Skip to content

Latest commit

 

History

History
72 lines (57 loc) · 1.92 KB

File metadata and controls

72 lines (57 loc) · 1.92 KB

🚀 Quick Start Guide

1. Clone and Setup

git clone <repository-url>
cd rag-latency-demo
python setup.py
2. Run Benchmarks
bash
# Comprehensive benchmark
python working_benchmark.py

# Ultimate speed test  
python ultimate_benchmark.py
3. Start API Server
bash
uvicorn app.main:app --reload
Open: http://localhost:8000/docs

4. Test with curl (PowerShell)
powershell
$body = @{
    question = "What is artificial intelligence?"
} | ConvertTo-Json

Invoke-RestMethod -Uri "http://localhost:8000/query" `
    -Method Post `
    -ContentType "application/json" `
    -Body $body
Or with curl (Command Prompt):
bash
curl -X POST http://localhost:8000/query \
  -H "Content-Type: application/json" \
  -d "{\"question\": \"What is artificial intelligence?\"}"
Project Structure
text
rag-latency-demo/
├── app/                    # Core application code
├── data/                   # Sample documents (add your own)
├── models/                 # ML models (downloaded automatically)
├── scripts/               # Utility scripts
├── benchmarks/            # Benchmark results (generated)
├── requirements.txt       # Python dependencies
├── Dockerfile            # Container deployment
└── README.md             # Full documentation
Key Files
app/rag_naive.py - Baseline implementation

app/rag_optimized.py - Optimized RAG with caching

app/no_compromise_rag.py - Ultimate optimization

app/main.py - FastAPI server

config.py - Configuration settings

Performance Results
Baseline (Naive RAG): 247ms average

Optimized RAG: 179ms average (1.4x faster)

No-Compromise RAG: 92ms average (2.7x faster ⚡)

Adding Your Own Data
Place text files (.txt, .md) in data/ directory

Run: python scripts/initialize_rag.py

Test with: python -c "from app.rag_naive import NaiveRAG; r=NaiveRAG(); r.initialize(); print(r.query(\"Test query\"))"

Deployment
See DEPLOYMENT.md for production deployment instructions.