Skip to content

kikobarr/RAG-time-tester

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

50 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RAG-Time Tester

Dual-chat Streamlit experience for comparing Groq vs. OpenAI latency when both models share the same RAG pipeline over Don Quixote.

Prerequisites

  1. Python 3.11+
  2. pip install -r requirements.txt
  3. Environment variables (or Streamlit secrets):
    • DATABASE_URL (Neon connection string)
    • OPENAI_API_KEY_CHAT (used for chat completions)
    • OPENAI_API_KEY_EMBED (used for embeddings in retriever.py)
    • GROQ_API_KEY_CHAT

Database migration

cd src/db
alembic upgrade head

This applies 202502091200_clean_org_chart_schema, which ensures the schema matches the ingestion/retrieval helpers.

Load Don Quixote

python src/rag/ingest.py --preset don-quixote

The preset pulls docs/don_quixote.txt, uses a plain-text chunker optimized for .txt prose, stores the text/embeddings in Neon, and skips any duplicate content automatically.

Run Streamlit locally

streamlit run src/app/streamlit_app.py

On Streamlit Community Cloud, set the same secrets plus optional overrides:

Secret Purpose
OPEN_AI_CHAT_MODEL Overrides default OpenAI chat model (defaults to gpt-4o-mini)
GROQ_CHAT_MODEL Overrides Groq model (defaults to llama-3.1-70b)

The app renders two chat panes (Groq left, OpenAI right), reuses a single prompt input, and logs latency deltas per run. Use the expander at the bottom of the page to inspect the retrieved context for each comparison.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors