This project implements a minimal Retrieval-Augmented Generation (RAG) pipeline using Mistral AI, PostgreSQL with pgvector, and Flask.
The system follows a simple RAG pipeline:
- Markdown documents are loaded and chunked.
- Chunks are embedded using Mistral embeddings.
- Embeddings are stored in PostgreSQL with pgvector.
- At query time:
- The query is embedded.
- Top-k similar chunks are retrieved.
- The LLM generates an answer using the retrieved context.
Create a .env file in the project root:
API_KEY=your_api_key
DOCS_PATH=/path/to/your/markdown/files
DB_USER=user
DB_PASSWORD=passwordSet up the venv and load the necessary requirements to run the project:
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
Run the following SQL commands in order after connecting to your PostgreSQL database.
This must be done once per database before using vector types.
CREATE EXTENSION IF NOT EXISTS vector;The embedding dimension must match your embedding model output. In this project, embeddings have 1024 dimensions.
CREATE TABLE IF NOT EXISTS documents (
id BIGSERIAL PRIMARY KEY,
content TEXT NOT NULL,
embedding VECTOR(1024) NOT NULL
);Confirm that the table and vector dimension are correct.
\d documents;SELECT vector_dims(embedding) FROM documents LIMIT 1;Before running queries, you must index your documents:
python scripts/index_documents.pyFrom the project root to start the API Server:
python run.pyThis project includes an A/B evaluation interface to compare different RAG configurations (e.g. chunk size, temperature, top-k retrieval).
The system:
- Generates answers for multiple configurations
- Creates all unique pairwise comparisons per question
- Stores comparisons in PostgreSQL
- Allows human evaluators to vote
- Stores votes for later Bradley–Terry / Gaussian Process analysis
Run the following commands after connecting to PostgreSQL.
CREATE TABLE IF NOT EXISTS ab_pairs (
id SERIAL PRIMARY KEY,
question TEXT NOT NULL,
config_a TEXT NOT NULL,
config_b TEXT NOT NULL,
answered BOOLEAN DEFAULT FALSE
);
CREATE TABLE IF NOT EXISTS ab_results (
id SERIAL PRIMARY KEY,
question TEXT NOT NULL,
config_a TEXT NOT NULL,
config_b TEXT NOT NULL,
winner TEXT NOT NULL,
created_at TIMESTAMP DEFAULT NOW()
);You must first generate responses for all configurations. This can be done with generate_data.ipynb
Example expected format (data.json):
[
{
"config": "config_1",
"results": [
{
"question": "How can I search for license risks?",
"answer": "..."
}
]
}
]Place this file in the project root.
Run:
python3 -m app.api.testing_serverThen open:
http://127.0.0.1:5000/
- All unique config pairs are generated per question
- Each comparison is shown exactly once
- Votes are stored in
ab_results - Pairs are marked as answered in
ab_pairs - When all comparisons are complete, the interface stops
For 5 configs and 5 questions:
- 10 comparisons per question
- 50 total comparisons