This repository is a local-first, production-shaped RAG backend for enterprise use. It mirrors the larger AWS design with Docker services:
- FastAPI API for ingestion, retrieval, answer generation, feedback, and debug traces
- PostgreSQL with
pgvectorfor vector retrieval and PostgreSQL full-text search - MinIO for raw source artifacts and future extracted assets
- Ollama for optional local embeddings and answer generation
- Deterministic hash embeddings as the default so the system works before any model downloads
docker compose up --buildAPI:
http://localhost:8000
Frontend:
http://localhost:5173
These snapshots mirror the main frontend workflows: project setup and ingestion, document review, and grounded Q&A.
Create or switch projects, ingest the mounted sample folder, and upload files with ACL groups.
See indexed sources, chunk counts, and open the raw citation file for any revision.
Ask a question against the selected project, tune Top K, and inspect the returned citations and diagnostics.
Swagger docs:
http://localhost:8000/docs
MinIO console:
http://localhost:9001
user: minioadmin
password: minioadmin
Invoke-RestMethod -Method Post `
-Uri http://localhost:8000/ingest/local `
-ContentType application/json `
-Body '{"acl_groups":[]}'The API ingests files mounted from sample_docs/ into /data/input.
The frontend can also ingest the same sample folder into the selected project, upload additional files from your browser, list indexed documents, open raw citation files, and ask questions against the selected project.
List projects:
Invoke-RestMethod http://localhost:8000/projectsCreate a project:
Invoke-RestMethod -Method Post `
-Uri http://localhost:8000/projects `
-ContentType application/json `
-Body '{"name":"Networking"}'Ingest the mounted sample folder into a specific project:
Invoke-RestMethod -Method Post `
-Uri http://localhost:8000/projects/<project-id>/ingest/local `
-ContentType application/json `
-Body '{"acl_groups":[]}'Upload files into a project from the frontend at http://localhost:5173.
Invoke-RestMethod -Method Post `
-Uri http://localhost:8000/search `
-ContentType application/json `
-Body '{"query":"How do I troubleshoot VPN error 809?","top_k":3,"groups":[]}'Invoke-RestMethod -Method Post `
-Uri http://localhost:8000/answer `
-ContentType application/json `
-Body '{"query":"How do I troubleshoot VPN error 809?","top_k":3,"groups":[]}'If Ollama does not have the configured chat model yet, the API returns an extractive fallback from the highest-ranked chunk.
Pull models into the Ollama container:
docker exec -it rag-poc-ollama ollama pull qwen2.5:0.5b
docker exec -it rag-poc-ollama ollama pull nomic-embed-textTo use Ollama embeddings, edit .env:
EMBEDDING_PROVIDER=ollama
EMBEDDING_DIM=768
Then recreate the database volume because pgvector columns are dimensioned:
docker compose down -v
docker compose up --buildIngestion does the following:
- Reads supported files from the local source directory.
- Stores raw bytes in MinIO.
- Extracts text from Markdown, text, HTML, PDF, and DOCX.
- Computes source, normalized text, and chunking fingerprints.
- Creates a new document revision only when content or chunking changed.
- Chunks with document title and section path context.
- Embeds each chunk.
- Atomically publishes the new revision and supersedes the old revision.
Retrieval does the following:
- Embeds the query.
- Runs pgvector cosine search.
- Runs PostgreSQL full-text search.
- Merges both result sets with reciprocal rank fusion.
- Applies ACL group filtering before returning chunks.
- Stores a retrieval trace for debugging.
- The default hash embedding provider is for learning and plumbing tests, not final retrieval quality.
- Reranking is not implemented yet.
- Image extraction and image captioning are not implemented yet.
- Connectors for Document360 and SharePoint are not implemented yet.
- Auth is represented by request-supplied groups for now. A real deployment should validate tokens and derive groups server-side.
- Add a real embedding model by default.
- Add a reranker stage.
- Add async ingestion workers with Redis.
- Add extracted image/table assets.
- Add SharePoint and Document360 connectors.
- Add an eval harness for retrieval quality, stale content prevention, and ACL safety.