Skip to content

fuzonmedia/ragforge

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Local RAG MVP

This repository is a local-first, production-shaped RAG backend for enterprise use. It mirrors the larger AWS design with Docker services:

  • FastAPI API for ingestion, retrieval, answer generation, feedback, and debug traces
  • PostgreSQL with pgvector for vector retrieval and PostgreSQL full-text search
  • MinIO for raw source artifacts and future extracted assets
  • Ollama for optional local embeddings and answer generation
  • Deterministic hash embeddings as the default so the system works before any model downloads

Start the stack

docker compose up --build

API:

http://localhost:8000

Frontend:

http://localhost:5173

UI snapshots

These snapshots mirror the main frontend workflows: project setup and ingestion, document review, and grounded Q&A.

Project setup and ingestion

Projects and ingestion panel

Create or switch projects, ingest the mounted sample folder, and upload files with ACL groups.

Document library

Document library view

See indexed sources, chunk counts, and open the raw citation file for any revision.

Ask and answer

Ask and answer view with citations

Ask a question against the selected project, tune Top K, and inspect the returned citations and diagnostics.

Swagger docs:

http://localhost:8000/docs

MinIO console:

http://localhost:9001
user: minioadmin
password: minioadmin

Ingest sample documents

Invoke-RestMethod -Method Post `
  -Uri http://localhost:8000/ingest/local `
  -ContentType application/json `
  -Body '{"acl_groups":[]}'

The API ingests files mounted from sample_docs/ into /data/input.

The frontend can also ingest the same sample folder into the selected project, upload additional files from your browser, list indexed documents, open raw citation files, and ask questions against the selected project.

Projects

List projects:

Invoke-RestMethod http://localhost:8000/projects

Create a project:

Invoke-RestMethod -Method Post `
  -Uri http://localhost:8000/projects `
  -ContentType application/json `
  -Body '{"name":"Networking"}'

Ingest the mounted sample folder into a specific project:

Invoke-RestMethod -Method Post `
  -Uri http://localhost:8000/projects/<project-id>/ingest/local `
  -ContentType application/json `
  -Body '{"acl_groups":[]}'

Upload files into a project from the frontend at http://localhost:5173.

Search

Invoke-RestMethod -Method Post `
  -Uri http://localhost:8000/search `
  -ContentType application/json `
  -Body '{"query":"How do I troubleshoot VPN error 809?","top_k":3,"groups":[]}'

Ask for an answer

Invoke-RestMethod -Method Post `
  -Uri http://localhost:8000/answer `
  -ContentType application/json `
  -Body '{"query":"How do I troubleshoot VPN error 809?","top_k":3,"groups":[]}'

If Ollama does not have the configured chat model yet, the API returns an extractive fallback from the highest-ranked chunk.

Optional: use Ollama models

Pull models into the Ollama container:

docker exec -it rag-poc-ollama ollama pull qwen2.5:0.5b
docker exec -it rag-poc-ollama ollama pull nomic-embed-text

To use Ollama embeddings, edit .env:

EMBEDDING_PROVIDER=ollama
EMBEDDING_DIM=768

Then recreate the database volume because pgvector columns are dimensioned:

docker compose down -v
docker compose up --build

MVP behavior

Ingestion does the following:

  1. Reads supported files from the local source directory.
  2. Stores raw bytes in MinIO.
  3. Extracts text from Markdown, text, HTML, PDF, and DOCX.
  4. Computes source, normalized text, and chunking fingerprints.
  5. Creates a new document revision only when content or chunking changed.
  6. Chunks with document title and section path context.
  7. Embeds each chunk.
  8. Atomically publishes the new revision and supersedes the old revision.

Retrieval does the following:

  1. Embeds the query.
  2. Runs pgvector cosine search.
  3. Runs PostgreSQL full-text search.
  4. Merges both result sets with reciprocal rank fusion.
  5. Applies ACL group filtering before returning chunks.
  6. Stores a retrieval trace for debugging.

Important local limitations

  • The default hash embedding provider is for learning and plumbing tests, not final retrieval quality.
  • Reranking is not implemented yet.
  • Image extraction and image captioning are not implemented yet.
  • Connectors for Document360 and SharePoint are not implemented yet.
  • Auth is represented by request-supplied groups for now. A real deployment should validate tokens and derive groups server-side.

Next milestones

  1. Add a real embedding model by default.
  2. Add a reranker stage.
  3. Add async ingestion workers with Redis.
  4. Add extracted image/table assets.
  5. Add SharePoint and Document360 connectors.
  6. Add an eval harness for retrieval quality, stale content prevention, and ACL safety.

About

Production-grade local-first RAG platform with ingestion pipelines, hybrid retrieval, multimodal indexing, revision-aware publishing, and enterprise retrieval orchestration.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors