A Retrieval-Augmented Generation (RAG) system built to answer insurance-related queries by extracting relevant information from uploaded policy documents (PDF / DOCX / EML) and generating concise, formal, human-style answers.
| Stage | Component | Description |
|---|---|---|
| 1️⃣ Data ingestion | preprocessing.py |
Reads documents, splits into chunks, generates embeddings, builds FAISS index |
| 2️⃣ Query handling | query_final.py |
Reformulates query → retrieves chunks via FAISS → reranks with Cross-Encoder → produces LLM answer |
| 3️⃣ API layer | router.py |
Exposes /hackrx/run endpoint to upload a document URL + multiple questions and returns answers |
The system behaves like an insurance agent — brief, factual, and strictly based on the documents.
├── preprocessing.py # Load docs, chunk, embed, build FAISS index ├── query_final.py # RAG pipeline + reranking + Groq LLM answering ├── router.py # FastAPI routes (/hackrx/run) ├── utils.py # File processing helper ├── faiss_index/ # FAISS index + chunks (auto-generated) ├── all-MiniLM-L6-v2/ # Local SentenceTransformer model ├── local_cross_encoder/ # Local reranking model ├── data/ # Optional document store └── requirements.txt
git clone https://github.com/SilentCanary/Finserv-Insurance-Agent.git
cd Finserv-Insurance-Agent
pip install -r requirements.txtEnvironment Variables (.env)
GROQ_KEY = your_groq_api_key
VALID_TOKEN = your_fastapi_auth_token🧰 Running the API
uvicorn main:app --reloadEndpoint
POST /hackrx/runExample Request
{
"documents": "https://example.com/policy.pdf",
"questions": [
"Is flood damage covered?",
"What is the waiting period for hospitalization?"
]
}
Example Response
"answers": [
"Yes, flood damage is covered under Section 3 with exclusions.",
"The policy requires a 30-day waiting period for hospitalization."
]
}
🧠 Core Technology
| Component | Purpose |
|---|---|
| SentenceTransformer | Computes embeddings for document chunks and user queries |
| FAISS | Performs vector similarity search for retrieval |
| CrossEncoder | Reranks retrieved chunks based on relevance |
| Groq LLM | Generates the final answer using the top-ranked chunks |
| LRU Cache | Speeds up execution by caching repeated queries and responses |
👤 Author Advitiya