Upload Analyze Ask Understand PDFs with AI
AI-powered web app to upload PDFs, analyze them with AI, and ask questions about their content.
No in-page PDF viewer β just clean, fast, and interactive analysis.
Live Demo: https://pdf-chatbot-chi.vercel.app/
- Quick Start
- Why PDF Chatbot?
- Demo & Screenshots
- System Architecture
- Features at a Glance
- Tech Stack
- Installation
- Usage
- Project Structure
- Security & Privacy
- Roadmap
- Troubleshooting
- Contributing
- Acknowledgements
- License
- Author
git clone <your-repo-url>
cd PDF_CHATBOT
docker build -t pdf-chatbot-backend ./backend
docker run -e PORT=8000 -p 8000:8000 pdf-chatbot-backendOpen the frontend at http://localhost:8080 and start chatting with PDFs
( Back to top)
PDFs contain valuable information, but extracting insights from long documents is time-consuming and frustrating.
PDF Chatbot solves this by allowing you to:
- Upload multiple PDFs
- Define a persona and goal
- Instantly receive structured insights, summaries, and answers
No scrolling. No guessing. Just answers.
Clean landing dashboard with quick actions for PDF analysis
Drag-and-drop PDF upload with multi-file support
Define persona and job-to-be-done for focused analysis
Ask natural language questions across uploaded PDFs
Structured explanations, summaries, and extracted sections
( Back to top)
- User uploads PDFs via frontend
- Backend extracts text, structure & tables
- Semantic ranking identifies relevant sections
- Persona & task guide AI output
- Results returned as structured JSON
( Back to top)
- Multi-PDF Upload & Analysis
- Persona-driven AI understanding
- Semantic search & ranking
- Structured summaries & explanations
- Fast, no embedded PDF viewer
- Fully Dockerized backend
- Light & Dark mode UI
( Back to top)
Frontend
- HTML, CSS, Vanilla JavaScript (fully custom, no frameworks)
- Font Awesome (CDN) for icons
- Custom animations & responsive design with pure CSS
- Fetch API + XMLHttpRequest for backend communication and progress tracking
- Static site β runs without build tools or bundlers
Backend
- Python 3.9+, FastAPI 0.110.0, Uvicorn 0.29.0
- PyMuPDF, pdfplumber, Pillow, pytesseract
- sentence-transformers, torch, scikit-learn, nltk, numpy<2.0
- CORS enabled
- Dockerized for deployment
Deployment
- Frontend Vercel
- Backend Render
- Clone the repository
git clone <your-repo-url> cd PDF_CHATBOT
- Build the backend Docker image:
docker build -t pdf-chatbot-backend ./backend
- Run the backend:
docker run -e PORT=8000 -p 8000:8000 pdf-chatbot-backend
- Serve the frontend:
- Use VSCode Live Server, Python
http.server, or any static server:cd frontend python -m http.server 8080 # or use Live Server extension in VSCode
- Open http://localhost:8080
- Use VSCode Live Server, Python
- Click Upload PDFs drag & drop or select files
- Click Done to upload
- Click Analyze Collection, enter a persona and job/task, then click Start Analysis
- Or click Quick Summary for fast document takeaways
- Use Ask Anything to query explanations
PDF_CHATBOT/
backend/
main.py # FastAPI app
analyze_collections.py # Semantic analysis
heading_extractor.py # PDF structure extraction
summary.py # Summary generation
explain.py # Explanations
setup_offline_assets.py # Offline cache/setup
requirements.txt # Python dependencies
Dockerfile # Container build file
frontend/
index.html
script.js
style.css
...
README.md
...
- No in-browser PDF embedding
- Files processed securely on backend
- No chat history stored
- Minimal local storage usage
( Back to top)
- If Analyze button is disabled: select at least one PDF and fill both persona & job fields
- If analysis fails: check backend logs, CORS, or API URL in
script.js - For Docker issues: ensure ports are mapped and backend is running
- RAG-based long-context QA
- Multi-language support
- Persona presets
- Confidence scoring & citations
- Offline mode
( Back to top)
Contributions are welcome!
If you'd like to improve this project, please follow these steps:
- Fork the repository
- Create a feature branch (
git checkout -b feature-name) - Commit your changes (
git commit -m 'Add new feature') - Push to your branch (
git push origin feature-name) - Open a Pull Request
This project wouldnt be possible without these amazing tools and libraries:
- FastAPI β Backend framework
- Sentence Transformers β Embeddings and NLP
- PyMuPDF β PDF parsing
- pdfplumber β Text extraction
- pytesseract β OCR
- Tailwind CSS & Bootstrap β Styling
- Font Awesome β Icons
MIT
- Developed by Gudiwada Sruthi






