Skip to content

gudiwadasruthi/PDF_CHATBOT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

34 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

PDF Chatbot Banner

PDF Chatbot

Upload Analyze Ask Understand PDFs with AI

PDF Chatbot – Upload, Analyze & Chat with PDFs

AI-powered web app to upload PDFs, analyze them with AI, and ask questions about their content.
No in-page PDF viewer β€” just clean, fast, and interactive analysis.

Live Demo: https://pdf-chatbot-chi.vercel.app/


Table of Contents


Quick Start

git clone <your-repo-url>
cd PDF_CHATBOT
docker build -t pdf-chatbot-backend ./backend
docker run -e PORT=8000 -p 8000:8000 pdf-chatbot-backend

Open the frontend at http://localhost:8080 and start chatting with PDFs

( Back to top)


Why PDF Chatbot?

PDFs contain valuable information, but extracting insights from long documents is time-consuming and frustrating.

PDF Chatbot solves this by allowing you to:

  • Upload multiple PDFs
  • Define a persona and goal
  • Instantly receive structured insights, summaries, and answers

No scrolling. No guessing. Just answers.


Demo & Screenshots

πŸŽ₯ Demo & Screenshots

🏠 Home Dashboard

Clean landing dashboard with quick actions for PDF analysis


πŸ“‚ Upload PDFs

Drag-and-drop PDF upload with multi-file support


βš™οΈ Analysis Settings (Persona & Task)

Define persona and job-to-be-done for focused analysis


πŸ’¬ Ask Anything Interface

Ask natural language questions across uploaded PDFs


🧠 AI-Powered Insights & Results

Structured explanations, summaries, and extracted sections

( Back to top)


System Architecture

Architecture Diagram

Flow Overview

  1. User uploads PDFs via frontend
  2. Backend extracts text, structure & tables
  3. Semantic ranking identifies relevant sections
  4. Persona & task guide AI output
  5. Results returned as structured JSON

( Back to top)


Features at a Glance

  • Multi-PDF Upload & Analysis
  • Persona-driven AI understanding
  • Semantic search & ranking
  • Structured summaries & explanations
  • Fast, no embedded PDF viewer
  • Fully Dockerized backend
  • Light & Dark mode UI

( Back to top)


Tech Stack

Frontend

  • HTML, CSS, Vanilla JavaScript (fully custom, no frameworks)
  • Font Awesome (CDN) for icons
  • Custom animations & responsive design with pure CSS
  • Fetch API + XMLHttpRequest for backend communication and progress tracking
  • Static site β€” runs without build tools or bundlers

Backend

  • Python 3.9+, FastAPI 0.110.0, Uvicorn 0.29.0
  • PyMuPDF, pdfplumber, Pillow, pytesseract
  • sentence-transformers, torch, scikit-learn, nltk, numpy<2.0
  • CORS enabled
  • Dockerized for deployment

Deployment

  • Frontend Vercel
  • Backend Render

Installation

  1. Clone the repository
    git clone <your-repo-url>
    cd PDF_CHATBOT
  2. Build the backend Docker image:
    docker build -t pdf-chatbot-backend ./backend
  3. Run the backend:
    docker run -e PORT=8000 -p 8000:8000 pdf-chatbot-backend
  4. Serve the frontend:
    • Use VSCode Live Server, Python http.server, or any static server:
      cd frontend
      python -m http.server 8080
      # or use Live Server extension in VSCode
    • Open http://localhost:8080

Usage

  1. Click Upload PDFs drag & drop or select files
  2. Click Done to upload
  3. Click Analyze Collection, enter a persona and job/task, then click Start Analysis
  4. Or click Quick Summary for fast document takeaways
  5. Use Ask Anything to query explanations

Project Structure

PDF_CHATBOT/
 backend/
    main.py                  # FastAPI app
    analyze_collections.py   # Semantic analysis
    heading_extractor.py     # PDF structure extraction
    summary.py               # Summary generation
    explain.py               # Explanations
    setup_offline_assets.py  # Offline cache/setup
    requirements.txt         # Python dependencies
    Dockerfile               # Container build file
 frontend/
    index.html
    script.js
    style.css
    ...
 README.md
 ...

Security & Privacy

  • No in-browser PDF embedding
  • Files processed securely on backend
  • No chat history stored
  • Minimal local storage usage

( Back to top)


Troubleshooting

  • If Analyze button is disabled: select at least one PDF and fill both persona & job fields
  • If analysis fails: check backend logs, CORS, or API URL in script.js
  • For Docker issues: ensure ports are mapped and backend is running

Roadmap

  • RAG-based long-context QA
  • Multi-language support
  • Persona presets
  • Confidence scoring & citations
  • Offline mode

( Back to top)


Contributing

Contributions are welcome!
If you'd like to improve this project, please follow these steps:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature-name)
  3. Commit your changes (git commit -m 'Add new feature')
  4. Push to your branch (git push origin feature-name)
  5. Open a Pull Request

Acknowledgements

This project wouldnt be possible without these amazing tools and libraries:


License

MIT

Author

  • Developed by Gudiwada Sruthi

About

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors