🎙️ PaperBiceps — AI-Powered Podcaster

Paper Biceps is an AI tool that transforms any written content — research papers, articles, blogs, or documents — into realistic podcast conversations. It uses LLMs and TTS to simulate an actual podcast episode between a host and an expert.

🚀 Features

v0.2.0 (Current)

🌐 Chrome Extension - Browser-based podcast generation with floating microphone
🔊 Voice Explanations - Smart query input for specific 'section' and 'Image' explanations ("Explain figure 1.2", "Read abstract")
🎧 Built-in Audio Player - With download functionality
🖱️ Context Menu - Right-click for quick podcast generation
🧠 Gemini-powered - Advanced summarization and script generation
🎤 Dynamic Dialogue - Host-expert style Podcast conversations
🗂️ Multiple Input Formats - PDF, TXT, DOCX, or URL
🎙️ Dual-speaker Audio - Using Deepgram TTS

v0.1.0 (Legacy)

🧠 Gemini-powered summarization and script generation
🎤 Dynamic host-expert style dialogue like a real podcast
🗂️ Accepts PDF, TXT, DOCX, or URL input
🎧 Dual-speaker audio generation using Deepgram TTS
🎙️ Option to use Streamlit UI or FastAPI API
🔊 Final podcast exported as an .mp3

🧰 Tech Stack

Backend

Python 3.10+ - Core language
FastAPI - Modern web framework
Pydantic - Data validation
Uvicorn - ASGI server
HTTPX/Requests - HTTP clients
Python-dotenv - Environment management

AI/ML Services

Google Gemini API - LLM for script generation
Deepgram TTS - Text-to-speech synthesis
spaCy - Natural language processing

Document Processing

PyMuPDF - PDF text extraction
python-docx - DOCX text extraction
trafilatura - Web content extraction

Audio Processing

Pydub - Audio manipulation

Frontend

Chrome Extension (Manifest v3)
Tailwind CSS - Extension styling
JavaScript - Extension logic

🏗️ Project Structure

PaperBiceps/
├── app/                          # Modular backend application
│   ├── main.py                   # FastAPI app entry point
│   ├── config.py                 # Configuration settings
│   ├── requirements.txt          # Python dependencies
│   ├── models/                   # Pydantic models
│   │   └── explain_request.py    # Request models
│   ├── routes/                   # API endpoints
│   │   ├── podcast.py            # Podcast generation endpoints
│   │   ├── explain.py            # Explanation endpoints
│   │   └── health.py             # Health check endpoints
│   ├── services/                 # Business logic
│   │   ├── gemini_service.py     # Gemini AI integration
│   │   ├── deepgram_service.py   # Deepgram TTS integration
│   │   └── scraping_service.py   # Web scraping service
│   └── utils/                    # Utility functions
│       ├── text_cleaning.py      # Text preprocessing
│       └── file_utils.py         # File handling utilities
├── extension/                    # Chrome extension
│   ├── manifest.json             # Extension manifest
│   ├── popup.html/js             # Extension UI
│   ├── content.js                # Content script
│   └── background.js             # Background service worker
├── .env                          # Environment variables
├── .gitignore                    # Git ignore rules
├── LICENSE                       # MIT License
├── paperbiceps_logo.jpg          # Project logo
└── README.md                     # This file

🧪 How to Run Locally

Clone the repo

git clone https://github.com/yourusername/PaperBiceps.git
cd PaperBiceps

Create a virtual environment

python -m venv .venv
source .venv/bin/activate  # Windows: .\.venv\Scripts\activate

Install dependencies
```
pip install -r app/requirements.txt
```

Add your API keys
Create a .env file in the root directory and add:

GEMINI_API_KEY=your_gemini_key_here
DEEPGRAM_API_KEY=your_deepgram_key_here

Run the backend API
```
cd app
uvicorn main:app --reload --host 0.0.0.0 --port 8000
```
The API will be available at http://localhost:8000
- API documentation: http://localhost:8000/docs
- Health check: http://localhost:8000/health
Load Chrome Extension
- Open chrome://extensions/
- Enable "Developer mode"
- Click "Load unpacked" and select the extension/ folder
- The extension microphone will appear on webpages

🛠️ Known Limitations

🐢 Script + Audio generation can be a little slow, especially for long files or limited bandwidth.
🤖 Gemini API occasionally throws rate/resource errors if overused or overloaded if using under free tier

🤝 Contributing

This is an open source project, and contributions are warmly welcome!
If you're interested in:

🏗️ Improving the modular architecture - Adding new services, optimizing existing ones
⚡ Performance optimization - Speeding up script generation and audio synthesis
🎭 Voice enhancement - Adding emotion and style using TTS attributes
🌍 Multi-language support - Extending to non-English content

Feel free to open a pull request or issue 🙌

Start by forking the repo, creating a new branch, and submitting a PR.

🧠 Ideal Use Cases

Listen to any important documnet on the go.
Turn research papers into explainable podcast summaries
Convert blog posts or newsletters into audio for listeners
Build your own AI-powered podcast channel
Help visually impaired users "listen" to any document

📜 License

Licensed under the MIT License.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎙️ PaperBiceps — AI-Powered Podcaster

🚀 Features

v0.2.0 (Current)

v0.1.0 (Legacy)

🧰 Tech Stack

Backend

AI/ML Services

Document Processing

Audio Processing

Frontend

🏗️ Project Structure

🧪 How to Run Locally

🛠️ Known Limitations

🤝 Contributing

🧠 Ideal Use Cases

📜 License

About

Uh oh!

Releases 2

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
__pycache__		__pycache__
app		app
extension		extension
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
config.py		config.py
main.py		main.py
paperbiceps_logo.jpg		paperbiceps_logo.jpg
requirements.txt		requirements.txt

License

Anvayt24/PaperBiceps

Folders and files

Latest commit

History

Repository files navigation

🎙️ PaperBiceps — AI-Powered Podcaster

🚀 Features

v0.2.0 (Current)

v0.1.0 (Legacy)

🧰 Tech Stack

Backend

AI/ML Services

Document Processing

Audio Processing

Frontend

🏗️ Project Structure

🧪 How to Run Locally

🛠️ Known Limitations

🤝 Contributing

🧠 Ideal Use Cases

📜 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Languages

Packages