Audio Intel

Sorry, boring name...

Audio Intel is a multimodal tool that takes in any audio file and analyzes it. In my humble opinion, the summary it does is quite comprehensive, covering everything asked for in the assessment, viz "a summary, key action items, sentiment trends, or topic tagging", and goes further. Some additional quality-of-life features include segmented language recognition, clear delineation between each speaker, and offers a profile of the speaker by identifying the attitudes, dynamics, and the role of the speaker, if it is discernable.

Frontend and backend interaction written in FastAPI.
Gemini Flash Lite 3.5 for transcription and source-to-English translation, with per-utterance language recognition. Also for timestamping.
pyannote API for diarization and timestamp.
Gemma 3 for intelligence analysis

Quick Start

1. Prerequisites

A Google AI Studio API key (GEMINI_API_KEY)
A pyannote API key (PYANNOTE_API_KEY) for speaker diarization

2. Configure

Edit .env and fill in your keys

3. Run

The app will be available at http://localhost:8000.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
app		app
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
image.png		image.png
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Audio Intel

Quick Start

1. Prerequisites

2. Configure

3. Run

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Audio Intel

Quick Start

1. Prerequisites

2. Configure

3. Run

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages