Sorry, boring name...
Audio Intel is a multimodal tool that takes in any audio file and analyzes it. In my humble opinion, the summary it does is quite comprehensive, covering everything asked for in the assessment, viz "a summary, key action items, sentiment trends, or topic tagging", and goes further. Some additional quality-of-life features include segmented language recognition, clear delineation between each speaker, and offers a profile of the speaker by identifying the attitudes, dynamics, and the role of the speaker, if it is discernable.
-
Frontend and backend interaction written in FastAPI.
-
Gemini Flash Lite 3.5 for transcription and source-to-English translation, with per-utterance language recognition. Also for timestamping.
-
pyannote API for diarization and timestamp.
-
Gemma 3 for intelligence analysis
- A Google AI Studio API key (
GEMINI_API_KEY) - A pyannote API key (
PYANNOTE_API_KEY) for speaker diarization
Edit .env and fill in your keys
The app will be available at http://localhost:8000.
