Skip to content

HuManKeat/audio-intel

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Audio Intel

Sorry, boring name...

Audio Intel is a multimodal tool that takes in any audio file and analyzes it. In my humble opinion, the summary it does is quite comprehensive, covering everything asked for in the assessment, viz "a summary, key action items, sentiment trends, or topic tagging", and goes further. Some additional quality-of-life features include segmented language recognition, clear delineation between each speaker, and offers a profile of the speaker by identifying the attitudes, dynamics, and the role of the speaker, if it is discernable.

  1. Frontend and backend interaction written in FastAPI.

  2. Gemini Flash Lite 3.5 for transcription and source-to-English translation, with per-utterance language recognition. Also for timestamping.

  3. pyannote API for diarization and timestamp.

  4. Gemma 3 for intelligence analysis

demo

Quick Start

1. Prerequisites

2. Configure

Edit .env and fill in your keys

3. Run

The app will be available at http://localhost:8000.


About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors