lavoix is a Python library + FastAPI server for speech workflows:
- Speech-to-Text (STT): Voxtral-first via Mistral API
- Text-to-Speech (TTS): Voxtral TTS path via Mistral API + OSS fallback (
pyttsx3) - Production shape: clean provider abstraction, typed config, HTTP API, and Python client
- Voxtral is used as the primary STT path (best quality for Mistral-native workflows).
- TTS is implemented with an OSS provider so you can run locally/offline.
- The service is provider-driven, so adding more engines (e.g. a future Mistral TTS endpoint) is straightforward.
cd /home/wardn/dev/mistral-dev/lavoix
python3 -m venv .venv
source .venv/bin/activate
pip install -e ".[dev,tts-oss]"Optional local STT fallback:
pip install -e ".[stt-oss]"Create .env:
LAVOIX_MISTRAL_API_KEY=your_key_here
LAVOIX_VOXTRAL_MODEL=voxtral-mini-latest
LAVOIX_DEFAULT_STT_PROVIDER=mistral
LAVOIX_VOXTRAL_TTS_MODEL=voxtral-tts-latest
LAVOIX_DEFAULT_TTS_PROVIDER=mistral
LAVOIX_PORT=8090lavoix-serverHealth:
curl http://localhost:8090/healthzTranscription:
curl -X POST http://localhost:8090/v1/stt/transcribe \
-F "file=@./sample.wav" \
-F "provider=mistral"Synthesis:
curl -X POST http://localhost:8090/v1/tts/synthesize \
-H "content-type: application/json" \
-d '{"text":"Hello from lavoix","voice":"default","speed":1.0}' \
--output out.wavfrom lavoix import LavoixClient
client = LavoixClient("http://localhost:8090")
print(client.healthz())
stt = client.transcribe("sample.wav", provider="mistral")
print(stt["text"])
client.synthesize("Bonjour Erwin", "out.wav")GET /healthzPOST /v1/stt/transcribe(multipart form:file, optionallanguage, optionalprovider)POST /v1/tts/synthesize(JSON body:text,voice,speed, optionalprovider)
- Otter forwards voice prompts to Lavoix for transcription.
- Seal consumes that flow through Otter (
POST /v1/voice/prompts) and keeps voice-first task creation. - Lavoix can also be used standalone as a generic STT/TTS service.
- Pre-commit config is included in
.pre-commit-config.yaml. - GitHub Actions CI runs lint/format checks and tests on push/PR.
Enable hooks locally:
pip install -e ".[dev]"
pre-commit install
pre-commit run --all-files- If Mistral API key is missing,
mistralSTT/TTS providers are unavailable; usefaster-whisperfor STT andossfor TTS. pyttsx3may require system speech backends (likeespeak) depending on OS.