Local-first voice I/O for AI applications: TTS, STT, microphone control, streaming speech output, and optional voice cloning behind a small Python API.
AbstractVoice is useful on its own, and it is also the voice capability package
for the AbstractFramework ecosystem. It does not force you to run a daemon:
embed VoiceManager directly when you want an in-process library; install it
beside AbstractCore when you want OpenAI-compatible HTTP audio endpoints.
- TTS (default): Piper (cross-platform, no system deps)
- STT (default): faster-whisper
- Local assistant:
listen()+speak()with playback/listening control - Headless/server-friendly:
speak_to_bytes(),speak_to_file(),transcribe_* - Streaming TTS:
speak_to_audio_chunks()andopen_tts_text_stream() - Voice cloning / heavier TTS (optional): OpenF5, Chroma, AudioDiT, OmniVoice
- AbstractCore plugin: discovered through
abstractcore.capabilities_plugins
Status: alpha (0.8.x). The default Piper/faster-whisper path is usable
today; optional cloning and torch-based engines are heavier and should be
validated on your target hardware. The supported integrator surface is
documented in docs/api.md.
Next: docs/getting-started.md (recommended setup + first smoke tests).
AbstractVoice has three intended usage modes:
- Standalone Python library: call
VoiceManagerdirectly from a desktop app, local assistant, batch job, or your own backend. - AbstractCore capability plugin: install it next to AbstractCore and let AbstractCore expose voice/audio capabilities to agents and OpenAI-compatible clients.
- AbstractFramework component: use it as the voice layer inside the wider
AbstractFramework stack (
https://github.com/lpalbou/abstractframework).
Key links:
- AbstractCore (agents/capabilities):
https://abstractcore.aiandhttps://github.com/lpalbou/abstractcore - AbstractFramework (umbrella):
https://github.com/lpalbou/abstractframework
Integration points (code evidence):
- AbstractCore capability plugin entry point:
pyproject.toml→[project.entry-points."abstractcore.capabilities_plugins"]
Implementation:abstractvoice/integrations/abstractcore_plugin.py - AbstractRuntime ArtifactStore adapter (optional, duck-typed):
abstractvoice/artifacts.py
Important: AbstractVoice is a voice I/O library (TTS/STT + optional cloning), not an agent framework and not a standalone LLM server. That boundary is intentional: in the AbstractFramework stack, AbstractCore owns agents, provider routing, and OpenAI-compatible HTTP endpoints; AbstractVoice supplies the concrete voice implementation.
flowchart LR
App["Your app / REPL"] --> VM["abstractvoice.VoiceManager"]
VM --> TTS["Piper TTS"]
VM --> STT["faster-whisper STT"]
VM --> IO["sounddevice / PortAudio"]
subgraph AbstractFramework
AC["AbstractCore"] -. "capability plugin" .-> VM
AR["AbstractRuntime"] -. "optional ArtifactStore" .-> VM
end
The shipped AbstractCore integration is via the capability plugin above. The abstractvoice REPL is a demonstrator/smoke-test harness (see docs/repl_guide.md) and includes a minimal OpenAI-compatible LLM HTTP client (abstractvoice/examples/llm_provider.py) for convenience.
Install AbstractVoice into the same environment as AbstractCore:
pip install "abstractcore[server]" abstractvoiceAbstractCore discovers AbstractVoice through the
abstractcore.capabilities_plugins entry point and can use it as:
core.voice.tts(...)/llm.voice.tts(...)for TTScore.audio.transcribe(...)/llm.audio.transcribe(...)for STT- OpenAI-compatible server endpoints when AbstractCore Server is running:
POST /v1/audio/speechPOST /v1/audio/transcriptions
Minimal server smoke test:
python -m abstractcore.server.app
curl -X POST http://localhost:8000/v1/audio/speech \
-H "Content-Type: application/json" \
-d '{"input":"Hello from AbstractVoice through AbstractCore.","format":"wav"}' \
--output hello.wav
curl -X POST http://localhost:8000/v1/audio/transcriptions \
-F "file=@hello.wav" \
-F "language=en"For the current AbstractCore surface, see https://abstractcore.ai and
https://github.com/lpalbou/abstractcore.
If you’re using the full AbstractFramework stack, install and run via the umbrella project and gateway tooling. Start here: https://github.com/lpalbou/abstractframework.
Requires Python >=3.10 (see pyproject.toml).
pip install abstractvoiceOptional extras (feature flags):
pip install "abstractvoice[all]"Notes:
abstractvoice[all]enables most optional features (incl. cloning + AEC + audio-fx), but does not include the GPU-heavy Chroma runtime, AudioDiT, or OmniVoice.- For the full list of extras (and platform troubleshooting), see
docs/installation.md.
Some features rely on large model weights/artifacts. AbstractVoice will not download these implicitly inside the REPL (offline-first).
After installing, prefetch explicitly (cross-platform).
Recommended (most users):
abstractvoice-prefetch --piper en
abstractvoice-prefetch --stt smallOptional (voice cloning artifacts):
pip install "abstractvoice[cloning]"
abstractvoice-prefetch --openf5
# Heavy (torch/transformers):
pip install "abstractvoice[audiodit]"
abstractvoice-prefetch --audiodit
pip install "abstractvoice[omnivoice]"
abstractvoice-prefetch --omnivoice
# GPU-heavy:
pip install "abstractvoice[chroma]"
abstractvoice-prefetch --chromaEquivalent python -m form:
python -m abstractvoice download --piper en
python -m abstractvoice download --stt small
python -m abstractvoice download --openf5 # optional; requires abstractvoice[cloning]
python -m abstractvoice download --chroma # optional; requires abstractvoice[chroma] (GPU-heavy)
python -m abstractvoice download --audiodit # optional; requires abstractvoice[audiodit]
python -m abstractvoice download --omnivoice # optional; requires abstractvoice[omnivoice]Notes:
--piper <lang>downloads the Piper ONNX voice for that language into~/.piper/models.--openf5is ~5.4GB.--chromais very large (GPU-heavy).
abstractvoice --verbose
# or (from a source checkout):
python -m abstractvoice cli --verboseNotes:
- Mic voice input is off by default for fast startup. Enable with
--voice-mode stop(or in-session:/voice stop). - The REPL is offline-first: no implicit model downloads. Use the explicit download commands above.
- The REPL is primarily a demonstrator. For production agent/server use in the AbstractFramework ecosystem, run AbstractCore and use AbstractVoice via its capability plugin (see
docs/api.md→ “Integrations”).
See docs/repl_guide.md.
from abstractvoice import VoiceManager
vm = VoiceManager()
vm.speak("Hello! This is AbstractVoice.")See docs/api.md for the supported integrator contract.
At a glance:
- TTS:
speak(),stop_speaking(),pause_speaking(),resume_speaking(),speak_to_bytes(),speak_to_file() - STT:
transcribe_file(),transcribe_from_bytes() - Mic:
listen(),stop_listening(),pause_listening(),resume_listening()
- Docs index:
docs/README.md - Getting started:
docs/getting-started.md - FAQ:
docs/faq.md - Orientation:
docs/overview.md - Acronyms:
docs/acronyms.md - Public API:
docs/api.md - REPL guide:
docs/repl_guide.md - Install troubleshooting:
docs/installation.md - Multilingual support:
docs/multilingual.md - Architecture (internal):
docs/architecture.md+docs/adr/ - Model management (Piper-first):
docs/model-management.md - Licensing notes:
docs/voices-and-licenses.md
- Changelog:
CHANGELOG.md - Contributing:
CONTRIBUTING.md - Security:
SECURITY.md - Acknowledgments:
ACKNOWLEDGMENTS.md
MIT. See LICENSE.