Run LLMs locally on Android, iOS, and Desktop — using a single Kotlin API.
Offline-first · Privacy-preserving · Kotlin Multiplatform
Llamatik is a Kotlin Multiplatform library that lets you run:
- 🧠 Large Language Models (LLMs) via
llama.cpp - 🎙 Speech-to-Text (STT) via
whisper.cpp
...fully on-device, with optional remote inference — all behind a unified Kotlin API.
No Python.
No mandatory servers.
Your models, your data, your device.
Designed for privacy-first, offline-capable, and cross-platform AI applications.
- ✅ Fully offline inference via llama.cpp
- ✅ On-device speech recognition via whisper.cpp
- ✅ No network required
- ✅ No data exfiltration
- ✅ Works with GGUF (LLMs) and GGML (Whisper) models
- ✅ Text generation (non-streaming & streaming)
- ✅ Context-aware generation (system + history)
- ✅ Schema-constrained JSON generation
- ✅ Embeddings for vector search & RAG
- ✅ On-device transcription
- ✅ Works fully offline
- ✅ 16kHz mono WAV support
- ✅ Selectable Whisper models
- ✅ Integrated model download + management
- ✅ Shared API across Android, iOS, Desktop
- ✅ Native C++ integration via Kotlin/Native
- ✅ Static frameworks for iOS
- ✅ JNI for Desktop
- ✅ Optional HTTP client for remote inference
- ✅ Drop-in backend server (
llamatik-backend) - ✅ Seamlessly switch between local and remote inference
Want to see Llamatik in action before integrating it?
The Llamatik App showcases:
- On-device inference
- Streaming generation
- Speech-to-text (Whisper)
- Privacy-first AI (no cloud required)
- Downloadable models
- 🧠 On-device chatbots & assistants
- 📚 Local RAG systems
- 🛰️ Hybrid AI apps (offline-first, online fallback)
- 🎮 Game AI & procedural dialogue
Your App
│
▼
LlamaBridge (shared Kotlin API)
│
├─ llamatik-core → Native llama.cpp (on-device)
├─ llamatik-client → Remote HTTP inference
└─ llamatik-backend → llama.cpp-compatible server
Switching between local and remote inference requires no API changes — only configuration.
- iOS Deployment Target: 16.6+
- Android MinSDK API: 26
Llamatik is published on Maven Central and follows semantic versioning.
- No custom Gradle plugins
- No manual native toolchain setup
- Works with standard Kotlin Multiplatform projects
dependencyResolutionManagement {
repositories {
google()
mavenCentral()
}
}
commonMain.dependencies {
implementation("com.llamatik:library:0.16.0")
}// Resolve model path (place GGUF in assets / bundle)
val modelPath = LlamaBridge.getModelPath("phi-2.Q4_0.gguf")
// Load model
LlamaBridge.initGenerateModel(modelPath)
// Generate text
val output = LlamaBridge.generate(
"Explain Kotlin Multiplatform in one sentence."
)The public Kotlin API is defined in LlamaBridge (an expect object with platform-specific actual implementations).
@Suppress("EXPECT_ACTUAL_CLASSIFIERS_ARE_IN_BETA_WARNING")
expect object LlamaBridge {
// Utilities
@Composable
fun getModelPath(modelFileName: String): String // copy asset/bundle model to app files dir and return absolute path
fun shutdown() // free native resources
// Embeddings
fun initModel(modelPath: String): Boolean // load embeddings model
fun embed(input: String): FloatArray // return embedding vector
// Text generation (non-streaming)
fun initGenerateModel(modelPath: String): Boolean // load generation model
fun generate(prompt: String): String
fun generateWithContext(
systemPrompt: String,
contextBlock: String,
userPrompt: String
): String
// Text generation (streaming)
fun generateStream(prompt: String, callback: GenStream)
fun generateStreamWithContext(
systemPrompt: String,
contextBlock: String,
userPrompt: String,
callback: GenStream
)
// Text generation with JSON schema (non-streaming)
fun generateJson(prompt: String, jsonSchema: String? = null): String
fun generateJsonWithContext(
systemPrompt: String,
contextBlock: String,
userPrompt: String,
jsonSchema: String? = null
): String
// Convenience streaming overload (callbacks)
fun generateStream(prompt: String, callback: GenStream)
fun generateStreamWithContext(
system: String,
context: String,
user: String,
onDelta: (String) -> Unit,
onDone: () -> Unit,
onError: (String) -> Unit
)
// Text generation with JSON schema (streaming)
fun generateJsonStream(prompt: String, jsonSchema: String? = null, callback: GenStream)
fun generateJsonStreamWithContext(
systemPrompt: String,
contextBlock: String,
userPrompt: String,
jsonSchema: String? = null,
callback: GenStream
)
fun nativeCancelGenerate() // cancel generation
}
interface GenStream {
fun onDelta(text: String)
fun onComplete()
fun onError(message: String)
}WhisperBridge exposes a small, platform-friendly wrapper around whisper.cpp for on-device speech-to-text.
The workflow is:
- Download a Whisper ggml model (e.g. ggml-tiny-q8_0.bin) to local storage (the app does this for you).
- Initialize Whisper once with the local model path.
- Record audio to a WAV file and transcribe it.
object WhisperBridge {
/** Returns a platform-specific absolute path for the model filename. */
fun getModelPath(modelFileName: String): String
/** Loads the model at [modelPath]. Returns true if loaded. */
fun initModel(modelPath: String): Boolean
/**
* Transcribes a WAV file and returns text.
* Tip: record WAV as 16 kHz, mono, 16-bit PCM for best compatibility.
*/
fun transcribeWav(wavPath: String, language: String? = null): String
/** Frees native resources. */
fun release()
}import com.llamatik.library.platform.WhisperBridge
val modelPath = WhisperBridge.getModelPath("ggml-tiny-q8_0.bin")
// 1) Init once (e.g. app start)
WhisperBridge.initModel(modelPath)
// 2) Record to a WAV file (16kHz mono PCM16) using your own recorder
val wavPath: String = "/path/to/recording.wav"
// 3) Transcribe
val text = WhisperBridge.transcribeWav(wavPath, language = null).trim()
println(text)
// 4) Optional: release on app shutdown
WhisperBridge.release()Note: WhisperBridge expects a WAV file path. Llamatik’s app uses AudioRecorder + AudioPaths.tempWavPath() to generate the WAV before calling transcribeWav(...).
Please go to the Backend README.md for more information.
- ✅ Built directly on llama.cpp and whisper.cpp
- ✅ Offline-first & privacy-preserving
- ✅ No runtime dependencies
- ✅ Open-source (MIT)
- ✅ Used by real Android & iOS apps
- ✅ Designed for long-term Kotlin Multiplatform support
Llamatik is already used in production apps on Google Play and App Store.
Want to showcase your app here? Open a PR and add it to the list 🚀
Llamatik is 100% open-source and actively developed.
- Bug reports
- Feature requests
- Documentation improvements
- Platform extensions
All contributions are welcome!
This project is licensed under the MIT License.
See LICENSE for details.
Built with ❤️ for the Kotlin community.



