feat: TTS audio mode — Kokoro voice personas, seekbar, conversational AI by alichherawalla · Pull Request #236 · alichherawalla/off-grid-mobile-ai

alichherawalla · 2026-04-07T11:28:44Z

Summary

Complete TTS audio mode implementation with Kokoro text-to-speech integration:

Voice Personas: 8 mood-based personas (Warm, Calm, Clear, Steady, Bold, Cheerful, Gentle, Refined) with pre-configured playback speeds, selectable via popover in the audio mode bottom bar
Audio Mode System Prompt: When audio mode is active, appends instructions telling the LLM to respond conversationally — short sentences, no markdown, expressive punctuation for natural prosody
Seekable Progress Bar: Full-width draggable seekbar on AI audio bubbles — tap or drag to jump to any position. Re-speaks from nearest sentence boundary
Waveform Animation: Wave bounce animation during actual audio playback only (not during loading). Stops when paused
Playback Controls: Play/pause, speed cycling (0.5x–2.0x), progress tracking with wall-clock timer using targeted Zustand selectors (no re-render on every amplitude update)
Thinking Block: Renders directly in audio mode without ChatMessage bubble wrapper — clean collapsible block above audio bubble
Tap-to-Toggle Recording: Audio mode mic uses tap-to-start/tap-to-stop instead of hold-to-record
App Lifecycle: Pauses TTS on app background, resumes on foreground. Stops on back navigation (blur + beforeRemove)
Crash Prevention: Defers Kokoro voice config changes until native ExecuTorch worker is idle. Always waits 300ms after stop before new speak
Bottom Bar: All quick settings (image gen, thinking, tools, voice) directly accessible — no popover needed. Disabled states shown for unavailable features

Test plan

Implements on-device text-to-speech using OuteTTS 0.3 (454 MB) + WavTokenizer (73 MB) via llama.rn, with react-native-audio-api for playback. Two interface modes (user-switchable from Settings): - Chat Mode: play/stop TTSButton on each assistant message bubble - Audio Mode: waveform bubbles with auto-TTS after streaming, transcript expand, speed cycling, and PCM audio persisted to disk per message for repeat playback New files: - src/constants/ttsModels.ts — model URLs, RAM thresholds, cache config - src/services/ttsService.ts — download, load, generate, persist, play - src/stores/ttsStore.ts — Zustand store with Chat + Audio Mode actions - src/hooks/useTTS.ts — convenience hook with RAM gate and weighted progress - src/components/TTSButton/index.tsx — Chat Mode play/stop per message - src/components/AudioMessageBubble/index.tsx — waveform bubble component - src/screens/TTSSettingsScreen/index.tsx — download, mode, speed, cache Modified: - Message type: audioPath, waveformData, audioDurationSeconds, isGeneratingAudio - ChatMessage: Audio Mode branch + TTSButton in meta row - SettingsScreen: Text to Speech nav row - Navigation: TTSSettings route - stores/index.ts, services/index.ts: exports Tests: 42 unit + integration tests covering service, store, and full flows Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Revert ChatMessage to main (avoids pre-existing complexity lint failure when the file enters the push-range diff) - Add Audio Mode + TTSButton to MessageRenderer instead — clean, under limit - Move audioPath/waveformData/audioDurationSeconds/isGeneratingAudio fields from types/index.ts to types/tts.ts via module augmentation (keeps index.ts under the 350-line max) - Add react-native-audio-api global mock to jest.setup.ts so all test suites that transitively import ttsService can resolve the native module Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

greptile-apps

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

In finalizeStreamingMessage, after addMessage() saves the assistant reply, check if Audio Mode is active and model is loaded — if so, fire useTTSStore.generateAndSave() in the background so the waveform bubble auto-generates instead of spinning indefinitely. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

gemini-code-assist

Code Review

This pull request implements a Text-to-Speech (TTS) service and store, enabling both Chat and Audio interface modes. The implementation includes model management, audio generation, file persistence, and playback controls. My feedback highlights that btoa and atob are not natively available in React Native and require polyfills or alternative base64 utilities, and suggests adding user feedback and logging when TTS generation fails due to unloaded models.

gemini-code-assist · 2026-04-07T11:37:26Z

src/services/ttsService.ts

+    for (let i = 0; i < uint8.length; i++) {
+      binary += String.fromCharCode(uint8[i]);
+    }
+    return btoa(binary);


The use of btoa is not available in standard React Native environments without a polyfill. Please ensure a base64 encoding utility (like buffer or a dedicated library) is used to ensure compatibility across platforms.

gemini-code-assist · 2026-04-07T11:37:26Z

src/services/ttsService.ts

+  }
+
+  private base64ToFloat32(base64: string): Float32Array {
+    const binary = atob(base64);


The use of atob is not available in standard React Native environments without a polyfill. Please ensure a base64 decoding utility is used to ensure compatibility across platforms.

gemini-code-assist · 2026-04-07T11:37:26Z

src/stores/ttsStore.ts

+        if (!settings.enabled || !isModelLoaded) {
+          return;
+        }


The check if (!settings.enabled || !isModelLoaded) is correct, but it might be better to provide user feedback if they try to speak while the model is not loaded, rather than silently failing. Additionally, ensure this failure is logged to aid in debugging, as swallowing failures can make issues harder to trace.

References

When catching errors or handling failures, log them instead of swallowing them to ensure failures are visible and to aid in debugging.

codecov · 2026-04-07T11:55:18Z

Codecov Report

❌ Patch coverage is 47.79051% with 319 lines in your changes missing coverage. Please review.
✅ Project coverage is 83.66%. Comparing base (9c8a2d7) to head (c602566).

Files with missing lines	Patch %	Lines
src/services/ttsService.ts	57.14%	56 Missing and 10 partials ⚠️
src/screens/TTSSettingsScreen/index.tsx	8.57%	64 Missing ⚠️
src/components/AudioMessageBubble/index.tsx	12.76%	41 Missing ⚠️
src/screens/ModelDownloadScreen.tsx	33.33%	28 Missing and 4 partials ⚠️
src/components/TTSButton/index.tsx	6.66%	28 Missing ⚠️
src/screens/ModelsScreen/useTextModels.ts	76.00%	12 Missing and 6 partials ⚠️
src/hooks/useTTS.ts	0.00%	14 Missing ⚠️
src/screens/ModelsScreen/TextModelsTab.tsx	67.44%	6 Missing and 8 partials ⚠️
src/stores/ttsStore.ts	83.33%	10 Missing and 2 partials ⚠️
src/components/ChatInput/Popovers.tsx	26.66%	7 Missing and 4 partials ⚠️
... and 4 more

❌ Your patch check has failed because the patch coverage (47.79%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #236      +/-   ##
==========================================
- Coverage   85.65%   83.66%   -2.00%     
==========================================
  Files         217      224       +7     
  Lines       10766    11289     +523     
  Branches     2888     3023     +135     
==========================================
+ Hits         9222     9445     +223     
- Misses        870     1138     +268     
- Partials      674      706      +32

Files with missing lines	Coverage Δ
src/components/ChatMessage/index.tsx	`95.57% <100.00%> (ø)`
src/components/ModelCard.tsx	`82.50% <ø> (ø)`
src/constants/models.ts	`100.00% <100.00%> (ø)`
src/constants/ttsModels.ts	`100.00% <100.00%> (ø)`
src/screens/ModelDownloadHelpers.tsx	`96.77% <100.00%> (ø)`
src/screens/ModelsScreen/TextFiltersSection.tsx	`85.29% <ø> (ø)`
src/screens/ModelsScreen/constants.ts	`100.00% <100.00%> (ø)`
src/screens/ModelsScreen/index.tsx	`82.35% <ø> (ø)`
src/screens/ModelsScreen/styles.ts	`100.00% <ø> (ø)`
src/screens/ModelsScreen/useModelsScreen.ts	`91.26% <ø> (ø)`
... and 16 more

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

…, TTSButton placement Critical fixes for TTS Audio Mode: - Add updateMessageAudio() to chatStore — writes audioPath, waveformData, audioDurationSeconds, isGeneratingAudio back to the conversation message (without this, the waveform bubble spun forever after generation) - Wire auto-TTS trigger in useChatScreen via useEffect on isStreamingForThisConversation: detects streaming → stopped, checks Audio Mode + model loaded, calls triggerAudioModeGeneration() which sets isGeneratingAudio:true, fires generateAndSave, then writes audio fields or clears the flag on error - Fix isGenerating logic: show spinner only when isGeneratingAudio===true, not for every assistant message missing audioPath (which made all old messages spin forever in Audio Mode) - Fix TTSButton placement: add metaExtra prop to ChatMessage/MessageMetaRow so TTSButton renders inline in the timestamp row rather than below the bubble Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Adds a Voice row (volume icon + Chat/Audio/N/A badge) to the quick settings popover in the chat input. Tapping it: - Toggles between Chat and Audio mode when models are downloaded - Auto-loads/unloads the TTS model on switch - Navigates to TTSSettings when models are not yet downloaded This makes Audio Mode accessible without leaving the chat screen. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

The ChatInput test mock for src/stores was missing useTTSStore, causing Popovers.tsx (which now uses useTTSStore) to throw on render. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

1. checkDownloadStatus() never called on TTSSettingsScreen mount → store always showed models as not downloaded after fresh app start 2. speak() race condition: stop() during generation didn't prevent playback → set isSpeakingFlag=true before generate(), check it after, use finally 3. RNFS.stat() on directory reports block size (~0), not total file size → replaced with readDir() recursive sum of individual .pcm file sizes 4. Historical messages without audio showed broken play button in Audio Mode → AudioMessageBubble only rendered when msg.audioPath || msg.isGeneratingAudio Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Replaced stat() mock with readDir() mocks matching the new recursive file-size summation approach. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…nto feat/tts-implementation

Replaces slider controls with a [–] value [+] stepper row for precise numeric input in settings screens. Supports min/max/step, optional decimal formatting, and testID for E2E automation. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Removes @react-native-community/slider from GenerationSettingsModal, ModelSettingsScreen, and TTSSettingsScreen. Every numeric control (temperature, top-p, GPU layers, speed, etc.) now uses the stepper for touch-friendly precise adjustment. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- MediaAttachment gains audioFormat and audioDurationSeconds fields - audioRecorderService.stopRecording() now returns { path, durationSeconds } instead of just the path, enabling accurate audio bubble scrubbing - ChatInput/Attachments.addAudioAttachment stores the duration Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…send In Audio Mode, user voice recordings now appear as right-aligned audio bubbles instead of text messages, making both sides of the conversation audio-native. - Voice.ts: adds file-based transcription path (audioRecorderService + whisperService.transcribeFile) and onAutoSend callback for atomic send with audio attachment. Multimodal models skip transcription entirely. - ChatInput: passes onAutoSend in Audio Mode; builds MediaAttachment inline to avoid async state-update race; uses attachmentsRef for sync reads. - AudioMessageBubble: adds isUser prop for right-aligned primary-tinted style. - MessageRenderer: renders user audio attachments as AudioMessageBubble before the normal message path. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

The streaming-complete useEffect only listed isStreamingForThisConversation in its deps, so activeConversation was captured stale. When streaming ended, the last message was always the old value — TTS generation was never triggered. Fix: read conversation and last message directly from useChatStore.getState() inside the effect instead of relying on the closed-over activeConversation. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

When no Whisper model is installed and the user taps the mic, show a CustomAlert offering to download Whisper Small (466 MB) immediately, rather than navigating away to VoiceSettings. UnavailableButton also now shows a download icon + percentage while the model is being fetched, so feedback is in-place. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Adds a TEXT TO SPEECH section alongside IMAGE GENERATION and TEXT GENERATION in the chat settings modal. Shows mode toggle (chat/audio), enable switch, speed stepper, and auto-play toggle. Deep-links to TTSSettingsScreen for full configuration. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

WHISPER_MODELS grows from 5 to 10 entries covering English-only and Multilingual variants for tiny/base/small/medium, plus Large v3 Turbo and Large v3. whisperService.downloadFromUrl(url, modelId) downloads any ggml .bin file from an arbitrary URL — enables installing community models from HuggingFace. whisperStore exposes it as downloadFromUrl action. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Rewrites the voice settings screen with three sections: - Active model card with inline download progress and remove action - Curated models grouped by English-only / Multilingual (all sizes, tiny → large-v3) - Live HuggingFace search bar (500 ms debounce) that queries ASR repos; tap a repo to expand and browse its ggml .bin files; tap a file to confirm and download via downloadFromUrl huggingFaceService gains searchWhisperRepos() and getWhisperFiles() to power the HF search without coupling to the LLM model browser. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

llmMessages builds an input_audio content block from audio attachments when the active model reports audio support, bypassing Whisper entirely. llm.ts exposes getMultimodalSupport() so the voice layer can detect this. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- ttsStore: adds interfaceMode, speed, autoPlay, enabled settings; generateAndSave flow for Audio Mode; updateMessageAudio - ttsService: OuteTTS generate+save path for AI audio bubbles - TTSButton: play/stop per-message with generation spinner - KokoroTTSManager + kokoroModels: scaffold for Tier 1 Kokoro TTS (not yet wired to react-native-executorch, marked not started) - App.tsx: mounts KokoroTTSManager near root - packages: react-native-executorch, background-downloader, dr.pogodin/react-native-fs Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- ChatMessage: long-press action sheet gains Speak option (delegates to ttsStore) - ModelSettingsScreen: suppress pre-existing exhaustive-deps lint warning - Tests: update GenerationSettingsModal and ModelSettingsScreen tests for NumericStepper (gpu-layers-stepper-increment) replacing slider testIDs - TTS_IMPLEMENTATION_PLAN: rewritten to reflect Audio Mode bidirectional voice conversation, stale closure fix, and implementation status Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…sages Two bugs causing broken Audio Mode: 1. AudioRecorder was recording at the system default rate (~44.1 kHz), producing WAV that Whisper interprets as static ('TV static' / [SOUND]). Fix: pass a preset with sampleRate:16000, BitDepth.Bit16 so the file is Whisper-compatible 16 kHz mono int16 PCM from the start. 2. buildOAIMessages was always including audio attachments as input_audio content blocks, even for models that don't support audio input (e.g. remote Qwen 3.5 2B / Gemma 42B). Those models replied 'I cannot hear audio'. Fix: buildOAIMessages now accepts supportsAudio flag (default false) and only emits input_audio parts when the model declares audio support. llm.ts passes multimodalSupport.audio when calling it. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

playFromFile was treating WAV bytes as raw Float32 PCM — designed for OuteTTS output only. WAV files have a 44-byte RIFF header plus int16 samples; reinterpreting them as Float32 produces pure static. Fix: use AudioContext.decodeAudioData(filePath) which properly parses the WAV header and decodes samples. The file:// prefix is added if missing. MessageRenderer now wraps user and assistant audio bubbles in a container View with paddingHorizontal:16 and marginVertical:8, matching the ChatMessage container layout so bubbles align correctly with the chat edges instead of touching screen borders. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Audio type attachments were falling through to the FadeInImage branch, causing Image to try to load the WAV file path — resulting in a broken image placeholder that stretched the user bubble very wide (the 'super long' bubble issue). Audio attachments now render as a compact mic icon + 'Voice message' badge (matching the document badge style), keeping the bubble compact. In Audio Mode they never reach this code — they render as AudioMessageBubble. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Add isAudioModeMessage to Message type and updateMessageAudio signature. Set flag in triggerAudioModeGeneration so mode switches don't reformat old text messages. MessageRenderer now checks msg.isAudioModeMessage instead of global ttsMode for assistant audio bubbles. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Bug 2: handlePlayPause calls speak() for AI bubbles (empty audioPath) instead of playMessage with empty string. Remove isGenerating spinner. Bug 3: WaveformBars gets flex:1 + overflow:hidden, WAVEFORM_BARS 40→28, bubble overflow:hidden, maxWidth 80%→88%. Bug 4: user bubble flips play row order (speed+duration left, play right). Bug 5: voice cycling chip on AI bubbles reads/writes kokoroVoiceId. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Fix guard: was checking isModelLoaded (OuteTTS, always false) instead of kokoroReady — so isAudioModeMessage was never stamped and all AI messages rendered as text in audio mode - Add sentence-level streaming TTS: Kokoro now starts speaking each sentence as soon as LLM finishes generating it, instead of waiting for the full response - Fix waveform invisible in idle state: min bar height 3→6px and empty waveform now renders a sine-wave placeholder instead of nearly-invisible flat bars Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Add react-native-executorch mock to jest.setup.ts (voice configs + useTextToSpeech) - Fix tts integration test: speak() now passes callback as 3rd arg - Update VoiceRecordButton tests: tap-to-toggle, download prompt, no "Transcribing..." text - Update VoiceSettingsScreen tests: new UI with English/Multilingual sections, Active badge - Update DownloadManagerScreen tests: conditional active section, filter bar touchables - Update messageContent test: stripControlTokens now trims output 157 suites, 5181 tests, all passing. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Use @react-native-community/slider (already installed) instead of custom PanResponder-based seekbar. Native component handles drag natively at 60fps — no JS thread bottleneck. Removes ~60 lines of PanResponder/measure/layout tracking code. Added slider mock to jest.setup.ts. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Replace animated WaveformBars (VU-meter, wave bounce, 3 animation modes, Animated.Value refs) with simple static bars. Progress is now shown entirely by the native Slider component. Remove RMS amplitude calculation from KokoroTTSManager onNext callback. ~80 lines of animation code removed. No more JS thread contention from per-chunk amplitude updates. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…click play - Transcript shows karaoke-style word highlighting based on playback progress — spoken words in full color, upcoming words muted - Stop any TTS playback when user starts recording (mic + speaker shouldn't overlap) - Set isSpeaking + currentMessageId immediately before the 300ms Kokoro cleanup wait, so UI shows loading state right away when switching clips Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- KokoroTTSManager: 500ms cooldown after isSpeaking→false before applying voice config change, giving native ExecuTorch thread time to fully stop - Transcript highlight: only the currently spoken word is highlighted (primary color + subtle background), not all spoken words - Auto-scroll: ScrollView with maxHeight 120px, scrolls to keep the active word visible as playback progresses Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Remove word-level transcript highlighting — Kokoro doesn't provide word timestamps, so it was always off. Keep transcript as plain text in a scrollable container (max 120px) - Waveform bars now visually distinguish playing vs idle: playing bars are brighter (0.6–1.0 opacity), idle bars are dimmer (0.25–0.6) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Waveform bars now tint as the playhead passes: played bars are bright, unplayed bars are muted — like WhatsApp voice messages - Progress is shown directly on the bars, with the Slider below for drag-to-seek interaction - Increase voice change cooldown to 1500ms to prevent native crash Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Audio bubble uses fixed width: 88% (not maxWidth) so it doesn't resize when transcript opens - Thinking block wrapper matches at width: 88% (was maxWidth: 85%) - Both bubbles now render at exactly the same width Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Slider is now positioned on top of the waveform bars (centered vertically) instead of as a separate row below - Slider track is transparent — waveform bar coloring shows progress - Slider thumb (dot) sits on top of the waveform at the current position - Seekbar visible on both user and AI audio bubbles - Removed separate seekbar row — cleaner layout Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Thumb is transparent when progress=0 and not seeking. Only becomes visible (primary color) when audio is actively playing or user is dragging the slider. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Thumb always shows (primary color) so users know they can seek - Expand seekOverlay to left/right -16px to compensate for Android Slider's built-in ~16px internal padding — thumb now aligns with the waveform bar highlighting Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Play button + waveform in top row (waveform takes full remaining width) - Show transcript, duration, speed chip in a single meta row below - Matches WhatsApp voice message layout: play + waveform on top, info below Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Bars now distribute evenly across the entire container width instead of clustering together with fixed 2px gaps. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Increase to 48 bars with 1.5px gaps — fills full width, looks denser - Bigger speed chip (more padding, larger border radius) — easier to tap - Voice change cooldown now uses actual stream end timestamp instead of isSpeaking state — waits 2 seconds from when the native stream actually stopped, not from when JS flag flipped - Both user and AI bubbles use same width: 88% Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Waveform bars now span edge-to-edge across the entire bubble width. Play button sits in the meta row below alongside show transcript, duration, and speed chip. No more asymmetric padding. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Reverted play button to left of waveform (standard layout). Reduced playRow gap from SPACING.sm to SPACING.xs so waveform extends further right. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Voice switch: key-based remount of KokoroTTSManager avoids native SIGSEGV when executorch re-initializes with a new voice config. Outer component manages cooldown, inner component holds the hook. Sets kokoroReady=false during switch so UI shows loader. - Seekbar progress: playMessage finally block now checks ownership (currentMessageId === messageId) before clearing state, preventing it from clobbering an in-flight speak() call's isSpeaking/isAudioPlaying. Added playSessionId counter + retry loop (up to 10x 200ms) when executorch reports "model is currently generating" (code 104). - Seekbar smoothness: timer interval 500ms→50ms, fractional seconds instead of Math.floor for continuous waveform bar progress. - Transcript layout: split TranscriptSection into TranscriptToggle (stays in metaRow with time/speed) and TranscriptContent (renders below), preventing text from squeezing against duration/speed chip. - Chat scroll: FlatList hidden (opacity:0) during initial layout, revealed after first scrollToEnd settles. Mode switch (chat↔audio) resets scroll via extraData + scrollToEnd. - Voice loader UI: track kokoroActiveVoiceId in store, derive isChangingVoice in UI components from settings vs active mismatch. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…nto feat/tts-implementation

…tional Kokoro - Audio mode now renders tool-call messages via ChatMessage (proper bubble + tool call UI) instead of dropping them as raw unstyled text. Plain assistant messages still render as AudioMessageBubble. - Transcript ScrollView uses react-native-gesture-handler for reliable nested scrolling inside FlatList on Android. Moved transcript outside the TouchableOpacity wrapper so it can capture scroll gestures. - Action menu (long-press + 3-dot) added to both user and assistant audio bubbles: Copy + Resend for user, Copy + Regenerate for assistant. - Kokoro TTS only loads in audio interface mode (App.tsx), saving RAM when in chat mode. - Post-stream ownership transfer: when all text was spoken by streaming chunks, transfers currentMessageId from 'streaming' to the real message ID so the AudioMessageBubble seekbar works. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

When retrying a message while TTS is speaking, the audio bubble disappears but Kokoro continues playing natively. Now calls ttsStore.stop() before deleting messages in the retry handler. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Conditional mounting (audio mode only) caused Kokoro to not be ready during streaming — it takes ~10s to initialize, but fast models finish streaming before that. Streaming TTS chunks silently skipped because kokoroReady was false. Reverting to always-mounted so Kokoro is warm when streaming starts. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Streaming TTS chunks couldn't keep up with fast cloud models — Kokoro speaks slower than tokens arrive, causing a growing backlog of unspoken chunks, word skipping at transitions, and unpredictable playback. Replaced with a simpler approach: text streams normally as a ChatMessage, then when streaming ends the full response is spoken as a single TTS call with the real message ID. Clean, predictable, no word skipping. Also includes: stop in-flight TTS when new streaming begins, TTS stop on retry/resend, and text offset fix for post-stream remaining calc. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…aming ends" This reverts commit 6861c30.

sonarqubecloud · 2026-04-09T17:34:59Z

Quality Gate failed

Failed conditions
6 Security Hotspots

See analysis details on SonarQube Cloud

alichherawalla and others added 2 commits April 7, 2026 16:41

greptile-apps bot reviewed Apr 7, 2026

View reviewed changes

gemini-code-assist bot reviewed Apr 7, 2026

View reviewed changes

alichherawalla and others added 24 commits April 7, 2026 18:04

test: add useTTSStore mock to ChatInput test suite

ee07ec2

The ChatInput test mock for src/stores was missing useTTSStore, causing Popovers.tsx (which now uses useTTSStore) to throw on render. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

test: update getAudioCacheSizeMB test for readDir-based implementation

c602566

Replaced stat() mock with readDir() mocks matching the new recursive file-size summation approach. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Merge branch 'main' of github.com:alichherawalla/off-grid-mobile-ai i…

edd8043

…nto feat/tts-implementation

feat: add NumericStepper component

8ab6a50

Replaces slider controls with a [–] value [+] stepper row for precise numeric input in settings screens. Supports min/max/step, optional decimal formatting, and testID for E2E automation. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

alichherawalla and others added 29 commits April 9, 2026 11:10

fix: hide seekbar thumb when not playing — no stray dot at position 0

b3d7077

Thumb is transparent when progress=0 and not seeking. Only becomes visible (primary color) when audio is actively playing or user is dragging the slider. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

fix: waveform bars span full width using space-between

e5b4816

Bars now distribute evenly across the entire container width instead of clustering together with fixed 2px gaps. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

fix: revert — play button back on left, reduce gap for wider waveform

e8bce31

Reverted play button to left of waveform (standard layout). Reduced playRow gap from SPACING.sm to SPACING.xs so waveform extends further right. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

fix: add left margin to waveform for spacing from play button

33dd403

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

fix: increase waveform left margin to SPACING.sm

e958dd4

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

fix: waveform extends to bubble right edge, spacing from play button

64c6a2a

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

fix: bars flex to fill full waveform width — no right gap

ff738a0

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

fix: remove negative right margin — waveform stays within bubble

42e606c

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

chore: interim

ea27099

Merge branch 'main' of github.com:alichherawalla/off-grid-mobile-ai i…

0beba49

…nto feat/tts-implementation

Revert "fix: drop streaming TTS chain, speak full response after stre…

81c49ea

…aming ends" This reverts commit 6861c30.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: TTS audio mode — Kokoro voice personas, seekbar, conversational AI#236

feat: TTS audio mode — Kokoro voice personas, seekbar, conversational AI#236
alichherawalla wants to merge 98 commits intomainfrom
feat/tts-implementation

alichherawalla commented Apr 7, 2026 •

edited

Loading

Uh oh!

greptile-apps bot left a comment

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Apr 7, 2026

Uh oh!

gemini-code-assist bot Apr 7, 2026

Uh oh!

gemini-code-assist bot Apr 7, 2026

Uh oh!

codecov bot commented Apr 7, 2026 •

edited

Loading

Uh oh!

sonarqubecloud bot commented Apr 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

alichherawalla commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

sonarqubecloud bot commented Apr 9, 2026

Quality Gate failed

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

alichherawalla commented Apr 7, 2026 •

edited

Loading

codecov bot commented Apr 7, 2026 •

edited

Loading