🎤 Usage Guide

Running the App

Option 1: Background Service (Recommended)

cd voxterm
./toggle.sh     # Start or stop the service

This runs voice dictation as a background service. Run again to toggle off.

Option 2: Foreground (for testing/debugging)

./run-streaming.sh     # Run in foreground, see live output

Press Ctrl+C to stop.

Monitoring Background Service

When running via ./toggle.sh, monitor the log file in real-time:

tail -f /tmp/voice-dictation-streaming.log

The log shows:

Wake word detections ("computer" detected)
Transcription results
Command executions ("change mode", "send it", etc.)
Errors and debugging info

Press Ctrl+C to stop watching (service continues running).

Option 3: Direct Python Execution

source venv/bin/activate
python main_streaming.py

Custom Configuration

# Use a different wake word
python main_streaming.py --wake-word jarvis

# Use a different model
python main_streaming.py --model small

# Adjust sensitivity
python main_streaming.py --sensitivity 0.7

How It Works

Start the app - Run ./toggle.sh
Wait for prompt - You'll see "🎧 Say 'computer' to start dictation" in the log
Say wake word - Say "computer" clearly
Speak - Say your command or text
Pause - Wait briefly when done
Watch it type - Your text appears in the terminal!

Voice Commands

🔤 Typing Text

Just speak normally:

You:      "computer"
You:      "echo hello world"
Terminal: echo hello world█

🎯 Claude Mode Toggle

"change mode"         → Cycle through Claude's plan/edit/default modes
"change mode twice"   → Cycle through modes twice
"change mode three times" → Cycle three times

📤 Text Submission

"send it"             → Submit the current input (press Enter)
"submit"              → Submit the current input (press Enter)

⬅️➡️ Cursor Navigation

"move left"           → Move left 1 position
"move left 5"         → Move left 5 positions
"move right 3"        → Move right 3 positions
"move to start"       → Jump to start of line
"move to end"         → Jump to end of line
"beginning"           → Jump to start of line

✂️ Editing

"delete word"         → Delete previous word
"delete line"         → Clear entire line
"delete"              → Delete 1 character
"delete 3"            → Delete 3 characters
"backspace"           → Delete 1 character

Real-World Examples

Example 1: Simple Text Entry

Terminal: $

You:      "computer"
You:      "echo hello world"
Terminal: $ echo hello world█

You:      "computer"
You:      "send it"
Terminal: $ echo hello world
          hello world
          $█

Example 2: Editing Text

You:      "computer"
You:      "git status"
Terminal: $ git status█

You:      "computer"
You:      "delete word"
Terminal: $ git █

You:      "computer"
You:      "add all"
Terminal: $ git add all█

Example 3: Using Change Mode

You:      "computer"
You:      "change mode"
Terminal: [Cycles through Claude's modes]

You:      "computer"
You:      "help me write a script"
Terminal: $ help me write a script█

You:      "computer"
You:      "send it"
Terminal: [Command submitted to Claude]

Tips & Tricks

🎯 Better Recognition

Speak clearly at normal volume
Speak naturally - just say the text you want typed
Pause briefly between commands
Use "change mode" to switch Claude modes
Use "send it" to submit input

🏃 Faster Transcription

python main_streaming.py --model tiny    # Use tiny model for speed

🎯 Better Accuracy

python main_streaming.py --model small   # Use small model for accuracy

🔊 More Sensitive Wake Word

python main_streaming.py --sensitivity 0.7

🔇 Less Sensitive Wake Word

python main_streaming.py --sensitivity 0.3

🎭 Different Wake Word

python main_streaming.py --wake-word jarvis

Available: computer, jarvis, alexa, hey google, ok google, porcupine, bumblebee, terminator

Testing Components

Before running the full app, test individual components:

Test Microphone

cd voxterm
source venv/bin/activate
python test_mic.py

Speaks a test sound and shows audio levels. Press Ctrl+C to stop.

Test Wake Word Detection

cd voxterm
source venv/bin/activate
python wake_word_detector.py
# Say "computer" to test
# Press Ctrl+C to stop

Test Audio Recording

cd voxterm
source venv/bin/activate
python audio_recorder.py
# Speak after the prompt
# Your audio is saved to /tmp/test_recording.wav

Test Transcription

cd voxterm
source venv/bin/activate
python audio_recorder.py
python transcriber.py /tmp/test_recording.wav

Test Keyboard Input

cd voxterm
source venv/bin/activate
python input_injector.py
# Focus your terminal within 3 seconds
# Watch it type test commands

Troubleshooting

Wake word not detected

Speak louder and clearer
Try: python main_streaming.py --sensitivity 0.7
Check microphone with: python test_mic.py

Transcription is slow

Use faster model: python main_streaming.py --model tiny
Close other applications
Check CPU usage

Text not appearing

Make sure terminal has focus
Try clicking in the terminal before speaking
Test with: python input_injector.py

Wrong text transcribed

Speak more slowly and clearly
Use better model: python main_streaming.py --model small
Reduce background noise
Get closer to microphone

Service not running

Check status: ./status.sh
Check logs: tail -f /tmp/voice-dictation-streaming.log
Restart: ./toggle.sh (off), then ./toggle.sh (on)

Stopping the App

Background service: ./toggle.sh (toggles off)
Foreground mode: Press Ctrl+C in the terminal running the app

Getting Help

Read README.md for full documentation
Check QUICKSTART.md for installation help
Test components individually to isolate issues

Enjoy hands-free terminal control! 🎤✨

FilesExpand file tree

USAGE.md

Latest commit

History