- Linux (Fedora / RHEL or Debian / Ubuntu)
bashandcurl— everything else is installed automatically- An OpenAI-compatible LLM endpoint
- (Strongly recommended) A local SearXNG instance — Wintermute's
search_webtool queries SearXNG for web search. Without it, searches fall back to DuckDuckGo's limited Instant Answer API. SearXNG is lightweight, privacy-respecting, and easy to deploy via Docker. - (Recommended) A dedicated Matrix account for the bot
- (Optional) signal-cli (Java) — for Signal messenger integration. See Signal Setup for details.
Clone the repository and run the AI-driven onboarding script:
git clone https://git.mikoshi.de/overcuriousity/wintermute.git wintermute
cd wintermute
bash onboarding.shThe script works in two phases:
Phase 1 (bash): Installs system dependencies (Python 3.12+, curl, uv, build tools, libolm, ffmpeg), runs uv sync, and asks for your primary LLM endpoint (URL, model, API key). It validates that the endpoint is reachable and supports function calling.
Phase 2 (AI-driven): Hands off to an AI configuration assistant powered by your own LLM. The assistant walks you through every config.yaml section conversationally:
- Inference backends and LLM role mapping
- Web interface settings
- Matrix integration (with live credential testing and test message delivery)
- Whisper voice transcription
- Convergence Protocol validators
- NL Translation
- Tasks, dreaming, memory harvest, scheduler, logging
- Systemd service installation
The AI gives recommendations, explains trade-offs, and runs in-flight validation (probing endpoints, testing Matrix login, triggering OAuth flows for Gemini/Kimi). Config values are written incrementally, so partial progress is preserved if you abort.
Experimental: The AI-driven onboarding requires a model with function-calling support (e.g. Qwen 2.5, Llama 3.1+, GPT-4, Gemini). The script tests this before handoff and warns if unsupported.
bash onboarding.sh --help # Show all options
bash onboarding.sh --dry-run # Show install plan without making changes
| Test | When |
|---|---|
| LLM endpoint reachability | After you provide the URL |
| Function-calling capability | Before handoff to AI assistant |
| Matrix homeserver reachability | When configuring Matrix |
| Matrix credential validation | Login test + immediate logout |
| Matrix message delivery | Optional test message to a room |
| Gemini OAuth | If gemini-cli provider is selected |
| Kimi-Code device-code auth | If kimi-code provider is selected |
The previous programmatic setup script is retained as setup.sh. Use it if your model doesn't support function calling or you prefer a non-AI workflow:
bash setup.shIt walks through 5 stages (dependencies, Python environment, configuration, systemd, diagnostics) with traditional menu-driven prompts. See bash setup.sh --help for options (--no-matrix, --no-systemd, --dry-run).
When you choose to enable Matrix (via either script), the bot's password is collected — not a token. On first start, Wintermute logs in automatically, creates a device, sets up E2E encryption, cross-signs the device, and saves a recovery key to data/matrix_recovery.key. No manual curl commands or token pasting required.
Some homeservers (e.g. matrix.org) require a one-time browser approval for cross-signing on first start — Wintermute logs the exact URL. After that, everything is fully automatic, including token refresh on expiry.
See matrix-setup.md for details on E2E encryption, SAS verification, and troubleshooting.
git clone https://git.mikoshi.de/overcuriousity/wintermute.git wintermute
cd wintermuteE2E encryption requires libolm headers:
# Fedora / RHEL
sudo dnf install -y gcc gcc-c++ cmake make libolm-devel python3-devel
# Debian / Ubuntu
sudo apt-get install -y build-essential cmake libolm-dev python3-dev# Install uv if you don't have it
curl -LsSf https://astral.sh/uv/install.sh | sh
# Create venv and install all dependencies
uv synccp config.yaml.example config.yamlOpen config.yaml and fill in at minimum the inference_backends and llm sections:
inference_backends:
- name: "main"
provider: "openai"
base_url: "https://api.openai.com/v1" # or your local endpoint
api_key: "sk-..."
model: "gpt-4o"
context_size: 128000
max_tokens: 4096
llm:
base: ["main"]For Matrix, supply the bot's password — Wintermute handles login and device creation automatically:
matrix:
homeserver: https://matrix.org
user_id: "@wintermute:matrix.org"
password: "bot-account-password"
access_token: "" # auto-filled on first start
device_id: "" # auto-filled on first start
allowed_users:
- "@you:matrix.org"See configuration.md for all options.
uv run wintermuteThe web interface starts at http://127.0.0.1:8080 by default.
Uses Anthropic's native Messages API with prompt caching. Requires a paid API key from console.anthropic.com — pay-per-token billing. Claude Pro/Max subscriptions do not include API access.
Run bash setup.sh and select option 8) Anthropic when prompted. The script will:
- Ask for your API key (
sk-ant-api03-...) - Prompt for a model (default:
claude-sonnet-4-20250514) - Write
config.yamlwithprovider: "anthropic"
inference_backends:
- name: "claude"
provider: "anthropic"
api_key: "sk-ant-api03-..."
model: "claude-sonnet-4-20250514"
context_size: 200000
max_tokens: 8192
llm:
base: ["claude"]Available models: claude-sonnet-4-20250514 (recommended), claude-opus-4-20250514,
claude-haiku-4-20250414 (good for background tasks — compaction, dreaming, validation).
Alpha: The Gemini Cloud Code Assist integration is experimental. Known limitations include aggressive rate limiting from Google's API and occasional tool-call parsing issues. For production use, an OpenAI-compatible endpoint (llama-server, vLLM, OpenAI, etc.) is recommended.
Unstable / Alpha — The
gemini-cliprovider piggybacks on Google's Cloud Code Assist OAuth flow. Credentials may expire unpredictably and the upstream API surface may change without notice. Suitable for experimentation; not recommended as your only backend.
Wintermute can use Google's Gemini models for free via the Cloud Code Assist API,
using credentials from a locally-installed gemini-cli.
- Node.js and npm (for installing gemini-cli)
Run bash onboarding.sh and select option 6) Gemini (via gemini-cli) when prompted
for the inference substrate. The script will:
- Check for (or install) gemini-cli
- Prompt for a model (default:
gemini-2.5-pro) - Run the OAuth flow (opens your browser for Google sign-in)
- Write
config.yamlwithprovider: "gemini-cli"
# 1. Install gemini-cli
npm install -g @google/gemini-cli
# 2. Run the OAuth setup
uv run python -m wintermute.gemini_auth
# 3. Configure config.yamlinference_backends:
- name: "gemini"
provider: "gemini-cli"
model: "gemini-2.5-pro"
context_size: 1048576
max_tokens: 8192
llm:
base: ["gemini"]Available models: gemini-2.5-pro, gemini-2.5-flash, gemini-3-pro-preview, gemini-3-flash-preview.
Credentials are saved to data/gemini_credentials.json and refreshed automatically.
On headless systems (no display server), the OAuth setup detects the missing DISPLAY/WAYLAND_DISPLAY and switches to a manual flow:
- Run
uv run python -m wintermute.gemini_auth - The script prints a Google OAuth URL — copy it and open it in a browser on any machine
- Sign in with your Google account and authorize access
- Your browser will redirect to
http://localhost:8085/...which won't load on a headless system — that's expected - Copy the full redirect URL from your browser's address bar (including the
?code=parameter) - Paste it back into the terminal prompt
If credentials expire or become invalid while running, Wintermute shows a message in the chat suggesting to re-run the auth setup. Re-run uv run python -m wintermute.gemini_auth and restart the service.
When running as a systemd service, NVM paths are not in PATH by default (systemd uses a minimal environment). Wintermute automatically probes common installation paths at startup:
~/.nvm/versions/node/*/bin/gemini(NVM — default)~/.local/share/nvm/versions/node/*/bin/gemini(NVM — XDG)~/.volta/bin/gemini(Volta)~/.local/bin/gemini(pipx / manual)/usr/local/bin/gemini(system-wide npm)
If your Node installation is in an unusual location, add the node bin directory to the service's environment:
[Service]
Environment=PATH=/path/to/node/bin:%h/.local/bin:/usr/local/bin:/usr/binWintermute supports Kimi-Code as an inference backend. Kimi-Code provides an OpenAI-compatible endpoint via a flat-rate subscription, authenticated with OAuth device-code flow.
Run bash onboarding.sh and select option 7) Kimi-Code when prompted. The script will:
- Prompt for a model (default:
kimi-for-coding) - Run the device-code auth flow (prints a URL — open it in any browser)
- Write
config.yamlwithprovider: "kimi-code"
# 1. Run device-code auth
uv run python -m wintermute.kimi_auth
# 2. Configure config.yamlinference_backends:
- name: "kimi"
provider: "kimi-code"
model: "kimi-for-coding"
context_size: 131072
max_tokens: 8192
llm:
base: ["kimi"]Available models include kimi-for-coding (default), kimi-code, and kimi-k2.5
(supports reasoning — set reasoning: true). The full list is dynamic; see
configuration.md for details.
Credentials are stored in data/kimi_credentials.json and tokens are refreshed
automatically. If credentials are missing on startup, Wintermute auto-triggers the
device flow and broadcasts the verification URL to connected interfaces.
You can also authenticate manually via the /kimi-auth command in Matrix or the web UI.
By default, Wintermute uses local_vector for semantic memory search. This requires only an OpenAI-compatible embeddings endpoint (no external database).
local_vector— SQLite + numpy. No external services beyond an embeddings endpoint. Recommended for most deployments.qdrant— Qdrant vector database. Only needed if you want a dedicated vector DB for larger-scale deployments.
To run Qdrant locally via Docker:
docker run -d --name qdrant -p 6333:6333 qdrant/qdrantSee the Qdrant documentation for more options.
An embeddings endpoint is required — without it, Wintermute will refuse to start. Any OpenAI-compatible /v1/embeddings endpoint works (local: text-embeddings-inference, llama.cpp, Infinity; cloud: OpenAI, Together, Fireworks).
Migration note: The legacy
fts5(SQLite keyword search) backend has been removed. If upgrading from an older version, update yourconfig.yaml: setmemory.backendto"local_vector"and configurememory.embeddings.endpoint.
Wintermute injects the current local time into every system prompt so the LLM has accurate time awareness. This relies on the scheduler.timezone setting in config.yaml:
scheduler:
timezone: "Europe/Berlin" # Your local timezoneIf running in a container, ensure the container's system clock is accurate (e.g. via NTP). The timezone does not need to match the host's /etc/localtime — Wintermute uses the configured timezone from config.yaml regardless of the system timezone.
Both onboarding.sh and setup.sh install a systemd service automatically. The scripts probe the user D-Bus session bus at runtime and choose the appropriate mode:
| Environment | Mode installed | Requires sudo |
|---|---|---|
| Standard Linux (desktop / VPS) | User service (~/.config/systemd/user/) |
No |
| LXC container, FreeIPA/SSSD, minimal SSH | System service (/etc/systemd/system/) with User= |
Yes (once) |
Lingering is enabled automatically via loginctl enable-linger so the service starts at boot without a login session.
Manual setup:
mkdir -p ~/.config/systemd/user
cat > ~/.config/systemd/user/wintermute.service <<EOF
[Unit]
Description=Wintermute AI Assistant
After=network-online.target
Wants=network-online.target
[Service]
Type=simple
WorkingDirectory=/path/to/wintermute
ExecStart=/path/to/uv run wintermute
Restart=on-failure
RestartSec=15
StandardOutput=journal
StandardError=journal
[Install]
WantedBy=default.target
EOF
loginctl enable-linger $USER
systemctl --user daemon-reload
systemctl --user enable --now wintermuteControl:
systemctl --user start wintermute
systemctl --user stop wintermute
systemctl --user restart wintermute
journalctl --user -u wintermute -fIn environments where the systemd user D-Bus session bus is unavailable — typically LXC containers managed by FreeIPA/SSSD, or other constrained environments — systemctl --user fails with:
Failed to connect to user scope bus via local transport: No such file or directory
The setup and onboarding scripts detect this automatically and fall back to a system service. Manual setup (requires sudo):
sudo tee /etc/systemd/system/wintermute.service <<EOF
[Unit]
Description=Wintermute AI Assistant
After=network-online.target
Wants=network-online.target
[Service]
Type=simple
User=YOUR_USERNAME
WorkingDirectory=/path/to/wintermute
ExecStart=/path/to/uv run wintermute
Restart=on-failure
RestartSec=15
StandardOutput=journal
StandardError=journal
[Install]
WantedBy=multi-user.target
EOF
sudo systemctl daemon-reload
sudo systemctl enable --now wintermuteControl:
sudo systemctl start wintermute
sudo systemctl stop wintermute
sudo systemctl restart wintermute
journalctl -u wintermute -f