Skip to content

Commit f32804d

Browse files
authored
Merge pull request #9 from scalytics/modelsupport
Multi-provider LLM support with middleware, CLI, and FinOps
2 parents 5cb7dfe + 76bd408 commit f32804d

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

56 files changed

+7649
-678
lines changed

Makefile

Lines changed: 221 additions & 147 deletions
Large diffs are not rendered by default.

_tasks/provider-support.md

Lines changed: 606 additions & 0 deletions
Large diffs are not rendered by default.

docs/operations-admin/admin-guide.md

Lines changed: 73 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -45,19 +45,27 @@ Configuration values are resolved in this precedence (highest wins):
4545

4646
```go
4747
type Config struct {
48-
Agents AgentsConfig `json:"agents"`
49-
Channels ChannelsConfig `json:"channels"`
50-
Providers ProvidersConfig `json:"providers"`
51-
Gateway GatewayConfig `json:"gateway"`
52-
Tools ToolsConfig `json:"tools"`
53-
Group GroupConfig `json:"group"`
54-
Orchestrator OrchestratorConfig `json:"orchestrator"`
55-
Scheduler SchedulerConfig `json:"scheduler"`
56-
ER1 ER1Config `json:"er1"`
57-
Observer ObserverConfig `json:"observer"`
48+
Paths PathsConfig `json:"paths"`
49+
Model ModelConfig `json:"model"`
50+
Agents AgentsConfig `json:"agents"`
51+
Channels ChannelsConfig `json:"channels"`
52+
Providers ProvidersConfig `json:"providers"`
53+
Gateway GatewayConfig `json:"gateway"`
54+
Tools ToolsConfig `json:"tools"`
55+
Group GroupConfig `json:"group"`
56+
Orchestrator OrchestratorConfig `json:"orchestrator"`
57+
Scheduler SchedulerConfig `json:"scheduler"`
58+
ER1 ER1IntegrationConfig `json:"er1"`
59+
Observer ObserverMemoryConfig `json:"observer"`
60+
ContentClassification ContentClassificationConfig `json:"contentClassification"`
61+
PromptGuard PromptGuardConfig `json:"promptGuard"`
62+
OutputSanitization OutputSanitizationConfig `json:"outputSanitization"`
63+
FinOps FinOpsConfig `json:"finops"`
5864
}
5965
```
6066

67+
New sections added in this release: `Model`, `Paths`, `ContentClassification`, `PromptGuard`, `OutputSanitization`, `FinOps`. See [Configuration Keys](../reference/config-keys/) for details.
68+
6169
### Agent Configuration
6270

6371
| Field | Default | Env Var | Description |
@@ -354,7 +362,7 @@ Isolation guarantees:
354362

355363
### Provider Architecture
356364

357-
All providers use the OpenAI-compatible API format via a single `OpenAIProvider` implementation.
365+
KafClaw supports 11 LLM providers through a unified `LLMProvider` interface. Most use the OpenAI-compatible API format. Providers are identified by canonical IDs and selected via model strings in the format `provider-id/model-name`.
358366

359367
```go
360368
type LLMProvider interface {
@@ -363,26 +371,66 @@ type LLMProvider interface {
363371
Speak(ctx, *TTSRequest) (*TTSResponse, error)
364372
DefaultModel() string
365373
}
374+
```
366375

367-
type Embedder interface {
368-
Embed(ctx, *EmbeddingRequest) (*EmbeddingResponse, error)
369-
}
376+
### Supported Providers
377+
378+
| Provider ID | Auth | Default Base |
379+
|---|---|---|
380+
| `claude` | API key | `https://api.anthropic.com/v1` |
381+
| `openai` | API key | _(configured)_ |
382+
| `gemini` | API key | Google AI Studio |
383+
| `gemini-cli` | OAuth | _(via Gemini CLI)_ |
384+
| `openai-codex` | OAuth | _(via Codex CLI)_ |
385+
| `xai` | API key | `https://api.x.ai/v1` |
386+
| `scalytics-copilot` | API key + base | _(configured)_ |
387+
| `openrouter` | API key | `https://openrouter.ai/api/v1` |
388+
| `deepseek` | API key | `https://api.deepseek.com/v1` |
389+
| `groq` | API key | `https://api.groq.com/openai/v1` |
390+
| `vllm` | optional key + base | _(configured)_ |
391+
392+
For full provider setup, see [LLM Providers Reference](../reference/providers/).
393+
394+
### Provider Resolution Order
395+
396+
1. Per-agent model (`agents.list[].model.primary`)
397+
2. Task-type routing (`model.taskRouting[category]`)
398+
3. Global model (`model.name`)
399+
4. Legacy OpenAI fallback
400+
401+
### Managing Credentials
402+
403+
```bash
404+
# API key providers
405+
kafclaw models auth set-key --provider claude --key sk-ant-...
406+
407+
# OAuth providers (Gemini, Codex)
408+
kafclaw models auth login --provider gemini
370409
```
371410

372-
### Capabilities
411+
See [Models CLI Reference](../reference/models-cli/) for all auth commands.
412+
413+
### Middleware Chain
414+
415+
A configurable middleware chain runs between the agent loop and the LLM provider:
373416

374-
| Capability | Endpoint | Default Model |
375-
|------------|----------|---------------|
376-
| Chat completion | `/chat/completions` | `anthropic/claude-sonnet-4-5` |
377-
| Audio transcription | `/audio/transcriptions` | `whisper-1` |
378-
| Text-to-speech | `/audio/speech` | `tts-1` (voice: nova, format: opus) |
379-
| Embeddings | `/embeddings` | `text-embedding-3-small` |
417+
- **Content Classifier** — sensitivity tagging and model rerouting
418+
- **Prompt Guard** — PII/secret scanning (warn, redact, or block)
419+
- **Output Sanitizer** — response redaction and deny pattern filtering
420+
- **FinOps Recorder** — per-request cost calculation and budget warnings
380421

381-
### API Key Fallback Chain
422+
See [Chat Middleware Reference](../reference/middleware/) for configuration.
382423

383-
1. `cfg.Providers.OpenAI.APIKey` (config or `KAFCLAW_OPENAI_API_KEY`)
384-
2. `OPENAI_API_KEY` environment variable
385-
3. `OPENROUTER_API_KEY` environment variable
424+
### Token & Cost Tracking
425+
426+
Token usage and cost are tracked per request, per provider, per day in the timeline database.
427+
428+
```bash
429+
kafclaw models stats # today's usage
430+
kafclaw models stats --days 7 # 7-day trend
431+
kafclaw status # includes provider info
432+
kafclaw doctor # warns on low rate limits
433+
```
386434

387435
---
388436

docs/reference/cli-reference.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@ Primary command groups:
1111
- `kafclaw status` - runtime/config health snapshot
1212
- `kafclaw doctor` - diagnostics and setup checks
1313
- `kafclaw security` - security checks, deep audit, and safe remediation (`check|audit|fix`)
14+
- `kafclaw models` - manage LLM providers and models (`list|stats|auth login|auth set-key`)
1415
- `kafclaw config` / `kafclaw configure` - low-level and guided config changes
1516
- `kafclaw agent -m` - one-shot interaction
1617
- `kafclaw skills` - bundled/external skill lifecycle and auth/prereq flows (`enable|disable|list|status|enable-skill|disable-skill|verify|install|update|exec|prereq|auth`)
@@ -37,6 +38,7 @@ Detailed command examples:
3738
- [Getting Started](../start-here/getting-started/)
3839
- [User Manual - CLI Reference section](../start-here/user-manual/#3-cli-reference)
3940
- [Manage KafClaw](../operations-admin/manage-kafclaw/)
41+
- [Models CLI Reference](models-cli/) - provider management, auth, usage stats
4042

4143
Skills execution example:
4244
- `kafclaw skills exec <skill-id> --input '{"text":"..."}'`

docs/reference/config-keys.md

Lines changed: 85 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -61,11 +61,92 @@ kafclaw status
6161
kafclaw doctor
6262
```
6363

64+
## Model Configuration
65+
66+
```json
67+
{
68+
"model": {
69+
"name": "claude/claude-sonnet-4-5",
70+
"maxTokens": 8192,
71+
"temperature": 0.7,
72+
"maxToolIterations": 20,
73+
"taskRouting": {
74+
"security": "claude/claude-opus-4-6",
75+
"coding": "openai-codex/gpt-5.3-codex"
76+
}
77+
}
78+
}
79+
```
80+
81+
| Key | Type | Description |
82+
|-----|------|-------------|
83+
| `model.name` | string | Global default model in `provider/model` format |
84+
| `model.maxTokens` | int | Max output tokens per LLM call |
85+
| `model.temperature` | float | Sampling temperature (0.0 - 1.0) |
86+
| `model.maxToolIterations` | int | Max tool-call rounds per request |
87+
| `model.taskRouting` | map | Category to model string overrides (`security`, `coding`, `tool-heavy`, `creative`) |
88+
89+
## Provider Configuration
90+
91+
```json
92+
{
93+
"providers": {
94+
"anthropic": { "apiKey": "sk-ant-...", "apiBase": "" },
95+
"openai": { "apiKey": "sk-...", "apiBase": "" },
96+
"gemini": { "apiKey": "AIza..." },
97+
"xai": { "apiKey": "xai-..." },
98+
"openrouter": { "apiKey": "sk-or-...", "apiBase": "https://openrouter.ai/api/v1" },
99+
"deepseek": { "apiKey": "sk-...", "apiBase": "https://api.deepseek.com/v1" },
100+
"groq": { "apiKey": "gsk_...", "apiBase": "https://api.groq.com/openai/v1" },
101+
"vllm": { "apiKey": "", "apiBase": "http://localhost:8000/v1" },
102+
"scalyticsCopilot": { "apiKey": "<token>", "apiBase": "https://copilot.scalytics.io/v1" }
103+
}
104+
}
105+
```
106+
107+
Each provider entry accepts `apiKey` and `apiBase`. See [LLM Providers](providers/) for details.
108+
109+
## Per-Agent Model Configuration
110+
111+
```json
112+
{
113+
"agents": {
114+
"list": [
115+
{
116+
"id": "main",
117+
"model": {
118+
"primary": "claude/claude-opus-4-6",
119+
"fallbacks": ["openai/gpt-4o"]
120+
},
121+
"subagents": {
122+
"model": "groq/llama-3.3-70b"
123+
}
124+
}
125+
]
126+
}
127+
}
128+
```
129+
130+
| Key | Type | Description |
131+
|-----|------|-------------|
132+
| `agents.list[].model.primary` | string | Primary model for this agent |
133+
| `agents.list[].model.fallbacks` | []string | Fallback models tried on transient errors |
134+
| `agents.list[].subagents.model` | string | Model for subagents spawned by this agent |
135+
136+
## Middleware Configuration
137+
138+
| Section | Reference |
139+
|---------|-----------|
140+
| `contentClassification` | [Content Classification](middleware/#content-classification) |
141+
| `promptGuard` | [Prompt Guard](middleware/#prompt-guard) |
142+
| `outputSanitization` | [Output Sanitizer](middleware/#output-sanitizer) |
143+
| `finops` | [FinOps Cost Attribution](middleware/#finops-cost-attribution) |
144+
64145
## Common Environment Variables
65146

66147
- `OPENAI_API_KEY`
67148
- `OPENROUTER_API_KEY`
68-
- `KAFCLAW_AGENTS_MODEL`
149+
- `KAFCLAW_MODEL` — global model (e.g. `claude/claude-sonnet-4-5`)
69150
- `KAFCLAW_AGENTS_WORKSPACE`
70151
- `KAFCLAW_AGENTS_WORK_REPO_PATH`
71152
- `KAFCLAW_GATEWAY_HOST`
@@ -82,6 +163,9 @@ kafclaw doctor
82163

83164
## Related Docs
84165

166+
- [LLM Providers](providers/)
167+
- [Models CLI](models-cli/)
168+
- [Chat Middleware](middleware/)
85169
- [Getting Started Guide](../start-here/getting-started/)
86170
- [KafClaw Administration Guide](../operations-admin/admin-guide/)
87171
- [Workspace Policy](../architecture-security/workspace-policy/)

0 commit comments

Comments
 (0)