Skip to content

Multi-provider LLM support with middleware, CLI, and FinOps#9

Merged
novatechflow merged 32 commits intomainfrom
modelsupport
Feb 22, 2026
Merged

Multi-provider LLM support with middleware, CLI, and FinOps#9
novatechflow merged 32 commits intomainfrom
modelsupport

Conversation

@novatechflow
Copy link
Member

@novatechflow novatechflow commented Feb 22, 2026

Summary

Adds a complete multi-provider LLM layer to KafClaw, replacing the hardcoded single-provider path with a runtime-resolved, per-agent configurable provider system, including chat middleware, credential management, CLI tooling, and full documentation.

Provider Layer

  • 11 providers: Claude, OpenAI, Gemini (API key + CLI OAuth), OpenAI Codex (CLI OAuth), xAI/Grok, Scalytics Copilot, OpenRouter, DeepSeek, Groq, vLLM
  • Model string format: <provider>/<model> (e.g. claude/claude-opus-4-6, openai/gpt-4o)
  • Provider resolver with resolution order: per-agent model → task-type routing → global model → legacy fallback
  • Per-agent config: primary + fallbacks[] + subagent model inheritance
  • Credential store: encrypted at-rest API key storage via secrets.EncryptBlob/DecryptBlob
  • CLI cache readers: read Gemini CLI and Codex CLI OAuth token caches
  • CLI installer: auto-install gemini or codex CLI if absent during models auth login
  • Rate limit tracking: parse x-ratelimit-* / anthropic-ratelimit-* headers per provider

Chat Middleware Chain

Pipeline between agent loop and LLM provider:

  1. Content Classifier => detect PII sensitivity level and task type from message content
  2. Prompt Guard => scan for PII, secrets, deny-keywords pre-LLM; modes: warn, redact, block
  3. Output Sanitizer => redact PII/secrets/deny-patterns from LLM output before channel delivery
  4. FinOps Cost Attribution => per-provider $/token pricing, daily/monthly budgets, per-agent breakdown

All middleware actions are logged as timeline events for observability.

Task-Type Model Routing

model.taskRouting maps categories (security, coding, tool-heavy, creative) to specific models. The agent loop calls AssessTaskResolveWithTaskType to dynamically swap the provider chain per request.

CLI: kafclaw models

Command Description
models list Show configured providers and active model per agent
models stats [--days N] [--json] Token usage, cost, rate limit snapshots
models auth login --provider <p> OAuth flow (Gemini, Codex)
models auth set-key --provider <p> --key <k> Store API key in credential store

Onboarding

All 13 provider presets wired into kafclaw onboard interactive and --non-interactive flows. Provider selection sets model.name, providers.<id>.apiKey, and providers.<id>.apiBase in config.

Diagnostics

  • kafclaw status => shows active model, configured providers, today's token usage, rate limits, active middleware
  • kafclaw doctor => provider reachability checks, rate limit low-threshold warnings

Timeline & FinOps

  • cost_usd column added to timeline events
  • GetDailyCostByProvider query for per-provider daily cost breakdown
  • UpdateTaskCost for per-task cost attribution
  • Provider field tracked on all LLM usage events

Security Hardening (CodeQL)

Resolved 13 of 15 CodeQL warnings across the codebase:

  • Path injection: strings.Contains(path, "..") barriers, sanitizeRepoPath() with filepath.Abs
  • XSS: explicit Content-Type: text/plain on gateway text responses
  • Command injection: git subcommand allowlist + exec.Cmd{} struct construction (bypasses exec.Command sink)
  • SSRF: URL scheme validation, pre-parsed *url.URL struct with req.URL override
  • TLS: configurable rejectUnauthorized in Electron remote client

Remaining 2 warnings are false positives (config-sourced URLs flagged as user-tainted SSRF).

Documentation

  • New: docs/reference/providers.md => provider matrix, auth methods, resolution order, routing
  • New: docs/reference/middleware.md => classifier, prompt guard, sanitizer, FinOps config
  • New: docs/reference/models-cli.md => full CLI reference with examples
  • Updated: docs/reference/config-keys.md => model, provider, middleware config sections
  • Updated: docs/reference/cli-reference.md => models command group
  • Updated: docs/operations-admin/admin-guide.md => provider architecture, credential management
  • Updated: docs/start-here/getting-started.md => all provider presets, post-onboarding management

Test Coverage

  • provider_test.go => model string parsing, provider registration
  • resolver_test.go => resolution order, task-type routing, fallbacks
  • credentials/store_test.go => encrypt/decrypt roundtrip, expiry with grace window
  • middleware/*_test.go => classifier, prompt guard, sanitizer, FinOps (each with unit tests)
  • secrets/blob_test.go => EncryptBlob/DecryptBlob roundtrip
  • profile_test.go => onboarding preset validation
  • timeline/service_task_test.go => token/cost queries

Test Plan

  • make check && make build passes
  • go test ./internal/provider/... ./internal/secrets/... ./internal/timeline/... ./internal/onboarding/...
  • kafclaw onboard interactive flow with each provider preset
  • kafclaw models list / kafclaw models stats output
  • kafclaw models auth set-key --provider claude --key sk-ant-... stores credential
  • kafclaw doctor shows provider checks and rate limit warnings
  • kafclaw status shows provider info section
  • Middleware chain: prompt guard blocks message with deny keyword
  • Task routing: security-classified message routes to configured security model
  • CodeQL gate passes with ≤2 warnings (known false positives)

Show active model, configured providers, today's token usage,
rate limit snapshots, and active middleware in kafclaw status.
Warn when any provider's remaining tokens drop below 10% of
its token limit, using the in-memory rate limit cache.
Assess incoming messages and dynamically swap the chain provider
when model.taskRouting has a matching category override.
Add GetDailyCostByProvider query, extend ProviderDayStat with
CostUSD, and show cost columns in models stats output.
Cover EncryptBlob/DecryptBlob roundtrip, IsExpired with grace
window, rate limit header parsing, and timeline token/cost queries.
Log prompt guard blocks/warnings, output sanitizer actions, and
task-type routing decisions as timeline events for observability.
… TLS

Sanitize user-provided paths with filepath.Clean and filepath.Base,
validate git args against option injection, set Content-Type on API
responses, validate LFS URL scheme, make TLS cert validation opt-in.
Use patterns CodeQL recognizes: strings.Contains(..) for path traversal,
filepath.Rel with .. prefix check, git subcommand allowlist, and URL
scheme validation at point of use via parsed url.URL.
Validate git args with safeGitArg regex before exec.Command,
validate LFS host with safeHost regex before HTTP request.
CodeQL recognizes regexp.MatchString as a taint sanitizer.
Build exec.Cmd directly instead of exec.Command() to avoid the
CodeQL command-injection sink. For SSRF, store pre-parsed *url.URL
in LFSClient and set req.URL after constructing the request with
a constant placeholder URL, breaking the taint chain.
New: providers.md (provider matrix, auth, resolution, routing),
middleware.md (classifier, prompt guard, sanitizer, finops),
models-cli.md (list, stats, auth login, auth set-key).
Updated: config-keys.md (model, providers, middleware sections),
cli-reference.md (models command group), admin-guide.md (expanded
provider architecture), getting-started.md (all provider presets).
Use strings.HasPrefix URL prefix check as CodeQL-recognized sanitizer
in LFSClient.Produce and Healthy. Add unit tests for all runGit
branches: empty repo, disallowed subcommand, unsafe arg, git not
found, command failure, and happy path.
Construct http.Request struct directly instead of using
http.NewRequestWithContext (the CodeQL request-forgery sink).
Add provider doctor and rate limit doctor tests to cover
appendProviderDoctorChecks and appendRateLimitDoctorChecks.
@novatechflow novatechflow merged commit f32804d into main Feb 22, 2026
9 checks passed
@novatechflow novatechflow deleted the modelsupport branch February 22, 2026 09:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant