fix(metering): default local-provider pricing to $0 for uncataloged models#1055
fix(metering): default local-provider pricing to $0 for uncataloged models#1055AL-ZiLLA wants to merge 2 commits into
Conversation
1. CSP: x-frame-options SAMEORIGIN + frame-ancestors for localhost:3000 (allows Command Center iframe embedding) 2. reasoning serde alias: accept both "reasoning" (Gemma 4 via Ollama) and "reasoning_content" (DeepSeek-R1, Qwen3) in non-streaming responses Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
…odels estimate_cost_with_catalog previously fell back to ($1/M input, $3/M output) for any model not in the builtin catalog. Custom local Modelfiles — e.g. an Ollama `gemma4-agent:latest` built via the Ollama CLI — miss the catalog and so were charged as if they were a paid cloud model, tripping budget quotas on zero-cost inference. Fix: thread the provider string through estimate_cost_with_catalog and pick the fallback based on whether the provider runs inference locally. For ollama/vllm/lmstudio/lemonade/llamacpp/local, default to ($0, $0). Cloud providers still default to ($1, $3) so an unknown cloud model surfaces a cost estimate rather than hiding it. Catalog pricing always wins if the model IS registered — a known model tagged with a local provider hint still uses catalog prices. Added unit tests covering: local-unknown is free, cloud-unknown uses default, known model ignores the provider hint, case-insensitive provider matching, and every supported local-provider string. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
|
Thanks @AL-ZiLLA — the metering fix is genuinely useful (local-GPU users running uncataloged Ollama/vLLM model IDs were getting phantom cloud pricing in budgets). But this PR bundles three unrelated concerns and can't land as one. Ask: please split into 3 PRs
CI is currently red on this branch; after splitting and rebasing on post-#1041 Thanks again — looking forward to landing (1) fast. |
jaberjaber23
left a comment
There was a problem hiding this comment.
Title says metering but the first diff hunk weakens X-Frame-Options from DENY to SAMEORIGIN and adds http://localhost:3000 to frame-ancestors. That is a clickjacking-relevant change hidden in a metering PR for a private Command Center deployment.
To merge:
- Split the metering pricing fix into its own PR (that part is fine on its own).
- The frame-ancestors change should be a separate, opt-in config field (e.g. dashboard.embed_origins in config.toml) defaulting to none. Otherwise every install ships a relaxed CSP for one operator's local proxy.
Holding here as request-changes for the security regression.
|
Closing this in favor of a clean re-submission. The security regression flagged on 2026-04-17 is still in the diff at d14f706:
Both ship in To get the metering fix landed, open a new PR with only the The CSP / X-Frame-Options change needs its own PR and must be opt-in via config (e.g. |
Bug
estimate_cost_with_catalogfalls back to(1.0, 3.0)per million tokens for any model not present in the builtin catalog. That unconditional fallback treats locally-served models — custom Ollama Modelfiles, vLLM variants, LM Studio / Lemonade / llama.cpp endpoints — as paid cloud models, even though they run on the user's own hardware and cost $0 per call.On my deployment, two agents on a custom Ollama Modelfile (
gemma4-agent:latest) were tripping the $2/hr and $8/day budget quotas with entirely fictional cost. Ledger shows 895 calls / $43.59 of phantom burn across two weeks — actual cost: $0.Repro on
main:Proof from my usage_events table:
gemma4:26bgemma4-agentBoth are local Ollama models producing zero-cost inference. The only difference is that
gemma4-agentis a user-built Modelfile alias and isn't in the builtin catalog.Fix
Thread the provider string through
estimate_cost_with_catalogand pick the fallback based on whether the provider runs inference locally:ollama,vllm,lmstudio,lm-studio,lemonade,llamacpp,llama.cpp,local) → fallback(0.0, 0.0)(1.0, 3.0)so an unknown cloud model surfaces a cost estimate rather than hiding itCatalog pricing always wins if the model is registered — a known cloud model won't get silenced by a mislabeled provider hint.
Impact
Callers updated
Three call sites in
crates/openfang-kernel/src/kernel.rsnow pass&manifest.model.provider. The manifestModelConfigalready carries this field (crates/openfang-types/src/agent.rs:375), so no storage or config changes are required.Tests
Added three new tests and updated three existing ones in
crates/openfang-kernel/src/metering.rs:test_estimate_cost_with_catalog_unknown_local_is_free— every supported local provider string returns $0 for an unknown model; verifies case-insensitive matchingtest_estimate_cost_with_catalog_known_model_ignores_provider_hint— catalog pricing wins over the provider hinttest_is_local_provider— unit test for the helperFull workspace
cargo test --releasepasses (1,300+ tests, 0 failures).cargo clippy -p openfang-runtime -p openfang-kernel -p openfang-api -- -D warningsclean.