Ollama models consistently print XML instead of calling tools or iterating. #10534
Replies: 2 comments
-
|
This is a common issue with local models and tool calling. The model is outputting tool call syntax but Continue is not parsing/executing it. Root causes:
Fixes to try:
Models that work best for tools:
We run local coding agents at Revolution AI — Qwen 2.5 Coder with native tool support is the most reliable for autonomous iteration. |
Beta Was this translation helpful? Give feedback.
-
|
The XML output instead of actual tool calls is a common issue with local models. Root cause: Fix 1: Use native tool calling format models:
- uses: ollama/qwen3-coder-30b
toolCallFormat: native # Not XMLFix 2: Add stop tokens models:
- uses: ollama/qwen3-coder-30b
stop:
- "</tool_call>"
- "</function>"Fix 3: Use models with better tool support Some models handle tools more reliably:
Fix 4: Custom prompt template models:
- uses: ollama/qwen3-coder-30b
promptTemplates:
tools: |
You have access to these tools. Call them using JSON format:
{"tool": "name", "args": {...}}
Available tools:
{{tools}}Fix 5: Increase context/tokens Tool calls sometimes fail when context is tight: models:
- uses: ollama/qwen3-coder-30b
contextLength: 32768
maxTokens: 4096Workaround: We tune local models for Continue at Revolution AI — the stop token + native format combo usually fixes this. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
When I prompt with something that I would expect to kick off some iteration sequences, I consistently fail to get the agent to do anything autonomously.
For example, I'll prompt something like:
And it will print
And stop there. Nothing else happens. I've tried rules, I've tried different models. I get varying levels of success and nothing resembling consistency.
I'm running these on a 7900xtx and know these aren't the most powerful models but I still would expect the tool calls to not fail in this way.
Here is another example, after repeating 3 times, it finally called the tool without stopping generation silently:
I searched for other support threads and there are some but most seem to focus on gpt-oss, but I've tried a number of qwen3 variants, including the instruct variants, and they all seem to suffer from this problem. Is there a trick to getting tool calls to be consistent?
My config:
Beta Was this translation helpful? Give feedback.
All reactions