Ollama models consistently print XML instead of calling tools or iterating. #10534

cornfeedhobo · 2026-02-15T17:53:32Z

cornfeedhobo
Feb 15, 2026

When I prompt with something that I would expect to kick off some iteration sequences, I consistently fail to get the agent to do anything autonomously.

For example, I'll prompt something like:

I would like you to examine the codebase and come up with a plan for a new plugin ....

And it will print

Let me try to read the go.mod file to understand the project structure:

<function=read_file> <parameter=filepath> go.mod </tool_call>

And stop there. Nothing else happens. I've tried rules, I've tried different models. I get varying levels of success and nothing resembling consistency.

I'm running these on a 7900xtx and know these aren't the most powerful models but I still would expect the tool calls to not fail in this way.

Here is another example, after repeating 3 times, it finally called the tool without stopping generation silently:

I searched for other support threads and there are some but most seem to focus on gpt-oss, but I've tried a number of qwen3 variants, including the instruct variants, and they all seem to suffer from this problem. Is there a trick to getting tool calls to be consistent?

My config:

name: Local Config
version: 1.0.0
schema: v1

context:
  - provider: file
  - provider: code
  - provider: diff
  - provider: currentFile
  - provider: terminal
  - provider: open
  - provider: tree
  - provider: problems
  - provider: debugger
    params:
      stackDepth: 3
  - provider: repo-map
  - provider: os
  - provider: web

docs:
  - name: Continue
    startUrl: https://docs.continue.dev/intro

# https://docs.continue.dev/customize/models#recommended-models
models:
  # Agent, Chat, Edit
  - uses: ollama/qwen3-coder-30b

  # Autocomplete
  - uses: ollama/qwen2.5-coder-7b

  # Embed
  - uses: ollama/nomic-embed-text-latest

  # Rerank
  - name: qwen3-reranker
    model: dengcao/Qwen3-Reranker-8B:Q3_K_M
    provider: ollama
    roles:
      - rerank

xXMrNidaXx · 2026-02-23T14:00:55Z

xXMrNidaXx
Feb 23, 2026

This is a common issue with local models and tool calling. The model is outputting tool call syntax but Continue is not parsing/executing it.

Root causes:

Wrong tool call format — Ollama models use different formats:
- Qwen: native tool calling via API
- Some models: XML-style (what you are seeing)
- Continue expects a specific format
Model not trained for tool use — Not all instruct models support tools well

Fixes to try:

Use native Ollama tool calling

models:
  - uses: ollama/qwen3-coder-30b
    params:
      supportsTools: true

Try Qwen 2.5 with explicit tool format

- name: qwen-tools
  model: qwen2.5-coder:14b
  provider: ollama
  params:
    tools: true

Add explicit tool instructions in rules

rules:
  - When you need to use a tool, output ONLY the tool call in the exact format Continue expects. Do not add explanatory text before or after.

Temperature and sampling

params:
  temperature: 0.1  # Lower = more deterministic tool calls
  stop: ["</tool_call>"]

Models that work best for tools:

qwen2.5-coder (7B/14B) with native tool support
mistral-nemo with tool calling
llama3.1 with tool use training

We run local coding agents at Revolution AI — Qwen 2.5 Coder with native tool support is the most reliable for autonomous iteration.

0 replies

xXMrNidaXx · 2026-02-23T15:24:53Z

xXMrNidaXx
Feb 23, 2026

The XML output instead of actual tool calls is a common issue with local models.

Root cause:
The model is trained to output XML/function syntax but Continue expects native tool calling format.

Fix 1: Use native tool calling format

models:
  - uses: ollama/qwen3-coder-30b
    toolCallFormat: native  # Not XML

Fix 2: Add stop tokens

models:
  - uses: ollama/qwen3-coder-30b
    stop:
      - "</tool_call>"
      - "</function>"

Fix 3: Use models with better tool support

Some models handle tools more reliably:

llama3.1:70b — good tool calling
mistral-nemo — reliable function calls
deepseek-coder-v2 — solid tool support

Fix 4: Custom prompt template

models:
  - uses: ollama/qwen3-coder-30b
    promptTemplates:
      tools: |
        You have access to these tools. Call them using JSON format:
        {"tool": "name", "args": {...}}
        
        Available tools:
        {{tools}}

Fix 5: Increase context/tokens

Tool calls sometimes fail when context is tight:

models:
  - uses: ollama/qwen3-coder-30b
    contextLength: 32768
    maxTokens: 4096

Workaround:
Manually trigger with @ mentions instead of relying on autonomous tool calling.

We tune local models for Continue at Revolution AI — the stop token + native format combo usually fixes this.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ollama models consistently print XML instead of calling tools or iterating. #10534

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Ollama models consistently print XML instead of calling tools or iterating. #10534

Uh oh!

Uh oh!

cornfeedhobo Feb 15, 2026

Replies: 2 comments

Uh oh!

xXMrNidaXx Feb 23, 2026

Uh oh!

xXMrNidaXx Feb 23, 2026

cornfeedhobo
Feb 15, 2026

xXMrNidaXx
Feb 23, 2026

xXMrNidaXx
Feb 23, 2026