fix: Opus prefill rejection during tool-call loops with thinking enabled#22404
fix: Opus prefill rejection during tool-call loops with thinking enabled#22404chan1103 wants to merge 1 commit intoanomalyco:devfrom
Conversation
|
Thanks for your contribution! This PR doesn't have a linked issue. All PRs must reference an existing issue. Please:
See CONTRIBUTING.md for details. |
|
The following comment was made by an LLM, it may be inaccurate: Related PR Found:
Related but Different Issue:
The current PR (#22404) is a focused fix for a specific edge case identified while working on #13286, and #22001 is the most directly related existing PR dealing with similar message shape issues. |
|
Noting for reviewers: #17010 addresses the root cause of this (loop exit condition using ULID ordering instead of parentID). This PR is a safety net at the transport layer — it prevents the API rejection even if the loop doesn't exit when it should. Both fixes are complementary: #17010 stops the loop from continuing, this PR catches the case at the message level. |
Some models (Opus 4.6) reject assistant-last messages as unsupported prefill when thinking is enabled. During tool-call loops, multi-step assistant messages can produce a trailing text-only model message after toModelMessages conversion. The loop continues (finish=tool-calls) but the next request ends with assistant. Strip trailing text-only assistant messages at the end of ProviderTransform.message() when options.thinking is set.
ab62259 to
1385351
Compare
Issue for this PR
Fixes #17982
Related: #13286, #22001, #13768, #13577
Type of change
What does this PR do?
I ran into another failure mode while chasing #13286:
What was happening here is that a single assistant turn could contain both a tool-call step and a text-only step after it. When
toModelMessagesflattens that into model messages, the text-only step becomes a trailingassistant(text)message. The loop is still open (finish === "tool-calls"), but the next request now ends with assistant instead of user.bvironn also hit a similar message shape in #22001 (tool-call followed by text in the same turn), though with a different symptom.
That turns out to be model-dependent when thinking is on. Sonnet accepts it. Opus 4.6 rejects it.
"does not support assistant message prefill"So the bug here isn't that the loop state is wrong. It's that we end up carrying forward a trailing text-only assistant message from the previous assistant turn, and some models treat that as unsupported prefill in thinking mode.
The fix strips trailing text-only assistant messages at the end of
ProviderTransform.message()whenoptions.thinkingis set. I used awhileloop so it also handles consecutive trailing text messages, and it covers both string content and array content.I kept the scope narrow:
assistantMessages with tool calls, reasoning blocks, or other non-text content are left alone. If thinking isn't active, nothing changes. During an open tool-call loop, these trailing text-only assistant messages are intermediate conversion results rather than the final assistant turn, so stripping them doesn't lose meaningful content — the model re-derives its plan from tool results in the next iteration.
How did you verify your code works?
I reproduced the behavior directly with API calls:
Full suite locally: 1906 pass, 0 fail.
Screenshots / recordings
Not a UI change.
Checklist