Skip to content

fix(harness): normalize ToolMessage structured content in serialization#1167

Closed
mvanhorn wants to merge 2 commits intobytedance:mainfrom
mvanhorn:osc/1149-toolmessage-structured-content
Closed

fix(harness): normalize ToolMessage structured content in serialization#1167
mvanhorn wants to merge 2 commits intobytedance:mainfrom
mvanhorn:osc/1149-toolmessage-structured-content

Conversation

@mvanhorn
Copy link
Contributor

Summary

When models return ToolMessage content as a list of content blocks (e.g., [{"type": "text", "text": "..."}]), the UI displays the raw Python repr string instead of the extracted text.

Why this matters

  • #1149 reports text display abnormality when qwen3-max returns structured tool responses
  • The same normalization pattern was already applied to titles in PR #1155, but the main message serialization path was not updated

Changes

Two call sites in client.py used str(msg.content) for ToolMessage content when it isn't a string. Replaced both with the existing _extract_text() helper, which already handles string content, list-of-blocks content, and other types:

  1. _serialize_message() (L242) - used in values events
  2. stream() (L348) - used in messages-tuple events

Added test_serialize_message_content.py with 6 regression tests covering string, list-of-blocks, mixed, and empty content.

Testing

cd backend && PYTHONPATH=. uv run pytest tests/test_serialize_message_content.py -v  # 6 passed
cd backend && PYTHONPATH=. uv run pytest tests/test_client.py -v                     # 81 passed (no regressions)
cd backend && uvx ruff check .                                                        # clean

Fixes #1149

This contribution was developed with AI assistance (Claude Code).

When models return ToolMessage content as a list of content blocks
(e.g., [{"type": "text", "text": "..."}]), _serialize_message and the
stream method used str() which produced raw Python repr strings in the
UI. Use _extract_text() instead, which already handles both string and
list content correctly.

The same normalization pattern was applied to titles in PR bytedance#1155 but
the main message serialization path was not updated.

Fixes bytedance#1149
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Normalizes LangChain ToolMessage structured content (list-of-blocks) to plain text during backend serialization so the frontend doesn’t display Python repr(...) strings for tool results.

Changes:

  • Replace str(msg.content) with _extract_text(msg.content) for ToolMessage serialization in both _serialize_message() and stream().
  • Add regression tests covering _serialize_message() behavior for tool message content variants.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
backend/packages/harness/deerflow/client.py Uses _extract_text() to normalize ToolMessage content for values and messages-tuple streaming events.
backend/tests/test_serialize_message_content.py Adds regression tests to ensure _serialize_message() extracts plain text from ToolMessage list/block content.

Comment on lines 240 to +242
return {
"type": "tool",
"content": msg.content if isinstance(msg.content, str) else str(msg.content),
"content": DeerFlowClient._extract_text(msg.content),
return {
"type": "tool",
"content": msg.content if isinstance(msg.content, str) else str(msg.content),
"content": DeerFlowClient._extract_text(msg.content),
data={
"type": "tool",
"content": msg.content if isinstance(msg.content, str) else str(msg.content),
"content": self._extract_text(msg.content),
@WillemJiang
Copy link
Collaborator

@mvanhorn, thanks for your contribution. The user shows the content in the issue #1149 comes from the browser, but your patch is for the DeerFlow Client. I don't think your PR can fix the issue.

@mvanhorn
Copy link
Contributor Author

@WillemJiang - good question. The abnormal text the user sees in the browser originates from the backend serialization. When qwen3-max returns structured ToolMessage content (a list of content blocks), str(msg.content) in client.py sends the Python repr string [{'type': 'text', 'text': '...'}] to the frontend, which renders it literally.

This is the same pattern that was fixed for titles in PR #1155 - this PR extends _extract_text() to the two remaining serialization paths in _serialize_message() and stream().

If there's also a separate frontend rendering issue, this PR wouldn't cover that - but the backend producing clean text instead of Python repr strings should address the reported symptom. Happy to close if you've already fixed this elsewhere.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[runtime] 每次模型发起请求协助时,文本展示异常

3 participants