Skip to content

Python: Fix Bedrock non-ASCII escaping in JSON content blocks#6628

Open
kimnamu wants to merge 1 commit into
microsoft:mainfrom
kimnamu:fix/bedrock-non-ascii
Open

Python: Fix Bedrock non-ASCII escaping in JSON content blocks#6628
kimnamu wants to merge 1 commit into
microsoft:mainfrom
kimnamu:fix/bedrock-non-ascii

Conversation

@kimnamu

@kimnamu kimnamu commented Jun 19, 2026

Copy link
Copy Markdown

Thanks for Agent Framework and the Bedrock integration — it's a pleasure to build on.

Closes #6627

Problem

When the Bedrock Converse API returns a structured json content block, BedrockChatClient._parse_message_contents serializes it to text with json.dumps(json_value). Since json.dumps defaults to ensure_ascii=True, non-ASCII characters (CJK, emoji, accented Latin, etc.) are escaped to \uXXXX and reach the user garbled.

Cause

This single call is the outlier. The sibling OpenAI client (python/packages/openai/agent_framework_openai/_chat_client.py) and 16+ other call sites across the repo already serialize user-facing / span data with ensure_ascii=False. PR #3894 ("Python: Fix non-ascii chars in span attributes") fixed the same class of issue in observability — this is the matching fix the Bedrock client was missing.

Change

One line: add ensure_ascii=False to the json.dumps for the Bedrock json content block, plus a regression test.

Before / After

Item Before After
json block {"greeting": "你好世界"} → text {"greeting": "你好世界"} {"greeting": "你好世界"}
Emoji "🎉" in json block "🎉" "🎉"
Text content blocks (block.get("text")) ✅ unchanged ✅ unchanged
Public API / method signatures / arguments ✅ unchanged ✅ unchanged
Output remains valid JSON (round-trips via json.loads)

Test (red → green)

New test test_process_converse_response_preserves_non_ascii_in_json_block uses the existing key-free _StubBedrockRuntime pattern.

Against the unpatched source (bug present):

>       assert "你好世界" in text
E       assert '你好世界' in '{"greeting": "\\u4f60\\u597d\\u4e16\\u754c", "emoji": "\\ud83c\\udf89"}'
FAILED packages/bedrock/tests/test_bedrock_client.py::test_process_converse_response_preserves_non_ascii_in_json_block
======================== 1 failed, 2 warnings in 0.29s =========================

With the fix:

packages/bedrock/tests/test_bedrock_client.py .                          [100%]
======================== 1 passed, 2 warnings in 1.88s =========================

Full bedrock suite (no regressions) + lint/format clean:

$ uv run --package agent-framework-bedrock pytest packages/bedrock/tests/ -m "not integration" -q
..................................                                       [100%]
$ uv run ruff format --check packages/bedrock/...   # 2 files already formatted
$ uv run ruff check packages/bedrock/...            # All checks passed!

This contribution was prepared with the help of an AI agent (Claude Code); a human reviewed the change, rationale, and test results before submitting.

The Bedrock Converse `json` content block was serialized with
`json.dumps(json_value)`, whose default `ensure_ascii=True` escapes
CJK/emoji/accented characters to `\uXXXX` and surfaces garbled text.
Add `ensure_ascii=False` to match the sibling OpenAI client and the
16+ other call sites across the repo. Includes a regression test.

Closes microsoft#6627

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings June 19, 2026 15:14
@moonbox3 moonbox3 added the python Usage: [Issues, PRs], Target: Python label Jun 19, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a Bedrock Converse response parsing edge case where structured json content blocks were being serialized with json.dumps(..., ensure_ascii=True) (default), causing non‑ASCII characters to be escaped into \uXXXX sequences in user-visible text.

Changes:

  • Update Bedrock json content block serialization to use ensure_ascii=False so non‑ASCII characters are preserved.
  • Add a regression test ensuring CJK text and emoji remain unescaped and the emitted text still round-trips via json.loads.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
python/packages/bedrock/agent_framework_bedrock/_chat_client.py Preserves non‑ASCII characters when converting Bedrock json content blocks into text.
python/packages/bedrock/tests/test_bedrock_client.py Adds a unit test verifying non‑ASCII preservation and preventing regression.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

python Usage: [Issues, PRs], Target: Python

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Python: Bedrock JSON content blocks escape non-ASCII characters to \uXXXX

3 participants