Skip to content

MCP agentic-tools adapter corrupts binary HTTP responses #36241

Description

@fmontes

Problem Statement

The api.request adapter in the MCP server (libs/agentic-tools/src/lib/http-client.ts) corrupts every binary HTTP response. In the response-parsing block (~line 226), any non-JSON body is read with response.text(), which UTF-8-decodes the bytes. For binary files (images, fonts, .ico, etc.) the invalid-UTF-8 bytes are replaced with U+FFFD and lost irreversibly.

Impact: any MCP execute workflow that downloads/reads a binary file asset gets a corrupted result. Text assets (scss/vtl/css/js/json) are unaffected. This affects anyone using the MCP server to pull file assets out of dotCMS.

Layer: the bug is entirely in the MCP / agentic-tools adapter. The dotCMS endpoints are correct — /dA/{id} and /api/v2/assets/{id} return byte-exact 200 responses to a normal HTTP client.

Steps to Reproduce

  1. Through the MCP execute sandbox, call api.request({ path: "/api/v2/assets/{identifier-of-a-png}" }).
  2. Inspect the returned string length vs. the real file size — it is shorter (multi-byte sequences collapsed into U+FFFD).
  3. Re-encode the returned string to bytes and write a file → it is a broken/invalid image.
  4. For comparison, curl -H "Authorization: Bearer $TOKEN" http://localhost:8080/dA/{identifier} returns the correct byte-exact file (HTTP 200, correct Content-Type, correct size).

Acceptance Criteria

  • Binary file assets fetched via MCP api.request round-trip byte-exact (no corruption).
  • Response parsing branches: application/json.json(); textual types (text/*, application/xml, application/javascript, +json/+xml, application/x-www-form-urlencoded) → .text(); everything else → response.arrayBuffer() returned base64-encoded (e.g. { __dotcmsBinary, contentType, base64, byteLength }) so bytes survive JSON.stringify across the sandbox boundary in execute.ts.
  • Non-OK binary responses are read as text so server error messages are preserved (current error path at ~line 231 assumes string/JSON).
  • A size cap is applied to the binary path (~25MB, matching MAX_REMOTE_FILE_BYTES).
  • Regression test with a small PNG fixture asserts round-tripped bytes match the source.

dotCMS Version

Latest from main branch (@dotcms/mcp-server@beta, libs/agentic-tools).

Severity

Medium - Some functionality impacted

Links

NA

Metadata

Metadata

Assignees

Type

No fields configured for Bug.

Projects

Status
New

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions