Skip to content

API Error 400: invalid_request_error when context accumulates large tool results #20

@HiepPP

Description

@HiepPP

Issue Description

The Kimi API returns a 400 invalid_request_error when the conversation context accumulates large tool results across multiple turns. The error provides no specific details about what validation failed.

Error Details

Timestamp: 2026-02-26T04:58:43.109Z
Error Response:

{
  "error": {
    "type": "invalid_request_error",
    "message": "Invalid request Error"
  },
  "type": "error"
}

HTTP Status: 400 (non-retryable)

Token Progression Pattern

The session accumulated tokens steadily before the crash:

Turn Input Tokens Cache Read Output Tokens Note
1 20,748 0 ~26 Initial request
2 32,031 0 ~26 After file read #1
3 36,374 0 ~322 After file read #2
4 4,374 32,000 ~81 ToolSearch call - CRASH

Total context at error: ~44,000+ tokens

Reproduction Steps

  1. Start a conversation with kimi-for-coding
  2. Perform multiple file read operations that return large text content (~180-200 lines each)
  3. Each file content gets added to conversation history as tool results
  4. Make a ToolSearch call that returns deferred tool references
  5. API returns 400 error immediately after

Key Technical Details

Message Structure Before Error

The conversation contained:

  • thinking blocks with signatures (encrypted thinking content)
  • tool_use messages with large input parameters
  • tool_result messages containing full file contents
  • tool_result with deferred tool references (71 total deferred tools returned)

Cache Configuration

  • Using 32k token cache (cache_read_input_tokens: 32000)
  • Cache miss on final request

Client Behavior

  • Claude Code CLI v2.1.59
  • Implements 11-attempt retry logic
  • First attempt fails with 400, no retries attempted (correct behavior for client errors)

Expected Behavior

API should either:

  1. Accept the request and process normally (if within limits)
  2. Return a specific error indicating:
    • Context length exceeded (with actual vs limit)
    • Message structure validation failed
    • Tool result size limit exceeded
    • Cache token limit exceeded

Actual Behavior

Generic invalid_request_error with message "Invalid request Error" provides zero diagnostic information.

Environment

  • Model: kimi-for-coding
  • Client: Claude Code CLI v2.1.59
  • Context management: 128k max output tokens (capped to 64k)
  • Tool use: Yes, multiple tool results in history
  • Cache: 32k ephemeral cache enabled

Questions for Kimi Team

  1. What is the actual context length limit for kimi-for-coding?
  2. Are there per-message size limits for tool results?
  3. Are there limits on deferred tool references count?
  4. Can error messages include diagnostic details about which validation failed?
  5. Is there a way to preview context usage before sending requests?

Suggested Improvements

  1. Descriptive errors: Instead of "Invalid request Error", return:

    {
      "error": {
        "type": "context_length_exceeded",
        "message": "Context length exceeded limit",
        "details": {
          "current_tokens": 44374,
          "limit": 40000,
          "exceeded_by": 4374
        }
      }
    }
  2. Headers: Return usage headers (x-request-tokens, x-context-limit) for debugging

  3. Documentation: Publish clear limits for:

    • Total context length
    • Individual message size
    • Tool result size
    • Number of tool definitions
    • Cache token limits

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions