-
Notifications
You must be signed in to change notification settings - Fork 145
Open
Description
Issue Description
The Kimi API returns a 400 invalid_request_error when the conversation context accumulates large tool results across multiple turns. The error provides no specific details about what validation failed.
Error Details
Timestamp: 2026-02-26T04:58:43.109Z
Error Response:
{
"error": {
"type": "invalid_request_error",
"message": "Invalid request Error"
},
"type": "error"
}HTTP Status: 400 (non-retryable)
Token Progression Pattern
The session accumulated tokens steadily before the crash:
| Turn | Input Tokens | Cache Read | Output Tokens | Note |
|---|---|---|---|---|
| 1 | 20,748 | 0 | ~26 | Initial request |
| 2 | 32,031 | 0 | ~26 | After file read #1 |
| 3 | 36,374 | 0 | ~322 | After file read #2 |
| 4 | 4,374 | 32,000 | ~81 | ToolSearch call - CRASH |
Total context at error: ~44,000+ tokens
Reproduction Steps
- Start a conversation with kimi-for-coding
- Perform multiple file read operations that return large text content (~180-200 lines each)
- Each file content gets added to conversation history as tool results
- Make a ToolSearch call that returns deferred tool references
- API returns 400 error immediately after
Key Technical Details
Message Structure Before Error
The conversation contained:
thinkingblocks with signatures (encrypted thinking content)tool_usemessages with large input parameterstool_resultmessages containing full file contentstool_resultwith deferred tool references (71 total deferred tools returned)
Cache Configuration
- Using 32k token cache (
cache_read_input_tokens: 32000) - Cache miss on final request
Client Behavior
- Claude Code CLI v2.1.59
- Implements 11-attempt retry logic
- First attempt fails with 400, no retries attempted (correct behavior for client errors)
Expected Behavior
API should either:
- Accept the request and process normally (if within limits)
- Return a specific error indicating:
- Context length exceeded (with actual vs limit)
- Message structure validation failed
- Tool result size limit exceeded
- Cache token limit exceeded
Actual Behavior
Generic invalid_request_error with message "Invalid request Error" provides zero diagnostic information.
Environment
- Model: kimi-for-coding
- Client: Claude Code CLI v2.1.59
- Context management: 128k max output tokens (capped to 64k)
- Tool use: Yes, multiple tool results in history
- Cache: 32k ephemeral cache enabled
Questions for Kimi Team
- What is the actual context length limit for kimi-for-coding?
- Are there per-message size limits for tool results?
- Are there limits on deferred tool references count?
- Can error messages include diagnostic details about which validation failed?
- Is there a way to preview context usage before sending requests?
Suggested Improvements
-
Descriptive errors: Instead of "Invalid request Error", return:
{ "error": { "type": "context_length_exceeded", "message": "Context length exceeded limit", "details": { "current_tokens": 44374, "limit": 40000, "exceeded_by": 4374 } } } -
Headers: Return usage headers (
x-request-tokens,x-context-limit) for debugging -
Documentation: Publish clear limits for:
- Total context length
- Individual message size
- Tool result size
- Number of tool definitions
- Cache token limits
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels