API Error 400: invalid_request_error when context accumulates large tool results

## Issue Description

The Kimi API returns a 400 `invalid_request_error` when the conversation context accumulates large tool results across multiple turns. The error provides no specific details about what validation failed.

## Error Details

**Timestamp**: 2026-02-26T04:58:43.109Z  
**Error Response**:
```json
{
  "error": {
    "type": "invalid_request_error",
    "message": "Invalid request Error"
  },
  "type": "error"
}
```

**HTTP Status**: 400 (non-retryable)

## Token Progression Pattern

The session accumulated tokens steadily before the crash:

| Turn | Input Tokens | Cache Read | Output Tokens | Note |
|------|--------------|------------|---------------|------|
| 1 | 20,748 | 0 | ~26 | Initial request |
| 2 | 32,031 | 0 | ~26 | After file read #1 |
| 3 | 36,374 | 0 | ~322 | After file read #2 |
| 4 | 4,374 | 32,000 | ~81 | **ToolSearch call - CRASH** |

**Total context at error**: ~44,000+ tokens

## Reproduction Steps

1. Start a conversation with kimi-for-coding
2. Perform multiple file read operations that return large text content (~180-200 lines each)
3. Each file content gets added to conversation history as tool results
4. Make a ToolSearch call that returns deferred tool references
5. API returns 400 error immediately after

## Key Technical Details

### Message Structure Before Error
The conversation contained:
- `thinking` blocks with signatures (encrypted thinking content)
- `tool_use` messages with large input parameters
- `tool_result` messages containing full file contents
- `tool_result` with deferred tool references (71 total deferred tools returned)

### Cache Configuration
- Using 32k token cache (`cache_read_input_tokens: 32000`)
- Cache miss on final request

### Client Behavior
- Claude Code CLI v2.1.59
- Implements 11-attempt retry logic
- First attempt fails with 400, no retries attempted (correct behavior for client errors)

## Expected Behavior

API should either:
1. Accept the request and process normally (if within limits)
2. Return a **specific** error indicating:
   - Context length exceeded (with actual vs limit)
   - Message structure validation failed
   - Tool result size limit exceeded
   - Cache token limit exceeded

## Actual Behavior

Generic `invalid_request_error` with message `"Invalid request Error"` provides zero diagnostic information.

## Environment

- **Model**: kimi-for-coding
- **Client**: Claude Code CLI v2.1.59
- **Context management**: 128k max output tokens (capped to 64k)
- **Tool use**: Yes, multiple tool results in history
- **Cache**: 32k ephemeral cache enabled

## Questions for Kimi Team

1. What is the actual context length limit for kimi-for-coding?
2. Are there per-message size limits for tool results?
3. Are there limits on deferred tool references count?
4. Can error messages include diagnostic details about which validation failed?
5. Is there a way to preview context usage before sending requests?

## Suggested Improvements

1. **Descriptive errors**: Instead of "Invalid request Error", return:
   ```json
   {
     "error": {
       "type": "context_length_exceeded",
       "message": "Context length exceeded limit",
       "details": {
         "current_tokens": 44374,
         "limit": 40000,
         "exceeded_by": 4374
       }
     }
   }
   ```

2. **Headers**: Return usage headers (`x-request-tokens`, `x-context-limit`) for debugging

3. **Documentation**: Publish clear limits for:
   - Total context length
   - Individual message size
   - Tool result size
   - Number of tool definitions
   - Cache token limits

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

API Error 400: invalid_request_error when context accumulates large tool results #20

Issue Description

Error Details

Token Progression Pattern

Reproduction Steps

Key Technical Details

Message Structure Before Error

Cache Configuration

Client Behavior

Expected Behavior

Actual Behavior

Environment

Questions for Kimi Team

Suggested Improvements

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Turn	Input Tokens	Cache Read	Output Tokens	Note
1	20,748	0	~26	Initial request
2	32,031	0	~26	After file read #1
3	36,374	0	~322	After file read #2
4	4,374	32,000	~81	ToolSearch call - CRASH

API Error 400: invalid_request_error when context accumulates large tool results #20

Description

Issue Description

Error Details

Token Progression Pattern

Reproduction Steps

Key Technical Details

Message Structure Before Error

Cache Configuration

Client Behavior

Expected Behavior

Actual Behavior

Environment

Questions for Kimi Team

Suggested Improvements

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions