Description
I've encountered a significant discrepancy between token counts calculated by tiktoken locally and the actual token usage returned by the OpenAI API, specifically when messages contain tool calls.
Environment
- tiktoken version: 3.11
- Python version: 0.7.0
- Model: gpt-4.1-mini
Steps to Reproduce
- Prepare a message payload that includes tool calls (function calling)
- Calculate token count using tiktoken locally
- Send the same payload to OpenAI API
- Compare tiktoken's result with the
usage field in API response
Expected Behavior
The token count calculated by tiktoken should be close to (or exactly match) the token usage reported by the API.
Actual Behavior
There is a large discrepancy between tiktoken's calculation and the API's reported token usage when tool calls are present in the message body.
tiktoken calculation
encoding = tiktoken.encoding_for_model("gpt-4")
Result
tiktoken_count = 47,194
API response shows: api_count = 140,384
Difference: -93,190
Description
I've encountered a significant discrepancy between token counts calculated by tiktoken locally and the actual token usage returned by the OpenAI API, specifically when messages contain tool calls.
Environment
Steps to Reproduce
usagefield in API responseExpected Behavior
The token count calculated by tiktoken should be close to (or exactly match) the token usage reported by the API.
Actual Behavior
There is a large discrepancy between tiktoken's calculation and the API's reported token usage when tool calls are present in the message body.
tiktoken calculation
encoding = tiktoken.encoding_for_model("gpt-4")
Result
tiktoken_count = 47,194
API response shows: api_count = 140,384
Difference: -93,190