Skip to content

feat: add fine-grained cache token tracking across 5 LLM providers#139

Merged
JDOxygen merged 2 commits into
jd-opensource:mainfrom
Takki-cd:feat/add-cache-tokens
Jun 1, 2026
Merged

feat: add fine-grained cache token tracking across 5 LLM providers#139
JDOxygen merged 2 commits into
jd-opensource:mainfrom
Takki-cd:feat/add-cache-tokens

Conversation

@Takki-cd
Copy link
Copy Markdown
Contributor

Replace the monolithic cached_tokens field with separatecached_input_tokens (cache read hits) and cache_creation_input_tokens (cache writes) for precise costallocation, since providers price cached and non-cached tokens differently.

Provider-specific cache field mapping:

  • OpenAI / Doubao: prompt_tokens_details.cached_tokens
  • Anthropic: cache_read_input_tokens + cache_creation_input_tokens
  • DeepSeek: prompt_cache_hit_tokens
  • Gemini (native): cachedContentTokenCount

Also fix total_tokens to use the API-provided value directly (e.g. Gemini includes thinking tokens in total, OpenAI includes reasoning in completion), with input+output as fallback when not available.

chenda14 added 2 commits May 28, 2026 22:56
Replace the monolithic `cached_tokens` field with separate
`cached_input_tokens` (cache read hits) and
`cache_creation_input_tokens` (cache writes) for precise cost
allocation, since providers price cached and non-cached tokens
differently.

Provider-specific cache field mapping:
- OpenAI / Doubao: prompt_tokens_details.cached_tokens
- Anthropic: cache_read_input_tokens + cache_creation_input_tokens
- DeepSeek: prompt_cache_hit_tokens
- Gemini (native): cachedContentTokenCount

Also fix total_tokens to use the API-provided value directly (e.g.
Gemini includes thinking tokens in total, OpenAI includes reasoning
in completion), with input+output as fallback when not available.
@JDOxygen JDOxygen merged commit 5b9008f into jd-opensource:main Jun 1, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants