fix(usage): use OpenRouter's billed cost instead of estimating locally by srtab · Pull Request #1127 · srtab/daiv

srtab · 2026-04-24T23:11:53Z

For OpenRouter (DAIV's default Anthropic route), langchain_openai's usage parser only emits cache_read, never cache_creation. The previous code passed cache_write_tokens=0 to genai_prices, which then priced cache-creation tokens at the base input rate instead of the 1.25x/2x cache-write rate. With the default 1h cache TTL, cache writes were being undercharged by about 50%, producing run costs visibly lower than LangSmith's.

Switch to OpenRouter's authoritative cost field:

ChatOpenRouter subclasses ChatOpenAI and stashes usage.cost on the message's response_metadata so it survives streaming chunk merging (non-streaming already exposes it via llm_output propagation).
get_model routes OpenRouter calls through ChatOpenRouter and sets extra_body.usage.include + stream_usage so OpenRouter actually returns the cost field.
DaivUsageCallbackHandler aggregates per-model provider cost alongside the existing token usage; build_usage_summary prefers it over the local genai_prices estimate and tags each model entry with cost_source.

Direct Anthropic / OpenAI / Google keep using genai_prices unchanged.

For OpenRouter (DAIV's default Anthropic route), langchain_openai's usage parser only emits cache_read, never cache_creation. The previous code passed cache_write_tokens=0 to genai_prices, which then priced cache-creation tokens at the base input rate instead of the 1.25x/2x cache-write rate. With the default 1h cache TTL, cache writes were being undercharged by about 50%, producing run costs visibly lower than LangSmith's. Switch to OpenRouter's authoritative cost field: - ChatOpenRouter subclasses ChatOpenAI and stashes usage.cost on the message's response_metadata so it survives streaming chunk merging (non-streaming already exposes it via llm_output propagation). - get_model routes OpenRouter calls through ChatOpenRouter and sets extra_body.usage.include + stream_usage so OpenRouter actually returns the cost field. - DaivUsageCallbackHandler aggregates per-model provider cost alongside the existing token usage; build_usage_summary prefers it over the local genai_prices estimate and tags each model entry with cost_source. Direct Anthropic / OpenAI / Google keep using genai_prices unchanged.

srtab self-assigned this Apr 24, 2026

srtab marked this pull request as draft April 24, 2026 23:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(usage): use OpenRouter's billed cost instead of estimating locally#1127

fix(usage): use OpenRouter's billed cost instead of estimating locally#1127
srtab wants to merge 1 commit intomainfrom
feat/openrouter-cost-tracking

srtab commented Apr 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

srtab commented Apr 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant