Commit 3317d14
authored
feat: recognize provider cost data from usage.cost_details block (#439)
## Summary
Some providers include detailed cost breakdowns directly in the response
`usage` object rather than via SSE `: cost` comment lines. This PR adds
support for recognizing and using that cost data when available.
### The new format
Providers may return a `usage` block like:
```json
"usage": {
"prompt_tokens": 23,
"total_tokens": 66,
"completion_tokens": 43,
"cost": 0.00017465,
"cost_details": {
"total_cost": 0.00017465,
"input_cost": 0.00002415,
"output_cost": 0.0001505,
"cached_input_cost": 0,
"cache_write_input_cost": 0,
"upstream_inference_cost": 0.00017465,
"request_cost": 0,
"web_search_cost": 0,
"data_storage_cost": 0.00000106
}
}
```
When `cost_details` is present, we use the provider's actual per-bucket
breakdown (`input_cost`, `output_cost`, `cached_input_cost`,
`cache_write_input_cost`) directly instead of proportionally
distributing a total.
## Changes
- **`utils/usage-normalizer.ts`**: Add `extractUsageCostDetails()` —
safely extracts `cost_details` from provider usage blocks; returns
`null` when absent (providers that don't use this format are
unaffected). Also updated `normalizeOpenAIChatUsage()` to extract
`cache_write_tokens` from `prompt_tokens_details`.
- **`utils/provider-cost.ts`**: Add `applyUsageCostDetails()` — applies
per-bucket breakdown when available, falls back to proportional
distribution otherwise.
- **`services/inspectors/usage-logging.ts`**: Wire `cost_details`
extraction into the streaming cost path (only applies if no SSE-reported
cost was found).
- **`services/response-handler.ts`**: Same for the non-streaming (unary)
path.
## Key design decisions
- **Fully optional/defensive**: `extractUsageCostDetails()` returns
`null` when `usage.cost_details` doesn't exist — providers that don't
use this format are completely unaffected
- **SSE `: cost` comments take precedence**: The `!providerReportedCost`
guard ensures `cost_details` only applies when no SSE-reported cost was
found
- **Per-bucket breakdown preferred**: When the provider gives explicit
`input_cost`/`output_cost`/etc., we use those directly instead of
proportional splitting
## Test plan
- [x] `extractUsageCostDetails` — extracts from the new format, falls
back to `usage.cost`/`usage.estimated_cost`, returns null for
missing/invalid data
- [x] `applyUsageCostDetails` — uses per-bucket breakdown, falls back to
proportional, handles zero/null costs
- [x] `normalizeOpenAIChatUsage` — extracts `cache_write_tokens` from
`prompt_tokens_details`
- [x] Precedence: SSE `: cost` comments > `cost_details` > calculated
costs
- [x] All 1367 existing tests pass1 parent e7eb887 commit 3317d14
6 files changed
Lines changed: 629 additions & 5 deletions
File tree
- packages/backend/src
- services
- inspectors
- utils
- __tests__
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
10 | 10 | | |
11 | 11 | | |
12 | 12 | | |
| 13 | + | |
13 | 14 | | |
14 | 15 | | |
15 | | - | |
| 16 | + | |
16 | 17 | | |
17 | 18 | | |
18 | 19 | | |
| |||
149 | 150 | | |
150 | 151 | | |
151 | 152 | | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
152 | 162 | | |
153 | 163 | | |
154 | 164 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
10 | 10 | | |
11 | 11 | | |
12 | 12 | | |
13 | | - | |
| 13 | + | |
| 14 | + | |
14 | 15 | | |
15 | 16 | | |
16 | 17 | | |
| |||
502 | 503 | | |
503 | 504 | | |
504 | 505 | | |
| 506 | + | |
| 507 | + | |
| 508 | + | |
| 509 | + | |
| 510 | + | |
| 511 | + | |
| 512 | + | |
| 513 | + | |
505 | 514 | | |
506 | 515 | | |
507 | 516 | | |
| |||
0 commit comments