Skip to content

feat: add token usage and cost display to model responses#7

Merged
sena-labs merged 1 commit into
mainfrom
claude/sleepy-joliot-6553d8
May 7, 2026
Merged

feat: add token usage and cost display to model responses#7
sena-labs merged 1 commit into
mainfrom
claude/sleepy-joliot-6553d8

Conversation

@sena-labs
Copy link
Copy Markdown
Owner

Summary

  • Adds SHOW_COST_INFO valve (default false) — when enabled, a footnote is appended to every model response showing token usage and cost
  • Adds COST_CURRENCY valve (select: USD/EUR/GBP/JPY/CAD/AUD, default USD) — controls the currency symbol shown next to the cost
  • Adds _format_cost_info() helper and _CURRENCY_SYMBOLS dict at module level
  • Both streaming and non-streaming paths are covered

What changed and why

OpenRouter already includes a usage object in every response (prompt_tokens, completion_tokens, total_tokens, cost). Until now the pipe discarded this data entirely. This PR surfaces it as an optional footnote after each reply, letting users see exactly what each conversation turn costs without leaving Open WebUI.

Non-streaming: usage is read from the final JSON response and appended to final_parts when SHOW_COST_INFO is true.

Streaming: a latest_usage variable tracks the usage chunk as SSE lines are consumed (OpenRouter sends it in the last chunk before [DONE]); after the stream ends, the footnote is yielded.

Cost formatting uses adaptive precision:

  • < 0.0001 → 6 decimal places ($0.000005)
  • 0.0001–0.01 → 5 decimal places ($0.00123)
  • >= 0.01 → 4 decimal places ($0.0567)

COST_CURRENCY is a display-only label — OpenRouter always bills in USD, so no exchange-rate API call is needed.

Example output with SHOW_COST_INFO=true, COST_CURRENCY=USD:

[model response...]

---
*Tokens: 1,250 prompt + 342 completion = 1,592 total · Cost: $0.00234*

Test plan

  • python test_pipe.py — 374 assertions, 0 failures (52 new assertions in section 33)
  • _format_cost_info() tested for: empty dict, tokens-only, zero cost, micro/small/normal costs, EUR/GBP/unknown currency, invalid cost value, missing total_tokens, output format/separators, large numbers with commas
  • Non-stream: SHOW_COST_INFO=false (no footnote), SHOW_COST_INFO=true USD/EUR, no usage field in response (no crash)
  • Stream: usage in final chunk, SHOW_COST_INFO=false, no usage chunk at all (no crash)
  • Valve defaults (SHOW_COST_INFO=false, COST_CURRENCY=USD) and select schema (6 options)
  • All prior 322 assertions continue to pass unchanged

🤖 Generated with Claude Code

Adds SHOW_COST_INFO and COST_CURRENCY valves that, when enabled, append
a formatted footnote to each response showing token counts (prompt,
completion, total) and cost in the selected currency symbol.

OpenRouter includes usage data in every API response — in the JSON body
for non-streaming and in the final SSE chunk before [DONE] for streaming.
Both paths now capture and surface this data when SHOW_COST_INFO is true.

Cost formatting uses adaptive precision: 6 decimals for micro-costs
(<0.0001), 5 for small (<0.01), 4 for normal amounts. Token counts use
comma separators for readability. COST_CURRENCY is a display-only label
(USD/EUR/GBP/JPY/CAD/AUD); OpenRouter always bills in USD.

Adds 52 new assertions (374 total, all passing).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 7, 2026 12:52
@sena-labs sena-labs merged commit 9c068c7 into main May 7, 2026
6 checks passed
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an optional “token usage + cost” footnote to OpenRouter model responses (both streaming and non-streaming) so Open WebUI users can see per-turn usage/cost directly in-chat.

Changes:

  • Introduces SHOW_COST_INFO and COST_CURRENCY valves and a _format_cost_info() formatter with a _CURRENCY_SYMBOLS map.
  • Appends formatted usage/cost info to non-stream responses when enabled.
  • Tracks the latest usage SSE chunk and appends the formatted footnote after streaming completes.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
openrouter_pipe.py Adds valves + formatting helper; appends token/cost footnote in both non-stream and stream flows.
test_pipe.py Adds unit/integration assertions covering formatting and valve behavior for both response modes.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread openrouter_pipe.py
if cost is not None:
try:
cost_f = float(cost)
symbol = _CURRENCY_SYMBOLS.get(currency, f"{currency} ")
Comment thread openrouter_pipe.py
description="Append token usage and cost to each response",
)
COST_CURRENCY: str = Field(
default=os.getenv("OPENROUTER_COST_CURRENCY", "USD"),
Comment thread test_pipe.py
_assert(v.REQUEST_TIMEOUT == 90, "REQUEST_TIMEOUT 90")
_assert(v.MAX_RETRIES == 2, "MAX_RETRIES 2")
_assert(v.SHOW_COST_INFO is False, "SHOW_COST_INFO false by default")
_assert(v.COST_CURRENCY == "USD", "COST_CURRENCY USD by default")
@sena-labs sena-labs deleted the claude/sleepy-joliot-6553d8 branch May 7, 2026 20:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants