fix: use per-invocation usage on agent OTEL span instead of accumulated by Zelys-DFKH · Pull Request #2074 · strands-agents/sdk-python

Zelys-DFKH · 2026-04-05T20:42:53Z

Description

end_agent_span in tracer.py reported response.metrics.accumulated_usage on
each span, which grows with every request in a session. In a session with 10
requests each using 100k tokens, request 1 would correctly show 100k, request 2
would show 200k, request 3 would show 300k, and so on. Observability backends like
Langfuse then sum these values, producing wildly inflated token counts and cost
estimates.

The fix replaces accumulated_usage with response.metrics.latest_agent_invocation.usage,
which contains only the tokens consumed during the current agent invocation. The
accumulated_usage field is retained as a fallback for the edge case where no
invocation has been recorded.

Related Issues

Resolves #2010

Documentation PR

N/A

Type of Change

Bug fix

Testing

Updated all four existing test_end_agent_span* tests to wire
latest_agent_invocation.usage on the mock (matching the value they were
already asserting, so expectations are unchanged).
Added test_end_agent_span_uses_invocation_not_accumulated_usage: sets
invocation usage to 100/200/300 tokens while accumulated usage is 300/600/900,
and asserts the span receives only the invocation values.
I ran hatch run prepare

Checklist

I have read the CONTRIBUTING document
I have added any necessary tests that prove my fix is effective or my feature works
I have updated the documentation accordingly
I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
My changes generate no new warnings
Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

end_agent_span was reporting response.metrics.accumulated_usage, which grows with every request in a session. In a 10-request session each using 100k tokens, request 2 would report 200k, request 3 would report 300k, etc., causing wildly inflated token counts in Langfuse and other OTEL backends. Use response.metrics.latest_agent_invocation.usage instead, which contains only the tokens for the current agent invocation. Falls back to accumulated_usage when no invocation is recorded (shouldn't happen in practice but guards against edge cases). Adds test_end_agent_span_uses_invocation_not_accumulated_usage to confirm that per-invocation values appear on the span when accumulated usage differs.

github-actions bot added the size/s label Apr 5, 2026

Zelys-DFKH requested a deployment to manual-approval April 5, 2026 20:43 — with GitHub Actions Waiting

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: use per-invocation usage on agent OTEL span instead of accumulated#2074

fix: use per-invocation usage on agent OTEL span instead of accumulated#2074
Zelys-DFKH wants to merge 1 commit intostrands-agents:mainfrom
Zelys-DFKH:fix/agent-span-per-invocation-usage

Zelys-DFKH commented Apr 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Zelys-DFKH commented Apr 5, 2026

Description

Related Issues

Documentation PR

Type of Change

Testing

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant