Commit 421bf6a
committed
feat(streaming): emit OTel metrics for ttft, tps, and per-call token counts
Adds six metrics to TemporalStreamingModel.get_response so applications
that configure an OTel MeterProvider can see streaming-call behavior
without per-app instrumentation:
- agentex.llm.ttft (histogram, ms): time from request start to first
content delta. Captured on the first ResponseTextDeltaEvent /
ResponseReasoningTextDeltaEvent / ResponseReasoningSummaryTextDeltaEvent.
- agentex.llm.tps (histogram, tokens/s): output_tokens / stream_duration.
Use 1/tps for time-per-output-token (tpot).
- agentex.llm.input_tokens / output_tokens / cached_input_tokens /
reasoning_tokens (counters): pulled from the captured ResponsesAPI
Usage at end-of-stream. Cache hit rate is computed at query time as
rate(cached_input_tokens) / rate(input_tokens).
Why
- The data was already captured (line 854 captured_usage = response.usage)
but never emitted as metrics. Apps could only see total LLM call
duration, not the meaningful breakdowns.
- Doing this in the SDK rather than each app means every consumer of
TemporalStreamingModel gets the metrics for free.
- Cardinality is bounded — only `model` is a metric attribute. Resource
attributes (service.name, k8s.*, etc.) come from the application's
configured OTel resource, so cross-app comparisons work cleanly in
Mimir/Prometheus.
The meter is a no-op when no MeterProvider is configured, so this is
safe for apps that don't run with OTel.1 parent cf249b9 commit 421bf6a
1 file changed
Lines changed: 65 additions & 0 deletions
Lines changed: 65 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | 3 | | |
| 4 | + | |
4 | 5 | | |
5 | 6 | | |
6 | 7 | | |
| |||
26 | 27 | | |
27 | 28 | | |
28 | 29 | | |
| 30 | + | |
29 | 31 | | |
30 | 32 | | |
31 | 33 | | |
| |||
78 | 80 | | |
79 | 81 | | |
80 | 82 | | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
81 | 121 | | |
82 | 122 | | |
83 | 123 | | |
| |||
642 | 682 | | |
643 | 683 | | |
644 | 684 | | |
| 685 | + | |
| 686 | + | |
| 687 | + | |
| 688 | + | |
| 689 | + | |
| 690 | + | |
645 | 691 | | |
646 | 692 | | |
647 | 693 | | |
| |||
721 | 767 | | |
722 | 768 | | |
723 | 769 | | |
| 770 | + | |
| 771 | + | |
| 772 | + | |
| 773 | + | |
724 | 774 | | |
725 | 775 | | |
726 | 776 | | |
| |||
983 | 1033 | | |
984 | 1034 | | |
985 | 1035 | | |
| 1036 | + | |
| 1037 | + | |
| 1038 | + | |
| 1039 | + | |
| 1040 | + | |
| 1041 | + | |
| 1042 | + | |
| 1043 | + | |
| 1044 | + | |
| 1045 | + | |
| 1046 | + | |
| 1047 | + | |
| 1048 | + | |
| 1049 | + | |
| 1050 | + | |
986 | 1051 | | |
987 | 1052 | | |
988 | 1053 | | |
| |||
0 commit comments