Commit eebd8ac
authored
adding session metrics observability to observability.py (#827)
## Summary
- Add session-level metrics collection to the Ambient runner's Langfuse
integration, tracking tool usage, token consumption, cost estimates,
failures,
and interrupts across the entire session lifecycle
- Introduce `observability_models.py` with Pydantic models for tool
classification (`ToolCallType` enum with O(1) lookup), cost estimation
by model,
clarification detection heuristics, and a `SessionMetric` model that
flattens all metrics for Langfuse
- Emit a "Claude Code - Session Metrics" span at session finalize with
all metrics embedded as span metadata (namespace, user, tool counts,
token
totals, estimated cost) and the full `SessionMetric` model dump as
output
- DRY up duplicated usage-details construction across `end_turn()` and
`_close_turn_with_text()` into a shared `_build_usage_details()` static
helper,
and extract a module-level `_TOKEN_KEYS` constant used in 3 locations
- Optimize the hot-path `_track_metrics_from_event()`: hoist
`classify_tool` import, accept pre-resolved `etype` from caller, and
cache tool
classification in `_evt_tool_types` to avoid redundant classify calls on
`TOOL_CALL_END`
- Record human interrupts from `ClaudeBridge.interrupt()` into
observability metrics
- Fix test infrastructure: add `tests/__init__.py`, deduplicate event
factories from `test_observability_metrics.py` into shared
`conftest.py`, extend
`conftest.make_tool_end()` to accept `**kwargs`
## Test plan
- [x] `pytest tests/test_observability_metrics.py` — 40 tests covering
tool classification, cost estimation, clarification detection, metrics
tracking
via events, session summary emission, and `SessionMetric` model
serialization
- [x] `pytest tests/test_observability.py` — 45 existing tests pass (no
regressions)
- [x] `pytest tests/test_tracing_middleware.py` — 11 existing tests pass
with shared conftest imports
- [x] `ruff check` passes on all modified files
- [x ] Manual: deploy to kind cluster with local Langfuse, create a
session, verify "Claude Code - Session Metrics" span appears with tool
counts, interrupts, token totals, and cost in metadata
Signed-off-by: Nelesh Singla <117123879+nsingla@users.noreply.github.com>1 parent 647947d commit eebd8ac
6 files changed
Lines changed: 1096 additions & 42 deletions
File tree
- components/runners/ambient-runner
- ambient_runner
- bridges/claude
- tests
Lines changed: 4 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
139 | 139 | | |
140 | 140 | | |
141 | 141 | | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
142 | 146 | | |
143 | 147 | | |
144 | 148 | | |
| |||
0 commit comments