Skip to content

Commit eebd8ac

Browse files
authored
adding session metrics observability to observability.py (#827)
## Summary - Add session-level metrics collection to the Ambient runner's Langfuse integration, tracking tool usage, token consumption, cost estimates, failures, and interrupts across the entire session lifecycle - Introduce `observability_models.py` with Pydantic models for tool classification (`ToolCallType` enum with O(1) lookup), cost estimation by model, clarification detection heuristics, and a `SessionMetric` model that flattens all metrics for Langfuse - Emit a "Claude Code - Session Metrics" span at session finalize with all metrics embedded as span metadata (namespace, user, tool counts, token totals, estimated cost) and the full `SessionMetric` model dump as output - DRY up duplicated usage-details construction across `end_turn()` and `_close_turn_with_text()` into a shared `_build_usage_details()` static helper, and extract a module-level `_TOKEN_KEYS` constant used in 3 locations - Optimize the hot-path `_track_metrics_from_event()`: hoist `classify_tool` import, accept pre-resolved `etype` from caller, and cache tool classification in `_evt_tool_types` to avoid redundant classify calls on `TOOL_CALL_END` - Record human interrupts from `ClaudeBridge.interrupt()` into observability metrics - Fix test infrastructure: add `tests/__init__.py`, deduplicate event factories from `test_observability_metrics.py` into shared `conftest.py`, extend `conftest.make_tool_end()` to accept `**kwargs` ## Test plan - [x] `pytest tests/test_observability_metrics.py` — 40 tests covering tool classification, cost estimation, clarification detection, metrics tracking via events, session summary emission, and `SessionMetric` model serialization - [x] `pytest tests/test_observability.py` — 45 existing tests pass (no regressions) - [x] `pytest tests/test_tracing_middleware.py` — 11 existing tests pass with shared conftest imports - [x] `ruff check` passes on all modified files - [x ] Manual: deploy to kind cluster with local Langfuse, create a session, verify "Claude Code - Session Metrics" span appears with tool counts, interrupts, token totals, and cost in metadata Signed-off-by: Nelesh Singla <117123879+nsingla@users.noreply.github.com>
1 parent 647947d commit eebd8ac

6 files changed

Lines changed: 1096 additions & 42 deletions

File tree

components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -139,6 +139,10 @@ async def interrupt(self, thread_id: Optional[str] = None) -> None:
139139
logger.info(f"Interrupt request for thread={tid}")
140140
await worker.interrupt()
141141

142+
# Record interrupt in observability metrics
143+
if self._obs:
144+
self._obs.record_interrupt()
145+
142146
# ------------------------------------------------------------------
143147
# Lifecycle methods
144148
# ------------------------------------------------------------------

0 commit comments

Comments
 (0)