Problem
apaevt_flow events from llm_anthropic invocations include duration but not token counts. This makes it impossible to distinguish "this role takes longer because it processes more tokens" from "this role takes longer per token" without re-instrumenting from scratch.
In a coding-agent tracer run, eng2 consumed 60 LLM calls at 12.4 s avg vs eng1's 31 calls at 3.9 s avg. Without usage data we can't tell whether eng2 is context-bloated, output-heavy, or model-slow on its specific prompt shape.
Proposed fix
After each ChatAnthropic call returns, extract response.usage_metadata (or LangChain's AIMessage.usage_metadata) and attach to the trace.result payload of the apaevt_flow op:leave event:
{
"input_tokens": 1234,
"output_tokens": 567,
"cache_creation_input_tokens": 0,
"cache_read_input_tokens": 0
}
Free piggyback on the same flow event — no new event type, no new schema, no extra round-trip.
Acceptance
apaevt_flow op:leave for any llm_anthropic invoke includes a usage dict.
- Caching verification (companion issue on prompt caching) becomes inspectable from the tracer file alone.
- Optional: also record on
llm_openai, llm_bedrock, etc. if upstream usage shapes are similar — out of scope for this issue, separate ticket.
Suggested labels
enhancement, observability, nodes/llm_anthropic
Problem
apaevt_flowevents fromllm_anthropicinvocations include duration but not token counts. This makes it impossible to distinguish "this role takes longer because it processes more tokens" from "this role takes longer per token" without re-instrumenting from scratch.In a coding-agent tracer run,
eng2consumed 60 LLM calls at 12.4 s avg vseng1's 31 calls at 3.9 s avg. Withoutusagedata we can't tell whether eng2 is context-bloated, output-heavy, or model-slow on its specific prompt shape.Proposed fix
After each
ChatAnthropiccall returns, extractresponse.usage_metadata(or LangChain'sAIMessage.usage_metadata) and attach to thetrace.resultpayload of theapaevt_flowop:leaveevent:{ "input_tokens": 1234, "output_tokens": 567, "cache_creation_input_tokens": 0, "cache_read_input_tokens": 0 }Free piggyback on the same flow event — no new event type, no new schema, no extra round-trip.
Acceptance
apaevt_flowop:leavefor anyllm_anthropicinvoke includes ausagedict.llm_openai,llm_bedrock, etc. if upstream usage shapes are similar — out of scope for this issue, separate ticket.Suggested labels
enhancement,observability,nodes/llm_anthropic