Adding missing attributes and metric for llamaindex instrumentation by shuningc · Pull Request #274 · signalfx/splunk-otel-python-contrib

shuningc · 2026-04-15T23:32:55Z

LlamaIndex instrumentation was missing several GenAI semantic convention attributes that LangChain already supports. Without these, users get incomplete observability — no response model info, no finish reasons, no streaming metrics, and no tool visibility. This change brings LlamaIndex to parity so users get consistent telemetry regardless of which framework they use.
Added 6 new LLM span attributes and TTFT metric emission:

Attribute	Description
gen_ai.response.model	Model name from response (fallback: raw → kwargs → request model)
gen_ai.response.finish_reasons	Completion finish reasons (e.g. ["stop"])
gen_ai.request.max_tokens	Max tokens from LLM metadata
gen_ai.request.stream	Whether the request was streaming
gen_ai.response.time_to_first_chunk	TTFT for streaming calls
gen_ai.tool.definitions	Tool definitions from agent context

Plus the gen_ai.client.operation.time_to_first_chunk histogram metric, matching LangChain's metrics pipeline.

Changed files

[callback_handler.py] — Extract new attributes from raw LLM response, TTFT from tracker, tool definitions from agent context
event_handler.py — [TTFTTracker] class and [LlamaindexEventHandler] listening to [LLMChatInProgressEvent] for streaming chunk timing
[invocation_manager.py] — TTFT tracking methods, [find_agent_with_tools()] fallback for async ContextVar propagation
workflow_instrumentation.py — Capture agent tools in [wrap_agent_run()] and register with invocation manager
init.py — Wire up TTFTTracker and EventHandler during instrumentation
[test_agent_attributes.py] (new) — 9 tests: response model, finish reasons, token usage, max tokens, tool definitions, TTFT span + metric, non-streaming
[test_ttft.py] (new) — 16 unit tests for TTFTTracker and ContextVar correlation
[test_circuit_agent.py] (new) — Live integration test with Circuit API (skipped in CI)
[CHANGELOG.md] — Document new features
[.env.example] (new) — Example environment config for Circuit credentials
Configuration
OTEL_INSTRUMENTATION_GENAI_CAPTURE_TOOL_DEFINITIONS=true to enable tool definitions capture.

TTFT challenge: two separate systems

Calculating Time To First Chunk in LlamaIndex is significantly harder than in LangChain. LangChain has a single callback system — on_llm_new_token fires for every streaming token, so TTFT is simply time.perf_counter() - start_time on the first token callback.

LlamaIndex has two separate instrumentation systems that don't share state:

Callback system (BaseCallbackHandler) — fires on_event_start/on_event_end for LLM calls with an event_id. This is where we create spans and set attributes.
Event system (BaseEventHandler) — fires LLMChatStartEvent and LLMChatInProgressEvent for streaming chunks with a span_id. This is the only place streaming tokens are visible.

The problem: the callback system never sees individual streaming tokens, and the event system doesn't have access to the span/invocation objects. They use different ID schemes (event_id vs span_id).

Solution: TTFTTracker + ContextVar bridge

We introduced three components to bridge the gap:

TTFTTracker — a standalone timing class that records start times keyed by span_id, and calculates TTFT when the first token arrives. It also maintains a mapping between event_id (callback system) and span_id (event system).
LlamaindexEventHandler — a BaseEventHandler that listens for LLMChatStartEvent (records start time) and LLMChatInProgressEvent (records first token arrival). It writes timing data into the shared TTFTTracker.
ContextVar _current_llm_event_id — when the callback handler starts an LLM event (on_event_start), it stores its event_id in a ContextVar. The event handler reads this ContextVar when LLMChatStartEvent fires (same execution context), creating the event_id ↔ span_id association in TTFTTracker.

When the callback handler's on_event_end fires, it calls invocation_manager.get_ttft_for_event(event_id) which looks up the associated span_id in TTFTTracker, retrieves the calculated TTFT, and sets it on the LLMInvocation. The existing MetricsEmitter in the util layer then emits the gen_ai.client.operation.time_to_first_chunk histogram metric automatically — no changes needed in the emitter.

Tool definitions: async ContextVar propagation

Tool definitions faced a similar cross-boundary challenge. Agent tools are registered in wrap_agent_run() (synchronous wrapper), but the LLM callback fires inside async agent execution where ContextVars may not propagate. We solved this by:

Registering the agent with invocation_manager before the async wrapped call
Adding find_agent_with_tools() as a fallback search across all registered agents

What changed

File	Change
callback_handler.py	Extract response model (fallback chain: raw → kwargs → request model), finish reasons, max tokens, TTFT from tracker, tool definitions from agent context, request_stream flag
event_handler.py (new)	TTFTTracker, LlamaindexEventHandler, ContextVar correlation functions
invocation_manager.py	TTFT tracking methods (set_ttft_tracker, get_ttft_for_event, cleanup_event_tracking), find_agent_with_tools() fallback
workflow_instrumentation.py	Capture agent tools in wrap_agent_run(), register with invocation manager before async execution
init.py	Wire up TTFTTracker, EventHandler, and dispatcher registration
test_agent_attributes.py (new)	9 tests: response model, finish reasons, tokens, max tokens, tool definitions, TTFT span + metric, non-streaming
test_ttft.py (new)	16 unit tests for TTFTTracker and ContextVar correlation
test_circuit_agent.py (new)	Live integration test with Circuit API (skipped in CI)
CHANGELOG.md	Document new features
.env.example (new)	Example environment config for examples

Adding missing attributes and metric for llamaindex instrumentation

b594f2b

shuningc requested review from a team as code owners April 15, 2026 23:32

shuningc marked this pull request as draft April 20, 2026 07:34

This was referenced Apr 20, 2026

Add streaming TTFT (Time To First Token) support for LlamaIndex instrumentation #281

Open

Add async fallback support for missing LLM attributes in LlamaIndex #282

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding missing attributes and metric for llamaindex instrumentation#274

Adding missing attributes and metric for llamaindex instrumentation#274
shuningc wants to merge 1 commit intomainfrom
HYBIM-620-LlamaindexMissingAttrs

shuningc commented Apr 15, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

shuningc commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

shuningc commented Apr 15, 2026 •

edited

Loading