fix: capture claude agent sdk session ids#222
Conversation
ralf0131
left a comment
There was a problem hiding this comment.
Summary
This PR adds session ID capture on agent, LLM, and tool spans for the Claude Agent SDK instrumentation. The core improvement is replacing the module-level global _client_managed_runs dict with a per-invocation local dict, eliminating cross-invocation state leakage. Session IDs are propagated from multiple sources (SystemMessage, StreamEvent, ResultMessage, client query, baggage) with entry baggage taking precedence. 562 lines of new tests provide comprehensive coverage. All CI checks pass.
Findings
- [Warning]
patch.py~line 731 — The removal ofotel_context.detach(empty_context_token)changes trace topology. Previously, each agent invocation explicitly detached to an empty context to guarantee independent root traces. Now, if a caller has an active span, the agent span becomes a child of that span. Teststest_wrap_query_sequential_calls_create_independent_root_tracesandtest_wrap_query_preserves_active_parent_contextverify both behaviors, so this is intentional. Consider noting this behavior change in the CHANGELOG entry (currently only mentions session ID capture). - [Info]
patch.py~line 940 —session_id = getattr(options, "resume", None)uses the Claude SDKresumefield as session_id. This is the correct convention for conversation resumption. - [Info]
patch.py~line 881 —_otel_session_idis set inwrap_claude_client_queryand read viagetattrwithNonedefault inwrap_claude_client_receive_response. The fallback is safe for cases wherereceive_responseis called without a precedingquery. - [Info] The module-level global
_client_managed_runs→ per-invocation local dict is a good concurrency safety improvement. This prevents state leakage between parallel/interleaved agent invocations.
Suggestions
The PR is well-structured and ready for merge. The one suggestion is to expand the CHANGELOG entry to mention the context propagation behavior change, since it affects trace topology for downstream consumers.
Automated review by github-manager-bot
d9cf73e to
aed5bb8
Compare
ralf0131
left a comment
There was a problem hiding this comment.
Summary
Re-review after new commits (squash). The CHANGELOG now documents the context-propagation behavior change, addressing the warning from the previous review. The core improvements remain solid: the global _client_managed_runs → per-invocation local dict eliminates cross-stream state leakage, entry-baggage session precedence is correctly implemented, and 13 new test cases provide comprehensive coverage.
CI Status
typecheckis failing, but the errors are invertexai/patch.py(lines 176, 260 — unnecessary# type: ignorecomments), not in the files changed by this PR. This is a pre-existing issue in the vertexai instrumentation module.- Multiple
LoongSuite loongsuite-instrumentation-claude-agent-sdk-*CI jobs are still pending — the PR\u2019s own test results cannot be fully confirmed until these complete. license/clapasses (CLA signed).
Findings
- [Info]
patch.py:611—_set_session_idcalled on everyStreamEventeven whensession_idisNone; consider early-return for the no-op case.
Previous Review Resolution
The previous review\u2019s warning about the CHANGELOG not mentioning the context-propagation change is now resolved — the entry reads \u201cpreserve active caller context so SDK traces attach to existing caller spans instead of being forced to independent roots.\u201d
Automated review by github-manager-bot
| if isinstance(event, dict): | ||
| session_id = event.get("session_id") | ||
|
|
||
| _set_session_id(agent_invocation, session_id) |
There was a problem hiding this comment.
[Info] _set_session_id is called on every StreamEvent even when session_id is None. When there is no session and no entry baggage, this is effectively a no-op but still triggers a baggage lookup via _entry_baggage_identity_attributes(). For high-throughput streaming paths, consider adding an early return when session_id is None and no entry baggage session is present.
Description
This PR captures Claude Agent SDK session IDs on agent, LLM, and tool spans so traces from resumed or client-managed SDK sessions can be correlated by
gen_ai.session.id.The change propagates session IDs from SDK init/system messages, stream events,
ClaudeSDKClient.query(..., session_id=...), standalonequery(..., options.resume=...), and result-message fallbacks. When an upstream Entry span has already propagatedgen_ai.session.idthrough OpenTelemetry Baggage, that Entry session is used ahead of the Claude SDK's internal session so downstream spans keep the request-level LoongSuite identity. It also preserves the caller's active OpenTelemetry context instead of clearing it with an empty context, matching the Robin fix for broken parent-child trace linkage, and keeps per-stream tool state local so concurrent streams that reuse tool IDs do not cross-contaminate session or trace state.Fixes # (N/A)
Type of change
How Has This Been Tested?
python $PIPELINE_SKILL_DIR/scripts/check_loongsuite_pr_readiness.py --repo .tox -e precommitOTEL_SEMCONV_STABILITY_OPT_IN=gen_ai_latest_experimental OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT=SPAN_ONLY python -m pytest instrumentation-loongsuite/loongsuite-instrumentation-claude-agent-sdk/tests/test_session_capture.py -qOTEL_SEMCONV_STABILITY_OPT_IN=gen_ai_latest_experimental OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT=SPAN_ONLY python -m pytest instrumentation-loongsuite/loongsuite-instrumentation-claude-agent-sdk/tests -q -m "not requires_cli"ANTHROPIC_BASE_URL=https://dashscope.aliyuncs.com/apps/anthropic ANTHROPIC_API_KEY=<configured> OTEL_SEMCONV_STABILITY_OPT_IN=gen_ai_latest_experimental OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT=SPAN_ONLY python -m pytest instrumentation-loongsuite/loongsuite-instrumentation-claude-agent-sdk/tests/test_attributes.py::test_span_attributes_semantic_conventions -q -sANTHROPIC_BASE_URL=https://dashscope.aliyuncs.com/apps/anthropic ANTHROPIC_API_KEY=<configured> OTEL_SEMCONV_STABILITY_OPT_IN=gen_ai_latest_experimental python -m pytest instrumentation-loongsuite/loongsuite-instrumentation-claude-agent-sdk/tests/test_attributes.py::test_span_attributes_no_sensitive_data -qReadtool call.weaver registry live-check -r <loongsuite-semantic-conventions>/model --advice-profile loongsuite-genai --input-format json.Validation Evidence
Spec and Scope
loongsuite-instrumentation-claude-agent-sdkruntime patch, focused session tests, and plugin changelog.Local Checks
python $PIPELINE_SKILL_DIR/scripts/check_loongsuite_pr_readiness.py --repo .python $PIPELINE_SKILL_DIR/scripts/plan_review_matrix.py --repo . --format markdowntox -e precommitOTEL_SEMCONV_STABILITY_OPT_IN=gen_ai_latest_experimental OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT=SPAN_ONLY python -m pytest instrumentation-loongsuite/loongsuite-instrumentation-claude-agent-sdk/tests/test_session_capture.py -qOTEL_SEMCONV_STABILITY_OPT_IN=gen_ai_latest_experimental OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT=SPAN_ONLY python -m pytest instrumentation-loongsuite/loongsuite-instrumentation-claude-agent-sdk/tests/test_session_capture.py::test_wrap_query_preserves_active_parent_context -qOTEL_SEMCONV_STABILITY_OPT_IN=gen_ai_latest_experimental OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT=SPAN_ONLY python -m pytest instrumentation-loongsuite/loongsuite-instrumentation-claude-agent-sdk/tests -q -m "not requires_cli"ANTHROPIC_BASE_URL=https://dashscope.aliyuncs.com/apps/anthropic ANTHROPIC_API_KEY=<configured> OTEL_SEMCONV_STABILITY_OPT_IN=gen_ai_latest_experimental OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT=SPAN_ONLY python -m pytest instrumentation-loongsuite/loongsuite-instrumentation-claude-agent-sdk/tests/test_attributes.py::test_span_attributes_semantic_conventions -q -sANTHROPIC_BASE_URL=https://dashscope.aliyuncs.com/apps/anthropic ANTHROPIC_API_KEY=<configured> OTEL_SEMCONV_STABILITY_OPT_IN=gen_ai_latest_experimental python -m pytest instrumentation-loongsuite/loongsuite-instrumentation-claude-agent-sdk/tests/test_attributes.py::test_span_attributes_no_sensitive_data -qtox -c tox-loongsuite.ini -e py312-test-loongsuite-instrumentation-claude-agent-sdk-latest -- -q -m "not requires_cli"codex-claude-review-loopgit diff --checkReal E2E Matrix
test_span_attributes_semantic_conventionsconsumed a bounded SDK query to completion.Readtool call.python -m pytest instrumentation-loongsuite/loongsuite-instrumentation-claude-agent-sdk/tests -q -m "not requires_cli"Telemetry and Weaver
gen_ai.session.id.gen_ai.session.id, AGENT, LLM, and TOOL spans use it instead of Claude's internal session id.weaver registry live-check -r <loongsuite-semantic-conventions>/model --input-source <generated JSON sample> --input-format json --advice-profile loongsuite-genai --skip-policies trueCI
Does This PR Require a Core Repo Change?
Checklist:
See contributing.md for styleguide, changelog guidelines, and more.