feat(claude-agent-sdk): capture gen_ai.skill.* on Skill load execute_tool span by rangemer333-cell · Pull Request #225 · alibaba/loongsuite-python

rangemer333-cell · 2026-06-22T01:01:36Z

Summary

Attach gen_ai.skill.name / gen_ai.skill.id / gen_ai.skill.description / gen_ai.skill.version to the execute_tool span of the built-in Skill tool in the Claude Agent SDK instrumentation.

Telemetry is bound to the ToolUseBlock(name="Skill") tool span — not to the subsequent UserMessage TextBlock that injects SKILL.md content (per the AGE-239 spec).

Attribute mapping

Source	Attribute
`ToolUseBlock.input.skill` / frontmatter `name`	`gen_ai.skill.name`
`claude:project:<skill-name>`	`gen_ai.skill.id`
`SKILL.md` frontmatter `description`	`gen_ai.skill.description`
`SKILL.md` frontmatter `version`	`gen_ai.skill.version`
`SystemMessage.data.cwd`	locates `<cwd>/.claude/skills/<name>/SKILL.md`

If start-time info is incomplete, skill name falls back to UserMessage.tool_use_result.commandName before the span closes.

Robustness

Skill metadata is read best-effort: any failure (missing/unreadable SKILL.md, malformed frontmatter, …) is swallowed and never affects the SDK call. When SKILL.md is unavailable, only gen_ai.skill.name/gen_ai.skill.id (derived from the tool input) are set.

Validation

PYTHONPATH="<repo>/util/opentelemetry-util-genai/src:<repo>/instrumentation-loongsuite/loongsuite-instrumentation-claude-agent-sdk/src" python -m pytest   instrumentation-loongsuite/loongsuite-instrumentation-claude-agent-sdk/tests/test_span_validation.py::test_skill_tool_span_attributes -q

python -m ruff check <changed-files>
python -m pytest   instrumentation-loongsuite/loongsuite-instrumentation-claude-agent-sdk/tests/test_span_validation.py   instrumentation-loongsuite/loongsuite-instrumentation-claude-agent-sdk/tests/test_with_cassettes.py   instrumentation-loongsuite/loongsuite-instrumentation-claude-agent-sdk/tests/test_task_subagent_real_data.py -q

All spec-scoped tests pass (24 passed); ruff clean.

Note: test_edge_cases.py / test_integration.py require a real claude-agent-sdk install and fail identically on clean main — pre-existing, unrelated to this change.

Draft PR per task scope — not for merge yet; review threads left open.

🤖 Generated with Claude Code

…tool span Attach gen_ai.skill.name/id/description/version to the execute_tool span of the built-in Skill tool. Telemetry is bound to the ToolUseBlock(name="Skill") tool span (not the SKILL.md-injecting UserMessage TextBlock). - skill.name from ToolUseBlock.input.skill (frontmatter.name fallback) - skill.id = claude:project:<skill-name> - skill.description/version read best-effort from <cwd>/.claude/skills/<name>/SKILL.md frontmatter (cwd from SystemMessage.data.cwd) - fallback to UserMessage.tool_use_result.commandName when start info incomplete - metadata read failures never propagate to the SDK call Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

CLAassistant · 2026-06-22T01:01:43Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.

流屿 seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

rangemer333-cell

Code review for AGE-239 — Skill telemetry on execute_tool span.

Verification against spec

Telemetry is on ToolUseBlock(name="Skill") execute_tool span, not on UserMessage TextBlock ✅
- _create_tool_spans_from_message (patch.py) creates ExecuteToolInvocation for name=="Skill" ToolUseBlocks and calls _apply_skill_metadata → handler.start_execute_tool.
- The UserMessage TextBlock carrying SKILL.md content is handled in _process_user_message and only feeds user_parts / collected_messages; no span attributes are written there. ✓
Four attributes map correctly, skill.id = claude:project:<name> ✅
- gen_ai.skill.name = frontmatter name (fallback to ToolUseBlock.input.skill).
- gen_ai.skill.id = f"{_SKILL_ID_PREFIX}{name}" with _SKILL_ID_PREFIX = "claude:project:".
- gen_ai.skill.description / gen_ai.skill.version from frontmatter; only written when non-empty.
SKILL.md frontmatter read is best-effort, failures do not escape ✅
- _read_skill_metadata swallows file-open errors → {}.
- _parse_skill_frontmatter swallows parse errors → {}.
- _apply_skill_metadata call site wraps in try/except → logger.warning only.
- _apply_skill_fallback call site also wrapped in try/except.
Tests + cassette coverage, regression set green ✅
- test_skill_tool_span_attributes: happy path, asserts four attributes + tool.call.id.
- test_skill_metadata_read_failure_does_not_break_sdk: cwd without SKILL.md → only name/id set.
- Regression set (24 passed) green; ruff clean.

Findings

[minor] _apply_skill_fallback branch is not exercised by any test.
In both new tests, skill_name is captured at span start from tool_input.get("skill"), so _apply_skill_fallback returns early on if tool_invocation.skill_name: return. The tool_use_result.commandName path is never hit. Additionally, create_mock_message_from_data in test_with_cassettes.py / test_span_validation.py does not propagate tool_use_result onto the mock UserMessage, so the cassette cannot exercise it either. Suggest adding a case where tool_input omits skill (or sets it empty) and tool_use_result.commandName provides the fallback, asserting skill_name/skill_id are recovered.

[nit] test_skill_load.yaml cwd: __SKILL_CWD__ is an unsubstituted placeholder.
No substitution logic exists, so at runtime _apply_skill_metadata looks for __SKILL_CWD__/.claude/skills/probe-skill/SKILL.md as a relative path and falls through to best-effort. The cassette effectively only serves as smoke; it does not exercise the frontmatter parse path. Either have the cassette support a real tmp-path substitution or document it as smoke-only.

[nit] In _apply_skill_fallback, skill_id does not use str(command_name).
skill_name = str(command_name) but skill_id = f"{_SKILL_ID_PREFIX}{command_name}". f-string calls __str__ implicitly so behavior is identical, but for consistency with the previous line prefer f"{_SKILL_ID_PREFIX}{str(command_name)}".

[nit] _apply_skill_fallback recovers skill_name but does not attempt to re-read SKILL.md.
Per spec this is acceptable (name/id only on fallback). If description/version are wanted on the fallback path, call _read_skill_metadata after commandName is known. Optional enhancement.

Summary

Implementation matches the spec on all four required points; best-effort handling is solid and regression set is clean. Main gap is test coverage for the fallback branch. Suggested event: COMMENT (not APPROVE) pending the fallback test.

rangemer333-cell · 2026-06-22T01:18:31Z

+        tool_invocation.skill_version = version
+
+
+def _apply_skill_fallback(


[minor] _apply_skill_fallback is not exercised by any test.

In both new tests, skill_name is captured at span start from tool_input.get("skill"), so this function returns early at if tool_invocation.skill_name: return. The tool_use_result.commandName path below is never hit.

Also, create_mock_message_from_data in test_with_cassettes.py and test_span_validation.py does not propagate tool_use_result onto the mock UserMessage, so the cassette cannot exercise this path either.

Suggest adding a case where tool_input omits skill (or sets it empty) and tool_use_result.commandName provides the fallback, asserting skill_name/skill_id are recovered from the result.

rangemer333-cell · 2026-06-22T01:18:31Z

+    command_name = tool_use_result.get("commandName")
+    if command_name:
+        tool_invocation.skill_name = str(command_name)
+        tool_invocation.skill_id = (


[nit] skill_id does not use str(command_name).

skill_name = str(command_name) on the previous line, but skill_id = f"{_SKILL_ID_PREFIX}{command_name}". The f-string calls __str__ implicitly so behavior is identical, but for consistency with the previous line prefer f"{_SKILL_ID_PREFIX}{str(command_name)}".

rangemer333-cell · 2026-06-22T01:18:31Z

+        return
+    if not isinstance(tool_use_result, dict):
+        return
+    command_name = tool_use_result.get("commandName")


[nit] Optional enhancement: after recovering skill_name from commandName, re-read SKILL.md via _read_skill_metadata so description/version can still be populated on the fallback path.

Per spec this is acceptable (name/id only on fallback), so feel free to leave as-is — just flagging the option.

rangemer333-cell · 2026-06-22T01:18:31Z

+  data:
+    type: system
+    subtype: init
+    cwd: __SKILL_CWD__


[nit] cwd: __SKILL_CWD__ is an unsubstituted placeholder.

No substitution logic exists, so at runtime _apply_skill_metadata looks for __SKILL_CWD__/.claude/skills/probe-skill/SKILL.md as a relative path and falls through to best-effort. The cassette effectively only serves as smoke; it does not exercise the frontmatter parse path.

Either have the cassette support a real tmp-path substitution (e.g. a fixture that rewrites __SKILL_CWD__ to tmp_path) or document it as smoke-only.

github-actions Bot assigned 123liuziming, Cirilla-zmh and ralf0131 Jun 22, 2026

github-actions Bot requested review from 123liuziming, Cirilla-zmh and ralf0131 June 22, 2026 01:01

rangemer333-cell commented Jun 22, 2026

View reviewed changes

sipercai closed this Jun 22, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(claude-agent-sdk): capture gen_ai.skill.* on Skill load execute_tool span#225

feat(claude-agent-sdk): capture gen_ai.skill.* on Skill load execute_tool span#225
rangemer333-cell wants to merge 1 commit into
alibaba:mainfrom
rangemer333-cell:feat/claude-agent-sdk-skill-telemetry

rangemer333-cell commented Jun 22, 2026

Uh oh!

CLAassistant commented Jun 22, 2026

Uh oh!

rangemer333-cell left a comment

Uh oh!

rangemer333-cell Jun 22, 2026

Uh oh!

rangemer333-cell Jun 22, 2026

Uh oh!

rangemer333-cell Jun 22, 2026

Uh oh!

rangemer333-cell Jun 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

		tool_invocation.skill_version = version


		def _apply_skill_fallback(

Conversation

rangemer333-cell commented Jun 22, 2026

Summary

Attribute mapping

Robustness

Validation

Uh oh!

CLAassistant commented Jun 22, 2026

Uh oh!

rangemer333-cell left a comment

Choose a reason for hiding this comment

Verification against spec

Findings

Summary

Uh oh!

rangemer333-cell Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

rangemer333-cell Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

rangemer333-cell Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

rangemer333-cell Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants