[WIP] feat(deepagents): add instrumentation plugin#190
Conversation
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Adds a new loongsuite-instrumentation-deepagents package to instrument the deepagents framework with ENTRY span creation, span enrichment, and GenAI metrics emission.
Changes:
- Introduces a
DeepAgentsInstrumentorthat wrapsdeepagents.graph.create_deep_agent, installs a LangChain callback enricher, and registers a metrics SpanProcessor. - Adds internal helpers for extracting metadata/messages and for generating ENTRY spans + metrics.
- Adds a dedicated test suite plus packaging/docs (pyproject, README) and registers the package in the instrumentation index.
Reviewed changes
Copilot reviewed 16 out of 16 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| instrumentation-loongsuite/loongsuite-instrumentation-deepagents/src/opentelemetry/instrumentation/deepagents/init.py | New instrumentor wiring ENTRY patch, enricher, and metrics processor |
| instrumentation-loongsuite/loongsuite-instrumentation-deepagents/src/opentelemetry/instrumentation/deepagents/internal/_entry_patch.py | Wraps create_deep_agent and graph methods to create/manage ENTRY spans |
| instrumentation-loongsuite/loongsuite-instrumentation-deepagents/src/opentelemetry/instrumentation/deepagents/internal/_enricher.py | LangChain callback handler + callback-manager patch to enrich active spans |
| instrumentation-loongsuite/loongsuite-instrumentation-deepagents/src/opentelemetry/instrumentation/deepagents/internal/_metrics_processor.py | SpanProcessor that emits GenAI metrics from finished spans |
| instrumentation-loongsuite/loongsuite-instrumentation-deepagents/src/opentelemetry/instrumentation/deepagents/internal/_utils.py | Shared helpers for metadata/version detection and message conversion |
| instrumentation-loongsuite/loongsuite-instrumentation-deepagents/src/opentelemetry/instrumentation/deepagents/internal/_attributes.py | Centralized constants for attributes, span kinds, and metric names |
| instrumentation-loongsuite/loongsuite-instrumentation-deepagents/src/opentelemetry/instrumentation/deepagents/package.py | Declares supported deepagents versions and metrics support |
| instrumentation-loongsuite/loongsuite-instrumentation-deepagents/src/opentelemetry/instrumentation/deepagents/version.py | Declares package version for distribution |
| instrumentation-loongsuite/loongsuite-instrumentation-deepagents/src/opentelemetry/instrumentation/deepagents/internal/init.py | Internal package init |
| instrumentation-loongsuite/loongsuite-instrumentation-deepagents/tests/conftest.py | Test fixtures for tracing/metrics providers and env setup |
| instrumentation-loongsuite/loongsuite-instrumentation-deepagents/tests/test_entry_patch.py | Tests ENTRY patch wrapping behavior and span attributes |
| instrumentation-loongsuite/loongsuite-instrumentation-deepagents/tests/test_enricher.py | Tests span enrichment from callback handler |
| instrumentation-loongsuite/loongsuite-instrumentation-deepagents/tests/test_metrics.py | Tests metrics processor behavior and warnings |
| instrumentation-loongsuite/loongsuite-instrumentation-deepagents/pyproject.toml | New distributable project config + dependencies |
| instrumentation-loongsuite/loongsuite-instrumentation-deepagents/README.md | Package documentation and local install instructions |
| instrumentation-loongsuite/README.md | Adds deepagents instrumentation to the index table |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| async def on_chain_start_async( | ||
| self, | ||
| serialized: dict[str, Any], | ||
| inputs: dict[str, Any], | ||
| *, | ||
| run_id: Any, | ||
| parent_run_id: Any | None = None, | ||
| tags: list[str] | None = None, | ||
| metadata: dict[str, Any] | None = None, | ||
| **kwargs: Any, | ||
| ) -> Any: | ||
| del serialized, inputs, run_id, parent_run_id, tags, kwargs | ||
| self._enrich_agent_or_chain(metadata or {}) |
| async def on_tool_start_async( | ||
| self, | ||
| serialized: dict[str, Any], | ||
| input_str: str, | ||
| *, | ||
| run_id: Any, | ||
| parent_run_id: Any | None = None, | ||
| tags: list[str] | None = None, | ||
| metadata: dict[str, Any] | None = None, | ||
| **kwargs: Any, | ||
| ) -> Any: | ||
| del input_str, run_id, parent_run_id, tags, metadata | ||
| self._enrich_tool(serialized, kwargs) |
| labels = { | ||
| "spanKind": str(span_kind), | ||
| "modelName": _model_name(attributes), | ||
| } |
| usage_labels = { | ||
| "spanKind": SPAN_KIND_LLM, | ||
| "modelName": labels["modelName"], | ||
| "usageType": usage_type, | ||
| } |
| ) | ||
|
|
||
| _logger = logging.getLogger(__name__) | ||
| _processors_by_provider_id: dict[int, "DeepAgentMetricsSpanProcessor"] = {} |
| provider_id = id(tracer_provider) | ||
| if provider_id in _processors_by_provider_id: | ||
| return | ||
| processor = DeepAgentMetricsSpanProcessor(meter_provider=meter_provider) | ||
| tracer_provider.add_span_processor(processor) | ||
| _processors_by_provider_id[provider_id] = processor |
| ## Local Install | ||
|
|
||
| Install the shared GenAI utility from the same source tree first, then install | ||
| the dependent LangChain, LangGraph, and deepagents instrumentations: |
ralf0131
left a comment
There was a problem hiding this comment.
Review by github-manager-bot
Summary
New instrumentation for deepagents (built on LangGraph): wraps create_deep_agent and the resulting graph's invoke/ainvoke/stream/astream to produce ENTRY spans, propagates a subagent registry via contextvars, patches the subagent task tool, and adds a langchain-tracer callback + subagent task-context resolution. Substantial and well-tested. Since it touches the shared langchain tracer, please give those changes a careful eye.
Note: PR is marked [WIP]; focusing on design-level feedback that should survive further iteration.
Findings
- [Warning]
_entry_patch.py—_SubagentRunnableProxyoverrides onlyinvokeandainvoke, while__getattr__delegates everything else (incl.stream/astream) to the raw runnable. The top-level graph wraps all four methods, so streaming subagent graphs silently miss the ReAct-metadata injection that sync/async subagents get — inconsistent telemetry coverage depending on how the subagent is invoked. Either wrapstream/astreamin the proxy too, or drop the proxy in favour of the same_wrap_graph_methodsused at the top level. - [Warning]
_entry_patch.py— streaming ENTRY span lifecycle. The stream wrappers yield in atrywith_finish_entryinexcept Exception, but nofinally. If a consumer abandons a stream early (partial consume, noclose()/with), the generator is closed byGeneratorExit, which is aBaseExceptionand is not caught byexcept Exception→_finish_entryis skipped → the ENTRY span is neverstop_entry/fail_entry'd (leak). Add afinally(with a guard to avoid double-finish) so the entry span always ends. - [Info]
_entry_patch.pyuses module-level mutable flags (_is_entry_patched,_top_level_patched,_is_subagent_task_patched,_handler). Concurrentinstrument()/uninstrument()would race; document single-threaded setup at startup or guard with a lock. - [Info]
_metrics_processor.py/ langchain_tracer.py—_get_deepagents_subagent_task_contextiterateslist(self._runs.values())per subagent chain start (O(n) per chain). Bounded and fine for typical traces, but note the cost for very large multi-subagent graphs. - [Info] Truncated (2-line) Apache headers across these files; align with the full header used in e.g. the anthropic plugin for lint consistency.
Suggestions
# in the stream generator wrappers
try:
yield from inner
finally:
_finish_entry(invocation, token, result=None, exc=current_exc, finished=bool(...))Cross-repo Note
No shared API surface with loongsuite-pilot; the langchain-tracer changes are internal to this plugin, no cross-repo change required.
Automated review by github-manager-bot
Description
Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.
Fixes # (issue)
Type of change
Please delete options that are not relevant.
How Has This Been Tested?
Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration
Does This PR Require a Core Repo Change?
Checklist:
See contributing.md for styleguide, changelog guidelines, and more.