feat(langchain): backport Langchain root metadata flag to v3-stable#1605
feat(langchain): backport Langchain root metadata flag to v3-stable#1605
Conversation
|
@claude review |
There was a problem hiding this comment.
LGTM — clean, additive backport with well-scoped tests.
Extended reasoning...
Overview
This PR adds a _get_langchain_observation_metadata helper to CallbackHandler.py that wraps the existing __join_tags_and_metadata and injects is_langchain_root: True whenever parent_run_id is None (i.e., the observation is the LangChain-level root). The helper is applied consistently across on_chain_start, on_tool_start, on_retriever_start, and __on_llm_action. Two test assertions are added to verify the flag on a standalone generation and on the root chain of an LCEL run.
Security risks
None. The change only adds a metadata flag to observation payloads; it does not touch auth, permissions, crypto, or any external API surface.
Level of scrutiny
Low. This is a purely additive change with no behavioral impact on existing functionality — existing consumers who do not check is_langchain_root are completely unaffected. The logic is simple and deterministic: flag is present iff parent_run_id is None. The one intentional asymmetry (trace-update path in on_chain_start still uses raw __join_tags_and_metadata without the root flag) is correct, since that path updates the trace metadata rather than the observation metadata.
Other factors
Tests are updated to assert the new flag, pattern is consistent with how the newer branch already handles this, and linting/format checks were run. The PR is focused and does not touch any other logic.
Summary
is_langchain_root: trueWhy
The v3 callback handler also creates a LangChain-root observation but did not explicitly mark it in observation metadata. Backporting the flag keeps the last major aligned with the newer branch and makes it easier for downstream consumers to identify the LangChain-created root observation reliably.
Impact
Consumers on
v3-stablecan detect the LangChain-created root observation viametadata.is_langchain_rootwithout changing child observation behavior.Validation
uv run ruff check langfuse/langchain/CallbackHandler.py tests/test_langchain.pyuv run ruff format --check langfuse/langchain/CallbackHandler.py tests/test_langchain.pyuv run pytest -q tests/test_langchain.py -k "callback_generated_from_trace_chat or callback_generated_from_lcel_chain"v3-stablefails test collection with apydantic.v1compatibility error under 3.14Disclaimer: Experimental PR review
Greptile Summary
This PR backports the
is_langchain_rootmetadata flag to the v3-stable branch of the LangChain callback handler, ensuring that root-level LangChain observations (chains, tools, retrievers, and standalone LLM calls) can be reliably identified by downstream consumers.Key changes:
_get_langchain_observation_metadata()helper that delegates to__join_tags_and_metadataand appendsis_langchain_root: Truewhenparent_run_id is Noneon_chain_start,on_tool_start,on_retriever_start, and__on_llm_action(covering bothon_chat_model_startandon_llm_start)is_langchain_rootout of the trace-level metadata inon_chain_start'supdate_tracecall — the flag only lives on the observationThe implementation is correct:
__join_tags_and_metadataalways constructs a fresh dict, so theobservation_metadata.copy()in the new helper is safe, and no shared-state mutation can occur. Thekeep_langfuse_trace_attributesforwarding is preserved for the LLM path as before.Confidence Score: 5/5
on_chain_startwhenupdate_traceis enabled, and a stylistic difference between single-underscore_get_langchain_observation_metadatavs double-underscore__join_tags_and_metadatanaming). No correctness, data-integrity, or reliability concerns were found.Important Files Changed
_get_langchain_observation_metadatahelper that injectsis_langchain_root: Truewhenparent_run_id is None, and applies it consistently acrosson_chain_start,on_tool_start,on_retriever_start, and__on_llm_action; trace-level metadata update inon_chain_startis correctly left unchanged.is_langchain_rooton a standalone root generation, and one verifying exactly one CHAIN-type observation is marked as root in an LCEL run.Flowchart
%%{init: {'theme': 'neutral'}}%% flowchart TD A["LangChain Event\n(on_chain_start / on_tool_start /\non_retriever_start / __on_llm_action)"] --> B{"parent_run_id is None?"} B -- "Yes (root call)" --> C["_get_langchain_observation_metadata()\n→ __join_tags_and_metadata()\n→ copy + add is_langchain_root: True"] B -- "No (child call)" --> D["_get_langchain_observation_metadata()\n→ __join_tags_and_metadata()\n→ return as-is (no flag)"] C --> E["start_observation(metadata={...is_langchain_root: True...})"] D --> F["start_observation(metadata={...})"] E --> G{"on_chain_start\nupdate_trace?"} G -- "Yes" --> H["update_trace(metadata=__join_tags_and_metadata())\nNO is_langchain_root in trace"] G -- "No" --> I["Done"] F --> IReviews (1): Last reviewed commit: "mark LangChain roots in metadata" | Re-trigger Greptile