feat(langchain): mark LangChain root observations in metadata by hassiebp · Pull Request #1604 · langfuse/langfuse-python

hassiebp · 2026-04-01T16:36:51Z

Summary

mark root LangChain observations with is_langchain_root: true in callback metadata
apply the flag consistently at the root callback boundary, including root chains and standalone root LLM runs
add focused regression tests covering a root chain and a root LLM callback

Why

Root observations created through the LangChain callback handler were not explicitly marked in observation metadata. That made it harder for downstream consumers to identify the LangChain-created root observation reliably.

Impact

Consumers can now detect root LangChain observations via the is_langchain_root metadata key without changing child observation behavior.

Validation

uv run pytest -q tests/test_langchain.py -k "is_langchain_root_metadata"

Disclaimer: Experimental PR review

Greptile Summary

This PR adds an is_langchain_root: True metadata flag to all root (top-level, no parent_run_id) LangChain observations, making it easier for downstream consumers to identify the entry-point span of a LangChain-driven trace without inspecting the call hierarchy.

Key changes:

A new private helper _get_langchain_observation_metadata wraps the existing __join_tags_and_metadata, injecting is_langchain_root: True when parent_run_id is None; child observations are unaffected.
The helper is applied uniformly across all four observation-creating callbacks: on_chain_start, on_tool_start, on_retriever_start, and __on_llm_action (covering both on_llm_start and on_chat_model_start).
Two focused unit tests validate the root chain and root LLM paths using a lightweight fake client, with no external dependencies.
Minor: on_tool_start merges arbitrary extra kwargs into meta after the flag is set, meaning a caller who happens to pass is_langchain_root in kwargs could silently overwrite it; the flag should ideally be applied last to be safe.
importlib.import_module(...) is called inside the _patch_langchain_client test helper rather than as a top-level import, which is inconsistent with the project's import convention.

Confidence Score: 5/5

Safe to merge; all remaining findings are P2 style/hygiene suggestions that do not affect correctness.
The core logic is simple and correct — is_langchain_root is injected only for root observations and never touches child spans. The two new tests exercise the changed paths with a proper fake, and the existing test suite is unchanged. Both open findings are P2: a test-only import style issue and a theoretical (not practically reachable today) flag-overwrite edge case in on_tool_start.
No files require special attention.

Important Files Changed

Filename	Overview
langfuse/langchain/CallbackHandler.py	Adds `_get_langchain_observation_metadata` helper that wraps `__join_tags_and_metadata` and injects `is_langchain_root: True` for all root observations (no `parent_run_id`); applied consistently to `on_chain_start`, `on_tool_start`, `on_retriever_start`, and `__on_llm_action`.
tests/test_langchain.py	Adds two focused unit tests for the new flag using a lightweight fake client/observation; covers root chain and root LLM paths but not root tool or retriever; uses `importlib.import_module` inside a helper function instead of a top-level import.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["LangChain callback fires\n(on_chain_start / on_llm_start /\non_tool_start / on_retriever_start)"] --> B{"parent_run_id\nis None?"}
    B -- "Yes (root run)" --> C["__join_tags_and_metadata\n(tags, metadata)"]
    C --> D["observation_metadata\n(may be None)"]
    D --> E["root_metadata = observation_metadata.copy() or {}"]
    E --> F["root_metadata['is_langchain_root'] = True"]
    F --> G["start_observation(metadata=root_metadata)"]
    B -- "No (child run)" --> H["__join_tags_and_metadata\n(tags, metadata)"]
    H --> I["start_observation(metadata=observation_metadata)"]

Comments Outside Diff (1)

langfuse/langchain/CallbackHandler.py, line 719-730 (link)

is_langchain_root can be silently overwritten by kwargs

For on_tool_start, after _get_langchain_observation_metadata returns {"is_langchain_root": True} (root run, no tags/metadata), the meta.update(kwargs) call immediately below merges all non-None kwargs into meta. If a caller ever passes is_langchain_root among the extra keyword arguments, the flag will be silently overwritten without warning.

While this is unlikely in practice today, setting the flag after the kwargs merge would be more resilient:
```
meta = self._get_langchain_observation_metadata(
    parent_run_id=parent_run_id,
    tags=tags,
    metadata=metadata,
)

if not meta:
    meta = {}

meta.update(
    {key: value for key, value in kwargs.items() if value is not None}
)

if parent_run_id is None:
    meta["is_langchain_root"] = True
```
Alternatively, the _get_langchain_observation_metadata helper itself could be the canonical place — but the update call should happen first so the flag wins.

_{Reviews (1): Last reviewed commit: "Merge branch 'main' into codex/langchain..." | Re-trigger Greptile}

Context used:

Rule used - Move imports to the top of the module instead of p... (source)

Learnt From
langfuse/langfuse-python#1387

github-actions · 2026-04-01T16:39:13Z

@claude review

claude

LGTM — clean, well-scoped change with focused tests.

Extended reasoning...

Overview

Adds a _get_langchain_observation_metadata helper to CallbackHandler.py that wraps the existing __join_tags_and_metadata call and injects is_langchain_root: True when parent_run_id is None. The helper is applied consistently at all four callback entry points: on_chain_start, on_tool_start, on_retriever_start, and __on_llm_action. Two unit tests using a lightweight fake-client pattern are added to tests/test_langchain.py.

Security risks

None. The change only writes a boolean flag into observation metadata; it does not touch auth, credentials, or any trust boundary.

Level of scrutiny

Low. This is an additive, metadata-only change that cannot break existing behaviour: consumers who do not look for is_langchain_root are unaffected, and child observations are explicitly left unchanged. The helper correctly uses .copy() to avoid mutating the dict returned by __join_tags_and_metadata.

Other factors

The two new tests directly exercise the flag for both the chain and LLM paths without requiring a live API, making them fast and reliable. The implementation follows the existing codebase style throughout.

claude · 2026-04-01T17:20:15Z

tests/test_langchain.py

    assert langchain_generation_span.input != ""
    assert langchain_generation_span.output is not None
    assert langchain_generation_span.output != ""
+    assert langchain_generation_span.metadata["is_langchain_root"] is True


🔴 The test test_callback_generated_from_trace_chat incorrectly asserts is_langchain_root on the GENERATION span rather than the CHAIN wrapper that is the actual LangChain root. When ChatOpenAI.invoke() is called, LangChain fires on_chain_start with parent_run_id=None first (creating a CHAIN wrapper), then on_chat_model_start with a non-None parent_run_id, so the flag lands on the CHAIN, not the GENERATION. The test should filter for the root CHAIN span (as test_callback_generated_from_lcel_chain correctly does) rather than asserting is_langchain_root on the GENERATION span.

Extended reasoning...

What the bug is and how it manifests

Line 70 asserts langchain_generation_span.metadata["is_langchain_root"] is True on the ChatOpenAI GENERATION span. The _get_langchain_observation_metadata helper only sets is_langchain_root=True when parent_run_id is None. When ChatOpenAI.invoke() is called, LangChain fires two callbacks: on_chain_start (with parent_run_id=None, creating a CHAIN wrapper as the LangChain root) and then on_chat_model_start (with parent_run_id=<chain_run_id>, a non-None value). So the CHAIN wrapper gets is_langchain_root=True, while the GENERATION does not.

The specific code path that triggers it

In __on_llm_action, the metadata is set via self._get_langchain_observation_metadata(parent_run_id=parent_run_id, ...). When on_chat_model_start fires for a ChatOpenAI.invoke() call, parent_run_id equals the run ID of the previously created CHAIN wrapper (not None). The helper function reaches if parent_run_id is not None: return observation_metadata without setting the is_langchain_root key.

Why existing evidence confirms this, not refutes it

The test itself already asserts len(trace.observations) == 3 (line 54), which means there are exactly 3 observations: (1) the Langfuse parent from start_as_current_observation, (2) a CHAIN wrapper created by on_chain_start, and (3) a GENERATION created by on_chat_model_start. If LangChain only fired on_chat_model_start with parent_run_id=None (as refuters claim), there would be only 2 observations total, contradicting line 54. This count of 3 is also independently confirmed by test_multimodal, which similarly calls model.invoke() directly and asserts len(trace.observations) == 3. Furthermore, git history shows commit 840cf2a explicitly changed the observation count in this test from 2 to 3, documenting the LangChain behavioral change where on_chain_start now fires before on_chat_model_start even for direct model invocations.

Step-by-step proof

chat.invoke(messages, config={"callbacks": [handler]}) is called.

LangChain fires on_chain_start(run_id=UUID_A, parent_run_id=None) → _get_langchain_observation_metadata sees parent_run_id is None, sets is_langchain_root=True on the CHAIN observation.

LangChain fires on_chat_model_start(run_id=UUID_B, parent_run_id=UUID_A) → _get_langchain_observation_metadata sees parent_run_id is not None, returns metadata without is_langchain_root.

The test filters for o.type == "GENERATION" and o.name == "ChatOpenAI", finding observation UUID_B.

langchain_generation_span.metadata["is_langchain_root"] is either missing or None, causing the assertion on line 70 to fail (either KeyError or AssertionError).

How to fix it

Mirror the pattern used in test_callback_generated_from_lcel_chain: filter all observations for those with observation.metadata.get("is_langchain_root") and assert the single result has type == "CHAIN". Remove the incorrect line 70 assertion from test_callback_generated_from_trace_chat and replace it with a check that the CHAIN wrapper (not the GENERATION) carries the flag.

mark LangChain roots in metadata

6dd8400

hassiebp marked this pull request as ready for review April 1, 2026 16:39

Merge branch 'main' into codex/langchain-root-metadata

03e0633

simplify LangChain root metadata tests

ccda750

claude bot reviewed Apr 1, 2026

View reviewed changes

hassiebp changed the title ~~[codex] Mark LangChain root observations in metadata~~ feat(langchain): mark LangChain root observations in metadata Apr 1, 2026

hassiebp merged commit 6f9eaf2 into main Apr 1, 2026
13 checks passed

hassiebp deleted the codex/langchain-root-metadata branch April 1, 2026 16:59

claude bot reviewed Apr 1, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(langchain): mark LangChain root observations in metadata#1604

feat(langchain): mark LangChain root observations in metadata#1604
hassiebp merged 3 commits intomainfrom
codex/langchain-root-metadata

hassiebp commented Apr 1, 2026 •

edited by greptile-apps bot

Loading

Uh oh!

github-actions bot commented Apr 1, 2026

Uh oh!

claude bot left a comment

Uh oh!

Uh oh!

claude bot Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

hassiebp commented Apr 1, 2026 • edited by greptile-apps bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

Impact

Validation

Disclaimer: Experimental PR review

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Flowchart

Comments Outside Diff (1)

Uh oh!

github-actions bot commented Apr 1, 2026

Uh oh!

claude bot left a comment

Choose a reason for hiding this comment

Overview

Security risks

Level of scrutiny

Other factors

Uh oh!

Uh oh!

claude bot Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

hassiebp commented Apr 1, 2026 •

edited by greptile-apps bot

Loading