-
Notifications
You must be signed in to change notification settings - Fork 259
feat(langchain): mark LangChain root observations in metadata #1604
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
+50
−6
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🔴 The test
test_callback_generated_from_trace_chatincorrectly assertsis_langchain_rooton the GENERATION span rather than the CHAIN wrapper that is the actual LangChain root. WhenChatOpenAI.invoke()is called, LangChain fireson_chain_startwithparent_run_id=Nonefirst (creating a CHAIN wrapper), thenon_chat_model_startwith a non-Noneparent_run_id, so the flag lands on the CHAIN, not the GENERATION. The test should filter for the root CHAIN span (astest_callback_generated_from_lcel_chaincorrectly does) rather than assertingis_langchain_rooton the GENERATION span.Extended reasoning...
What the bug is and how it manifests
Line 70 asserts
langchain_generation_span.metadata["is_langchain_root"] is Trueon the ChatOpenAI GENERATION span. The_get_langchain_observation_metadatahelper only setsis_langchain_root=Truewhenparent_run_id is None. WhenChatOpenAI.invoke()is called, LangChain fires two callbacks:on_chain_start(withparent_run_id=None, creating a CHAIN wrapper as the LangChain root) and thenon_chat_model_start(withparent_run_id=<chain_run_id>, a non-Nonevalue). So the CHAIN wrapper getsis_langchain_root=True, while the GENERATION does not.The specific code path that triggers it
In
__on_llm_action, the metadata is set viaself._get_langchain_observation_metadata(parent_run_id=parent_run_id, ...). Whenon_chat_model_startfires for aChatOpenAI.invoke()call,parent_run_idequals the run ID of the previously created CHAIN wrapper (notNone). The helper function reachesif parent_run_id is not None: return observation_metadatawithout setting theis_langchain_rootkey.Why existing evidence confirms this, not refutes it
The test itself already asserts
len(trace.observations) == 3(line 54), which means there are exactly 3 observations: (1) the Langfuse parent fromstart_as_current_observation, (2) a CHAIN wrapper created byon_chain_start, and (3) a GENERATION created byon_chat_model_start. If LangChain only firedon_chat_model_startwithparent_run_id=None(as refuters claim), there would be only 2 observations total, contradicting line 54. This count of 3 is also independently confirmed bytest_multimodal, which similarly callsmodel.invoke()directly and assertslen(trace.observations) == 3. Furthermore, git history shows commit 840cf2a explicitly changed the observation count in this test from 2 to 3, documenting the LangChain behavioral change whereon_chain_startnow fires beforeon_chat_model_starteven for direct model invocations.Step-by-step proof
chat.invoke(messages, config={"callbacks": [handler]})is called.on_chain_start(run_id=UUID_A, parent_run_id=None)→_get_langchain_observation_metadataseesparent_run_id is None, setsis_langchain_root=Trueon the CHAIN observation.on_chat_model_start(run_id=UUID_B, parent_run_id=UUID_A)→_get_langchain_observation_metadataseesparent_run_id is not None, returns metadata withoutis_langchain_root.o.type == "GENERATION" and o.name == "ChatOpenAI", finding observation UUID_B.langchain_generation_span.metadata["is_langchain_root"]is either missing orNone, causing the assertion on line 70 to fail (eitherKeyErrororAssertionError).How to fix it
Mirror the pattern used in
test_callback_generated_from_lcel_chain: filter all observations for those withobservation.metadata.get("is_langchain_root")and assert the single result hastype == "CHAIN". Remove the incorrect line 70 assertion fromtest_callback_generated_from_trace_chatand replace it with a check that the CHAIN wrapper (not the GENERATION) carries the flag.