Skip to content

feat(docs): add RAG observability cookbook example#3062

Open
AryanBhansali-SE wants to merge 1 commit into
langfuse:mainfrom
AryanBhansali-SE:docs-rag-observability-example
Open

feat(docs): add RAG observability cookbook example#3062
AryanBhansali-SE wants to merge 1 commit into
langfuse:mainfrom
AryanBhansali-SE:docs-rag-observability-example

Conversation

@AryanBhansali-SE

@AryanBhansali-SE AryanBhansali-SE commented Jun 6, 2026

Copy link
Copy Markdown

Adds a new cookbook guide covering how to instrument a RAG pipeline with Langfuse using @observe, start_as_current_observation(as_type="retriever"), and the LangChain CallbackHandler. Includes user/session context propagation and links to the RAGAS evaluation cookbook and Experiments docs as next steps.

Greptile Summary

This PR adds a new RAG observability cookbook guide demonstrating how to instrument a retrieval-augmented generation pipeline with Langfuse using @observe, start_as_current_observation(as_type="retriever"), and the LangChain CallbackHandler.

  • MDX guide (rag_observability_example.mdx): Five well-structured steps covering dependency installation, environment setup, pipeline tracing, user/session context propagation, and next-step evaluation links. Code uses correct Langfuse v3 APIs throughout.
  • Notebook stub (rag_observability_example.ipynb): Contains only a metadata markdown cell — no executable code cells — while the companion MDX holds all the actual examples. The route entry registers this as an interactive guide (isGuide: true, docsPath: null), so users who download or open the notebook will find it empty.
  • Routing/sidebar entries: _routes.json and meta.json additions are consistent with existing patterns.

Confidence Score: 3/5

The MDX guide is correct but the notebook shipped is an empty stub — users who download it get nothing runnable.

The MDX content and API usage are well-structured and correct. However, the .ipynb file has no code cells at all; every comparable guide in the repo ships with actual content in the notebook. Because _routes.json registers this as a guide with no separate docs path, the notebook is the primary interactive artifact and it is empty.

cookbook/rag_observability_example.ipynb needs the full code content added before merge.

Sequence Diagram

sequenceDiagram
    participant User
    participant rag_pipeline as rag_pipeline() @observe
    participant build_retriever as build_retriever()
    participant FAISS as FAISS VectorStore
    participant LangfuseCtx as Langfuse Context (retriever span)
    participant LLMChain as answer_chain (LangChain)
    participant OpenAI as OpenAI LLM
    participant Langfuse as Langfuse Backend

    User->>rag_pipeline: question, urls
    Note over rag_pipeline,Langfuse: @observe() opens root trace
    rag_pipeline->>build_retriever: urls
    build_retriever->>FAISS: WebBaseLoader + chunk + embed
    FAISS-->>rag_pipeline: retriever
    rag_pipeline->>LangfuseCtx: "start_as_current_observation(as_type=retriever)"
    LangfuseCtx->>FAISS: retriever.invoke(question)
    FAISS-->>LangfuseCtx: docs[]
    LangfuseCtx->>Langfuse: update output (chunks + sources)
    rag_pipeline->>LLMChain: "invoke({context, question}, callbacks=[CallbackHandler])"
    LLMChain->>OpenAI: prompt + context
    OpenAI-->>LLMChain: answer
    LLMChain->>Langfuse: generation span (prompt, tokens, response)
    LLMChain-->>rag_pipeline: answer string
    rag_pipeline-->>User: "{answer, num_docs}"
    rag_pipeline->>Langfuse: "flush (root trace closed by @observe)"
Loading
Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
cookbook/rag_observability_example.ipynb:1-24
**Notebook is an empty stub with no runnable code**

The `.ipynb` file contains only a single metadata markdown cell. Every other cookbook registered under `_routes.json` with `isGuide: true` — including `error-analysis-llm-applications.ipynb` and `evaluation_of_rag_with_ragas.ipynb` — ships with actual content cells. When a user downloads this notebook from the cookbook page, they receive a file with zero executable code, while all the actual examples live only in the accompanying MDX. The route registration (`"isGuide": true`, `"docsPath": null`) signals this notebook is the primary guide artifact, making the missing content a real gap for anyone trying to run the cookbook interactively.

### Issue 2 of 2
content/guides/cookbook/rag_observability_example.mdx:146-185
**Step 5 silently depends on variables initialized in Step 3**

`rag_pipeline_with_context` calls `build_retriever`, uses `answer_chain` and `langfuse_handler`, and invokes `langfuse.start_as_current_observation` — all defined or initialized in Step 3's code block. There is no note directing the reader to run Step 3 first, and `langfuse_handler` is never re-declared in Step 5. A reader who navigates directly to Step 5 (e.g., via a search or link) will get `NameError` on first execution with no obvious hint about the missing setup.

Reviews (1): Last reviewed commit: "feat(docs): add RAG observability cookbo..." | Re-trigger Greptile

Greptile also left 2 inline comments on this PR.

Adds a new cookbook guide covering how to instrument a RAG pipeline with
Langfuse using @observe, start_as_current_observation(as_type="retriever"),
and the LangChain CallbackHandler. Includes user/session context propagation
and links to the RAGAS evaluation cookbook and Experiments docs as next steps.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

@claude claude Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

@vercel

vercel Bot commented Jun 6, 2026

Copy link
Copy Markdown

@AryanBhansali-SE is attempting to deploy a commit to the langfuse Team on Vercel.

A member of the Team first needs to authorize it.

@review-notebook-app

Copy link
Copy Markdown

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@dosubot dosubot Bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Jun 6, 2026
@CLAassistant

CLAassistant commented Jun 6, 2026

Copy link
Copy Markdown

CLA assistant check
All committers have signed the CLA.

@dosubot dosubot Bot added the documentation Improvements or additions to documentation label Jun 6, 2026
Comment on lines +1 to +24
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<!-- NOTEBOOK_METADATA title: \"RAG Observability with Langfuse\" sidebarTitle: \"RAG Observability\" description: \"Learn how to trace and observe RAG (Retrieval-Augmented Generation) pipelines with Langfuse using the @observe decorator and the retriever observation type.\" category: \"Observability\" -->"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"name": "python",
"version": "3.11.0"
}
},
"nbformat": 4,
"nbformat_minor": 4
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Notebook is an empty stub with no runnable code

The .ipynb file contains only a single metadata markdown cell. Every other cookbook registered under _routes.json with isGuide: true — including error-analysis-llm-applications.ipynb and evaluation_of_rag_with_ragas.ipynb — ships with actual content cells. When a user downloads this notebook from the cookbook page, they receive a file with zero executable code, while all the actual examples live only in the accompanying MDX. The route registration ("isGuide": true, "docsPath": null) signals this notebook is the primary guide artifact, making the missing content a real gap for anyone trying to run the cookbook interactively.

Prompt To Fix With AI
This is a comment left during a code review.
Path: cookbook/rag_observability_example.ipynb
Line: 1-24

Comment:
**Notebook is an empty stub with no runnable code**

The `.ipynb` file contains only a single metadata markdown cell. Every other cookbook registered under `_routes.json` with `isGuide: true` — including `error-analysis-llm-applications.ipynb` and `evaluation_of_rag_with_ragas.ipynb` — ships with actual content cells. When a user downloads this notebook from the cookbook page, they receive a file with zero executable code, while all the actual examples live only in the accompanying MDX. The route registration (`"isGuide": true`, `"docsPath": null`) signals this notebook is the primary guide artifact, making the missing content a real gap for anyone trying to run the cookbook interactively.

How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Comment on lines +146 to +185
| Observation | Type | What it captures |
|---|---|---|
| `rag_pipeline` | span (root) | Question as input, final answer as output |
| `retrieve_documents` | retriever | Query as input, list of retrieved chunks with source URLs as output |
| LLM call | generation | Full prompt, model name, token usage, and raw response |

The `retriever` observation type renders with a distinct icon in the Langfuse UI, making it easy to filter all retrieval steps across your traces and spot quality issues in your document retrieval.

![RAG trace in Langfuse](/images/docs/tracing-rag.png)

## Step 5: Add user and session context

Tag traces with user and session identifiers to make filtering and debugging easier in production.

```python
from langfuse import observe, propagate_attributes

@observe()
def rag_pipeline_with_context(
question: str,
urls: list[str],
user_id: str | None = None,
session_id: str | None = None,
) -> dict:
# Propagate user/session context to all nested observations
with propagate_attributes(
user_id=user_id,
session_id=session_id,
tags=["rag"],
metadata={"urls": urls},
):
retriever = build_retriever(urls)

with langfuse.start_as_current_observation(
as_type="retriever",
name="retrieve_documents",
input=question,
metadata={"chunk_size": 512, "top_k": 4},
) as retriever_span:
docs = retriever.invoke(question)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Step 5 silently depends on variables initialized in Step 3

rag_pipeline_with_context calls build_retriever, uses answer_chain and langfuse_handler, and invokes langfuse.start_as_current_observation — all defined or initialized in Step 3's code block. There is no note directing the reader to run Step 3 first, and langfuse_handler is never re-declared in Step 5. A reader who navigates directly to Step 5 (e.g., via a search or link) will get NameError on first execution with no obvious hint about the missing setup.

Prompt To Fix With AI
This is a comment left during a code review.
Path: content/guides/cookbook/rag_observability_example.mdx
Line: 146-185

Comment:
**Step 5 silently depends on variables initialized in Step 3**

`rag_pipeline_with_context` calls `build_retriever`, uses `answer_chain` and `langfuse_handler`, and invokes `langfuse.start_as_current_observation` — all defined or initialized in Step 3's code block. There is no note directing the reader to run Step 3 first, and `langfuse_handler` is never re-declared in Step 5. A reader who navigates directly to Step 5 (e.g., via a search or link) will get `NameError` on first execution with no obvious hint about the missing setup.

How can I resolve this? If you propose a fix, please make it concise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants