Created: 2025-01-08
Purpose: Document LangSmith tracing integration for RAG system
Status: ✅ COMPLETE
The RAG system now uses LangChain's ChatGoogleGenerativeAI for automatic LangSmith tracing. This means all LLM calls are automatically logged to LangSmith without custom tracking code.
The ContextAwareAgent now uses LangChain's ChatGoogleGenerativeAI:
from langchain_core.messages import HumanMessage, SystemMessage
from langchain_google_genai import ChatGoogleGenerativeAI
# Create LLM with automatic LangSmith tracing
llm = ChatGoogleGenerativeAI(
model="gemini-2.0-flash-exp",
google_api_key=api_key,
temperature=0.7,
convert_system_message_to_human=True
)
# Invoke - automatically traced to LangSmith
messages = [
SystemMessage(content="You are a helpful AI assistant..."),
HumanMessage(content=prompt)
]
response = await llm.ainvoke(messages)Benefits:
- ✅ Automatic tracing - No custom tracking code needed
- ✅ Full transparency - All prompts, responses, tokens visible in LangSmith
- ✅ Performance metrics - Latency, token usage automatically tracked
- ✅ Error tracking - Failures and retries logged automatically
- ✅ Cost tracking - Token usage for cost estimation
Add to .streamlit/secrets.toml:
# LangSmith Configuration
LANGCHAIN_TRACING_V2 = "true"
LANGCHAIN_API_KEY = "your-langsmith-api-key-here"
LANGCHAIN_PROJECT = "ai-dev-agent-rag"The RAG UI automatically loads these and sets environment variables.
When you use the RAG system, LangSmith automatically logs:
-
RAG Query Processing
- Original user query
- Query rewriting variants
- Key concept extraction
-
Context Retrieval
- Semantic search results
- Retrieved document chunks
- Relevance scores
-
LLM Generation
- Full prompt with context
- System message
- LLM response
- Token usage
- Latency
-
End-to-End Trace
- Complete flow from query → retrieval → generation → response
- Time spent in each stage
- Any errors or retries
🔗 https://smith.langchain.com/
-
Project: ai-dev-agent-rag
- All RAG queries listed chronologically
-
Individual Traces
- Full execution tree
- Input (user query + context)
- Output (LLM response)
- Metadata (tokens, latency, model)
-
Performance Metrics
- Average response time
- Token usage trends
- Success/error rates
- Cost estimates
- Navigate to "🧪 Testing & Evaluation" page
- Enter a test query
- Click "🚀 Run Test"
- Go to LangSmith dashboard to see the trace
- Navigate to "💬 Agent Chat" page
- Ask any question
- Check LangSmith for complete trace including:
- Query rewriting
- Multi-stage retrieval
- Context deduplication
- LLM generation with full prompt
We removed custom tracking code because LangSmith provides:
- ✅ Better visualization
- ✅ Standard format
- ✅ No maintenance overhead
- ✅ Industry-standard tooling
We kept these for in-app visualization:
- Query variants (query rewriting stage)
- Retrieved context chunks with scores
- Multi-signal scoring breakdown
- Processing pipeline stages
Why? Users testing in the UI need immediate feedback without switching to LangSmith.
async def _call_gemini(self, prompt: str) -> str:
"""
Call Gemini API with LangChain for automatic LangSmith tracing.
"""
if LANGCHAIN_AVAILABLE:
# Use LangChain - automatically traced
llm = ChatGoogleGenerativeAI(
model="gemini-2.0-flash-exp",
google_api_key=api_key,
temperature=0.7
)
messages = [
SystemMessage(content="You are a helpful AI assistant..."),
HumanMessage(content=prompt)
]
response = await llm.ainvoke(messages)
return response.content
else:
# Fallback to direct API (no tracing)
# ... direct genai call ...# Enable LangSmith tracing from secrets
try:
if 'LANGCHAIN_TRACING_V2' in st.secrets:
os.environ['LANGCHAIN_TRACING_V2'] = str(st.secrets['LANGCHAIN_TRACING_V2'])
if 'LANGCHAIN_API_KEY' in st.secrets:
os.environ['LANGCHAIN_API_KEY'] = st.secrets['LANGCHAIN_API_KEY']
if 'LANGCHAIN_PROJECT' in st.secrets:
os.environ['LANGCHAIN_PROJECT'] = st.secrets['LANGCHAIN_PROJECT']
except Exception:
pass # Fallback to environment variablesTo verify LangSmith integration is working:
- Added LangSmith API key to
.streamlit/secrets.toml - Set
LANGCHAIN_TRACING_V2 = "true" - Started RAG UI (
streamlit run apps/rag_management_app.py) - Ran a test query in "Agent Chat" or "Testing & Evaluation"
- Checked LangSmith dashboard at https://smith.langchain.com/
- Verified trace appears with full prompt/response
- Confirmed token usage and latency metrics visible
| Feature | Before | After (with LangSmith) |
|---|---|---|
| Tracing | Custom logging | ✅ Automatic LangChain tracing |
| Visibility | Local logs only | ✅ Web dashboard with search |
| Metrics | Manual calculation | ✅ Auto token/latency tracking |
| Debugging | Text logs | ✅ Visual execution tree |
| Cost Tracking | Manual | ✅ Automatic token-based |
| Sharing | Export logs | ✅ Share dashboard links |
| Maintenance | Custom code | ✅ Zero maintenance |
Status: ✅ Complete and Production Ready
Last Updated: 2025-01-08