Created: 2025-01-29
Purpose: Comprehensive overview of LangChain-compatible RAG system
Status: 🚀 Active Development
Build a LangChain-native RAG system with sophisticated multi-agent orchestration and Human-in-the-Loop (HITL) control using official LangChain patterns.
Five expert agents working together:
| Agent | Role | LLM Model | Responsibilities |
|---|---|---|---|
| QueryAnalystAgent | Query Understanding | Gemini 2.5 Flash | Intent classification, query rewriting, concept extraction |
| RetrievalSpecialistAgent | Context Retrieval | N/A (pure search) | Multi-strategy search, query expansion |
| ReRankerAgent | Result Ranking | N/A (algorithmic) | Multi-signal scoring, deduplication |
| QualityAssuranceAgent | Quality Validation | Gemini 2.5 Flash | Quality assessment, coverage analysis |
| WriterAgent | Response Synthesis | Gemini 2.5 Flash | Answer generation, citation, formatting |
Coordinated by: RAGSwarmCoordinator (LangGraph orchestration)
Following official LangChain HITL patterns:
Three Implementation Options:
from deepagents import create_deep_agent
rag_agent = create_deep_agent(
model="anthropic:claude-sonnet-4-20250514",
tools=[analyze_query, retrieve_context, rerank, assess_quality, generate],
interrupt_on={
"analyze_query": {"allowed_decisions": ["approve", "edit", "reject"]},
"retrieve_context": {"allowed_decisions": ["approve", "edit", "reject"]},
"rerank_results": {"allowed_decisions": ["approve", "reject"]},
"assess_quality": {"allowed_decisions": ["approve", "reject"]}
},
checkpointer=MemorySaver()
)from langchain.agents.middleware import HumanInTheLoopMiddleware
agent = create_agent(
model="anthropic:claude-sonnet-4-20250514",
tools=[...],
middleware=[HumanInTheLoopMiddleware(interrupt_on={...})]
)- Keep our sophisticated routing
- Use LangChain's
Commandpattern for resume - Structured
HITLRequest/HITLResponse
5 Strategic Human Review Points:
-
Query Analysis Review
- Review intent classification and search strategy
- Decisions: approve, edit, reject
-
Retrieval Results Review
- Review retrieved sources and relevance
- Decisions: approve, edit (add sources), reject
-
Re-ranking Review
- Review ranked context quality
- Decisions: approve, reject
-
Quality Assessment Review
- Review quality score and completeness
- Decisions: approve, reject (trigger re-retrieval)
-
Final Response Review
- Review generated answer
- Decisions: approve, revise, reject
Different tasks use different agent combinations:
Simple QA:
query_analyst → retrieval → writer → END
Research Article:
query_analyst → [HITL] → retrieval → [HITL] →
re_ranker → [HITL] → writer → [HITL] →
quality_assurance → [HITL] → END
Code Generation:
query_analyst → retrieval (code-focused) → [HITL] →
writer (code formatter) → [HITL] → END
See Task-Adaptive Workflows for details.
1. User Query
↓
2. QueryAnalystAgent
- Intent: factual | conceptual | procedural | multi-hop
- Generate 3-5 query variants
- Recommend search strategy
↓
3. [HITL #1: Review Query Analysis]
- Human: approve | edit | reject
↓
4. RetrievalSpecialistAgent
- Execute multi-strategy search
- Retrieve 20-30 candidates
↓
5. [HITL #2: Review Retrieved Sources]
- Human: approve | add_source | retry | reject
↓
6. ReRankerAgent
- Multi-signal scoring (semantic, keyword, quality, diversity)
- Deduplication
- Position optimization
- Top 10 results
↓
7. [HITL #3: Review Ranked Context]
- Human: approve | improve_ranking | more_sources
↓
8. QualityAssuranceAgent
- Assess quality score (0-1)
- Assess coverage
- Trigger re-retrieval if needed
↓
9. [HITL #4: Review Quality Assessment]
- Human: approve | retry_retrieval
↓
10. WriterAgent
- Synthesize context into answer
- Cite sources
- Format response
↓
11. [HITL #5: Final Response Review]
- Human: ship | revise | restart
↓
12. END
combined_score = (
0.40 × semantic_similarity + # Vector search score
0.25 × keyword_overlap + # Term matching
0.20 × content_quality + # Metadata & length
0.15 × diversity # Uniqueness
)Automatically triggers re-retrieval when:
- Quality score < 0.6
- Coverage < 0.5
- Less than 3 results
- QA agent recommends it
Mitigates "lost in the middle" effect:
- Best results at beginning and end
- Middle results in reversed order
- 5 specialized agents implemented
- RAGSwarmCoordinator with LangGraph
- Thread-based state persistence
- Multi-signal re-ranking
- Quality feedback loop
- Streamlit UI integration
- LangSmith tracing
- LangChain-compatible HITL (Decision: Deep Agents vs. Custom)
- Structured decision handling (approve/edit/reject)
- Decision validation per checkpoint
- Context preview improvements
- Task-adaptive routing (different workflows per task type)
- Multi-session project support
- Advanced source management (URLs, documents, categories)
- Comprehensive testing suite
- Performance benchmarking
| Metric | Target | Current |
|---|---|---|
| Response Quality | Excellent | Good → Excellent |
| Context Relevance | >0.85 | ~0.85 |
| Coverage | >85% | ~88% |
| Latency | <5s | 3-5s |
| Source Citations | Comprehensive | Basic → Comprehensive |
| Re-retrieval | Automatic | ✅ Automatic |
streamlit run apps/rag_management_app.py --server.port 8510- Navigate to "💬 Agent Chat"
- Select "🔥 Agent Swarm (Best Quality)"
- Enable HITL mode if desired
- Ask your question
- Review at each checkpoint
from agents.rag import RAGSwarmCoordinator
from context.context_engine import ContextEngine
# Initialize
context_engine = ContextEngine(context_config)
await context_engine.initialize()
swarm = RAGSwarmCoordinator(context_engine, human_in_loop=True)
# Execute with HITL
config = {"configurable": {"thread_id": "session_123"}}
result = await swarm.execute("Your query here", config=config)
# Handle interrupt
if result['status'] == 'interrupted':
# Present to human for review
human_decision = get_human_feedback()
# Resume
result = swarm.resume(
thread_id="session_123",
human_input=human_decision,
parent_run_id=result.get('run_id')
)- HITL Implementation Plan - Detailed LangChain HITL implementation
- Task-Adaptive Workflows - Different workflows per task type
- Best Practices - Industry RAG best practices
- LangSmith Integration - Tracing and observability
- LangChain-Native: Use official LangChain patterns, not custom implementations
- Human-in-Control: HITL as primary interaction model, not afterthought
- Task-Adaptive: Workflows adapt to task type, not one-size-fits-all
- Quality-First: Automatic quality validation with feedback loops
- Transparent: Full observability via LangSmith
- Modular: Each agent independently testable and replaceable
Status: Active development with LangChain HITL patterns
Next Milestone: Complete LangChain-compatible HITL implementation
Last Updated: 2025-01-29