Skip to content

Examples/crm ops desk#264

Open
zhirafovod wants to merge 8 commits intomainfrom
examples/crm-ops-desk
Open

Examples/crm ops desk#264
zhirafovod wants to merge 8 commits intomainfrom
examples/crm-ops-desk

Conversation

@zhirafovod
Copy link
Copy Markdown
Contributor

feat: add CRM Ops Desk LangGraph multi-agent demo

Fully-native LangGraph example app exercising multi-agent orchestration
with SDOT auto-instrumentation. Replaces the previous ad-hoc CRM demo
with proper LangGraph patterns: TypedDict state, @tool-decorated
functions, ChatOpenAI.bind_tools(), ToolNode, and conditional edges.

  • Records Agent: MongoDB Atlas vector search for orders, tickets, refunds
  • Policy Agent: RAG-based policy retrieval with drift mode support
  • Action Agent: LLM-driven tool selection via bind_tools (no manual SDOT)
  • Audit Agent: Rationale generation with citations
  • 7 scenarios covering refunds, escalations, order inquiries, hallucination

Fully-native LangGraph example app exercising multi-agent orchestration
with SDOT auto-instrumentation. Replaces the previous ad-hoc CRM demo
with proper LangGraph patterns: TypedDict state, @tool-decorated
functions, ChatOpenAI.bind_tools(), ToolNode, and conditional edges.

- Records Agent: MongoDB Atlas vector search for orders, tickets, refunds
- Policy Agent: RAG-based policy retrieval with drift mode support
- Action Agent: LLM-driven tool selection via bind_tools (no manual SDOT)
- Audit Agent: Rationale generation with citations
- 7 scenarios covering refunds, escalations, order inquiries, hallucination
The action_agent → tool_executor → action_summarise loop created
multiple invoke_agent spans. Collapse the tool-calling loop into a
single action_node so the trace shape matches the original CRM app:
exactly one invoke_agent span per agent (Records, Policy, Action, Audit).
Close behavioral gaps identified by side-by-side review:
- Orders vector search: limit 3 → 1 (match old atlas_client.py)
- Policy region detection: add missing country codes (USA, ITALY, SPAIN)
  and default-to-EU for unknown countries
- create_ticket / update_ticket: restore `comments` field
- Cost model: restore per-tool variable costs instead of flat rate
- Resolution logic: add "no_action_required", "refund_state_explained",
  and "action_failed" to match old _determine_resolution()
- AuditOutput: restore `span_ids` field
Extract the monolithic app.py into app/ package matching the original
CRM demo repo structure: agents/, models/, rag/, tools/, graph.py.
No logic changes — just reorganization for maintainability.
@zhirafovod zhirafovod requested review from a team as code owners April 13, 2026 17:28
- Pass RunnableConfig to action_node and summarize_node so graph-level
  callbacks propagate to inner LLM and tool calls
- Tools now return JSON strings for reliable output capture
- Simplified explain_refund_state and explain_order_state DB lookups
- Improved Action Agent prompt for better context adherence
Add scenarios designed to exercise all 15 Galileo eval metrics:
- pii_leak_refund: input_pii + output_pii (PII in query, tool echoes it)
- prompt_injection_attempt: prompt_injection, context_adherence
- toxic_abusive_customer: input_toxicity, input_tone
- incomplete_multi_request: completeness, action_completion
- tool_failure_scenario: tool_error_rate, action_advancement
- vague_rambling_query: agent_efficiency, tool_selection_quality
- hostile_context_leakage: output_tone, output_toxicity

Add run-sdot-batch.sh for batch execution with mixed scenario plan
(30% baseline, 70% metric triggers) and random delay between runs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant