A demonstration of the Databricks Multi-Agent Supervisor pattern using real USDA bee colony health and pollination data. The supervisor routes questions between structured agricultural data (Genie) and beekeeping guidance documents (Knowledge Assistant).
You build a supervisor agent that sits in front of two specialized sub-agents and decides which one (or both) should handle each user query:
- A user asks a data question ("Top 5 states by colony loss in Q1 2023?") — the supervisor routes to a Genie agent, which writes SQL against three Delta tables (~13,500 rows of real USDA data) and returns tabular results.
- A user asks a guidance question ("How should I monitor varroa mite levels?") — the supervisor routes to a Knowledge Assistant, which retrieves answers from four public-domain PDFs indexed via Vector Search.
- A user asks a question that needs both ("Which stressors hit California hardest in Q1 2023, and what should beekeepers do about it?") — the supervisor calls both agents and synthesizes a single answer that connects the data to actionable recommendations.
The whole stack deploys with one databricks bundle deploy and one databricks bundle run. The bundle creates the Delta tables, uploads the PDFs to a Unity Catalog Volume, and provisions the Genie Space and Knowledge Assistant. The only manual step is wiring up the Supervisor Agent in the UI (API coming shorlty...).
After running a few queries, you can inspect MLflow traces to see exactly how the supervisor routed each request, which sub-agents were called, and how long each step took — built-in observability with no extra instrumentation.
The demo demonstrates the following Databricks technologies:
- The Agent Bricks Supervisor and subagents routing and delegation pattern
- The Genie's ability to generate SQL queries from a natural language query
- The ease with which you can use Declarative Automation Bundle (DAB) from your local host and deploy your project to remote Datarbicks workspace.
- Use MLflow's observability to examine traces and evaluate by adding judges
A Supervisor Agent intelligently routes user queries to two specialized subagents:
| Subagent | Purpose | Databricks Component |
|---|---|---|
| Genie Agent | Structured data for bee colonies queries (SQL, stats, trends) | Genie Space → Unity Catalog table |
| Knowledge Assistant | Covers varroa mite management, pollinator conservation, agricultural habitat, and native plant guides. | AgentBricks Knowledge Assistant → Vector Search index |
| Synthesizer | Routes, delegates, synthesizes responses from both subagents | AgentBricks Supervisor Agent |
Structured data (Genie): ~13,500 rows of real USDA NASS data — annual honey metrics by state/year plus quarterly colony loss (including max colonies, deadout counts, and loss percent) and colony stressors by state/year/quarter (2015-2025).
Documents (Knowledge Assistant): 4 public-domain PDFs (~140 pages) covering varroa mite management, pollinator conservation, agricultural habitat, and native plant guides.
See docs/DATA_SOURCES.md for full sourcing, licensing, and refresh details.
- Databricks workspace with Unity Catalog enabled
- Databricks CLI v0.218+ (
databricks --version) - A catalog you can write to and a SQL Warehouse ID
- No API key needed — data snapshots and PDFs ship in the repo
Before following the Declarative Automation Bundle (DAB) steps below, please clone this repo.
In your Databricks workspace, navigate to the SQL Warehouses section.
- Select an existing Pro or Serverless SQL warehouse (or create one if needed). Copy the warehouse ID from its details page.
cd demos/bee-pollinator
databricks bundle deploy \
--var="catalog=your_catalog" \
--var="warehouse_id=your_warehouse_id"
databricks bundle run setup_demo \
--var="catalog=your_catalog" \
--var="warehouse_id=your_warehouse_id"Add --profile your_profile if not using the default CLI profile.
This creates 3 Delta tables, uploads 4 PDFs to a UC Volume, and creates the Genie Space, Knowledge Assistant, and Supervisor Agent — all automated.
| Variable | Default | Description |
|---|---|---|
catalog |
main |
Unity Catalog catalog name |
schema |
bee_pollinator |
Schema for demo tables |
warehouse_id |
— (required) | SQL Warehouse ID for Genie Space |
The Knowledge Assistant indexes the PDFs after creation. For this demo's ~140 pages this typically takes ~10 minutes (sometimes longer). The Supervisor Agent will return apologetic, ungrounded responses to document questions until indexing completes — that's the signal to wait.
Confirm in the Databricks UI:
- Data > your catalog > your schema: 3 tables (
honey_production,colony_loss,colony_stressors) and aguidance_docsvolume with 4 PDFs - Agents: Genie Space (
USDA Bee Health Data), Knowledge Assistant (Bee Health Documents), and Supervisor Agent (Bee Colony Health Advisor) all present
Test the Supervisor Agent in the Agents UI with these queries:
| Type | Query |
|---|---|
| Data (Genie) | "Which 5 states had the highest colony loss percentage in Q4 2024, and what were their max colonies?" |
| Document (KA) | "What does the Varroa Management Guide recommend for monitoring mite levels?" |
| Cross-modal | "Which stressors affected California colonies most in Q1 2024, and what varroa management practices should California beekeepers prioritize?" |
Honey questions can stay annual. Colony-loss and stressor questions should stay quarterly because the USDA Honey Bee Colonies data in this demo is quarter-based. Use max_colonies with loss_colonies when you need quarter-specific scale.
Or run the same three queries from the CLI:
pip install -r requirements.txt # or: uv pip install -r requirements.txt
python scripts/verify_demo.py --supervisor "Bee Colony Health Advisor" --profile your_profileThe script resolves the display name to the supervisor's serving endpoint (mas-XXXXXXXX-endpoint) and reports pass/fail per query.
pip install -r requirements.txt # or: uv pip install -r requirements.txt
python scripts/setup_data.py --catalog your_catalog --schema your_schema
python scripts/setup_agents.py --catalog your_catalog --schema your_schema --warehouse-id your_warehouse_idThis creates the Genie Space, Knowledge Assistant, and Supervisor Agent — no manual UI step. Use the same --schema for both commands.
The demo is a ~5 minute walkthrough of the three query types above, showing how the Supervisor routes each one differently:
- Data query — routes to Genie, generates SQL, returns tabular results
- Document query — routes to Knowledge Assistant, retrieves from PDFs, cites sources
- Cross-modal query — uses both agents and synthesizes a combined answer
After running the queries, show MLflow traces (Machine Learning > Experiments > find the Supervisor Agent experiment > Traces tab) to demonstrate built-in observability — you can see which sub-agents were called and how long each step took.
If any of the primary queries don't land well:
- Data:
Show me honey production trends in California over the last 5 years. - Document:
Which native plants should I recommend for spring forage in the Northeast? - Cross-modal:
Which stressors affected North Dakota colonies most in Q4 2024, and what management practices should beekeepers prioritize?
For a more comprehensive evaluation beyond ad-hoc Genie Code judges, the eval_supervisor notebook runs 12 queries across all three routing patterns and scores every response using MLflow's GenAI evaluation framework — mlflow.genai.evaluate().
Each Supervisor's query trace shows the judge's score and the rational for that score.
The notebook sends 12 queries (4 Genie-only, 4 Knowledge-Assistant-only, 4 both) through the deployed Supervisor Agent and applies four scorers to each response:
| Scorer | Type | What it measures |
|---|---|---|
| Routing Correctness | make_judge() |
Did the supervisor route to the correct sub-agent(s)? |
| Answer Correctness | Built-in Correctness() |
Does the response contain the expected facts? |
| Completeness | @scorer + make_judge() |
Does the response cover all expected elements? |
| Response Quality | Built-in Guidelines() |
Does the response meet domain quality standards? |
- Open the scripts/eval_supervisor.py in your Databricks workspace (the bundle uploads it under /Workspace/Users/
you/.bundle/bee-pollinator-demo/dev/files/scripts/) - Attach the notebook to a cluster
- Set the two widgets at the top:
- Supervisor — the serving endpoint for your Supervisor Agent (e.g.,
mas-f6c439c0-endpoint) - Judge Model URI — the model used for LLM judge scorers (e.g.,
databricks:/databricks-gpt-5-4)
- Supervisor — the serving endpoint for your Supervisor Agent (e.g.,
- Run All Cells — the notebook installs dependencies, queries the agent, runs all four judges, and displays results
The evaluation takes 3-6 minutes depending on agent response times.
- MLflow experiment at
/Users/<you>/bee_pollinator_evalwith metrics logged per run - Eval results table with per-query scores for routing, correctness, completeness, and quality
- Aggregate metrics displayed as an HTML dashboard in the notebook
- All traces are captured via
mlflow.openai.autolog()for drill-down in the MLflow Traces tab
# Remove bundle-managed resources (job definition)
databricks bundle destroy
# Tables, schema, volume, and agents must be cleaned up separately:
# - Drop schema (cascades to tables and volume)
# - Delete Genie Space, Knowledge Assistant, and Supervisor Agent
# from the UI or via SDK (w.supervisor_agents.delete_supervisor_agent,
# w.knowledge_assistants.delete_knowledge_assistant, w.genie.trash_space)| Problem | Solution |
|---|---|
bundle validate auth error |
Run databricks auth login --profile your_profile |
load_data task fails reading CSVs |
Verify bundle deployed: databricks workspace ls "/Workspace/Users/<you>/.bundle/bee-pollinator-demo/dev/files/data/snapshots" |
create_agents fails with missing module or supervisor_agents attribute |
Bump databricks-sdk to >=0.106.0 in databricks.yml (bundle path) or pyproject.toml / requirements.txt (local CLI path) |
| "Table not found" in Genie | Verify tables exist in Data browser; re-run bundle run setup_demo |
| KA returns "No relevant documents" | Check PDFs in the guidance_docs volume; wait for indexing to finish |
| Supervisor routes incorrectly | Verify both sub-agents are added; check instructions are pasted correctly; test sub-agents individually first |
Data sources are public domain USDA datasets. PDF documents retain their original licenses (typically public domain or CC-BY for USDA publications).

