Skip to content

Latest commit

 

History

History

README.md

Bee Colony Health & Pollinator Supervisor Demo

A demonstration of the Databricks Multi-Agent Supervisor pattern using real USDA bee colony health and pollination data. The supervisor routes questions between structured agricultural data (Genie) and beekeeping guidance documents (Knowledge Assistant).

What this demo does

You build a supervisor agent that sits in front of two specialized sub-agents and decides which one (or both) should handle each user query:

  • A user asks a data question ("Top 5 states by colony loss in Q1 2023?") — the supervisor routes to a Genie agent, which writes SQL against three Delta tables (~13,500 rows of real USDA data) and returns tabular results.
  • A user asks a guidance question ("How should I monitor varroa mite levels?") — the supervisor routes to a Knowledge Assistant, which retrieves answers from four public-domain PDFs indexed via Vector Search.
  • A user asks a question that needs both ("Which stressors hit California hardest in Q1 2023, and what should beekeepers do about it?") — the supervisor calls both agents and synthesizes a single answer that connects the data to actionable recommendations.

The whole stack deploys with one databricks bundle deploy and one databricks bundle run. The bundle creates the Delta tables, uploads the PDFs to a Unity Catalog Volume, and provisions the Genie Space and Knowledge Assistant. The only manual step is wiring up the Supervisor Agent in the UI (API coming shorlty...).

After running a few queries, you can inspect MLflow traces to see exactly how the supervisor routed each request, which sub-agents were called, and how long each step took — built-in observability with no extra instrumentation.

The demo demonstrates the following Databricks technologies:

  • The Agent Bricks Supervisor and subagents routing and delegation pattern
  • The Genie's ability to generate SQL queries from a natural language query
  • The ease with which you can use Declarative Automation Bundle (DAB) from your local host and deploy your project to remote Datarbicks workspace.
  • Use MLflow's observability to examine traces and evaluate by adding judges

Architecture

architecture

A Supervisor Agent intelligently routes user queries to two specialized subagents:

Subagent Purpose Databricks Component
Genie Agent Structured data for bee colonies queries (SQL, stats, trends) Genie Space → Unity Catalog table
Knowledge Assistant Covers varroa mite management, pollinator conservation, agricultural habitat, and native plant guides. AgentBricks Knowledge Assistant → Vector Search index
Synthesizer Routes, delegates, synthesizes responses from both subagents AgentBricks Supervisor Agent

Structured data (Genie): ~13,500 rows of real USDA NASS data — annual honey metrics by state/year plus quarterly colony loss (including max colonies, deadout counts, and loss percent) and colony stressors by state/year/quarter (2015-2025).

Documents (Knowledge Assistant): 4 public-domain PDFs (~140 pages) covering varroa mite management, pollinator conservation, agricultural habitat, and native plant guides.

See docs/DATA_SOURCES.md for full sourcing, licensing, and refresh details.

Prerequisites

  • Databricks workspace with Unity Catalog enabled
  • Databricks CLI v0.218+ (databricks --version)
  • A catalog you can write to and a SQL Warehouse ID
  • No API key needed — data snapshots and PDFs ship in the repo

Setup

Before following the Declarative Automation Bundle (DAB) steps below, please clone this repo.

Step 1: Find your SQL Warehouse

In your Databricks workspace, navigate to the SQL Warehouses section.

  1. Select an existing Pro or Serverless SQL warehouse (or create one if needed). Copy the warehouse ID from its details page.

Step 2: Deploy and run the bundle (~10 minutes)

cd demos/bee-pollinator

databricks bundle deploy \
  --var="catalog=your_catalog" \
  --var="warehouse_id=your_warehouse_id"

databricks bundle run setup_demo \
  --var="catalog=your_catalog" \
  --var="warehouse_id=your_warehouse_id"

Add --profile your_profile if not using the default CLI profile.

This creates 3 Delta tables, uploads 4 PDFs to a UC Volume, and creates the Genie Space, Knowledge Assistant, and Supervisor Agent — all automated.

Variable Default Description
catalog main Unity Catalog catalog name
schema bee_pollinator Schema for demo tables
warehouse_id — (required) SQL Warehouse ID for Genie Space

Step 3: Wait for indexing

The Knowledge Assistant indexes the PDFs after creation. For this demo's ~140 pages this typically takes ~10 minutes (sometimes longer). The Supervisor Agent will return apologetic, ungrounded responses to document questions until indexing completes — that's the signal to wait.

Step 4: Verify

Confirm in the Databricks UI:

  • Data > your catalog > your schema: 3 tables (honey_production, colony_loss, colony_stressors) and a guidance_docs volume with 4 PDFs
  • Agents: Genie Space (USDA Bee Health Data), Knowledge Assistant (Bee Health Documents), and Supervisor Agent (Bee Colony Health Advisor) all present

Test the Supervisor Agent in the Agents UI with these queries:

Type Query
Data (Genie) "Which 5 states had the highest colony loss percentage in Q4 2024, and what were their max colonies?"
Document (KA) "What does the Varroa Management Guide recommend for monitoring mite levels?"
Cross-modal "Which stressors affected California colonies most in Q1 2024, and what varroa management practices should California beekeepers prioritize?"

Honey questions can stay annual. Colony-loss and stressor questions should stay quarterly because the USDA Honey Bee Colonies data in this demo is quarter-based. Use max_colonies with loss_colonies when you need quarter-specific scale.

Or run the same three queries from the CLI:

pip install -r requirements.txt    # or: uv pip install -r requirements.txt
python scripts/verify_demo.py --supervisor "Bee Colony Health Advisor" --profile your_profile

The script resolves the display name to the supervisor's serving endpoint (mas-XXXXXXXX-endpoint) and reports pass/fail per query.

Alternative: Local CLI setup (no DABs)

pip install -r requirements.txt    # or: uv pip install -r requirements.txt
python scripts/setup_data.py --catalog your_catalog --schema your_schema
python scripts/setup_agents.py --catalog your_catalog --schema your_schema --warehouse-id your_warehouse_id

This creates the Genie Space, Knowledge Assistant, and Supervisor Agent — no manual UI step. Use the same --schema for both commands.

Running the Demo

The demo is a ~5 minute walkthrough of the three query types above, showing how the Supervisor routes each one differently:

  1. Data query — routes to Genie, generates SQL, returns tabular results
  2. Document query — routes to Knowledge Assistant, retrieves from PDFs, cites sources
  3. Cross-modal query — uses both agents and synthesizes a combined answer

After running the queries, show MLflow traces (Machine Learning > Experiments > find the Supervisor Agent experiment > Traces tab) to demonstrate built-in observability — you can see which sub-agents were called and how long each step took.

Backup queries

If any of the primary queries don't land well:

  • Data: Show me honey production trends in California over the last 5 years.
  • Document: Which native plants should I recommend for spring forage in the Northeast?
  • Cross-modal: Which stressors affected North Dakota colonies most in Q4 2024, and what management practices should beekeepers prioritize?

Evaluate the Supervisor Agent with MLflow

For a more comprehensive evaluation beyond ad-hoc Genie Code judges, the eval_supervisor notebook runs 12 queries across all three routing patterns and scores every response using MLflow's GenAI evaluation framework — mlflow.genai.evaluate().

Evaluation results dashboard

Each Supervisor's query trace shows the judge's score and the rational for that score.

Evalution rational

What it evaluates

The notebook sends 12 queries (4 Genie-only, 4 Knowledge-Assistant-only, 4 both) through the deployed Supervisor Agent and applies four scorers to each response:

Scorer Type What it measures
Routing Correctness make_judge() Did the supervisor route to the correct sub-agent(s)?
Answer Correctness Built-in Correctness() Does the response contain the expected facts?
Completeness @scorer + make_judge() Does the response cover all expected elements?
Response Quality Built-in Guidelines() Does the response meet domain quality standards?

How to run it

  1. Open the scripts/eval_supervisor.py in your Databricks workspace (the bundle uploads it under /Workspace/Users/you/.bundle/bee-pollinator-demo/dev/files/scripts/)
  2. Attach the notebook to a cluster
  3. Set the two widgets at the top:
    • Supervisor — the serving endpoint for your Supervisor Agent (e.g., mas-f6c439c0-endpoint)
    • Judge Model URI — the model used for LLM judge scorers (e.g., databricks:/databricks-gpt-5-4)
  4. Run All Cells — the notebook installs dependencies, queries the agent, runs all four judges, and displays results

The evaluation takes 3-6 minutes depending on agent response times.

What you get

  • MLflow experiment at /Users/<you>/bee_pollinator_eval with metrics logged per run
  • Eval results table with per-query scores for routing, correctness, completeness, and quality
  • Aggregate metrics displayed as an HTML dashboard in the notebook
  • All traces are captured via mlflow.openai.autolog() for drill-down in the MLflow Traces tab

Teardown

# Remove bundle-managed resources (job definition)
databricks bundle destroy

# Tables, schema, volume, and agents must be cleaned up separately:
#   - Drop schema (cascades to tables and volume)
#   - Delete Genie Space, Knowledge Assistant, and Supervisor Agent
#     from the UI or via SDK (w.supervisor_agents.delete_supervisor_agent,
#     w.knowledge_assistants.delete_knowledge_assistant, w.genie.trash_space)

Troubleshooting

Problem Solution
bundle validate auth error Run databricks auth login --profile your_profile
load_data task fails reading CSVs Verify bundle deployed: databricks workspace ls "/Workspace/Users/<you>/.bundle/bee-pollinator-demo/dev/files/data/snapshots"
create_agents fails with missing module or supervisor_agents attribute Bump databricks-sdk to >=0.106.0 in databricks.yml (bundle path) or pyproject.toml / requirements.txt (local CLI path)
"Table not found" in Genie Verify tables exist in Data browser; re-run bundle run setup_demo
KA returns "No relevant documents" Check PDFs in the guidance_docs volume; wait for indexing to finish
Supervisor routes incorrectly Verify both sub-agents are added; check instructions are pasted correctly; test sub-agents individually first

License

Data sources are public domain USDA datasets. PDF documents retain their original licenses (typically public domain or CC-BY for USDA publications).