Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
179 changes: 179 additions & 0 deletions examples/09_cricket_umpire/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,179 @@
# 🏏 Cricket DRS AI — Third Umpire Decision Review System

> AI-powered Decision Review System for Women's Cricket using Gemini Live vision, YOLO pose detection, and real-time voice verdicts.

---

## 🎥 Demo Video

[▶ Watch Demo](https://youtu.be/j3rzsTp0sW0)

## 🌐 Deployment

| | Link |
|---|---|
| Frontend | [Link](https://ai-drs-vision-agents-sdk-jd.vercel.app/) |
| Backend | Runs locally (see setup below) |
Comment thread
jaya6400 marked this conversation as resolved.

> Backend requires a persistent Gemini Live WebSocket connection and cannot be hosted on free-tier platforms. Full local setup takes under 2 minutes.

---

## ✨ Features

- **Real-time video analysis** — Gemini Live watches your screen share and analyzes cricket footage frame by frame
- **Voice verdicts** — Third Umpire AI speaks the decision aloud (DECISION / REVIEW TYPE / REASON / CONFIDENCE)
- **YOLO pose detection** — Player body positions detected in real-time at 30 FPS
- **Two review types** — LBW and Run Out (the two most contested DRS decisions)
- **FastAPI trigger endpoint** — Button click sends review request directly to Gemini via REST API
- **Custom DRS UI** — Built with Stream Video SDK, dark cricket stadium aesthetic

---

## 🛠 Tech Stack

| Layer | Technology |
|---|---|
| Vision AI | Gemini Live (google-genai) — real-time video + audio |
| Pose Detection | YOLO11n-pose (Ultralytics) — player skeleton tracking |
| Video Transport | Stream Video SDK (getstream) |
| Agent Framework | Vision Agents SDK (GetStream) |
| Backend API | FastAPI + Uvicorn |
| Frontend | React + Vite + Stream Video React SDK |
| Auth | JWT token server |

---

## 🚀 Setup

### Prerequisites
- Python 3.10+
- Node.js 18+
- Google API Key (Gemini Live access)
- Stream API Key + Secret

### 1. Clone & Install

```bash
git clone https://github.com/jaya6400/Vision-Agents.git
cd Vision-Agents
pip install -e agents-core
pip install -e plugins/getstream
pip install -e plugins/ultralytics
cd examples/09_cricket_umpire
pip install -r requirements.txt
```

### 2. Environment Variables

Create `.env` in `examples/09_cricket_umpire/`:

```env
GOOGLE_API_KEY=your_google_api_key
STREAM_API_KEY=your_stream_api_key
STREAM_API_SECRET=your_stream_api_secret
```

Create `.env` in `examples/09_cricket_umpire/frontend/`:

```env
VITE_STREAM_API_KEY=your_stream_api_key
```

### 3. Run

```bash
cd examples/09_cricket_umpire
bash run.sh
```

This starts:
- Token server on `http://localhost:8001`
- DRS Agent (Gemini Live + YOLO)
- Review API on `http://localhost:8002`
- Frontend on `http://localhost:5173`

### 4. Usage

1. Open `http://localhost:5173`
2. Click **Start DRS Session**
3. Click **Share Screen** → select your cricket video tab
4. **Uncheck "Share tab audio"** in the Chrome dialog (important!)
5. Click **LBW Review** or **Run Out Review**
6. Hear the Third Umpire AI speak the verdict

---

## 🗂 Project Structure

```
examples/09_cricket_umpire/
├── cricket_umpire.py # Main agent + FastAPI review endpoint
├── cricket_umpire.md # Gemini instructions (DRS rules)
├── token_server.py # JWT auth server
├── run.sh # One-command startup
├── requirements.txt # Python dependencies
└── frontend/
└── src/
├── App.jsx # React UI
└── App.css # DRS styling
```

---

## 📸 Screenshots

**Frontend:**
![frontend](docs/screenshots/frontend.PNG)

**Agent Transcript:**
![realtime-ai-agent](docs/screenshots/realtime-voice-agent.PNG)

---

## ⚠️ Known Issues & Fixes

| Issue | Root Cause | Fix Applied |
|---|---|---|
| `AudioQueue buffer limit exceeded` | `SCREEN_SHARE_AUDIO` track overwhelming WebRTC pipeline | Uncheck "Share tab audio" in Chrome screen share dialog |
| `Pose processing TIMEOUT 12s` | YOLO running at full 1920×1080 resolution on CPU | Reduced `imgsz=256`, dropped Gemini fps to 2 |
| `Edge connection is not set` | Screen share arrived before agent WebRTC fully connected | Agent joins before user, race condition handled by SDK retry |
| `Cannot handle offer in signaling state "closed"` | WebRTC renegotiation after audio track timeout | Resolved by disabling screen share audio |
| Agent giving verdict before screen share | Gemini responding to text prompt alone without video | Added video grounding check in instructions |

---

## 🏗 Architecture

```
┌─────────────────┐ screen share ┌──────────────────────┐
│ Browser UI │ ──────────────────────▶│ Stream Video SFU │
│ (React + Vite) │ └──────────┬───────────┘
│ │ │ video frames
│ Click Review │ POST /review/lbw ▼
│ Button │ ──────────────────────▶ ┌─────────────────────┐
└─────────────────┘ │ cricket_umpire.py │
│ │
│ YOLO Pose (256px) │
│ ↓ │
│ Gemini Live │
│ (fps=2, vision) │
│ ↓ │
│ Speaks verdict │
│ via Stream audio │
└─────────────────────┘
```

---

## 📝 Blog Post

[Read the full writeup on Medium](https://medium.com/@jayadubey6402/i-built-an-ai-drs-system-for-womens-cricket-in-3-days-challenges-and-learnings-2ed784718b41)

---

## 🏆 Built For

Vision AI Hackathon 2026 — GetStream Vision Agents Challenge

*Women's Cricket • Decision Review System • Real-time AI Umpire*
96 changes: 96 additions & 0 deletions examples/09_cricket_umpire/cricket_umpire.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
# DRS System - Women's Cricket Third Umpire Agent

You are an AI-powered Third Umpire for professional women's cricket matches.
You assist the on-field umpire by reviewing two types of decisions using live video analysis.

--------------------------------------------------

YOUR ROLE

You are the Third Umpire in the DRS (Decision Review System).
The on-field umpire will call you for a review.
You watch the live video feed, analyze the footage carefully, and deliver a final verdict.

You only review:
1. Run Out
2. LBW (Leg Before Wicket)

--------------------------------------------------

HOW A REVIEW WORKS

The on-field umpire will say something like:
- "Third Umpire, please check this run out"
- "Referring to the third umpire for LBW"
- "Check the crease please"
- "Is she out? LBW appeal"

When you hear a review request:

Step 1 — Acknowledge:
Say: "Third Umpire reviewing. Please ensure the video feed is active."

Step 2 — Analyze:
Watch the video carefully. For LBW, track the ball trajectory from pitch to impact to stumps.
For Run Out, check the exact frame when the bails were removed vs bat/foot position.

Step 3 — Deliver verdict:
Speak your decision clearly in the REQUIRED FORMAT below.

--------------------------------------------------

RUN OUT RULES

Check:
- Exact moment bails were removed (stump broken)
- Position of bat or foot relative to the crease line

OUT: Bat or foot was NOT grounded behind the crease when stumps were broken
NOT OUT: Bat or foot was grounded behind crease before stumps were broken

--------------------------------------------------

LBW RULES

Check:
- Where did the ball pitch? (in line, outside off, outside leg)
- Where did it impact the pad? (in line with stumps or outside)
- Would it have hit the stumps? (trajectory projection)

OUT if ALL three:
- Pitched in line or outside off stump
- Impact in line with the stumps
- Projected to hit the stumps

NOT OUT if ANY:
- Pitched outside leg stump
- Impact outside the line (shot offered)
- Ball missing stumps

--------------------------------------------------

REQUIRED VERDICT FORMAT

Speak this clearly every time:

DECISION: [OUT or NOT OUT]
REVIEW TYPE: [Run Out or LBW]
REASON: [One sentence — what you saw that determined the decision]
CONFIDENCE: [High / Medium / Low]

Example:
DECISION: OUT
REVIEW TYPE: Run Out
REASON: The bat was clearly in the air when the bails were removed.
CONFIDENCE: High

--------------------------------------------------

RULES

- Always give a final decision — never leave it unresolved
- If video is unclear: say "Third Umpire: Insufficient evidence. On-field decision stands."
- Speak calmly and authoritatively — like a professional TV third umpire
- No emojis, no markdown, no extra commentary
- Refer to players as "the batter", "the bowler", "the fielder"
- You are supporting women's cricket — treat every decision with full professionalism
80 changes: 80 additions & 0 deletions examples/09_cricket_umpire/cricket_umpire.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
import asyncio
import logging
from contextlib import asynccontextmanager

from dotenv import load_dotenv
from fastapi import FastAPI
from vision_agents.core import Agent, Runner, User
from vision_agents.core.agents import AgentLauncher
from vision_agents.plugins import gemini, getstream, ultralytics

logger = logging.getLogger(__name__)
load_dotenv()

# Global agent reference so HTTP endpoint can trigger it
active_agent: Agent | None = None
Comment thread
jaya6400 marked this conversation as resolved.


async def create_agent(**kwargs) -> Agent:
agent = Agent(
edge=getstream.Edge(),
agent_user=User(name="Third Umpire DRS"),
instructions="Read @cricket_umpire.md",
llm=gemini.Realtime(fps=2),
processors=[
ultralytics.YOLOPoseProcessor(
model_path="yolo11n-pose.pt",
imgsz=256
)
],
)
return agent


async def join_call(agent: Agent, call_type: str, call_id: str, **kwargs) -> None:
global active_agent
active_agent = agent
call = await agent.create_call(call_type, call_id)
async with agent.join(call):
await agent.llm.simple_response(
text="Say: Third Umpire DRS ready. Awaiting referral."
)
await agent.finish()
active_agent = None


# Extra FastAPI app for review trigger endpoint
@asynccontextmanager
async def lifespan(app: FastAPI):
yield

review_app = FastAPI(lifespan=lifespan)

@review_app.post("/review/{review_type}")
async def trigger_review(review_type: str):
global active_agent
if active_agent is None:
return {"error": "Agent not connected"}

if review_type == "lbw":
prompt = "The on-field umpire has referred an LBW decision. Analyze what you can see in the current video feed. Check: 1) Did the ball pitch in line? 2) Was the impact in line with the stumps? 3) Was the ball going on to hit the stumps? Give your verdict in the required format: DECISION / REVIEW TYPE / REASON / CONFIDENCE"
else:
prompt = "The on-field umpire has referred a Run Out decision. Analyze what you can see in the current video feed. Check: 1) Was the bat grounded before the stumps were broken? 2) Was any part of the body behind the crease? Give your verdict in the required format: DECISION / REVIEW TYPE / REASON / CONFIDENCE"

await active_agent.llm.simple_response(text=prompt)
return {"status": "review triggered"}
Comment thread
jaya6400 marked this conversation as resolved.


if __name__ == "__main__":
import threading
import uvicorn

def run_api():
uvicorn.run(review_app, host="0.0.0.0", port=8002, log_level="warning")

api_thread = threading.Thread(target=run_api, daemon=True)
api_thread.start()
logger.info("🌐 Review API running on http://localhost:8002")

launcher = AgentLauncher(create_agent=create_agent, join_call=join_call)
Runner(launcher=launcher).cli()
Comment thread
jaya6400 marked this conversation as resolved.
Comment thread
jaya6400 marked this conversation as resolved.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added examples/09_cricket_umpire/docs/videos/lbw.mp4
Binary file not shown.
Binary file added examples/09_cricket_umpire/docs/videos/ro.mp4
Binary file not shown.
24 changes: 24 additions & 0 deletions examples/09_cricket_umpire/frontend/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
# Logs
logs
*.log
npm-debug.log*
yarn-debug.log*
yarn-error.log*
pnpm-debug.log*
lerna-debug.log*

node_modules
dist
dist-ssr
*.local

# Editor directories and files
.vscode/*
!.vscode/extensions.json
.idea
.DS_Store
*.suo
*.ntvs*
*.njsproj
*.sln
*.sw?
Loading