Skip to content

Commit 9b81316

Browse files
committed
feat: add SmartHome-Bench video anomaly detection skill
New DeepCamera analysis skill wrapping the SmartHome-Bench dataset for evaluating VLM performance on video anomaly detection. - SKILL.md with YAML manifest, params, and protocol docs - config.yaml with mode/maxVideos/categories params - run-benchmark.cjs: video download (yt-dlp), frame sampling (ffmpeg), multi-image VLM evaluation, binary anomaly scoring, JSONL protocol - generate-report.cjs: HTML report with confusion matrix, per-category metrics (accuracy/precision/recall/F1), model comparison - fixtures/annotations.json: 99 curated clips across 7 categories (Wildlife, Senior Care, Baby Monitoring, Pet Monitoring, Home Security, Package Delivery, General Activity) - deploy.sh: system dep checks + npm install - Added to skills.json registry and README catalog
1 parent 7d3e7a3 commit 9b81316

10 files changed

Lines changed: 2220 additions & 0 deletions

File tree

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,7 @@ Each skill is a self-contained module with its own model, parameters, and [commu
4343
| | [`dinov3-grounding`](skills/detection/dinov3-grounding/) | Open-vocabulary detection — describe what to find | 📐 |
4444
| | [`person-recognition`](skills/detection/person-recognition/) | Re-identify individuals across cameras | 📐 |
4545
| **Analysis** | [`home-security-benchmark`](skills/analysis/home-security-benchmark/) | [143-test evaluation suite](#-homesec-bench--how-secure-is-your-local-ai) for LLM & VLM security performance ||
46+
| | [`smarthome-bench`](skills/analysis/smarthome-bench/) | Video anomaly detection benchmark — 105 clips across 7 smart home categories ||
4647
| | [`vlm-scene-analysis`](skills/analysis/vlm-scene-analysis/) | Describe what happened in recorded clips | 📐 |
4748
| | [`sam2-segmentation`](skills/analysis/sam2-segmentation/) | Click-to-segment with pixel-perfect masks | 📐 |
4849
| **Transformation** | [`depth-estimation`](skills/transformation/depth-estimation/) | Monocular depth maps with Depth Anything v2 | 📐 |

skills.json

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -96,6 +96,43 @@
9696
"medium",
9797
"large"
9898
]
99+
},
100+
{
101+
"id": "smarthome-bench",
102+
"name": "SmartHome Video Anomaly Benchmark",
103+
"description": "VLM evaluation suite for video anomaly detection in smart home camera footage — 7 categories, 105 curated clips from SmartHome-Bench.",
104+
"version": "1.0.0",
105+
"category": "analysis",
106+
"path": "skills/analysis/smarthome-bench",
107+
"tags": [
108+
"benchmark",
109+
"vlm",
110+
"video",
111+
"anomaly-detection",
112+
"smart-home"
113+
],
114+
"platforms": [
115+
"linux-x64",
116+
"linux-arm64",
117+
"darwin-arm64",
118+
"darwin-x64",
119+
"win-x64"
120+
],
121+
"requirements": {
122+
"node": ">=18",
123+
"ram_gb": 2,
124+
"system_deps": [
125+
"yt-dlp",
126+
"ffmpeg"
127+
]
128+
},
129+
"capabilities": [
130+
"benchmark",
131+
"report_generation"
132+
],
133+
"ui_unlocks": [
134+
"benchmark_report"
135+
]
99136
}
100137
]
101138
}
Lines changed: 159 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,159 @@
1+
---
2+
name: SmartHome Video Anomaly Benchmark
3+
description: VLM evaluation suite for video anomaly detection in smart home camera footage
4+
version: 1.0.0
5+
category: analysis
6+
runtime: node
7+
entry: scripts/run-benchmark.cjs
8+
install: npm
9+
10+
requirements:
11+
node: ">=18"
12+
npm_install: true
13+
system_deps: [yt-dlp, ffmpeg]
14+
platforms: ["linux", "macos", "windows"]
15+
---
16+
17+
# SmartHome Video Anomaly Benchmark
18+
19+
Benchmark suite evaluating VLM models on **video anomaly detection** across **7 smart home categories** — based on the [SmartHome-Bench](https://github.com/Xinyi-0724/SmartHome-Bench-LLM) dataset (1,203 YouTube smart home camera clips).
20+
21+
> **Key differentiator from HomeSec-Bench**: This benchmark requires **video understanding** (multi-frame sampling), not single-frame analysis.
22+
23+
## Setup
24+
25+
**Requires `npm install` + system dependencies.** The deployment agent should:
26+
1. Check for `yt-dlp` and `ffmpeg` in PATH
27+
2. Run `npm install` in the skill directory
28+
29+
Entry script: `scripts/run-benchmark.cjs`
30+
31+
### Verification
32+
33+
```bash
34+
node scripts/run-benchmark.cjs --help
35+
```
36+
37+
## Quick Start
38+
39+
### As an Aegis Skill (automatic)
40+
41+
When spawned by Aegis, configuration is injected via environment variables. The benchmark downloads video clips, samples frames, evaluates with VLM, and generates an HTML report.
42+
43+
### Standalone
44+
45+
```bash
46+
# Run with local VLM (subset mode, 50 videos)
47+
node scripts/run-benchmark.cjs --vlm http://localhost:5405
48+
49+
# Quick test with 10 videos
50+
node scripts/run-benchmark.cjs --vlm http://localhost:5405 --max-videos 10
51+
52+
# Full benchmark (all curated clips)
53+
node scripts/run-benchmark.cjs --vlm http://localhost:5405 --mode full
54+
55+
# Filter by category
56+
node scripts/run-benchmark.cjs --vlm http://localhost:5405 --categories "Wildlife,Security"
57+
58+
# Skip download (re-evaluate cached videos)
59+
node scripts/run-benchmark.cjs --vlm http://localhost:5405 --skip-download
60+
61+
# Skip report auto-open
62+
node scripts/run-benchmark.cjs --vlm http://localhost:5405 --no-open
63+
```
64+
65+
## Configuration
66+
67+
### Environment Variables (set by Aegis)
68+
69+
| Variable | Default | Description |
70+
|----------|---------|-------------|
71+
| `AEGIS_VLM_URL` | *(required)* | VLM server base URL |
72+
| `AEGIS_VLM_MODEL` || Loaded VLM model ID |
73+
| `AEGIS_SKILL_ID` || Skill identifier (enables skill mode) |
74+
| `AEGIS_SKILL_PARAMS` | `{}` | JSON params from skill config |
75+
76+
> **Note**: This is a VLM-only benchmark. An LLM gateway is not required.
77+
78+
### User Configuration (config.yaml)
79+
80+
This skill includes a [`config.yaml`](config.yaml) that defines user-configurable parameters. Aegis parses this at install time and renders a config panel in the UI. Values are delivered via `AEGIS_SKILL_PARAMS`.
81+
82+
| Parameter | Type | Default | Description |
83+
|-----------|------|---------|-------------|
84+
| `mode` | select | `subset` | Which clips to evaluate: `subset` (~50 clips) or `full` (all ~105 curated clips) |
85+
| `maxVideos` | number | `50` | Maximum number of videos to evaluate |
86+
| `categories` | text | `all` | Comma-separated category filter (e.g. `Wildlife,Security`) |
87+
| `noOpen` | boolean | `false` | Skip auto-opening the HTML report in browser |
88+
89+
### CLI Arguments (standalone fallback)
90+
91+
| Argument | Default | Description |
92+
|----------|---------|-------------|
93+
| `--vlm URL` | *(required)* | VLM server base URL |
94+
| `--out DIR` | `~/.aegis-ai/smarthome-bench` | Results directory |
95+
| `--max-videos N` | `50` | Max videos to evaluate |
96+
| `--mode MODE` | `subset` | `subset` or `full` |
97+
| `--categories LIST` | `all` | Comma-separated category filter |
98+
| `--skip-download` || Skip video download, use cached |
99+
| `--no-open` || Don't auto-open report in browser |
100+
| `--report` | *(auto in skill mode)* | Force report generation |
101+
102+
## Protocol
103+
104+
### Aegis → Skill (env vars)
105+
```
106+
AEGIS_VLM_URL=http://localhost:5405
107+
AEGIS_SKILL_ID=smarthome-bench
108+
AEGIS_SKILL_PARAMS={}
109+
```
110+
111+
### Skill → Aegis (stdout, JSON lines)
112+
```jsonl
113+
{"event": "ready", "model": "SmolVLM2-2.2B", "system": "Apple M3"}
114+
{"event": "suite_start", "suite": "Wildlife"}
115+
{"event": "test_result", "suite": "Wildlife", "test": "smartbench_0003", "status": "pass", "timeMs": 4500}
116+
{"event": "suite_end", "suite": "Wildlife", "passed": 12, "failed": 3}
117+
{"event": "complete", "passed": 78, "total": 105, "timeMs": 480000, "reportPath": "/path/to/report.html"}
118+
```
119+
120+
Human-readable output goes to **stderr** (visible in Aegis console tab).
121+
122+
## Test Suites (7 Categories)
123+
124+
| Suite | Description | Anomaly Examples |
125+
|-------|-------------|------------------|
126+
| 🦊 Wildlife | Wild animals near home cameras | Bear on porch, deer in garden, coyote at night |
127+
| 👴 Senior Care | Elderly activity monitoring | Falls, wandering, unusual inactivity |
128+
| 👶 Baby Monitoring | Infant/child safety | Stroller rolling, child climbing, unsupervised |
129+
| 🐾 Pet Monitoring | Pet behavior detection | Pet illness, escaped pets, unusual behavior |
130+
| 🔒 Home Security | Intrusion & suspicious activity | Break-ins, trespassing, porch pirates |
131+
| 📦 Package Delivery | Package arrival & theft | Stolen packages, misdelivered, weather damage |
132+
| 🏠 General Activity | General smart home events | Unusual hours activity, appliance issues |
133+
134+
Each clip is evaluated for **binary anomaly detection**: the VLM predicts normal (0) or abnormal (1), compared against expert annotations.
135+
136+
## Metrics
137+
138+
Per-category and overall:
139+
- **Accuracy** — correct predictions / total
140+
- **Precision** — true positives / predicted positives
141+
- **Recall** — true positives / actual positives
142+
- **F1-Score** — harmonic mean of precision & recall
143+
- **Confusion Matrix** — TP, FP, TN, FN breakdown
144+
145+
## Results
146+
147+
Results are saved to `~/.aegis-ai/smarthome-bench/` as JSON. An HTML report with per-category breakdown, confusion matrix, and model comparison is auto-generated.
148+
149+
## Requirements
150+
151+
- Node.js ≥ 18
152+
- `npm install` (for `openai` SDK dependency)
153+
- `yt-dlp` (video download from YouTube)
154+
- `ffmpeg` (frame extraction from video clips)
155+
- Running VLM server (must support multi-image input)
156+
157+
## Citation
158+
159+
Based on [SmartHome-Bench: A Comprehensive Benchmark for Video Anomaly Detection in Smart Homes Using Multi-Modal Foundation Models](https://arxiv.org/abs/2506.12992).
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
params:
2+
- key: mode
3+
label: Evaluation Mode
4+
type: select
5+
options: [subset, full]
6+
default: subset
7+
description: "Which clips to evaluate: subset (~50 videos) or full (all ~105 curated clips)"
8+
9+
- key: maxVideos
10+
label: Max Videos
11+
type: number
12+
default: 50
13+
description: Maximum number of videos to evaluate (overrides mode)
14+
15+
- key: categories
16+
label: Categories
17+
type: text
18+
default: all
19+
description: "Comma-separated category filter, e.g. Wildlife,Security (default: all)"
20+
21+
- key: noOpen
22+
label: Don't auto-open report
23+
type: boolean
24+
default: false
25+
description: Skip opening the HTML report in browser after completion
Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
#!/usr/bin/env bash
2+
# SmartHome-Bench deployment script
3+
# Called by Aegis deployment agent during skill installation
4+
5+
set -e
6+
7+
SKILL_DIR="$(cd "$(dirname "$0")" && pwd)"
8+
echo "📦 Deploying SmartHome-Bench from: $SKILL_DIR"
9+
10+
# ── Check system dependencies ─────────────────────────────────────────────────
11+
12+
echo "🔍 Checking system dependencies..."
13+
14+
if ! command -v yt-dlp &>/dev/null; then
15+
echo "⚠️ yt-dlp not found. Attempting install..."
16+
if command -v brew &>/dev/null; then
17+
brew install yt-dlp
18+
elif command -v pip3 &>/dev/null; then
19+
pip3 install yt-dlp
20+
elif command -v apt-get &>/dev/null; then
21+
sudo apt-get install -y yt-dlp 2>/dev/null || pip3 install yt-dlp
22+
else
23+
echo "❌ Cannot install yt-dlp automatically. Please install manually:"
24+
echo " pip install yt-dlp OR brew install yt-dlp"
25+
exit 1
26+
fi
27+
fi
28+
echo " ✅ yt-dlp: $(yt-dlp --version)"
29+
30+
if ! command -v ffmpeg &>/dev/null; then
31+
echo "⚠️ ffmpeg not found. Attempting install..."
32+
if command -v brew &>/dev/null; then
33+
brew install ffmpeg
34+
elif command -v apt-get &>/dev/null; then
35+
sudo apt-get install -y ffmpeg
36+
else
37+
echo "❌ Cannot install ffmpeg automatically. Please install manually:"
38+
echo " brew install ffmpeg OR apt-get install ffmpeg"
39+
exit 1
40+
fi
41+
fi
42+
echo " ✅ ffmpeg: $(ffmpeg -version 2>&1 | head -1)"
43+
44+
# ── Install npm dependencies ──────────────────────────────────────────────────
45+
46+
echo "📦 Installing npm dependencies..."
47+
cd "$SKILL_DIR"
48+
npm install --production
49+
50+
echo "✅ SmartHome-Bench deployed successfully"

0 commit comments

Comments
 (0)