Skip to content

Commit 65f3ca5

Browse files
authored
Merge pull request #178 from SharpAI/feature/coral-tpu-detection
feat: add YOLO 2026 Coral TPU detection skill (Docker-based)
2 parents e908615 + c8ae2ac commit 65f3ca5

File tree

25 files changed

+2507
-0
lines changed

25 files changed

+2507
-0
lines changed

README.md

Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -58,6 +58,8 @@ Each skill is a self-contained module with its own model, parameters, and [commu
5858
| Category | Skill | What It Does | Status |
5959
|----------|-------|--------------|:------:|
6060
| **Detection** | [`yolo-detection-2026`](skills/detection/yolo-detection-2026/) | Real-time 80+ class detection — auto-accelerated via TensorRT / CoreML / OpenVINO / ONNX ||
61+
| | [`yolo-detection-2026-coral-tpu`](skills/detection/yolo-detection-2026-coral-tpu/) | Google Coral Edge TPU — ~4ms inference via USB accelerator ([Docker-based](#detection--segmentation-skills)) | 🧪 |
62+
| | [`yolo-detection-2026-openvino`](skills/detection/yolo-detection-2026-openvino/) | Intel NCS2 USB / Intel GPU / CPU — multi-device via OpenVINO ([Docker-based](#detection--segmentation-skills)) | 🧪 |
6163
| **Analysis** | [`home-security-benchmark`](skills/analysis/home-security-benchmark/) | [143-test evaluation suite](#-homesec-bench--how-secure-is-your-local-ai) for LLM & VLM security performance ||
6264
| **Privacy** | [`depth-estimation`](skills/transformation/depth-estimation/) | [Real-time depth-map privacy transform](#-privacy--depth-map-anonymization) — anonymize camera feeds while preserving activity ||
6365
| **Segmentation** | [`sam2-segmentation`](skills/segmentation/sam2-segmentation/) | Interactive click-to-segment with Segment Anything 2 — pixel-perfect masks, point/box prompts, video tracking ||
@@ -70,6 +72,54 @@ Each skill is a self-contained module with its own model, parameters, and [commu
7072
7173
> **Registry:** All skills are indexed in [`skills.json`](skills.json) for programmatic discovery.
7274
75+
### Detection & Segmentation Skills
76+
77+
Detection and segmentation skills process visual data from camera feeds — detecting objects, segmenting regions, or analyzing scenes. All skills use the same **JSONL stdin/stdout protocol**: Aegis writes a frame to a shared volume, sends a `frame` event on stdin, and reads `detections` from stdout. This means every detection skill — whether running natively or inside Docker — is interchangeable from Aegis's perspective.
78+
79+
```mermaid
80+
graph TB
81+
CAM["📷 Camera Feed"] --> GOV["Frame Governor (5 FPS)"]
82+
GOV --> |"frame.jpg → shared volume"| PROTO["JSONL stdin/stdout Protocol"]
83+
84+
PROTO --> NATIVE["🖥️ Native: yolo-detection-2026"]
85+
PROTO --> DOCKER["🐳 Docker: Coral TPU / OpenVINO"]
86+
87+
subgraph Native["Native Skill (runs on host)"]
88+
NATIVE --> ENV["env_config.py auto-detect"]
89+
ENV --> TRT["NVIDIA → TensorRT"]
90+
ENV --> CML["Apple Silicon → CoreML"]
91+
ENV --> OV["Intel → OpenVINO IR"]
92+
ENV --> ONNX["AMD / CPU → ONNX"]
93+
end
94+
95+
subgraph Container["Docker Container"]
96+
DOCKER --> CORAL["Coral TPU → pycoral"]
97+
DOCKER --> OVIR["OpenVINO → Ultralytics OV"]
98+
DOCKER --> CPU["CPU fallback"]
99+
CORAL -.-> USB["USB/IP passthrough"]
100+
OVIR -.-> DRI["/dev/dri · /dev/bus/usb"]
101+
end
102+
103+
NATIVE --> |"stdout: detections"| AEGIS["Aegis IPC → Live Overlay + Alerts"]
104+
DOCKER --> |"stdout: detections"| AEGIS
105+
```
106+
107+
- **Native skills** run directly on the host — [`env_config.py`](skills/lib/env_config.py) auto-detects the GPU and converts models to the fastest format (TensorRT, CoreML, OpenVINO IR, ONNX)
108+
- **Docker skills** wrap hardware-specific runtimes in a container — cross-platform USB/device access without native driver installation
109+
- **Same output** — Aegis sees identical JSONL from all skills, so detection overlays, alerts, and forensic analysis work with any backend
110+
111+
#### LLM-Assisted Skill Installation
112+
113+
Skills are installed by an **autonomous LLM deployment agent** — not by brittle shell scripts. When you click "Install" in Aegis, a focused mini-agent session reads the skill's `SKILL.md` manifest and figures out what to do:
114+
115+
1. **Probe** — reads `SKILL.md`, `requirements.txt`, and `package.json` to understand what the skill needs
116+
2. **Detect hardware** — checks for NVIDIA (CUDA), AMD (ROCm), Apple Silicon (MPS), Intel (OpenVINO), or CPU-only
117+
3. **Install** — runs the right commands (`pip install`, `npm install`, `docker build`) with the correct backend-specific dependencies
118+
4. **Verify** — runs a smoke test to confirm the skill loads before marking it complete
119+
5. **Determine launch command** — figures out the exact `run_command` to start the skill and saves it to the registry
120+
121+
This means community-contributed skills don't need a bespoke installer — the LLM reads the manifest and adapts to whatever hardware you have. If something fails, it reads the error output and tries to fix it autonomously.
122+
73123

74124
## 🚀 Getting Started with [SharpAI Aegis](https://www.sharpai.org)
75125

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
# ─── Coral TPU Detection — Runtime Image ─────────────────────────────────────
2+
# Runs YOLO inference on Google Coral Edge TPU via pycoral/tflite-runtime.
3+
# Built locally on user's machine. No external model registry.
4+
#
5+
# Build: docker build -t aegis-coral-tpu .
6+
# Run: docker run -i --rm --device /dev/bus/usb \
7+
# -v /tmp/aegis_detection:/tmp/aegis_detection \
8+
# aegis-coral-tpu
9+
10+
FROM debian:bullseye-slim
11+
12+
# Avoid interactive prompts
13+
ENV DEBIAN_FRONTEND=noninteractive
14+
ENV PYTHONUNBUFFERED=1
15+
16+
# ─── System dependencies ─────────────────────────────────────────────────────
17+
RUN apt-get update && apt-get install -y --no-install-recommends \
18+
python3 \
19+
python3-pip \
20+
python3-dev \
21+
gnupg2 \
22+
ca-certificates \
23+
curl \
24+
usbutils \
25+
libusb-1.0-0 \
26+
&& rm -rf /var/lib/apt/lists/*
27+
28+
# ─── Add Google Coral package repository ─────────────────────────────────────
29+
RUN echo "deb https://packages.cloud.google.com/apt coral-edgetpu-stable main" \
30+
> /etc/apt/sources.list.d/coral-edgetpu.list \
31+
&& curl -fsSL https://packages.cloud.google.com/apt/doc/apt-key.gpg \
32+
| apt-key add - \
33+
&& apt-get update \
34+
&& apt-get install -y --no-install-recommends \
35+
libedgetpu1-std \
36+
&& rm -rf /var/lib/apt/lists/*
37+
38+
# ─── Python dependencies ─────────────────────────────────────────────────────
39+
COPY requirements.txt /app/requirements.txt
40+
RUN pip3 install --no-cache-dir -r /app/requirements.txt
41+
42+
# ─── Application code ────────────────────────────────────────────────────────
43+
COPY scripts/ /app/scripts/
44+
COPY models/ /app/models/
45+
46+
WORKDIR /app
47+
48+
# ─── Shared volume for frame exchange ────────────────────────────────────────
49+
VOLUME ["/tmp/aegis_detection"]
50+
51+
# ─── Entry point ─────────────────────────────────────────────────────────────
52+
# stdin/stdout JSONL protocol — same as yolo-detection-2026
53+
ENTRYPOINT ["python3", "scripts/detect.py"]
Lines changed: 175 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,175 @@
1+
---
2+
name: yolo-detection-2026-coral-tpu
3+
description: "Google Coral Edge TPU — real-time object detection via Docker"
4+
version: 1.0.0
5+
icon: assets/icon.png
6+
entry: scripts/detect.py
7+
deploy: deploy.sh
8+
runtime: docker
9+
10+
requirements:
11+
docker: ">=20.10"
12+
platforms: ["linux", "macos", "windows"]
13+
14+
parameters:
15+
- name: auto_start
16+
label: "Auto Start"
17+
type: boolean
18+
default: false
19+
description: "Start this skill automatically when Aegis launches"
20+
group: Lifecycle
21+
22+
- name: confidence
23+
label: "Confidence Threshold"
24+
type: number
25+
min: 0.1
26+
max: 1.0
27+
default: 0.5
28+
description: "Minimum detection confidence — lower than GPU models due to INT8 quantization"
29+
group: Model
30+
31+
- name: classes
32+
label: "Detect Classes"
33+
type: string
34+
default: "person,car,dog,cat"
35+
description: "Comma-separated COCO class names (80 classes available)"
36+
group: Model
37+
38+
- name: fps
39+
label: "Processing FPS"
40+
type: select
41+
options: [0.2, 0.5, 1, 3, 5, 15]
42+
default: 5
43+
description: "Frames per second — Edge TPU handles 15+ FPS easily"
44+
group: Performance
45+
46+
- name: input_size
47+
label: "Input Resolution"
48+
type: select
49+
options: [320, 640]
50+
default: 320
51+
description: "320 fits fully on TPU (~4ms), 640 partially on CPU (~20ms)"
52+
group: Performance
53+
54+
- name: tpu_device
55+
label: "TPU Device"
56+
type: select
57+
options: ["auto", "0", "1", "2", "3"]
58+
default: "auto"
59+
description: "Which Edge TPU to use — auto selects first available"
60+
group: Performance
61+
62+
- name: clock_speed
63+
label: "TPU Clock Speed"
64+
type: select
65+
options: ["standard", "max"]
66+
default: "standard"
67+
description: "Max is faster but runs hotter — needs active cooling for sustained use"
68+
group: Performance
69+
70+
capabilities:
71+
live_detection:
72+
script: scripts/detect.py
73+
description: "Real-time object detection on live camera frames via Edge TPU"
74+
75+
category: detection
76+
mutex: detection
77+
---
78+
79+
# Coral TPU Object Detection
80+
81+
Real-time object detection using Google Coral Edge TPU accelerator. Runs inside Docker for cross-platform support. Detects 80 COCO classes (person, car, dog, cat, etc.) with ~4ms inference on 320×320 input.
82+
83+
## Requirements
84+
85+
- **Google Coral USB Accelerator** (USB 3.0 port recommended)
86+
- **Docker Desktop 4.35+** (all platforms — Linux, macOS, Windows)
87+
- USB 3.0 cable (the included cable is recommended)
88+
- Adequate cooling for sustained inference
89+
90+
## How It Works
91+
92+
```
93+
┌─────────────────────────────────────────────────────┐
94+
│ Host (Aegis-AI) │
95+
│ frame.jpg → /tmp/aegis_detection/ │
96+
│ stdin ──→ ┌──────────────────────────────┐ │
97+
│ │ Docker Container │ │
98+
│ │ detect.py │ │
99+
│ │ ├─ loads _edgetpu.tflite │ │
100+
│ │ ├─ reads frame from volume │ │
101+
│ │ └─ runs inference on TPU │ │
102+
│ stdout ←── │ → JSONL detections │ │
103+
│ └──────────────────────────────┘ │
104+
│ USB ──→ /dev/bus/usb (Linux) or USB/IP (Mac/Win) │
105+
└─────────────────────────────────────────────────────┘
106+
```
107+
108+
1. Aegis writes camera frame JPEG to shared `/tmp/aegis_detection/` volume
109+
2. Sends `frame` event via stdin JSONL to Docker container
110+
3. `detect.py` reads frame, runs inference on Edge TPU
111+
4. Returns `detections` event via stdout JSONL
112+
5. Same protocol as `yolo-detection-2026` — Aegis sees no difference
113+
114+
## Platform Setup
115+
116+
### Linux
117+
```bash
118+
# USB Coral should be auto-detected
119+
# Docker uses --device /dev/bus/usb for direct access
120+
./deploy.sh
121+
```
122+
123+
### macOS (Docker Desktop 4.35+)
124+
```bash
125+
# Docker Desktop USB/IP handles USB passthrough
126+
# ARM64 Docker image builds natively on Apple Silicon
127+
./deploy.sh
128+
```
129+
130+
### Windows
131+
```powershell
132+
# Docker Desktop 4.35+ with USB/IP support
133+
# Or WSL2 backend with usbipd-win
134+
.\deploy.bat
135+
```
136+
137+
## Model
138+
139+
Ships with pre-compiled `yolo26n_edgetpu.tflite` (YOLO 2026 nano, INT8 quantized, 320×320). To compile custom models:
140+
141+
```bash
142+
# Requires x86_64 Linux or Docker --platform linux/amd64
143+
python scripts/compile_model.py --model yolo26s --size 320
144+
```
145+
146+
## Performance
147+
148+
| Input Size | Inference | On-chip | Notes |
149+
|-----------|-----------|---------|-------|
150+
| 320×320 | ~4ms | 100% | Fully on TPU, best for real-time |
151+
| 640×640 | ~20ms | Partial | Some layers on CPU (model segmented) |
152+
153+
> **Cooling**: The USB Accelerator aluminum case acts as a heatsink. If too hot to touch during continuous inference, it will thermal-throttle. Consider active cooling or `clock_speed: standard`.
154+
155+
## Protocol
156+
157+
Same JSONL as `yolo-detection-2026`:
158+
159+
### Skill → Aegis (stdout)
160+
```jsonl
161+
{"event": "ready", "model": "yolo26n_edgetpu", "device": "coral", "format": "edgetpu_tflite", "tpu_count": 1, "classes": 80}
162+
{"event": "detections", "frame_id": 42, "camera_id": "front_door", "objects": [{"class": "person", "confidence": 0.85, "bbox": [100, 50, 300, 400]}]}
163+
{"event": "perf_stats", "total_frames": 50, "timings_ms": {"inference": {"avg": 4.1, "p50": 3.9, "p95": 5.2}}}
164+
```
165+
166+
### Bounding Box Format
167+
`[x_min, y_min, x_max, y_max]` — pixel coordinates (xyxy).
168+
169+
## Installation
170+
171+
```bash
172+
./deploy.sh
173+
```
174+
175+
The deployer builds the Docker image locally, probes for TPU devices, and sets the runtime command. No packages pulled from external registries beyond Docker base images and Coral apt repo.
Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
# Coral TPU Detection Skill — Configuration Schema
2+
# Parsed by Aegis skill-registry-service.cjs → parseConfigYaml()
3+
# Format: params[] with key, type, label, default, description, options
4+
5+
params:
6+
- key: auto_start
7+
label: Auto Start
8+
type: boolean
9+
default: false
10+
description: "Start this skill automatically when Aegis launches"
11+
12+
- key: confidence
13+
label: Confidence Threshold
14+
type: number
15+
default: 0.5
16+
description: "Minimum detection confidence (0.1–1.0). Lower than GPU YOLO — Edge TPU INT8 quantization produces softer scores"
17+
18+
- key: fps
19+
label: Frame Rate
20+
type: select
21+
default: 5
22+
description: "Detection processing rate — Edge TPU is fast enough for real-time"
23+
options:
24+
- { value: 0.2, label: "Ultra Low (0.2 FPS)" }
25+
- { value: 0.5, label: "Low (0.5 FPS)" }
26+
- { value: 1, label: "Normal (1 FPS)" }
27+
- { value: 3, label: "Active (3 FPS)" }
28+
- { value: 5, label: "High (5 FPS)" }
29+
- { value: 15, label: "Real-time (15 FPS)" }
30+
31+
- key: classes
32+
label: Detection Classes
33+
type: string
34+
default: "person,car,dog,cat"
35+
description: "Comma-separated COCO class names to detect"
36+
37+
- key: input_size
38+
label: Input Resolution
39+
type: select
40+
default: 320
41+
description: "Image size for inference — 320 fits fully on TPU (~4ms), 640 partially runs on CPU (~20ms)"
42+
options:
43+
- { value: 320, label: "320×320 (fastest, fully on-chip)" }
44+
- { value: 640, label: "640×640 (more accurate, partially CPU)" }
45+
46+
- key: tpu_device
47+
label: TPU Device
48+
type: select
49+
default: auto
50+
description: "Which Edge TPU to use — 'auto' selects the first available device"
51+
options:
52+
- { value: auto, label: "Auto-detect (first available)" }
53+
- { value: "0", label: "TPU 0" }
54+
- { value: "1", label: "TPU 1" }
55+
- { value: "2", label: "TPU 2" }
56+
- { value: "3", label: "TPU 3" }
57+
58+
- key: clock_speed
59+
label: TPU Clock Speed
60+
type: select
61+
default: standard
62+
description: "Edge TPU operating frequency — 'max' is faster but runs hotter and needs cooling"
63+
options:
64+
- { value: standard, label: "Standard (lower power, cooler)" }
65+
- { value: max, label: "Maximum (faster inference, needs cooling)" }

0 commit comments

Comments
 (0)