Skip to content

Commit 8c002f9

Browse files
authored
Merge pull request #136 from SharpAI/develop
Develop
2 parents 8d67e55 + 8b19435 commit 8c002f9

File tree

12 files changed

+795
-89
lines changed

12 files changed

+795
-89
lines changed

README.md

Lines changed: 57 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -24,13 +24,22 @@
2424

2525
---
2626

27+
## 🗺️ Roadmap
28+
29+
- [x] **Skill architecture** — pluggable `SKILL.md` interface for all capabilities
30+
- [x] **Skill Store UI** — browse, install, and configure skills from Aegis
31+
- [x] **AI/LLM-assisted skill installation** — community-contributed skills installed and configured via AI agent
32+
- [x] **GPU / NPU / CPU (AIPC) aware installation** — auto-detect hardware, install matching frameworks, convert models to optimal format
33+
- [x] **Hardware environment layer** — shared [`env_config.py`](skills/lib/env_config.py) for auto-detection + model optimization across NVIDIA, AMD, Apple Silicon, Intel, and CPU
34+
- [ ] **Skill development** — 18 skills across 9 categories, actively expanding with community contributions
35+
2736
## 🧩 Skill Catalog
2837

2938
Each skill is a self-contained module with its own model, parameters, and [communication protocol](docs/skill-development.md). See the [Skill Development Guide](docs/skill-development.md) and [Platform Parameters](docs/skill-params.md) to build your own.
3039

3140
| Category | Skill | What It Does | Status |
3241
|----------|-------|--------------|:------:|
33-
| **Detection** | [`yolo-detection-2026`](skills/detection/yolo-detection-2026/) | Real-time 80+ class object detection ||
42+
| **Detection** | [`yolo-detection-2026`](skills/detection/yolo-detection-2026/) | Real-time 80+ class detection — auto-accelerated via TensorRT / CoreML / OpenVINO / ONNX ||
3443
| | [`dinov3-grounding`](skills/detection/dinov3-grounding/) | Open-vocabulary detection — describe what to find | 📐 |
3544
| | [`person-recognition`](skills/detection/person-recognition/) | Re-identify individuals across cameras | 📐 |
3645
| **Analysis** | [`home-security-benchmark`](skills/analysis/home-security-benchmark/) | [131-test evaluation suite](#-homesec-bench--how-secure-is-your-local-ai) for LLM & VLM security performance ||
@@ -48,13 +57,6 @@ Each skill is a self-contained module with its own model, parameters, and [commu
4857
4958
> **Registry:** All skills are indexed in [`skills.json`](skills.json) for programmatic discovery.
5059
51-
### 🗺️ Roadmap
52-
53-
- [x] **Skill architecture** — pluggable `SKILL.md` interface for all capabilities
54-
- [x] **Full skill catalog** — 18 skills across 9 categories with working scripts
55-
- [ ] **Skill Store UI** — browse, install, and configure skills from Aegis
56-
- [ ] **Custom skill packaging** — community-contributed skills via GitHub
57-
- [ ] **GPU-optimized containers** — one-click Docker deployment per skill
5860

5961
## 🚀 Getting Started with [SharpAI Aegis](https://www.sharpai.org)
6062

@@ -89,6 +91,53 @@ The easiest way to run DeepCamera's AI skills. Aegis connects everything — cam
8991
</table>
9092

9193

94+
## 🎯 YOLO 2026 — Real-Time Object Detection
95+
96+
State-of-the-art detection running locally on **any hardware**, fully integrated as a [DeepCamera skill](skills/detection/yolo-detection-2026/).
97+
98+
### YOLO26 Models
99+
100+
YOLO26 (Jan 2026) eliminates NMS and DFL for cleaner exports and lower latency. Pick the size that fits your hardware:
101+
102+
| Model | Params | Latency (optimized) | Use Case |
103+
|-------|--------|:-------------------:|----------|
104+
| **yolo26n** (nano) | 2.6M | ~2ms | Edge devices, real-time on CPU |
105+
| **yolo26s** (small) | 11.2M | ~5ms | Balanced speed & accuracy |
106+
| **yolo26m** (medium) | 25.4M | ~12ms | Accuracy-focused |
107+
| **yolo26l** (large) | 52.3M | ~25ms | Maximum detection quality |
108+
109+
All models detect **80+ COCO classes**: people, vehicles, animals, everyday objects.
110+
111+
### Hardware Acceleration
112+
113+
The shared [`env_config.py`](skills/lib/env_config.py) **auto-detects your GPU** and converts the model to the fastest native format — zero manual setup:
114+
115+
| Your Hardware | Optimized Format | Runtime | Speedup vs PyTorch |
116+
|---------------|-----------------|---------|:------------------:|
117+
| **NVIDIA GPU** (RTX, Jetson) | TensorRT `.engine` | CUDA | **3-5x** |
118+
| **Apple Silicon** (M1–M4) | CoreML `.mlpackage` | ANE + GPU | **~2x** |
119+
| **Intel** (CPU, iGPU, NPU) | OpenVINO IR `.xml` | OpenVINO | **2-3x** |
120+
| **AMD GPU** (RX, MI) | ONNX Runtime | ROCm | **1.5-2x** |
121+
| **Any CPU** | ONNX Runtime | CPU | **~1.5x** |
122+
123+
### Aegis Skill Integration
124+
125+
Detection runs as a **parallel pipeline** alongside VLM analysis — never blocks your AI agent:
126+
127+
```
128+
Camera → Frame Governor → detect.py (JSONL) → Aegis IPC → Live Overlay
129+
5 FPS ↓
130+
perf_stats (p50/p95/p99 latency)
131+
```
132+
133+
- 🖱️ **Click to setup** — one button in Aegis installs everything, no terminal needed
134+
- 🤖 **AI-driven environment config** — autonomous agent detects your GPU, installs the right framework (CUDA/ROCm/CoreML/OpenVINO), converts models, and verifies the setup
135+
- 📺 **Live bounding boxes** — detection results rendered as overlays on RTSP camera streams
136+
- 📊 **Built-in performance profiling** — aggregate latency stats (p50/p95/p99) emitted every 50 frames
137+
-**Auto start** — set `auto_start: true` to begin detecting when Aegis launches
138+
139+
📖 [Full Skill Documentation →](skills/detection/yolo-detection-2026/SKILL.md)
140+
92141
## 📊 HomeSec-Bench — How Secure Is Your Local AI?
93142

94143
**HomeSec-Bench** is a 131-test security benchmark that measures how well your local AI performs as a security guard. It tests what matters: Can it detect a person in fog? Classify a break-in vs. a delivery? Resist prompt injection? Route alerts correctly at 3 AM?

skills/detection/yolo-detection-2026/SKILL.md

Lines changed: 73 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,19 @@
11
---
22
name: yolo-detection-2026
33
description: "YOLO 2026 — state-of-the-art real-time object detection"
4-
version: 1.0.0
4+
version: 2.0.0
55
icon: assets/icon.png
66
entry: scripts/detect.py
7+
deploy: deploy.sh
78

89
parameters:
10+
- name: auto_start
11+
label: "Auto Start"
12+
type: boolean
13+
default: false
14+
description: "Start this skill automatically when Aegis launches"
15+
group: Lifecycle
16+
917
- name: model_size
1018
label: "Model Size"
1119
type: select
@@ -45,6 +53,13 @@ parameters:
4553
description: "auto = best available GPU, else CPU"
4654
group: Performance
4755

56+
- name: use_optimized
57+
label: "Hardware Acceleration"
58+
type: boolean
59+
default: true
60+
description: "Auto-convert model to optimized format for faster inference"
61+
group: Performance
62+
4863
capabilities:
4964
live_detection:
5065
script: scripts/detect.py
@@ -64,6 +79,50 @@ Real-time object detection using the latest YOLO 2026 models. Detects 80+ COCO o
6479
| medium | Moderate | High | Accuracy-focused deployments |
6580
| large | Slower | Highest | Maximum detection quality |
6681

82+
## Hardware Acceleration
83+
84+
The skill uses [`env_config.py`](../../lib/env_config.py) to **automatically detect hardware** and convert the model to the fastest format for your platform. Conversion happens once during deployment and is cached.
85+
86+
| Platform | Backend | Optimized Format | Expected Speedup |
87+
|----------|---------|------------------|:----------------:|
88+
| NVIDIA GPU | CUDA | TensorRT `.engine` | ~3-5x |
89+
| Apple Silicon (M1+) | MPS | CoreML `.mlpackage` | ~2x |
90+
| Intel CPU/GPU/NPU | OpenVINO | OpenVINO IR `.xml` | ~2-3x |
91+
| AMD GPU | ROCm | ONNX Runtime | ~1.5-2x |
92+
| CPU (any) | CPU | ONNX Runtime | ~1.5x |
93+
94+
### How It Works
95+
96+
1. `deploy.sh` detects your hardware via `env_config.HardwareEnv.detect()`
97+
2. Installs the matching `requirements_{backend}.txt` (e.g. CUDA → includes `tensorrt`)
98+
3. Pre-converts the default model to the optimal format
99+
4. At runtime, `detect.py` loads the cached optimized model automatically
100+
5. Falls back to PyTorch if optimization fails
101+
102+
Set `use_optimized: false` to disable auto-conversion and use raw PyTorch.
103+
104+
## Auto Start
105+
106+
Set `auto_start: true` in the skill config to start detection automatically when Aegis launches. The skill will begin processing frames from the selected camera immediately.
107+
108+
```yaml
109+
auto_start: true
110+
model_size: nano
111+
fps: 5
112+
```
113+
114+
## Performance Monitoring
115+
116+
The skill emits `perf_stats` events every 50 frames with aggregate timing:
117+
118+
```jsonl
119+
{"event": "perf_stats", "total_frames": 50, "timings_ms": {
120+
"inference": {"avg": 3.4, "p50": 3.2, "p95": 5.1},
121+
"postprocess": {"avg": 0.15, "p50": 0.12, "p95": 0.31},
122+
"total": {"avg": 3.6, "p50": 3.4, "p95": 5.5}
123+
}}
124+
```
125+
67126
## Protocol
68127

69128
Communicates via **JSON lines** over stdin/stdout.
@@ -75,10 +134,11 @@ Communicates via **JSON lines** over stdin/stdout.
75134

76135
### Skill → Aegis (stdout)
77136
```jsonl
78-
{"event": "ready", "model": "yolo2026n", "device": "mps", "classes": 80, "fps": 5}
137+
{"event": "ready", "model": "yolo2026n", "device": "mps", "backend": "mps", "format": "coreml", "gpu": "Apple M3", "classes": 80, "fps": 5}
79138
{"event": "detections", "frame_id": 42, "camera_id": "front_door", "timestamp": "...", "objects": [
80139
{"class": "person", "confidence": 0.92, "bbox": [100, 50, 300, 400]}
81140
]}
141+
{"event": "perf_stats", "total_frames": 50, "timings_ms": {"inference": {"avg": 3.4}}}
82142
{"event": "error", "message": "...", "retriable": true}
83143
```
84144

@@ -90,20 +150,20 @@ Communicates via **JSON lines** over stdin/stdout.
90150
{"command": "stop"}
91151
```
92152

93-
## Hardware Support
94-
95-
| Platform | Backend | Performance |
96-
|----------|---------|-------------|
97-
| Apple Silicon (M1+) | MPS | 20-30 FPS |
98-
| NVIDIA GPU | CUDA | 25-60 FPS |
99-
| AMD GPU | ROCm | 15-40 FPS |
100-
| CPU (modern x86) | CPU | 5-15 FPS |
101-
| Raspberry Pi 5 | CPU | 2-5 FPS |
102-
103153
## Installation
104154

105-
The `deploy.sh` bootstrapper handles everything — Python environment, GPU backend detection, and dependency installation. No manual setup required.
155+
The `deploy.sh` bootstrapper handles everything — Python environment, GPU backend detection, dependency installation, and model optimization. No manual setup required.
106156

107157
```bash
108158
./deploy.sh
109159
```
160+
161+
### Requirements Files
162+
163+
| File | Backend | Key Deps |
164+
|------|---------|----------|
165+
| `requirements_cuda.txt` | NVIDIA | `torch` (cu124), `tensorrt` |
166+
| `requirements_mps.txt` | Apple | `torch`, `coremltools` |
167+
| `requirements_intel.txt` | Intel | `torch`, `openvino` |
168+
| `requirements_rocm.txt` | AMD | `torch` (rocm6.2), `onnxruntime-rocm` |
169+
| `requirements_cpu.txt` | CPU | `torch` (cpu), `onnxruntime` |

skills/detection/yolo-detection-2026/config.yaml

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -56,3 +56,11 @@ params:
5656
- { value: cuda, label: "NVIDIA CUDA" }
5757
- { value: mps, label: "Apple Silicon (MPS)" }
5858
- { value: rocm, label: "AMD ROCm" }
59+
60+
- key: use_optimized
61+
label: Hardware Acceleration
62+
type: boolean
63+
default: true
64+
description: "Auto-convert model to optimized format (TensorRT/CoreML/OpenVINO/ONNX) for faster inference"
65+
66+

0 commit comments

Comments
 (0)