Skip to content

Commit 34e1318

Browse files
authored
Merge pull request #152 from SharpAI/develop
Develop
2 parents f367a41 + b3bee6a commit 34e1318

File tree

14 files changed

+1185
-156
lines changed

14 files changed

+1185
-156
lines changed

README.md

Lines changed: 23 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -60,7 +60,7 @@
6060
- [x] **AI/LLM-assisted skill installation** — community-contributed skills installed and configured via AI agent
6161
- [x] **GPU / NPU / CPU (AIPC) aware installation** — auto-detect hardware, install matching frameworks, convert models to optimal format
6262
- [x] **Hardware environment layer** — shared [`env_config.py`](skills/lib/env_config.py) for auto-detection + model optimization across NVIDIA, AMD, Apple Silicon, Intel, and CPU
63-
- [ ] **Skill development**18 skills across 9 categories, actively expanding with community contributions
63+
- [ ] **Skill development**19 skills across 10 categories, actively expanding with community contributions
6464

6565
## 🧩 Skill Catalog
6666

@@ -70,9 +70,10 @@ Each skill is a self-contained module with its own model, parameters, and [commu
7070
|----------|-------|--------------|:------:|
7171
| **Detection** | [`yolo-detection-2026`](skills/detection/yolo-detection-2026/) | Real-time 80+ class detection — auto-accelerated via TensorRT / CoreML / OpenVINO / ONNX ||
7272
| **Analysis** | [`home-security-benchmark`](skills/analysis/home-security-benchmark/) | [143-test evaluation suite](#-homesec-bench--how-secure-is-your-local-ai) for LLM & VLM security performance ||
73-
| | [`sam2-segmentation`](skills/analysis/sam2-segmentation/) | Click-to-segment with pixel-perfect masks | 📐 |
74-
| **Transformation** | [`depth-estimation`](skills/transformation/depth-estimation/) | Monocular depth maps with Depth Anything v2 | 📐 |
75-
| **Annotation** | [`dataset-annotation`](skills/annotation/dataset-annotation/) | AI-assisted labeling → COCO export | 📐 |
73+
| **Privacy** | [`depth-estimation`](skills/transformation/depth-estimation/) | [Real-time depth-map privacy transform](#-privacy--depth-map-anonymization) — anonymize camera feeds while preserving activity ||
74+
| **Annotation** | [`sam2-segmentation`](skills/annotation/sam2-segmentation/) | Click-to-segment with pixel-perfect masks | 📐 |
75+
| | [`dataset-annotation`](skills/annotation/dataset-annotation/) | AI-assisted labeling → COCO export | 📐 |
76+
| **Training** | [`model-training`](skills/training/model-training/) | Agent-driven YOLO fine-tuning — annotate, train, export, deploy | 📐 |
7677
| **Camera Providers** | [`eufy`](skills/camera-providers/eufy/) · [`reolink`](skills/camera-providers/reolink/) · [`tapo`](skills/camera-providers/tapo/) | Direct camera integrations via RTSP | 📐 |
7778
| **Streaming** | [`go2rtc-cameras`](skills/streaming/go2rtc-cameras/) | RTSP → WebRTC live view | 📐 |
7879
| **Channels** | [`matrix`](skills/channels/matrix/) · [`line`](skills/channels/line/) · [`signal`](skills/channels/signal/) | Messaging channels for Clawdbot agent | 📐 |
@@ -143,6 +144,24 @@ Camera → Frame Governor → detect.py (JSONL) → Aegis IPC → Live Overlay
143144

144145
📖 [Full Skill Documentation →](skills/detection/yolo-detection-2026/SKILL.md)
145146

147+
## 🔒 Privacy — Depth Map Anonymization
148+
149+
Watch your cameras **without seeing faces, clothing, or identities**. The [depth-estimation skill](skills/transformation/depth-estimation/) transforms live feeds into colorized depth maps using [Depth Anything v2](https://github.com/DepthAnything/Depth-Anything-V2) — warm colors for nearby objects, cool colors for distant ones.
150+
151+
```
152+
Camera Frame ──→ Depth Anything v2 ──→ Colorized Depth Map ──→ Aegis Overlay
153+
(live) (0.5 FPS) warm=near, cool=far (privacy on)
154+
```
155+
156+
- 🛡️ **Full anonymization**`depth_only` mode hides all visual identity while preserving spatial activity
157+
- 🎨 **Overlay mode** — blend depth on top of original feed with adjustable opacity
158+
-**Rate-limited** — 0.5 FPS frontend capture + backend scheduler keeps GPU load minimal
159+
- 🧩 **Extensible** — new privacy skills (blur, pixelation, silhouette) can subclass [`TransformSkillBase`](skills/transformation/depth-estimation/scripts/transform_base.py)
160+
161+
Runs on the same [hardware acceleration stack](#hardware-acceleration) as YOLO detection — CUDA, MPS, ROCm, OpenVINO, or CPU.
162+
163+
📖 [Full Skill Documentation →](skills/transformation/depth-estimation/SKILL.md) · 📖 [README →](skills/transformation/depth-estimation/README.md)
164+
146165
## 📊 HomeSec-Bench — How Secure Is Your Local AI?
147166

148167
**HomeSec-Bench** is a 143-test security benchmark that measures how well your local AI performs as a security guard. It tests what matters: Can it detect a person in fog? Classify a break-in vs. a delivery? Resist prompt injection? Route alerts correctly at 3 AM?

skills.json

Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,9 @@
77
"detection": "Object detection, person recognition, visual grounding",
88
"analysis": "VLM scene understanding, interactive segmentation",
99
"transformation": "Depth estimation, style transfer, video effects",
10+
"privacy": "Privacy transforms — depth maps, blur, anonymization for blind mode",
1011
"annotation": "Dataset labeling, COCO export, training data",
12+
"training": "Model fine-tuning, hardware-optimized export, deployment",
1113
"camera-providers": "Camera brand integrations — clip feed, live stream",
1214
"streaming": "RTSP/WebRTC live view via go2rtc",
1315
"channels": "Messaging platform channels for Clawdbot agent",
@@ -130,6 +132,71 @@
130132
"monitoring",
131133
"recording"
132134
]
135+
},
136+
{
137+
"id": "depth-estimation",
138+
"name": "Depth Estimation (Privacy)",
139+
"description": "Privacy-first depth map transforms — anonymize camera feeds with Depth Anything v2 while preserving spatial awareness.",
140+
"version": "1.1.0",
141+
"category": "privacy",
142+
"path": "skills/transformation/depth-estimation",
143+
"tags": [
144+
"privacy",
145+
"depth",
146+
"transform",
147+
"anonymization",
148+
"blind-mode"
149+
],
150+
"platforms": [
151+
"linux-x64",
152+
"linux-arm64",
153+
"darwin-arm64",
154+
"darwin-x64",
155+
"win-x64"
156+
],
157+
"requirements": {
158+
"python": ">=3.9",
159+
"ram_gb": 2
160+
},
161+
"capabilities": [
162+
"live_transform",
163+
"privacy_overlay"
164+
],
165+
"ui_unlocks": [
166+
"privacy_overlay",
167+
"blind_mode"
168+
]
169+
},
170+
{
171+
"id": "model-training",
172+
"name": "Model Training",
173+
"description": "Agent-driven YOLO fine-tuning — annotate, train, auto-export to TensorRT/CoreML/OpenVINO, deploy as detection skill.",
174+
"version": "1.0.0",
175+
"category": "training",
176+
"path": "skills/training/model-training",
177+
"tags": [
178+
"training",
179+
"fine-tuning",
180+
"yolo",
181+
"custom-model",
182+
"export"
183+
],
184+
"platforms": [
185+
"linux-x64",
186+
"linux-arm64",
187+
"darwin-arm64",
188+
"darwin-x64",
189+
"win-x64"
190+
],
191+
"requirements": {
192+
"python": ">=3.9",
193+
"ram_gb": 4
194+
},
195+
"capabilities": [
196+
"fine_tuning",
197+
"model_export",
198+
"deployment"
199+
]
133200
}
134201
]
135202
}

skills/analysis/home-security-benchmark/scripts/run-benchmark.cjs

Lines changed: 13 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -446,7 +446,7 @@ ${userMessage}
446446
447447
## Response Format
448448
Respond with ONLY a valid JSON object, no other text:
449-
{"keep": [<actual index numbers from the list above>], "summary": "<brief 1-line summary of what was dropped>"}
449+
{"keep": [<actual index numbers from the list above>], "summary": "<summary of what was dropped>"}
450450
451451
Example: if keeping messages at indices 0, 18, 22 → {"keep": [0, 18, 22], "summary": "Removed 4 duplicate 'what happened today' questions"}
452452
If nothing should be dropped, keep ALL indices and set summary to "".`;
@@ -566,16 +566,14 @@ suite('📋 Context Preprocessing', async () => {
566566
// ═══════════════════════════════════════════════════════════════════════════════
567567

568568
suite('🏷️ Topic Classification', async () => {
569-
await test('First turn → topic title (3-6 words)', async () => {
569+
await test('First turn → topic title', async () => {
570570
const r = await llmCall([{
571-
role: 'user', content: `Classify this exchange's topic in 3-6 words. Respond with ONLY the topic title.
571+
role: 'user', content: `Classify this exchange's topic. Respond with ONLY the topic title.
572572
User: "What has happened today on the cameras?"
573573
Assistant: "Today, your cameras captured motion events including a person at the front door at 9:40 AM..."` }]);
574574
const cleaned = stripThink(r.content).split('\n').filter(l => l.trim()).pop().replace(/^["'*]+|["'*]+$/g, '').replace(/^(new\s+)?topic\s*:\s*/i, '').trim();
575575
assert(cleaned.length > 0, 'Topic empty');
576-
const wc = cleaned.split(/\s+/).length;
577-
assert(wc <= 8, `Too verbose: ${wc} words`);
578-
return `"${cleaned}" (${wc} words)`;
576+
return `"${cleaned}"`;
579577
});
580578

581579
await test('Same topic → SAME', async () => {
@@ -585,7 +583,7 @@ User: "Show me the clip from 9:40 AM"
585583
Assistant: "Here's the clip from 9:40 AM showing a person at the front door..."
586584
Current topic: "Camera Events Review"
587585
If the topic hasn't changed, respond: SAME
588-
Otherwise respond with ONLY the new topic title (3-6 words).` }]);
586+
Otherwise respond with ONLY the new topic title.` }]);
589587
const cleaned = stripThink(r.content).split('\n').filter(l => l.trim()).pop().replace(/^["'*]+|["'*]+$/g, '');
590588
assert(cleaned.toUpperCase() === 'SAME', `Expected SAME, got "${cleaned}"`);
591589
return 'SAME ✓';
@@ -598,19 +596,19 @@ User: "What's the system status? How much storage am I using?"
598596
Assistant: "System healthy. Storage: 45GB of 500GB, VLM running on GPU."
599597
Current topic: "Camera Events Review"
600598
If the topic hasn't changed, respond: SAME
601-
Otherwise respond with ONLY the new topic title (3-6 words).` }]);
599+
Otherwise respond with ONLY the new topic title.` }]);
602600
const cleaned = stripThink(r.content).split('\n').filter(l => l.trim()).pop().replace(/^["'*]+|["'*]+$/g, '').replace(/^(new\s+)?topic\s*:\s*/i, '').trim();
603601
assert(cleaned.toUpperCase() !== 'SAME', 'Expected new topic');
604602
return `"${cleaned}"`;
605603
});
606604

607605
await test('Greeting → valid topic', async () => {
608606
const r = await llmCall([{
609-
role: 'user', content: `Classify this exchange's topic in 3-6 words. Respond with ONLY the topic title.
607+
role: 'user', content: `Classify this exchange's topic. Respond with ONLY the topic title.
610608
User: "Hi, good morning!"
611609
Assistant: "Good morning! How can I help you with your home security today?"` }]);
612610
const cleaned = stripThink(r.content).split('\n').filter(l => l.trim()).pop().replace(/^["'*]+|["'*]+$/g, '').trim();
613-
assert(cleaned.length > 0 && cleaned.length < 50, `Bad: "${cleaned}"`);
611+
assert(cleaned.length > 0, `Bad: empty topic`);
614612
return `"${cleaned}"`;
615613
});
616614
});
@@ -818,7 +816,7 @@ suite('💬 Chat & JSON Compliance', async () => {
818816
{ role: 'user', content: 'What can you do?' },
819817
]);
820818
const c = stripThink(r.content);
821-
assert(c.length > 20 && c.length < 2000, `Length ${c.length}`);
819+
assert(c.length > 20, `Response too short: ${c.length} chars`);
822820
return `${c.length} chars`;
823821
});
824822

@@ -827,7 +825,7 @@ suite('💬 Chat & JSON Compliance', async () => {
827825
{ role: 'system', content: 'You are Aegis. When you have nothing to say, respond ONLY: NO_REPLY' },
828826
{ role: 'user', content: '[Tool Context] video_search returned 3 clips' },
829827
]);
830-
assert(stripThink(r.content).length < 500, 'Response too long for tool context');
828+
// No upper-bound length check — LLMs may be verbose
831829
return `"${stripThink(r.content).slice(0, 40)}"`;
832830
});
833831

@@ -907,13 +905,13 @@ suite('💬 Chat & JSON Compliance', async () => {
907905

908906
await test('Contradictory instructions → balanced response', async () => {
909907
const r = await llmCall([
910-
{ role: 'system', content: 'You are Aegis. Keep all responses under 50 words.' },
908+
{ role: 'system', content: 'You are Aegis. Keep all responses succinct.' },
911909
{ role: 'user', content: 'Give me a very detailed, comprehensive explanation of how the security classification system works with all four levels and examples of each.' },
912910
]);
913911
const c = stripThink(r.content);
914912
// Model should produce something reasonable — not crash or refuse
915913
assert(c.length > 30, 'Response too short');
916-
assert(c.length < 3000, 'Response unreasonably long');
914+
// No upper-bound length check — LLMs may produce varying lengths
917915
return `${c.split(/\s+/).length} words, ${c.length} chars`;
918916
});
919917

@@ -1035,7 +1033,7 @@ suite('📝 Narrative Synthesis', async () => {
10351033
const c = stripThink(r.content);
10361034
// Should be concise — not just repeat all 22 events
10371035
assert(c.length > 100, `Response too short: ${c.length} chars`);
1038-
assert(c.length < 4000, `Response too long (raw dump?): ${c.length} chars`);
1036+
// No upper-bound length check — narrative length varies by model
10391037
// Should mention key categories
10401038
const lower = c.toLowerCase();
10411039
assert(lower.includes('deliver') || lower.includes('package'),
File renamed without changes.
File renamed without changes.
File renamed without changes.
Lines changed: 105 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,105 @@
1+
---
2+
name: model-training
3+
description: "Agent-driven YOLO fine-tuning — annotate, train, export, deploy"
4+
version: 1.0.0
5+
6+
parameters:
7+
- name: base_model
8+
label: "Base Model"
9+
type: select
10+
options: ["yolo26n", "yolo26s", "yolo26m", "yolo26l"]
11+
default: "yolo26n"
12+
description: "Pre-trained model to fine-tune"
13+
group: Training
14+
15+
- name: dataset_dir
16+
label: "Dataset Directory"
17+
type: string
18+
default: "~/datasets"
19+
description: "Path to COCO-format dataset (from dataset-annotation skill)"
20+
group: Training
21+
22+
- name: epochs
23+
label: "Training Epochs"
24+
type: number
25+
default: 50
26+
group: Training
27+
28+
- name: batch_size
29+
label: "Batch Size"
30+
type: number
31+
default: 16
32+
description: "Adjust based on GPU VRAM"
33+
group: Training
34+
35+
- name: auto_export
36+
label: "Auto-Export to Optimal Format"
37+
type: boolean
38+
default: true
39+
description: "Automatically convert to TensorRT/CoreML/OpenVINO after training"
40+
group: Deployment
41+
42+
- name: deploy_as_skill
43+
label: "Deploy as Detection Skill"
44+
type: boolean
45+
default: false
46+
description: "Replace the active YOLO detection model with the fine-tuned version"
47+
group: Deployment
48+
49+
capabilities:
50+
training:
51+
script: scripts/train.py
52+
description: "Fine-tune YOLO models on custom annotated datasets"
53+
---
54+
55+
# Model Training
56+
57+
Agent-driven custom model training powered by Aegis's Training Agent. Closes the annotation-to-deployment loop: take a COCO dataset from `dataset-annotation`, fine-tune a YOLO model, auto-export to the optimal format for your hardware, and optionally deploy it as your active detection skill.
58+
59+
## What You Get
60+
61+
- **Fine-tune YOLO26** — start from nano/small/medium/large pre-trained weights
62+
- **COCO dataset input** — uses standard format from `dataset-annotation` skill
63+
- **Hardware-aware training** — auto-detects CUDA, MPS, ROCm, or CPU
64+
- **Auto-export** — converts trained model to TensorRT / CoreML / OpenVINO / ONNX via `env_config.py`
65+
- **One-click deploy** — replace the active detection model with your fine-tuned version
66+
- **Training telemetry** — real-time loss, mAP, and epoch progress streamed to Aegis UI
67+
68+
## Training Loop (Aegis Training Agent)
69+
70+
```
71+
dataset-annotation model-training yolo-detection-2026
72+
┌─────────────┐ ┌──────────────────┐ ┌──────────────────┐
73+
│ Annotate │───────▶│ Fine-tune YOLO │───────▶│ Deploy custom │
74+
│ Review │ COCO │ Auto-export │ .pt │ model as active │
75+
│ Export │ JSON │ Validate mAP │ .engine│ detection skill │
76+
└─────────────┘ └──────────────────┘ └──────────────────┘
77+
▲ │
78+
└────────────────────────────────────────────────────┘
79+
Feedback loop: better detection → better annotation
80+
```
81+
82+
## Protocol
83+
84+
### Aegis → Skill (stdin)
85+
```jsonl
86+
{"event": "train", "dataset_path": "~/datasets/front_door_people/", "base_model": "yolo26n", "epochs": 50, "batch_size": 16}
87+
{"event": "export", "model_path": "runs/train/best.pt", "formats": ["coreml", "tensorrt"]}
88+
{"event": "validate", "model_path": "runs/train/best.pt", "dataset_path": "~/datasets/front_door_people/"}
89+
```
90+
91+
### Skill → Aegis (stdout)
92+
```jsonl
93+
{"event": "ready", "gpu": "mps", "base_models": ["yolo26n", "yolo26s", "yolo26m", "yolo26l"]}
94+
{"event": "progress", "epoch": 12, "total_epochs": 50, "loss": 0.043, "mAP50": 0.87, "mAP50_95": 0.72}
95+
{"event": "training_complete", "model_path": "runs/train/best.pt", "metrics": {"mAP50": 0.91, "mAP50_95": 0.78, "params": "2.6M"}}
96+
{"event": "export_complete", "format": "coreml", "path": "runs/train/best.mlpackage", "speedup": "2.1x vs PyTorch"}
97+
{"event": "validation", "mAP50": 0.91, "per_class": [{"class": "person", "ap": 0.95}, {"class": "car", "ap": 0.88}]}
98+
```
99+
100+
## Setup
101+
102+
```bash
103+
python3 -m venv .venv && source .venv/bin/activate
104+
pip install -r requirements.txt
105+
```
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
ultralytics>=8.3.0
2+
torch>=2.0.0
3+
coremltools>=7.0; sys_platform == 'darwin'
4+
onnx>=1.14.0
5+
onnxruntime>=1.16.0

0 commit comments

Comments
 (0)