DeepCamera's open-source skills give your cameras AI — VLM scene analysis, object detection, person re-identification, all running locally with models like Qwen, DeepSeek, SmolVLM, and LLaVA. Built on proven facial recognition, RE-ID, fall detection, and CCTV/NVR surveillance monitoring, the skill catalog extends these machine learning capabilities with modern AI. All inference runs locally for maximum privacy.
Each skill is a self-contained module with its own model, parameters, and communication protocol. See the Skill Development Guide and Platform Parameters to build your own.
| Category | Skill | What It Does | Status |
|---|---|---|---|
| Detection | yolo-detection-2026 |
Real-time 80+ class object detection | ✅ |
dinov3-grounding |
Open-vocabulary detection — describe what to find | 📐 | |
person-recognition |
Re-identify individuals across cameras | 📐 | |
| Analysis | home-security-benchmark |
131-test evaluation suite for LLM & VLM security performance | ✅ |
vlm-scene-analysis |
Describe what happened in recorded clips | 📐 | |
sam2-segmentation |
Click-to-segment with pixel-perfect masks | 📐 | |
| Transformation | depth-estimation |
Monocular depth maps with Depth Anything v2 | 📐 |
| Annotation | dataset-annotation |
AI-assisted labeling → COCO export | 📐 |
| Camera Providers | eufy · reolink · tapo |
Direct camera integrations via RTSP | 📐 |
| Streaming | go2rtc-cameras |
RTSP → WebRTC live view | 📐 |
| Channels | matrix · line · signal |
Messaging channels for Clawdbot agent | 📐 |
| Automation | mqtt · webhook · ha-trigger |
Event-driven automation triggers | 📐 |
| Integrations | homeassistant-bridge |
HA cameras in ↔ detection results out | 📐 |
✅ Ready · 🧪 Testing · 📐 Planned
Registry: All skills are indexed in
skills.jsonfor programmatic discovery.
- Skill architecture — pluggable
SKILL.mdinterface for all capabilities - Full skill catalog — 18 skills across 9 categories with working scripts
- Skill Store UI — browse, install, and configure skills from Aegis
- Custom skill packaging — community-contributed skills via GitHub
- GPU-optimized containers — one-click Docker deployment per skill
🚀 Getting Started with SharpAI Aegis
The easiest way to run DeepCamera's AI skills. Aegis connects everything — cameras, models, skills, and you.
- 📷 Connect cameras in seconds — add RTSP/ONVIF cameras, webcams, or iPhone cameras for a quick test
- 🤖 Built-in local LLM & VLM — llama-server included, no separate setup needed
- 📦 One-click skill deployment — install skills from the catalog with AI-assisted troubleshooting
- 🔽 One-click HuggingFace downloads — browse and run Qwen, DeepSeek, SmolVLM, LLaVA, MiniCPM-V
- 📊 Find the best VLM for your machine — benchmark models on your own hardware with HomeSec-Bench
- 💬 Talk to your guard — via Telegram, Discord, or Slack. Ask what happened, tell it what to watch for, get AI-reasoned answers with footage.
HomeSec-Bench is a 131-test security benchmark that measures how well your local AI performs as a security guard. It tests what matters: Can it detect a person in fog? Classify a break-in vs. a delivery? Resist prompt injection? Route alerts correctly at 3 AM?
Run it on your own hardware to know exactly where your setup stands.
| Area | Tests | What's at Stake |
|---|---|---|
| Scene Understanding | 35 | Person detection in fog, rain, night IR, sun glare |
| Security Classification | 12 | Telling a break-in from a raccoon |
| Tool Use & Reasoning | 16 | Correct tool calls with accurate parameters |
| Prompt Injection Resistance | 4 | Adversarial attacks that try to disable your guard |
| Privacy Compliance | 3 | PII leak prevention, illegal surveillance refusal |
| Alert Routing | 5 | Right message, right channel, right time |
Running on a Mac M1 Mini 8GB: local Qwen3.5-4B scores 39/54 (72%), cloud GPT-5.2 scores 46/48 (96%), and the hybrid config reaches 53/54 (98%). All 35 VLM test images are AI-generated — no real footage, fully privacy-compliant.
📄 Read the Paper · 🔬 Run It Yourself · 📋 Test Scenarios
Legacy Applications (SharpAI-Hub CLI)
These applications use the sharpai-cli Docker-based workflow.
For the modern experience, use SharpAI Aegis.
| Application | CLI Command | Platforms |
|---|---|---|
| Person Recognition (ReID) | sharpai-cli yolov7_reid start |
Jetson/Windows/Linux/macOS |
| Person Detector | sharpai-cli yolov7_person_detector start |
Jetson/Windows/Linux/macOS |
| Facial Recognition | sharpai-cli deepcamera start |
Jetson/Windows/Linux/macOS |
| Local Facial Recognition | sharpai-cli local_deepcamera start |
Windows/Linux/macOS |
| Screen Monitor | sharpai-cli screen_monitor start |
Windows/Linux/macOS |
| Parking Monitor | sharpai-cli yoloparking start |
Jetson AGX |
| Fall Detection | sharpai-cli falldetection start |
Jetson AGX |
- Edge: Jetson Nano, Xavier AGX, Raspberry Pi 4/8GB
- Desktop: macOS, Windows 11, Ubuntu 20.04
- MCU: ESP32 CAM, ESP32-S3-Eye
- RTSP: DaHua, Lorex, Amcrest
- Cloud: Blink, Nest (via Home Assistant)
- Mobile: IP Camera Lite (iOS)
- 💬 Slack Community — help, discussions, and camera setup assistance
- 🐛 GitHub Issues — technical support and bug reports
- 🏢 Commercial Support — pipeline optimization, custom models, edge deployment



