DeepCamera — Open-Source AI Camera Skills Platform

DeepCamera's open-source skills give your cameras AI — VLM scene analysis, object detection, person re-identification, all running locally with models like Qwen, DeepSeek, SmolVLM, and LLaVA. Built on proven facial recognition, RE-ID, fall detection, and CCTV/NVR surveillance monitoring, the skill catalog extends these machine learning capabilities with modern AI. All inference runs locally for maximum privacy.

🧩 Skill Catalog

Each skill is a self-contained module with its own model, parameters, and communication protocol. See the Skill Development Guide and Platform Parameters to build your own.

Category	Skill	What It Does	Status
Detection	`yolo-detection-2026`	Real-time 80+ class object detection	✅
	`dinov3-grounding`	Open-vocabulary detection — describe what to find	📐
	`person-recognition`	Re-identify individuals across cameras	📐
Analysis	`home-security-benchmark`	131-test evaluation suite for LLM & VLM security performance	✅
	`vlm-scene-analysis`	Describe what happened in recorded clips	📐
	`sam2-segmentation`	Click-to-segment with pixel-perfect masks	📐
Transformation	`depth-estimation`	Monocular depth maps with Depth Anything v2	📐
Annotation	`dataset-annotation`	AI-assisted labeling → COCO export	📐
Camera Providers	`eufy` · `reolink` · `tapo`	Direct camera integrations via RTSP	📐
Streaming	`go2rtc-cameras`	RTSP → WebRTC live view	📐
Channels	`matrix` · `line` · `signal`	Messaging channels for Clawdbot agent	📐
Automation	`mqtt` · `webhook` · `ha-trigger`	Event-driven automation triggers	📐
Integrations	`homeassistant-bridge`	HA cameras in ↔ detection results out	📐

✅ Ready · 🧪 Testing · 📐 Planned

Registry: All skills are indexed in skills.json for programmatic discovery.

🗺️ Roadmap

Skill architecture — pluggable SKILL.md interface for all capabilities
Full skill catalog — 18 skills across 9 categories with working scripts
Skill Store UI — browse, install, and configure skills from Aegis
Custom skill packaging — community-contributed skills via GitHub
GPU-optimized containers — one-click Docker deployment per skill

🚀 Getting Started with SharpAI Aegis

The easiest way to run DeepCamera's AI skills. Aegis connects everything — cameras, models, skills, and you.

📷 Connect cameras in seconds — add RTSP/ONVIF cameras, webcams, or iPhone cameras for a quick test
🤖 Built-in local LLM & VLM — llama-server included, no separate setup needed
📦 One-click skill deployment — install skills from the catalog with AI-assisted troubleshooting
🔽 One-click HuggingFace downloads — browse and run Qwen, DeepSeek, SmolVLM, LLaVA, MiniCPM-V
📊 Find the best VLM for your machine — benchmark models on your own hardware with HomeSec-Bench
💬 Talk to your guard — via Telegram, Discord, or Slack. Ask what happened, tell it what to watch for, get AI-reasoned answers with footage.

📦 Download SharpAI Aegis →

Run Local VLMs from HuggingFace — Even on Mac Mini 8GB

SharpAI Aegis — Browse and run local VLM models for AI camera video analysis

Download and run SmolVLM2, Qwen-VL, LLaVA, MiniCPM-V locally. Your AI security camera agent sees through these eyes.

Chat with Your AI Camera Agent

SharpAI Aegis — LLM-powered agentic security camera chat

"Who was at the door?" — Your agent searches footage, reasons about what happened, and answers with timestamps and clips.

📊 HomeSec-Bench — How Secure Is Your Local AI?

HomeSec-Bench is a 131-test security benchmark that measures how well your local AI performs as a security guard. It tests what matters: Can it detect a person in fog? Classify a break-in vs. a delivery? Resist prompt injection? Route alerts correctly at 3 AM?

Run it on your own hardware to know exactly where your setup stands.

Area	Tests	What's at Stake
Scene Understanding	35	Person detection in fog, rain, night IR, sun glare
Security Classification	12	Telling a break-in from a raccoon
Tool Use & Reasoning	16	Correct tool calls with accurate parameters
Prompt Injection Resistance	4	Adversarial attacks that try to disable your guard
Privacy Compliance	3	PII leak prevention, illegal surveillance refusal
Alert Routing	5	Right message, right channel, right time

Results: Local vs. Cloud vs. Hybrid

Running on a Mac M1 Mini 8GB: local Qwen3.5-4B scores 39/54 (72%), cloud GPT-5.2 scores 46/48 (96%), and the hybrid config reaches 53/54 (98%). All 35 VLM test images are AI-generated — no real footage, fully privacy-compliant.

📄 Read the Paper · 🔬 Run It Yourself · 📋 Test Scenarios

📦 More Applications

Legacy Applications (SharpAI-Hub CLI)

These applications use the sharpai-cli Docker-based workflow. For the modern experience, use SharpAI Aegis.

Application	CLI Command	Platforms
Person Recognition (ReID)	`sharpai-cli yolov7_reid start`	Jetson/Windows/Linux/macOS
Person Detector	`sharpai-cli yolov7_person_detector start`	Jetson/Windows/Linux/macOS
Facial Recognition	`sharpai-cli deepcamera start`	Jetson/Windows/Linux/macOS
Local Facial Recognition	`sharpai-cli local_deepcamera start`	Windows/Linux/macOS
Screen Monitor	`sharpai-cli screen_monitor start`	Windows/Linux/macOS
Parking Monitor	`sharpai-cli yoloparking start`	Jetson AGX
Fall Detection	`sharpai-cli falldetection start`	Jetson AGX

📖 Detailed setup guides →

Tested Devices

Edge: Jetson Nano, Xavier AGX, Raspberry Pi 4/8GB
Desktop: macOS, Windows 11, Ubuntu 20.04
MCU: ESP32 CAM, ESP32-S3-Eye

Tested Cameras

RTSP: DaHua, Lorex, Amcrest
Cloud: Blink, Nest (via Home Assistant)
Mobile: IP Camera Lite (iOS)

🏗️ Architecture

Complete Feature List →

🤝 Support & Community

💬 Slack Community — help, discussions, and camera setup assistance
🐛 GitHub Issues — technical support and bug reports
🏢 Commercial Support — pipeline optimization, custom models, edge deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

DeepCamera — Open-Source AI Camera Skills Platform