Skip to content

Commit f9073ad

Browse files
authored
Merge pull request #182 from SharpAI/develop
Develop
2 parents a275e40 + c575193 commit f9073ad

File tree

21 files changed

+1482
-429
lines changed

21 files changed

+1482
-429
lines changed

README.md

Lines changed: 25 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -58,8 +58,10 @@ Each skill is a self-contained module with its own model, parameters, and [commu
5858
| Category | Skill | What It Does | Status |
5959
|----------|-------|--------------|:------:|
6060
| **Detection** | [`yolo-detection-2026`](skills/detection/yolo-detection-2026/) | Real-time 80+ class detection — auto-accelerated via TensorRT / CoreML / OpenVINO / ONNX ||
61-
| | [`yolo-detection-2026-coral-tpu`](skills/detection/yolo-detection-2026-coral-tpu/) | Google Coral Edge TPU — ~4ms inference via USB accelerator ([Docker-based](#detection--segmentation-skills)) | 🧪 |
62-
| | [`yolo-detection-2026-openvino`](skills/detection/yolo-detection-2026-openvino/) | Intel NCS2 USB / Intel GPU / CPU — multi-device via OpenVINO ([Docker-based](#detection--segmentation-skills)) | 🧪 |
61+
| | [`yolo-detection-2026-coral-tpu`](skills/detection/yolo-detection-2026-coral-tpu/) | Google Coral Edge TPU — ~4ms inference via USB accelerator ([LiteRT](#detection--segmentation-skills)) ||
62+
| | [`yolo-detection-2026-openvino`](skills/detection/yolo-detection-2026-openvino/) | Intel NCS2 USB / Intel GPU / CPU — multi-device via OpenVINO ([architecture](#detection--segmentation-skills)) | 🧪 |
63+
| | `face-detection-recognition` | Face detection & recognition — identify known faces from camera feeds | 📐 |
64+
| | `license-plate-recognition` | License plate detection & recognition — read plate numbers from camera feeds | 📐 |
6365
| **Analysis** | [`home-security-benchmark`](skills/analysis/home-security-benchmark/) | [143-test evaluation suite](#-homesec-bench--how-secure-is-your-local-ai) for LLM & VLM security performance ||
6466
| **Privacy** | [`depth-estimation`](skills/transformation/depth-estimation/) | [Real-time depth-map privacy transform](#-privacy--depth-map-anonymization) — anonymize camera feeds while preserving activity ||
6567
| **Segmentation** | [`sam2-segmentation`](skills/segmentation/sam2-segmentation/) | Interactive click-to-segment with Segment Anything 2 — pixel-perfect masks, point/box prompts, video tracking ||
@@ -74,38 +76,41 @@ Each skill is a self-contained module with its own model, parameters, and [commu
7476
7577
### Detection & Segmentation Skills
7678

77-
Detection and segmentation skills process visual data from camera feeds — detecting objects, segmenting regions, or analyzing scenes. All skills use the same **JSONL stdin/stdout protocol**: Aegis writes a frame to a shared volume, sends a `frame` event on stdin, and reads `detections` from stdout. This means every detection skill — whether running natively or inside Docker — is interchangeable from Aegis's perspective.
79+
Detection and segmentation skills process visual data from camera feeds — detecting objects, segmenting regions, or analyzing scenes. All skills use the same **JSONL stdin/stdout protocol**: Aegis writes a frame to a shared volume, sends a `frame` event on stdin, and reads `detections` from stdout. Every detection skill is interchangeable from Aegis's perspective.
7880

7981
```mermaid
8082
graph TB
8183
CAM["📷 Camera Feed"] --> GOV["Frame Governor (5 FPS)"]
8284
GOV --> |"frame.jpg → shared volume"| PROTO["JSONL stdin/stdout Protocol"]
8385
84-
PROTO --> NATIVE["🖥️ Native: yolo-detection-2026"]
85-
PROTO --> DOCKER["🐳 Docker: Coral TPU / OpenVINO"]
86+
PROTO --> YOLO["yolo-detection-2026"]
87+
PROTO --> CORAL["yolo-detection-2026-coral-tpu"]
88+
PROTO --> OV["yolo-detection-2026-openvino"]
8689
87-
subgraph Native["Native Skill (runs on host)"]
88-
NATIVE --> ENV["env_config.py auto-detect"]
90+
subgraph Backends["Skill Backends"]
91+
YOLO --> ENV["env_config.py auto-detect"]
8992
ENV --> TRT["NVIDIA → TensorRT"]
9093
ENV --> CML["Apple Silicon → CoreML"]
91-
ENV --> OV["Intel → OpenVINO IR"]
94+
ENV --> OVIR["Intel → OpenVINO IR"]
9295
ENV --> ONNX["AMD / CPU → ONNX"]
93-
end
9496
95-
subgraph Container["Docker Container"]
96-
DOCKER --> CORAL["Coral TPU → pycoral"]
97-
DOCKER --> OVIR["OpenVINO → Ultralytics OV"]
98-
DOCKER --> CPU["CPU fallback"]
99-
CORAL -.-> USB["USB/IP passthrough"]
100-
OVIR -.-> DRI["/dev/dri · /dev/bus/usb"]
97+
CORAL --> LITERT["ai-edge-litert + libedgetpu"]
98+
LITERT --> TPU["Coral USB → Edge TPU delegate"]
99+
LITERT --> CPU1["No TPU → CPU fallback"]
100+
101+
OV --> OVSDK["OpenVINO SDK"]
102+
OVSDK --> NCS2["Intel NCS2 USB"]
103+
OVSDK --> IGPU["Intel iGPU / Arc"]
104+
OVSDK --> CPU2["CPU fallback"]
101105
end
102106
103-
NATIVE --> |"stdout: detections"| AEGIS["Aegis IPC → Live Overlay + Alerts"]
104-
DOCKER --> |"stdout: detections"| AEGIS
107+
YOLO --> |"stdout: detections"| AEGIS["Aegis IPC → Live Overlay + Alerts"]
108+
CORAL --> |"stdout: detections"| AEGIS
109+
OV --> |"stdout: detections"| AEGIS
105110
```
106111

107-
- **Native skills** run directly on the host — [`env_config.py`](skills/lib/env_config.py) auto-detects the GPU and converts models to the fastest format (TensorRT, CoreML, OpenVINO IR, ONNX)
108-
- **Docker skills** wrap hardware-specific runtimes in a container — cross-platform USB/device access without native driver installation
112+
- **Unified protocol** — each skill creates its own Python venv or Docker container, but Aegis sees the same JSONL interface regardless of backend
113+
- **Coral TPU** uses [ai-edge-litert](https://pypi.org/project/ai-edge-litert/) (LiteRT) with the `libedgetpu` delegate — supports Python 3.9–3.13 on Linux, macOS, and Windows
109114
- **Same output** — Aegis sees identical JSONL from all skills, so detection overlays, alerts, and forensic analysis work with any backend
110115

111116
#### LLM-Assisted Skill Installation
@@ -114,7 +119,7 @@ Skills are installed by an **autonomous LLM deployment agent** — not by brittl
114119

115120
1. **Probe** — reads `SKILL.md`, `requirements.txt`, and `package.json` to understand what the skill needs
116121
2. **Detect hardware** — checks for NVIDIA (CUDA), AMD (ROCm), Apple Silicon (MPS), Intel (OpenVINO), or CPU-only
117-
3. **Install** — runs the right commands (`pip install`, `npm install`, `docker build`) with the correct backend-specific dependencies
122+
3. **Install** — runs the right commands (`pip install`, `npm install`, system packages) with the correct backend-specific dependencies
118123
4. **Verify** — runs a smoke test to confirm the skill loads before marking it complete
119124
5. **Determine launch command** — figures out the exact `run_command` to start the skill and saves it to the registry
120125

skills.json

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -100,6 +100,54 @@
100100
"large"
101101
]
102102
},
103+
{
104+
"id": "yolo-detection-2026-coral-tpu",
105+
"name": "YOLO 2026 Coral TPU",
106+
"description": "Google Coral Edge TPU — real-time object detection with LiteRT (INT8, ~4ms inference at 320×320).",
107+
"version": "2.0.0",
108+
"category": "detection",
109+
"path": "skills/detection/yolo-detection-2026-coral-tpu",
110+
"tags": [
111+
"detection",
112+
"yolo",
113+
"coral",
114+
"edge-tpu",
115+
"litert",
116+
"real-time",
117+
"coco"
118+
],
119+
"platforms": [
120+
"linux-x64",
121+
"linux-arm64",
122+
"darwin-arm64",
123+
"darwin-x64",
124+
"win-x64"
125+
],
126+
"requirements": {
127+
"python": ">=3.9",
128+
"system": "libedgetpu",
129+
"hardware": "Google Coral USB Accelerator"
130+
},
131+
"capabilities": [
132+
"live_detection",
133+
"bbox_overlay"
134+
],
135+
"ui_unlocks": [
136+
"detection_overlay",
137+
"detection_results"
138+
],
139+
"fps_presets": [
140+
0.2,
141+
0.5,
142+
1,
143+
3,
144+
5,
145+
15
146+
],
147+
"model_sizes": [
148+
"nano"
149+
]
150+
},
103151
{
104152
"id": "camera-claw",
105153
"name": "Camera Claw",
Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
# ─── Python ───────────────────────────────────────────
2+
__pycache__/
3+
*.py[cod]
4+
*$py.class
5+
*.so
6+
*.egg-info/
7+
*.egg
8+
dist/
9+
.eggs/
10+
.venv/
11+
venv/
12+
env/
13+
14+
# ─── ML Models & Data ────────────────────────────────
15+
*.pt
16+
*.pth
17+
*.onnx
18+
*.tflite
19+
*.pb
20+
*.h5
21+
*.safetensors
22+
model/
23+
model.tgz
24+
*.tgz
25+
weights/
26+
runs/
27+
28+
# ─── Build ────────────────────────────────────────────
29+
build/build/*
30+
build/dist/*
31+
build/runtime_arch
32+
build/*
33+
runtime
34+
35+
# ─── Node ─────────────────────────────────────────────
36+
node_modules/
37+
38+
# ─── Docker ───────────────────────────────────────────
39+
docker/db
40+
volumes
41+
42+
# ─── IDE & OS ─────────────────────────────────────────
43+
.DS_Store
44+
.vscode/
45+
.idea/
46+
*.swp
47+
*.swo
48+
*~
49+
.env.local
50+
51+
# ─── Project Specific ────────────────────────────────
52+
src/yolov7_reid/src/models/mgn_R50-ibn.onnx
53+
*.tflite
Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
arch:
2+
- arm64
3+
os: linux
4+
dist: xenial
5+
language: shell
6+
services:
7+
- docker
8+
env:
9+
global:
10+
- DOCKER_CACHE_FILE=/home/travis/docker/cache.tar.gz
11+
before_script:
12+
# Every 30 seconds, look for the build log file. If it exists, then
13+
# start watching its contents and printing them to stdout as they
14+
# change. This has two effects:
15+
# 1. it avoids Travis timing out because the build outputs nothing
16+
# 2. it makes it more obvious what part of the build, if any, gets stuck
17+
- while sleep 30; do tail $TRAVIS_BUILD_DIR/log -f ; done &
18+
script:
19+
#- cd $TRAVIS_BUILD_DIR/docker/build/tensorflow && docker build -f Dockerfile.arm64v8 -t shareai/tensorflow:arm64v8_latest .
20+
#- cd $TRAVIS_BUILD_DIR/docker/build/od && docker build -f Dockerfile.arm64v8 -t shareai/od:arm64v8_latest .
21+
- "travis_wait 50 sleep 3000 &"
22+
- docker-compose -f $TRAVIS_BUILD_DIR/docker/build/docker-compose-arm64v8.yml build > $TRAVIS_BUILD_DIR/log
23+
after_success:
24+
- docker login --username shareai --password $DOCKER_HUB_TOKEN
25+
- docker push shareai/tensorflow:arm64v8_latest
26+
#- docker push shareai/od:arm64v8_latest
27+
- cd $TRAVIS_BUILD_DIR/docker/build && docker-compose -f docker-compose-arm64v8.yml push
28+
cache:
29+
directories:
30+
- /home/travis/docker/
31+
before_install:
32+
- if [ -f ${DOCKER_CACHE_FILE} ]; then gunzip -c ${DOCKER_CACHE_FILE} | docker load; fi
33+
before_cache:
34+
- if [[ ${TRAVIS_BRANCH} == "master" ]] && [[ ${TRAVIS_PULL_REQUEST} == "false" ]]; then docker save $(docker images -a -q) | gzip > ${DOCKER_CACHE_FILE}; fi
Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
# Contributor Covenant Code of Conduct
2+
3+
## Our Pledge
4+
5+
We as members, contributors, and leaders pledge to make participation in our
6+
community a harassment-free experience for everyone, regardless of age, body
7+
size, visible or invisible disability, ethnicity, sex characteristics, gender
8+
identity and expression, level of experience, education, socio-economic status,
9+
nationality, personal appearance, race, caste, color, religion, or sexual
10+
identity and orientation.
11+
12+
We pledge to act and interact in ways that contribute to an open, welcoming,
13+
diverse, inclusive, and healthy community.
14+
15+
## Our Standards
16+
17+
Examples of behavior that contributes to a positive environment for our
18+
community include:
19+
20+
* Demonstrating empathy and kindness toward other people
21+
* Being respectful of differing opinions, viewpoints, and experiences
22+
* Giving and gracefully accepting constructive feedback
23+
* Accepting responsibility and apologizing to those affected by our mistakes,
24+
and learning from the experience
25+
* Focusing on what is best not just for us as individuals, but for the overall
26+
community
27+
28+
Examples of unacceptable behavior include:
29+
30+
* The use of sexualized language or imagery, and sexual attention or advances of
31+
any kind
32+
* Trolling, insulting or derogatory comments, and personal or political attacks
33+
* Public or private harassment
34+
* Publishing others' private information, such as a physical or electronic
35+
address, without their explicit permission
36+
* Other conduct which could reasonably be considered inappropriate in a
37+
professional setting
38+
39+
## Enforcement
40+
41+
Instances of abusive, harassing, or otherwise unacceptable behavior may be
42+
reported to the project maintainers. All complaints will be reviewed and
43+
investigated promptly and fairly.
44+
45+
## Attribution
46+
47+
This Code of Conduct is adapted from the [Contributor Covenant](https://www.contributor-covenant.org),
48+
version 2.1, available at
49+
https://www.contributor-covenant.org/version/2/1/code_of_conduct.html
Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
# Contributing to DeepCamera
2+
3+
Thank you for your interest in contributing to DeepCamera! This project is evolving into an open-source AI skill platform for [SharpAI Aegis](https://sharpai.org).
4+
5+
## How to Contribute
6+
7+
### 🛠️ Build a New Skill
8+
9+
The best way to contribute is by building a new skill. Each skill is a self-contained folder under `skills/` with:
10+
11+
1. **`SKILL.md`** — declares parameters (rendered as UI in Aegis) and capabilities
12+
2. **`requirements.txt`** — Python dependencies
13+
3. **`scripts/`** — entry point using JSON-lines stdin/stdout protocol
14+
15+
See [`skills/detection/yolo-detection-2026/`](skills/detection/yolo-detection-2026/) for a complete reference implementation.
16+
17+
### 📋 Skill Ideas We Need
18+
19+
- Camera providers: Eufy, Reolink, Tapo, Ring
20+
- Messaging channels: Matrix, LINE, Signal
21+
- Automation triggers: MQTT, webhooks
22+
- AI models: VLM scene analysis, SAM2 segmentation, depth estimation
23+
24+
### 🐛 Report Issues
25+
26+
- Use [GitHub Issues](https://github.com/SharpAI/DeepCamera/issues)
27+
- Include your platform, Python version, and steps to reproduce
28+
29+
### 📝 Improve Documentation
30+
31+
- Fix typos, improve clarity, add examples
32+
- Add platform-specific setup guides under `docs/`
33+
34+
## Development Setup
35+
36+
```bash
37+
git clone https://github.com/SharpAI/DeepCamera.git
38+
cd DeepCamera
39+
40+
# Work on a skill
41+
cd skills/detection/yolo-detection-2026
42+
python3 -m venv .venv && source .venv/bin/activate
43+
pip install -r requirements.txt
44+
```
45+
46+
## Code Style
47+
48+
- Python: follow PEP 8
49+
- Use type hints where practical
50+
- Add docstrings to public functions
51+
52+
## License
53+
54+
By contributing, you agree that your contributions will be licensed under the [MIT License](LICENSE).
Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
# Third-Party Licenses & Acknowledgments
2+
3+
This project uses or was inspired by the following open-source projects:
4+
5+
## AI & ML Frameworks
6+
* [Ultralytics](https://github.com/ultralytics/ultralytics) — YOLOv8/v10/v11 (AGPL-3.0)
7+
* [Insightface](https://github.com/deepinsight/insightface) — Face recognition (MIT)
8+
* [FastReID](https://github.com/JDAI-CV/fast-reid) — Person re-identification (Apache-2.0)
9+
10+
## Legacy Dependencies (src/)
11+
* [TensorFlow](https://github.com/tensorflow/tensorflow) — Apache License 2.0
12+
* [MXNet](https://github.com/apache/incubator-mxnet) — Apache License 2.0
13+
* [TVM](https://github.com/dmlc/tvm) — Apache License 2.0
14+
15+
## Infrastructure
16+
* [Milvus](https://github.com/milvus-io/milvus) — Vector database (Apache-2.0)
17+
* [go2rtc](https://github.com/AlexxIT/go2rtc) — RTSP/WebRTC streaming (MIT)
18+
* [Node.js](https://nodejs.org) — MIT
19+
* [Python](https://www.python.org) — PSF License
20+
21+
## Historical
22+
* Shinobi — https://gitlab.com/Shinobi-Systems/Shinobi/
23+
* Termux — https://github.com/termux/termux-app (GPLv3)

0 commit comments

Comments
 (0)