Multi-object detection and tracking optimised from YOLOv8 + DeepSORT to pure ONNX + IoU tracking — runs fully on CPU.
This project started as a straightforward YOLOv8 + DeepSORT tracker and evolved through three major iterations into a high-performance, multi-interface tracking system. Each version targeted specific bottlenecks until we achieved 13.6× faster real-time tracking — all on CPU.
┌─────────────────────────────────────────────────────────────────┐
│ v1 │ YOLOv8 (.pt) + DeepSORT │ ~4.7 FPS │ Baseline │
├──────┼────────────────────────────────┼────────────┼──────────────┤
│ v2 │ ONNX Runtime + DeepSORT │ ~12 FPS │ 2.5× faster │
├──────┼────────────────────────────────┼────────────┼──────────────┤
│ v3 │ ONNX Runtime + IoU Tracker │ ~64 FPS │ 13.6× faster │
└─────────────────────────────────────────────────────────────────┘
Benchmarked on a minimal CPU-only laptop (no GPU, no CUDA):
| Metric | v1 — YOLOv8 + DeepSORT | v3 — ONNX + IoU Tracker | Improvement |
|---|---|---|---|
| FPS | 4.7 | 64.1 | 13.6× |
| Detection latency | 210 ms (PyTorch @640px) | 36 ms (ONNX @320px) | 5.8× |
| Tracker latency | ~30 ms (CNN re-ID + Kalman) | 0.01 ms (pure IoU geometry) | 3000× |
| Per-frame total | ~240 ms | 15.6 ms | 15.4× |
| Runtime deps | 6 (torch, torchvision, ultralytics, deep-sort-realtime, opencv, numpy) | 3 (onnxruntime, opencv, numpy) | 50% fewer |
| Install size | ~2.1 GB (PyTorch + CUDA stubs) | ~85 MB | 25× smaller |
| Bottleneck | v1 Approach | v3 Fix | FPS Impact |
|---|---|---|---|
| Inference engine | Ultralytics PyTorch | ONNX Runtime (graph optimization, op fusion) | +50–200% |
| Input resolution | 640×640 | 320×320 (4× fewer pixels) | +100–150% |
| Tracker | DeepSORT (CNN feature extractor + Kalman filter) | IoU Tracker (pure bounding-box geometry) | +30% |
| Frame skip | Every 2nd frame | Every 3rd frame (configurable 1–4) | +50% |
| Capture threading | Synchronous read + process | Background QThread with drop-old queue |
+10–20% |
| Model loading | PyTorch → Python overhead per forward pass | ONNX → compiled execution graph | +20–30% |
The project has three interfaces — all sharing the same optimized pipeline:
track_project/
│
├── 🟢 final.py # Max-FPS PySide6 GUI (ONNX + IoU) — self-contained
├── 🔵 main.py # Standard PySide6 GUI (YOLOv8 + DeepSORT via core.py)
├── 🔵 track.py # CLI tracker (YOLOv8/ONNX + DeepSORT via core.py)
├── 🔵 core.py # Shared TrackerCore: dual-backend detection + DeepSORT
├── 🔧 export_onnx.py # One-time YOLOv8 .pt → ONNX converter
│
├── backend/ # FastAPI web server (MJPEG streaming + JWT auth)
│ ├── main.py # Route handlers (/api/*, /stream)
│ ├── auth.py # JWT + bcrypt authentication
│ ├── pipeline.py # Streaming pipeline (CaptureThread → ONNX → IoU)
│ ├── capture.py # Threaded video capture
│ ├── tracker.py # ONNX detector + IoU tracker (server-side)
│ ├── video_manager.py # Per-user pipeline management
│ └── database.py # SQLite user store
│
├── frontend/ # Web dashboard (HTML/CSS/JS)
│ ├── dashboard.html # Live tracking controls + MJPEG viewer
│ ├── login.html # Auth pages
│ ├── index.html # Landing page
│ ├── style.css # Dark-theme CSS
│ └── script.js # Dashboard JS (API calls, token management)
│
├── requirements.txt
├── .gitignore
├── LICENSE
└── README.md
┌──────────────┐ ┌──────────────────┐ ┌────────────────┐
│ Capture │ │ Frame Skip │ │ ONNX Detector │
│ Thread │────▶│ (every Nth) │────▶│ 320×320 │
│ (QThread) │ │ │ │ ~36ms │
└──────────────┘ └──────────────────┘ └────────────────┘
│ │
│ Queue(maxsize=2) │ [x1,y1,x2,y2] boxes
│ drop-old policy ▼
│ ┌────────────────┐
│ │ NMS Filter │
│ │ IoU ≤ 0.45 │
│ └────────────────┘
│ │
│ ▼
│ ┌────────────────┐
│ reuse last_boxes on skipped │ IoU Tracker │
│ frames ─────────────────────────▶ │ ~0.01ms │
│ │ greedy match │
│ └────────────────┘
│ │
▼ ▼
┌──────────────────────────────────────────────────────────┐
│ Draw boxes + IDs + FPS → Display │
└──────────────────────────────────────────────────────────┘
The custom IoU tracker replaces DeepSORT's heavy CNN + Kalman pipeline with pure bounding-box geometry:
New Detections → Compute IoU Matrix (tracks × detections)
│
▼
Greedy Matching
(highest IoU first)
│
┌────────┴────────┐
│ │
IoU ≥ 0.3 IoU < 0.3
│ │
Match: update Stop matching
track bbox, │
reset age │
┌───────┴───────┐
│ │
Unmatched Unmatched
tracks → detections →
age++ NEW track
(delete if (assign ID)
age > 15)
Why it works for real-time: When running at 30–60+ FPS, objects move very little between frames. Simple IoU overlap is sufficient to maintain identity — no expensive appearance features needed.
Trade-off: The IoU tracker may lose track during fast object motion or heavy occlusion (where DeepSORT's appearance model excels). For most real-time surveillance and monitoring use cases, this trade-off is well worth the 3000× speed improvement.
- Python 3.10+
- A webcam or a video file (
.mp4,.avi,.mkv,.mov) - No GPU required
git clone https://github.com/Adhirajsingh2507/track-object.git
cd track-objectpython3 -m venv venv
source venv/bin/activate # Linux / macOS
# .\\venv\\Scripts\\Activate.ps1 # Windows PowerShellpip install -r requirements.txt# Download YOLOv8n weights (auto-download) and export to ONNX
pip install ultralytics
python export_onnx.py --imgsz 320This creates
yolov8n.onnx(~12.7 MB). The.ptweights are only needed for this export step.
python final.pyFeatures:
- 📂 Upload video or 📷 use webcam
- ▶ Start / ⏸ Stop tracking
- ⚡ Quality selector (frame skip 1–4)
- Live FPS counter overlay
python track.py # webcam (default)
python track.py --source video.mp4 # video file
python track.py --source 0 --conf 0.5 # custom confidence
python track.py --frame-skip 3 --process-width 320 # speed modeuvicorn backend.main:app --reload --host 0.0.0.0 --port 8000Then open http://localhost:8000 — register, log in, and access the live tracking dashboard with:
- Video upload / webcam streaming via MJPEG
- Real-time tracking controls
- Quality/frame-skip selector
- JWT-based multi-user authentication
| Flag | Default | Description |
|---|---|---|
--source |
0 (webcam) |
Video source: 0 for webcam, or path to video file |
--model |
auto-detect | Path to .pt or .onnx model (prefers ONNX if available) |
--conf |
0.4 |
Minimum detection confidence (0.0–1.0) |
--process-width |
480 |
Resize width for inference (smaller = faster) |
--frame-skip |
2 |
Process every Nth frame (1 = every frame) |
| Parameter | Value | Effect |
|---|---|---|
| ONNX input size | 320×320 | 4× fewer pixels than 640 → ~6× faster inference |
conf_threshold |
0.5 | Filters weak detections |
nms_threshold |
0.45 | Merges overlapping boxes |
frame_skip |
3 (default) | YOLO runs every 3rd frame |
iou_threshold |
0.3 | Min overlap to match track ↔ detection |
max_age |
15 | Frames before unmatched track is deleted |
Queue maxsize |
2 | Drops old frames to stay real-time |
| Component | v1 (Original) | v3 (Optimised) | Purpose |
|---|---|---|---|
| Detection | Ultralytics YOLOv8 (.pt) | ONNX Runtime (direct inference) | Per-frame object detection |
| Tracking | deep-sort-realtime (CNN + Kalman) | Custom IoU Tracker (pure geometry) | Multi-object ID assignment |
| Video I/O | OpenCV | OpenCV | Frame capture + rendering |
| GUI | PySide6 | PySide6 | Desktop application |
| Web Backend | — | FastAPI + MJPEG | Multi-user web streaming |
| Auth | — | JWT + bcrypt + SQLite | Web authentication |
| Numerical | NumPy | NumPy | Bounding box math |
| File | Lines | Description |
|---|---|---|
final.py |
463 | ⚡ Max-FPS tracker — self-contained ONNX + IoU pipeline with PySide6 GUI. The target optimized version. |
main.py |
264 | 🔵 Standard GUI tracker using TrackerCore from core.py (YOLOv8/ONNX + DeepSORT) |
core.py |
272 | 🔵 Shared TrackerCore class — dual-backend (PyTorch/.pt or ONNX) detection + DeepSORT tracking |
track.py |
114 | 🔵 CLI tracker — uses TrackerCore, supports all model formats |
export_onnx.py |
33 | 🔧 One-time script to convert yolov8n.pt → yolov8n.onnx |
backend/ |
~450 | 🌐 FastAPI web server: auth, pipeline management, MJPEG streaming |
frontend/ |
~400 | 🌐 Web dashboard: dark-theme HTML/CSS/JS, MJPEG viewer, tracking controls |
- YOLOv8 + DeepSORT baseline tracker
- ONNX Runtime export for faster CPU inference
- Shared
TrackerCoremodule (CLI + GUI) - Custom IoU tracker (3000× faster than DeepSORT)
- Threaded capture with drop-old queue
- Configurable frame-skip and quality modes
- PySide6 GUI with dark theme
- Full-stack web app (FastAPI + MJPEG + JWT auth)
- Class-label display alongside track IDs
- Video output recording (
--saveflag) - Docker containerisation for zero-setup deployment
- WebSocket streaming for lower latency than MJPEG
- INT8 quantization for further ONNX speed gains
Adhiraj Singh B.Tech — Computer Science Engineering (AI/ML)
This project is licensed under the MIT License — see the LICENSE file for details.