This roadmap targets v1.4.2 as the JOSS (Journal of Open Source Software) submission release. It focuses on JOSS requirements, reproducibility assets, and reviewer-friendly onboarding.
Target Release: Q1 2026 Current Status: PLANNING (v1.4.1 released Dec 18, 2025) Main Goal: JOSS paper submission + acceptance, backed by a tagged v1.4.2 release Duration: 8-10 weeks (focused scope)
Note: VideoAnnotator already has public releases; this roadmap is about making the JOSS submission and review as low-friction as possible.
Author order for the JOSS paper:
- Caspar Addyman (ILCHR, Stellenbosch)
- Jeremiah Ishaya (ILCHR, Stellenbosch)
- Irene Uwerikowe (ILCHR, Stellenbosch)
- Daniel Stamate (Computing, Goldsmiths, University of London)
- Jamie Lachman (DISP, University of Oxford)
- Mark Tomlinson (ILCHR, Stellenbosch)
These items map directly to JOSS submission/review expectations and should be treated as release gates:
- Open repository: public Git repository with issues and PRs enabled
- OSI license: license file present and referenced in docs
- Installation: clear install path for reviewers (CPU-only, local), plus Docker as optional reproducibility
- Automated tests: documented commands and a “reviewer smoke test” path
- Documentation: README + quick start + troubleshooting sufficient for first-time users
- Paper artifacts:
paper/paper.mdandpaper/paper.bibin the repository - Paper format: short JOSS format, with proper citations and statement of need
- Archival DOI: tag v1.4.2, archive release (e.g., Zenodo), and reference the DOI consistently (paper + CITATION + README)
- Disclosure: conflicts of interest and funding statements in the paper
- AI/provenance: brief statement describing AI assistance boundaries and confirming authors’ responsibility/maintainership
Prerequisites - ALL COMPLETED in v1.3.0:
- ✅ Persistent storage with retention policies
- ✅ Job cancellation and concurrency control
- ✅ Schema-based config validation
- ✅ Secure-by-default configuration
- ✅ Standardized error envelope
- ✅ Modern videoannotator package namespace
- ✅ Test suite at 94.4% passing (720/763)
- ✅ Real test fixtures infrastructure
- ✅ ffmpeg across all platforms
This release has a tight scope focused on JOSS requirements ONLY:
- ✅ JOSS Paper - Complete manuscript meeting all JOSS requirements
- ✅ Research Reproducibility - Example datasets with reproducible workflows
- ✅ Publication Documentation - Method descriptions, validation, benchmarks
- ✅ Community Onboarding - Quick start, tutorials, clear installation
- ✅ PyPI Release -
pip install videoannotatorfor easy adoption - ✅ Minor Polish - Only v1.3.0 deferred items (16 hours total)
Explicitly OUT OF SCOPE (moved to v1.5.0):
- ❌ Advanced features (quality assessment, batch optimization, pipeline comparison)
- ❌ Enhanced progress indicators and notifications
- ❌ Alternative export formats (FiftyOne, Label Studio, custom CSV)
- ❌ Structured logging and log analysis tools
- ❌ Interactive config wizard and templates
- ❌ Model auto-download and setup wizard
- ❌ Resource monitoring and advanced health metrics
- ❌ All advanced ML features, plugins, enterprise features
- Problem Definition: Why video annotation for research is hard
- Existing Tools: Comparison with ELAN, BORIS, Anvil, commercial solutions
- Our Contribution: Modular pipelines, standardized outputs, reproducibility
- Target Audience: Developmental psychologists, interaction researchers, behavioral scientists
- Pipeline Architecture: Modular design, registry system, standard outputs
- Supported Models: Person tracking, face analysis, audio diarization, emotion, gaze
- Output Formats: COCO, WebVTT, RTTM - rationale and conversion tools
- Extensibility: How researchers can adapt pipelines for their needs
- Parent Child Interaction: Multi-person tracking + speech diarization
- Clinical Assessment: Face analysis + emotion recognition over time
- Infant Attention: Gaze tracking + object detection coordination
- Group Dynamics: Social interaction patterns across multiple individuals
- Reproducible Tutorial: Step-by-step walkthrough with provided data
- Benchmark Results: Accuracy on standard datasets (if available)
- Performance Metrics: Speed benchmarks (CPU/GPU, various video lengths)
- Comparison Study: Side-by-side with manual coding or other tools
- Limitations: Known issues, edge cases, when NOT to use VideoAnnotator
- Documentation: Comprehensive user and developer guides
- Installation: Multiple paths (pip, Docker, conda)
- Support: GitHub issues, discussion forum, contribution guidelines
- Roadmap: Future development plans and feature requests
Complete the minimal deferred items from v1.3.0 (16 hours total):
- Add computed
queue_positionproperty for PENDING jobs - Include in API responses
- Add tests
- Effort: 2 hours
Acceptance criteria:
queue_positionis 1-based and only set forpendingjobs- Ordering matches the worker dequeue order (FIFO by
created_at) - Verified by:
uv run pytest -q tests/api/test_api_server.py::TestJobEndpoints::test_queue_position_pending_jobs
- Synthetic video generator with known properties
- Mock OpenCV capture for unit tests
- Eliminate flaky tests
- Effort: 8 hours
Acceptance criteria:
- Tests stop relying on invalid “fake video bytes” and use deterministic, OpenCV-readable fixtures
- Default fixture avoids codec edge cases (prefer AVI/MJPG for broad OpenCV support)
- Verified by at least one targeted smoke run (e.g.,
uv run pytest -q tests/pipelines/test_scene_detection.py::TestSceneDetectionPipeline::test_pipeline_initialization)
- Enhance contributor workflow docs
- Expand troubleshooting based on feedback
- Effort: 6 hours
Acceptance criteria:
- Reviewer-friendly commands use
uvconsistently (install + run + tests) - At least one “smoke test” flow documented (CPU-only, local)
Checkpoint: v1.3.0 polish complete (16 hours)
Complete, reproducible research examples (JOSS requirement):
- Example 1: Classroom Interaction - Multi-person tracking + speech diarization
- Example 2: Clinical Assessment - Face analysis + emotion recognition
- Example 3: Infant Attention - Gaze tracking coordination example
- Example 4: Group Dynamics - Multi-person interaction patterns
Each example includes:
- Jupyter notebook with narrative
- Sample video (5-10 minutes, properly licensed)
- Ground truth annotations (where applicable)
- Expected outputs
- README with clear instructions
Deliverables: examples/research_workflows/ with 4 complete examples
Effort: 40 hours (10 hours per example)
Minimal reproducibility requirements for JOSS:
-
Docker Images - CPU and GPU images with pinned dependencies
videoannotator:1.4.2-cpu(lightweight)videoannotator:1.4.2-gpu(CUDA 12.1)- Published to Docker Hub
-
Benchmark Data - Basic performance metrics
- Processing time vs video length (CPU/GPU)
- Memory usage profiles
- Hardware specifications used
-
Validation Scripts - Compare outputs to ground truth
- Basic accuracy calculation where applicable
- Simple visualization of results
Deliverables:
docker/with Dockerfilesbenchmarks/results.mdwith performance datavalidation/with comparison scripts
Effort: 24 hours (8 hours Docker, 8 hours benchmarks, 8 hours validation)
Checkpoint: Research examples complete (64 hours total)
Complete JOSS paper (JOSS requirement):
- Abstract & Introduction - Problem statement and contribution
- Implementation Section - Architecture and pipeline descriptions
- Example Usage - Reference to the 4 research examples
- Quality & Performance - Benchmark results and validation
- Acknowledgments & References - Citations and funding
Deliverables: paper/paper.md (JOSS format) + paper/paper.bib
Effort: 40 hours
Scientific documentation for each major pipeline:
-
Pipeline Methods - Brief descriptions of:
- Person tracking (YOLO-based detection + tracking)
- Face analysis (detection + recognition pipelines)
- Audio diarization (speaker segmentation)
- Speech recognition (Whisper-based transcription)
-
Model Citations - Proper attribution for all models used
-
Limitations - Known issues and edge cases
-
Comparison Table - VideoAnnotator vs alternatives (ELAN, BORIS, OpenPose)
Deliverables:
docs/methods/with pipeline documentationCITATION.cffupdated with complete citation infodocs/comparison.mdwith ecosystem positioning
Effort: 24 hours
Checkpoint: JOSS paper and academic docs complete (64 hours)
Essential documentation for community onboarding (JOSS requirement):
- 5-minute quick start (
docs/quickstart.md)- Installation on Linux/macOS/Windows
- Test video processing
- View results
- Next steps
Effort: 8 hours
- Tutorial 1: Single pipeline (person tracking)
- Tutorial 2: Multi-pipeline workflow (person + audio)
- Tutorial 3: Configuration and customization
Effort: 16 hours (5-6 hours per tutorial)
- Update
CONTRIBUTING.mdwith:- Development setup walkthrough
- Adding a new pipeline (basic example)
- Writing tests
- Submitting PRs
Effort: 8 hours
Checkpoint: Core user documentation complete (32 hours)
- Package Setup - Prepare for PyPI
- Update
pyproject.tomlfor PyPI - Build wheels for major platforms
- Test installation from test.pypi.org
- Publish to PyPI:
pip install videoannotator
- Update
Effort: 16 hours
- Test installation on:
- Ubuntu 22.04, 24.04
- macOS (Intel and Apple Silicon)
- Windows 11 (WSL2)
- Docker (both CPU and GPU images)
Effort: 16 hours
- Run all examples on fresh installs
- Verify benchmark reproducibility
- Test documentation accuracy
- Fix any critical issues found
Effort: 16 hours
Checkpoint: PyPI release tested (48 hours)
- Finalize Paper - Review and polish manuscript
- Submit to JOSS - Follow JOSS submission process
- Address Reviewer Feedback - Respond to initial review
Effort: 24 hours
- Release Notes - Comprehensive changelog since v1.3.0
- Migration Guide - v1.4.1 → v1.4.2 changes
- GitHub Release - Tag v1.4.2 and create release
- Zenodo Archive - Create DOI for paper
Effort: 16 hours
- README Polish - Ensure GitHub README is excellent
- GitHub Discussions - Enable and create initial topics
- Issue Templates - Ensure issue templates are present and current
- Demo Video - 5-minute overview screencast
Effort: 16 hours
Checkpoint: v1.4.2 released, JOSS submitted (56 hours)
- JOSS paper submitted and under review
- 4 complete research workflow examples (reproducible)
- PyPI package published (
pip install videoannotator) - Docker images published (CPU + GPU)
- Method documentation for major pipelines
- Benchmark data published
- Quick start + 3 tutorials minimum
- Enhanced contributing guide
- Multi-platform testing passed
- All examples verified on fresh installs
- Example reproducibility: 100% success rate
- Documentation clarity: Clear for JOSS reviewers
- API stability: No breaking changes from v1.3.0
- Installation: Works on Linux, macOS, Windows
- Test coverage: Maintain ≥80% from v1.3.0
- Paper meets all JOSS submission requirements
- Responds to reviewer feedback within 2 weeks
- Paper accepted for publication
All "nice-to-have" features are moved to v1.5.0 to keep v1.4.2 focused on JOSS:
- Model auto-download with progress bars
- Setup wizard and first-run configuration
- Real-time progress indicators and ETA
- Resource usage monitoring
- Job notifications (email, webhook, desktop)
- Interactive config wizard
- Config templates library
- Structured logging (JSON format)
- Log analysis tools
- FiftyOne export integration
- Label Studio import/export
- Custom CSV templates
- Advanced export formats
- Quality assessment pipeline
- Batch processing optimization
- Smart job scheduling
- Pipeline comparison tools
- Parameter optimization framework
- Active learning system
- Multi-modal correlation analysis
- Plugin system architecture
- Real-time streaming
- GraphQL API
- Enterprise features (SSO, RBAC, multi-tenancy)
- Advanced analytics dashboard
- Cloud provider integration
- Queue position display
- Deterministic test fixtures
- Documentation touch-ups
- 4 complete research workflow examples
- Docker images (CPU/GPU)
- Benchmark data and validation scripts
- Complete JOSS paper manuscript
- Method documentation for pipelines
- Comparison with alternatives
- Citations and references
- Quick start guide
- 3 progressive tutorials
- Enhanced contributing guide
- PyPI package preparation and publication
- Multi-platform testing
- Example verification on fresh installs
- Critical bug fixes
- JOSS paper submission
- Release preparation (notes, migration guide)
- Minimal community setup (README, discussions, demo video)
- v1.4.2 release and announcement
Total Effort: ~280 hours (8-10 weeks with focused effort)
| Phase | Duration | Effort (hours) | Key Deliverables |
|---|---|---|---|
| Phase 1: Quick Polish | Week 1 | 16 | v1.3.0 deferred items complete |
| Phase 2: Research Examples | Weeks 2-3 | 64 | 4 examples + Docker + benchmarks |
| Phase 3: JOSS Paper | Weeks 4-5 | 64 | Paper manuscript + methods docs |
| Phase 4: User Docs | Week 6 | 32 | Quick start + 3 tutorials |
| Phase 5: PyPI & Testing | Weeks 7-8 | 48 | PyPI release + testing |
| Phase 6: Launch | Weeks 9-10 | 56 | JOSS submission + release |
| TOTAL | 10 weeks | 280 hours | JOSS-ready release |
v1.4.2 is intentionally streamlined to focus on JOSS acceptance:
Core Goals:
- ✅ Complete, reproducible research examples
- ✅ JOSS paper manuscript and submission
- ✅ Essential documentation (methods, tutorials, quick start)
- ✅ PyPI release for easy installation
- ✅ Multi-platform testing
Deferred to v1.5.0:
- All usability enhancements (progress bars, wizards, notifications)
- All advanced features (quality assessment, batch optimization)
- All alternative integrations (FiftyOne, Label Studio)
- All advanced tooling (log analysis, parameter optimization)
This tight scope ensures we can:
- Complete v1.4.2 in 10 weeks (~280 hours)
- Get JOSS paper submitted and accepted
- Make the JOSS review experience reviewer-friendly
- Build momentum for v1.5.0 feature enhancements
Last Updated: December 18, 2025 Target Release: Q1 2026 Duration: 8-10 weeks Status: Planning Phase - v1.4.2 JOSS Focused