- Wire a managed S3-compatible bucket into production and disable the legacy upload token path.
- Add authenticated write UX in the web dashboard for submissions and run management.
- Expand seeded/demo benchmark packs with richer trace visualizations and comparison views.
- Add benchmark submission jobs and asynchronous runner orchestration.
- Add richer observability dashboards around request latency, leaderboard freshness, and upload failures.
- Publish benchmark result baselines for multiple planner families.
- Multi-project or multi-tenant benchmark hosting.
- Signed artifact URLs and background ingestion pipelines.
- Paper/demo site integration with live experiment galleries.