This Django app orchestrates collector work: it reads a YAML schedule, decides which manage.py commands should run for a given trigger (daily, weekly, monthly, interval, or on Boost release), and runs them in process via Django’s call_command. It does not fetch remote data, write to your tracker models, or define collectors—that logic lives in the other apps whose management commands you list in the schedule.
-
Schedule file The canonical file is
config/boost_collector_schedule.yaml. You can point elsewhere with theBOOST_COLLECTOR_SCHEDULE_YAMLsetting (seeschedule_config.py). -
Load and validate
schedule_config.load_configreads the YAML and validates structure: each group has adefault_time(UTCHH:MM) and a list of tasks. Each task has at leastcommand(a Django management command name) andschedule(daily,weekly,monthly,on_release, orinterval), plus optionalargs,enabled, and schedule-specific fields (on/day_of_week,on/day_of_month,minutesfor interval). -
Which tasks run
get_tasks_for_schedulefilters tasks to those matching the current invocation: schedule kind, optional group id, weekday for weekly, day for monthly, or interval length. Thedefaultschedule kind is special: it is used for the group batch (daily + weekly-for-today + monthly-for-today + optionalon_releasein one run); interval tasks are excluded from that batch and are triggered separately. -
run_scheduled_collectorsThe management command callsget_tasks_for_schedule, then runs each selected task sequentially withcall_command(command, *args). Exit status is 0 only if every command succeeds (unless--stop-on-failurestops early; in that case, each collector that was not run is logged at WARNING with the failed predecessor and reason). Foron_release(and fordefaultwhen release tasks are included), it consultsboost_library_tracker.release_check.has_new_boost_release(); if there is no new Boost release,on_releasetasks are skipped. -
Celery Beat
get_beat_schedulebuildsCELERY_BEAT_SCHEDULEentries: one crontab per group at that group’sdefault_time(running the group batch via kwargsschedule_kind=defaultandgroup_id=...), and separate interval entries per distinctminutesvalue that invoke the same command withschedule_kind=interval. WithDEBUG=FalseorBOOST_COLLECTOR_SCHEDULE_STRICT=True, missing or invalid YAML raisesScheduleConfigurationErrorat Django import time so Beat cannot start with an empty schedule. The Celery entry point isrun_scheduled_collectors_task, which forwards torun_scheduled_collectorswith the right CLI flags. At app startup,BoostCollectorRunnerConfig.ready()logs a short schedule summary (path, counts, Beat entry keys) or an error, and may setBOOST_COLLECTOR_SCHEDULE_STARTUP_OKonsettingsfor diagnostics. -
No app-owned data This package has no models; it only wires configuration to management commands.
For broader platform context (databases, deployment, other services), see the repo root README and docs/Workflow.md where the schedule is documented.
- Run one schedule group once (smoke test):
python manage.py run_scheduled_collectors --schedule daily --group github(more examples in the root README.) - Change what runs when: edit the YAML schedule and redeploy; keep
commandvalues aligned with real commands under each app’smanagement/commands/.
Runs tasks from the schedule file for the selected schedule type. Exits with status 0 only when all invoked collectors succeed (see --stop-on-failure).
| Option | Description |
|---|---|
--schedule |
Required. daily | weekly | monthly | on_release | interval | default. default runs the group batch (daily + weekly for today + monthly for today + on_release when applicable); default requires --group. |
--day-of-week |
For weekly: weekday name (e.g. monday). Required when --schedule weekly. |
--day-of-month |
For monthly: day 1–31. Required when --schedule monthly. |
--interval-minutes |
For interval: repeat every N minutes (1–180). Required when --schedule interval. |
--group |
Limit to one YAML group. Required with --schedule default. For other schedule kinds, omit to run every group. |
--stop-on-failure |
Stop after the first failing collector instead of continuing; log WARNING for each skipped collector with the failed predecessor and reason. |
--strict |
Require the schedule YAML to exist and parse before resolving tasks (fails even when DEBUG is True). |
Run python manage.py run_scheduled_collectors --help for the full CLI.
- Django app label:
boost_collector_runner - Path (from repo root):
boost_collector_runner/ - Registration: Listed under
INSTALLED_APPSinconfig/settings.py.
| Command | Description |
|---|---|
run_scheduled_collectors |
Run collectors from the YAML schedule for a given schedule type and optional group. |
From the repo root (after README prerequisites):
python -m pytest boost_collector_runner/tests/ -v