Skip to content

Fix: rebuild missing dataset files#1417

Merged
jcpitre merged 2 commits into
mainfrom
fix-rebuild_missing-dataset-files
Oct 24, 2025
Merged

Fix: rebuild missing dataset files#1417
jcpitre merged 2 commits into
mainfrom
fix-rebuild_missing-dataset-files

Conversation

@jcpitre
Copy link
Copy Markdown
Collaborator

@jcpitre jcpitre commented Oct 24, 2025

Summary:

Added a dataset_id parameter to rebuild_missing_dataset_files to rebuild a specific dataset.
Corrected a problem where the last datasets were not processed if the count was not a multiple of 5.

From our AI friend

This pull request adds support for processing a specific GTFS dataset by its stable ID in the dataset files rebuilding workflow. The changes update both the main logic and documentation, and introduce new tests to verify this functionality.

Feature: Process a specific dataset by ID

  • Added a new dataset_id parameter to the payload for the dataset files rebuild function, allowing the process to target a single dataset and supersede the after_date and latest_only parameters. The documentation in README.md was updated to explain usage and behavior. [1] [2] [3]
  • Updated the handler and main function in rebuild_missing_dataset_files.py to accept and forward the dataset_id parameter, and changed the dataset selection logic to query by stable ID when provided. [1] [2] [3]
  • Modified message publishing logic to handle the case where only one dataset is processed, ensuring correct batching and logging. [1] [2]
  • Updated the result summary to include the dataset_id parameter for clarity.

Testing improvements

  • Added new unit tests in test_rebuild_missing_dataset_files.py to verify that the handler and main function correctly process the dataset_id parameter, including tests for payload forwarding and message publishing behavior when a specific dataset is targeted. [1] [2] [3]

These changes make the dataset files rebuilding workflow more flexible and testable, allowing targeted processing for debugging or special cases.

the tips AND to try anything they deem relevant outside the bounds of the testing tips.

Please make sure these boxes are checked before submitting your pull request - thanks!

  • Run the unit tests with ./scripts/api-tests.sh to make sure you didn't break anything
  • Add or update any needed documentation to the repo
  • Format the title like "feat: [new feature short description]". Title must follow the Conventional Commit Specification(https://www.conventionalcommits.org/en/v1.0.0/).
  • Linked all relevant issues
  • Include screenshot(s) showing how this pull request works and fixes the issue(s)

@jcpitre jcpitre requested a review from cka-y October 24, 2025 13:29
Copy link
Copy Markdown
Contributor

@cka-y cka-y left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jcpitre jcpitre merged commit d040e10 into main Oct 24, 2025
3 checks passed
@jcpitre jcpitre deleted the fix-rebuild_missing-dataset-files branch October 24, 2025 14:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants