Skip to content

feat: add feed availability task and scheduler#1705

Draft
davidgamez wants to merge 13 commits into
mainfrom
feed_availability
Draft

feat: add feed availability task and scheduler#1705
davidgamez wants to merge 13 commits into
mainfrom
feed_availability

Conversation

@davidgamez
Copy link
Copy Markdown
Member

@davidgamez davidgamez commented May 19, 2026

Summary:

This PR adds the cloud function and scheduler to check for and persist GTFS feed availability. It expanded the scope of the issue to perform a quick zip content check. Initially, it was intended to perform only HEAD HTTP requests, but testing in DEV, I realized that some of the servers don't support HEAD requests(~160). As a workaround, the check executes a HEAD request and if it fails, a GET request.

From our AI friend

This pull request introduces a new feature for checking the availability of GTFS feeds via HTTP HEAD/GET requests, along with several supporting refactors and documentation updates. The main changes include implementing robust HTTP request logic for feed checks, refactoring SSL context and HTTP header handling, and adding a new task handler and documentation for this feature.

New GTFS Feed Availability Check Feature:

  • Added perform_request and supporting functions to utils.py for checking GTFS feed availability using HTTP HEAD requests, with optional GET fallback and ZIP file detection via magic bytes. This includes robust error handling and content-type inference.
  • Added a new task handler, check_gtfs_feed_availability_handler, and registered it in the main task executor, enabling the new feed availability check to be triggered as a task. [1] [2]
  • Updated the task executor documentation to describe the new check_gtfs_feed_availability task, its parameters, and the expected response format, including verbose error reporting.

Refactoring and Helper Improvements:

  • Refactored SSL context creation for HTTP requests into a reusable create_feed_ssl_context function, with improved handling for legacy servers and optional disabling of certificate checks for problematic feeds.
  • Extracted and improved HTTP header and authentication logic into build_feed_request_params, supporting per-feed header overrides and multiple authentication schemes.

Dependency and Import Updates:

  • Added imports for time, urllib3.exceptions, and timezone to support the new HTTP request and datetime logic.

Expected behavior:

The GTFS availability is persisted in the DB.

Testing tips:

Internal team: Can be tested via retool in the dev environment

Please make sure these boxes are checked before submitting your pull request - thanks!

  • Run the unit tests with ./scripts/api-tests.sh to make sure you didn't break anything
  • Add or update any needed documentation to the repo
  • Format the title like "feat: [new feature short description]". Title must follow the Conventional Commit Specification(https://www.conventionalcommits.org/en/v1.0.0/).
  • Linked all relevant issues
  • Include screenshot(s) showing how this pull request works and fixes the issue(s)

@davidgamez davidgamez linked an issue May 21, 2026 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Create scheduled Cloud Function to check GTFS feed availability

1 participant