Migrate versioning workflow from PAT to GitHub App token#640
Conversation
Fixes #638. The versioning workflow used a PAT (POLICYENGINE_GITHUB) to push the "Update package version" commit, which broke when the token expired. Switch to a GitHub App token via actions/create-github-app-token@v1, matching the pattern used in policyengine-api-v2-alpha. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Claude aggressively posted this comment on my behalf:
One thing I initially flagged: the `EndBug/add-and-commit@v9` step doesn't explicitly receive the app token — so I worried it might fall back to the default `GITHUB_TOKEN` for the push, which wouldn't trigger downstream workflows (like PyPI publish via `code_changes.yaml`).
After digging in, this is not a concern. Here's why:
- `actions/checkout@v4` with `token: <app-token>` persists that token into the local git config via an `http.extraheader` credential (this is the default `persist-credentials: true` behavior).
- `EndBug/add-and-commit@v9` just shells out to `git push` — it uses whatever credentials the local repo already has configured. Its `github_token` input is only used for GitHub API calls (fetching user info for commit metadata), not for git operations.
- Their README confirms: *"When pushing, the action uses the token that the local git repository has been configured with."*
So the version-bump commit will be pushed with the GitHub App token, which will correctly trigger subsequent workflow runs. LGTM.
So an LGTM is coming and I'll try to get this in before I go to sleep tonight.
baogorek
left a comment
There was a problem hiding this comment.
Left a note about Claude's anxiousness about a token carrying over, but an LGTM from the human.
|
There are other workflow files like reusable_tests.yaml that use similar tokens like (Eg, the test suite was also responsible for publishing datasets, but now the pipeline run also does that, we might want to decide on a single responsible step -- we are overwriting them anyway otherwise). What do you think @baogorek @anth-volk ? |
|
@juaristi22 This was just a minimal fix, as I figured PyPI publishing might be important to have working again. I don't think the current CI/CD pipeline is sufficient. I could transform this PR into that, or do separately, but would be happy to propose a solution unless anyone else is super eager to do the plumbing. |
Split tests into unit/ (self-contained, synthetic data, mocks) and integration/ (requires built H5 datasets). Unit sub-folders use no test_ prefix (datasets/, calibration/) to avoid confusion with integration tests. - Move 30+ unit test files to tests/unit/ with calibration/ and datasets/ sub-directories - Move 11 integration test files to tests/integration/ - Merge test_dataset_sanity.py into per-dataset integration files (test_cps.py, test_enhanced_cps.py, test_sparse_enhanced_cps.py) - Rename calibration integration tests to match dataset names (test_source_imputed_cps_masking.py, _consistency.py) - Move top-level tests/ files into appropriate unit/ or integration/ - Add integration conftest.py with skip logic and shared fixtures - Update pyproject.toml testpaths and add pytest-cov dependency - Update modal_app/data_build.py TEST_MODULES - Add make test-unit and make test-integration targets - Fix Path(__file__) references in moved test files Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Delete 7 deprecated/redundant workflow files and replace with 3 consolidated workflows matching the policyengine-api-v2-alpha pattern. pr.yaml: fork check, lint, uv.lock, changelog, unit tests with Codecov (informational), smoke test. Runs in ~2-3 minutes. push.yaml: two paths — version bump commits publish to PyPI; all other commits run per-dataset Modal builds with integration tests after each stage, then manual approval gate, then pipeline dispatch. pipeline.yaml: remove push trigger (now dispatched from push.yaml), add scope input (all/national/state/congressional/local/test), add summary with Modal dashboard link. Also adds: - .codecov.yml with informational-only coverage reporting - --script mode to data_build.py for per-dataset Modal execution - SCRIPT_SHORT_NAMES mapping for human-friendly script names - run_single_script() Modal function for single-dataset builds Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New queue-based H5 build system that replaces the partition-based N-worker model with one-container-per-item processing. - generate_work_items(scope, db_path): auto-generates work item lists filtered by scope (all/national/state/congressional/local/test). Test scope builds national + NY + NV-01 only. - build_single_area(): Modal function (1 CPU, 16GB) that processes exactly one work item per container via worker_script.py - queue_coordinator(): spawns up to 50 single-item workers, collects results. No multi-threading, no chunking. - main_queue entrypoint for CLI access - Wire scope parameter from pipeline.yaml through run_pipeline() - Fall back to legacy coordinate_publish for scope=all Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add testing standards: unit vs integration placement rules, per-dataset naming convention, make test-unit/test-integration commands. Add CI/CD overview documenting the four workflow files and their triggers. Update Python version from 3.11 to 3.12-3.13. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Closing in favor of a new PR that supersedes this with a broader CI/CD restructuring (test reorganization, workflow consolidation, queue-based Modal architecture). The PAT→App token fix from this PR is included in the new one. |
Fixes #638
Summary
POLICYENGINE_GITHUB) with a GitHub App token in the versioning workflowactions/create-github-app-token@v1withAPP_IDandAPP_PRIVATE_KEYsecrets (already configured in the repo)policyengine-api-v2-alphaTest plan
code_changes.yamland the PyPI publish step runs🤖 Generated with Claude Code