Migrate versioning workflow from PAT to GitHub App token by anth-volk · Pull Request #640 · PolicyEngine/policyengine-us-data

anth-volk · 2026-03-26T22:41:06Z

Fixes #638

Summary

Replaced expired PAT (POLICYENGINE_GITHUB) with a GitHub App token in the versioning workflow
Uses actions/create-github-app-token@v1 with APP_ID and APP_PRIVATE_KEY secrets (already configured in the repo)
Matches the pattern used in policyengine-api-v2-alpha

Test plan

Merge a PR with a changelog fragment and verify the versioning workflow successfully pushes the "Update package version" commit
Verify that commit triggers code_changes.yaml and the PyPI publish step runs

🤖 Generated with Claude Code

Fixes #638. The versioning workflow used a PAT (POLICYENGINE_GITHUB) to push the "Update package version" commit, which broke when the token expired. Switch to a GitHub App token via actions/create-github-app-token@v1, matching the pattern used in policyengine-api-v2-alpha. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

baogorek

Claude aggressively posted this comment on my behalf:

One thing I initially flagged: the `EndBug/add-and-commit@v9` step doesn't explicitly receive the app token — so I worried it might fall back to the default `GITHUB_TOKEN` for the push, which wouldn't trigger downstream workflows (like PyPI publish via `code_changes.yaml`).

After digging in, this is not a concern. Here's why:

- `actions/checkout@v4` with `token: <app-token>` persists that token into the local git config via an `http.extraheader` credential (this is the default `persist-credentials: true` behavior).
- `EndBug/add-and-commit@v9` just shells out to `git push` — it uses whatever credentials the local repo already has configured. Its `github_token` input is only used for GitHub API calls (fetching user info for commit metadata), not for git operations.
- Their README confirms: *"When pushing, the action uses the token that the local git repository has been configured with."*

So the version-bump commit will be pushed with the GitHub App token, which will correctly trigger subsequent workflow runs. LGTM.

So an LGTM is coming and I'll try to get this in before I go to sleep tonight.

baogorek

Left a note about Claude's anxiousness about a token carrying over, but an LGTM from the human.

juaristi22 · 2026-03-27T06:57:38Z

There are other workflow files like reusable_tests.yaml that use similar tokens like secrets.POLICYENGINE_US_DATA_GITHUB_TOKEN or code_changes.yaml (which also seems responsible for publishing), that might also be affected by this. I'm not sure whether we want to trigger the entire test suite again on main (considering costs) now that we have the constraint that PRs need to be approved before merging. It might be worth reviewing the entire workflow now that running the pipeline has come in?

(Eg, the test suite was also responsible for publishing datasets, but now the pipeline run also does that, we might want to decide on a single responsible step -- we are overwriting them anyway otherwise). What do you think @baogorek @anth-volk ?

anth-volk · 2026-03-27T14:17:36Z

@juaristi22 This was just a minimal fix, as I figured PyPI publishing might be important to have working again. I don't think the current CI/CD pipeline is sufficient. I could transform this PR into that, or do separately, but would be happy to propose a solution unless anyone else is super eager to do the plumbing.

Split tests into unit/ (self-contained, synthetic data, mocks) and integration/ (requires built H5 datasets). Unit sub-folders use no test_ prefix (datasets/, calibration/) to avoid confusion with integration tests. - Move 30+ unit test files to tests/unit/ with calibration/ and datasets/ sub-directories - Move 11 integration test files to tests/integration/ - Merge test_dataset_sanity.py into per-dataset integration files (test_cps.py, test_enhanced_cps.py, test_sparse_enhanced_cps.py) - Rename calibration integration tests to match dataset names (test_source_imputed_cps_masking.py, _consistency.py) - Move top-level tests/ files into appropriate unit/ or integration/ - Add integration conftest.py with skip logic and shared fixtures - Update pyproject.toml testpaths and add pytest-cov dependency - Update modal_app/data_build.py TEST_MODULES - Add make test-unit and make test-integration targets - Fix Path(__file__) references in moved test files Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Delete 7 deprecated/redundant workflow files and replace with 3 consolidated workflows matching the policyengine-api-v2-alpha pattern. pr.yaml: fork check, lint, uv.lock, changelog, unit tests with Codecov (informational), smoke test. Runs in ~2-3 minutes. push.yaml: two paths — version bump commits publish to PyPI; all other commits run per-dataset Modal builds with integration tests after each stage, then manual approval gate, then pipeline dispatch. pipeline.yaml: remove push trigger (now dispatched from push.yaml), add scope input (all/national/state/congressional/local/test), add summary with Modal dashboard link. Also adds: - .codecov.yml with informational-only coverage reporting - --script mode to data_build.py for per-dataset Modal execution - SCRIPT_SHORT_NAMES mapping for human-friendly script names - run_single_script() Modal function for single-dataset builds Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

New queue-based H5 build system that replaces the partition-based N-worker model with one-container-per-item processing. - generate_work_items(scope, db_path): auto-generates work item lists filtered by scope (all/national/state/congressional/local/test). Test scope builds national + NY + NV-01 only. - build_single_area(): Modal function (1 CPU, 16GB) that processes exactly one work item per container via worker_script.py - queue_coordinator(): spawns up to 50 single-item workers, collects results. No multi-threading, no chunking. - main_queue entrypoint for CLI access - Wire scope parameter from pipeline.yaml through run_pipeline() - Fall back to legacy coordinate_publish for scope=all Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Add testing standards: unit vs integration placement rules, per-dataset naming convention, make test-unit/test-integration commands. Add CI/CD overview documenting the four workflow files and their triggers. Update Python version from 3.11 to 3.12-3.13. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

anth-volk · 2026-03-30T19:59:03Z

Closing in favor of a new PR that supersedes this with a broader CI/CD restructuring (test reorganization, workflow consolidation, queue-based Modal architecture). The PAT→App token fix from this PR is included in the new one.

anth-volk marked this pull request as ready for review March 26, 2026 22:47

anth-volk requested a review from juaristi22 March 26, 2026 22:47

baogorek reviewed Mar 27, 2026

View reviewed changes

baogorek approved these changes Mar 27, 2026

View reviewed changes

anth-volk marked this pull request as draft March 27, 2026 16:38

anth-volk and others added 5 commits March 30, 2026 20:30

Update changelog fragment to cover full restructuring scope

48abbd4

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

anth-volk closed this Mar 30, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Migrate versioning workflow from PAT to GitHub App token#640

Migrate versioning workflow from PAT to GitHub App token#640
anth-volk wants to merge 6 commits into
mainfrom
fix/fix-us-data-pypi

anth-volk commented Mar 26, 2026

Uh oh!

baogorek left a comment •

edited

Loading

Uh oh!

baogorek left a comment

Uh oh!

juaristi22 commented Mar 27, 2026

Uh oh!

anth-volk commented Mar 27, 2026

Uh oh!

anth-volk commented Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

anth-volk commented Mar 26, 2026

Summary

Test plan

Uh oh!

baogorek left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

baogorek left a comment

Choose a reason for hiding this comment

Uh oh!

juaristi22 commented Mar 27, 2026

Uh oh!

anth-volk commented Mar 27, 2026

Uh oh!

anth-volk commented Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

baogorek left a comment •

edited

Loading