Skip to content

Commit 9b7c18c

Browse files
baogorekclaude
andcommitted
Address PR #681 review feedback
- Replace label-based integration gating with path-based triggers (policyengine_us_data/, modal_app/, tests/integration/) - Add concurrency group to cancel in-flight PR CI on new pushes - Add docs build job to PR checks and docs deploy to push workflow - Remove unused scope input from pipeline.yaml, surface fc.object_id in step summary - Restore local_area_promote.yaml and local_area_publish.yaml - Rename integration tests to descriptive names (test_no_formula_variables_stored, test_build_matrix_masking, test_xw_consistency) - Move test_stacked_dataset_builder to integration/test_build_h5 (uses Microsimulation + H5 fixture) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 084cb0f commit 9b7c18c

11 files changed

Lines changed: 258 additions & 36 deletions
Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
name: Promote Local Area H5 Files
2+
3+
on:
4+
workflow_dispatch:
5+
inputs:
6+
version:
7+
description: 'Version to promote (e.g. 1.23.0)'
8+
required: true
9+
type: string
10+
branch:
11+
description: 'Branch to use for repo setup'
12+
required: false
13+
default: 'main'
14+
type: string
15+
16+
jobs:
17+
promote-local-area:
18+
runs-on: ubuntu-latest
19+
permissions:
20+
contents: read
21+
env:
22+
HUGGING_FACE_TOKEN: ${{ secrets.HUGGING_FACE_TOKEN }}
23+
MODAL_TOKEN_ID: ${{ secrets.MODAL_TOKEN_ID }}
24+
MODAL_TOKEN_SECRET: ${{ secrets.MODAL_TOKEN_SECRET }}
25+
26+
steps:
27+
- name: Checkout repo
28+
uses: actions/checkout@v4
29+
30+
- name: Set up Python
31+
uses: actions/setup-python@v5
32+
with:
33+
python-version: '3.14'
34+
35+
- name: Install Modal CLI
36+
run: pip install modal
37+
38+
- name: Promote staged files to production
39+
run: |
40+
VERSION="${{ github.event.inputs.version }}"
41+
BRANCH="${{ github.event.inputs.branch }}"
42+
echo "Promoting version ${VERSION} from branch ${BRANCH}"
43+
modal run modal_app/local_area.py::main_promote --version="${VERSION}" --branch="${BRANCH}"
Lines changed: 131 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,131 @@
1+
name: Publish Local Area H5 Files
2+
3+
on:
4+
# TEMPORARILY DISABLED - re-enable push/repository_dispatch triggers when ready
5+
# push:
6+
# branches: [main]
7+
# paths:
8+
# - 'policyengine_us_data/calibration/**'
9+
# - '.github/workflows/local_area_publish.yaml'
10+
# - 'modal_app/**'
11+
# repository_dispatch:
12+
# types: [calibration-updated]
13+
workflow_dispatch:
14+
inputs:
15+
num_workers:
16+
description: 'Number of parallel workers'
17+
required: false
18+
default: '8'
19+
type: string
20+
skip_upload:
21+
description: 'Skip upload (build only)'
22+
required: false
23+
default: false
24+
type: boolean
25+
26+
# Trigger strategy:
27+
# 1. Automatic: Code changes to calibration/ pushed to main
28+
# 2. repository_dispatch: Calibration workflow triggers after uploading new weights
29+
# 3. workflow_dispatch: Manual trigger with optional parameters
30+
31+
jobs:
32+
publish-local-area:
33+
runs-on: ubuntu-latest
34+
permissions:
35+
contents: read
36+
env:
37+
HUGGING_FACE_TOKEN: ${{ secrets.HUGGING_FACE_TOKEN }}
38+
MODAL_TOKEN_ID: ${{ secrets.MODAL_TOKEN_ID }}
39+
MODAL_TOKEN_SECRET: ${{ secrets.MODAL_TOKEN_SECRET }}
40+
41+
steps:
42+
- name: Checkout repo
43+
uses: actions/checkout@v4
44+
45+
- name: Set up Python
46+
uses: actions/setup-python@v5
47+
with:
48+
python-version: '3.14'
49+
50+
- name: Install Modal CLI
51+
run: pip install modal
52+
53+
- name: Run local area build and stage on Modal
54+
run: |
55+
NUM_WORKERS="${{ github.event.inputs.num_workers || '8' }}"
56+
SKIP_UPLOAD="${{ github.event.inputs.skip_upload || 'false' }}"
57+
BRANCH="${{ github.head_ref || github.ref_name }}"
58+
59+
CMD="modal run modal_app/local_area.py::main --branch=${BRANCH} --num-workers=${NUM_WORKERS}"
60+
61+
if [ "$SKIP_UPLOAD" = "true" ]; then
62+
CMD="${CMD} --skip-upload"
63+
fi
64+
65+
echo "Running: $CMD"
66+
$CMD
67+
68+
- name: Post-build summary
69+
if: success()
70+
run: |
71+
echo "## Build + Stage Complete" >> $GITHUB_STEP_SUMMARY
72+
echo "" >> $GITHUB_STEP_SUMMARY
73+
echo "Files have been uploaded to GCS and staged on HuggingFace." >> $GITHUB_STEP_SUMMARY
74+
echo "" >> $GITHUB_STEP_SUMMARY
75+
echo "### Next step: Validation runs automatically" >> $GITHUB_STEP_SUMMARY
76+
echo "The validate-staging job will now check all staged H5s." >> $GITHUB_STEP_SUMMARY
77+
78+
validate-staging:
79+
needs: publish-local-area
80+
runs-on: ubuntu-latest
81+
env:
82+
HUGGING_FACE_TOKEN: ${{ secrets.HUGGING_FACE_TOKEN }}
83+
steps:
84+
- name: Checkout repo
85+
uses: actions/checkout@v4
86+
87+
- name: Set up Python
88+
uses: actions/setup-python@v5
89+
with:
90+
python-version: '3.14'
91+
92+
- name: Set up uv
93+
uses: astral-sh/setup-uv@v5
94+
95+
- name: Install dependencies
96+
run: uv sync
97+
98+
- name: Validate staged H5s
99+
run: |
100+
uv run python -m policyengine_us_data.calibration.validate_staging \
101+
--area-type states --output validation_results.csv
102+
103+
- name: Upload validation results to HF
104+
run: |
105+
uv run python -c "
106+
from policyengine_us_data.utils.huggingface import upload
107+
upload('validation_results.csv',
108+
'policyengine/policyengine-us-data',
109+
'calibration/logs/validation_results.csv')
110+
"
111+
112+
- name: Post validation summary
113+
if: always()
114+
run: |
115+
echo "## Validation Results" >> $GITHUB_STEP_SUMMARY
116+
if [ -f validation_results.csv ]; then
117+
TOTAL=$(tail -n +2 validation_results.csv | wc -l)
118+
FAILS=$(grep -c ',FAIL,' validation_results.csv || true)
119+
echo "- **${TOTAL}** targets validated" >> $GITHUB_STEP_SUMMARY
120+
echo "- **${FAILS}** sanity failures" >> $GITHUB_STEP_SUMMARY
121+
echo "" >> $GITHUB_STEP_SUMMARY
122+
echo "Review in dashboard, then trigger **Promote** workflow." >> $GITHUB_STEP_SUMMARY
123+
else
124+
echo "Validation did not produce output." >> $GITHUB_STEP_SUMMARY
125+
fi
126+
127+
- name: Upload validation artifact
128+
uses: actions/upload-artifact@v4
129+
with:
130+
name: validation-results
131+
path: validation_results.csv

.github/workflows/pipeline.yaml

Lines changed: 10 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -3,17 +3,6 @@ name: Run Pipeline
33
on:
44
workflow_dispatch:
55
inputs:
6-
scope:
7-
description: "Dataset scope to build"
8-
default: "all"
9-
type: choice
10-
options:
11-
- all
12-
- national
13-
- state
14-
- congressional
15-
- local
16-
- test
176
gpu:
187
description: "GPU type for regional calibration"
198
default: "T4"
@@ -56,7 +45,6 @@ jobs:
5645
run: |
5746
modal deploy modal_app/pipeline.py
5847
59-
SCOPE="${{ inputs.scope || 'all' }}"
6048
GPU="${{ inputs.gpu || 'T4' }}"
6149
EPOCHS="${{ inputs.epochs || '1000' }}"
6250
NATIONAL_EPOCHS="${{ inputs.national_epochs || '4000' }}"
@@ -74,20 +62,15 @@ jobs:
7462
num_workers=int('${NUM_WORKERS}'),
7563
skip_national='${SKIP_NATIONAL}' == 'true',
7664
)
77-
print(f'Pipeline spawned with scope=${SCOPE}.')
65+
print(f'Pipeline spawned.')
7866
print(f'Function call ID: {fc.object_id}')
79-
"
80-
81-
- name: Write summary
82-
run: |
83-
cat >> $GITHUB_STEP_SUMMARY << 'EOF'
84-
## Pipeline Launched
8567
86-
| Field | Value |
87-
|-------|-------|
88-
| Scope | `${{ inputs.scope || 'all' }}` |
89-
| GPU | `${{ inputs.gpu || 'T4' }}` |
90-
| Epochs | `${{ inputs.epochs || '1000' }}` / `${{ inputs.national_epochs || '4000' }}` |
91-
92-
**[Monitor on Modal Dashboard](https://modal.com/apps)**
93-
EOF
68+
with open('$GITHUB_STEP_SUMMARY', 'a') as f:
69+
f.write('## Pipeline Launched\n\n')
70+
f.write('| Field | Value |\n')
71+
f.write('|-------|-------|\n')
72+
f.write(f'| GPU | \`${GPU}\` |\n')
73+
f.write(f'| Epochs | \`${EPOCHS}\` / \`${NATIONAL_EPOCHS}\` |\n')
74+
f.write(f'| Function call ID | \`{fc.object_id}\` |\n\n')
75+
f.write('**[Monitor on Modal Dashboard](https://modal.com/apps)**\n')
76+
"

.github/workflows/pr.yaml

Lines changed: 41 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,10 @@ on:
44
pull_request:
55
branches: [main]
66

7+
concurrency:
8+
group: pr-checks-${{ github.event.pull_request.number }}
9+
cancel-in-progress: true
10+
711
jobs:
812
check-fork:
913
runs-on: ubuntu-latest
@@ -92,12 +96,45 @@ jobs:
9296
- run: python -c "import policyengine_us_data; print('OK')"
9397
- run: python -c "from policyengine_core.data import Dataset; print('OK')"
9498

99+
docs-build:
100+
runs-on: ubuntu-latest
101+
needs: [check-fork]
102+
steps:
103+
- uses: actions/checkout@v4
104+
- uses: actions/setup-python@v5
105+
with:
106+
python-version: "3.14"
107+
- uses: actions/setup-node@v4
108+
with:
109+
node-version: "24"
110+
- uses: astral-sh/setup-uv@v5
111+
- run: uv sync --dev
112+
- name: Test documentation builds
113+
run: uv run make documentation
114+
115+
decide-test-scope:
116+
runs-on: ubuntu-latest
117+
needs: check-fork
118+
outputs:
119+
run_integration: ${{ steps.check.outputs.run_integration }}
120+
steps:
121+
- uses: actions/checkout@v4
122+
with:
123+
fetch-depth: 0
124+
- name: Check changed files for integration scope
125+
id: check
126+
run: |
127+
CHANGED=$(git diff --name-only origin/main...HEAD)
128+
if echo "$CHANGED" | grep -qE '^(policyengine_us_data/|modal_app/|tests/integration/)'; then
129+
echo "run_integration=true" >> "$GITHUB_OUTPUT"
130+
else
131+
echo "run_integration=false" >> "$GITHUB_OUTPUT"
132+
fi
133+
95134
integration-tests:
96135
runs-on: ubuntu-latest
97-
needs: [check-fork, lint]
98-
if: >-
99-
contains(github.event.pull_request.labels.*.name, 'run-integration') ||
100-
github.event.pull_request.head.ref == 'main'
136+
needs: [check-fork, lint, decide-test-scope]
137+
if: needs.decide-test-scope.outputs.run_integration == 'true'
101138
env:
102139
MODAL_TOKEN_ID: ${{ secrets.MODAL_TOKEN_ID }}
103140
MODAL_TOKEN_SECRET: ${{ secrets.MODAL_TOKEN_SECRET }}

.github/workflows/push.yaml

Lines changed: 27 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -57,9 +57,34 @@ jobs:
5757
repo: context.repo.repo,
5858
workflow_id: 'pipeline.yaml',
5959
ref: 'main',
60-
inputs: { scope: 'all' }
6160
})
62-
console.log('Pipeline dispatched with scope=all')
61+
console.log('Pipeline dispatched')
62+
63+
# ── Documentation ──────────────────────────────────────────
64+
docs:
65+
runs-on: ubuntu-latest
66+
permissions:
67+
contents: write
68+
steps:
69+
- uses: actions/checkout@v4
70+
- uses: actions/setup-python@v5
71+
with:
72+
python-version: "3.14"
73+
- uses: actions/setup-node@v4
74+
with:
75+
node-version: "24"
76+
- uses: astral-sh/setup-uv@v5
77+
- run: uv sync --dev
78+
- name: Build documentation
79+
run: uv run make documentation
80+
env:
81+
BASE_URL: /policyengine-us-data
82+
- name: Deploy to GitHub Pages
83+
uses: JamesIves/github-pages-deploy-action@v4
84+
with:
85+
branch: gh-pages
86+
folder: docs/_build/html
87+
clean: true
6388

6489
# ── PyPI publish (version bump commits only) ────────────────
6590
publish:

CLAUDE.md

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -45,14 +45,17 @@ Tests are in the top-level `tests/` directory, split into two sub-directories:
4545
- **Python Version**: Targeting Python 3.12-3.14
4646

4747
## CI/CD Structure
48-
Four workflow files in `.github/workflows/`:
48+
Six workflow files in `.github/workflows/`:
4949

50-
- **`pr.yaml`** — Runs on every PR to main: fork check, lint, uv.lock freshness, changelog fragment, unit tests with Codecov, smoke test. Integration tests run only with `run-integration` label. ~2-3 minutes.
50+
- **`pr.yaml`** — Runs on every PR to main: fork check, lint, uv.lock freshness, changelog fragment, unit tests with Codecov, smoke test, and docs build. Integration tests trigger automatically when the PR changes files in `policyengine_us_data/`, `modal_app/`, or `tests/integration/`. ~2-3 minutes for unit tests.
5151
- **`push.yaml`** — Runs on push to main. Two paths:
5252
- Version bump commits (`Update package version`): build and publish to PyPI
5353
- All other commits: full Modal data build with integration tests → manual approval gate → pipeline dispatch
54-
- **`pipeline.yaml`** — Dispatch only. Spawns the H5 generation pipeline on Modal with scope filtering (all/national/state/congressional/local/test).
54+
- Docs build and deploy to gh-pages runs unconditionally on every push.
55+
- **`pipeline.yaml`** — Dispatch only. Spawns the H5 generation pipeline on Modal with configurable GPU, epochs, and worker count.
5556
- **`versioning.yaml`** — Auto-bumps version when changelog.d fragments are merged. Commits `Update package version` which triggers the publish path in push.yaml.
57+
- **`local_area_publish.yaml`** — Manual dispatch. Builds and stages local area H5 files on Modal, then validates staged files.
58+
- **`local_area_promote.yaml`** — Manual dispatch. Promotes staged local area H5 files to production.
5659

5760
## Git and PR Guidelines
5861
- **CRITICAL**: NEVER create PRs from personal forks - ALL PRs MUST be created from branches pushed to the upstream PolicyEngine repository
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.

0 commit comments

Comments
 (0)