|
| 1 | +# Release Promotion Stage AI Guide |
| 2 | + |
| 3 | +This guide is for AI agents and maintainers modifying Stage 5 |
| 4 | +(`5_validate_and_promote_release`) code. Stage 5 validates a staged release |
| 5 | +candidate, promotes the exact candidate to public Hugging Face and GCS |
| 6 | +destinations, writes release/version/completion metadata, and cleans staging |
| 7 | +only after completion is certified. |
| 8 | + |
| 9 | +## Candidate Identity |
| 10 | + |
| 11 | +Use `policyengine_us_data.release_promotion.ReleasePromotionContext` as the |
| 12 | +typed Stage 5 identity boundary. The context must keep these values distinct: |
| 13 | + |
| 14 | +- `run_id`: the canonical publication run correlation key. |
| 15 | +- `candidate_version`: the candidate staging scope used in Hugging Face staging |
| 16 | + paths such as `staging/{candidate_version}-{run_id}/...`. |
| 17 | +- `release_version`: the final stable public release version. |
| 18 | +- `base_release_version` and `release_bump`: optional provenance for how the |
| 19 | + candidate scope was chosen. |
| 20 | + |
| 21 | +Do not resolve a different run ID from the environment inside lower-level |
| 22 | +release-promotion logic. Environment resolution belongs at orchestration edges; |
| 23 | +Stage 5 library code should receive explicit context. |
| 24 | + |
| 25 | +## Release Candidate Bundles |
| 26 | + |
| 27 | +Use `ReleaseCandidateInputBundle` to describe the artifacts Stage 5 is allowed |
| 28 | +to validate and promote. Each artifact should be represented by a |
| 29 | +`ReleaseArtifactSpec` with a production-relative path, artifact family, source |
| 30 | +stage, and optional checksum/size metadata. |
| 31 | + |
| 32 | +The current compatibility path may build a bundle from the legacy staged path |
| 33 | +set produced by Modal orchestration. Mark that reader as compatibility-only and |
| 34 | +keep it retirable. |
| 35 | + |
| 36 | +The Stage 4 contract/inventory reader API now exists for migration work: |
| 37 | +`build_release_candidate_bundle_from_stage4_contract()` accepts an in-memory |
| 38 | +Stage 4 contract plus inventory records, and |
| 39 | +`read_stage4_release_candidate_bundle()` reads the same shape from files. |
| 40 | +Production Stage 5 code should not depend on Stage 4 contracts until the |
| 41 | +contract and inventory are canonical, complete, and populated with semantic |
| 42 | +artifact identity plus checksum/size material. |
| 43 | + |
| 44 | +Candidate bundles may record validation reports as path-only |
| 45 | +`validation_report_paths` for compatibility. When Stage 4 or another upstream |
| 46 | +producer can provide report checksums, prefer `validation_report_refs` with |
| 47 | +canonical `DiagnosticRef` / `ArtifactRef` identity so rerun comparison can |
| 48 | +distinguish an overwritten report at the same diagnostics path. |
| 49 | + |
| 50 | +## Validation Reports |
| 51 | + |
| 52 | +Stage 5 must use the shared validation schema for durable validation output: |
| 53 | + |
| 54 | +- `policyengine_us_data.stage_contracts.ValidationReport` |
| 55 | +- `policyengine_us_data.stage_contracts.ValidationFinding` |
| 56 | +- `policyengine_us_data.stage_contracts.DiagnosticRef` |
| 57 | + |
| 58 | +Do not create a Stage 5-specific durable validation report, check, finding, or |
| 59 | +error schema for contracts, diagnostics, release candidates, status endpoints, |
| 60 | +or step manifests. Release-specific details such as missing staged artifacts, |
| 61 | +missing validation reports, finalized-release conflicts, version mismatches, or |
| 62 | +destination conflicts should live in canonical finding metadata. |
| 63 | + |
| 64 | +## Rerun Comparison Material |
| 65 | + |
| 66 | +Before public writes, rerun and reuse decisions should compare semantic |
| 67 | +candidate identity rather than only checking whether output files exist. The |
| 68 | +comparison material should include: |
| 69 | + |
| 70 | +- run ID, candidate version, release version, HF repository, and GCS bucket; |
| 71 | +- Stage 4 output contract fingerprint when available; |
| 72 | +- output inventory paths/checksums when available; |
| 73 | +- validation report paths and `DiagnosticRef` checksum identities when |
| 74 | + available; |
| 75 | +- expected production-relative artifact paths; |
| 76 | +- the Stage 5 candidate bundle fingerprint. |
| 77 | + |
| 78 | +When required artifacts only have paths and no checksum/size identity, treat |
| 79 | +the bundle as path-only and do not use its fingerprint for promotion reuse |
| 80 | +decisions. |
| 81 | + |
| 82 | +Already-finalized releases are an idempotency case, not a shortcut around |
| 83 | +candidate identity. A finalized release can be reused only when its completion |
| 84 | +marker is valid and it matches the requested candidate. |
| 85 | + |
| 86 | +## Side Effects |
| 87 | + |
| 88 | +Candidate builders, schema adapters, and rerun comparison helpers should not |
| 89 | +perform Hugging Face writes, GCS uploads, Modal calls, staging cleanup, or |
| 90 | +release-manifest publication. Keep those operations behind explicit adapters or |
| 91 | +services so tests can exercise candidate shape and validation logic without |
| 92 | +credentials or network access. |
0 commit comments