Skip to content

Commit a00ff45

Browse files
committed
Initial plan
1 parent 76b89b7 commit a00ff45

1 file changed

Lines changed: 215 additions & 0 deletions

File tree

Lines changed: 215 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,215 @@
1+
# Sync API Review Branches on Working Branch Push
2+
3+
## Goal
4+
5+
When a commit is pushed to a working branch, detect whether the resulting API surface changed and, if so, update every associated API review branch with a consistent API surface. If the push does not affect any package API surface, the workflow should exit without touching review branches.
6+
7+
The sync workflow must not blindly copy committed `api.md` or `api.metadata.yml` files from the working branch. Those files may be stale or incorrect, and the existing API.md consistency check may fail for the same commit. Review branches should only receive API artifacts that have either passed consistency validation or were regenerated by the sync workflow itself.
8+
9+
The workflow must support one working branch having multiple associated review branches, for example when the same SDK PR has API review PRs for multiple packages or versions.
10+
11+
## Current Branch Context
12+
13+
The current API review prototype already has most of the branch mechanics in `scripts/api_md_workflow/create_api_review_pr.js`:
14+
15+
- Review branches use `apireview/review_<package>_<version>` with suffixes such as `_a`, `_b`, etc. when multiple branches are needed.
16+
- Baseline branches use `apireview/base_<package>_<version>`.
17+
- Review branches are created from the selected base branch and then receive the generated target `api.md` and `api.metadata.yml` files.
18+
- Existing review branches can be reused when their API state matches the desired generated output and, for review branches, the base branch is an ancestor.
19+
- The generated review PR body records a human-readable working branch, working PR, or target tag reference.
20+
21+
The missing piece is a durable, machine-readable association from a working branch or working PR to one or more review branches. The current PR body is useful for humans but is too brittle as the only lookup mechanism.
22+
23+
## Proposed Design
24+
25+
Add a new GitHub Actions workflow plus a small script layer under `scripts/api_md_workflow`.
26+
27+
There are two viable strategies:
28+
29+
1. **Gate on the consistency workflow.** Let the existing API.md consistency workflow prove that committed `api.md` and `api.metadata.yml` match regenerated output. Only after that check passes, copy the committed API files to associated review branches.
30+
2. **Regenerate in the sync workflow.** Detect packages affected by the push, regenerate `api.md` and `api.metadata.yml` from the pushed working commit, and push the regenerated artifacts to associated review branches when they differ from the review branch.
31+
32+
The safer initial implementation is option 1 because it reuses the existing pass/fail gate and avoids introducing a second generation path with slightly different behavior. Option 2 is attractive later if we want review branches to update even when the working PR has not committed refreshed API files yet.
33+
34+
The workflow should run on `push` events to non-review branches and use path filtering so it only starts when API surface files changed:
35+
36+
```yaml
37+
on:
38+
push:
39+
branches-ignore:
40+
- main
41+
- apireview/**
42+
paths:
43+
- sdk/**/api.md
44+
- sdk/**/api.metadata.yml
45+
```
46+
47+
Inside the workflow, still compute the exact changed files from `github.event.before..github.sha`. The `paths` filter prevents most unnecessary runs, while the explicit diff keeps behavior correct for force-pushes, unusual event payloads, and future manual dispatch support.
48+
49+
## Association Model
50+
51+
When `create_api_review_pr.js` creates or reuses a review PR, it should write a hidden metadata block into the PR body. This keeps the association with the PR, avoids creating repo-wide state files, and allows multiple review PRs to point back to the same working branch.
52+
53+
Suggested block:
54+
55+
```markdown
56+
<!-- api-md-review-sync
57+
{
58+
"schemaVersion": 1,
59+
"repository": "Azure/azure-sdk-for-python",
60+
"packageName": "azure-example",
61+
"packageDir": "sdk/service/azure-example",
62+
"baseBranch": "apireview/base_azure-example_1.0.0",
63+
"reviewBranch": "apireview/review_azure-example_1.1.0",
64+
"workingOwner": "Azure",
65+
"workingBranch": "feature/api-change",
66+
"workingPrNumber": 47203
67+
}
68+
-->
69+
```
70+
71+
Use the `workingOwner` and `workingBranch` pair as the primary lookup key. `workingPrNumber` is helpful context but should not be required, since the sync workflow runs from a branch push and may not have a PR context.
72+
73+
For review PRs already created before this metadata exists, the sync script can include a best-effort fallback that scans open API review PR bodies for the existing human-readable working branch or working PR line. That fallback should only be a migration bridge; new PRs should rely on the metadata block.
74+
75+
## Workflow Steps: Gated Copy Strategy
76+
77+
1. Checkout the pushed commit with enough history to compare `github.event.before` and `github.sha`.
78+
2. Build a list of committed API files changed in the pushed range:
79+
80+
```bash
81+
git diff --name-only "$BEFORE" "$SHA" -- 'sdk/**/api.md' 'sdk/**/api.metadata.yml'
82+
```
83+
84+
3. If the changed-file list is empty, exit successfully.
85+
4. Wait for, or query, the API.md consistency check for the same commit. If the consistency check has not completed successfully, exit without updating review branches. This may be implemented as:
86+
- a separate `workflow_run` trigger that runs only after `API.md Consistency` completes successfully, or
87+
- a push-triggered workflow step that queries the check suite for `github.sha` and proceeds only when the consistency check conclusion is `success`.
88+
5. Map changed files to package directories. A package is syncable only when at least one of these files changed:
89+
- `<packageDir>/api.md`
90+
- `<packageDir>/api.metadata.yml`
91+
6. Discover associated review PRs by querying open PRs in `Azure/azure-sdk-for-python` and parsing the hidden metadata block.
92+
7. Filter to metadata where:
93+
- `workingOwner` and `workingBranch` match the pushed branch.
94+
- `packageDir` is in the changed package set.
95+
- `reviewBranch` starts with `apireview/review_`.
96+
8. For each matched review branch, copy the now-validated API files from the pushed working commit onto the review branch and push the result.
97+
9. Summarize which review branches were updated, which changed packages had no associated review branch, and whether any update was skipped because consistency had not passed.
98+
99+
## Alternative Workflow: Regenerated Output Strategy
100+
101+
If we want review branches to update even when the working branch has stale committed API files, the sync workflow can regenerate API artifacts instead of copying them.
102+
103+
1. Trigger on pushes under `sdk/**`, not only changes to committed API files, because source changes may alter generated API output before `api.md` is updated.
104+
2. Reuse the package detection logic from the consistency workflow to find affected packages from `github.event.before..github.sha`.
105+
3. For each affected package with associated review branches, overlay the pushed working commit and run the adapter generation command, equivalent to `azpysdk apistub --md --extract-metadata <package-name> --dest-dir .` for Python.
106+
4. Compare regenerated `api.md` and `api.metadata.yml` to the current review branch files.
107+
5. If there is no diff, skip the branch.
108+
6. If there is a diff, commit the regenerated files to each associated review branch and push with `--force-with-lease`.
109+
110+
This avoids copying bad committed artifacts, but it is more expensive and means the review branch can show an API surface that the working branch has not committed yet. If we choose this path, the workflow summary should explicitly say when regenerated output differs from committed `api.md` so authors know the working PR still needs its consistency fix.
111+
112+
## Branch Update Mechanics
113+
114+
For each target review branch:
115+
116+
1. Fetch the review branch and the pushed working commit.
117+
2. Create or reset a local branch to the remote review branch:
118+
119+
```bash
120+
git fetch origin "$REVIEW_BRANCH"
121+
git checkout -B "$REVIEW_BRANCH" "origin/$REVIEW_BRANCH"
122+
```
123+
124+
3. Copy only the package's API files from the pushed commit after consistency has passed:
125+
126+
```bash
127+
git checkout "$WORKING_SHA" -- "$PACKAGE_DIR/api.md" "$PACKAGE_DIR/api.metadata.yml"
128+
```
129+
130+
4. If only one of the two files exists in the working commit, fail that branch update rather than producing a partial review branch. The existing consistency workflow treats `api.metadata.yml` as required alongside `api.md`, so the sync path should enforce the same invariant.
131+
5. If `git diff --quiet` shows no change on the review branch, skip the push.
132+
6. Otherwise commit and push with `--force-with-lease`:
133+
134+
```bash
135+
git add "$PACKAGE_DIR/api.md" "$PACKAGE_DIR/api.metadata.yml"
136+
git commit -m "[API Review] Sync api.md for <package> from <working branch>"
137+
git push --force-with-lease origin "$REVIEW_BRANCH"
138+
```
139+
140+
This intentionally copies committed API artifacts only after the consistency workflow has proven they match source changes on the working PR. If the consistency check fails or is still pending, the sync workflow should leave the review branch unchanged.
141+
142+
## Script Shape
143+
144+
Add a new script such as `scripts/api_md_workflow/sync_review_branches.js` with these responsibilities:
145+
146+
- Parse inputs from environment variables used by the workflow:
147+
- `GITHUB_REPOSITORY`
148+
- `GITHUB_REF_NAME`
149+
- `GITHUB_SHA`
150+
- `GITHUB_EVENT_BEFORE` or `github.event.before`
151+
- Detect changed API files and package directories.
152+
- Query open PRs through `gh pr list` or the REST API.
153+
- Parse the hidden `api-md-review-sync` metadata block.
154+
- Select all matching review branches, not just the first one.
155+
- Update each branch independently and continue processing other branches when one branch fails.
156+
- Return a non-zero exit code if any selected branch failed to update.
157+
158+
The script should reuse helper functions where practical from `create_api_review_pr.js`, but the first implementation can duplicate a small amount of git/gh plumbing if exporting the existing helpers would make the change larger. A follow-up cleanup can extract shared git helpers into a common module.
159+
160+
## Required Changes to PR Creation
161+
162+
Update `create_api_review_pr.js` so every new API review PR body includes the metadata block. For reused review PRs, update the body if the metadata block is missing or stale.
163+
164+
When the target is a tag, do not include sync metadata and do not associate the review PR with a working branch. Tag-to-tag review PRs are static and should not be updated by branch push events.
165+
166+
When the target is `owner:branch`, set `workingOwner` to `owner` and `workingBranch` to `branch`. When the target is an origin branch, set `workingOwner` to `Azure` and `workingBranch` to that branch. When no target is supplied and the script defaults to `main`, omit sync metadata because `main` should not drive synthetic review updates.
167+
168+
## Permissions and Safety
169+
170+
The workflow needs `contents: write` to push review branches and `pull-requests: read` to discover review PR metadata. If it updates missing metadata on review PRs, it also needs `pull-requests: write`.
171+
172+
Safety checks should include:
173+
174+
- Ignore pushes to `main` and `apireview/**`.
175+
- Only update branches whose metadata points to the pushed branch.
176+
- Only copy `api.md` and `api.metadata.yml` under the recorded package directory.
177+
- Require both API files to exist at the pushed commit.
178+
- Use `--force-with-lease`, never a plain force push.
179+
- Log and skip closed review PRs.
180+
- Avoid creating new review branches from this sync workflow. Branch creation remains the job of `create_api_review_pr.js`.
181+
182+
## Testing Plan
183+
184+
Add Node unit tests for the new script helpers:
185+
186+
- Changed-file detection ignores non-API files.
187+
- Changed-file detection groups `api.md` and `api.metadata.yml` by package directory.
188+
- Metadata parser accepts exactly one valid hidden block.
189+
- Metadata parser ignores malformed or unrelated comments.
190+
- Review branch selection returns multiple branches for one working branch when packages differ.
191+
- Tag-target review PRs are not selected.
192+
193+
Add tests around `create_api_review_pr.js` body generation:
194+
195+
- Branch targets include the hidden sync metadata.
196+
- Fork targets include the fork owner and branch.
197+
- Tag targets omit sync metadata.
198+
- Default `main` target omits sync metadata.
199+
200+
If practical, add a lightweight integration-style test with stubbed git and gh runners that verifies a review branch update copies only the two API files and skips the push when there is no diff.
201+
202+
## Open Questions
203+
204+
- Should the sync workflow update review PR bodies when metadata is absent, or should migration be handled only when `create_api_review_pr.js` is rerun?
205+
- Should a review branch be updated when `api.md` changed but `api.metadata.yml` did not, or should the workflow require both files in the same pushed range? The safer implementation is to require both files to exist at the pushed commit, not necessarily both changed in the same commit range.
206+
- Should branch updates be serialized with a concurrency group per review branch to avoid two rapid pushes racing each other?
207+
208+
## Recommended Implementation Order
209+
210+
1. Add metadata block generation to `create_api_review_pr.js` for branch and fork targets.
211+
2. Add tests for metadata generation and parsing.
212+
3. Add `sync_review_branches.js` with dry-run logging first.
213+
4. Add the push-triggered workflow with `contents: write` and `pull-requests: read`.
214+
5. Enable real pushes after dry-run output confirms the correct review branches are selected.
215+
6. Add fallback discovery for existing metadata-less review PRs if needed.

0 commit comments

Comments
 (0)