Add dev-feature preservation gate and change schedule#4773
Open
Phlip79 wants to merge 1 commit into
Open
Conversation
…hedule
Two changes to the main-to-dev nightly sync workflow:
1. New workflow `.github/workflows/nightly-sync-dev-preservation-gate.yml`
that runs deterministically on every push to a `main2dev/*` PR (no
LLM in the loop). For each non-exempt file the sync touched, it
computes:
(lines on origin/dev) - (lines on origin/main) - (lines in tree)
and posts a sticky PR comment listing every line that satisfies all
three. The workflow fails if any non-exempt file has a non-empty
result, blocking the PR from being marked ready.
Catches the most common sync regression: feature lands on dev at T0,
lands on main at T1>T0, sync runs in between, `-X theirs` drops
dev's feature wherever main happened to touch nearby lines. Recent
examples this would have caught:
- `_forward_mlp_router(input_ids=None)` in transformer_layer.py
- `num_sms_preprocessing_api=...` kwarg in token_dispatcher.py
- `self._maybe_record_overload_factor(...)` call in moe_layer.py
- `parse_and_validate_args` import in
gpt_dynamic_inference_with_coordinator.py
- `args.dynamic_context_parallel` references in
data_samplers.py / utils.py / training.py
- "Packing Scheduler" section in datasets/readme.md
Files in the skill's "Files to Override from Main" list
(training.py, utils.py, data_samplers.py, initialize.py,
layer_wise_optimizer.py) report as `warning` rather than `error`,
matching the skill's intent that main may legitimately win there.
pyproject.toml / uv.lock / docker/Dockerfile.ci.dev and CODEOWNERS
are skipped entirely (always dev's by skill rule).
The job also publishes a prompt-addendum (on workflow_dispatch only)
that can be pasted into the sync-bot prompt so the agent fixes
violations proactively and the deterministic gate stays green.
2. Schedule change in
`.github/workflows/nightly-sync-main-to-dev.yml`: from daily at
21:00 UTC to twice-weekly (Monday + Thursday) at 15:00 UTC, which
is 8 AM PDT (7 AM PST in winter, since GitHub Actions cron is
UTC-only and does not follow DST).
d309b08 to
ac2e39b
Compare
Member
Author
|
/ok to test ac2e39b |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Two changes to the main-to-dev nightly sync workflow:
New workflow
.github/workflows/nightly-sync-dev-preservation-gate.ymlthat runs deterministically on every push to amain2dev/*PR (no LLM in the loop). For each non-exempt file the sync touched, it computes:and posts a sticky PR comment listing every line that satisfies all three. The workflow fails if any non-exempt file has a non-empty result, blocking the PR from being marked ready.
Catches the most common sync regression: feature lands on dev at T0, lands on main at T1>T0, sync runs in between,
-X theirsdrops dev's feature wherever main happened to touch nearby lines. Recent examples this would have caught:_forward_mlp_router(input_ids=None)in transformer_layer.pynum_sms_preprocessing_api=...kwarg in token_dispatcher.pyself._maybe_record_overload_factor(...)call in moe_layer.pyparse_and_validate_argsimport in gpt_dynamic_inference_with_coordinator.pyargs.dynamic_context_parallelreferences in data_samplers.py / utils.py / training.pyFiles in the skill's "Files to Override from Main" list
(training.py, utils.py, data_samplers.py, initialize.py,
layer_wise_optimizer.py) report as
warningrather thanerror,matching the skill's intent that main may legitimately win there.
pyproject.toml / uv.lock / docker/Dockerfile.ci.dev and CODEOWNERS
are skipped entirely (always dev's by skill rule).
The job also publishes a prompt-addendum (on workflow_dispatch only)
that can be pasted into the sync-bot prompt so the agent fixes
violations proactively and the deterministic gate stays green.
Schedule change in
.github/workflows/nightly-sync-main-to-dev.yml: from daily at 21:00 UTC to twice-weekly (Monday + Thursday) at 15:00 UTC, which is 8 AM PDT (7 AM PST in winter, since GitHub Actions cron is UTC-only and does not follow DST).