Skip to content

feat(release-controller): add ignored_proposals to release-index.yaml#2015

Merged
pietrodimarco-dfinity merged 2 commits into
mainfrom
pmarco/release-controller-ignore-proposal-id
May 18, 2026
Merged

feat(release-controller): add ignored_proposals to release-index.yaml#2015
pietrodimarco-dfinity merged 2 commits into
mainfrom
pmarco/release-controller-ignore-proposal-id

Conversation

@pietrodimarco-dfinity
Copy link
Copy Markdown
Contributor

@pietrodimarco-dfinity pietrodimarco-dfinity commented May 18, 2026

Motivation

The reconciler rebuilds its in-memory ReconcilerState on every cycle from the public dashboard (/proposals?limit=50&include_topic=TOPIC_IC_OS_VERSION_ELECTION) and the governance canister (dre proposals filter -t ic-os-version-election), and treats every matching election proposal as "already submitted" regardless of status. A rejected or failed proposal therefore prevents the reconciler from ever submitting a fresh one for the same version — with no operator escape hatch.

Concretely, NNS election proposal 141776 (a rejected GuestOS election proposal) is currently blocking resubmission for its version. This PR adds the lever to unblock that and any similar future incident.

Design

Configuration-driven, not flag-driven:

  • New top-level field on release-index.yaml: ignored_proposals: List[int].
  • The reconciler reads index.root.ignored_proposals on every cycle from the freshly-loaded index and drops any matching proposal from the dashboard + governance lookups before they reach ReconcilerState.
  • Edits to release-index.yaml on dfinity/dre@main are picked up by the running controller on the next cycle — no redeploy, no k8s PR, no image bump needed.
  • Configuration lives versioned alongside the releases it pertains to.

This replaces an earlier iteration of this same PR that wired the same logic to a --ignore-proposal-id CLI flag (commit 72eaae5c); see the follow-up commit 09d9b76c for the pivot.

Changes

  • release-index-schema.json: add the optional top-level ignored_proposals array (integer[]).
  • release-controller/release_index.py: add ignored_proposals: List[int] = [] to the ReleaseIndex pydantic model (manual edit, mirroring the generator output — this file is already hand-tweaked).
  • release-controller/reconciler.py: drop the --ignore-proposal-id argparse flag and the Reconciler.ignore_proposal_ids constructor parameter. _filtered_proposals_retriever is now a static method taking the ignore set explicitly, and reconcile() pulls it from index.root.ignored_proposals before invoking the wrappers. Every dropped proposal still emits a warning log identifying the source, OS kind, proposal id, and version.
  • release-index.yaml: pin ignored_proposals: [141776] to force resubmission for the version that the rejected GuestOS proposal 141776 was blocking.
  • release-controller/README.md: rewrite the operator runbook to point at the release-index.yaml workflow; drop the stale --skip-preloading-state dev-mode docs (the flag was never wired into main's argparse anyway).
  • release-controller/tests/test_reconciler_integration.py: two unit-ish tests for the static filter wrapper (with / without ignore IDs), plus an end-to-end test asserting that an entry in release-index.yaml's ignored_proposals propagates through the loader into a fresh proposal submission via the dryrun DRECli.

Caveats

  • The IC governance canister still independently refuses to re-elect a version that is already blessed. This flag only helps when the prior proposal did not result in a blessing (typically REJECTED or FAILED).
  • The flag is non-persistent. Once the reconciler has run one cycle with 141776 ignored and submitted the replacement proposal, the entry should be removed from release-index.yaml so it does not silently swallow future state for that id.

Companion change

Test plan

  • bazel test //release-controller:integration_tests --test_filter='test_reconciler_filtered_proposals|test_reconciler_picks_up' — all 5 tests pass (51.5 s).
  • release-index.yaml validates against the updated release-index-schema.json (jsonschema).
  • mypy clean on touched files.
  • Once merged: confirm the running release-controller logs Ignoring dashboard guestos election proposal 141776 for version … on the next reconcile cycle, then submits a fresh GuestOS election proposal for the underlying version.
  • Follow-up PR to remove the 141776 entry from release-index.yaml once the replacement proposal has been submitted.

The reconciler rebuilds its in-memory `ReconcilerState` on every cycle
from the public dashboard (`/proposals?limit=50&...`) and the governance
canister (`dre proposals filter -t ic-os-version-election`), and treats
every matching election proposal as "already submitted" regardless of
status. A rejected or failed proposal therefore prevents the reconciler
from ever submitting a fresh one for the same version, with no operator
escape hatch.

Add a repeatable `--ignore-proposal-id=<id>` flag that the reconciler
applies as a filter on the result of both proposal-retriever sources
before they populate `ReconcilerState`, with a warning log per drop. The
affected version goes back to the "no proposal" state and a new proposal
is submitted on the next cycle. The IC governance canister still refuses
to re-elect an already blessed version, so this only helps when the
prior proposal did not result in a blessing (typically REJECTED or
FAILED).

Add integration tests covering both branches of the filter (with and
without ignore IDs configured) using a self-contained retriever fixture
that does not depend on the existing `MockDashboard` quirk where
`_fake_proposal` always emits `hostos_version_to_elect`. Refresh the
README "Operator runbook" with a "Forcing resubmission" section and
replace the stale `--skip-preloading-state` dev-mode docs (the flag is
not wired into `main`'s argparse).
@pietrodimarco-dfinity pietrodimarco-dfinity requested a review from a team as a code owner May 18, 2026 11:04
… flag

Move the operator escape hatch from a `--ignore-proposal-id` CLI flag on
the release controller binary into a top-level `ignored_proposals: List[int]`
field on `release-index.yaml`.  The reconciler reads the list per cycle
from the freshly-loaded index, so edits to `release-index.yaml` are
picked up live without a controller redeploy and the configuration lives
versioned alongside the releases it pertains to.

- release-index-schema.json: add the new top-level array.
- release-controller/release_index.py: add `ignored_proposals: List[int] = []`
  to `ReleaseIndex` (manual edit, mirroring the generator output).
- release-controller/reconciler.py: drop the `--ignore-proposal-id` argparse
  flag, drop the `Reconciler.ignore_proposal_ids` constructor parameter,
  turn `_filtered_proposals_retriever` into a static method that takes the
  ignore set explicitly, and have `reconcile()` pull it from
  `index.root.ignored_proposals` before invoking the wrappers.
- release-index.yaml: pin `ignored_proposals: [141776]` so the rejected
  GuestOS proposal stops blocking resubmission of its underlying version
  on the next reconcile cycle.
- README.md: rewrite the runbook to point at the release-index.yaml
  workflow and drop the stale CLI-flag dev-mode section.
- tests: switch the two filter wrapper tests to call the static method
  directly with an explicit ignore set; add a new end-to-end test
  asserting that an entry in `release-index.yaml`'s `ignored_proposals`
  propagates through the loader into a fresh proposal submission via
  the dryrun DRECli.
@pietrodimarco-dfinity pietrodimarco-dfinity changed the title feat(release-controller): add --ignore-proposal-id to force resubmission feat(release-controller): add ignored_proposals to release-index.yaml May 18, 2026
@pietrodimarco-dfinity pietrodimarco-dfinity merged commit 7d5355b into main May 18, 2026
9 checks passed
@pietrodimarco-dfinity pietrodimarco-dfinity deleted the pmarco/release-controller-ignore-proposal-id branch May 18, 2026 11:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants