Skip to content

CLOS-3716: inhibit elevate when target kernel minor outruns userland#66

Open
prilr wants to merge 1 commit into
cloudlinux:cloudlinuxfrom
prilr:CLOS-3716-elevate-installed-wrong-kernel-for-cln-9-r2
Open

CLOS-3716: inhibit elevate when target kernel minor outruns userland#66
prilr wants to merge 1 commit into
cloudlinux:cloudlinuxfrom
prilr:CLOS-3716-elevate-installed-wrong-kernel-for-cln-9-r2

Conversation

@prilr

@prilr prilr commented Jun 5, 2026

Copy link
Copy Markdown
Collaborator

Defense-in-depth Checks-phase actor that compares the newest available kernel-core minor against the newest available cloudlinux-release minor in the target userspace's repositories. When the kernel minor is higher (the gradual-rollout-leak shape behind ZD 268790), an INHIBITOR report stops the upgrade with a remediation pointing at re-running once the rollout has finished promoting the rest of the userland.

The customer's CL8->CL9 upgrade landed during the 9.7 rollout: the CLN package channel cloudlinux-x86_64-server-9 (the bare-major channel cln_switch switches to) was already serving the 9.7 kernel for this system's rollout slot while still serving 9.6 cloudlinux-release/kmod-lve. dnf installed both kernels; then set default boot to the newer (611.el9_7) kernel, which kmod-lve-2.1-52.el9 (built for el9_6) couldn't load -> kmodlve missing -> LVE broken.

The actor consumes TargetUserSpaceInfo and runs dnf repoquery --installroot=... against the target container.

Because the container's repoquery returns the union of source+target repos, the parsers filter both kernel dist-tags (elN_M) and cloudlinux-release versions (N.M) by the target major before taking the max. The same actor protects el7->el8 and el8->el9 - the regex is parametric on the target major.

Phase: TargetTransactionChecksPhaseTag (same phase as missinggpgkeysinhibitor). TargetUserSpaceInfo is produced by target_userspace_creator at TargetTransactionFactsPhaseTag, so an earlier ChecksPhase actor wouldn't have a target userspace to query.

Validation:

  • Unit tests cover parsing (vendor-suffix, no-minor-tag, cross-major rejection) and process() (match, leak, release-ahead, missing data, el8_10 path, cross-major rows in target userspace).
  • Live VM test: leapp preupgrade reaches the actor, repoqueries the target container, reports kernel-core minor=el9_8, cloudlinux-release minor=9.8 (current target repo state, matched -> no inhibitor). The exact mismatch branch is unit-test only - the 9.7 rollout window the customer hit is closed.

Defense-in-depth Checks-phase actor that compares the newest available
`kernel-core` minor against the newest available `cloudlinux-release`
minor in the target userspace's repositories.  When the kernel minor is
higher (the gradual-rollout-leak shape behind ZD 268790), an INHIBITOR
report stops the upgrade with a remediation pointing at re-running once
the rollout has finished promoting the rest of the userland.

The customer's CL8->CL9 elevate landed during the staged 9.7 rollout: the
CLN package channel `cloudlinux-x86_64-server-9` (the bare-major channel
`cln_switch` switches to) was already serving the 9.7 kernel for this
system's rollout slot while still serving 9.6 cloudlinux-release/kmod-lve.
`dnf` installed both kernels; `scaninstalledtargetkernelversion` +
`forcedefaultboottotargetkernelversion` then set default boot to the
newer (611.el9_7) kernel, which kmod-lve-2.1-52.el9 (built for el9_6)
couldn't load -> kmodlve missing -> LVE broken.

The actor consumes `TargetUserSpaceInfo` and runs
`dnf repoquery --installroot=...` against the target container.  Because
the container's repoquery returns the union of source+target repos
(empirically validated on a CL8.10 nopanel target: `cloudlinux-release-8.10`
rows surface alongside the 9.x set), the parsers filter both kernel
dist-tags (`elN_M`) and `cloudlinux-release` versions (`N.M`) by the
target major before taking the max.  The same actor protects el7->el8 and
el8->el9 - the regex is parametric on the target major.

Phase: `TargetTransactionChecksPhaseTag` (same phase as
`missinggpgkeysinhibitor`).  `TargetUserSpaceInfo` is produced by
`target_userspace_creator` at `TargetTransactionFactsPhaseTag`, so an
earlier ChecksPhase actor wouldn't have a target userspace to query.

Validation:
- Unit tests cover parsing (vendor-suffix, no-minor-tag, cross-major
  rejection) and `process()` (match, leak, release-ahead, missing data,
  el8_10 path, cross-major rows in target userspace).
- Live CL8.10 nopanel VM (Flow C, template 27486): `leapp preupgrade`
  reaches the actor, repoqueries the target container, reports
  `kernel-core minor=el9_8, cloudlinux-release minor=9.8` (current target
  repo state, matched -> no inhibitor).  Phase + container query + major
  filter all confirmed end-to-end.  The exact mismatch branch is unit-test
  only - the 9.7 rollout window the customer hit is closed.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@azheregelya azheregelya self-requested a review June 10, 2026 09:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants