Add Applier for upgrading workloads in place#5410
Conversation
2af4157 to
53115aa
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #5410 +/- ##
==========================================
+ Coverage 68.75% 68.83% +0.07%
==========================================
Files 634 635 +1
Lines 64183 64331 +148
==========================================
+ Hits 44132 44281 +149
+ Misses 16782 16774 -8
- Partials 3269 3276 +7 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Large PR Detected
This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.
How to unblock this PR:
Add a section to your PR description with the following format:
## Large PR Justification
[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformationAlternative:
Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.
See our Contributing Guidelines for more details.
This review will be automatically dismissed once you add the justification section.
Large PR justification has been provided. Thank you!
|
✅ Large PR justification has been provided. The size review has been dismissed and this PR can now proceed with normal review. |
53115aa to
a6b9a1d
Compare
a6b9a1d to
1029c5c
Compare
jhrozek
left a comment
There was a problem hiding this comment.
Two notes from re-reviewing the Applier (the rest of my findings I'm tracking elsewhere, or they belong to the stacked check PRs). Both are non-blocking — the verify-then-pull ordering and TOCTOU guard look solid.
Detecting an available upgrade is only useful if users can apply it while keeping their configuration. Add the apply path that the CLI and API will drive. Add upgrade.Applier: it reloads the workload's saved config, re-runs the check on fresh state (so a stale result can never drive an apply), resolves the candidate from the registry, and rebuilds the run config preserving the full user configuration — auth, authz, audit, telemetry, tools filters, volumes, secrets, ports, permission profile, and more — changing only the image, merged env/secrets, and re-resolved registry URLs. New required env vars surface through the injected validator. Crucially, the candidate image is verified and pulled (and the policy gate runs) before the destructive stop/delete/start, so a missing or unverifiable image leaves the running workload untouched — there is no rollback once UpdateWorkload begins. Verification uses the same path as thv run. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Summary
Detecting an available upgrade is only useful if users can apply it while keeping their configuration. This adds the apply core that the CLI and API drive in the next PR. Part of RFC THV-0068, local scope.
upgrade.Applier: it reloads the workload's saved config, re-runs the check on fresh state (so a stale result can never drive an apply — closes the TOCTOU window), resolves the candidate from the registry, and rebuilds the run config preserving the full user configuration (auth, authz, audit, telemetry, tools filters, volumes, secrets, ports, permission profile, …), changing only the image, merged env/secrets, and re-resolved registry URLs. Newly required env vars surface through the injected validator.UpdateWorkloadbegins. Verification uses the same path asthv run.UpdateWorkload's completion so the upgraded config is durably persisted before returning (both CLI and API are synchronous).Type of change
Test plan
task test) — no-op when up-to-date; resolve/verify/pull failure leaves the workload untouched (UpdateWorkloadnever called); ordering assertion (verify+pull before update, completion awaited); the exact gated+pulled config is the deployed one; full-config preservation; copy-before-mutate; completion-failure surfaces as an error.task lint-fix)Changes
pkg/workloads/upgrade/applier.goApplier,Apply, config merge/preservationpkg/workloads/upgrade/applier_test.goDoes this introduce a user-facing change?
No — internal apply path. The
thv upgrade applycommand andPOST .../upgradeendpoint that drive it land in PR 5.Large PR Justification
This PR is ~1,015 lines, but 576 are unit tests and 439 are a single source file (
applier.go).Applyis one sequence — load → re-check → resolve+verify → rebuild (preserving every user-owned field) → pull → recreate → await completion. Splitting it would create half-built intermediate states that neither compile meaningfully nor convey the security-critical ordering that is the whole point of the change.Special notes for reviewers
PR 4 of 6 in the RFC THV-0068 stack; based on #5409. This is the highest-risk change — the destructive recreate path. The verify-then-pull-before-destroy ordering and full-config preservation were the focus of the security/Go review (an earlier draft silently dropped auth/audit/tools-filter config on upgrade; now every user-owned field is preserved). On Kubernetes the byte-level pull is delegated to the kubelet, but verification + policy gate still precede destruction; local runtime is the RFC's scope.
🤖 Generated with Claude Code