Skip to content

Add upgrade-check REST endpoints for workloads#5408

Merged
JAORMX merged 6 commits into
mainfrom
upgrade-2-check-api
Jun 2, 2026
Merged

Add upgrade-check REST endpoints for workloads#5408
JAORMX merged 6 commits into
mainfrom
upgrade-2-check-api

Conversation

@JAORMX
Copy link
Copy Markdown
Collaborator

@JAORMX JAORMX commented Jun 1, 2026

Summary

The upgrade checker (PR #5407) is only useful if clients can reach it. CLI, Studio, and automation should share one backend source of truth for upgrade availability instead of each reimplementing registry drift detection. This exposes the checker over the existing workloads API. Part of RFC THV-0068, local scope.

  • Add GET /api/v1beta/workloads/upgrade-check (batch) and GET /api/v1beta/workloads/{name}/upgrade-check (single).
  • The batch handler reuses the exact group/all authorization scoping of the list endpoint and intersects it with the enumerated run configs, so it can never report a workload outside the caller's scope.
  • Responses carry only non-sensitive CheckResult metadata; secret env-var defaults are cleared. Both routes are read-only and skip image pulls, so they use the standard request timeout.
  • Regenerate the OpenAPI spec.

Type of change

  • New feature

Test plan

  • Unit tests (task test) — single (up-to-date / available / 404 / 400), batch (mixed results, group filter, stale-on-disk config excluded, unloadable config skipped-not-fatal), a no-secret-leak assertion, and a chi route-ordering test (/upgrade-check does not resolve to getWorkload).
  • Linting (task lint-fix)

Changes

File Change
pkg/api/v1/workloads_upgrade.go Batch + single check handlers, run-config enumeration
pkg/api/v1/workloads.go Route registration + injected deps
pkg/api/v1/workload_types.go Response types
docs/server/* Regenerated OpenAPI

Does this introduce a user-facing change?

Yes — two new read-only REST endpoints for querying upgrade availability. No existing endpoints change.

Large PR Justification

This PR is ~1,434 lines, but only 231 are hand-written source. The rest is mechanically generated or test code that cannot be meaningfully split out:

  • 764 lines are auto-generated OpenAPI (docs/server/docs.go, swagger.json, swagger.yaml), produced by task docs from the swag annotations on the two new handlers. These must be regenerated and committed in the same PR (a Verify Swagger Documentation CI check fails otherwise), and they cannot be split.
  • 439 lines are integration tests for the new endpoints, which must ship with the handlers they cover.
  • The 231 source lines are the two read-only handlers plus their response types — already the smallest coherent API slice (check endpoints only; the apply endpoint is deferred to PR Add upgrade apply for the CLI and API #5411).

Special notes for reviewers

PR 2 of 6 in the RFC THV-0068 stack; based on #5407. The dedicated apply endpoint (POST .../upgrade) lands in PR 5, reusing the response types added here.

🤖 Generated with Claude Code

@github-actions github-actions Bot added the size/XL Extra large PR: 1000+ lines changed label Jun 1, 2026
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Large PR Detected

This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.

How to unblock this PR:

Add a section to your PR description with the following format:

## Large PR Justification

[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformation

Alternative:

Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.

See our Contributing Guidelines for more details.


This review will be automatically dismissed once you add the justification section.

CLI and API users have no way to discover when a newer version of a
registry-sourced MCP server is available; only Studio implements drift
detection, in its frontend. Introduce a backend package that all clients
can consume.

Add pkg/workloads/upgrade with a Checker that compares a running
workload's image tag against its registry entry (semver-aware, with a
string fallback) and reports environment-variable and configuration
(transport / permission-profile / network-isolation) drift. Comparison
degrades safely to "unknown" for :latest, digest refs, repository
changes, and non-registry-sourced workloads, so only a strictly-newer
tag on the same repository yields "upgrade-available".

This is the read-only detection core (RFC THV-0068, phase A); the apply
path, API endpoints, and CLI follow in later changes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@JAORMX JAORMX force-pushed the upgrade-1-detection branch from b1361f7 to 3350917 Compare June 1, 2026 09:31
@JAORMX JAORMX force-pushed the upgrade-2-check-api branch from 0224c57 to 4574212 Compare June 1, 2026 09:31
@github-actions github-actions Bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jun 1, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented Jun 1, 2026

Codecov Report

❌ Patch coverage is 45.09804% with 56 lines in your changes missing coverage. Please review.
✅ Project coverage is 68.79%. Comparing base (db82aef) to head (6850600).

Files with missing lines Patch % Lines
pkg/api/v1/workloads_upgrade.go 56.79% 23 Missing and 12 partials ⚠️
pkg/api/v1/workloads.go 0.00% 21 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5408      +/-   ##
==========================================
- Coverage   68.82%   68.79%   -0.03%     
==========================================
  Files         632      633       +1     
  Lines       64085    64181      +96     
==========================================
+ Hits        44105    44154      +49     
- Misses      16722    16760      +38     
- Partials     3258     3267       +9     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@JAORMX JAORMX marked this pull request as ready for review June 1, 2026 10:21
@JAORMX JAORMX requested a review from amirejaz as a code owner June 1, 2026 10:21
@github-actions github-actions Bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jun 1, 2026
@github-actions github-actions Bot dismissed their stale review June 1, 2026 10:25

Large PR justification has been provided. Thank you!

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 1, 2026

✅ Large PR justification has been provided. The size review has been dismissed and this PR can now proceed with normal review.

Comment thread pkg/api/v1/workloads_upgrade.go Outdated
@JAORMX JAORMX force-pushed the upgrade-2-check-api branch from 4574212 to 6faa7be Compare June 1, 2026 13:16
@github-actions github-actions Bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jun 1, 2026
JAORMX and others added 4 commits June 1, 2026 13:23
- Lowercase an uppercase "V" tag prefix so semver comparison works;
  "V1.2.0" vs "V1.3.0" no longer falls through to undecidable and
  hides a real upgrade.
- Drop the raw provider error from CheckResult.Reason (it is serialized
  into the API response and can leak internal addressing); log it at
  DEBUG and return a fixed string. Same for the CheckAll path.
- Add a defensive default to the comparison switch so an unexpected
  value yields StatusUnknown rather than the least-safe StatusUpToDate.
- Stop reporting network-isolation drift: the registry has no
  network-isolation field, so it fired for every isolated workload
  regardless of the candidate version. Remove the ConfigDrift field
  and the now-unused BoolChange type.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
CLI, Studio, and automation need a single backend source of truth for
upgrade availability instead of each client reimplementing registry
drift detection. Expose the Phase A checker over the existing workloads
API.

Add GET /api/v1beta/workloads/upgrade-check (batch) and
GET /api/v1beta/workloads/{name}/upgrade-check (single). The batch
handler reuses the exact group/all authorization scoping of the list
endpoint and intersects it with the enumerated run configs, so it can
never report a workload outside the caller's scope. Responses carry only
non-sensitive CheckResult metadata; secret env-var defaults are cleared.
Both routes are read-only and skip image pulls, so they use the standard
timeout. Regenerate the OpenAPI spec.

The dedicated apply endpoint (POST .../upgrade) follows once the Applier
lands (RFC THV-0068, phase B).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
FilterByGroup returns an empty slice (nil error) for a group that does
not exist, so a typo'd ?group= silently returned 200 with an empty list
instead of the documented 404. Check group existence explicitly via the
group manager before filtering, in both the upgrade-check and the
listWorkloads handlers, so the advertised 404 is real. Add a bulk
upgrade-check test covering the unknown-group path.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The upgrade detection change removed the ConfigDrift.NetworkIsolation
field and the BoolChange type, so regenerate the committed OpenAPI spec
to drop the stale schema and property. Fixes the Verify Swagger
Documentation CI check.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@JAORMX JAORMX force-pushed the upgrade-1-detection branch from 0431342 to 6cb9506 Compare June 1, 2026 13:27
@JAORMX JAORMX requested a review from lujunsan as a code owner June 1, 2026 13:27
@JAORMX JAORMX force-pushed the upgrade-2-check-api branch from 6faa7be to 664e594 Compare June 1, 2026 13:27
@github-actions github-actions Bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jun 1, 2026
jhrozek
jhrozek previously approved these changes Jun 1, 2026
Base automatically changed from upgrade-1-detection to main June 2, 2026 08:33
@JAORMX JAORMX dismissed jhrozek’s stale review June 2, 2026 08:33

The base branch was changed.

@github-actions github-actions Bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jun 2, 2026
@github-actions github-actions Bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jun 2, 2026
@JAORMX JAORMX merged commit 16ce897 into main Jun 2, 2026
45 checks passed
@JAORMX JAORMX deleted the upgrade-2-check-api branch June 2, 2026 11:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/XL Extra large PR: 1000+ lines changed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants