|
| 1 | +--- |
| 2 | +Status: Draft |
| 3 | +Owner: HyperFleet Team |
| 4 | +Last Updated: 2026-04-03 |
| 5 | +--- |
| 6 | + |
| 7 | +# HyperFleet Release Contract |
| 8 | + |
| 9 | +## What & Why |
| 10 | + |
| 11 | +### What |
| 12 | + |
| 13 | +This document defines the formal release contract between the HyperFleet team and its consumer teams (GCP Offering Team and ROSA Regional Platform Team). It covers: |
| 14 | + |
| 15 | +- **Release handoff contract**: what artifacts are produced, how consumers are notified, and what SLAs apply |
| 16 | +- **Integration testing strategy**: how each team gates HyperFleet changes and how tests are coordinated across teams |
| 17 | +- **Test ownership map**: which team owns which layer of testing to eliminate redundancy and close coverage gaps |
| 18 | + |
| 19 | +### Why |
| 20 | + |
| 21 | +- Without a defined contract, release handoffs are ad-hoc and require manual coordination between teams, slowing delivery and increasing error risk |
| 22 | +- Unclear test ownership creates either coverage gaps (bugs reaching production) or overlapping test suites (longer pipelines without benefit) |
| 23 | +- Consumer teams deploying HyperFleet via Argo CD and Terraform need predictable, machine-consumable artifacts (OCI Helm charts) to automate rollout |
| 24 | +- A shared testing strategy is required before building confidence in continuous delivery pipelines end-to-end |
| 25 | + |
| 26 | +### Out of scope |
| 27 | + |
| 28 | +- No automated deployment of HyperFleet releases to consumer integration environments |
| 29 | + |
| 30 | +--- |
| 31 | + |
| 32 | +## Consumer Teams |
| 33 | + |
| 34 | +| Team | Platform | Deployment Method | |
| 35 | +|------|----------|-------------------| |
| 36 | +| GCP Offering Team | GCP | TBD (ref: GCP-334) | |
| 37 | +| ROSA Regional Platform | AWS ROSA | Argo CD + Terraform + AWS CodePipelines | |
| 38 | + |
| 39 | +### ROSA Platform Architecture |
| 40 | + |
| 41 | +The ROSA regional platform consumes HyperFleet as part of a GitOps deployment pipeline. Each deployment initiates three pipelines: |
| 42 | + |
| 43 | +```mermaid |
| 44 | +flowchart TD |
| 45 | + PR[HyperFleet PR / Release] --> OCI[OCI Helm Chart Registry] |
| 46 | + OCI --> BOM[Bill of Materials\nenvironment default config file] |
| 47 | + BOM --> P1[Pipeline 1\nEntry Point] |
| 48 | + P1 --> P2[Pipeline 2\nRegional Cluster Provisioning\nTerraform] |
| 49 | + P1 --> P3[Pipeline 3\nManagement Cluster Provisioning\nArgo CD] |
| 50 | + P2 --> ENV[Full Environment] |
| 51 | + P3 --> ENV |
| 52 | +``` |
| 53 | + |
| 54 | +Environment configuration is centralized in a `default` file that acts as the bill of materials for Argo CD reconciliation. Component versions, Git revisions, and domain names are defined there and can be overridden per environment. |
| 55 | + |
| 56 | +### GCP Platform Architecture |
| 57 | + |
| 58 | +TBD |
| 59 | + |
| 60 | +--- |
| 61 | + |
| 62 | +## Release Handoff Contract |
| 63 | + |
| 64 | +### Release Artifacts |
| 65 | + |
| 66 | +For each HyperFleet release, the following artifacts are produced and made available to consumer teams: |
| 67 | + |
| 68 | +| Artifact | Location | Format | Notes | |
| 69 | +|----------|----------|--------|-------| |
| 70 | +| Container images | `quay.io/openshift-hyperfleet/hyperfleet-{component}:{version}` | OCI image | Built automatically by Prow on GA tag | |
| 71 | +| Helm charts | OCI registry (see [Helm Chart Distribution](#helm-chart-distribution)) | OCI artifact | Required for ROSA/Argo CD consumption | |
| 72 | +| Release notes | `hyperfleet-release` repo, `releases/release-X.Y/` | Markdown | Compatibility matrix, breaking changes, upgrade guide | |
| 73 | +| Compatibility matrix | `hyperfleet-release` repo | Markdown table | Maps validated component version combinations | |
| 74 | +| Git tags | Per-component repos + `hyperfleet-release` | `vX.Y.Z` / `release-X.Y` | See [Release Process](hyperfleet-release-process.md) | |
| 75 | + |
| 76 | + |
| 77 | +When a GA release is published, it will have detail of which ROSA/GCP versions have passed the integration tests to use as compatibility matrix. This allows to potentially introduce a breaking change in one release, that may be only deployable by another pillar. |
| 78 | + |
| 79 | +### Helm Chart Distribution |
| 80 | + |
| 81 | +**Current state**: ROSA consumes HyperFleet charts via Argo CD ApplicationSets that point directly to GitHub repos with a pinned `targetRevision` Git tag (e.g., `targetRevision: v0.1.1` on `https://github.com/openshift-hyperfleet/hyperfleet-adapter`). A freshly configured Argo CD instance does not support Git-sourced Helm charts without a plugin, which ROSA has not installed. This creates a tight coupling between HyperFleet Git tags and ROSA's deployment cadence. |
| 82 | + |
| 83 | +**Agreed path**: |
| 84 | + |
| 85 | +1. **Short-term (Q2 2026)**: ROSA team sets up a temporary OCI registry to publish HyperFleet Helm charts. This unblocks integration testing immediately. |
| 86 | +2. **Q2 target**: HyperFleet team publishes charts to an OCI-compliant registry via Conflux as part of the release pipeline, eliminating the temporary workaround and the Git coupling. |
| 87 | + |
| 88 | + |
| 89 | +### Notification SLA |
| 90 | + |
| 91 | +When a HyperFleet GA release is published: |
| 92 | + |
| 93 | +| Event | Channel | Timeline | Recipients | |
| 94 | +|-------|---------|----------|------------| |
| 95 | +| Release candidate available | `#hyperfleet-releases` Slack | RC cut day | GCP team, ROSA team | |
| 96 | +| GA release published | `#hyperfleet-releases` Slack | GA day | GCP team, ROSA team | |
| 97 | +| Breaking change in next release | `#hyperfleet-releases` Slack | ≥ 1 sprint before GA | GCP team, ROSA team | |
| 98 | +| Hotfix / patch release | `#hyperfleet-releases` Slack | Within 2 hours of GA tag | GCP team, ROSA team | |
| 99 | + |
| 100 | + |
| 101 | +At this point in time (April 26) breaking changes are not blockers to HyperFleet releases as ROSA/GCP teams do not have to keep long running clusters and migrate data. |
| 102 | + |
| 103 | + |
| 104 | +### Rollback / Recovery |
| 105 | + |
| 106 | +HyperFleet uses a **roll-forward** strategy for MVP: issues are fixed via patch releases rather than rollback. See [Release Process — Release Recovery Strategy](hyperfleet-release-process.md#55-release-recovery-strategy). |
| 107 | + |
| 108 | +HyperFleet commits to (exact times TBD): |
| 109 | + |
| 110 | +- Producing a patch release within **48 hours** for Blocker/Critical regressions |
| 111 | +- Producing a patch release within **1 week** for Major regressions |
| 112 | +- Maintaining N-1 backward compatibility so consumer teams can remain pinned to the previous validated release while a fix is in flight |
| 113 | + |
| 114 | +--- |
| 115 | + |
| 116 | +## Integration Testing Strategy |
| 117 | + |
| 118 | +### Decision: Nightly Runs with OCI Chart Injection |
| 119 | + |
| 120 | +**Agreed approach** (as of March 31, 2026 meeting): |
| 121 | + |
| 122 | +- Start with **nightly runs** against HyperFleet `main` branch, not presubmit jobs |
| 123 | +- Test against the **latest known-good stable version** of the ROSA regional platform (production Maestro version), replacing only the HyperFleet component under test |
| 124 | +- The ROSA team will **temporarily enable OCI chart pushing** so the HyperFleet team can inject PR-built charts into the ROSA deployment pipeline |
| 125 | +- Evaluate **non-blocking presubmit** integration with the HyperFleet release repository as a follow-up |
| 126 | + |
| 127 | +**Rationale**: Running full ROSA environment provisioning (~40 minutes + E2E duration) as a presubmit would significantly impact development velocity without proportional benefit at the current team scale. Nightly runs provide meaningful feedback without blocking day-to-day development. |
| 128 | + |
| 129 | +**Note on ROSA's existing pre-merge capability**: The ROSA repo already has a working cross-component E2E pre-merge mechanism (triggered via Prow comment on PRs). The decision to start with nightly runs is about HyperFleet's readiness to onboard to that mechanism — not a limitation of the ROSA infrastructure. Per-PR testing remains the target once the OCI chart injection step is stable. |
| 130 | + |
| 131 | + |
| 132 | +### Team Test Ownership |
| 133 | + |
| 134 | +| Layer | Owner | Scope | Runs on | |
| 135 | +|-------|-------|-------|---------| |
| 136 | +| Unit tests | HyperFleet | Each component in isolation | Every PR (presubmit) | |
| 137 | +| Integration tests | HyperFleet | Cross-component API contracts | Every PR (presubmit) | |
| 138 | +| HyperFleet E2E | HyperFleet | HyperFleet stack end-to-end | Nightly (main branch) | |
| 139 | +| ROSA integration | ROSA Team | Full ROSA region + HyperFleet override | Nightly (HyperFleet main) | |
| 140 | +| GCP integration | GCP Team | GCP deployment + HyperFleet | TBD (ref: GCP-334) | |
| 141 | +| Release gate | HyperFleet | All of the above must pass | Before GA tag | |
| 142 | + |
| 143 | +### Testing Gaps Identified |
| 144 | + |
| 145 | +| Gap | Owning Team | Mitigation | |
| 146 | +|-----|-------------|------------| |
| 147 | +| HyperFleet not yet onboarded to ROSA's pre-merge E2E mechanism | HyperFleet | Onboard to `openshift/release` Prow config + create `quay.io/rrp-dev-ci/` image repos (see onboarding steps above) | |
| 148 | +| Helm chart override (OCI) not yet wired into ROSA CI | ROSA + HyperFleet | Temporary OCI setup by ROSA team (Q2 2026, immediate action); replaced by Conflux Q2 target | |
| 149 | +| GCP integration tests not yet connected to HyperFleet CI | GCP + HyperFleet | Blocked on GCP-334 (CLM/Argo CD integration in progress) | |
| 150 | +| Multi-component PR testing (API + Adapter in same PR) | HyperFleet | Nightly tests use `main` for all other components; single-component override per nightly run is the starting point | |
| 151 | +| Presubmit integration gate for HyperFleet release repo | HyperFleet | Future action: non-blocking presubmit on `hyperfleet-release` repo | |
| 152 | + |
| 153 | + |
| 154 | +--- |
| 155 | + |
| 156 | +## Alternatives Considered |
| 157 | + |
| 158 | +### 1. Non-blocking Presubmit on HyperFleet Release Repository |
| 159 | + |
| 160 | +Run the full ROSA integration pipeline as an optional, non-blocking presubmit job triggered on the `hyperfleet-release` repo. |
| 161 | + |
| 162 | +**Rejected for now**: A ~40-minute+ non-blocking job provides weak signal — developers may ignore it, especially if failures are infrequent. Starting with nightly runs builds confidence in the pipeline before promoting it to presubmit. This remains a **future action**. |
| 163 | + |
| 164 | +### 2. Consumer-Driven Contract Testing (Pact-style) |
| 165 | + |
| 166 | +Define formal API contracts using a consumer-driven contract testing tool (e.g., Pact). ROSA and GCP publish their expectations; HyperFleet CI verifies them on every PR. |
| 167 | + |
| 168 | +**Rejected for MVP**: The integration surface between HyperFleet and consumer teams is primarily at the Helm chart / deployment configuration level, not a REST API contract boundary. Consumer-driven contract testing tools are better suited to service-to-service REST contracts. Helm value schema validation is a lighter-weight alternative to investigate post-MVP. |
| 169 | + |
| 170 | +### 3. Automated Rollout to Integration Environments on GA |
| 171 | + |
| 172 | +Trigger automatic deployment of each HyperFleet GA release to ROSA and GCP integration environments via webhooks. |
| 173 | + |
| 174 | +**Rejected for MVP**: ROSA's pipeline takes ~40 minutes per run and requires environment-specific configuration overrides. Automating this safely requires tooling (OCI charts via Conflux, pipeline webhooks) not yet in place. Deferred to post-Q2 2026. |
| 175 | + |
| 176 | +--- |
| 177 | + |
| 178 | +## Related Documents |
| 179 | + |
| 180 | +- [HyperFleet Release Process](hyperfleet-release-process.md) — release cadence, branching, artifacts |
| 181 | +- [Versioning Trade-offs](versioning-trade-offs.md) — SDK versioning, rollback considerations |
| 182 | +- [E2E Testing Framework Spike Report](e2e-testing/e2e-testing-framework-spike-report.md) |
| 183 | +- [E2E Run Strategy Spike Report](e2e-testing/e2e-run-strategy-spike-report.md) |
| 184 | +- [ROSA — Adding a Component for Pre-merge E2E Testing](https://github.com/openshift-online/rosa-regional-platform/blob/main/docs/adding-component-pre-merge.md) — onboarding guide for the `/test rosa-regionality-compatibility-e2e` Prow trigger |
| 185 | +- [ROSA — Testing Strategy Design](https://github.com/openshift-online/rosa-regional-platform/blob/main/docs/design/testing-strategy.md) — three CI workflows (pre-merge, nightly integration, nightly ephemeral) |
| 186 | +- GCP-334 — CLM Components Deployment (linked Jira epic) |
| 187 | +- HYPERFLEET-633 — Define release contract and integration testing strategy |
0 commit comments