Skip to content

Commit 7538ca9

Browse files
committed
HYPERFLEET-633 - docs: add HyperFleet release contract and integration testing strategy
1 parent 8137eea commit 7538ca9

1 file changed

Lines changed: 187 additions & 0 deletions

File tree

Lines changed: 187 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,187 @@
1+
---
2+
Status: Draft
3+
Owner: HyperFleet Team
4+
Last Updated: 2026-04-03
5+
---
6+
7+
# HyperFleet Release Contract
8+
9+
## What & Why
10+
11+
### What
12+
13+
This document defines the formal release contract between the HyperFleet team and its consumer teams (GCP Offering Team and ROSA Regional Platform Team). It covers:
14+
15+
- **Release handoff contract**: what artifacts are produced, how consumers are notified, and what SLAs apply
16+
- **Integration testing strategy**: how each team gates HyperFleet changes and how tests are coordinated across teams
17+
- **Test ownership map**: which team owns which layer of testing to eliminate redundancy and close coverage gaps
18+
19+
### Why
20+
21+
- Without a defined contract, release handoffs are ad-hoc and require manual coordination between teams, slowing delivery and increasing error risk
22+
- Unclear test ownership creates either coverage gaps (bugs reaching production) or overlapping test suites (longer pipelines without benefit)
23+
- Consumer teams deploying HyperFleet via Argo CD and Terraform need predictable, machine-consumable artifacts (OCI Helm charts) to automate rollout
24+
- A shared testing strategy is required before building confidence in continuous delivery pipelines end-to-end
25+
26+
### Out of scope
27+
28+
- No automated deployment of HyperFleet releases to consumer integration environments
29+
30+
---
31+
32+
## Consumer Teams
33+
34+
| Team | Platform | Deployment Method |
35+
|------|----------|-------------------|
36+
| GCP Offering Team | GCP | TBD (ref: GCP-334) |
37+
| ROSA Regional Platform | AWS ROSA | Argo CD + Terraform + AWS CodePipelines |
38+
39+
### ROSA Platform Architecture
40+
41+
The ROSA regional platform consumes HyperFleet as part of a GitOps deployment pipeline. Each deployment initiates three pipelines:
42+
43+
```mermaid
44+
flowchart TD
45+
PR[HyperFleet PR / Release] --> OCI[OCI Helm Chart Registry]
46+
OCI --> BOM[Bill of Materials\nenvironment default config file]
47+
BOM --> P1[Pipeline 1\nEntry Point]
48+
P1 --> P2[Pipeline 2\nRegional Cluster Provisioning\nTerraform]
49+
P1 --> P3[Pipeline 3\nManagement Cluster Provisioning\nArgo CD]
50+
P2 --> ENV[Full Environment]
51+
P3 --> ENV
52+
```
53+
54+
Environment configuration is centralized in a `default` file that acts as the bill of materials for Argo CD reconciliation. Component versions, Git revisions, and domain names are defined there and can be overridden per environment.
55+
56+
### GCP Platform Architecture
57+
58+
TBD
59+
60+
---
61+
62+
## Release Handoff Contract
63+
64+
### Release Artifacts
65+
66+
For each HyperFleet release, the following artifacts are produced and made available to consumer teams:
67+
68+
| Artifact | Location | Format | Notes |
69+
|----------|----------|--------|-------|
70+
| Container images | `quay.io/openshift-hyperfleet/hyperfleet-{component}:{version}` | OCI image | Built automatically by Prow on GA tag |
71+
| Helm charts | OCI registry (see [Helm Chart Distribution](#helm-chart-distribution)) | OCI artifact | Required for ROSA/Argo CD consumption |
72+
| Release notes | `hyperfleet-release` repo, `releases/release-X.Y/` | Markdown | Compatibility matrix, breaking changes, upgrade guide |
73+
| Compatibility matrix | `hyperfleet-release` repo | Markdown table | Maps validated component version combinations |
74+
| Git tags | Per-component repos + `hyperfleet-release` | `vX.Y.Z` / `release-X.Y` | See [Release Process](hyperfleet-release-process.md) |
75+
76+
77+
When a GA release is published, it will have detail of which ROSA/GCP versions have passed the integration tests to use as compatibility matrix. This allows to potentially introduce a breaking change in one release, that may be only deployable by another pillar.
78+
79+
### Helm Chart Distribution
80+
81+
**Current state**: ROSA consumes HyperFleet charts via Argo CD ApplicationSets that point directly to GitHub repos with a pinned `targetRevision` Git tag (e.g., `targetRevision: v0.1.1` on `https://github.com/openshift-hyperfleet/hyperfleet-adapter`). A freshly configured Argo CD instance does not support Git-sourced Helm charts without a plugin, which ROSA has not installed. This creates a tight coupling between HyperFleet Git tags and ROSA's deployment cadence.
82+
83+
**Agreed path**:
84+
85+
1. **Short-term (Q2 2026)**: ROSA team sets up a temporary OCI registry to publish HyperFleet Helm charts. This unblocks integration testing immediately.
86+
2. **Q2 target**: HyperFleet team publishes charts to an OCI-compliant registry via Conflux as part of the release pipeline, eliminating the temporary workaround and the Git coupling.
87+
88+
89+
### Notification SLA
90+
91+
When a HyperFleet GA release is published:
92+
93+
| Event | Channel | Timeline | Recipients |
94+
|-------|---------|----------|------------|
95+
| Release candidate available | `#hyperfleet-releases` Slack | RC cut day | GCP team, ROSA team |
96+
| GA release published | `#hyperfleet-releases` Slack | GA day | GCP team, ROSA team |
97+
| Breaking change in next release | `#hyperfleet-releases` Slack | ≥ 1 sprint before GA | GCP team, ROSA team |
98+
| Hotfix / patch release | `#hyperfleet-releases` Slack | Within 2 hours of GA tag | GCP team, ROSA team |
99+
100+
101+
At this point in time (April 26) breaking changes are not blockers to HyperFleet releases as ROSA/GCP teams do not have to keep long running clusters and migrate data.
102+
103+
104+
### Rollback / Recovery
105+
106+
HyperFleet uses a **roll-forward** strategy for MVP: issues are fixed via patch releases rather than rollback. See [Release Process — Release Recovery Strategy](hyperfleet-release-process.md#55-release-recovery-strategy).
107+
108+
HyperFleet commits to (exact times TBD):
109+
110+
- Producing a patch release within **48 hours** for Blocker/Critical regressions
111+
- Producing a patch release within **1 week** for Major regressions
112+
- Maintaining N-1 backward compatibility so consumer teams can remain pinned to the previous validated release while a fix is in flight
113+
114+
---
115+
116+
## Integration Testing Strategy
117+
118+
### Decision: Nightly Runs with OCI Chart Injection
119+
120+
**Agreed approach** (as of March 31, 2026 meeting):
121+
122+
- Start with **nightly runs** against HyperFleet `main` branch, not presubmit jobs
123+
- Test against the **latest known-good stable version** of the ROSA regional platform (production Maestro version), replacing only the HyperFleet component under test
124+
- The ROSA team will **temporarily enable OCI chart pushing** so the HyperFleet team can inject PR-built charts into the ROSA deployment pipeline
125+
- Evaluate **non-blocking presubmit** integration with the HyperFleet release repository as a follow-up
126+
127+
**Rationale**: Running full ROSA environment provisioning (~40 minutes + E2E duration) as a presubmit would significantly impact development velocity without proportional benefit at the current team scale. Nightly runs provide meaningful feedback without blocking day-to-day development.
128+
129+
**Note on ROSA's existing pre-merge capability**: The ROSA repo already has a working cross-component E2E pre-merge mechanism (triggered via Prow comment on PRs). The decision to start with nightly runs is about HyperFleet's readiness to onboard to that mechanism — not a limitation of the ROSA infrastructure. Per-PR testing remains the target once the OCI chart injection step is stable.
130+
131+
132+
### Team Test Ownership
133+
134+
| Layer | Owner | Scope | Runs on |
135+
|-------|-------|-------|---------|
136+
| Unit tests | HyperFleet | Each component in isolation | Every PR (presubmit) |
137+
| Integration tests | HyperFleet | Cross-component API contracts | Every PR (presubmit) |
138+
| HyperFleet E2E | HyperFleet | HyperFleet stack end-to-end | Nightly (main branch) |
139+
| ROSA integration | ROSA Team | Full ROSA region + HyperFleet override | Nightly (HyperFleet main) |
140+
| GCP integration | GCP Team | GCP deployment + HyperFleet | TBD (ref: GCP-334) |
141+
| Release gate | HyperFleet | All of the above must pass | Before GA tag |
142+
143+
### Testing Gaps Identified
144+
145+
| Gap | Owning Team | Mitigation |
146+
|-----|-------------|------------|
147+
| HyperFleet not yet onboarded to ROSA's pre-merge E2E mechanism | HyperFleet | Onboard to `openshift/release` Prow config + create `quay.io/rrp-dev-ci/` image repos (see onboarding steps above) |
148+
| Helm chart override (OCI) not yet wired into ROSA CI | ROSA + HyperFleet | Temporary OCI setup by ROSA team (Q2 2026, immediate action); replaced by Conflux Q2 target |
149+
| GCP integration tests not yet connected to HyperFleet CI | GCP + HyperFleet | Blocked on GCP-334 (CLM/Argo CD integration in progress) |
150+
| Multi-component PR testing (API + Adapter in same PR) | HyperFleet | Nightly tests use `main` for all other components; single-component override per nightly run is the starting point |
151+
| Presubmit integration gate for HyperFleet release repo | HyperFleet | Future action: non-blocking presubmit on `hyperfleet-release` repo |
152+
153+
154+
---
155+
156+
## Alternatives Considered
157+
158+
### 1. Non-blocking Presubmit on HyperFleet Release Repository
159+
160+
Run the full ROSA integration pipeline as an optional, non-blocking presubmit job triggered on the `hyperfleet-release` repo.
161+
162+
**Rejected for now**: A ~40-minute+ non-blocking job provides weak signal — developers may ignore it, especially if failures are infrequent. Starting with nightly runs builds confidence in the pipeline before promoting it to presubmit. This remains a **future action**.
163+
164+
### 2. Consumer-Driven Contract Testing (Pact-style)
165+
166+
Define formal API contracts using a consumer-driven contract testing tool (e.g., Pact). ROSA and GCP publish their expectations; HyperFleet CI verifies them on every PR.
167+
168+
**Rejected for MVP**: The integration surface between HyperFleet and consumer teams is primarily at the Helm chart / deployment configuration level, not a REST API contract boundary. Consumer-driven contract testing tools are better suited to service-to-service REST contracts. Helm value schema validation is a lighter-weight alternative to investigate post-MVP.
169+
170+
### 3. Automated Rollout to Integration Environments on GA
171+
172+
Trigger automatic deployment of each HyperFleet GA release to ROSA and GCP integration environments via webhooks.
173+
174+
**Rejected for MVP**: ROSA's pipeline takes ~40 minutes per run and requires environment-specific configuration overrides. Automating this safely requires tooling (OCI charts via Conflux, pipeline webhooks) not yet in place. Deferred to post-Q2 2026.
175+
176+
---
177+
178+
## Related Documents
179+
180+
- [HyperFleet Release Process](hyperfleet-release-process.md) — release cadence, branching, artifacts
181+
- [Versioning Trade-offs](versioning-trade-offs.md) — SDK versioning, rollback considerations
182+
- [E2E Testing Framework Spike Report](e2e-testing/e2e-testing-framework-spike-report.md)
183+
- [E2E Run Strategy Spike Report](e2e-testing/e2e-run-strategy-spike-report.md)
184+
- [ROSA — Adding a Component for Pre-merge E2E Testing](https://github.com/openshift-online/rosa-regional-platform/blob/main/docs/adding-component-pre-merge.md) — onboarding guide for the `/test rosa-regionality-compatibility-e2e` Prow trigger
185+
- [ROSA — Testing Strategy Design](https://github.com/openshift-online/rosa-regional-platform/blob/main/docs/design/testing-strategy.md) — three CI workflows (pre-merge, nightly integration, nightly ephemeral)
186+
- GCP-334 — CLM Components Deployment (linked Jira epic)
187+
- HYPERFLEET-633 — Define release contract and integration testing strategy

0 commit comments

Comments
 (0)