Skip to content

Konflux integration#145

Open
AdamSaleh wants to merge 21 commits into
mainfrom
konflux-integration
Open

Konflux integration#145
AdamSaleh wants to merge 21 commits into
mainfrom
konflux-integration

Conversation

@AdamSaleh

Copy link
Copy Markdown
Collaborator

This should be ready for review.

The code itself was produced mostly by claude but I reviewed and tested it extensively.

Intent of the pipeline:

The pipeline uses a specific test image, that is split into base, that should be rebuilt infrequently and a layer on top of it that contains all of the scripts used in the pipeline
the scripts are configured by env-vars, making them easy to use from pipelines or when testing locally
Pipeline provisions it's own konflux cluster - it is based on arm64 hypershift, version 4.14
It installs appropriate catalog and installs the latest operator.
runs the parallel test-suite from https://github.com/rh-gitops-release-qa/gitops-operator that houses the for of gitops-operator, to facilitate for fast test updates
after it finishes it pushes installation logs, test logs and assorted debug information as quay artefact
it sends a message to gitops-test-notification channel

Outstanding questions:

what openshift version should we test against?
should there be difference for running against master?
any part of the pipeline that looks wonky?

@AdamSaleh

Copy link
Copy Markdown
Collaborator Author

/retest

2 similar comments
@AdamSaleh

Copy link
Copy Markdown
Collaborator Author

/retest

@AdamSaleh

Copy link
Copy Markdown
Collaborator Author

/retest

AdamSaleh and others added 7 commits June 4, 2026 13:13
There are currently four test-suites being run:
- gitops-operator's e2e ginkgo test-suite, sharded into 3 scripts
- the rollouts e2e tests
- gitops operator's ui test verifying login (more tests to come)
- the argocd tests in a separate pipeline

There is simple parametrized pipeline, where you can choose:
- the openshift version
- size of cluster nodes
- the channel to be used in the catalog
- the test-script to run

Secont separate pipeline installs standalone argocd and runs the e2e tests

All the tests are run from precompiled docker image,
the pipeline will check at the start and build them if hte images were
changed. The test and utility scripts always get copied.

The logs get uploaded to quay.
At the end of the pipeline, it will send a message to
gitops-test-notification channel on slack

The code is mostly authored by prompting claude and tested
against the v1.20 branch of the catalog repo.

Assisted-by: Claude <usersafety@anthropic.com>
Signed-off-by: Adam Saleh <adam@asaleh.net>
- Add GATE_LABEL param to both pipelines: when set, push events always
  run but pull_request events require the specified label on the PR
  (checked via GitHub API). Default empty = no gating.
- Add BUILD_TEST_IMAGE param: when "true", builds test image from source
  via build-ginkgo-test-image task; otherwise uses pre-built TEST_IMAGE_URL.
- Add resolve-test-image task to select built vs pre-built image URL.
- Add run-sanity-tests.sh: release sanity script covering CSV validation,
  operator health, toolchain version check (with Confluence Component
  Matrix lookup), ArgoCD login test, and app sync smoke test.
- Update test-operator task with confluence-credentials volume mount.
- Remove non-UI test scenario files (moved to cluster-only).
- Update default TEST_IMAGE_URL to konflux_v1.21.0 tag.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@AdamSaleh AdamSaleh force-pushed the konflux-integration branch from 2e84399 to acd01fe Compare June 5, 2026 12:56
Four scenarios covering latest channel (default, fips, upgrade, upgrade-fips)
using run-sanity-tests.sh with GATE_LABEL=release-candidate.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@AdamSaleh

Copy link
Copy Markdown
Collaborator Author

/retest

The parse-metadata task reads event type and PR number from pod labels
(pac.test.appstudio.openshift.io/*), but these labels may not be
propagated to integration test PipelineRun pods. This adds a fallback
that reads the PipelineRun labels directly via the Kubernetes API,
plus diagnostic logging to help debug label propagation issues.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@AdamSaleh

Copy link
Copy Markdown
Collaborator Author

/retest

1 similar comment
@AdamSaleh

Copy link
Copy Markdown
Collaborator Author

/retest

AdamSaleh and others added 9 commits June 8, 2026 15:27
When build-test-image is skipped (BUILD_TEST_IMAGE=false), Tekton
cascade-skips any task that references its results. Adding default
values to the results means resolve-test-image receives "" instead
of being skipped, allowing it to fall through to the TEST_IMAGE_URL
fallback. This was causing all downstream tasks (install-operator,
test-operator) to be skipped after provision-cluster.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… ref

Revert result defaults (unsupported by this Tekton version) and instead
remove the $(tasks.build-test-image.results.IMAGE_URL) reference from
resolve-test-image params. Pass "" so it always falls through to
TEST_IMAGE_URL. BUILD_TEST_IMAGE is not actively used; wiring can be
restored when needed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace the inline resolve-test-image pass-through with a new
overlay-test-scripts task that builds a thin scripts layer on top of
the pre-built base image. This ensures new/changed scripts (like
run-sanity-tests.sh) are always available without full image rebuilds.

The task clones the catalog repo, hashes scripts/ and config/ dirs,
and skips the build on cache hit (skopeo inspect). On miss it builds
a single-layer overlay with buildah and pushes to quay.

Both operator and argocd e2e pipelines now use this task.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The grep pattern for extracting PR labels assumed "name":"value"
(no space after colon), but GitHub returns "name": "value" with
a space, causing label detection to always fail.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Separate the image resolution and scripts overlay into distinct tasks:
- resolve-test-image: inline task that picks the base image (build
  output via K8s API when BUILD_TEST_IMAGE=true, or TEST_IMAGE_URL)
- overlay-test-scripts: builds scripts layer on top of resolved base

resolve-test-image now has runAfter: [build-test-image] so it waits
for the full build when active, then passes the build output to the
overlay task. When build is skipped (common case), resolve runs
immediately with the pre-built TEST_IMAGE_URL.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The guestbook app sync failed because the ArgoCD application
controller lacked permissions in the target namespace. Label the
namespace with argocd.argoproj.io/managed-by so the operator
automatically creates the required RoleBindings.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The guestbook deployment can take a while to pull its image on EaaS
clusters. The sanity test validates that ArgoCD can sync an app, not
that the container starts quickly. Accept Synced as the primary
success condition — Progressing health is noted but not a failure.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The guestbook app pulls gcr.io/google-samples/gb-frontend:v5 which
is slow on EaaS clusters, causing the health check to time out at
Progressing. Replace with a ConfigMap-only app from the catalog repo
itself — no image pull, instant Synced+Healthy, still validates the
full ArgoCD sync path.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Point the ArgoCD smoke test at a ConfigMap in the catalog repo itself
(.tekton/test-image/config/smoke-app/) instead of the guestbook app.
Uses CATALOG_URL and CATALOG_REVISION env vars so ArgoCD syncs from
the same branch the pipeline is running from — the smoke-app path
exists on that branch.

No image pull needed, instant Synced+Healthy.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@AdamSaleh

Copy link
Copy Markdown
Collaborator Author

/retest

2 similar comments
@AdamSaleh

Copy link
Copy Markdown
Collaborator Author

/retest

@AdamSaleh

Copy link
Copy Markdown
Collaborator Author

/retest

Add IntegrationTestScenario definitions for all test types:

- rc-operator-check: parallel, parallel-fips, sequential-s1, sequential-s2,
  rollouts, parallel-upgrade, sequential-s1-upgrade (7 scenarios)
- rc-argocd-check: argocd-e2e, argocd-e2e-fips (2 scenarios)
- rc-ui-check: ui-e2e (1 scenario)

All scenarios are optional and gated on PR labels. Only operator
and sanity test groups include upgrade testing variants.

Also adds catalogUrl/catalogRevision params to test-operator task
for smoke test app source resolution.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@AdamSaleh

Copy link
Copy Markdown
Collaborator Author

/retest

…switch

- Add fallback to pre-compiled argocd binary from test image when all
  extraction methods fail (IDMS mirror + arch mismatch on EaaS clusters)
- Add wait_for_argocd_reconciliation() to ensure ArgoCD workloads are
  updated with new images before tests run after an operator upgrade
- Switch test suite repo from rh-gitops-release-qa to redhat-developer

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant