Skip to content
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,345 @@
---
apiVersion: tekton.dev/v1beta1
kind: Pipeline
metadata:
name: lightspeed-stack-integration-tests-pipeline
spec:
description: |
This pipeline automates the process of running end-to-end tests for Lightspeed Stack
using a ROSA (Red Hat OpenShift Service on AWS cluster. The pipeline provisions
the ROSA cluster, installs the Lightspeed Stack, runs the tests, collects artifacts,
and finally deprovisions the ROSA cluster.
params:
- name: SNAPSHOT
description: 'The JSON string representing the snapshot of the application under test (includes lightspeed-stack image).'
default: '{"components": [{"name":"lightspeed-stack", "containerImage": "quay.io/example/lightspeed-stack:latest"}]}'
type: string
- name: llama-stack-image
description: 'Llama Stack runs from source on UBI (init container clones repo and installs deps). Kept for logging/backwards compatibility.'
default: 'run-from-source (UBI)'
type: string
- name: test-name
description: 'The name of the test corresponding to a defined Konflux integration test.'
default: 'lightspeed-stack-e2e-tests'
- name: namespace
description: 'Namespace to run tests in'
default: 'lightspeed-stack'
tasks:
- name: eaas-provision-space
taskRef:
resolver: git
params:
- name: url
value: https://github.com/konflux-ci/build-definitions.git
- name: revision
value: main
- name: pathInRepo
value: task/eaas-provision-space/0.1/eaas-provision-space.yaml
params:
- name: ownerKind
value: PipelineRun
- name: ownerName
value: $(context.pipelineRun.name)
- name: ownerUid
value: $(context.pipelineRun.uid)
- name: provision-cluster
runAfter:
- eaas-provision-space
taskSpec:
results:
- name: clusterName
value: "$(steps.create-cluster.results.clusterName)"
steps:
- name: pick-version
ref:
resolver: git
params:
- name: url
value: https://github.com/konflux-ci/build-definitions.git
- name: revision
value: main
- name: pathInRepo
value: stepactions/eaas-get-latest-openshift-version-by-prefix/0.1/eaas-get-latest-openshift-version-by-prefix.yaml
params:
- name: prefix
value: "4.19."
- name: create-cluster
ref:
resolver: git
params:
- name: url
value: https://github.com/konflux-ci/build-definitions.git
- name: revision
value: main
- name: pathInRepo
value: stepactions/eaas-create-ephemeral-cluster-hypershift-aws/0.1/eaas-create-ephemeral-cluster-hypershift-aws.yaml
params:
- name: eaasSpaceSecretRef
value: $(tasks.eaas-provision-space.results.secretRef)
- name: version
value: "$(steps.pick-version.results.version)"
- name: instanceType
value: "m5.large"
- name: get-stack-images
description: Extract lightspeed-stack image and commit from SNAPSHOT (Llama Stack runs from source in-pod)
runAfter:
- provision-cluster
params:
- name: SNAPSHOT
value: $(params.SNAPSHOT)
- name: namespace
value: "$(params.namespace)"
taskSpec:
results:
- name: lightspeed-stack-image
value: "$(steps.get-stack-images.results.lightspeed-stack-image)"
- name: commit
value: "$(steps.get-stack-images.results.commit)"
params:
- name: SNAPSHOT
- name: namespace
type: string
volumes:
- name: credentials
emptyDir: {}
steps:
- name: get-stack-images
image: registry.redhat.io/openshift4/ose-cli:latest
env:
- name: SNAPSHOT
value: $(params.SNAPSHOT)
results:
- name: lightspeed-stack-image
type: string
description: "lightspeed-stack container image from snapshot"
- name: commit
type: string
description: "commit sha to be used to store artifacts"
script: |
dnf -y install jq
echo -n "$(jq -r --arg n "lightspeed-stack" '.components[] | select(.name == $n) | .containerImage // ""' <<< "$SNAPSHOT")" > $(step.results.lightspeed-stack-image.path)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fail fast if lightspeed-stack is missing from SNAPSHOT.

Line 120 currently writes "" when the component lookup misses. tests/e2e-prow/rhoai/pipeline-konflux.sh:14-26 then falls back to quay.io/lightspeed-core/lightspeed-stack:dev-latest, so this task can silently test the wrong image instead of the snapshot under test.

Suggested change
-              echo -n "$(jq -r --arg n "lightspeed-stack" '.components[] | select(.name == $n) | .containerImage // ""' <<< "$SNAPSHOT")" > $(step.results.lightspeed-stack-image.path)
+              image="$(jq -er --arg n "lightspeed-stack" '.components[] | select(.name == $n) | .containerImage' <<< "$SNAPSHOT")" || {
+                echo "lightspeed-stack image missing from SNAPSHOT" >&2
+                exit 1
+              }
+              printf '%s' "$image" > "$(step.results.lightspeed-stack-image.path)"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
echo -n "$(jq -r --arg n "lightspeed-stack" '.components[] | select(.name == $n) | .containerImage // ""' <<< "$SNAPSHOT")" > $(step.results.lightspeed-stack-image.path)
image="$(jq -er --arg n "lightspeed-stack" '.components[] | select(.name == $n) | .containerImage' <<< "$SNAPSHOT")" || {
echo "lightspeed-stack image missing from SNAPSHOT" >&2
exit 1
}
printf '%s' "$image" > "$(step.results.lightspeed-stack-image.path)"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.tekton/integration-tests/pipeline/lightspeed-stack-integration-test.yaml at
line 120, The task currently writes an empty string when jq fails to find the
"lightspeed-stack" component in SNAPSHOT which allows downstream tests to
silently use a fallback image; update the step to fail-fast: after computing the
image value with the existing jq expression (the part using --arg n
"lightspeed-stack" '.components[] | select(.name == $n) | .containerImage //
""'), check if the result is empty and if so print a clear error and exit
non-zero, otherwise write the image into
step.results.lightspeed-stack-image.path so the pipeline aborts when the
component is missing instead of running with the wrong image.

echo -n "$(jq -r --arg n "lightspeed-stack" '.components[] | select(.name == $n) | .source.git.revision // "latest"' <<< "$SNAPSHOT")" > $(step.results.commit.path)
- name: echo-integration-params
description: Echo all params passed to lightspeed-stack-integration-tests for verification before the test runs.
runAfter:
- get-stack-images
params:
- name: SNAPSHOT
value: $(params.SNAPSHOT)
- name: lightspeedstackimage
value: $(tasks.get-stack-images.results.lightspeed-stack-image)
- name: llamastackimage
value: $(params.llama-stack-image)
- name: commit
value: $(tasks.get-stack-images.results.commit)
- name: namespace
value: "$(params.namespace)"
taskSpec:
params:
- name: SNAPSHOT
- name: lightspeedstackimage
- name: llamastackimage
- name: commit
- name: namespace
type: string
steps:
- name: echo-params
image: registry.access.redhat.com/ubi9/ubi-minimal
env:
- name: LIGHTSPEED_STACK_IMAGE
value: $(params.lightspeedstackimage)
- name: LLAMA_STACK_IMAGE
value: $(params.llamastackimage)
- name: NAMESPACE
value: $(params.namespace)
- name: COMMIT
value: $(params.commit)
- name: SNAPSHOT
value: $(params.SNAPSHOT)
script: |
echo "========== Integration test parameters (before lightspeed-stack-integration-tests) =========="
echo "LIGHTSPEED_STACK_IMAGE=$LIGHTSPEED_STACK_IMAGE"
echo "LLAMA_STACK_IMAGE=$LLAMA_STACK_IMAGE"
echo "NAMESPACE=$NAMESPACE"
echo "COMMIT=$COMMIT"
echo "SNAPSHOT length: $(echo -n "$SNAPSHOT" | wc -c) chars"
echo "SNAPSHOT (first 500 chars): $(echo -n "$SNAPSHOT" | head -c 500)"
echo "========== End parameters =========="
- name: lightspeed-stack-integration-tests
description: Task to run integration tests from lightspeed-stack repository
params:
- name: SNAPSHOT
value: $(params.SNAPSHOT)
- name: lightspeedstackimage
value: $(tasks.get-stack-images.results.lightspeed-stack-image)
- name: llamastackimage
value: $(params.llama-stack-image)
- name: commit
value: $(tasks.get-stack-images.results.commit)
- name: namespace
value: "$(params.namespace)"
- name: spaceRequestSecretName
value: $(tasks.eaas-provision-space.results.secretRef)
- name: clusterName
value: $(tasks.provision-cluster.results.clusterName)
runAfter:
- echo-integration-params
taskSpec:
params:
- name: SNAPSHOT
- name: lightspeedstackimage
- name: llamastackimage
- name: commit
- name: namespace
type: string
- name: spaceRequestSecretName
type: string
- name: clusterName
type: string
results:
- name: TEST_OUTPUT
description: Standardized JSON output for Enterprise Contract
Comment on lines +199 to +201
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

TEST_OUTPUT result is declared but never populated.

The TEST_OUTPUT result is defined for "Standardized JSON output for Enterprise Contract" but no step writes to it. This will cause the result to be empty, potentially breaking downstream consumers.

Either populate the result in the run-e2e-tests step or remove the declaration if it's not needed yet.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.tekton/integration-tests/pipeline/lightspeed-stack-integration-test.yaml
around lines 201 - 203, The TEST_OUTPUT pipeline result is declared but never
written; update the run-e2e-tests step (or the step that produces Enterprise
Contract output) to write the standardized JSON into the TASK_RESULT named
TEST_OUTPUT (e.g., echo the JSON to $(results.TEST_OUTPUT.path) or use tkn-style
result write) so downstream consumers receive the output, or remove the
TEST_OUTPUT result declaration if no step will produce that JSON; locate
references to TEST_OUTPUT and the run-e2e-tests step in the task spec and add a
command that writes the JSON to the results file path.

volumes:
# Secrets for pipeline-konflux.sh (e2e-prow): OpenAI API key (secret key must be "openai-api-key"), Quay pull for images
- name: openai-api-key
secret:
secretName: openai-api-key
- name: quay-aipcc-name
secret:
secretName: quay-aipcc-name
- name: quay-aipcc-password
secret:
secretName: quay-aipcc-password
- name: credentials
emptyDir: {}
steps:
- name: get-kubeconfig
ref:
resolver: git
params:
- name: url
value: https://github.com/konflux-ci/build-definitions.git
- name: revision
value: main
- name: pathInRepo
value: stepactions/eaas-get-ephemeral-cluster-credentials/0.1/eaas-get-ephemeral-cluster-credentials.yaml
params:
- name: eaasSpaceSecretRef
value: $(params.spaceRequestSecretName)
- name: clusterName
value: "$(params.clusterName)"
- name: credentials
value: credentials
- name: run-e2e-tests
onError: continue
Comment on lines +233 to +234
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

In Tekton Pipelines, if a step uses onError: continue and exits non-zero, does the enclosing TaskRun/PipelineRun still succeed when there are no later failing steps?

💡 Result:

Yes, if a step in a Tekton Task uses onError: continue and exits with a non-zero code, but all subsequent steps in the Task succeed, the enclosing TaskRun succeeds (SUCCEEDED condition True with reason Succeeded). The PipelineRun will also succeed if all its TaskRuns succeed and there are no later failing steps or PipelineTasks with onError: stopAndFail.

Citations:


🏁 Script executed:

cat -n .tekton/integration-tests/pipeline/lightspeed-stack-integration-test.yaml | sed -n '240,260p'

Repository: lightspeed-core/lightspeed-stack

Length of output: 1159


🏁 Script executed:

wc -l .tekton/integration-tests/pipeline/lightspeed-stack-integration-test.yaml

Repository: lightspeed-core/lightspeed-stack

Length of output: 153


🏁 Script executed:

rg 'PIPELINE_EXIT' .tekton/integration-tests/pipeline/lightspeed-stack-integration-test.yaml -B 5 -A 5

Repository: lightspeed-core/lightspeed-stack

Length of output: 767


🏁 Script executed:

# Check if there are any steps defined after the run-e2e-tests step
cat -n .tekton/integration-tests/pipeline/lightspeed-stack-integration-test.yaml | sed -n '245,280p'

Repository: lightspeed-core/lightspeed-stack

Length of output: 1790


🏁 Script executed:

cat -n .tekton/integration-tests/pipeline/lightspeed-stack-integration-test.yaml | sed -n '280,359p'

Repository: lightspeed-core/lightspeed-stack

Length of output: 4753


onError: continue makes failing E2E runs non-blocking and allows the task to succeed despite test failures.

This is the only active step in the task—all subsequent steps are commented out. With onError: continue, even if pipeline-konflux.sh exits with a non-zero code, the TaskRun will still report success, so broken E2E tests will not fail CI. Remove onError: continue or uncomment the fail-if-any-step-failed step before merge.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.tekton/integration-tests/pipeline/lightspeed-stack-integration-test.yaml
around lines 248 - 249, The pipeline step named "run-e2e-tests" currently has
onError: continue which masks failures; remove the onError: continue from the
run-e2e-tests step (or alternatively re-enable the follow-up
"fail-if-any-step-failed" step) so that a non-zero exit from pipeline-konflux.sh
causes the TaskRun to fail; update the task YAML around the run-e2e-tests step
and ensure pipeline-konflux.sh remains the sole active test step or that
fail-if-any-step-failed is uncommented to enforce failure propagation.

resources:
requests:
cpu: '1'
memory: 1Gi
limits:
memory: 10Gi
volumeMounts:
# Mount paths expected by pipeline-konflux.sh (tests/e2e-prow/rhoai/pipeline-konflux.sh)
- name: openai-api-key
mountPath: /var/run/openai
- name: quay-aipcc-name
mountPath: /var/run/quay-aipcc-name
- name: quay-aipcc-password
mountPath: /var/run/quay-aipcc-password
- name: credentials
mountPath: /credentials
env:
- name: KUBECONFIG
value: "/credentials/$(steps.get-kubeconfig.results.kubeconfig)"
- name: ARTIFACT_DIR
value: "/workspace/artifacts"
- name: SUITE_ID
value: "nosuite"
- name: KONFLUX_BOOL
value: "true"
- name: LIGHTSPEED_STACK_IMAGE
value: "$(params.lightspeedstackimage)"
- name: LLAMA_STACK_IMAGE
value: "$(params.llamastackimage)"
- name: NAMESPACE
value: "$(params.namespace)"
- name: SNAPSHOT
value: $(params.SNAPSHOT)
image: registry.access.redhat.com/ubi9/ubi-minimal
script: |
set +e
echo "[e2e] 1/8 Starting run-e2e-tests step"
echo "[e2e] 2/8 Installing deps (git, tar, jq, curl-minimal, python3, gettext for envsubst)..."
microdnf -y install git tar jq curl-minimal python3 gettext
echo "[e2e] 3/8 Downloading oc client..."
curl -sL -o oc.tar.gz https://mirror.openshift.com/pub/openshift-v4/x86_64/clients/ocp/latest-4.19/openshift-client-linux-amd64-rhel9.tar.gz
tar -xzf oc.tar.gz && chmod +x kubectl oc && mv oc kubectl /usr/local/bin/
echo "[e2e] 4/8 SNAPSHOT (length ${#SNAPSHOT} chars; clone URL fixed — SNAPSHOT is main/upstream, not fork)..."
# Fixed fork + branch: Konflux SNAPSHOT points at main repo/rev, not the PR/fork under test.
REPO_URL=$(jq -r '.components[] | select(.name == "lightspeed-stack") | .source.git.url // "https://github.com/lightspeed-core/lightspeed-stack.git"' <<< "$SNAPSHOT")
REPO_REV=$(jq -r '.components[] | select(.name == "lightspeed-stack") | .source.git.revision // "main"' <<< "$SNAPSHOT")
echo "[e2e] 5/8 Clone $REPO_URL @ $REPO_REV"
git clone -q "$REPO_URL" /workspace/lightspeed-stack
cd /workspace/lightspeed-stack && git fetch origin "$REPO_REV" && git checkout -q "$REPO_REV"
echo "[e2e] 6/8 Entering tests/e2e-prow/rhoai"
cd tests/e2e-prow/rhoai && chmod +x pipeline-konflux.sh && ls -la pipeline-konflux.sh
echo "[e2e] 7/8 Running pipeline-konflux.sh (this should take 10+ min)..."
./pipeline-konflux.sh
PIPELINE_EXIT=$?
echo "[e2e] 8/8 pipeline-konflux.sh exited with code $PIPELINE_EXIT"
exit $PIPELINE_EXIT
# - name: gather-cluster-resources
# onError: continue
# ref:
# resolver: git
# params:
# - name: url
# value: https://github.com/konflux-ci/tekton-integration-catalog
# - name: revision
# value: main
# - name: pathInRepo
# value: stepactions/gather-cluster-resources/0.1/gather-cluster-resources.yaml
# params:
# - name: credentials
# value: "credentials"
# - name: kubeconfig
# value: "$(steps.get-kubeconfig.results.kubeconfig)"
# - name: artifact-dir
# value: "/workspace/konflux-artifacts"
# # validate that the cluster resources are available in another tekton step
# - name: list-artifacts
# onError: continue
# image: quay.io/konflux-qe-incubator/konflux-qe-tools:latest
# workingDir: "/workspace"
# script: |
# #!/bin/bash
# ls -la /workspace
# - name: push-artifacts
# (requires volume ols-konflux-artifacts-bot-creds + Secret ols-konflux-artifacts-bot)
# onError: continue
# ref:
# resolver: git
# params:
# - name: url
# value: https://github.com/konflux-ci/tekton-integration-catalog.git
# - name: revision
# value: main
# - name: pathInRepo
# value: stepactions/secure-push-oci/0.1/secure-push-oci.yaml
# params:
# - name: workdir-path
# value: /workspace
# - name: oci-ref
# value: "quay.io/openshift-lightspeed/ols-service-artifacts:$(params.commit)"
# - name: credentials-volume-name
# value: ols-konflux-artifacts-bot-creds
# - name: fail-if-any-step-failed
# ref:
# resolver: git
# params:
# - name: url
# value: https://github.com/konflux-ci/tekton-integration-catalog.git
# - name: revision
# value: main
# - name: pathInRepo
# value: stepactions/fail-if-any-step-failed/0.1/fail-if-any-step-failed.yaml
10 changes: 6 additions & 4 deletions docs/e2e_testing.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,9 +69,11 @@ tests/e2e/
tests/e2e-prow/
└── rhoai/ # RHOAI / OpenShift E2E
├── run-tests.sh # Entry to run E2E in Prow
├── pipeline.sh # Main pipeline definition
├── pipeline-services.sh # Services pipeline
├── pipeline-vllm.sh # vLLM pipeline
├── pipeline.sh # Prow: full vLLM + LCS + behave (main branch workflow)
├── pipeline-konflux.sh # Konflux: OpenAI Llama run-from-source + run-ci.yaml + behave
├── pipeline-services.sh # Services for Prow (vLLM llama-stack image + LCS)
├── pipeline-services-konflux.sh # Services for Konflux (llama-stack-openai + templated LCS)
├── pipeline-vllm.sh # vLLM cluster setup (called from pipeline.sh)
├── pipeline-test-pod.sh # Test pod pipeline
├── configs/ # Lightspeed-stack configs for Prow (used by environment.py when is_prow)
│ ├── lightspeed-stack.yaml
Expand All @@ -80,7 +82,7 @@ tests/e2e-prow/
│ ├── lightspeed-stack-auth-rh-identity.yaml
│ ├── lightspeed-stack-no-cache.yaml
│ ├── lightspeed-stack-invalid-feedback-storage.yaml
│ └── run.yaml # Llama Stack run config for Prow
│ └── run.yaml # vLLM Llama Stack config (used by pipeline.sh); Konflux uses tests/e2e/configs/run-ci.yaml via pipeline-konflux.sh
├── scripts/
│ ├── e2e-ops.sh # E2E ops (e.g. disrupt/restore llama-stack) — called from prow_utils
│ ├── bootstrap.sh
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
name: Lightspeed Core Service (LCS)
service:
host: 0.0.0.0
port: 8080
auth_enabled: false
workers: 1
color_log: true
access_log: true
llama_stack:
use_as_library_client: false
url: http://${env.E2E_LLAMA_HOSTNAME}:8321
api_key: xyzzy
user_data_collection:
feedback_enabled: true
feedback_storage: "/tmp/data/feedback"
transcripts_enabled: true
transcripts_storage: "/tmp/data/transcripts"
authentication:
module: "noop"
mcp_servers:
- name: "mcp-file"
provider_id: "model-context-protocol"
url: "http://mock-mcp:3001"
authorization_headers:
Authorization: "/tmp/invalid-mcp-token"
Loading
Loading