Skip to content

Commit bcd6cd7

Browse files
authored
Merge pull request #2259 from tkatila/e2e-add-ocp-operator-support
e2e: rewrite operator tests with OCP support
2 parents 554410b + 81364bc commit bcd6cd7

18 files changed

Lines changed: 1466 additions & 44 deletions

DEVEL.md

Lines changed: 94 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ Table of Contents
1010
* [Work with Intel Device Plugins Operator Modifications](#work-with-intel-device-plugins-operator-modifications)
1111
* [Publish a New Version of the Intel Device Plugins Operator to operatorhub.io](#publish-a-new-version-of-the-intel-device-plugins-operator-to-operatorhubio)
1212
* [Run E2E Tests](#run-e2e-tests)
13+
* [Run Operator E2E Tests](#run-operator-e2e-tests)
1314
* [Run Controller Tests with a Local Control Plane](#run-controller-tests-with-a-local-control-plane)
1415
* [How to Develop Simple Device Plugins](#how-to-develop-simple-device-plugins)
1516
* [Logging](#logging)
@@ -268,6 +269,99 @@ without a pre-configured Kubernetes cluster. Just make sure you have
268269
make test-with-kind
269270
```
270271

272+
### Run Operator E2E Tests
273+
274+
The operator E2E tests (`test/e2e/operator/`) verify that the Intel Device Plugins Operator
275+
correctly deploys each plugin DaemonSet and exposes device resources as allocatable on nodes.
276+
The tests cover QAT (cy, dc), SGX, DSA (idxd, vfio), IAA, and GPU plugins.
277+
278+
The following environment variables must be set before running:
279+
280+
| Variable | Required | Description |
281+
|---|---|---|
282+
| `PROJECT_NAMESPACE`| always | Kubernetes namespace where the operator is deployed |
283+
| `IMAGE_PATH` | always | Image path under the registry (e.g. `myproject`) |
284+
| `PLUGIN_VERSION` | always | Plugin image tag (e.g. `0.35.0`) |
285+
| `IMAGE_REGISTRY` | K8s only | Container registry address (e.g. `registry.example.com:5000`); omit on OCP|
286+
287+
#### Vanilla Kubernetes
288+
289+
**Prerequisites:**
290+
291+
- `cert-manager` must already be installed in the cluster (the `cert-manager` namespace must exist).
292+
The tests will fail early if it is absent. See the [cert-manager install guide](https://cert-manager.io/docs/installation/).
293+
- Node Feature Discovery (NFD) is deployed automatically by the test suite if no pods are found in the `node-feature-discovery` namespace.
294+
- The `PROJECT_NAMESPACE` namespace is created automatically if it does not yet exist.
295+
- The operator is deployed from `deployments/operator/default/kustomization.yaml`.
296+
297+
```bash
298+
export PLUGIN_VERSION=0.35.0
299+
export PROJECT_NAMESPACE=inteldeviceplugin-operator
300+
export IMAGE_PATH=myproject
301+
export IMAGE_REGISTRY=registry.example.com:5000
302+
KUBECONFIG=/path/to/kubeconfig make e2e-operator
303+
```
304+
305+
Running operator tests with existing container images (0.35.0):
306+
```bash
307+
export PLUGIN_VERSION=0.35.0
308+
export PROJECT_NAMESPACE=inteldeviceplugin-operator
309+
export IMAGE_PATH=intel
310+
export IMAGE_REGISTRY=docker.io
311+
KUBECONFIG=/path/to/kubeconfig make e2e-operator
312+
```
313+
314+
#### OpenShift (OCP)
315+
316+
**Prerequisites:**
317+
318+
- The OCP service-CA operator is used for TLS; `cert-manager` is **not** required and must not
319+
interfere.
320+
- NFD is expected to already be running, either via the OpenShift NFD Operator
321+
(`openshift-nfd` namespace) or standard upstream NFD (`node-feature-discovery` namespace).
322+
The test suite deploys NFD automatically only if neither namespace has running pods.
323+
- The `PROJECT_NAMESPACE` namespace **must exist before running**; it is not created automatically
324+
on OCP. For OCP internal registry access, `PROJECT_NAMESPACE` and `IMAGE_PATH` typically refer
325+
to the same OpenShift project.
326+
- Do **not** set `IMAGE_REGISTRY`; the tests default to the OCP internal registry
327+
(`image-registry.openshift-image-registry.svc:5000`).
328+
- The operator is deployed from `deployments/operator/overlays/ocp/kustomization.yaml`.
329+
- Plugin images must be mirrored to the OCP internal registry before running:
330+
331+
```bash
332+
# The TAG has to be same or greater than the current release semver
333+
TAG=0.35.1 make set-version dockerfiles mirror-images-ocp
334+
```
335+
336+
Then run the tests:
337+
338+
```bash
339+
export PROJECT_NAMESPACE=inteldeviceplugin-operator
340+
export IMAGE_PATH=inteldeviceplugin-operator
341+
export PLUGIN_VERSION=0.35.1
342+
KUBECONFIG=/path/to/kubeconfig make e2e-operator
343+
```
344+
345+
#### Filtering Operator Tests
346+
347+
The `make e2e-operator` target uses the label filter `operator && !(gpu || iaa)` by default.
348+
Use `--ginkgo.label-filter` directly when you need a different subset:
349+
350+
```bash
351+
# Run only the QAT operator tests
352+
KUBECONFIG=/path/to/kubeconfig go test -v ./test/e2e/... \
353+
-ginkgo.v --ginkgo.label-filter "operator && qat" \
354+
-delete-namespace-on-failure=false
355+
356+
# Run all operator tests including GPU and IAA
357+
KUBECONFIG=/path/to/kubeconfig go test -v ./test/e2e/... \
358+
-ginkgo.v --ginkgo.label-filter "operator" \
359+
-delete-namespace-on-failure=false
360+
```
361+
362+
Available per-plugin labels: `qat`, `sgx`, `dsa`, `iaa`, `gpu`.
363+
Focus labels within operator tests: `cy`, `dc` (QAT), `idxd`, `vfio` (DSA), `i915` (GPU).
364+
271365
### Run Controller Tests with a Local Control Plane
272366

273367
The controller-runtime library provides a package for integration testing by

Makefile

Lines changed: 19 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -134,6 +134,11 @@ bundle-build:
134134
clean:
135135
@for cmd in $(cmds) ; do pwd=$(shell pwd) ; cd cmd/$$cmd ; $(GO) clean ; cd $$pwd ; done
136136

137+
.PHONY: mirror-images-ocp
138+
mirror-images-ocp:
139+
@bash scripts/build-images-for-ocp.sh
140+
@bash scripts/mirror-images-to-ocp.sh inteldeviceplugin-operator
141+
137142
ORG?=intel
138143
REG?=$(ORG)/
139144
TAG?=devel
@@ -161,14 +166,26 @@ e2e-dsa:
161166
e2e-iaa:
162167
@$(GO) test -v ./test/e2e/... -ginkgo.v -ginkgo.show-node-events --ginkgo.label-filter "iaa && !operator" $(E2E_DRYRUN) -delete-namespace-on-failure=false
163168

164-
# This is a CI specific target to run all tests that are possible in the SPR host
169+
# This is a CI specific target to run all tests that are possible in the SPR host.
170+
# Plugin tests run first, operator tests run second to avoid resource conflicts.
165171
e2e-spr:
166-
@$(GO) test -v ./test/e2e/... -ginkgo.v -ginkgo.show-node-events --ginkgo.label-filter "(qat && !compress-perf && dpdk && (cy || dc) || sgx) && !operator " $(E2E_DRYRUN) -delete-namespace-on-failure=false -e2e-verify-service-account=false
172+
@$(GO) test -v ./test/e2e/... -ginkgo.v -ginkgo.show-node-events \
173+
--ginkgo.label-filter "(qat && !compress-perf && !nft && !operator) || (sgx && !operator)" $(E2E_DRYRUN) \
174+
-delete-namespace-on-failure=false -e2e-verify-service-account=false
175+
@$(GO) test -v ./test/e2e/... -ginkgo.v -ginkgo.show-node-events \
176+
--ginkgo.label-filter "operator && (qat || sgx)" $(E2E_DRYRUN) \
177+
-delete-namespace-on-failure=false -e2e-verify-service-account=false
167178

168179
# Target to run a subset of tests depending on the given filter arg. By default, runs all tests.
169180
e2e:
170181
@$(GO) test -v ./test/e2e/... -ginkgo.v -ginkgo.show-node-events --ginkgo.label-filter "$(E2E_FILTER)" $(E2E_DRYRUN) -delete-namespace-on-failure=false
171182

183+
e2e-operator:
184+
@echo "NOTE: This depends on 'mirror-images-ocp' target but does not depend on it directly to allow running tests without building and mirroring every time."
185+
@echo "For OCP: Make sure PROJECT_NAMESPACE, IMAGE_PATH and PLUGIN_VERSION env variables are set, and IMAGE_REGISTRY _not_ set."
186+
@echo "For K8s: Make sure PROJECT_NAMESPACE, IMAGE_PATH, PLUGIN_VERSION and IMAGE_REGISTRY env variables are set"
187+
@$(GO) test -v ./test/e2e/... -ginkgo.v -ginkgo.show-node-events --ginkgo.label-filter "operator && !(gpu || iaa)" $(E2E_DRYRUN) -delete-namespace-on-failure=false
188+
172189
pre-pull:
173190
ifeq ($(TAG),devel)
174191
@$(BUILDER) pull golang:1.25-trixie
Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
# OCP-specific operator overlay.
2+
# Replaces cert-manager TLS with the OCP service-CA operator and grants the
3+
# privileged SCC to the default service account in the operator namespace so
4+
# that device-plugin DaemonSets can access host devices.
5+
namePrefix: intel-deviceplugins-
6+
7+
resources:
8+
- ../../crd
9+
- ../../rbac
10+
- ../../manager
11+
- ../../webhook
12+
# SCC privileged mode doesn't seem to be needed, so it's left out for now.
13+
# If we need it in the future, we can add it back in.
14+
#- scc-binding.yaml
15+
16+
patches:
17+
- path: manager_metrics_patch.yaml
18+
target:
19+
kind: Deployment
20+
- path: manager_webhook_patch.yaml
21+
target:
22+
kind: Deployment
23+
name: controller-manager
24+
- path: webhook-service-patch.yaml
25+
target:
26+
kind: Service
27+
name: webhook-service
28+
- path: webhook-mutating-ca-patch.yaml
29+
target:
30+
kind: MutatingWebhookConfiguration
31+
- path: webhook-validating-ca-patch.yaml
32+
target:
33+
kind: ValidatingWebhookConfiguration
34+
# remove namespace object as we don't want to create (and delete) it in OCP overlay
35+
# In OCP, we must keep the namespace so the images within the ImageStream stay intact.
36+
- patch: |-
37+
$patch: delete
38+
apiVersion: v1
39+
kind: Namespace
40+
metadata:
41+
labels:
42+
control-plane: controller-manager
43+
manager: intel-deviceplugin-operator
44+
pod-security.kubernetes.io/enforce: privileged
45+
pod-security.kubernetes.io/audit: privileged
46+
pod-security.kubernetes.io/warn: privileged
47+
name: system
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
# This patch adds the args to allow exposing the metrics endpoint using HTTPS
2+
- op: add
3+
path: /spec/template/spec/containers/0/args/0
4+
value: "--metrics-bind-address=:8443"
5+
- op: add
6+
path: /spec/template/spec/containers/0/args/0
7+
value: "--metrics-secure"
Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
apiVersion: apps/v1
2+
kind: Deployment
3+
metadata:
4+
name: controller-manager
5+
namespace: system
6+
spec:
7+
template:
8+
spec:
9+
containers:
10+
- name: manager
11+
ports:
12+
- containerPort: 9443
13+
name: webhook-server
14+
protocol: TCP
15+
volumeMounts:
16+
- mountPath: /tmp/k8s-webhook-server/serving-certs
17+
name: cert
18+
readOnly: true
19+
volumes:
20+
- name: cert
21+
secret:
22+
defaultMode: 420
23+
secretName: webhook-server-cert
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
# Grant the privileged SCC to the default service account in the operator
2+
# namespace so that device-plugin DaemonSet pods can access host devices and
3+
# run with the required capabilities.
4+
apiVersion: rbac.authorization.k8s.io/v1
5+
kind: ClusterRoleBinding
6+
metadata:
7+
name: privileged-scc
8+
subjects:
9+
- kind: ServiceAccount
10+
name: default
11+
namespace: inteldeviceplugins-system
12+
roleRef:
13+
kind: ClusterRole
14+
name: system:openshift:scc:privileged
15+
apiGroup: rbac.authorization.k8s.io
Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
# Instruct the OCP service-CA operator to inject the cluster CA bundle into
2+
# the MutatingWebhookConfiguration so that the API server can validate the
3+
# webhook server's TLS certificate.
4+
apiVersion: admissionregistration.k8s.io/v1
5+
kind: MutatingWebhookConfiguration
6+
metadata:
7+
name: mutating-webhook-configuration
8+
annotations:
9+
service.beta.openshift.io/inject-cabundle: "true"
Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
# Instruct the OCP service-CA operator to issue a signed serving certificate
2+
# for the webhook service and store it in the named secret. The secret name
3+
# must match the volume mount defined in manager_webhook_patch.yaml.
4+
apiVersion: v1
5+
kind: Service
6+
metadata:
7+
name: webhook-service
8+
annotations:
9+
service.beta.openshift.io/serving-cert-secret-name: webhook-server-cert
Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
# Instruct the OCP service-CA operator to inject the cluster CA bundle into
2+
# the ValidatingWebhookConfiguration so that the API server can validate the
3+
# webhook server's TLS certificate.
4+
apiVersion: admissionregistration.k8s.io/v1
5+
kind: ValidatingWebhookConfiguration
6+
metadata:
7+
name: validating-webhook-configuration
8+
annotations:
9+
service.beta.openshift.io/inject-cabundle: "true"

scripts/build-images-for-ocp.sh

Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
#!/usr/bin/env bash
2+
# Copyright 2026 Intel Corporation. All Rights Reserved.
3+
#
4+
# Licensed under the Apache License, Version 2.0 (the "License");
5+
# you may not use this file except in compliance with the License.
6+
# You may obtain a copy of the License at
7+
#
8+
# http://www.apache.org/licenses/LICENSE-2.0
9+
#
10+
# Unless required by applicable law or agreed to in writing, software
11+
# distributed under the License is distributed on an "AS IS" BASIS,
12+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
# See the License for the specific language governing permissions and
14+
# limitations under the License.
15+
16+
# build-images-for-ocp.sh – builds all container images required for the OCP
17+
# e2e tests from the current project source tree.
18+
#
19+
# Usage:
20+
# ./scripts/build-images-for-ocp.sh
21+
#
22+
# Environment variables (all optional, match Makefile defaults):
23+
# BUILDER – container builder binary: docker (default), podman, or buildah
24+
# TAG – image tag (default: value from Makefile, currently 0.35.1)
25+
# UBI – set to 1 to build UBI-based images instead of distroless
26+
#
27+
# After a successful run the images are available locally as:
28+
# intel/<name>:<TAG>
29+
# Pass these to mirror-images-to-ocp.sh to push them to an OCP cluster.
30+
31+
set -euo pipefail
32+
33+
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
34+
REPO_ROOT="$(cd "${SCRIPT_DIR}/.." && pwd)"
35+
36+
# To load OCP_IMAGES
37+
. "${SCRIPT_DIR}/common-ocp.sh"
38+
39+
log() { echo "[build-images-for-ocp] $*"; }
40+
41+
cd "${REPO_ROOT}"
42+
43+
# Ensure Dockerfiles are up to date with the .Dockerfile.in templates before
44+
# building. This is a no-op if nothing has changed.
45+
log "Regenerating Dockerfiles from templates..."
46+
make dockerfiles
47+
48+
log "Building images: ${OCP_IMAGES[*]}"
49+
for img in "${OCP_IMAGES[@]}"; do
50+
log " building ${img}..."
51+
make "${img}" ${BUILDER:+BUILDER="${BUILDER}"} ${TAG:+TAG="${TAG}"} ${UBI:+UBI="${UBI}"}
52+
done
53+
54+
# Print a summary so the caller can verify and export OCP_IMAGE_REGISTRY later.
55+
TAG="${TAG:-$(grep '^TAG' Makefile | head -1 | awk -F'[?=]' '{print $NF}' | tr -d ' ')}"
56+
log ""
57+
log "Build complete. Images available locally:"
58+
for img in "${OCP_IMAGES[@]}"; do
59+
log " intel/${img}:${TAG}"
60+
done
61+
log ""
62+
log "Next step: push to your OCP cluster:"
63+
log " ./scripts/mirror-images-to-ocp.sh <namespace>"

0 commit comments

Comments
 (0)