chore: 1.18 kubernetes-ci use self-hosted#390
Conversation
📝 WalkthroughWalkthroughCI and build workflows remove explicit KUBECONFIG exports and adjust CI runner/actions and Kind setup. Makefile drops KUBECONFIG and adds TARGETARCH build-arg. E2E tests add imagePullPolicy, readiness assertions, and provider-aware TLS logic; test template adds Jaeger config under prometheus. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 7
🧹 Nitpick comments (1)
.github/workflows/e2e-test-k8s.yml (1)
75-75: Add explicit timeout to node readiness wait.Line 75 relies on implicit timeout behavior, which can cause flaky startup failures on slower runners.
Suggested fix
- kubectl wait --for=condition=Ready nodes --all + kubectl wait --for=condition=Ready nodes --all --timeout=180s🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/e2e-test-k8s.yml at line 75, The kubectl wait invocation "kubectl wait --for=condition=Ready nodes --all" lacks an explicit timeout causing flakiness; update that command to include a reasonable --timeout value (e.g., --timeout=5m or configurable via an environment variable) so the workflow fails deterministically if nodes don't become ready, e.g., replace the existing invocation with one that appends --timeout=<duration> (or uses a $NODE_WAIT_TIMEOUT variable) to enforce an explicit maximum wait.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/e2e-test-k8s.yml:
- Line 37: The job currently uses a generic runs-on: self-hosted which can
schedule secret-consuming jobs to any self-hosted runner; change the runs-on
selector to a restrictive label set (e.g., add a dedicated runner tag such as
'self-hosted' plus a specific label like 'k8s-e2e' or 'trusted') so only
intended machines run this job; update the workflow's job runs-on field
(referencing the runs-on key in the e2e-test-k8s job) to require those specific
labels and ensure this job that uses secrets (PRIVATE_DOCKER_USERNAME,
PRIVATE_DOCKER_PASSWORD, API7_EE_LICENSE) will only run on labeled/trusted
self-hosted runners.
- Around line 44-50: Replace the no-op "Setup Go Env" and "Install kind" steps
with deterministic install steps: use actions/setup-go@v4 (step name "Setup Go"
or keep "Setup Go Env") with an explicit go-version input to install and verify
Go, and add a dedicated "Setup kind" step that installs kind from a known action
(e.g., engineerd/setup-kind@vX or another pinned action) with an explicit
version input and then run `kind version` to verify installation; update the
existing step names "Setup Go Env" and "Install kind" in the workflow to
reference these actions so the runtime no longer depends on preinstalled tools
on self-hosted runners.
- Line 40: Replace the checkout action reference that currently uses
actions/checkout@v3 with a supported newer major version (minimum
actions/checkout@v4, or preferably actions/checkout@v6) to avoid Node 16
dependency issues; update the action string in the workflow step that contains
"uses: actions/checkout@v3" to the chosen version and run/validate the workflow
to ensure compatibility with current GitHub Actions runners.
- Line 67: Replace the mutable KIND_NODE_IMAGE tag with an immutable
digest-pinned reference: locate the KIND_NODE_IMAGE environment variable
assignment (the line containing KIND_NODE_IMAGE: kindest/node:v1.18.15) and
update its value to the corresponding image@sha256 digest (e.g.,
kindest/node:v1.18.15@sha256:5c1b980c4d0e0e8e7eb9f36f7df525d079a96169c8a8f20d8bd108c0d0889cc4)
so the workflow uses a pinned image for reproducible builds.
In `@test/e2e/crds/v2/route.go`:
- Around line 1372-1374: The manifest uses image: jmalloc/echo-server:latest
together with imagePullPolicy: IfNotPresent which can cause CI nondeterminism;
update the Kubernetes spec where image and imagePullPolicy are set (look for the
image: jmalloc/echo-server:latest and the imagePullPolicy field) to either pin
the image to an immutable tag or digest, or change imagePullPolicy from
IfNotPresent to Always so the registry is checked on each run; ensure the change
is applied to the same container spec that declares ports so the correct pod
uses the updated policy.
In `@test/e2e/crds/v2/tls.go`:
- Line 335: The assertion in assert.Equal(GinkgoT(), int64(10),
*tls[0].Client.Depth, "client depth should be 1") has a mismatched message;
update the assertion message to reflect the expected value (e.g., change the
message to "client depth should be 10") so it matches the expected int64(10) and
the tls[0].Client.Depth check in test/e2e/crds/v2/tls.go.
- Around line 238-242: The test dereferences tls[0].Client.Depth without
checking for nil, causing panics; update the check in the TLS assertion block to
first assert that tls[0].Client and tls[0].Client.Depth are non-nil (e.g., using
assert.NotNil or equivalent) before dereferencing, then compare the value to the
expected depth computed from s.Deployer.Name() and framework.ProviderTypeAPI7EE
with assert.Equal; ensure the failure messages mention which field is missing to
aid debugging.
---
Nitpick comments:
In @.github/workflows/e2e-test-k8s.yml:
- Line 75: The kubectl wait invocation "kubectl wait --for=condition=Ready nodes
--all" lacks an explicit timeout causing flakiness; update that command to
include a reasonable --timeout value (e.g., --timeout=5m or configurable via an
environment variable) so the workflow fails deterministically if nodes don't
become ready, e.g., replace the existing invocation with one that appends
--timeout=<duration> (or uses a $NODE_WAIT_TIMEOUT variable) to enforce an
explicit maximum wait.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 09fb5cfb-b925-4fae-a5f4-f137c55282fc
📒 Files selected for processing (7)
.github/workflows/apisix-conformance-test.yml.github/workflows/conformance-test.yml.github/workflows/e2e-test-k8s.ymlMakefiletest/e2e/crds/v2/route.gotest/e2e/crds/v2/tls.gotest/e2e/framework/api7_dashboard.go
💤 Files with no reviewable changes (2)
- .github/workflows/conformance-test.yml
- .github/workflows/apisix-conformance-test.yml
| - name: Setup Go Env | ||
| uses: actions/setup-go@v4 | ||
| with: | ||
| go-version: "1.24" | ||
| run: | | ||
| go env | ||
|
|
||
| - name: Install kind | ||
| run: | | ||
| go install sigs.k8s.io/kind@v0.13.0 | ||
| kind version |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, find the workflow file
find . -name "e2e-test-k8s.yml" -type fRepository: api7/api7-ingress-controller
Length of output: 108
🏁 Script executed:
# Read the full workflow file to understand the context
cat -n .github/workflows/e2e-test-k8s.ymlRepository: api7/api7-ingress-controller
Length of output: 4439
🏁 Script executed:
# Check if there are other workflow files that might show setup patterns
ls -la .github/workflows/Repository: api7/api7-ingress-controller
Length of output: 1199
🏁 Script executed:
# Search for any existing setup-go usage in other workflows
rg "setup-go|actions/setup" .github/workflows/ -A 3Repository: api7/api7-ingress-controller
Length of output: 2715
Replace non-functional setup steps with proper tool installation to ensure reproducible builds on self-hosted runners.
Lines 45-50 only print go env and kind version; they do not install or verify versions. Running on self-hosted runners means these steps rely entirely on pre-installed tools, causing reproducibility drift. Other workflows in this repo consistently use actions/setup-go@v4 with explicit versions.
Suggested fix
- - name: Setup Go Env
- run: |
- go env
+ - name: Setup Go
+ uses: actions/setup-go@v4
+ with:
+ go-version-file: go.mod
+ cache: true
- name: Install kind
run: |
- kind version
+ if ! command -v kind >/dev/null 2>&1; then
+ go install sigs.k8s.io/kind@v0.22.0
+ fi
+ kind version🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.github/workflows/e2e-test-k8s.yml around lines 44 - 50, Replace the no-op
"Setup Go Env" and "Install kind" steps with deterministic install steps: use
actions/setup-go@v4 (step name "Setup Go" or keep "Setup Go Env") with an
explicit go-version input to install and verify Go, and add a dedicated "Setup
kind" step that installs kind from a known action (e.g., engineerd/setup-kind@vX
or another pinned action) with an explicit version input and then run `kind
version` to verify installation; update the existing step names "Setup Go Env"
and "Install kind" in the workflow to reference these actions so the runtime no
longer depends on preinstalled tools on self-hosted runners.
| - name: Launch Kind Cluster | ||
| env: | ||
| KIND_NODE_IMAGE: kindest/node:v1.18.20@sha256:38a8726ece5d7867fb0ede63d718d27ce2d41af519ce68be5ae7fcca563537ed | ||
| KIND_NODE_IMAGE: kindest/node:v1.18.15 |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cd "$(git rev-parse --show-toplevel)" && head -80 .github/workflows/e2e-test-k8s.yml | tail -20Repository: api7/api7-ingress-controller
Length of output: 817
🌐 Web query:
official sha256 digest kindest/node v1.18.15
💡 Result:
kindest/node:v1.18.15 digest (use with @sha256:):
kindest/node:v1.18.15@sha256:5c1b980c4d0e0e8e7eb9f36f7df525d079a96169c8a8f20d8bd108c0d0889cc4 [1] [2] [3]
Sources: [1] [2] [3]
🏁 Script executed:
grep -n "image:" .github/workflows/e2e-test-k8s.yml | grep -v "@sha256"Repository: api7/api7-ingress-controller
Length of output: 54
🏁 Script executed:
grep -n "kindest/node\|docker pull\|_IMAGE:" .github/workflows/e2e-test-k8s.ymlRepository: api7/api7-ingress-controller
Length of output: 123
Pin KIND node image by digest.
Line 67 uses a mutable tag (kindest/node:v1.18.15) without digest pinning, which weakens reproducibility and supply-chain integrity. Update to:
KIND_NODE_IMAGE: kindest/node:v1.18.15@sha256:5c1b980c4d0e0e8e7eb9f36f7df525d079a96169c8a8f20d8bd108c0d0889cc4
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.github/workflows/e2e-test-k8s.yml at line 67, Replace the mutable
KIND_NODE_IMAGE tag with an immutable digest-pinned reference: locate the
KIND_NODE_IMAGE environment variable assignment (the line containing
KIND_NODE_IMAGE: kindest/node:v1.18.15) and update its value to the
corresponding image@sha256 digest (e.g.,
kindest/node:v1.18.15@sha256:5c1b980c4d0e0e8e7eb9f36f7df525d079a96169c8a8f20d8bd108c0d0889cc4)
so the workflow uses a pinned image for reproducible builds.
conformance test report - apisix-standalone modeapiVersion: gateway.networking.k8s.io/v1
date: "2026-03-05T09:28:42Z"
gatewayAPIChannel: experimental
gatewayAPIVersion: v1.3.0
implementation:
contact: null
organization: APISIX
project: apisix-ingress-controller
url: https://github.com/apache/apisix-ingress-controller.git
version: v2.0.0
kind: ConformanceReport
mode: default
profiles:
- core:
result: success
statistics:
Failed: 0
Passed: 12
Skipped: 0
name: GATEWAY-GRPC
summary: Core tests succeeded.
- core:
result: partial
skippedTests:
- HTTPRouteHTTPSListener
statistics:
Failed: 0
Passed: 32
Skipped: 1
extended:
result: partial
skippedTests:
- HTTPRouteRedirectPortAndScheme
statistics:
Failed: 0
Passed: 11
Skipped: 1
supportedFeatures:
- GatewayAddressEmpty
- GatewayPort8080
- HTTPRouteBackendProtocolWebSocket
- HTTPRouteDestinationPortMatching
- HTTPRouteHostRewrite
- HTTPRouteMethodMatching
- HTTPRoutePathRewrite
- HTTPRoutePortRedirect
- HTTPRouteQueryParamMatching
- HTTPRouteRequestMirror
- HTTPRouteResponseHeaderModification
- HTTPRouteSchemeRedirect
unsupportedFeatures:
- GatewayHTTPListenerIsolation
- GatewayInfrastructurePropagation
- GatewayStaticAddresses
- HTTPRouteBackendProtocolH2C
- HTTPRouteBackendRequestHeaderModification
- HTTPRouteBackendTimeout
- HTTPRouteParentRefPort
- HTTPRoutePathRedirect
- HTTPRouteRequestMultipleMirrors
- HTTPRouteRequestPercentageMirror
- HTTPRouteRequestTimeout
name: GATEWAY-HTTP
summary: Core tests partially succeeded with 1 test skips. Extended tests partially
succeeded with 1 test skips.
- core:
result: partial
skippedTests:
- TLSRouteSimpleSameNamespace
statistics:
Failed: 0
Passed: 10
Skipped: 1
name: GATEWAY-TLS
summary: Core tests partially succeeded with 1 test skips. |
conformance test report - apisix modeapiVersion: gateway.networking.k8s.io/v1
date: "2026-03-05T09:29:49Z"
gatewayAPIChannel: experimental
gatewayAPIVersion: v1.3.0
implementation:
contact: null
organization: APISIX
project: apisix-ingress-controller
url: https://github.com/apache/apisix-ingress-controller.git
version: v2.0.0
kind: ConformanceReport
mode: default
profiles:
- core:
result: success
statistics:
Failed: 0
Passed: 12
Skipped: 0
name: GATEWAY-GRPC
summary: Core tests succeeded.
- core:
failedTests:
- HTTPRouteInvalidBackendRefUnknownKind
result: failure
skippedTests:
- HTTPRouteHTTPSListener
statistics:
Failed: 1
Passed: 31
Skipped: 1
extended:
result: partial
skippedTests:
- HTTPRouteRedirectPortAndScheme
statistics:
Failed: 0
Passed: 11
Skipped: 1
supportedFeatures:
- GatewayAddressEmpty
- GatewayPort8080
- HTTPRouteBackendProtocolWebSocket
- HTTPRouteDestinationPortMatching
- HTTPRouteHostRewrite
- HTTPRouteMethodMatching
- HTTPRoutePathRewrite
- HTTPRoutePortRedirect
- HTTPRouteQueryParamMatching
- HTTPRouteRequestMirror
- HTTPRouteResponseHeaderModification
- HTTPRouteSchemeRedirect
unsupportedFeatures:
- GatewayHTTPListenerIsolation
- GatewayInfrastructurePropagation
- GatewayStaticAddresses
- HTTPRouteBackendProtocolH2C
- HTTPRouteBackendRequestHeaderModification
- HTTPRouteBackendTimeout
- HTTPRouteParentRefPort
- HTTPRoutePathRedirect
- HTTPRouteRequestMultipleMirrors
- HTTPRouteRequestPercentageMirror
- HTTPRouteRequestTimeout
name: GATEWAY-HTTP
summary: Core tests failed with 1 test failures. Extended tests partially succeeded
with 1 test skips.
- core:
result: partial
skippedTests:
- TLSRouteSimpleSameNamespace
statistics:
Failed: 0
Passed: 10
Skipped: 1
name: GATEWAY-TLS
summary: Core tests partially succeeded with 1 test skips. |
conformance test reportapiVersion: gateway.networking.k8s.io/v1
date: "2026-03-05T09:51:00Z"
gatewayAPIChannel: experimental
gatewayAPIVersion: v1.3.0
implementation:
contact: null
organization: APISIX
project: apisix-ingress-controller
url: https://github.com/apache/apisix-ingress-controller.git
version: v2.0.0
kind: ConformanceReport
mode: default
profiles:
- core:
failedTests:
- GatewayModifyListeners
- TLSRouteSimpleSameNamespace
result: failure
statistics:
Failed: 2
Passed: 9
Skipped: 0
name: GATEWAY-TLS
summary: Core tests failed with 2 test failures.
- core:
failedTests:
- GRPCExactMethodMatching
- GRPCRouteHeaderMatching
- GRPCRouteListenerHostnameMatching
- GatewayModifyListeners
result: failure
statistics:
Failed: 4
Passed: 8
Skipped: 0
name: GATEWAY-GRPC
summary: Core tests failed with 4 test failures.
- core:
failedTests:
- GatewayModifyListeners
result: failure
skippedTests:
- HTTPRouteHTTPSListener
statistics:
Failed: 1
Passed: 31
Skipped: 1
extended:
failedTests:
- HTTPRouteBackendProtocolWebSocket
result: failure
skippedTests:
- HTTPRouteRedirectPortAndScheme
statistics:
Failed: 1
Passed: 10
Skipped: 1
supportedFeatures:
- GatewayAddressEmpty
- GatewayPort8080
- HTTPRouteBackendProtocolWebSocket
- HTTPRouteDestinationPortMatching
- HTTPRouteHostRewrite
- HTTPRouteMethodMatching
- HTTPRoutePathRewrite
- HTTPRoutePortRedirect
- HTTPRouteQueryParamMatching
- HTTPRouteRequestMirror
- HTTPRouteResponseHeaderModification
- HTTPRouteSchemeRedirect
unsupportedFeatures:
- GatewayHTTPListenerIsolation
- GatewayInfrastructurePropagation
- GatewayStaticAddresses
- HTTPRouteBackendProtocolH2C
- HTTPRouteBackendRequestHeaderModification
- HTTPRouteBackendTimeout
- HTTPRouteParentRefPort
- HTTPRoutePathRedirect
- HTTPRouteRequestMultipleMirrors
- HTTPRouteRequestPercentageMirror
- HTTPRouteRequestTimeout
name: GATEWAY-HTTP
summary: Core tests failed with 1 test failures. Extended tests failed with 1 test
failures. |
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/e2e-test-k8s.yml:
- Around line 70-75: The kubectl wait command (kubectl wait
--for=condition=Ready nodes --all) can hang indefinitely; update the invocation
to include a sensible --timeout value (e.g., --timeout=5m or another
CI-appropriate duration) so the workflow fails fast if nodes don't become Ready,
and ensure any callers or downstream steps handle the non-zero exit on timeout.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 6a9ed94a-27ab-44d7-b192-5fb55c929446
📒 Files selected for processing (2)
.github/workflows/e2e-test-k8s.ymltest/e2e/crds/v2/tls.go
🚧 Files skipped from review as they are similar to previous changes (1)
- test/e2e/crds/v2/tls.go
Type of change:
What this PR does / why we need it:
https://buildjet.com/for-github-actions/blog/we-are-shutting-down
BuildJet is about to go offline
Pre-submission checklist:
Summary by CodeRabbit