OCPBUGS-83580: Load fuse kernel module before testing /dev/fuse in CRI-O#31044
OCPBUGS-83580: Load fuse kernel module before testing /dev/fuse in CRI-O#31044Chandan9112 wants to merge 1 commit intoopenshift:mainfrom
Conversation
|
Pipeline controller notification For optional jobs, comment This repository is configured in: automatic mode |
|
@Chandan9112: This pull request references Jira Issue OCPBUGS-83580, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira (cmaurya@redhat.com), skipping review request. The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: Chandan9112 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository: openshift/coderabbit/.coderabbit.yaml Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
WalkthroughThe dev-fuse e2e test now lists all worker nodes via a label selector, asserts at least one worker exists, runs Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes 🚥 Pre-merge checks | ✅ 10 | ❌ 2❌ Failed checks (2 inconclusive)
✅ Passed checks (10 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@Chandan9112: This pull request references Jira Issue OCPBUGS-83580, which is valid. 3 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira (cmaurya@redhat.com), skipping review request. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/extended/node/node_e2e/node.go`:
- Around line 115-120: The current logic only loads the fuse module on the first
worker fetched (jsonpath {.items[0]}) which fails when the test pod schedules to
another worker; change the node discovery to fetch all worker names (e.g.,
jsonpath {.items[*].metadata.name}), split the output into individual node
names, then loop over each node and call nodeutils.ExecOnNodeWithChroot(oc,
strings.TrimSpace(nodeName), "modprobe", "fuse"), checking
o.Expect(err).NotTo(o.HaveOccurred()) for each invocation so every schedulable
worker has /dev/fuse loaded before the pod runs.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 2913005d-20d0-4383-82e0-aba7fc8eb55e
📒 Files selected for processing (1)
test/extended/node/node_e2e/node.go
The OCP-70987 test was failing across multiple CI platforms (AWS, GCP, vSphere, Metal+IPv6) because /dev/fuse was not available on host nodes where the fuse kernel module was not loaded. CRI-O silently ignores the io.kubernetes.cri-o.Devices annotation when the device does not exist on the host, causing the pod to start without /dev/fuse mounted. This adds a modprobe fuse step before creating the test pod to ensure the fuse kernel module is loaded. This is idempotent (no-op if already loaded) and follows the same pattern used in tap.go for modprobe tun. Made-with: Cursor
aefb892 to
28fc03c
Compare
|
/retest |
|
/test images |
1 similar comment
|
/test images |
|
Scheduling required tests: |
|
/retest |
|
@Chandan9112: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
/payload 4.22 nightly blocking |
|
@BhargaviGudi: trigger 13 job(s) of type blocking for the nightly release of OCP 4.22
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/38210840-3e0a-11f1-8a06-a9e051c3da43-0 |
|
/retest |
|
Job Failure Risk Analysis for sha: 28fc03c
|
| workers := strings.Fields(nodeNames) | ||
| o.Expect(workers).NotTo(o.BeEmpty(), "No worker nodes found") | ||
| for _, worker := range workers { | ||
| _, err = nodeutils.ExecOnNodeWithChroot(oc, worker, "modprobe", "fuse") |
There was a problem hiding this comment.
This line changes the state of the node permanently by adding fuse. So all tests following this is impacted.
So we either need to unload fuse if its loaded here after the test case or run this test on a new node and destroy it.
You could try modprobe -r fuse in a defer, but this could fail if something is using it
Bug
OCPBUGS-83580
Follow-up PR
#PR
Problem
The OCP-70987 test (
Allow dev fuse by default in CRI-O) was failing across multiple CI platforms (AWS, GCP, vSphere, Metal+IPv6) with error:stat: cannot statx '/dev/fuse': No such file or directory
The test relies on the CRI-O annotation
io.kubernetes.cri-o.Devices: "/dev/fuse"to mount/dev/fuseinto the pod. However, on CI nodes where the fuse kernel module is not loaded,/dev/fusedoes not exist on the host. CRI-O silently ignores the annotation when the device is missing, so the pod starts successfully but without/dev/fusemounted.Root Cause
The fuse kernel module is not loaded on some CI cluster nodes. Without it,
/dev/fusedoesn't exist on the host, and CRI-O cannot bind-mount it into the container.Fix
Added a
modprobe fusestep on a worker node before creating the test pod. This ensures the fuse kernel module is loaded so/dev/fuseis available on the host for CRI-O to mount into the container.modprobe fuseis idempotent — if the module is already loaded, it's a no-opTesting
Tested on a fresh OCP 4.22 GCP cluster:
Summary by CodeRabbit