Skip to content

Add support for MutableCSINodeAllocatableCount#1066

Open
nschad wants to merge 8 commits into
mainfrom
dynamic-volume-limits
Open

Add support for MutableCSINodeAllocatableCount#1066
nschad wants to merge 8 commits into
mainfrom
dynamic-volume-limits

Conversation

@nschad

@nschad nschad commented May 7, 2026

Copy link
Copy Markdown
Contributor

How to categorize this PR?

/kind enhancement

What this PR does / why we need it:

This PR updates the CSI driver to dynamically calculate the maximum allocatable volume count when the MutableCSINodeAllocatableCount feature gate is enabled.

Instead of relying on a static limit, the driver now scans the system for free PCIe root ports and adds them to the amount of already mounted volumes using this driver. This ensures that hardware overhead—such as additional Network Interface Cards (NICs)—is accounted for, preventing volume attachment failures due to exhausted root ports.

This will also lead to better scheduling decision by the kube-scheduler.

Which issue(s) this PR fixes:
Fixes #

Special notes for your reviewer:

Breaking changes:

@ske-prow

ske-prow Bot commented May 7, 2026

Copy link
Copy Markdown

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@ske-prow ske-prow Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 7, 2026
@ske-prow

ske-prow Bot commented May 7, 2026

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign dergeberl for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ske-prow ske-prow Bot added do-not-merge/needs-kind Indicates a PR lacks a `kind/foo` label and requires one. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels May 7, 2026
@nschad nschad force-pushed the dynamic-volume-limits branch from c9ac70a to 127f8d7 Compare May 8, 2026 10:52
@nschad nschad changed the title WIP: Query PCIe ports for correct amount of devices that can be attac… Add support for MutableCSINodeAllocatableCount May 8, 2026
@nschad nschad changed the title Add support for MutableCSINodeAllocatableCount WIP: Add support for MutableCSINodeAllocatableCount May 8, 2026
@nschad

nschad commented May 8, 2026

Copy link
Copy Markdown
Contributor Author

/kind enhancement

/hold waiting for IaaS to clarify some API endpoints

@ske-prow ske-prow Bot added kind/enhancement Enhancement, improvement, extension do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. and removed do-not-merge/needs-kind Indicates a PR lacks a `kind/foo` label and requires one. labels May 8, 2026
@nschad nschad requested a review from a team May 8, 2026 10:57
@nschad nschad changed the title WIP: Add support for MutableCSINodeAllocatableCount Add support for MutableCSINodeAllocatableCount May 15, 2026
@nschad nschad marked this pull request as ready for review May 15, 2026 12:21
@ske-prow ske-prow Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 15, 2026
@nschad nschad force-pushed the dynamic-volume-limits branch from b5f9b20 to 5fe6f7a Compare May 15, 2026 12:36
@nschad nschad marked this pull request as draft May 15, 2026 12:58
@ske-prow ske-prow Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 15, 2026

@breuerfelix breuerfelix left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a really complex PR, few lines of code, but you really have to understand why and what happens here.
Does it make sense to simulate this setup in an integration test with multiple different flavors ?

Comment thread pkg/csi/blockstorage/nodeserver.go Outdated
Comment thread pkg/csi/util/mount/mount_linux.go Outdated
}
}
} else {
klog.V(4).Infof("skipping class %s: path: %s", class, devPath)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

invert the check above and continue in case it does not have the prefix

@hown3d

hown3d commented Jun 2, 2026

Copy link
Copy Markdown
Member

This is a really complex PR, few lines of code, but you really have to understand why and what happens here.

I agree, maybe its worth using the existing implementation of procfs from prometheus to read PCIDevices and parse them. Would allow us to have less code and write it more understandable:
https://github.com/prometheus/procfs/blob/1082e3d8d4c73ed8e6360357001cbd4a82e40635/sysfs/pci_device.go#L114

@breuerfelix

Copy link
Copy Markdown
Member

This is a really complex PR, few lines of code, but you really have to understand why and what happens here.

I agree, maybe its worth using the existing implementation of procfs from prometheus to read PCIDevices and parse them. Would allow us to have less code and write it more understandable: https://github.com/prometheus/procfs/blob/1082e3d8d4c73ed8e6360357001cbd4a82e40635/sysfs/pci_device.go#L114

We tested the implementation with this lib now and this doesn't remove complexity. You neither can iterate over free PCI slots nor check if a Device is a Bridge and has children.
The algorithm to determine free slots with this library (or others) actually does not improve readability and sometimes even adds complexity on top, but thanks for the idea :)
But we have some other ideas + will add more comments and tests for sure.

nschad added 6 commits June 10, 2026 10:17
The CSI list's all PCIe devices that are not of
type VIRTIO_BLOCK_DEVICE and subtracts them from
the theoretically maximum, so kubernetes can report
a correct dynamic max volume count that can be attached
for each node.

Signed-off-by: Niclas Schad <niclas.schad@stackit.cloud>
Signed-off-by: Niclas Schad <niclas.schad@stackit.cloud>
Signed-off-by: Niclas Schad <niclas.schad@stackit.cloud>
Signed-off-by: Niclas Schad <niclas.schad@stackit.cloud>
Signed-off-by: Niclas Schad <niclas.schad@stackit.cloud>
Signed-off-by: Niclas Schad <niclas.schad@stackit.cloud>
@nschad nschad force-pushed the dynamic-volume-limits branch from 72f45cd to 16bfec0 Compare June 10, 2026 08:17
@nschad nschad marked this pull request as ready for review June 10, 2026 08:18
@ske-prow ske-prow Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 10, 2026
@nschad

nschad commented Jun 10, 2026

Copy link
Copy Markdown
Contributor Author

/unhold

@ske-prow ske-prow Bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 10, 2026
nschad and others added 2 commits June 10, 2026 10:23
Signed-off-by: Niclas Schad <niclas.schad@stackit.cloud>
Signed-off-by: Felix Breuer <f.breuer94@gmail.com>
@ske-prow

ske-prow Bot commented Jun 10, 2026

Copy link
Copy Markdown

@nschad: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-cloud-provider-stackit-verify e104889 link true /test pull-cloud-provider-stackit-verify

Full PR test history. Your PR dashboard. Command help for this repository.
Please help us cut down on flakes by linking this test failure to an open flake report or filing a new flake report if you can't find an existing one. Also see the gardener testing guideline for how to avoid and hunt flakes.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/enhancement Enhancement, improvement, extension size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants