You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fix(prepare): enable containerd device_ownership_from_security_context for CDI block imports (#48)
* fix(prepare): enable containerd device_ownership_from_security_context for CDI block imports
KubeVirt's CDI importer writes VM disk images into raw block volumes
from a non-root pod. containerd only chowns the block device to the
pod's SecurityContext when device_ownership_from_security_context is
enabled on the CRI plugin, and k3s ships it disabled. Without it the
importer fails with 'cannot open /dev/cdi-block-volume: Permission
denied', the DataVolume hangs in ImportInProgress, and VMs referencing
the disk stay Pending.
Add a k3s containerd drop-in (config-v3.toml.d/10-cozystack-cri.toml)
to all three prepare playbooks, gated behind cozystack_enable_kubevirt
and overridable via cozystack_k3s_containerd_dropin_dir.
Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Andrei Kvapil <andrei.kvapil@aenix.io>
* test(examples): cover containerd device-ownership drop-in across distros
Pin the device_ownership_from_security_context drop-in, its KubeVirt gate, the containerd v3 CRI runtime table, and the k3s restart handler across the Ubuntu/RHEL/SUSE prepare playbooks, so the mechanism cannot silently regress or drift between distros.
Assisted-By: Claude <noreply@anthropic.com>
Signed-off-by: Aleksei Sviridkin <f@lex.la>
* chore(release): keep collection version inheritance, defer bump to upstream
The collection version tracks the upstream Cozystack chart release; a bugfix on its own does not force a version bump. Keep galaxy.yml at the inherited version and record the change under an Unreleased CHANGELOG section, to be renamed when upstream bumps.
Assisted-By: Claude <noreply@anthropic.com>
Signed-off-by: Aleksei Sviridkin <f@lex.la>
* docs(readme): unwrap hard-wrapped prose into single lines
Reflow the remaining hard-wrapped prose paragraphs so each paragraph is one continuous line, letting the renderer wrap to viewer width. Formatting only; tables, code blocks and YAML are left untouched.
Assisted-By: Claude <noreply@anthropic.com>
Signed-off-by: Aleksei Sviridkin <f@lex.la>
* docs(readme): document native --nonroot-devices flag and live-restart caveat
Explain why the collection uses a containerd config drop-in rather than k3s's native --nonroot-devices flag (uniform server+agent coverage without wiring agent args, and it applies to an already-running cluster), and warn that the restart handler bounces k3s when the drop-in is first added or changes on a live re-run. Add a doc-drift guard test pinning both points to the drop-in README section.
Assisted-By: Claude <noreply@anthropic.com>
Signed-off-by: Aleksei Sviridkin <f@lex.la>
* docs(claude): list containerd device-ownership in silent-failure traps
Add the containerd device_ownership_from_security_context failure (CDI block import Permission denied -> DataVolume ImportInProgress -> VM Pending) to the project's canonical Critical silent-failure traps list, alongside the multipath/vhost_net/br_netfilter entries, with a guard test pinning it.
Assisted-By: Claude <noreply@anthropic.com>
Signed-off-by: Aleksei Sviridkin <f@lex.la>
* docs(readme): correct the containerd 1.x drop-in guidance
The drop-in content (version = 3, io.containerd.cri.v1.runtime) is hardcoded for containerd 2.x; cozystack_k3s_containerd_dropin_dir only relocates the file, it does not rewrite the content. State plainly that containerd 1.x is not handled as-is and operators must write their own v2 drop-in, rather than implying a one-variable override that the code cannot honor. Guard the corrected wording with a test.
Assisted-By: Claude <noreply@anthropic.com>
Signed-off-by: Aleksei Sviridkin <f@lex.la>
* docs(readme): list containerd drop-in dir var and update kubevirt toggle scope
Add cozystack_k3s_containerd_dropin_dir to the Example playbook variables table (the canonical override reference, where every other examples/* tunable is listed), and extend the cozystack_enable_kubevirt row to note it now also gates the containerd device-ownership drop-in for CDI block imports — so disabling KubeVirt prep is understood to disable that fix too. Also attribute the k3s version pin to the example inventories rather than the role. Guard both table facts with a test.
Assisted-By: Claude <noreply@anthropic.com>
Signed-off-by: Aleksei Sviridkin <f@lex.la>
* fix(prepare): harden containerd CDI drop-in toggle and k3s restart
The drop-in toggle was one-way: cozystack_enable_kubevirt: false only
skipped writing 10-cozystack-cri.toml, so a host that ran with KubeVirt
enabled kept device_ownership_from_security_context on and the host
state no longer matched the toggle. Add a symmetric cleanup task that
removes the drop-in when the toggle is off and notifies the restart
handler, mirroring the existing DRBD drop-in cleanup pattern.
The restart handler used failed_when: false, which masked every
failure, not just the intended missing-unit case (only one of
k3s/k3s-agent exists on a node, and on a full-pipeline run neither
exists yet when prepare runs). A malformed drop-in or a k3s that failed
to come back was silently reported as success. Refresh service facts on
the same notify topic and restart only units present in
ansible_facts.services, so a genuine restart failure fails the play
while a missing unit is still skipped.
Assisted-By: Claude <noreply@anthropic.com>
Signed-off-by: Aleksei Sviridkin <f@lex.la>
* docs(prepare): correct containerd 1.x drop-in caveat in comments and changelog
The playbook comments and the changelog entry still claimed
cozystack_k3s_containerd_dropin_dir could target a containerd 1.x
cluster, but the drop-in content is hardcoded for containerd 2.x
(config version 3, [plugins.'io.containerd.cri.v1.runtime']). Pointing
the variable at a 1.x config-dir does not produce a working config:
containerd 1.x needs config version 2 and
[plugins.'io.containerd.grpc.v1.cri']. The override only relocates the
file (e.g. a non-default k3s data-dir); it does not rewrite the
content. The README already states this — align the playbook comments,
the changelog, and a stale test message with it.
Assisted-By: Claude <noreply@anthropic.com>
Signed-off-by: Aleksei Sviridkin <f@lex.la>
---------
Signed-off-by: Andrei Kvapil <andrei.kvapil@aenix.io>
Signed-off-by: Aleksei Sviridkin <f@lex.la>
Co-authored-by: Andrei Kvapil <andrei.kvapil@aenix.io>
Co-authored-by: Claude <noreply@anthropic.com>
with "No such file or directory". Load the module before applying the
146
146
sysctl.
147
+
-**containerd `device_ownership_from_security_context` disabled**: k3s ships it off; without the `config-v3.toml.d/10-cozystack-cri.toml` drop-in, KubeVirt's non-root CDI importer cannot open a raw block volume (`blockdev: cannot open /dev/cdi-block-volume: Permission denied`), the DataVolume hangs in `ImportInProgress`, and VMs that reference the disk stay Pending. Apply when KubeVirt is enabled (gated on `cozystack_enable_kubevirt`).
Copy file name to clipboardExpand all lines: README.md
+32-47Lines changed: 32 additions & 47 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -12,9 +12,7 @@ Supported targets:
12
12
13
13
Cloud-image users **must** set `cozystack_flush_iptables: true` for multi-master k3s to bootstrap — Ubuntu cloud images ship with `REJECT icmp-host-prohibited` in INPUT that blocks etcd peer port 2380 between nodes. See **Node Prerequisites → Known limitations** below.
14
14
15
-
Deploys the Cozystack operator and Platform Package using the
16
-
`kubernetes.core.helm` module with automatic Helm and helm-diff
17
-
installation.
15
+
Deploys the Cozystack operator and Platform Package using the `kubernetes.core.helm` module with automatic Helm and helm-diff installation.
on the control-plane node. No manual Helm installation is needed.
31
+
The role automatically installs Helm and the [helm-diff](https://github.com/databus23/helm-diff) plugin on the control-plane node. No manual Helm installation is needed.
36
32
37
33
### Node Prerequisites
38
34
@@ -168,11 +164,25 @@ tun
168
164
kvm_intel # or kvm_amd depending on the CPU
169
165
```
170
166
167
+
#### Enabled by default: containerd device ownership for CDI block imports
168
+
169
+
When KubeVirt is enabled, the prepare playbook drops a containerd CRI config that sets `device_ownership_from_security_context = true`. KubeVirt's CDI (Containerized Data Importer) writes VM disk images into raw **block** volumes from a non-root importer pod; containerd only chowns the block device to the pod's `SecurityContext` UID/GID when this option is on, and k3s ships it disabled. Without it the importer fails with `blockdev: cannot open /dev/cdi-block-volume: Permission denied`, the `DataVolume` is stuck in `ImportInProgress`, and every VM that references the disk stays `Pending` — one of the silent "VMs stuck in Pending" failure modes called out above.
170
+
171
+
Written as a drop-in that containerd merges on top of k3s's generated `config.toml`:
`config-v3.toml.d` and the `io.containerd.cri.v1.runtime` plugin table are the containerd 2.x (config version 3) paths shipped by current k3s (the example inventories pin `k3s_version: v1.36.1+k3s1`), and the drop-in content is hardcoded for that — `version = 3` and the v3 table. `cozystack_k3s_containerd_dropin_dir` only relocates the file; it does not rewrite the content. So on a containerd 1.x cluster (older k3s) this drop-in does not apply as-is — write your own under `config.toml.d/` with `version = 2` and the `io.containerd.grpc.v1.cri` table. The drop-in is read at first k3s start in the full pipeline; on a re-run against a running cluster a handler restarts k3s so the change takes effect.
178
+
179
+
k3s also exposes a native `--nonroot-devices` flag (valid on both server and agent) that sets the same containerd option. This collection uses the config drop-in instead because it applies uniformly to every node in the `cluster` group — including agent/worker nodes, for which the example playbooks do not wire `extra_agent_args` — and because it can be applied to an already-running cluster, which an install-time k3s flag cannot.
180
+
181
+
The restart handler only fires when the drop-in is first created or its content changes; idempotent re-runs leave k3s untouched. When it does fire, `systemctl restart k3s` (or `k3s-agent`) briefly disrupts the control plane and the node's workloads on that host, so apply such a change in a maintenance window rather than casually mid-day.
182
+
171
183
#### Known limitations
172
184
173
-
ZFS support depends on the OS ecosystem and kernel flavor. The prepare
174
-
playbooks skip ZFS automation gracefully in these cases and emit an
175
-
informational notice:
185
+
ZFS support depends on the OS ecosystem and kernel flavor. The prepare playbooks skip ZFS automation gracefully in these cases and emit an informational notice:
176
186
177
187
| OS / kernel | ZFS automation | Reason |
178
188
| --- | --- | --- |
@@ -213,9 +223,7 @@ Enable and start:
213
223
214
224
#### iptables (cloud providers)
215
225
216
-
Cloud providers (OCI, AWS, GCP) may ship images with restrictive iptables
etcd 2379-2380) even when security groups allow it.
226
+
Cloud providers (OCI, AWS, GCP) may ship images with restrictive iptables INPUT rules that block inter-node Kubernetes traffic (API 6443, kubelet 10250, etcd 2379-2380) even when security groups allow it.
219
227
220
228
Fix: flush the INPUT chain and set policy to ACCEPT before deploying k3s.
221
229
@@ -249,11 +257,7 @@ cluster-cidr: 10.42.0.0/16
249
257
service-cidr: 10.43.0.0/16
250
258
```
251
259
252
-
These CIDRs are the k3s defaults. The example prepare playbooks
253
-
(e.g., `examples/ubuntu/prepare-ubuntu.yml`) set them via the
254
-
`server_config_yaml`variable used by `k3s.orchestration`. The role
255
-
variables `cozystack_pod_cidr` and `cozystack_svc_cidr` must match —
256
-
they default to the same values.
260
+
These CIDRs are the k3s defaults. The example prepare playbooks (e.g., `examples/ubuntu/prepare-ubuntu.yml`) set them via the `server_config_yaml` variable used by `k3s.orchestration`. The role variables `cozystack_pod_cidr` and `cozystack_svc_cidr` must match — they default to the same values.
257
261
258
262
## Installation
259
263
@@ -273,8 +277,7 @@ collections:
273
277
274
278
## Quick start
275
279
276
-
1. Create your environment (pick your distro — see `examples/ubuntu/`,
277
-
`examples/rhel/`, or `examples/suse/`):
280
+
1. Create your environment (pick your distro — see `examples/ubuntu/`, `examples/rhel/`, or `examples/suse/`):
278
281
279
282
```text
280
283
my-env/
@@ -314,9 +317,7 @@ Both stages are handled automatically by the `cozystack` role.
314
317
315
318
## Role: cozystack.installer.cozystack
316
319
317
-
Installs Cozystack via the official `cozy-installer` Helm chart using
318
-
the `kubernetes.core.helm` module with automatic Helm and helm-diff
319
-
installation.
320
+
Installs Cozystack via the official `cozy-installer` Helm chart using the `kubernetes.core.helm` module with automatic Helm and helm-diff installation.
320
321
321
322
Runs on `server[0]` only.
322
323
@@ -353,14 +354,13 @@ Runs on `server[0]` only.
353
354
354
355
### Example playbook variables
355
356
356
-
These variables are consumed only by the example prepare playbooks in
357
-
`examples/*/`, not by the role itself. Set them as inventory host/group
358
-
vars to opt out of the corresponding prepare step:
357
+
These variables are consumed only by the example prepare playbooks in `examples/*/`, not by the role itself. Set them as inventory host/group vars to opt out of the corresponding prepare step:
359
358
360
359
| Variable | Default | Description |
361
360
| --- | --- | --- |
362
361
| `cozystack_enable_zfs` | `true` | Example playbooks: install ZFS userspace and load the module. Set `false` to skip. |
363
-
| `cozystack_enable_kubevirt` | `true` | Example playbooks: load KubeVirt kernel modules. Set `false` to skip. |
362
+
| `cozystack_enable_kubevirt` | `true` | Example playbooks: load KubeVirt kernel modules **and** install the containerd `device_ownership_from_security_context` drop-in for CDI block imports. Set `false` to skip both. |
363
+
| `cozystack_k3s_containerd_dropin_dir` | `/var/lib/rancher/k3s/agent/etc/containerd/config-v3.toml.d` | Example playbooks: directory for the containerd CRI drop-in (gated on `cozystack_enable_kubevirt`). Only relocates the file — the drop-in content is hardcoded for containerd 2.x (config v3); a containerd 1.x cluster needs a hand-written `config.toml.d` drop-in instead. |
364
364
| `cozystack_flush_iptables` | `false` | Example playbooks: flush the iptables INPUT chain before k3s installs. Set `true` on Ubuntu/Debian cloud images (OCI/AWS/GCP) where the default INPUT chain ends with `REJECT icmp-host-prohibited` and blocks k3s inter-node ports 2380/6443. |
365
365
| `cozystack_zfs_release_rpm_extra` | `{}` | `examples/rhel/` only: merged on top of the built-in `cozystack_zfs_release_rpm_by_major` dict, so you can add (or override) a single EL-major → OpenZFS release RPM entry from inventory without wiping the base dict. Example: `{"10": "https://zfsonlinux.org/epel/zfs-release-X-Y.el10.noarch.rpm"}` once upstream ships one. |
366
366
| `cozystack_enable_drbd_dkms` | `true` | `examples/ubuntu/` only: install `drbd-dkms` from the LINBIT PPA on Ubuntu LTS 22.04 / 24.04 hosts so DRBD's kernel module is signed via dkms+shim under Secure Boot. Set `false` on Talos hosts (Talos ships pre-signed DRBD modules in extensions) or where Secure Boot is disabled and the in-cluster compile path is preferred. The toggle stops *future* installs but does NOT undo a prior install — manually `apt purge drbd-dkms` and remove the LINBIT entry from `/etc/apt/sources.list.d/` if you flipped to `false` after a successful run. |
@@ -371,8 +371,7 @@ vars to opt out of the corresponding prepare step:
371
371
372
372
This collection is designed to work alongside [k3s.orchestration](https://github.com/k3s-io/k3s-ansible). The inventory structure (groups: `cluster`, `server`, `agent`) is fully compatible.
373
373
374
-
Example full pipeline (`site.yml`) — see `examples/ubuntu/`, `examples/rhel/`,
375
-
or `examples/suse/`:
374
+
Example full pipeline (`site.yml`) — see `examples/ubuntu/`, `examples/rhel/`, or `examples/suse/`:
376
375
377
376
```yaml
378
377
- name: Prepare nodes
@@ -393,12 +392,9 @@ On cloud providers with NAT (OCI, AWS, GCP), nodes have internal IPs different f
393
392
394
393
### Multi-master setup (kube-ovn RAFT)
395
394
396
-
Kube-ovn requires `MASTER_NODES` — a comma-separated list of all
397
-
control-plane node IPs for OVN RAFT consensus. By default, the role
398
-
auto-detects these IPs from the `server` inventory group host keys.
395
+
Kube-ovn requires `MASTER_NODES` — a comma-separated list of all control-plane node IPs for OVN RAFT consensus. By default, the role auto-detects these IPs from the `server` inventory group host keys.
399
396
400
-
This works when host keys are internal IPs (the recommended inventory
401
-
pattern):
397
+
This works when host keys are internal IPs (the recommended inventory pattern):
402
398
403
399
```yaml
404
400
server:
@@ -409,30 +405,19 @@ server:
409
405
ansible_host: 203.0.113.11
410
406
```
411
407
412
-
If your inventory uses hostnames or non-IP host keys, set
413
-
`cozystack_master_nodes` explicitly:
408
+
If your inventory uses hostnames or non-IP host keys, set `cozystack_master_nodes` explicitly:
[helm-diff](https://github.com/databus23/helm-diff) plugin on the
423
-
target node automatically. The `helm-diff` plugin enables true
424
-
idempotency — repeated runs report no changes when the release is
425
-
already up to date.
416
+
The role installs Helm and the [helm-diff](https://github.com/databus23/helm-diff) plugin on the target node automatically. The `helm-diff` plugin enables true idempotency — repeated runs report no changes when the release is already up to date.
426
417
427
418
### Customizing variables
428
419
429
-
The example prepare playbooks define internal variables (like
430
-
`cozystack_k3s_server_args`) in the play `vars` section. User-facing
431
-
variables such as `cozystack_k3s_extra_args` and
432
-
`cozystack_flush_iptables`should be set **in the inventory**, not in
433
-
the playbook. Ansible play `vars` take precedence over inventory
434
-
variables, so defining them in both places causes the inventory values
435
-
to be silently ignored.
420
+
The example prepare playbooks define internal variables (like `cozystack_k3s_server_args`) in the play `vars` section. User-facing variables such as `cozystack_k3s_extra_args` and `cozystack_flush_iptables` should be set **in the inventory**, not in the playbook. Ansible play `vars` take precedence over inventory variables, so defining them in both places causes the inventory values to be silently ignored.
0 commit comments