diff --git a/tests/e2e/cli-matrix/README.md b/tests/e2e/cli-matrix/README.md index a7d8c3e5..687e407e 100644 --- a/tests/e2e/cli-matrix/README.md +++ b/tests/e2e/cli-matrix/README.md @@ -41,6 +41,8 @@ A cell counts as **FAIL** if either leg times out, the `linstor` CLI exits non-z | `r-td-diskless-reaps-tiebreaker.sh` | parity | Sibling of `r-d-collapses-tiebreaker` for the toggle path: on a 2-diskful + TIE_BREAKER RD, `linstor r td --diskless ` drops diskful to 1 and the auto-witness is reaped within 30s, settling on exactly 2 rows (1 diskful UpToDate + 1 user-diskless) with no TIE_BREAKER. Pins the upstream-parity contract that no witness is managed below 2 diskful (quorum=off at 1 diskful). | | `r-l-conns-shapes.sh` | 331 | Conns/State column contract: parses `linstor r l` JSON across (Healthy, Disconnected peer, Diskless, TieBreaker) shapes and pins observer's events2 translation. | | `snap-restore-snapshotless-node-rejected.sh` | 397 | P0 DATA INTEGRITY. `snapshot resource restore` onto a node NOT holding the snapshot is rejected (no silent empty replica, no orphan RD); restoring onto the snapshot's own nodes converges UpToDate AND every replica holds the real snapshot bytes (marker read per-replica), never a silently-empty UpToDate copy. | +| `rd-clone-vd-data-plane.sh` | 020 | `linstor rd clone ` on a VD-bearing source (plain CLI body, no `use_zfs_clone`) AND the raw-REST `use_zfs_clone=true` body linstor-csi sends both materialise a real clone: 2 replicas UpToDate, marker bytes from the source present on EVERY clone replica (promote each in turn), clone status COMPLETE, internal `clone-` snapshot visible on the source. Pre-fix: 400 on `use_zfs_clone`, 501 on VD-bearing sources (Bug 114 gate). | +| `encryption-passphrase-luks-rd.sh` | 023 | Secret-only LUKS flow: `linstor encryption create-passphrase` alone (legacy `DrbdOptions/EncryptPassphrase` controller prop asserted ABSENT throughout) unlocks `rd c -l drbd,luks,storage` + autoplace to UpToDate, and the Secret-backed passphrase actually opens the LUKS header on each replica's backing device. Requires the Bug-023 fix (PR #143); pre-fix the rd-create is rejected with "LUKS layer requires DrbdOptions/EncryptPassphrase to be set first". | ## Running diff --git a/tests/e2e/cli-matrix/encryption-passphrase-luks-rd.sh b/tests/e2e/cli-matrix/encryption-passphrase-luks-rd.sh new file mode 100755 index 00000000..284d310a --- /dev/null +++ b/tests/e2e/cli-matrix/encryption-passphrase-luks-rd.sh @@ -0,0 +1,196 @@ +#!/usr/bin/env bash +# +# usage: encryption-passphrase-luks-rd.sh WORK_DIR +# +# L6 cli-matrix cell — Bug 023 (fix: encryption create-passphrase +# unlocks LUKS provisioning). +# +# Audit gap: `linstor encryption create-passphrase` stored the cluster +# master passphrase in the blockstor-cluster-passphrase Secret, but +# nothing downstream read it: +# - the LUKS RD-create gate only consulted the legacy +# DrbdOptions/EncryptPassphrase controller property, so the +# upstream-standard flow (create-passphrase → rd c -l +# drbd,luks,storage) was rejected with "LUKS layer requires +# DrbdOptions/EncryptPassphrase to be set first" — and the hint +# told operators to store a PLAINTEXT passphrase in a controller +# prop; +# - the satellite lifted the LUKS key onto the LuksPassphrase wire +# prop only from controller-scope props, so a Secret-only cluster +# looped on "LUKS in layer stack but Props.LuksPassphrase empty" +# at apply time. +# +# Post-fix contract (pinned here): the Secret set by `encryption +# create-passphrase` is the PRIMARY, upstream-parity key source — the +# whole LUKS lifecycle must work WITHOUT the legacy controller prop +# ever being set. The sibling cells (luks-rd-create-encrypted.sh, +# luks-clone-encrypted.sh, replay/luks-encrypted-rd.yaml) still set +# the legacy prop and keep covering the deprecated path. +# +# Flow + assertions: +# 1. cleanup_encryption_state → known-clean baseline (no Secret, no +# legacy prop). +# 2. linstor encryption create-passphrase --passphrase → exit 0. +# 3. legacy prop ABSENT on `controller list-properties` (and stays +# absent through the whole cell — provisioning must not depend on +# anything writing it behind our back). +# 4. rd c -l drbd,luks,storage → exit 0 (pre-fix: rejected). +# 5. vd c + r c --auto-place=2 → both diskful replicas UpToDate. +# 6. kernel-level proof on EACH replica: backing LV/zvol carries a +# real LUKS header AND the cluster passphrase opens it +# (cryptsetup --test-passphrase) — the Secret value travelled the +# satellite channel to luksFormat, not just past the REST gate. +# +# Unit pins: pkg/rest/luks_gate_bug023_test.go, +# pkg/satellite/controllers/luks_passphrase_internal_test.go. This +# cell is the stand-side companion: real python-linstor → apiserver → +# satellite → cryptsetup. + +set -euo pipefail + +WORK_DIR=${1:?work_dir required} +export KUBECONFIG="$WORK_DIR/kubeconfig" + +SCRIPT_DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd) +# shellcheck source=lib.sh +source "$SCRIPT_DIR/lib.sh" + +require_workers 2 + +linstor_cli_setup + +RD=cli-matrix-023-pp-luks +POOL=${POOL:-lvm-thin} +PASSPHRASE='cli-matrix-023-secret-pp!' + +cleanup() { + delete_rd "$RD" + assert_no_orphans "$RD" + cleanup_encryption_state + linstor_cli_teardown +} +trap cleanup EXIT + +# assert_legacy_prop_absent — the Bug 023 core invariant: the +# deprecated DrbdOptions/EncryptPassphrase controller property must +# never appear during the Secret-only flow. Checked via the same +# machine-readable list-properties surface the python CLI renders. +assert_legacy_prop_absent() { + local phase=$1 + local present + present=$("${LCTL[@]}" --machine-readable controller list-properties 2>/dev/null \ + | jq -r '[.. | objects | select(.key == "DrbdOptions/EncryptPassphrase")] | length' \ + 2>/dev/null || echo 0) + if [[ "$present" != "0" ]]; then + echo "FAIL (Bug 023): legacy DrbdOptions/EncryptPassphrase controller prop present ($phase)" >&2 + echo " the Secret-only flow must not set or require it" >&2 + exit 1 + fi +} + +echo ">> [Bug 023] pre-flight: 2 healthy $POOL SPs" +sp_json=$("${LCTL[@]}" --machine-readable storage-pool list --storage-pools "$POOL" 2>/dev/null || echo "[]") +ok_nodes=$(jq -r '[.[]? | .[]? | select(.provider_kind != null) | .node_name] | unique | length' <<<"$sp_json" 2>/dev/null || echo 0) +if (( ok_nodes < 2 )); then + echo "SKIP: $POOL SP not on >=2 nodes (got $ok_nodes) — encrypted-RD autoplace fixture unavailable" + exit 0 +fi + +# Known-clean baseline: no passphrase Secret, no legacy controller +# prop. Without this the create-passphrase below answers "already +# set" and the cell would silently test the wrong (modify) path. +cleanup_encryption_state + +echo ">> [Bug 023] linstor encryption create-passphrase (Secret-only flow)" +err_file=$(mktemp) +if ! "${LCTL[@]}" encryption create-passphrase --passphrase "$PASSPHRASE" 2>"$err_file"; then + rc=$? + echo "FAIL (Bug 023): create-passphrase exited $rc" >&2 + cat "$err_file" >&2 + rm -f "$err_file" + exit 1 +fi +rm -f "$err_file" + +echo ">> [Bug 023] legacy DrbdOptions/EncryptPassphrase prop is ABSENT" +assert_legacy_prop_absent "after create-passphrase" + +echo ">> [Bug 023] linstor rd c $RD -l drbd,luks,storage (no legacy prop set)" +err_file=$(mktemp) +if ! "${LCTL[@]}" resource-definition create "$RD" \ + --layer-list drbd,luks,storage 2>"$err_file"; then + rc=$? + echo "FAIL (Bug 023): rd create rejected (exit $rc) — Secret-backed passphrase not accepted by the LUKS gate?" >&2 + cat "$err_file" >&2 + rm -f "$err_file" + exit 1 +fi +rm -f "$err_file" + +echo ">> [Bug 023] linstor vd c $RD 128M" +"${LCTL[@]}" volume-definition create "$RD" 128M >/dev/null + +echo ">> [Bug 023] linstor r c $RD --auto-place=2 -s $POOL" +err_file=$(mktemp) +if ! "${LCTL[@]}" resource create --auto-place=2 --storage-pool="$POOL" "$RD" 2>"$err_file"; then + rc=$? + echo "FAIL (Bug 023): encrypted auto-place=2 exited $rc" >&2 + cat "$err_file" >&2 + rm -f "$err_file" + exit 1 +fi +rm -f "$err_file" + +echo ">> [Bug 023] wait for 2 diskful Resource CRDs to land" +# auto-place=2 may add a DISKLESS TIE_BREAKER witness on top of the 2 +# diskful replicas — count diskful only (same convention as the other +# autoplace cells) so the luksDump checks never target a backing-less +# witness. +deadline=$(( $(date +%s) + 60 )) +placed_nodes=() +while (( $(date +%s) < deadline )); do + mapfile -t placed_nodes < <(linstor_diskful_nodes "$RD") + if (( ${#placed_nodes[@]} == 2 )); then + break + fi + sleep 2 +done +if (( ${#placed_nodes[@]} != 2 )); then + echo "FAIL (Bug 023): autoplace did not stage 2 diskful Resource CRDs within 60s (got ${#placed_nodes[@]})" >&2 + echo " all replicas: $(linstor_replica_count "$RD"), tiebreaker: $(linstor_tiebreaker_node "$RD")" >&2 + exit 1 +fi +echo " placed (diskful) on: ${placed_nodes[*]}" + +N1="${placed_nodes[0]}" +N2="${placed_nodes[1]}" + +echo ">> [Bug 023] wait both replicas UpToDate (Secret-fed luksFormat ran)" +# Pre-fix failure mode for a gate-only patch: rd-create passes but the +# satellite loops on "LUKS in layer stack but Props.LuksPassphrase +# empty" and the replicas never converge. UpToDate within the bound is +# the proof the Secret reached the satellite channel. +wait_uptodate "$RD" "$N1" "$N2" + +echo ">> [Bug 023] legacy prop STILL absent after provisioning" +assert_legacy_prop_absent "after provisioning" + +echo ">> [Bug 023] LUKS header present + Secret passphrase opens it on EACH replica" +for node in "$N1" "$N2"; do + backing=$(luks_backing_device "$RD" "$node" 0) + if [[ -z "$backing" ]]; then + echo "FAIL (Bug 023): could not resolve backing device for $RD on $node" >&2 + exit 1 + fi + echo " $node: backing=$backing" + if ! wait_luks_header_present "$node" "$backing" 60; then + echo "FAIL (Bug 023): no LUKS header on $node:$backing" >&2 + exit 1 + fi + if ! assert_luks_passphrase_opens "$node" "$backing" "$PASSPHRASE"; then + echo "FAIL (Bug 023): Secret-backed passphrase does not unlock $node:$backing" >&2 + exit 1 + fi +done + +echo ">> encryption-passphrase-luks-rd OK (Bug 023: Secret-only passphrase provisions LUKS end-to-end, no legacy prop)" diff --git a/tests/e2e/cli-matrix/rd-clone-vd-data-plane.sh b/tests/e2e/cli-matrix/rd-clone-vd-data-plane.sh new file mode 100755 index 00000000..f3ba9904 --- /dev/null +++ b/tests/e2e/cli-matrix/rd-clone-vd-data-plane.sh @@ -0,0 +1,236 @@ +#!/usr/bin/env bash +# +# usage: rd-clone-vd-data-plane.sh WORK_DIR +# +# L6 cli-matrix cell — Bug 020 (fix: accept use_zfs_clone and +# materialise VD-bearing RD clones). +# +# Audit gap: before the fix, POST /v1/resource-definitions/{rd}/clone +# rejected golinstor v0.58+'s `use_zfs_clone` field with 400 +# (DisallowUnknownFields), breaking linstor-csi clone-from-source, and +# a VD-bearing source answered an explicit 501 (Bug 114 gate) instead +# of producing a clone. Post-fix the handler routes VD-bearing sources +# through the snapshot-restore machinery: internal snapshot +# `clone-` of the source + restore-marker materialisation, so +# the target RD comes back with hydrated VDs, replicas on the +# snapshot-holding nodes, and the REAL source bytes (delta row 82 in +# docs/cli-parity-known-deltas.md). +# +# This cell pins the DATA PLANE, not just the envelope: a clone that +# converges UpToDate but reads back zeros is the Bug 114 silent-empty- +# shell failure mode resurfacing. Both wire variants are driven: +# +# A. plain python CLI — `linstor resource-definition clone `. +# linstor-client 1.27.1 declares `--use-zfs-clone` with +# action=store_true, default=None, and python-linstor only +# serialises non-None kwargs, so the bare verb POSTs a body +# WITHOUT `use_zfs_clone` (the `use_zfs_clone=false/absent` +# branch of delta row 82). The CLI then polls +# GET /v1/resource-definitions/{src}/clone/{dst} until COMPLETE. +# B. raw REST with `use_zfs_clone: true` — the exact body +# linstor-csi sends on CSI clone-from-source (golinstor v0.58+ +# sets UseZfsClone on every CreateVolume with a volume content +# source). Driven via curl because the CLI flag's presence varies +# across client builds while the wire shape is the contract. +# +# Contract per variant: +# 1. clone verb/POST accepted (CLI exit 0 / HTTP 201, CloneStarted +# envelope — never 400 on use_zfs_clone, never 501). +# 2. clone status answers COMPLETE. +# 3. target RD materialises 2 diskful replicas that converge +# UpToDate (observer-stamped Status). +# 4. EVERY diskful replica of the clone holds the deterministic +# marker seeded on the source (promote each in turn — a silently +# empty replica reports UpToDate but reads back zeros). +# 5. the internal snapshot `clone-` is visible on the source +# (it must outlive the clone — `zfs clone` targets stay dependent +# on their origin snapshot; delta row 82). +# +# Pool: `stand` (FILE_THIN) — snapshot-capable, present on every stand +# worker; same pool the Bug 397 snapshot-restore data-integrity cell +# uses for its byte-level asserts. Override with POOL=zfs-thin to +# exercise the literal `zfs clone` data plane. +# +# Unit pin: pkg/rest/clone_use_zfs_clone_bug020_test.go. This cell is +# the stand-side companion (real python-linstor + satellite + kernel). + +set -euo pipefail + +WORK_DIR=${1:?work_dir required} +export KUBECONFIG="$WORK_DIR/kubeconfig" + +SCRIPT_DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd) +# shellcheck source=lib.sh +source "$SCRIPT_DIR/lib.sh" + +require_workers 2 + +linstor_cli_setup + +SRC=cli-matrix-020-clsrc +DST_CLI=cli-matrix-020-cla +DST_ZFS=cli-matrix-020-clb +POOL=${POOL:-stand} +MARKER='BLOCKSTOR-BUG020-CLONE-MARKER' + +N1=$WORKER_1 +N2=$WORKER_2 + +cleanup() { + # Clones first (their backing storage may depend on the source's + # internal snapshot), then the source. delete_rd reaps the + # clone- Snapshot CRDs together with the source RD. + delete_rd "$DST_CLI" + delete_rd "$DST_ZFS" + delete_rd "$SRC" + assert_no_orphans "$DST_CLI" + assert_no_orphans "$DST_ZFS" + assert_no_orphans "$SRC" + linstor_cli_teardown +} +trap cleanup EXIT + +echo ">> [Bug 020] source RD: 2 diskful replicas on $POOL" +"${LCTL[@]}" resource-definition create "$SRC" >/dev/null +"${LCTL[@]}" volume-definition create "$SRC" 64M >/dev/null +_out=$("${LCTL[@]}" resource create "$N1" "$SRC" --storage-pool="$POOL" 2>&1) \ + || { echo "FAIL: r c $N1 $SRC: $_out" >&2; exit 1; } +_out=$("${LCTL[@]}" resource create "$N2" "$SRC" --storage-pool="$POOL" 2>&1) \ + || { echo "FAIL: r c $N2 $SRC: $_out" >&2; exit 1; } +wait_uptodate "$SRC" "$N1" "$N2" + +echo ">> [Bug 020] seed deterministic marker on $N1 $SRC" +on_node "$N1" drbdadm primary --force "$SRC" 2>/dev/null || true +dev=$(resolve_drbd_device "$N1" "$SRC" 0) || { + echo "ABORT: could not resolve /dev/drbd for $SRC on $N1" >&2 + exit 2 +} +on_node "$N1" bash -c \ + "printf '$MARKER' | dd of='$dev' bs=1 count=${#MARKER} conv=fsync status=none" +on_node "$N1" drbdadm secondary "$SRC" 2>/dev/null || true +wait_uptodate "$SRC" "$N1" "$N2" + +# wait_clone_replicas [timeout] — the clone materialises its +# replicas asynchronously on the snapshot-holding nodes; poll until 2 +# diskful Resource CRDs exist. Echoes the node list one per line. +wait_clone_replicas() { + local rd=$1 timeout=${2:-120} + local deadline=$(( $(date +%s) + timeout )) + local nodes=() + while (( $(date +%s) < deadline )); do + mapfile -t nodes < <(linstor_diskful_nodes "$rd") + if (( ${#nodes[@]} == 2 )); then + printf '%s\n' "${nodes[@]}" + return 0 + fi + sleep 2 + done + echo "wait_clone_replicas: $rd never materialised 2 diskful replicas (got ${#nodes[@]})" >&2 + return 1 +} + +# assert_clone_marker — promote each diskful replica of +# the clone in turn and read the marker region back. Catches the Bug +# 114 silent-empty-shell mode: UpToDate by DRBD, zeros on disk. +assert_clone_marker() { + local rd=$1 + shift + local nodes=("$@") + local node other dev marker_read + for node in "${nodes[@]}"; do + for other in "${nodes[@]}"; do + [[ "$other" == "$node" ]] && continue + on_node "$other" drbdadm secondary "$rd" 2>/dev/null || true + done + dev=$(resolve_drbd_device "$node" "$rd" 0 2>/dev/null) || dev="" + marker_read=$(on_node "$node" bash -c " + drbdadm primary --force $rd 2>/dev/null || true + if [ -n '$dev' ]; then + head -c ${#MARKER} '$dev' 2>/dev/null + fi + " 2>/dev/null || echo "") + on_node "$node" drbdadm secondary "$rd" 2>/dev/null || true + if [[ "$marker_read" != "$MARKER" ]]; then + echo "FAIL (Bug 020): replica $node of $rd does NOT hold the source bytes" >&2 + echo " expected marker '$MARKER', read back '$marker_read'" >&2 + return 1 + fi + echo " $node: marker present" + done +} + +# assert_internal_snapshot — delta row 82: the clone's internal +# snapshot `clone-` lives on the SOURCE and must outlive the +# clone (zfs targets stay dependent on their origin snapshot). +assert_internal_snapshot() { + local dst=$1 + if ! kubectl get "snapshots.blockstor.cozystack.io/${SRC}.clone-${dst}" \ + >/dev/null 2>&1; then + echo "FAIL (Bug 020): internal snapshot ${SRC}.clone-${dst} not found" >&2 + kubectl get snapshots.blockstor.cozystack.io --no-headers 2>/dev/null >&2 || true + return 1 + fi +} + +# ---- variant A: plain CLI clone (no use_zfs_clone on the wire) ------------ + +echo ">> [Bug 020 / A] linstor resource-definition clone $SRC $DST_CLI" +err_file=$(mktemp) +if ! "${LCTL[@]}" resource-definition clone "$SRC" "$DST_CLI" 2>"$err_file"; then + rc=$? + echo "FAIL (Bug 020): rd clone exited $rc (pre-fix: 501 on VD-bearing source)" >&2 + cat "$err_file" >&2 + rm -f "$err_file" + exit 1 +fi +rm -f "$err_file" + +echo ">> [Bug 020 / A] clone replicas materialise + converge UpToDate" +mapfile -t cli_nodes < <(wait_clone_replicas "$DST_CLI" 120) +wait_uptodate "$DST_CLI" "${cli_nodes[0]}" "${cli_nodes[1]}" + +echo ">> [Bug 020 / A] marker bytes present on EVERY clone replica" +assert_clone_marker "$DST_CLI" "${cli_nodes[@]}" +assert_internal_snapshot "$DST_CLI" + +# ---- variant B: raw REST with use_zfs_clone=true (linstor-csi shape) ------ + +echo ">> [Bug 020 / B] POST /v1/resource-definitions/$SRC/clone use_zfs_clone=true" +http_code=$(curl -sS -m 30 -o /tmp/cli-matrix-020-clone.json -w '%{http_code}' \ + -X POST -H 'Content-Type: application/json' \ + -d "{\"name\":\"${DST_ZFS}\",\"use_zfs_clone\":true}" \ + "http://127.0.0.1:${LCTL_PORT}/v1/resource-definitions/${SRC}/clone" \ + 2>/dev/null || echo "000") +if [[ "$http_code" != "201" ]]; then + echo "FAIL (Bug 020): use_zfs_clone=true POST answered HTTP $http_code, want 201" >&2 + echo " (pre-fix: 400 DisallowUnknownFields on use_zfs_clone)" >&2 + cat /tmp/cli-matrix-020-clone.json >&2 2>/dev/null || true + exit 1 +fi + +echo ">> [Bug 020 / B] GET clone status reaches COMPLETE" +deadline=$(( $(date +%s) + 60 )) +clone_status="" +while (( $(date +%s) < deadline )); do + clone_status=$(curl -fsS -m 5 \ + "http://127.0.0.1:${LCTL_PORT}/v1/resource-definitions/${SRC}/clone/${DST_ZFS}" \ + 2>/dev/null | jq -r '.status // empty' 2>/dev/null || echo "") + if [[ "$clone_status" == "COMPLETE" ]]; then + break + fi + sleep 2 +done +if [[ "$clone_status" != "COMPLETE" ]]; then + echo "FAIL (Bug 020): clone status for $DST_ZFS never reached COMPLETE (last='$clone_status')" >&2 + exit 1 +fi + +echo ">> [Bug 020 / B] clone replicas materialise + converge UpToDate" +mapfile -t zfs_nodes < <(wait_clone_replicas "$DST_ZFS" 120) +wait_uptodate "$DST_ZFS" "${zfs_nodes[0]}" "${zfs_nodes[1]}" + +echo ">> [Bug 020 / B] marker bytes present on EVERY clone replica" +assert_clone_marker "$DST_ZFS" "${zfs_nodes[@]}" +assert_internal_snapshot "$DST_ZFS" + +echo ">> rd-clone-vd-data-plane OK (Bug 020: both wire variants materialise a real data-plane clone)" diff --git a/tests/operator-harness/replay/encryption-passphrase-luks-rd.yaml b/tests/operator-harness/replay/encryption-passphrase-luks-rd.yaml new file mode 100644 index 00000000..445a08da --- /dev/null +++ b/tests/operator-harness/replay/encryption-passphrase-luks-rd.yaml @@ -0,0 +1,95 @@ +name: encryption-passphrase-luks-rd +description: | + Bug-023 catcher (fix: encryption create-passphrase unlocks LUKS + provisioning). Pins the upstream-parity Secret-only flow: + 1. clear the legacy DrbdOptions/EncryptPassphrase controller prop + (empty value = delete) and assert it is ABSENT + 2. encryption create-passphrase (master passphrase → Secret) + 3. rd create -l drbd,luks,storage -- pre-fix this step is + rejected with "LUKS layer requires DrbdOptions/EncryptPassphrase + to be set first" (exit 10); post-fix the Secret satisfies the + LUKS gate + 4. vd create + r create --auto-place=2 -- every replica converges + UpToDate, proving the satellite folded the Secret value into + the LuksPassphrase wire prop (a gate-only patch would pass step + 3 but loop on "LUKS in layer stack but Props.LuksPassphrase + empty" and never converge) + 5. legacy prop STILL absent after provisioning + + Contrast with replay/luks-encrypted-rd.yaml, which sets the legacy + controller prop and keeps covering the deprecated path; this + workflow proves the prop is no longer required. + + NOTE: the runner only executes `linstor` argv, so the kernel-level + asserts (cryptsetup luksDump / --test-passphrase on each replica's + backing device) live in the L6 cell + tests/e2e/cli-matrix/encryption-passphrase-luks-rd.sh. + +prerequisites: + min_nodes: 2 + storage_pool: stand + +vars: + rd: replay-pp023 + sp: stand + +steps: + - name: clear-legacy-prop + # Empty value = delete (same idiom the auto-diskful workflows use + # in teardown; "delete a property that wasn't set" is a no-op, not + # an error). Establishes the Secret-only baseline on a shared + # stand where a sibling LUKS workflow may have left the prop. + cmd: ["controller", "set-property", "DrbdOptions/EncryptPassphrase", ""] + expect_exit: 0 + await: + kind: prop_value + obj: controller + key: DrbdOptions/EncryptPassphrase + timeout_s: 30 + - name: create-passphrase + # Passphrase goes via the -p flag (positional would prompt + # interactively). On a shared stand a cluster passphrase may + # already exist -- the controller answers "already exists" (exit + # 10), which is benign here: ANY Secret-backed passphrase + # satisfies the post-fix LUKS gate. + cmd: ["encryption", "create-passphrase", "-p", "replay-pp023-do-not-rely-on"] + expect_exit: [0, 10] + - name: create-rd-luks-no-legacy-prop + # THE Bug-023 step: pre-fix exit 10 ("LUKS layer requires + # DrbdOptions/EncryptPassphrase to be set first"), post-fix 0. + cmd: ["resource-definition", "create", "{{rd}}", "-l", "drbd,luks,storage"] + expect_exit: 0 + - name: create-vd + cmd: ["volume-definition", "create", "{{rd}}", "32M"] + expect_exit: 0 + - name: auto-place-2-encrypted + cmd: ["resource", "create", "--auto-place", "2", "--storage-pool={{sp}}", "{{rd}}"] + expect_exit: 0 + await: + kind: replica_count + rd: "{{rd}}" + min: 2 + timeout_s: 120 + - name: wait-uptodate + cmd: ["resource", "list", "--resources", "{{rd}}"] + expect_exit: 0 + await: + kind: all_uptodate + rd: "{{rd}}" + timeout_s: 240 + - name: legacy-prop-still-absent + # `expected` omitted ⇒ absence assertion: nothing may have + # back-filled the deprecated prop to make provisioning work. + cmd: ["controller", "list-properties"] + expect_exit: 0 + await: + kind: prop_value + obj: controller + key: DrbdOptions/EncryptPassphrase + timeout_s: 10 + +teardown: + - cmd: ["resource-definition", "delete", "{{rd}}"] + +invariants: + - no_orphans diff --git a/tests/operator-harness/replay/rd-clone-vd-data-plane.yaml b/tests/operator-harness/replay/rd-clone-vd-data-plane.yaml new file mode 100644 index 00000000..afc26a91 --- /dev/null +++ b/tests/operator-harness/replay/rd-clone-vd-data-plane.yaml @@ -0,0 +1,89 @@ +name: rd-clone-vd-data-plane +description: | + Bug-020 catcher (fix: accept use_zfs_clone and materialise VD-bearing + RD clones). Exercises the operator clone flow end-to-end: + 1. rd create + vd create + r create --auto-place=2 + 2. wait the source UpToDate + 3. linstor resource-definition clone + (linstor-client 1.27.1: bare verb POSTs WITHOUT use_zfs_clone -- + --use-zfs-clone is store_true/default=None and python-linstor + only serialises non-None kwargs; the CLI then polls + GET /v1/resource-definitions/{src}/clone/{dst} until COMPLETE) + 4. the clone RD materialises 2 replicas on the snapshot-holding + nodes and every diskful replica converges UpToDate + + Pre-fix failure modes pinned: 501 refusal on a VD-bearing source + (Bug 114 gate) and an empty target shell that never grows replicas. + + NOTE: the runner only executes `linstor` argv (no dd/curl steps), so + the byte-level data-plane assert (marker seeded on the source read + back per-replica on the clone) lives in the L6 cell + tests/e2e/cli-matrix/rd-clone-vd-data-plane.sh. This replay pins the + verb + convergence shape the runner CAN express; awaits on the clone + RD use the literal "{{rd}}-cl" name (await rd: fields pass through + the same substitution as cmd argv). + + Teardown order matters: the clone depends on the internal source + snapshot `clone-` (delta row 82 -- it must outlive the + clone), and `rd delete` on the source is blocked while snapshots + exist, so: delete the clone RD, then the internal snapshot, then the + source RD. + +prerequisites: + min_nodes: 2 + storage_pool: stand + +vars: + # Short fixed name: the clone suffix and the internal snapshot name + # clone-{{rd}}-cl must all stay under LINSTOR's 48-char ceiling. + rd: replay-cl020 + sp: stand + +steps: + - name: create-rd + cmd: ["resource-definition", "create", "{{rd}}"] + expect_exit: 0 + - name: create-vd + cmd: ["volume-definition", "create", "{{rd}}", "64M"] + expect_exit: 0 + - name: auto-place-2 + cmd: ["resource", "create", "--auto-place", "2", "--storage-pool={{sp}}", "{{rd}}"] + expect_exit: 0 + await: + kind: replica_count + rd: "{{rd}}" + min: 2 + timeout_s: 120 + - name: wait-source-uptodate + cmd: ["resource", "list", "--resources", "{{rd}}"] + expect_exit: 0 + await: + kind: all_uptodate + rd: "{{rd}}" + timeout_s: 240 + - name: clone + # Pre-fix: exit non-zero (501 CloneStarted refusal on a VD-bearing + # source). Post-fix: the CLI waits for clone status COMPLETE. + cmd: ["resource-definition", "clone", "{{rd}}", "{{rd}}-cl"] + expect_exit: 0 + await: + kind: replica_count + rd: "{{rd}}-cl" + min: 2 + timeout_s: 120 + - name: wait-clone-uptodate + cmd: ["resource", "list", "--resources", "{{rd}}-cl"] + expect_exit: 0 + await: + kind: all_uptodate + rd: "{{rd}}-cl" + timeout_s: 240 + +teardown: + - cmd: ["resource-definition", "delete", "{{rd}}-cl"] + - cmd: ["snapshot", "delete", "{{rd}}", "clone-{{rd}}-cl"] + - cmd: ["resource-definition", "delete", "{{rd}}"] + +invariants: + # Prefix-match on vars.rd also covers "{{rd}}-cl" leftovers. + - no_orphans