Skip to content

Commit edb4c11

Browse files
bogdandoopenshift-merge-bot[bot]
authored andcommitted
[multiple] BM SNO follow-up fixes, reuse support
- Skip controller reboot and wait_for_connection in deploy-edpm-reuse when cifmw_bm_sno is true (no virtual controller assumption). - Skip syncing local repos to the Ansible controller in push_code when cifmw_bm_sno is true. - In reuse_main, skip CRC/OCP layout detection for BM SNO; set _use_crc/_use_ocp false and _has_openshift true. - In deploy_architecture, fall back to play host facts when inventory has no controller-* host; derive controller address from default IPv4 or inventory_hostname when ansible_host is unset. - Run OCP cluster-size reduction in architecture only when the ocps group exists. - Add cifmw_bm_agent_disabled_ifaces and agent-config networkConfig so extra NICs can stay link-up without IPv4/IPv6 (overlap validation); install nmstate when that list is non-empty. - Document bm_sno Zuul autohold workflow, reproducer scenarios vs baremetal, and spellcheck terms (NICs, autoheld, tty). - Update BM SNO logic to match the existing reuse_ocp flow where OCP (and SNO) deployment becomes skipped Generated-by: claude-4.6-opus-high Signed-off-by: Bohdan Dobrelia <bdobreli@redhat.com>
1 parent 84f9add commit edb4c11

10 files changed

Lines changed: 163 additions & 7 deletions

File tree

deploy-edpm-reuse.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,11 +7,13 @@
77
gather_facts: false
88
tasks:
99
- name: Reboot controller-0
10+
when: not (cifmw_bm_sno | default(false) | bool)
1011
ansible.builtin.reboot:
1112
reboot_timeout: 600
1213
become: true
1314

1415
- name: Wait for controller-0 to come back online
16+
when: not (cifmw_bm_sno | default(false) | bool)
1517
ansible.builtin.wait_for_connection:
1618
timeout: 600
1719
delay: 10

docs/dictionary/en-custom.txt

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@ Idempotency
1212
LDAP
1313
LLM
1414
MachineConfig
15+
NICs
1516
NodeHealthCheck
1617
RHCOS
1718
SNO
@@ -41,6 +42,7 @@ arxcruz
4142
auth
4243
authfile
4344
autoconfiguration
45+
autoheld
4446
autohold
4547
autoholds
4648
autologin
@@ -600,6 +602,7 @@ topolvm
600602
traceback
601603
tripleo
602604
ttl
605+
tty
603606
txt
604607
uefi
605608
uefisecureboot

roles/bm_sno/README.md

Lines changed: 90 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,7 @@ provision IP via `/etc/hosts` entries managed by the role.
6161
| `cifmw_bm_agent_vmedia_uefi_path` | str | auto-discovered | UEFI device path for the Virtual Optical Drive; auto-discovered from UEFI boot options if omitted |
6262
| `cifmw_bm_agent_core_password` | str || Set a `core` user password post-install via MachineConfig |
6363
| `cifmw_bm_agent_live_debug` | bool | `false` | Patch the agent ISO with password, autologin, and systemd debug shell on `tty6` for discovery-phase console access (requires `cifmw_bm_agent_core_password`) |
64+
| `cifmw_bm_agent_disabled_ifaces` | list | `[]` | Extra NIC names to disable IPv4/IPv6 on during agent-based install. Prevents overlapping-subnet validation failures when multiple NICs share a native VLAN (e.g. `[eno2]`). The interfaces stay link-up but get no IP address; post-install NNCP configures them. |
6465

6566
## Secrets management
6667

@@ -95,7 +96,7 @@ The agent-based deployment is composed of reusable task files under
9596
| `bm_ensure_usb_boot.yml` | Wraps `bm_check_usb_boot.yml`; if disabled and `cifmw_bm_agent_enable_usb_boot` is true, sets the BIOS attribute, creates a config job, and power-cycles to apply |
9697
| `bm_eject_vmedia.yml` | Ejects VirtualMedia from the iDRAC Virtual Optical Drive |
9798
| `bm_discover_vmedia_target.yml` | Discovers or validates the UEFI device path for VirtualMedia, clears pending iDRAC config jobs, and sets a one-time boot override |
98-
| `bm_patch_agent_iso.yml` | Patches the agent ISO ignition with core password, autologin, and debug shell (used when `cifmw_bm_agent_live_debug` is true) |
99+
| `bm_patch_agent_iso.yml` | Patches the agent ISO ignition with core password, autologin, and debug shell on tty6 (used when `cifmw_bm_agent_live_debug` is true) |
99100
| `bm_core_password_machineconfig.yml` | Generates a MachineConfig manifest to set the core user password hash post-install |
100101

101102
## openshift-install acquisition
@@ -178,6 +179,94 @@ cifmw_bm_nodes:
178179
root_device: /dev/sda
179180
```
180181

182+
## Local debugging on an autoheld Zuul node
183+
184+
When a Zuul job is held (`autohold`), you can SSH into the Zuul controller
185+
and iterate on the deployment without re-provisioning SNO from scratch.
186+
187+
### 1. Prepare the environment
188+
189+
Edit `~/configs/zuul_vars.yaml` to skip SNO re-provisioning and OpenStack
190+
cleanup (there is nothing to clean up if doing the first RHOSO deployment):
191+
192+
```yaml
193+
cifmw_cleanup_architecture: false
194+
reuse_ocp: true
195+
run_cleanup: false
196+
```
197+
198+
### 2. Run the playbook
199+
200+
From the `ci-framework-jobs` checkout on the Zuul controller:
201+
202+
```bash
203+
cd ~/src/gitlab.cee.redhat.com/ci-framework/ci-framework-jobs
204+
205+
ansible-playbook playbooks/baremetal/run-sno-bm.yaml \
206+
--flush-cache \
207+
-e@/home/zuul/configs/default-vars.yaml \
208+
-e@/home/zuul/src/gitlab.cee.redhat.com/ci-framework/ci-framework-jobs/scenarios/test/test-tool-versions.yaml \
209+
-e@/home/zuul/src/gitlab.cee.redhat.com/ci-framework/ci-framework-jobs/scenarios/uni/default-vars.yaml \
210+
-e@/home/zuul/src/gitlab.cee.redhat.com/ci-framework/ci-framework-jobs/scenarios/baremetal/vaf/rhel-vars.yaml \
211+
-e@/home/zuul/configs/networking_defintion.yaml \
212+
-e@/home/zuul/configs/nmstate_config.yaml \
213+
-e@/home/zuul/configs/scenario-vars.yaml \
214+
-e@/home/zuul/configs/secrets.yaml \
215+
-e@/home/zuul/configs/vars.yaml \
216+
-e@/home/zuul/configs/zuul_vars.yaml
217+
```
218+
219+
With `reuse_ocp: true`, `run-sno-bm.yaml` will:
220+
221+
1. Copy the SNO kubeconfig from `dev-scripts/ocp/<cluster>/auth/` to
222+
`~/.kube/config` and `oc login` as `kubeadmin` with
223+
`--insecure-skip-tls-verify` (agent-based installer uses self-signed certs)
224+
2. Generate `openshift-login-params.yml` via the `openshift_login` role
225+
3. Write a static inventory mapping `controller-0` to `localhost`
226+
4. Run `deploy-edpm-reuse.yaml` instead of `reproducer.yml`, which skips
227+
OCP provisioning and goes straight to architecture deployment
228+
229+
### 3. Subsequent iterations
230+
231+
Once the first EDPM deployment succeeds, set `cifmw_cleanup_architecture`
232+
back to `true` so that `cleanup-architecture.sh` tears down the previous
233+
OpenStack deployment before re-applying:
234+
235+
```yaml
236+
cifmw_cleanup_architecture: true
237+
reuse_ocp: true
238+
run_cleanup: false
239+
```
240+
241+
### 4. Quick OCP and agent/SNO SSH access
242+
243+
The SNO kubeconfig and kubeadmin password live in the dev-scripts auth
244+
directory:
245+
246+
```bash
247+
export KUBECONFIG=~/src/github.com/openshift-metal3/dev-scripts/ocp/<cluster>/auth/kubeconfig
248+
oc login -u kubeadmin \
249+
-p "$(cat ~/src/github.com/openshift-metal3/dev-scripts/ocp/<cluster>/auth/kubeadmin-password)" \
250+
--insecure-skip-tls-verify=true
251+
oc get nodes
252+
```
253+
254+
For ssh access into SNO host:
255+
```bash
256+
ssh -i ~/ci-framework-data/artifacts/agent-install/agent_ssh_key \
257+
core@<cluster>.<cifmw_bm_agent_base_domain>
258+
```
259+
260+
Replace `<cluster>` with the value of `cifmw_bm_agent_cluster_name` (e.g.
261+
`sno`).
262+
263+
For ssh into agent-install appliance, use `-i ci-framework-data/artifacts/cifmw_ocp_access_key`.
264+
You can also get autologin and debug shell on tty6 of the agent with:
265+
```bash
266+
cifmw_bm_agent_core_password: changeme
267+
cifmw_bm_agent_live_debug: true
268+
```
269+
181270
## References
182271

183272
* [ci-framework reproducer documentation](https://ci-framework.readthedocs.io/en/latest/roles/reproducer.html)

roles/bm_sno/defaults/main.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,3 +6,4 @@ cifmw_bm_agent_core_password: "redhat"
66
cifmw_bm_agent_live_debug: false
77
cifmw_bm_agent_vmedia_uefi_path: ""
88
cifmw_bm_agent_enable_usb_boot: true
9+
cifmw_bm_agent_disabled_ifaces: []

roles/bm_sno/tasks/main.yml

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -219,6 +219,13 @@
219219
regexp: '^pullSecret:'
220220
line: "pullSecret: '<REDACTED>'"
221221

222+
- name: Ensure nmstatectl is available for agent-config networkConfig validation
223+
become: true
224+
ansible.builtin.package:
225+
name: nmstate
226+
state: present
227+
when: cifmw_bm_agent_disabled_ifaces | default([]) | length > 0
228+
222229
- name: Generate agent ISO
223230
ansible.builtin.command:
224231
cmd: "{{ _work_dir }}/openshift-install agent create image --dir {{ _work_dir }}"

roles/bm_sno/templates/agent_config.yaml.j2

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,3 +8,22 @@ hosts:
88
interfaces:
99
- name: {{ _node_iface }}
1010
macAddress: {{ _node_mac }}
11+
networkConfig:
12+
interfaces:
13+
- name: {{ _node_iface }}
14+
type: ethernet
15+
state: up
16+
ipv4:
17+
enabled: true
18+
dhcp: true
19+
ipv6:
20+
enabled: false
21+
{% for iface in cifmw_bm_agent_disabled_ifaces | default([]) %}
22+
- name: {{ iface }}
23+
type: ethernet
24+
state: up
25+
ipv4:
26+
enabled: false
27+
ipv6:
28+
enabled: false
29+
{% endfor %}

roles/cifmw_setup/tasks/deploy_architecture.yml

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -54,18 +54,26 @@
5454
# - controller (in Zuul)
5555
# - controller-0.foo.com (FQDN)
5656
# - controller-0 (no FQDN) - compatibility match
57+
# Falls back to the play host (e.g. localhost) when no controller
58+
# host exists in inventory, which is the case for bare metal SNO.
5759
_ctl_data: >-
5860
{{
5961
hostvars | dict2items |
6062
selectattr('key', 'match', '^(controller-0.*|controller)') |
61-
map(attribute='value') | first
63+
map(attribute='value') | first |
64+
default(hostvars[inventory_hostname])
6265
}}
6366
_ifaces_vars: >-
6467
{{
6568
_ctl_data.ansible_interfaces |
6669
map('regex_replace', '^(.*)$', 'ansible_\1')
6770
}}
68-
_controller_host: "{{ _ctl_data.ansible_host }}"
71+
_controller_host: >-
72+
{{
73+
_ctl_data.ansible_host |
74+
default(_ctl_data.ansible_default_ipv4.address |
75+
default(inventory_hostname))
76+
}}
6977
block:
7078
- name: Generate needed facts out of local files
7179
vars:
@@ -177,6 +185,7 @@
177185

178186
- name: Reduce OCP cluster size in architecture
179187
when:
188+
- "'ocps' in groups"
180189
- groups['ocps'] | length == 1
181190
ansible.builtin.import_role:
182191
name: kustomize_deploy

roles/reproducer/tasks/push_code.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -138,6 +138,7 @@
138138
- name: Sync local repositories to ansible controller
139139
delegate_to: localhost
140140
when:
141+
- not (cifmw_bm_sno | default(false) | bool)
141142
- item.src is abs or item.src is not match('.*:.*')
142143
ansible.posix.synchronize:
143144
src: "{{ item.src }}"

roles/reproducer/tasks/reuse_main.yaml

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,7 @@
5454
}}
5555
5656
- name: Set _use_crc based on actual layout
57+
when: not (cifmw_bm_sno | default(false) | bool)
5758
tags:
5859
- always
5960
vars:
@@ -76,6 +77,15 @@
7677
_use_ocp: "{{ _use_ocp }}"
7778
_has_openshift: "{{ _use_ocp or _use_crc }}"
7879

80+
- name: Set _has_openshift for bare metal SNO
81+
when: cifmw_bm_sno | default(false) | bool
82+
tags:
83+
- always
84+
ansible.builtin.set_fact:
85+
_use_crc: false
86+
_use_ocp: false
87+
_has_openshift: true
88+
7989
- name: Ensure directories are present
8090
tags:
8191
- always

scenarios/reproducers/README.md

Lines changed: 19 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,20 @@
1-
# Reproducer scenarios usage
1+
# Reproducer scenarios
22

3-
These environment files should be used with the "reproducer.yml" playbook.
4-
Please refer to [the doc](https://ci-framework.readthedocs.io/en/latest/roles/reproducer.html)
5-
about its usage.
3+
These environment files define validated architecture scenarios for the
4+
virtual reproducer (reproducer.yml playbook). Each file sets
5+
`cifmw_libvirt_manager_configuration` (libvirt networks and VMs),
6+
`cifmw_devscripts_config_overrides`, and other deployment parameters for
7+
a specific architecture variant (e.g. HCI, SNO, multi-site).
8+
9+
The `cifmw_architecture_scenario` variable selects which file to load.
10+
For example, `cifmw_architecture_scenario: va-hci-minimal-sno` loads
11+
`va-hci-minimal-sno.yml`.
12+
13+
Baremetal deployments (e.g. ci-framework-jobs BM SNO) do not use the
14+
`cifmw_libvirt_manager_configuration` from these files. They provide
15+
their own networking, kustomization, and scenario variables via Zuul
16+
`variable_files` / `variable_files_dirs`, which take precedence as
17+
Ansible extra vars.
18+
19+
Please refer to [the reproducer documentation](https://ci-framework.readthedocs.io/en/latest/roles/reproducer.html)
20+
for usage details.

0 commit comments

Comments
 (0)