This guide shows how to deploy an multi-node OpenShift cluster with 3x masters and 2x workers using Agent-based Installer and run OpenShift's conformance test suite. It uses libvirt domains (QEMU/KVM based virtual machines) running in a Podman container to simulate bare-metal servers and auxiliary resources.
First, install git and Podman on a bare-metal system with Debian 11 (Bullseye), CentOS Stream 8,
Fedora Linux 33, Ubuntu 22.04 LTS (Jammy Jellyfish) or newer. Ensure the system has KVM nested virtualization enabled, has enough storage to store disk images for the virtual machines and is not connected
to ip networks 192.168.157.0/24 and 192.168.158.0/24. Then run:
git clone https://github.com/JM1/ansible-collection-jm1-cloudy.git
cd ansible-collection-jm1-cloudy/
cp -i ansible.cfg.example ansible.cfgOpenShift requires pull secrets to authenticate with container registries Quay.io and
registry.redhat.io, which serve the container images for OpenShift Container Platform components. Download pull
secrets from Red Hat Cloud Console and store them in file pull-secret.txt
in repository directory ansible-collection-jm1-cloudy.
If you want to deploy an OpenShift release image build from OpenShift CI, you also have to get a pull
secret for registry.ci.openshift.org (guide): Ensure your GitHub.com user is a member of the
OpenShift organization, otherwise request access here. Then request an API token, it will be like sha256~abcdefghijklmnopqrstuvwxyz01234567890abcdef. Use this token to login
and store it in pull-secret.txt with (replace $GITHUB_USER and $API_TOKEN):
podman login --authfile pull-secret.txt -u $GITHUB_USER -p $API_TOKEN registry.ci.openshift.orgNOTE: Tokens for registry.ci.openshift.org invalidate quickly, so expect to request new tokens monthly.
Grant unprivileged user in Podman container access to pull secrets:
chmod a+r pull-secret.txtNext, change host_vars of Ansible host lvrt-lcl-session-srv-530-okd-abi-ha-provisioner to read pull secrets from file
pull-secret.txt. Open file inventory/host_vars/lvrt-lcl-session-srv-530-okd-abi-ha-provisioner.yml and change variable
openshift_abi_pullsecret to:
openshift_abi_pullsecret: "{{ lookup('ansible.builtin.file', '/home/cloudy/project/pull-secret.txt') }}"Edit openshift_abi_release_image in file inventory/host_vars/lvrt-lcl-session-srv-530-okd-abi-ha-provisioner.yml to the
OpenShift release you want to deploy:
openshift_abi_release_image: "{{ lookup('ansible.builtin.pipe', openshift_abi_release_image_query) }}"
openshift_abi_release_image_query: |
curl -s https://mirror.openshift.com/pub/openshift-v4/amd64/clients/ocp/stable-4.14/release.txt \
| grep 'Pull From: quay.io' \
| awk -F ' ' '{print $3}'Or:
openshift_abi_release_image: 'registry.ci.openshift.org/ocp/release:4.14'When your corporate network blocks access to public NTP servers edit Ansible variable chrony_config for host
lvrt-lcl-session-srv-500-okd-abi-ha-router in file
inventory/host_vars/lvrt-lcl-session-srv-500-okd-abi-ha-router.yml. For example, suppose your internal NTP servers are
grouped in a pool clock.company.com, change chrony_config to:
chrony_config:
- ansible.builtin.copy:
content: |
allow 192.168.158.0/24
# Corporate network blocks all NTP traffic except to internal NTP servers.
pool clock.company.com iburst
dest: /etc/chrony/conf.d/home.arpa.conf
mode: u=rw,g=r,o=
group: root
owner: rootCreate Podman networks, volumes and containers, and attach to a container named cloudy with:
cd containers/
sudo DEBUG=yes DEBUG_SHELL=yes ./podman-compose.sh upInside this container a Bash shell will be spawned for user cloudy. This user cloudy will be executing the libvirt
domains (QEMU/KVM based virtual machines).
Launch the first set of virtual machines with the following commands run from cloudy's Bash shell:
ansible-playbook playbooks/site.yml --limit \
lvrt-lcl-session-srv-500-okd-abi-ha-router,\
lvrt-lcl-session-srv-501-okd-abi-ha-bmc,\
lvrt-lcl-session-srv-510-okd-abi-ha-cp0,\
lvrt-lcl-session-srv-511-okd-abi-ha-cp1,\
lvrt-lcl-session-srv-512-okd-abi-ha-cp2,\
lvrt-lcl-session-srv-520-okd-abi-ha-w0,\
lvrt-lcl-session-srv-521-okd-abi-ha-w1The former sets up a router which provides DHCP, DNS and NTP services and internet access. It starts sushy-emulator to provide a virtual Redfish BMC to power cycle servers and mount virtual media for hardware inspection and provisioning. It will also create virtual machines for OpenShift's master nodes and worker nodes, but without an operating system and in stopped/shutdown state.
NOTE: When Ansible execution fails, try to run the Ansible playbook again.
Launch another virtual machine to run OpenShift ABI and deploy the OpenShift cluster:
ansible-playbook playbooks/site.yml \
--limit lvrt-lcl-session-srv-530-okd-abi-ha-provisioner \
--skip-tags jm1.cloudy.openshift_testsTo access the cluster when Ansible is done, connect to the virtual machine which initiated the cluster installation
(Ansible host lvrt-lcl-session-srv-530-okd-abi-ha-provisioner):
ssh ansible@192.168.158.48The cluster uses internal DHCP and DNS services which are not accessible from the container host. In order to connect to the virtual machine from another shell at the container host (the bare-metal system) run:
sudo podman exec -ti -u cloudy cloudy ssh ansible@192.168.158.48From ansible's Bash shell at lvrt-lcl-session-srv-530-okd-abi-ha-provisioner the cluster can be accessed with:
export KUBECONFIG=/home/ansible/clusterconfigs/auth/kubeconfig
oc get nodes
oc debug node/cp0Back at cloudy's Bash shell inside the container, run OpenShift's conformance test suite with:
ansible-playbook playbooks/site.yml \
--limit lvrt-lcl-session-srv-530-okd-abi-ha-provisioner \
--tags jm1.cloudy.openshift_testsRemove all virtual machines with:
# Note the .home.arpa suffix
for vm in \
lvrt-lcl-session-srv-500-okd-abi-ha-router.home.arpa \
lvrt-lcl-session-srv-501-okd-abi-ha-bmc.home.arpa \
lvrt-lcl-session-srv-510-okd-abi-ha-cp0.home.arpa \
lvrt-lcl-session-srv-511-okd-abi-ha-cp1.home.arpa \
lvrt-lcl-session-srv-512-okd-abi-ha-cp2.home.arpa \
lvrt-lcl-session-srv-520-okd-abi-ha-w0.home.arpa \
lvrt-lcl-session-srv-521-okd-abi-ha-w1.home.arpa \
lvrt-lcl-session-srv-530-okd-abi-ha-provisioner.home.arpa
do
virsh destroy "$vm"
virsh undefine --remove-all-storage --nvram "$vm"
doneRemoval does not impose any order.
Exit cloudy's Bash shell to stop the container.
NOTE: Any virtual machines still running inside the container will be killed!
Finally, remove all Podman containers, networks and volumes with:
sudo DEBUG=yes ./podman-compose.sh down