Skip to content

ci: fix missing libvirt default network in test-coreos job#2158

Closed
Copilot wants to merge 2 commits into
mainfrom
copilot/update-test-coreos-workflow
Closed

ci: fix missing libvirt default network in test-coreos job#2158
Copilot wants to merge 2 commits into
mainfrom
copilot/update-test-coreos-workflow

Conversation

Copy link
Copy Markdown

Copilot AI commented Apr 23, 2026

The test-coreos job was failing with Network not found: no network with matching name 'default', causing the provisioned VM to never receive a DHCP lease and TMT's SSH wait to exhaust all 60 attempts.

Changes

  • .github/workflows/ci.yml — Replace the step that only started the network if it already existed with one that defines it from a minimal NAT XML template when absent, then starts and autostarts it:
if ! virsh net-info default >/dev/null 2>&1; then
  cat > /tmp/default-net.xml <<'EOF'
<network>
  <name>default</name>
  <forward mode='nat'/>
  <bridge name='virbr0' stp='on' delay='0'/>
  <ip address='192.168.122.1' netmask='255.255.255.0'>
    <dhcp>
      <range start='192.168.122.2' end='192.168.122.254'/>
    </dhcp>
  </ip>
</network>
EOF
  sudo virsh net-define /tmp/default-net.xml
fi
sudo virsh net-start default || true
sudo virsh net-autostart default || true
  • crates/xtask/src/tmt.rs — Add dump_libvirt_diagnostics() called on SSH verification failure, emitting virsh list/net-list/net-dhcp-leases/dominfo/domiflist/dumpxml to stderr so future failures produce actionable output instead of a bare timeout.
Original prompt

Update existing PR #2157 in repository bootc-dev/bootc by adjusting the workflow change in .github/workflows/ci.yml for the test-coreos job.

Context:

Required change:

  • In .github/workflows/ci.yml, in the test-coreos job, replace the existing step named similar to Ensure libvirt default network is active with a step that:
    1. Lists networks with virsh net-list --all || true
    2. Checks if ! virsh net-info default >/dev/null 2>&1; then ... fi
    3. Writes a minimal libvirt default NAT network XML to /tmp/default-net.xml with:
      • network name default
      • forward mode nat
      • bridge virbr0
      • subnet 192.168.122.0/24
      • DHCP range 192.168.122.2 to 192.168.122.254
    4. Runs sudo virsh net-define /tmp/default-net.xml when missing
    5. Runs sudo virsh net-start default || true
    6. Runs sudo virsh net-autostart default || true
    7. Prints virsh net-list --all and virsh net-dhcp-leases default || true
  • Keep the existing diagnostics changes in the PR.
  • Do not make unrelated workflow simplifications or replace the whole file.
  • Commit the change onto the branch backing PR ci: add libvirt diagnostics and network activation for CoreOS TMT SSH failures #2157, if possible; otherwise create an updated PR from the appropriate branch.

Suggested replacement snippet:

- name: Ensure libvirt default network is present and active
  run: |
    set -euxo pipefail

    virsh net-list --all || true

    if ! virsh net-info default >/dev/null 2>&1; then
      cat > /tmp/default-net.xml <<'EOF'
    <network>
      <name>default</name>
      <forward mode='nat'/>
      <bridge name='virbr0' stp='on' delay='0'/>
      <ip address='192.168.122.1' netmask='255.255.255.0'>
        <dhcp>
          <range start='192.168.122.2' end='192.168.122.254'/>
        </dhcp>
      </ip>
    </network>
    EOF
      sudo virsh net-define /tmp/default-net.xml
    fi

    sudo virsh net-start default || true
    sudo virsh net-autostart default || true

    virsh net-list --all
    virsh net-dhcp-leases default || true

The following is the prior conversation context from the user's chat exploration (may be truncated):

User: Please find a solution for this failing job. Use the logs, job definition, and any referenced files where the failure occurred. Keep your response focused on the solution and include code suggestions when appropriate.

Assistant: The job is failing because the CoreOS TMT plan can’t SSH into the provisioned VM. The key clue is in the libvirt diagnostics:

  • virsh net-dhcp-leases defaultNetwork not found: no network with matching name 'default'
  • Then TMT times out: “SSH connectivity check failed after 60 attempts”

So the VM never gets connectivity because the libvirt “default” NAT network is missing/not defined in the runner environment being used for this job.

Fix: ensure the libvirt default network exists (define + start it)

Your workflow already tries to start the network, but it only does so if it already exists:

if virsh net-info default >/dev/null 2>&1; then
  virsh net-start default
  ...
fi

That doesn’t help when the network is absent (which is what the logs show). Update the “Ensure libvirt default network is active” step to define it when missing, then start/autostart it.

Suggested workflow patch (in .github/workflows/ci.yml, test-coreos job)

Replace the existing “Ensure libvirt default network is active” step with:

- name: Ensure libvirt default network is present and active
  run: |
    set -euxo pipefail

    # Show what libvirt sees
    virsh net-list --all || true

    if ! virsh net-info default >/dev/null 2>&1; then
      cat > /tmp/default-net.xml <<'EOF'
    <network>
      <name>default</name>
      <forward mode='nat'/>
      <bridge name='virbr0' stp='on' delay='0'/>
      <ip address='192.168.122.1' netmask='255.255.255.0'>
        <dhcp>
          <range start='192.168.122.2' end='192.168.122.254'/>
        </dhcp>
      </ip>
    </network>
    EOF
      sudo virsh net-define /tmp/default-net.xml
    fi

    sudo virsh net-start default || true
    sudo virsh net-autostart default || true

    virsh net-list --all
    virsh net-dhcp-leases default || true

Notes:

  • This addresses the actual failure mode (“network not found”), not just the “network ina...

This pull request was created from Copilot chat.

Copilot AI changed the title [WIP] Update workflow in ci.yml to define default libvirt network ci: fix missing libvirt default network in test-coreos job Apr 23, 2026
Copilot AI requested a review from jmarrero April 23, 2026 01:58
@jmarrero
Copy link
Copy Markdown
Contributor

closing @jeckersb is investigating this.

@jmarrero jmarrero closed this Apr 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants