Skip to content

net, tests, stuntime: Wait for VMI affinity reconciliation before migration#5347

Open
Anatw wants to merge 2 commits into
RedHatQE:mainfrom
Anatw:stuntime_fix_affinity_issue
Open

net, tests, stuntime: Wait for VMI affinity reconciliation before migration#5347
Anatw wants to merge 2 commits into
RedHatQE:mainfrom
Anatw:stuntime_fix_affinity_issue

Conversation

@Anatw

@Anatw Anatw commented Jun 23, 2026

Copy link
Copy Markdown
Contributor
What this PR does / why we need it:

When a VM template affinity is updated and migration is triggered immediately after, virt-controller and the VMIM controller race: the VMIM controller may read the VMI before the VM controller has reconciled the template change, creating the migration target pod with stale affinity rules. The VM ends up on the wrong node.

This PR adds wait_for_vmi_affinity() to BaseVirtualMachine, which polls the VMI until its affinity matches the VM template — ensuring the VM controller has reconciled before migration is triggered. All stuntime tests (L2 bridge + localnet) now call it after set_template_affinity.

Additionally, temporary post-migration affinity assertions are added (gated behind is_jira_open("CNV-90576")) to detect the race if it still occurs despite the wait. These assertions will be removed once CNV-90576 is resolved.

The first commit removes a redundant set_template_affinity call from test_server_migrates_between_non_client_nodes — the preceding test already sets the same affinity and @pytest.mark.incremental preserves it.

Which issue(s) this PR fixes:
Special notes for reviewer:
jira-ticket:

NONE

Summary by CodeRabbit

  • Tests
    • Enhanced migration stuntime tests to explicitly validate VM node placement based on affinity and anti-affinity rules before measuring performance metrics
    • Strengthened test assertions to verify VMs are scheduled on expected nodes according to their placement policies
    • Updates applied to both Linux bridge and OVN localnet migration test scenarios

@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Warning

Review limit reached

@Anatw, you've reached your PR review limit, so we couldn't start this review.

Next review available in: 51 minutes

Enable usage-based reviews in Billing to review now. Otherwise, wait until the next included review is available.
You're only billed for reviews past your plan's rate limits ($0.25/file).

How can I continue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based reviews.

How do review limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window.

Please refer docs for additional details.

Review details
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 72fb3a9e-76a7-4275-b7cd-225c690c1f07

📥 Commits

Reviewing files that changed from the base of the PR and between d503c68 and 56a1e16.

📒 Files selected for processing (4)
  • libs/vm/vm.py
  • tests/network/l2_bridge/migration_stuntime/test_migration_stuntime.py
  • tests/network/libs/stuntime.py
  • tests/network/localnet/migration_stuntime/test_migration_stuntime.py
📝 Walkthrough

Walkthrough

This PR adds a VM helper that waits for VMI affinity to match the VM template, then updates L2 bridge and localnet migration stuntime tests to wait for that state, verify post-migration node placement, and measure stuntime afterward.

Changes

Affinity-aware migration stuntime checks

Layer / File(s) Summary
VMI affinity wait helper
libs/vm/vm.py
Adds BaseVirtualMachine.wait_for_vmi_affinity() with logging and timeout-based polling until the VMI spec.affinity matches the VM template affinity.
L2 bridge migration assertions
tests/network/l2_bridge/migration_stuntime/test_migration_stuntime.py
Updates migration scenarios to wait for affinity reconciliation, migrate, assert expected same-node or different-node placement, then measure stuntime; one test also adds the stuntime_server_vm fixture.
Localnet migration assertions
tests/network/localnet/migration_stuntime/test_migration_stuntime.py
Applies the same wait-then-migrate-then-placement-check sequence across localnet scenarios before stuntime checks, with some test signatures reformatted.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related issues

Possibly related PRs

Suggested reviewers

  • EdDev
  • orelmisan
  • frenzyfriday
  • nirdothan
  • azhivovk
  • yossisegev
  • dshchedr
  • rnetser

Caution

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

  • Ignore

❌ Failed checks (1 error)

Check name Status Explanation Resolution
Stp Link Required ❌ Error New test file tests/storage/disk_preallocation/test_disk_preallocation.py has Jira reference but missing required # marker on that line (HIGH severity). Add # comment to line 9: Jira: http://redhat.atlassian.net/browse/CNV-6008 #
✅ Passed checks (4 passed)
Check name Status Explanation
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Title check ✅ Passed The title is under 120 characters and accurately summarizes the main change: waiting for VMI affinity reconciliation before migration.
Description check ✅ Passed The PR description follows the required template and clearly explains the change, with only optional issue and reviewer-note sections left empty.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

Review ran into problems

🔥 Problems

Linked repositories: Your configuration references 1 linked repositories, but your current plan allows 0. Analyzed ``, skipped RedHatQE/openshift-virtualization-tests-design-docs.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@openshift-virtualization-qe-bot-6

Copy link
Copy Markdown

Report bugs in Issues

Welcome! 🎉

This pull request will be automatically processed with the following features:

🔄 Automatic Actions

  • Reviewer Assignment: Reviewers are automatically assigned based on the OWNERS file in the repository root
  • Size Labeling: PR size labels (XS, S, M, L, XL, XXL) are automatically applied based on changes
  • Issue Creation: A tracking issue is created for this PR and will be closed when the PR is merged or closed
  • Branch Labeling: Branch-specific labels are applied to track the target branch
  • Auto-verification: Auto-verified users have their PRs automatically marked as verified
  • Labels: Enabled categories: branch, can-be-merged, cherry-pick, has-conflicts, hold, needs-rebase, size, verified, wip

📋 Available Commands

PR Status Management

  • /wip - Mark PR as work in progress (adds WIP: prefix to title)
  • /wip cancel - Remove work in progress status
  • /hold - Block PR merging (approvers only)
  • /hold cancel - Unblock PR merging
  • /verified - Mark PR as verified
  • /verified cancel - Remove verification status
  • /reprocess - Trigger complete PR workflow reprocessing (useful if webhook failed or configuration changed)
  • /regenerate-welcome - Regenerate this welcome message
  • /security-override - Set security check runs to pass (maintainers only)
  • /security-override cancel - Re-run security checks

Review & Approval

  • /lgtm - Approve changes (looks good to me)
  • /approve - Approve PR (approvers only)
  • /assign-reviewers - Assign reviewers based on OWNERS file
  • /assign-reviewer @username - Assign specific reviewer
  • /check-can-merge - Check if PR meets merge requirements

Testing & Validation

  • /retest tox - Run Python test suite with tox
  • /retest build-container - Rebuild and test container image
  • /retest verify-bugs-are-open - verify-bugs-are-open
  • /retest all - Run all available tests

Container Operations

  • /build-and-push-container - Build and push container image (tagged with PR number)
    • Supports additional build arguments: /build-and-push-container --build-arg KEY=value

Cherry-pick Operations

  • /cherry-pick <branch> - Schedule cherry-pick to target branch when PR is merged
    • Multiple branches: /cherry-pick branch1 branch2 branch3
  • /cherry-pick-retry <branch> - Retry a failed cherry-pick (merged PRs only)

Branch Management

  • /rebase - Rebase this PR branch onto its base branch

Label Management

  • /<label-name> - Add a label to the PR
  • /<label-name> cancel - Remove a label from the PR

✅ Merge Requirements

This PR will be automatically approved when the following conditions are met:

  1. Approval: /approve from at least one approver
  2. LGTM Count: Minimum 2 /lgtm from reviewers
  3. Status Checks: All required status checks must pass
  4. No Blockers: No wip, hold, has-conflicts labels and PR must be mergeable (no conflicts)
  5. Verified: PR must be marked as verified

📊 Review Process

Approvers and Reviewers

Approvers:

  • EdDev
  • dshchedr
  • myakove
  • rnetser
  • vsibirsk

Reviewers:

  • Anatw
  • EdDev
  • RoniKishner
  • azhivovk
  • dshchedr
  • frenzyfriday
  • nirdothan
  • orelmisan
  • rnetser
  • servolkov
  • vsibirsk
  • yossisegev
Available Labels
  • hold
  • verified
  • wip
  • lgtm
  • approve
AI Features
  • Cherry-Pick Conflict Resolution: Enabled (claude/claude-opus-4-6-1m)
Security Checks
  • Suspicious Path Detection: Monitors paths: .claude/, .vscode/, .cursor/, .devcontainer/, .pi/, .github/workflows/, .github/actions/
  • Committer Identity Check: Verifies last committer matches PR author
  • Mandatory: Security checks block merge (use /security-override to bypass — maintainers only)

💡 Tips

  • WIP Status: Use /wip when your PR is not ready for review
  • Verification: The verified label is removed on new commits unless the push is detected as a clean rebase
  • Cherry-picking: Cherry-pick labels are processed when the PR is merged
  • Container Builds: Container images are automatically tagged with the PR number
  • Permission Levels: Some commands require approver permissions
  • Auto-verified Users: Certain users have automatic verification and merge privileges

For more information, please refer to the project documentation or contact the maintainers.

@Anatw

Anatw commented Jun 23, 2026

Copy link
Copy Markdown
Contributor Author

/wip

@openshift-virtualization-qe-bot-4 openshift-virtualization-qe-bot-4 changed the title net, tests, stuntime: Wait for VMI affinity reconciliation before migration WIP: net, tests, stuntime: Wait for VMI affinity reconciliation before migration Jun 23, 2026
@Anatw

Anatw commented Jun 23, 2026

Copy link
Copy Markdown
Contributor Author

/build-and-push-container

@openshift-virtualization-qe-bot-4

Copy link
Copy Markdown

New container for quay.io/openshift-cnv/openshift-virtualization-tests:pr-5347 published

@Anatw

Anatw commented Jun 28, 2026

Copy link
Copy Markdown
Contributor Author

/wip cancel

@openshift-virtualization-qe-bot-6 openshift-virtualization-qe-bot-6 changed the title WIP: net, tests, stuntime: Wait for VMI affinity reconciliation before migration net, tests, stuntime: Wait for VMI affinity reconciliation before migration Jun 28, 2026
@Anatw

Anatw commented Jun 28, 2026

Copy link
Copy Markdown
Contributor Author

/build-and-push-container

@openshift-virtualization-qe-bot-4

Copy link
Copy Markdown

New container for quay.io/openshift-cnv/openshift-virtualization-tests:pr-5347 published

@Anatw

Anatw commented Jun 29, 2026

Copy link
Copy Markdown
Contributor Author

/verified

Tested on bm03-tlv2
Openshift version: 4.22.0-rc.2
CNV version: 4.22.0
HCO image: brew.registry.redhat.io/rh-osbs/iib:1158778

openshift-virtualization-tests-runner/5829/
PYTEST_PARAMS: -s -o log_cli=true -m tier3 --jira --storage-class-matrix=ocs-storagecluster-ceph-rbd-virtualization tests/network/localnet/migration_stuntime/test_migration_stuntime.py tests/network/l2_bridge/migration_stuntime/test_migration_stuntime.py
==== 24 passed, 1 deselected, 185 warnings in 3289.27s (0:54:49) ====

@openshift-virtualization-qe-bot-3

Copy link
Copy Markdown
Contributor

@coderabbitai

Test execution plan request details

CRITICAL: You MUST post an inline review comment on the first changed line of the first file.
The inline comment should contain the full Test Execution Plan (smoke decision, gating decision, and specific affected tests).
Do NOT submit a blocking review event (REQUEST_CHANGES/APPROVE).
Post a single inline PR comment on Files Changed (non-blocking COMMENT flow).

As an expert software testing engineer, analyze all modified files in this PR and create a targeted test execution plan.
You will post an inline review comment with the test execution plan on the first changed file.
If you fail to run or post a comment, retry.

Analysis Requirements:

  1. Examine code changes in each modified file

  2. Identify affected code paths, functions, and classes

  3. Analyze pytest-specific elements: fixtures (scope, dependencies), parametrization, markers, conftest changes

  4. Trace test dependencies through imports, shared utilities, fixture inheritance, fixture teardown, and yield from cleanup in conftest

  5. Detect new tests introduced in the PR

  6. Utilities and libs impact (when utilities/ or libs/ changes):
    You MUST use shell scripts (rg, git diff) to trace the full impact.
    Follow these sub-steps in order:

    6a. Identify modified symbols: For each changed file under utilities/ or libs/,
    list every modified function or method.
    Example: git diff HEAD~1 --unified=0 -- utilities/hco.py | grep '^[+-]def '

    6b. Find direct callers: Search tests and conftest for each symbol from 6a.
    Example: rg -l 'get_hco_version' tests/

    6c. Trace fixture teardown and cleanup: Find fixtures that reach
    the modified symbol through yield from or context-manager wrappers.
    Example: rg -l 'yield from.*enable_common_boot|def.*enable_common_boot' tests/

    6d. Trace same-file callers: In each changed file, find other functions
    whose body calls a modified symbol (including code after yield
    in @contextmanager helpers).
    Example: rg 'get_hco_version|enable_common_boot' utilities/hco.py

    6e. Expand transitively: If function A calls modified B, then
    tests/fixtures that call A are affected — even when the test body
    never imports B directly.

    Do NOT limit impact to tests that import the modified symbol only.

  7. Smoke test impact: Intersect the affected set from step 6 with smoke-marked tests.
    Run: rg -l '@pytest.mark.smoke' tests/
    VERIFY the above command returned actual file paths before concluding False.
    Set True if either condition is met:

    • a smoke-marked file appears in the affected set from 6b-6e, OR
    • any conftest.py in the smoke test's parent-directory hierarchy (up to repo root)
      imports or calls a modified utilities/libs symbol — including autouse fixtures
      that depend on modified functions. ALL tests in that directory and below are affected.
      Example check: for each smoke_file, scan dirname(smoke_file)/conftest.py,
      dirname(dirname(smoke_file))/conftest.py, etc. for modified symbol imports
      and autouse fixtures that depend on modified symbols.
  8. Gating test impact: Intersect the affected set from step 6 with gating-marked tests.
    Run: rg -l '@pytest.mark.gating' tests/
    Set True if a gating-marked file also appears in the affected set from 6b-6e.
    Utilities/libs changes often affect gating tests without affecting smoke tests.
    Do NOT stop analysis after concluding Run smoke tests: False.

Output rules:
Do NOT include analysis step numbers (1-8) in your visible output.

Your deliverable:
Your inline informational comment will be based on the following requirements:

Test Execution Plan

  • Run smoke tests: True / False — If True, state the dependency path (test → fixture → changed symbol). True ONLY with a verified path.
  • Run gating tests: True / False — If True, state the dependency path. True if any gating-marked test is in the affected set.
  • Affected tests to run (required when utilities/, libs/, or shared conftest changes — list concrete paths even when smoke is False)

Use these formats:

  • path/to/test_file.py - When the entire test file needs verification
  • path/to/test_file.py::TestClass::test_method - When specific test(s) needed
  • path/to/test_file.py::test_function - When specific test(s) needed
  • -m marker - When a marker covers multiple affected tests (e.g. -m gating only if ALL gating tests in scope need run)
  • Tag each listed test or group with its marker when not obvious, e.g. (gating) or (smoke)

Real test commands (MANDATORY when changes affect session/runtime code):

When the affected code runs at session/collection time (conftest fixtures, pytest plugins,
config hooks, session-scoped setup) or modifies runtime behavior that unit tests mock away,
you MUST include concrete pytest commands the PR author must run on a real cluster
to verify the change works end-to-end. Include:

  • A command for the error/fix path (the scenario the PR fixes)
  • A command for the happy path (regression: the normal case still works)
  • Use lightweight tests (e.g., --collect-only for startup failures,
    a single small test for runtime behavior)
    If the PR only changes test logic (not utilities/libs/conftest), the affected test
    paths themselves serve as the real test commands — no separate section needed.

Example output for a session-startup fix:

**Real tests (cluster required)**
Error path (the fix):
`pytest tests/storage/.../test_foo.py --storage-class-matrix=nonexistent-sc --collect-only`
Expected: ValueError with clear message, not IndexError

Happy path (regression):
`pytest tests/storage/.../test_foo.py --storage-class-matrix=<valid-sc> -k test_bar`
Expected: session starts normally

Guidelines:

  • Include tests affected directly OR via fixture setup/teardown, yield from cleanup, or transitive utility call chains (caller calls modified helper)
  • Use a full file path only if ALL tests in that file require verification
  • Use file path + test name when only specific tests use an affected fixture or utility wrapper (preferred for partial file impact)
  • If a test marker can cover multiple files/tests, provide the marker
  • Balance coverage vs over-testing - Keep descriptions minimal
  • Example: if leaf helper foo() changes, include tests whose fixture teardown calls wrapper bar() where bar() calls foo(), even when the test body only imports an unrelated symbol from the same utilities module

Hardware-Related Checks (SR-IOV, GPU, DPDK):

When PR modifies fixtures for hardware-specific resources:

  • Collection Safety: Fixtures MUST have existence checks (return None when hardware unavailable)
  • Test Plan: MUST verify both WITH and WITHOUT hardware:
    • Run affected tests on cluster WITH hardware
    • Verify collection succeeds on cluster WITHOUT hardware

CRITICAL WORKFLOW COMPLETION RULES:

When responding to this test execution plan request, you MUST follow these rules EXACTLY:

  1. YOUR ONLY DELIVERABLE: Post one non-blocking inline comment containing the test execution plan on the first changed line
  2. THEN STOP IMMEDIATELY - Do NOT generate any additional response
  3. FALLBACK ONLY: If inline comment API calls fail after retrying, post as a regular PR comment
  4. SILENCE = SUCCESS: After successfully submitting the review, your task is complete. No confirmation needed.

ABSOLUTE PROHIBITIONS (violating these creates empty/meaningless reviews):

  • ❌ Do NOT post acknowledgment messages like "Test execution plan posted", "Review posted successfully", "I've successfully posted"
  • ❌ Do NOT mention review IDs, URLs, or confirmation of posting in the PR thread
  • ❌ Do NOT add any follow-up comments after submitting the review
  • ❌ Do NOT reply to confirm task completion
  • ❌ Do NOT explain what you did - just do it and stop

Remember: The pull request review is visible to users. Additional comments are redundant noise.

CRITICAL — Verification rules for smoke and gating decisions:
- For each True decision, you must have traced a concrete path: test file → fixture → utility function → modified symbol.
- Do NOT assume impact based on scope, file proximity, or infrastructure-sounding names.
Only set True when you trace a concrete path through steps 6a-6e, then 7 or 8.
WRONG: "utilities/hco.py changed → run smoke" (no traced dependency path)
WRONG: "Run smoke tests: False" with no affected-test list when gating tests use the changed utility
RIGHT: "Run smoke tests: False, Run gating tests: True" plus tests/.../test_foo.py::TestBar::test_baz (gating)
RIGHT: "Smoke test X uses fixture Y → wrapper Z → changed function W in utilities/hco.py"

@coderabbitai

coderabbitai Bot commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

repo="RedHatQE/openshift-virtualization-tests"
pr="5347"

echo "== PR files =="
gh pr view "$pr" --repo "$repo" --json files --jq '.files[].path'

echo
echo "== First changed file and patch head =="
first_file=$(gh pr view "$pr" --repo "$repo" --json files --jq '.files[0].path')
echo "$first_file"
gh pr diff "$pr" --repo "$repo" --patch | awk -v f="$first_file" '
  $0 ~ ("^diff --git a/" f " b/" f "$") {infile=1; print; next}
  infile && /^diff --git / {exit}
  infile {print}
' | sed -n '1,120p'

echo
echo "== Modified symbols in libs/utilities =="
git diff --unified=0 HEAD~1 -- libs utilities || true

echo
echo "== Outline of changed library file =="
ast-grep outline libs/vm/vm.py --view expanded | sed -n '1,220p'

echo
echo "== Direct callers / imports of wait_for_vmi_affinity in tests and conftest =="
rg -n -C2 'wait_for_vmi_affinity|set_template_affinity' tests

echo
echo "== Smoke-marked tests =="
rg -n '`@pytest`\.mark\.smoke' tests || true

echo
echo "== Gating-marked tests =="
rg -n '`@pytest`\.mark\.gating' tests || true

echo
echo "== Relevant changed test files with markers and test definitions =="
for f in tests/network/l2_bridge/migration_stuntime/test_migration_stuntime.py tests/network/localnet/migration_stuntime/test_migration_stuntime.py; do
  echo "--- $f ---"
  sed -n '1,260p' "$f" | nl -ba | sed -n '1,260p'
done

Length of output: 3439


🏁 Script executed:

#!/bin/bash
set -euo pipefail

repo="RedHatQE/openshift-virtualization-tests"
pr="5347"

echo "== PR head sha =="
gh pr view "$pr" --repo "$repo" --json headRefOid --jq '.headRefOid'

echo
echo "== Changed files and additions/deletions =="
gh pr view "$pr" --repo "$repo" --json files --jq '.files[] | [.path, .additions, .deletions] | `@tsv`'

echo
echo "== Current line numbers around first changed hunk in libs/vm/vm.py =="
nl -ba libs/vm/vm.py | sed -n '1,190p'

echo
echo "== Diff for libs/vm/vm.py =="
git diff --unified=3 HEAD~1 -- libs/vm/vm.py | sed -n '1,220p'

echo
echo "== Diff for tests/network/libs/stuntime.py =="
git diff --unified=3 HEAD~1 -- tests/network/libs/stuntime.py | sed -n '1,220p'

echo
echo "== Usages of wait_for_vmi_affinity =="
rg -n -C2 '\bwait_for_vmi_affinity\s*\(' tests libs

echo
echo "== Usages of measure_stuntime and migrate_vm_and_verify =="
rg -n -C2 '\bmeasure_stuntime\s*\(|\bmigrate_vm_and_verify\s*\(' tests/network

echo
echo "== Markers in changed test files =="
for f in tests/network/l2_bridge/migration_stuntime/test_migration_stuntime.py tests/network/localnet/migration_stuntime/test_migration_stuntime.py; do
  echo "--- $f markers ---"
  rg -n '`@pytest`\.mark\.(smoke|gating|tier[0-9]+|polarion|parametrize)' "$f" || true
done

echo
echo "== Callers of tests/network/libs/stuntime.py helpers =="
rg -n -C2 'from tests\.network\.libs\.stuntime import|import tests\.network\.libs\.stuntime|measure_stuntime|verify_ping_stability|continuous_ping' tests | sed -n '1,260p'

echo
echo "== Smoke-marked file paths =="
rg -l '`@pytest`\.mark\.smoke' tests | sed -n '1,120p'

echo
echo "== Gating-marked file paths =="
rg -l '`@pytest`\.mark\.gating' tests | sed -n '1,120p'

Length of output: 2441


🏁 Script executed:

#!/bin/bash
set -euo pipefail

repo="RedHatQE/openshift-virtualization-tests"
pr="5347"

echo "== libs/vm/vm.py around first hunk =="
cat -n libs/vm/vm.py | sed -n '1,180p'

echo
echo "== Diff libs/vm/vm.py =="
git diff --unified=3 HEAD~1 -- libs/vm/vm.py | sed -n '1,220p'

echo
echo "== Diff tests/network/libs/stuntime.py =="
git diff --unified=3 HEAD~1 -- tests/network/libs/stuntime.py | sed -n '1,220p'

echo
echo "== Diff changed test files =="
for f in tests/network/l2_bridge/migration_stuntime/test_migration_stuntime.py tests/network/localnet/migration_stuntime/test_migration_stuntime.py; do
  echo "--- $f ---"
  git diff --unified=3 HEAD~1 -- "$f" | sed -n '1,260p'
done

echo
echo "== wait_for_vmi_affinity callers =="
rg -n -C2 '\bwait_for_vmi_affinity\s*\(' tests libs || true

echo
echo "== measure_stuntime callers =="
rg -n -C2 '\bmeasure_stuntime\s*\(' tests || true

echo
echo "== migration_stuntime test definitions and markers =="
for f in tests/network/l2_bridge/migration_stuntime/test_migration_stuntime.py tests/network/localnet/migration_stuntime/test_migration_stuntime.py; do
  echo "--- $f markers ---"
  rg -n '`@pytest`\.mark\.(smoke|gating|tier[0-9]+|polarion|parametrize)' "$f" || true
  echo "--- $f test defs ---"
  rg -n '^\s*def test_' "$f" || true
done

echo
echo "== smoke-marked files =="
rg -l '`@pytest`\.mark\.smoke' tests || true

echo
echo "== gating-marked files =="
rg -l '`@pytest`\.mark\.gating' tests || true

echo
echo "== parent conftest references to modified symbol =="
rg -n -C2 'wait_for_vmi_affinity|set_template_affinity' tests/**/conftest.py tests/conftest.py || true

echo
echo "== First changed line in first file from PR patch =="
gh api "repos/$repo/pulls/$pr/files?per_page=100" --jq '.[] | select(.filename=="libs/vm/vm.py") | .patch' \
| python - <<'PY'
import sys,re
patch=sys.stdin.read().splitlines()
new_line=None
cur=None
for line in patch:
    m=re.match(r'^@@ -\d+(?:,\d+)? \+(\d+)(?:,\d+)? @@', line)
    if m:
        cur=int(m.group(1))
        continue
    if cur is None:
        continue
    if line.startswith('+') and not line.startswith('+++'):
        new_line=cur
        print(new_line)
        break
    if line.startswith('-') and not line.startswith('---'):
        continue
    cur+=1
PY

Length of output: 42844


🏁 Script executed:

#!/bin/bash
set -euo pipefail

repo="RedHatQE/openshift-virtualization-tests"
pr="5347"
commit_id="c61964437a716526e0309d0d8e6d8481a50a28cb"
path="libs/vm/vm.py"
line="13"

body="$(cat <<'EOF'
**Test Execution Plan**

- **Run smoke tests: False** — no verified dependency path from any `@pytest.mark.smoke` test or parent `conftest.py` to `libs/vm/vm.py::BaseVirtualMachine.wait_for_vmi_affinity()` or `tests/network/libs/stuntime.py::assert_affinity_after_migration()`.
- **Run gating tests: False** — no verified dependency path from any `@pytest.mark.gating` test to the changed symbols.

- **Affected tests to run**
  - `tests/network/l2_bridge/migration_stuntime/test_migration_stuntime.py::TestMigrationStuntime::test_client_migrates_off_server_node` (tier3)
  - `tests/network/l2_bridge/migration_stuntime/test_migration_stuntime.py::TestMigrationStuntime::test_client_migrates_to_server_node` (tier3)
  - `tests/network/l2_bridge/migration_stuntime/test_migration_stuntime.py::TestMigrationStuntime::test_server_migrates_off_client_node` (tier3)
  - `tests/network/l2_bridge/migration_stuntime/test_migration_stuntime.py::TestMigrationStuntime::test_server_migrates_to_client_node` (tier3)
  - `tests/network/localnet/migration_stuntime/test_migration_stuntime.py::TestMigrationStuntime::test_client_migrates_off_server_node` (tier3)
  - `tests/network/localnet/migration_stuntime/test_migration_stuntime.py::TestMigrationStuntime::test_client_migrates_to_server_node` (tier3)
  - `tests/network/localnet/migration_stuntime/test_migration_stuntime.py::TestMigrationStuntime::test_server_migrates_off_client_node` (tier3)
  - `tests/network/localnet/migration_stuntime/test_migration_stuntime.py::TestMigrationStuntime::test_server_migrates_to_client_node` (tier3)

These are the concrete affected paths because the changed tests now call `BaseVirtualMachine.wait_for_vmi_affinity()` directly after `set_template_affinity()`, and they conditionally call `tests/network/libs/stuntime.py::assert_affinity_after_migration()` after `migrate_vm_and_verify()`.

**Real tests (cluster required)**

Error path (the fix):
`pytest -s -o log_cli=true -m tier3 --jira tests/network/localnet/migration_stuntime/test_migration_stuntime.py -k 'test_client_migrates_off_server_node or test_server_migrates_off_client_node'`
Expected: after `set_template_affinity()`, the VMI affinity converges before migration, the migrated VM lands with matching anti-affinity rules, and stuntime stays within `STUNTIME_THRESHOLD_SECONDS`.

Happy path (regression):
`pytest -s -o log_cli=true -m tier3 --jira tests/network/l2_bridge/migration_stuntime/test_migration_stuntime.py -k 'test_client_migrates_to_server_node or test_server_migrates_to_client_node'`
Expected: affinity-targeted migrations still succeed, the target pod reflects the expected affinity type when `CNV-90576` is open, and stuntime stays within `STUNTIME_THRESHOLD_SECONDS`.
EOF
)"

post_inline() {
  gh api \
    --method POST \
    "repos/$repo/pulls/$pr/comments" \
    -f body="$body" \
    -f commit_id="$commit_id" \
    -f path="$path" \
    -F line="$line" \
    -f side="RIGHT" >/dev/null
}

post_fallback() {
  gh pr comment "$pr" --repo "$repo" --body "$body" >/dev/null
}

for attempt in 1 2 3; do
  if post_inline; then
    exit 0
  fi
  sleep 2
done

post_fallback

Length of output: 3857


🏁 Script executed:

#!/bin/bash
set -euo pipefail

repo="RedHatQE/openshift-virtualization-tests"
pr="5347"
commit_id="c61964437a716526e0309d0d8e6d8481a50a28cb"
path="libs/vm/vm.py"
line="13"

body="$(cat <<'EOF'
**Test Execution Plan**

- **Run smoke tests: False** — no verified dependency path from any `@pytest.mark.smoke` test or parent `conftest.py` to `libs/vm/vm.py::BaseVirtualMachine.wait_for_vmi_affinity()` or `tests/network/libs/stuntime.py::assert_affinity_after_migration()`.
- **Run gating tests: False** — no verified dependency path from any `@pytest.mark.gating` test to the changed symbols.

- **Affected tests to run**
  - `tests/network/l2_bridge/migration_stuntime/test_migration_stuntime.py::TestMigrationStuntime::test_client_migrates_off_server_node` (tier3)
  - `tests/network/l2_bridge/migration_stuntime/test_migration_stuntime.py::TestMigrationStuntime::test_client_migrates_to_server_node` (tier3)
  - `tests/network/l2_bridge/migration_stuntime/test_migration_stuntime.py::TestMigrationStuntime::test_server_migrates_off_client_node` (tier3)
  - `tests/network/l2_bridge/migration_stuntime/test_migration_stuntime.py::TestMigrationStuntime::test_server_migrates_to_client_node` (tier3)
  - `tests/network/localnet/migration_stuntime/test_migration_stuntime.py::TestMigrationStuntime::test_client_migrates_off_server_node` (tier3)
  - `tests/network/localnet/migration_stuntime/test_migration_stuntime.py::TestMigrationStuntime::test_client_migrates_to_server_node` (tier3)
  - `tests/network/localnet/migration_stuntime/test_migration_stuntime.py::TestMigrationStuntime::test_server_migrates_off_client_node` (tier3)
  - `tests/network/localnet/migration_stuntime/test_migration_stuntime.py::TestMigrationStuntime::test_server_migrates_to_client_node` (tier3)

These are the concrete affected paths because the changed tests now call `BaseVirtualMachine.wait_for_vmi_affinity()` directly after `set_template_affinity()`, and they conditionally call `tests/network/libs/stuntime.py::assert_affinity_after_migration()` after `migrate_vm_and_verify()`.

**Real tests (cluster required)**

Error path (the fix):
`pytest -s -o log_cli=true -m tier3 --jira tests/network/localnet/migration_stuntime/test_migration_stuntime.py -k 'test_client_migrates_off_server_node or test_server_migrates_off_client_node'`
Expected: after `set_template_affinity()`, the VMI affinity converges before migration, the migrated VM lands with matching anti-affinity rules, and stuntime stays within `STUNTIME_THRESHOLD_SECONDS`.

Happy path (regression):
`pytest -s -o log_cli=true -m tier3 --jira tests/network/l2_bridge/migration_stuntime/test_migration_stuntime.py -k 'test_client_migrates_to_server_node or test_server_migrates_to_client_node'`
Expected: affinity-targeted migrations still succeed, the target pod reflects the expected affinity type when `CNV-90576` is open, and stuntime stays within `STUNTIME_THRESHOLD_SECONDS`.
EOF
)"

echo "== Authenticated viewer =="
gh api user --jq '.login'

echo
echo "== Repo permissions =="
gh api "repos/$repo" --jq '{permissions: .permissions, private: .private, full_name: .full_name}'

echo
echo "== Attempt inline review comment via REST =="
set +e
gh api \
  --method POST \
  "repos/$repo/pulls/$pr/comments" \
  -f body="$body" \
  -f commit_id="$commit_id" \
  -f path="$path" \
  -F line="$line" \
  -f side="RIGHT"
rc_inline=$?
set -e
echo "inline_rc=$rc_inline"

echo
echo "== Attempt fallback PR issue comment via REST =="
set +e
gh api \
  --method POST \
  "repos/$repo/issues/$pr/comments" \
  -f body="$body"
rc_issue=$?
set -e
echo "issue_comment_rc=$rc_issue"

exit 0

Length of output: 4168


@openshift-virtualization-qe-bot-3

CRITICAL: I could not complete the required inline PR comment workflow because the GitHub integration has no comment-write access on this PR. All attempts failed with HTTP 403: Resource not accessible by integration, including:

  • inline review comment API
  • fallback PR/issue comment API
  • authenticated user probe

Because the blocker is permissions, not analysis quality, here is the exact comment body to post on the first changed line of libs/vm/vm.py:

Test Execution Plan

  • Run smoke tests: False — no verified dependency path from any @pytest.mark.smoke test or parent conftest.py to libs/vm/vm.py::BaseVirtualMachine.wait_for_vmi_affinity() or tests/network/libs/stuntime.py::assert_affinity_after_migration().

  • Run gating tests: False — no verified dependency path from any @pytest.mark.gating test to the changed symbols.

  • Affected tests to run

    • tests/network/l2_bridge/migration_stuntime/test_migration_stuntime.py::TestMigrationStuntime::test_client_migrates_off_server_node (tier3)
    • tests/network/l2_bridge/migration_stuntime/test_migration_stuntime.py::TestMigrationStuntime::test_client_migrates_to_server_node (tier3)
    • tests/network/l2_bridge/migration_stuntime/test_migration_stuntime.py::TestMigrationStuntime::test_server_migrates_off_client_node (tier3)
    • tests/network/l2_bridge/migration_stuntime/test_migration_stuntime.py::TestMigrationStuntime::test_server_migrates_to_client_node (tier3)
    • tests/network/localnet/migration_stuntime/test_migration_stuntime.py::TestMigrationStuntime::test_client_migrates_off_server_node (tier3)
    • tests/network/localnet/migration_stuntime/test_migration_stuntime.py::TestMigrationStuntime::test_client_migrates_to_server_node (tier3)
    • tests/network/localnet/migration_stuntime/test_migration_stuntime.py::TestMigrationStuntime::test_server_migrates_off_client_node (tier3)
    • tests/network/localnet/migration_stuntime/test_migration_stuntime.py::TestMigrationStuntime::test_server_migrates_to_client_node (tier3)

These are the concrete affected paths because the changed tests now call BaseVirtualMachine.wait_for_vmi_affinity() directly after set_template_affinity(), and they conditionally call tests/network/libs/stuntime.py::assert_affinity_after_migration() after migrate_vm_and_verify().

Real tests (cluster required)

Error path (the fix):
pytest -s -o log_cli=true -m tier3 --jira tests/network/localnet/migration_stuntime/test_migration_stuntime.py -k 'test_client_migrates_off_server_node or test_server_migrates_off_client_node'
Expected: after set_template_affinity(), the VMI affinity converges before migration, the migrated VM lands with matching anti-affinity rules, and stuntime stays within STUNTIME_THRESHOLD_SECONDS.

Happy path (regression):
pytest -s -o log_cli=true -m tier3 --jira tests/network/l2_bridge/migration_stuntime/test_migration_stuntime.py -k 'test_client_migrates_to_server_node or test_server_migrates_to_client_node'
Expected: affinity-targeted migrations still succeed, the target pod reflects the expected affinity type when CNV-90576 is open, and stuntime stays within STUNTIME_THRESHOLD_SECONDS.

Comment thread tests/network/libs/stuntime.py Outdated
Comment on lines +17 to +18
_POD_AFFINITY_TYPE: Final[str] = "podAffinity"
_POD_ANTI_AFFINITY_TYPE: Final[str] = "podAntiAffinity"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These 2 are imported outside this module, please remove the underscore prefix.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack.

virt-controller and VMIM controller reconcile independently — if
migration is triggered before the VM controller propagates the template
affinity to the VMI, the target pod is created with stale scheduling
rules.

Wait for the VMI to reflect the updated affinity before migrating, and
assert post-migration node placement matches the expected affinity.

Assisted-by: Claude <noreply@anthropic.com>
Signed-off-by: Anat Wax <awax@redhat.com>
@Anatw Anatw force-pushed the stuntime_fix_affinity_issue branch from c619644 to 56a1e16 Compare June 29, 2026 07:23
@Anatw

Anatw commented Jun 29, 2026

Copy link
Copy Markdown
Contributor Author

Change: Remove trailing underscore from constants.

@yossisegev yossisegev left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you Anat

Comment thread libs/vm/vm.py
"""
template_affinity = self._spec.template.spec.affinity
expected_affinity = (
asdict(obj=template_affinity, dict_factory=self._filter_out_none_values) if template_affinity else None

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We expect to have affinity=None in some VMs? We don't always use some default template?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, not all VMs have a default affinity in their template - for example, the stuntime server VM is created without any affinity, it's only added during the tests.
set_template_affinity accepts Affinity | None (None clears affinity), so wait_for_vmi_affinity handles both cases to stay consistent with that signature.

Comment thread libs/vm/vm.py
expected_affinity = (
asdict(obj=template_affinity, dict_factory=self._filter_out_none_values) if template_affinity else None
)
for sample in TimeoutSampler(

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using a helper with @retry

Comment on lines +164 to +168
has_stuntime_rules = any(
expr.get("key", "").startswith("stuntime.")
for rule in pod_affinity[stale_type].get("requiredDuringSchedulingIgnoredDuringExecution", [])
for expr in rule.get("labelSelector", {}).get("matchExpressions", [])
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rules are a bit unreadable
Consider splitting it to vars

@openshift-virtualization-qe-bot-3

Copy link
Copy Markdown
Contributor

/retest all

Auto-triggered: Files in this PR were modified by merged PR #5285.

Overlapping files

libs/vm/vm.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants