Skip to content

Add webhook health precheck to cluster sanity checks#3

Closed
rnetser wants to merge 702 commits into
cnv-4.18from
webhook-sanity-cnv-4.20
Closed

Add webhook health precheck to cluster sanity checks#3
rnetser wants to merge 702 commits into
cnv-4.18from
webhook-sanity-cnv-4.20

Conversation

@rnetser

@rnetser rnetser commented Feb 8, 2026

Copy link
Copy Markdown
Owner

Summary

  • Add webhook health check functions to verify all webhook services in the HCO namespace have available endpoints
  • Add dry-run VM creation test to validate API and webhook functionality
  • Add --cluster-sanity-skip-webhook-check pytest option to skip webhook checks when needed
  • Add test_webhook_endpoints_health and test_vm_creation_capability tests to cluster_health_check suite

Changes

  • utilities/infra.py: Added webhook health check functions (check_webhook_endpoints_health, check_vm_creation_capability, _discover_webhook_services)
  • conftest.py: Added --cluster-sanity-skip-webhook-check option
  • tests/after_cluster_deploy_sanity/test_after_cluster_deploy_sanity.py: Added webhook health tests

Test plan

  • Run pytest tests/after_cluster_deploy_sanity/ -m cluster_health_check to verify webhook tests

Backport of RedHatQE#3573 and RedHatQE#3690

dshchedr and others added 30 commits August 25, 2025 09:38
Signed-off-by: Sibo Wang<siwang@redhat.com>
There is new runbooks for new alerts that not merged for downstream and make the lanes unstable, until these PRs will be merged on DS, this test will be quarantined.
The Closed Loop Automation, we need to confirm that guest agent online when reboot the vm
Signed-off-by: Sibo Wang <siwang@redhat.com>
Removed alerts that was tested on T2 and
should be on T1

Co-authored-by: vsibirsk <57763370+vsibirsk@users.noreply.github.com>
https://issues.redhat.com/browse/CNV-64433 bug fixed.
Removing the skip.

Signed-off-by: Harel Meir <hmeir@redhat.com>
…ttp server. (RedHatQE#1847)

Updated `QCOW2_IMG` for s390x from
`Fedora-Cloud-Base-Generic-41-1.4.s390x.qcow2` to `Fedora-qcow2.img`
to match the format used on x86 and the image name in the internal
HTTP server. This ensures consistency and allows test cases to pass.

Reference (x86 implementation):
https://github.com/RedHatQE/openshift-virtualization-tests/blob/8abf57e3d9d153c1d64ffcfca989e60bccff2f7c/utilities/constants.py#L101

Signed-off-by: Nekkunti Anand

Signed-off-by: Nekkunti Anand
Co-authored-by: Nekkunti Anand <anand@Nekkuntis-MacBook-Pro.local>
…edHatQE#1854)

* Storage: remove virt-launchers cleanup from storage migration tests

* remove unused code
This module includes two scenarios that are already covered elsewhere:
1. test_report_masquerade_ip — covered by CNV-2155
(test_automatic_mac_from_pool). Additionally, CNV-6733
(test_connectivity_after_migration) adds a connectivity check before
migration, which requires an IP on the masquerade interface on a fresh
VM (this is added in a new PR - RedHatQE#1843).

2. test_report_masquerade_ip_after_migration — covered by CNV-6733
(test_connectivity_after_migration), which verifies connectivity
across migration over the masquerade interface and thus that the
interface has an IP after migration.

Keeping a dedicated module for these scenarios adds duplication without
unique coverage.

Signed-off-by: Anat Wax <awax@redhat.com>
TestVMICountMetric used old function and it was
duplicate of the function that validate metric value.
Also it waited too short for the metric to update which led
to failure of the test in the last run.
* fix: improve cherry-pick workflow for cnv-4.99 branch

This commit fixes the GitHub Actions workflow that handles cherry-picking
commits from main to cnv-4.99 branch. Key improvements include:

- Added git merge-base --is-ancestor pre-check to detect when commits
  actually need cherry-picking, preventing unnecessary workflow runs
- Removed --keep-redundant-commits flag that was causing false "no changes"
  detection during cherry-pick operations
- Improved commit SHA handling using github.event.before and
  github.event.after for more accurate change detection
- Fixed peter-evans/create-pull-request action configuration to properly
  create PRs when cherry-pick conflicts occur

The workflow now properly detects when commits need to be cherry-picked
and creates PRs accordingly, including proper handling of conflicts.

Assisted-by: Claude <noreply@anthropic.com>

* draft and label conflict pr
* Storage: remove disk location check

* Removing test per comments, this is not T2 check

---------

Co-authored-by: Jenia Peimer <86722603+jpeimer@users.noreply.github.com>
updates:
- [github.com/astral-sh/ruff-pre-commit: v0.12.8 → v0.12.10](astral-sh/ruff-pre-commit@v0.12.8...v0.12.10)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Generated through:

uv run pytest tests/install_upgrade_operators/csv/csv_permissions_audit/test_csv_permissions_audit.py::test_compare_csv_permissions -v --update-csv -x
Signed-off-by: Roni Kishner <rkishner@redhat.com>
* add sig labels based on the changed files

* use glob patterns
…#1723)

* add storage support list

* add storage-class support for conformance tests

* address comments

* Add production-ready breaking changes detector for GitHub Actions

This commit introduces a comprehensive breaking changes detection system designed
for GitHub Actions workflows. The detector analyzes Python code changes to identify
potentially breaking modifications and generates detailed reports.

Key features:
- AST-based code analysis for accurate change detection
- Git integration for comparing branches and commits
- GitHub Actions optimized with dedicated entry points
- UV-managed Python project with proper dependency management
- Security hardened with input validation and safe error handling
- Comprehensive test coverage and code quality verification
- Modular architecture with clean separation of concerns

Components added:
- breaking_changes_detector.py: Core detection engine
- ast_analyzer.py: AST-based code analysis
- git_analyzer.py: Git repository analysis
- report_generator.py: Detailed HTML/markdown report generation
- config_manager.py: Configuration and settings management
- github_actions_utils.py: GitHub Actions integration utilities
- error_handler.py: Centralized error handling and logging
- action.yml: GitHub Actions workflow definition
- pyproject.toml: UV project configuration with dependencies

The detector identifies various types of breaking changes including:
- Function/method signature modifications
- Class interface changes
- Import structure modifications
- API endpoint alterations
- Configuration schema changes

Ready for immediate use in CI/CD pipelines to catch breaking changes
before they impact downstream consumers.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* remove code that should not here

* add info to cmdline

* remove redundant spaces

* fix typo

* allow passing only conformnce sc name

* fix sc assignment

* add defualt sc

* add log and fix upper

---------

Co-authored-by: Claude <noreply@anthropic.com>
* fix chrryr

* update cherry pick flow

* fix: address CodeRabbit review comments for cherry-pick workflow

- Remove unnecessary owner existence curl check that added external dependency
- Add support for squash/rebase merge detection via commit message pattern
- Add issues: write permission for proper PR labeling
- Add no changes scenario guard for peter-evans action to handle edge cases

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* feat: enhance cherry-pick workflow robustness and reliability

Improves the cherry-pick workflow with several robustness enhancements:

- Enhanced duplicate detection: Added -x trailer check to detect prior cherry-picks before ancestor check
- Improved curl robustness: Added connect-timeout and reorganized flags for consistency
- Fixed detached HEAD: Use -B flag to create local tracking branch instead of detached HEAD
- Added cherry-pick label: Successful PRs now get "cherry-pick" label for better organization
- Added concurrency control: Added concurrency group to prevent race conditions during parallel merges

These improvements make the cherry-pick automation more reliable and prevent common failure scenarios.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* update label name

---------

Co-authored-by: Claude <noreply@anthropic.com>
…edHatQE#1917)

* fix: stage conflicts for peter-evans action in cherry-pick workflow

Added `git add .` in conflict handling path to stage all changes including
conflicts. This resolves the "you need to resolve your current index first"
error that was preventing peter-evans from creating branches and commits
when cherry-pick conflicts occurred.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* address cr comment

---------

Co-authored-by: Claude <noreply@anthropic.com>
Removing tests for recording rules that use existing
or not cnv related metrics as expressions to prevent
unnecessary tests execution and reduce time exectuion of
observability lane.
updates:
- [github.com/astral-sh/ruff-pre-commit: v0.12.10 → v0.12.11](astral-sh/ruff-pre-commit@v0.12.10...v0.12.11)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
…default deployments. (RedHatQE#1028)

* Add x86_vanilla test marker

* rename marker and update readme

* update marker name and readme and typos

* fix coderabbit comments

* resolve merge conflicts

* remove marker from tests which alter resources

* rename marker to common_provisioned_cluster

* rename to conformance

* update pytest command and remove marker from TestCommonTemplatesFedora

* update readme

* remove sriov

* remove sriov from readme

* remove tests that require info from bitwarden

* remove tests that require info from bitwarden

* address review comments

* re-add tests after password update

* do not check multi-nic for conformance tests

* remove conformance marker from test_ipv4_linux_bridge

* remove tests that are not supported

* add requirements.txt for packaging

* remove tests that are not needed for conformance

* remove tests that need nmstate

* remove test_vm_with_limits_overrides_global_vlaues

* add code from soterage conformance for tests

* update to new changes in conformance-sc pr

* fix sc assignment

* add defualt sc

* update readme with new sotrage configs

* add missing bash in readme

* update wrong param name in readme
* move cache_admin_client to be private

* fix unittest failure
…edHatQE#1900)

* net, dhcpd: move helpers to a dedicated module, add type annotation

Improve code organization and re-usability with separate dhcpd module
Signed-off-by: Asia Khromov <azhivovk@redhat.com>

* net, dhcpd: move dhcp ip related constants to dhcp module

Signed-off-by: Asia Khromov <azhivovk@redhat.com>

* net, dhcpd: edit DHCP_IP_SUBNET in dhcpd constants
Signed-off-by: Asia Khromov <azhivovk@redhat.com>

---------

Signed-off-by: Asia Khromov <azhivovk@redhat.com>
dshchedr and others added 22 commits January 18, 2026 15:56
…RT VMs (RedHatQE#3477)

Add dedicated namespace for optin test (no node selectors or special
resources)

##### Short description:

##### More details:

##### What this PR does / why we need it:

##### Which issue(s) this PR fixes:

##### Special notes for reviewer:

##### jira-ticket:
<!-- full-ticket-url needs to be provided. This would add a link to the
pull request to the jira and close it when the pull request is merged
If the task is not tracked by a Jira ticket, just write "NONE".
-->
These tests constantly fail on a specific cluster, which is used for all
SR-IOV tests (in all branches).
Once CNV-75730 is resolved, either by replacing the cluster or fixing it
in that cluster, these tests will be enabled.

##### jira-ticket:
https://issues.redhat.com/browse/CNV-31351
##### Short description:
Bug fixed
##### More details:

##### What this PR does / why we need it:

##### Which issue(s) this PR fixes:

##### Special notes for reviewer:

##### jira-ticket:
<!-- full-ticket-url needs to be provided. This would add a link to the
pull request to the jira and close it when the pull request is merged
If the task is not tracked by a Jira ticket, just write "NONE".
-->

Signed-off-by: Harel Meir <hmeir@redhat.com>
…HatQE#3508)

##### Short description:
To allow open bugs check to be a GH check run, integrated with
myk-org/github-webhook-server#961 - tox was updated with an explicit
command
Manual cherrypick of
RedHatQE#3048

##### More details:

##### What this PR does / why we need it:

##### Which issue(s) this PR fixes:

##### Special notes for reviewer:

##### jira-ticket:
<!-- full-ticket-url needs to be provided. This would add a link to the
pull request to the jira and close it when the pull request is merged
If the task is not tracked by a Jira ticket, just write "NONE".
-->
Cherry-pick -
RedHatQE#2582

##### Short description:
Add wait for boot sources re-imported after changing the default storage
class in the cluster

##### More details:
After changing the default storage class, the data import crons
re-import the data sources using the new storage class. this causes
issues in following tests if we do not wait for the sources to complete
the import.

##### What this PR does / why we need it:
Stabilise `test_data_import_cron_deletion_on_opt_out` by waiting for
data sources import

Signed-off-by: Roni Kishner <rkishner@redhat.com>
…k uploader (RedHatQE#3521)

cherry-pick
RedHatQE#3465
into cnv-4.20

requested-by geetikakay

Signed-off-by: Geetika Kapoor <gkapoor@redhat.com>
Co-authored-by: Geetika Kapoor <13978799+geetikakay@users.noreply.github.com>
…HatQE#3485)

cherry-pick
RedHatQE#3410
into cnv-4.20

requested-by rnetser

---------

Co-authored-by: Ruth Netser <rnetser@redhat.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
…3539)

##### Short description:
Removing virt recording rules tests, these recording rules depends on
kubernetes metrics and not cnv, this tests should be removed. original
PR: RedHatQE#3493
##### More details:

##### What this PR does / why we need it:

##### Which issue(s) this PR fixes:

##### Special notes for reviewer:

##### jira-ticket:
https://issues.redhat.com/browse/CNV-77310
…e_body_on_error_upload_virtctl (RedHatQE#3070)

cherry-pick
RedHatQE#2950
into cnv-4.20

requested-by Ahmad-Hafe

---------

Co-authored-by: Ahmad Hafe <ahafe@redhat.com>
Co-authored-by: Jenia Peimer <86722603+jpeimer@users.noreply.github.com>
…vmi_non_evictable test (RedHatQE#3537)

##### Short description:
The test for kubevirt_vmi_non_evictable is flaky duo to connectivity
issues with the artifactory, in this PR I modified the test to not rely
on the artifactory to avoid this kind of failures to stabilize the
observability lanes.
Original PR:
RedHatQE#3137
##### More details:

##### What this PR does / why we need it:

##### Which issue(s) this PR fixes:

##### Special notes for reviewer:

##### jira-ticket:
https://issues.redhat.com/browse/CNV-77296
…3548)

Manual cherry-pick for RedHatQE#3423
This test has issue with creds from time to time
Checking configmap and secrets covered in tier1
so, in real this test does not check anything new from CNV point of view

<!-- full-ticket-url needs to be provided. This would add a link to the
pull request to the jira and close it when the pull request is merged If
the task is not tracked by a Jira ticket, just write "NONE". -->

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

* **Tests**
* Removed test module for RHEL RHSM yum update functionality, including
related test fixtures and test cases that validated package update
operations.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

##### Short description:

##### More details:

##### What this PR does / why we need it:

##### Which issue(s) this PR fixes:

##### Special notes for reviewer:

##### jira-ticket:
<!-- full-ticket-url needs to be provided. This would add a link to the
pull request to the jira and close it when the pull request is merged
If the task is not tracked by a Jira ticket, just write "NONE".
-->
…class_migration_a_to_b_running_vms (RedHatQE#3482)

cherry-pick
RedHatQE#3432
into cnv-4.20

requested-by Ahmad-Hafe

Co-authored-by: Kate Shvaika <kshvaika@redhat.com>
…ecks and update CNV-4033 to assert populator behavior (RedHatQE#3473)

##### Short description:
cherry-pick
RedHatQE#2401


##### Special notes for reviewer:

##### jira-ticket:
<!-- full-ticket-url needs to be provided. This would add a link to the
pull request to the jira and close it when the pull request is merged
If the task is not tracked by a Jira ticket, just write "NONE".
-->
…dHatQE#3207)

##### Short description:
manual backport of
RedHatQE#3062

##### More details:

##### What this PR does / why we need it:

##### Which issue(s) this PR fixes:

##### Special notes for reviewer:

##### jira-ticket:
<!-- full-ticket-url needs to be provided. This would add a link to the
pull request to the jira and close it when the pull request is merged
If the task is not tracked by a Jira ticket, just write "NONE".
-->
Manual backport of RedHatQE#3314 and RedHatQE#3330 (not possible to use auto-backport
due to conflicts).

test_ovs_bridge_sanity coverage is already covered in
test_ipv4_ovs_bridge["L2_bridge_network"] (CNV-11126). Since
test_ipv4_ovs_bridge["L2_bridge_network"] covers the same OVS bridge
connectivity scenario using NNCP-created bridges, test_ovs_bridge_sanity
should be removed.

Remove all test scenarios marked with @pytest.mark.ovs_brcnv along with
their associated fixtures, helper functions, and configuration markers.
These tests require a specially-configured cluster that is not part of
any existing test lanes. No test execution collects the ovs_brcnv
marker, and they cannot be executed in our standard test infrastructure.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

* **Tests**
* Removed multiple test suites and associated fixtures covering bridge,
bond and migration connectivity scenarios (BR-CNV related).
* **Chores**
* Cleaned up test infra by removing an obsolete test marker, related
constants and helper utilities.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Anat Wax <awax@redhat.com>
##### Short description:
Bug was fixed
##### More details:

##### What this PR does / why we need it:

##### Which issue(s) this PR fixes:

##### Special notes for reviewer:

##### jira-ticket:
<!-- full-ticket-url needs to be provided. This would add a link to the
pull request to the jira and close it when the pull request is merged
If the task is not tracked by a Jira ticket, just write "NONE".
-->
##### Short description:

##### More details:

##### What this PR does / why we need it:

##### Which issue(s) this PR fixes:

##### Special notes for reviewer:

##### jira-ticket:
<!-- full-ticket-url needs to be provided. This would add a link to the
pull request to the jira and close it when the pull request is merged
If the task is not tracked by a Jira ticket, just write "NONE".
-->
…tions in upgrade tests (RedHatQE#3672)

cherry-pick
RedHatQE#3575
into cnv-4.20

requested-by dshchedr

Co-authored-by: Den Shchedrivyi <dshchedr@redhat.com>
…fig (RedHatQE#3631)

cherry-pick
RedHatQE#3399
into cnv-4.20

requested-by Ahmad-Hafe

Co-authored-by: Ahmad Hafe <ahafe@redhat.com>
…rics (RedHatQE#3438)

cherry-pick
RedHatQE#3436
into cnv-4.20

requested-by OhadRevah

Co-authored-by: Ohad Revah <orevah@redhat.com>
…ce (RedHatQE#3727)

cherry-pick
[3467](RedHatQE#3467)
into cnv-4.20

##### Short description:

##### More details:

##### What this PR does / why we need it:

##### Which issue(s) this PR fixes:

##### Special notes for reviewer:

##### jira-ticket:
<!-- full-ticket-url needs to be provided. This would add a link to the
pull request to the jira and close it when the pull request is merged
If the task is not tracked by a Jira ticket, just write "NONE".
-->

Signed-off-by: Samuel Albershtein <salbersh@redhat.com>
- Add webhook health check functions to utilities/infra.py
- Add _discover_webhook_services() to find webhook services in HCO namespace
- Add check_webhook_endpoints_health() to verify webhook services have endpoints
- Add check_vm_creation_capability() for dry-run VM creation test
- Add --cluster-sanity-skip-webhook-check pytest option to conftest.py
- Add webhook health tests to cluster_health_check test suite
- Modify cluster_sanity() to include webhook checks

Backport of PRs RedHatQE#3573 and RedHatQE#3690 for cnv-4.20 branch
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.