Skip to content

Added WRED with affected Leaf/LC/FM model check#379

Open
Priyanka-Patil14 wants to merge 17 commits into
datacenter:v4.2.0-devfrom
Priyanka-Patil14:bugfix/CSCwt50713
Open

Added WRED with affected Leaf/LC/FM model check#379
Priyanka-Patil14 wants to merge 17 commits into
datacenter:v4.2.0-devfrom
Priyanka-Patil14:bugfix/CSCwt50713

Conversation

@Priyanka-Patil14

@Priyanka-Patil14 Priyanka-Patil14 commented Apr 9, 2026

Copy link
Copy Markdown

Summary

Adds a new pre-upgrade validation check to detect fabric nodes at risk due to CSCwt50713, where WRED-enabled QoS combined with specific Leaf/LC/FM hardware models can cause N9504 spine crashes after upgrading to affected ACI releases.

Detection Logic

Three gates must all be true to trigger a FAIL:

  1. Version Gate – Target version is in the affected range:

    • ACI 6.1(x) older than 6.1(6a)
    • ACI 6.2(x) older than 6.2(2e)
  2. Feature Gate – WRED is enabled (qosCong.algo = wred)

  3. Hardware Gate – Any of the following affected models are present:

    • FM: N9K-C9504-FM-E, N9K-C9508-FM-E, N9K-C9516-FM-E

Testing

  • 5 unit test cases added under tests/checks/wred_affected_model_check/
  • All 5 passed
  • Validated on live fabric (fab3-apic1): confirmed FAIL_O with real hit on node 201 (FAB3-S1, N9K-C9504-FM-E)

@Priyanka-Patil14

Copy link
Copy Markdown
Author

WredCheck_APIC_Output_logs.txt
WredCheck_Pytest_Logs.txt

Uploaded the test logs.

@Harinadh-Saladi Harinadh-Saladi left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pls address the comments given and also Pls add the bug details in validations.md file. It's missing.
Pls execute the script on Fab3 and share PASS, FAIL and NA logs. Will review it.

Comment thread aci-preupgrade-validation-script.py Outdated
Comment thread aci-preupgrade-validation-script.py Outdated
Comment thread aci-preupgrade-validation-script.py
Comment thread aci-preupgrade-validation-script.py Outdated
Comment thread aci-preupgrade-validation-script.py Outdated
Comment thread aci-preupgrade-validation-script.py Outdated
Comment thread aci-preupgrade-validation-script.py Outdated
Comment thread aci-preupgrade-validation-script.py Outdated
@Priyanka-Patil14

Copy link
Copy Markdown
Author

Pls address the comments given and also Pls add the bug details in validations.md file. It's missing. Pls execute the script on Fab3 and share PASS, FAIL and NA logs. Will review it.

WRED_PASS:FAIL:NA_APIC_Logs.txt

Please find the attached logs. Executed on fab3 for PASS, FAIL and NA scenario.

Comment thread docs/docs/validations.md Outdated
Comment thread aci-preupgrade-validation-script.py Outdated
Comment thread aci-preupgrade-validation-script.py Outdated
Comment thread aci-preupgrade-validation-script.py Outdated

@lovkeshsharma702 lovkeshsharma702 left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please work on all comments.

Comment thread docs/docs/validations.md Outdated
Comment thread aci-preupgrade-validation-script.py Outdated
Comment thread aci-preupgrade-validation-script.py Outdated

@lovkeshsharma702 lovkeshsharma702 left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

work on comments

Comment thread aci-preupgrade-validation-script.py Outdated

@Harinadh-Saladi Harinadh-Saladi left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pls address the comments. If there is any different understanding with me in the test results or technical aspects, will discuss with team and address after getting the confirmation.

Comment thread docs/docs/validations.md Outdated
Comment thread aci-preupgrade-validation-script.py
Comment thread tests/checks/wred_affected_model_check/test_wred_affected_model_check.py Outdated
Comment thread aci-preupgrade-validation-script.py Outdated
Comment thread tests/checks/wred_affected_model_check/eqptLC_affected.json
Comment thread tests/checks/wred_affected_model_check/test_wred_affected_model_check.py Outdated
Comment thread tests/checks/wred_affected_model_check/test_wred_affected_model_check.py Outdated
Comment thread aci-preupgrade-validation-script.py Outdated
Comment thread aci-preupgrade-validation-script.py Outdated
Comment thread aci-preupgrade-validation-script.py Outdated
Comment thread tests/checks/wred_affected_model_check/eqptFC_mixed.json

@lovkeshsharma702 lovkeshsharma702 left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please sort the other comments and we are good.

@Priyanka-Patil14

Copy link
Copy Markdown
Author

Attached the pytest and full script run logs as requested.
FullRun_WRED_Logs.txt
WRED_FinalPytest_logs.txt

@Priyanka-Patil14 Priyanka-Patil14 changed the base branch from v4.1.0-dev to v4.2.0-dev May 25, 2026 04:53
Comment thread aci-preupgrade-validation-script.py Outdated
Comment thread aci-preupgrade-validation-script.py Outdated
Comment thread aci-preupgrade-validation-script.py Outdated
Comment thread aci-preupgrade-validation-script.py
Comment thread aci-preupgrade-validation-script.py Outdated
@Priyanka-Patil14

Priyanka-Patil14 commented May 26, 2026

Copy link
Copy Markdown
Author

Attaching the full run logs:

CSCwt50713_Pytest_FullRun_Logs.txt
CSCwt50713_FullRun_Logs.txt

@lovkeshsharma702 lovkeshsharma702 left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

intigration testing completed. Py test also completed.

@monrog2 monrog2 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to get this bug updated with fixed release versions:
CSCwt50713
regardless of its dup nature, as its a different set of conditions to trigger the same RCA underlying crash.

Also include a known fixed version in those details, we want the bug to be the source of truth on affected/fixed versions, not this script.

Comment thread docs/docs/validations.md Outdated
Comment thread aci-preupgrade-validation-script.py Outdated
Comment thread aci-preupgrade-validation-script.py Outdated
Comment thread aci-preupgrade-validation-script.py Outdated
Comment thread tests/checks/wred_affected_model_check/test_wred_affected_model_check.py Outdated
Comment thread tests/checks/wred_affected_model_check/test_wred_affected_model_check.py Outdated

@lovkeshsharma702 lovkeshsharma702 left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

integrated test passed .

[2026-06-12 10:10:45 UTC INFO root:81] (tac_bdsol_pod83) No errors.
[2026-06-12 10:10:47 UTC INFO root:182] (tme_sj_fab1) Completed. Checking errors...
[2026-06-12 10:10:47 UTC INFO root:81] (tme_sj_fab1) No errors.
[2026-06-12 10:10:47 UTC INFO root:182] (tme_sj_fab2) Completed. Checking errors...
[2026-06-12 10:10:47 UTC INFO root:81] (tme_sj_fab2) No errors.
[2026-06-12 10:10:49 UTC INFO root:182] (tac_rtp_pod1) Completed. Checking errors...
[2026-06-12 10:10:49 UTC INFO root:81] (tac_rtp_pod1) No errors.
[2026-06-12 10:10:58 UTC INFO root:182] (tac_rtp_calo_a) Completed. Checking errors...
[2026-06-12 10:10:58 UTC INFO root:81] (tac_rtp_calo_a) No errors.
[2026-06-12 10:11:03 UTC INFO root:182] (tac_bdsol_pod12) Completed. Checking errors...
[2026-06-12 10:11:03 UTC INFO root:81] (tac_bdsol_pod12) No errors.
[2026-06-12 10:11:18 UTC INFO root:182] (tac_bdsol_pod19) Completed. Checking errors...
[2026-06-12 10:11:18 UTC INFO root:81] (tac_bdsol_pod19) No errors.
[2026-06-12 10:12:11 UTC INFO root:182] (bu_qa_scale1) Completed. Checking errors...
[2026-06-12 10:12:11 UTC INFO root:81] (bu_qa_scale1) No errors.
[2026-06-12 10:12:12 UTC INFO root:182] (tac_bdsol_pod20) Completed. Checking errors...
[2026-06-12 10:12:12 UTC INFO root:83] (tac_bdsol_pod20) Found 1 errors.
[2026-06-12 10:12:43 UTC INFO root:182] (tac_rtp_calo_b) Completed. Checking errors...
[2026-06-12 10:12:43 UTC INFO root:83] (tac_rtp_calo_b) Found 2 errors.
[2026-06-12 10:12:43 UTC ERROR root:248] (MainThread) ####################################
[2026-06-12 10:12:43 UTC ERROR root:249] (MainThread)          tac_rtp_pod8
[2026-06-12 10:12:43 UTC ERROR root:250] (MainThread) ####################################
[2026-06-12 10:12:43 UTC ERROR root:251] (MainThread) Message: Uncaught exception found
[2026-06-12 10:12:43 UTC ERROR root:253] (MainThread) Output: PermissionError(13, 'Permission denied')
[2026-06-12 10:12:43 UTC ERROR root:248] (MainThread) ####################################
[2026-06-12 10:12:43 UTC INFO root:257] (MainThread) Fabrics with errors - 6 / 22 fabrics
[2026-06-12 10:12:43 UTC WARNING root:263] (MainThread) Fabrics below seem to be unreachable, not counting as a script error.
[2026-06-12 10:12:43 UTC WARNING root:267] (MainThread) bu_trunk10
[2026-06-12 10:12:43 UTC WARNING root:267] (MainThread) bu_qa_scale6
[2026-06-12 10:12:43 UTC WARNING root:267] (MainThread) bu_ifav30
Uploading artifacts for failed job
00:02
Uploading artifacts...
/builds/aci-pre-upgrade-validation/aci-pre-upgrade-validation-script/tests/*.log: found 17 matching artifact files and directories 
Uploading artifacts as "archive" to coordinator... 201 Created  id=448 responseStatus=201 Created token=glcbt-64
ERROR: Job failed: exit code 1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants