You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Add arg to define thread limit - to throttle concurrent API calls when required (#355)
* add `--max-threads` arg
* fix bad descriptor errs/race conditions
* update pytests
* 353 the script incorrectly detects vpc and port channel interfaces as cscwh68103 invalid fabricpathep targets (#357)
* specific testing for known failure conditions of cscwh68103 as to not catch valid scenarios
* Update aci-preupgrade-validation-script.py
mark version
* print cleanup
* fix: Add cversion check for post_upgrade_cb_check (#377)
* added validation for CFD CSCwp64296 (#307)
Added rogue ep/coop exception mac check for the CFD CSCwp64296
* Added pre-upgrade validation for N9K-C9408 with more than 6 N9K-X9400-16W LEM's for the bug CSCws82819 (#354)
* Added a new check for the bug 'CSCws82819N9K-C9408 boot loop on 16.1.2f and later with 6 or more LEMs'
* New Validation for APIC Storage Inode Usage (F4388, F4389, F4390 equipment-full) (#361)
* New Validation for APIC Storage Inode Usage (F4388, F4389, F4390 equipment-full)
* Add new exception handling of invalid query filter in `icurl` due to, for example, a non-supported fault code on older versions
* Add validation for multipod_modular_spine_bootscript_check - CSCwr66848 (#365)
* Add check for CFD - CSCwr66848
* Update pytest.yml to run on vX.Y.Z branches
* adding of cli parameters for user and password (#335)
* added cli parameter for user (-u) and password (-p) to be used in an eased way for fully automated execution of 'aci-preupgrade-validation-script.py'
* test: Add pytest
---------
Co-authored-by: Detlef Sass <detlef.sass@de.bosch.com>
* Added validation for CSCwd40071 (#332)
* Added validation for CSCwd40071
* Addressed the comments
* Added cversion for the check
* Removed empty spaces
* logic change. removed 0.0.0.0/0 and made pytest changes
* logic modified and validation.md file updated
* Added validation for CSCws84232 (#334)
* Feat:added svccoreCtrlr excessive entries check
* updated the formated json in svccore_negative.json
* Removed other PR checks
* Fix: svccoreNode mo check is updated
* Fix:update the threshold to 240
* refactor:moved to general check
* refactor:updated recommended action
* refactor:optimise the check for get the count alone
* refactor:added try catch exception for error handling
* update the threshold value for 240
* refactor:query-target was added
* refactor: removed unwanted spaces
* refactor:removed unwanted spaces in validation.md
* resolve conflict while rebasing
* resolve conflict while rebasing
* updated pytest script file
* updated validation.md
* addressed review comments
* updated validation.md
---------
Co-authored-by: takishida <38262981+takishida@users.noreply.github.com>
* set to v4.1.0
---------
Co-authored-by: tkishida <tkishida@cisco.com>
Co-authored-by: takishida <38262981+takishida@users.noreply.github.com>
Co-authored-by: psureshb <psureshb@cisco.com>
Co-authored-by: Harinadh-Saladi <hsaladi@cisco.com>
Co-authored-by: sanjanch <sanjanch@cisco.com>
Co-authored-by: asraf-khan <anazar@cisco.com>
Co-authored-by: Thatleft <detlef.sass@web.de>
Co-authored-by: Detlef Sass <detlef.sass@de.bosch.com>
Co-authored-by: veesenth_cisco <veesenth@cisco.com>
@@ -1551,6 +1562,56 @@ EPGs using the `pre-provision` resolution immediacy do not rely on the VMM inven
1551
1562
1552
1563
This check returns a `MANUAL` result as there are many reasons for a partial inventory sync to be reported. The goal is to ensure that the VMM inventory sync has fully completed before triggering the APIC upgrade to reduce any chance for unexpected inventory changes to occur.
1553
1564
1565
+
1566
+
### APIC Storage Inode Usage
1567
+
1568
+
If a Cisco APIC is running low on inode capacity for any reason, the Cisco APIC upgrade can fail. The Cisco APIC will raise three different faults depending on inode utilization. If any of these faults are raised on the system, the issue should be resolved prior to performing the upgrade.
1569
+
1570
+
***F4388**: A warning level fault for Cisco APIC storage inode utilization. This is raised when utilization is greater than 75%.
1571
+
1572
+
***F4389**: A major level fault for Cisco APIC storage inode utilization. This is raised when utilization is between 85% and 90%.
1573
+
1574
+
***F4390**: A critical level fault for Cisco APIC storage inode utilization. This is raised when utilization is greater than 90%.
1575
+
1576
+
Although the storage space for the filesystem might be adequate we might still see issues with inode usage, this happens when we have more number of files or directories created with lower file sizes.
1577
+
1578
+
Recommended Action:
1579
+
1580
+
To recover from this fault, try the following action
1581
+
1582
+
1. Free up space from affected disk partition .
1583
+
2. TAC may be required to analyze and cleanup certain directories due to filesystem permissions. Cleanup of `/` is one such example.
1584
+
1585
+
!!! example "Fault Example (F4390: Critical fault for APIC Inode Utilisation)"
@@ -2648,6 +2709,7 @@ Due to [CSCwp95515][59], upgrading to an affected version while having any `conf
2648
2709
2649
2710
If any instances of `configpushShardCont` are flagged by this script, Cisco TAC must be contacted to identify and resolve the underlying issue before performing the upgrade.
2650
2711
2712
+
2651
2713
### Auto Firmware Update on Switch Discovery
2652
2714
2653
2715
[Auto Firmware Update on Switch Discovery][63] automatically upgrades a new switch to the target firmware version before registering it to the ACI fabric. This feature activates in three scenarios:
@@ -2668,6 +2730,61 @@ To avoid this risk, consider disabling Auto Firmware Update before upgrading to
2668
2730
This issue occurs because older switch firmware versions are not compatible with switch images 6.0(3) or newer. The APIC version is not a factor.
2669
2731
2670
2732
2733
+
### Rogue EP Exception List missing on switches
2734
+
2735
+
The Rogue/COOP Exception List feature, introduced in 5.2(3), allows exclusion of specific MAC addresses from Rogue Endpoint Control and COOP Dampening. Initially, each MAC address had to be configured individually in each bridge domain. In 6.0(3), this feature was enhanced to support fabric-wide exception lists with wildcard options per bridge domain and the ability to exclude MAC addresses in L3Outs.
2736
+
2737
+
However, due to [CSCwp64296][64], when upgrading spine switches to version 6.0(3)+ from an older version with Rogue/COOP Exception Lists configured, some exception lists may not be pushed to the spine switches. As a result, the feature may stop functioning after the upgrade.
2738
+
2739
+
The root cause is that internal objects called `presListener` for Rogue/COOP Exception List, which publish the configuration from APICs to switches, may be missing on the APICs after an upgrade.
2740
+
2741
+
Recommended action: Delete the affected exception list and create it again. If needed, contact Cisco TAC to help recover missing `presListener` objects on APICs.
2742
+
2743
+
2744
+
### N9K-C9408 with more than 5 N9K-X9400-16W LEMs
2745
+
2746
+
Due to defect [CSCws82819][65], N9K-C9408 switch will experience a boot loop with dt_helper process crash if upgraded to versions 16.1(2f) to 16.1(5) or 16.2(1g) with more than 5 N9K-X9400-16W LEMs installed.
2747
+
2748
+
To avoid this issue, please upgrade to fix version or use less than 6 N9K-X9400-16W in one chassis.
2749
+
2750
+
2751
+
### Multi-Pod Modular Spine Bootscript File
2752
+
2753
+
Due to [CSCwr66848][66], in Multi-Pod environments, upgrading a modular spine to 6.1(4h) may result in inter-pod traffic to stop working if the `/bootflash/bootscript` file is missing on the spine prior to the upgrade. The traffic interruption occurs because the spine incorrectly indentifies the reason of its reload, leading to an unnecessary attempt to load the missing bootscript file.
2754
+
2755
+
This issue happens only when the target version is specifically 6.1(4h).
2756
+
2757
+
To avoid this issue, change the target version to another version. Or verify that the `bootscript` file exists in the bootflash of each modular spine switch prior to upgrading to 6.1(4h). If the file is missing, you have to do clean reboot on the impacted spine to ensure that `/bootflash/bootscript` gets created again. In case you already upgraded your spine and you are experiencing the traffic impact due to this issue, clean reboot of the spine will restore the traffic.
2758
+
2759
+
2760
+
### Inband Management Policy Misconfiguration
2761
+
2762
+
Due to the defect [CSCwh80837][67], starting from version 6.0(4c), mgmtRsInBStNode policy get modified in leaf/spine during Apic upgrade.
2763
+
2764
+
Impact:
2765
+
2766
+
When upgrading Apic from versions prior to 6.0(4c) to versions 6.0(4c) or later, if there is a misconfiguration in the inband management policies (mgmtRsInBStNode) with invalid values, the re-processing triggered by [CSCwh80837][67] will expose the underlying [CSCwd40071][68] defect. This results in continuous policyelem core dumps and switch reboot if Switch are running impacted version of [CSCwd40071][68].
2767
+
2768
+
The invalid configuration occurs when mgmtRsInBStNode has "0.0.0.0" values ( with or without mask) for either the "addr" or "gw" fields.
2769
+
2770
+
Suggestion:
2771
+
2772
+
Contact Cisco TAC to remove any identified misconfigured objects before performing the upgrade to prevent policyelem crashes.
2773
+
The [CSCwd40071][68] defect affects versions 5.2(5c) and later with a fix available in 6.0(1g). However, the issue will only be triggered during Apic upgrades crossing 6.0(4c) due to [CSCwh80837][67].
2774
+
2775
+
2776
+
### Svccore Excessive Data Check
2777
+
2778
+
Due to excessive `svccoreCtrlr` or `svccoreNode` managed objects, Apic gui stuck in loading multiple queries.
2779
+
2780
+
The svccoreCtrlr and svccoreNode objects represent core files related to Apic and Leaf/Spines process respectively.
2781
+
2782
+
Due to [CSCws84232][67], the APIC GUI may become unresponsive after login, with dashboards stuck in a continuous “Loading…”state.
2783
+
Administrators may be unable to access or operate the APIC GUI, potentially impacting day-to-day management or upgrade.
2784
+
2785
+
This check will verify the count of the `svccoreCtrlr` Managed Object and raise and alarm with the bug if object count found more than 240. Remove the content or objects of `svccoreCtrlr` or `svccoreNode`. Contact Cisco TAC or upgrade to a release containing the fix for CSCws84232 before proceeding with an upgrade.
0 commit comments