Skip to content
Open
Show file tree
Hide file tree
Changes from 19 commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
04fef19
Add arg to define thread limit - to throttle concurrent API calls whe…
monrog2 Feb 27, 2026
233359a
353 the script incorrectly detects vpc and port channel interfaces as…
monrog2 Feb 27, 2026
6d7d4d4
Update aci-preupgrade-validation-script.py
monrog2 Feb 27, 2026
e4defbf
print cleanup
monrog2 Feb 27, 2026
6a2fe7f
Merge branch 'master' of github.com:datacenter/ACI-Pre-Upgrade-Valida…
takishida Mar 7, 2026
20b5eee
Check for F0467 bgpProt-policy-already-existing
dhaselva Mar 26, 2026
4e4a120
Incorporated the review comments
dhaselva Mar 27, 2026
b2dfa64
Addressed review comments
dhaselva Mar 27, 2026
d849e8e
fix: Add cversion check for post_upgrade_cb_check (#377)
takishida Apr 3, 2026
67dab4b
added validation for CFD CSCwp64296 (#307)
psureshb Apr 3, 2026
6ccd6d9
Added pre-upgrade validation for N9K-C9408 with more than 6 N9K-X9400…
Harinadh-Saladi Apr 3, 2026
ea1e0f8
New Validation for APIC Storage Inode Usage (F4388, F4389, F4390 equi…
sanjanch Apr 3, 2026
dc09913
Add validation for multipod_modular_spine_bootscript_check - CSCwr668…
asraf-khan Apr 4, 2026
d118464
Update pytest.yml to run on vX.Y.Z branches
takishida Apr 4, 2026
07ea2db
adding of cli parameters for user and password (#335)
Thatleft Apr 4, 2026
5815d26
Added version check instead of generic check
dhaselva Apr 6, 2026
b8ca9d5
Modified the target version check
dhaselva Apr 7, 2026
fdd9a28
pulled dev branch
dhaselva Apr 9, 2026
ea73e9b
Removed whitepsaces
dhaselva Apr 9, 2026
d41ad25
Modified the content in validation file
dhaselva Apr 10, 2026
6f07e2c
Updated the pytest files
dhaselva May 5, 2026
217dbcb
Added validation for CSCwd40071 (#332)
sanjanch May 8, 2026
69d6a4b
Added validation for CSCws84232 (#334)
veenaskumar-cisco May 19, 2026
a6d65c2
Merge branch 'v4.1.0-dev' into dhaselva/F0467
dhaselva May 21, 2026
1c4a1b5
modified the logic suggested by Gabe
dhaselva May 21, 2026
f6ce159
Merge branch 'dhaselva/F0467' of github.com:dhaselva/ACI-Pre-Upgrade-…
dhaselva May 21, 2026
3aaf31b
Modified the bgp proto content
dhaselva May 21, 2026
18f619e
Removed unwanted whitespace
dhaselva May 21, 2026
403e840
Removed the duplicate entry from line line 6555
dhaselva May 21, 2026
8c5487a
merged the changes from v4.2.0-dev
dhaselva May 22, 2026
714e996
Added the tversion check
dhaselva May 25, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .github/workflows/pytest.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@ name: Pytest

on:
push:
branches: [master]
branches: [master, 'v[0-9].[0-9]+.[0-9]+*']
pull_request:
branches: [master]
branches: [master, 'v[0-9].[0-9]+.[0-9]+*']

permissions:
contents: read
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@ coverage.xml
.hypothesis/
.pytest_cache/
cover/
preupgrade_validator*.tgz

# Translations
*.mo
Expand Down
312 changes: 298 additions & 14 deletions aci-preupgrade-validation-script.py

Large diffs are not rendered by default.

127 changes: 125 additions & 2 deletions docs/docs/validations.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@ Items | Faults | This Script
[Fabric Port Status][f19] | F1394: ethpm-if-port-down-fabric | :white_check_mark: | :no_entry_sign:
[Equipment Disk Limits][f20] | F1820: 80% -minor<br>F1821: -major<br>F1822: -critical | :white_check_mark: | :no_entry_sign:
[VMM Inventory Partially Synced][f21] | F0132: comp-ctrlr-operational-issues | :white_check_mark: | :no_entry_sign:

[APIC Storage Inode Usage][f22] | F4388: 75% - 85% -warning<br>F4389: 85% - 90% -major<br>F4390: 90% or more -critical | :white_check_mark: | :no_entry_sign:

[f1]: #apic-disk-space-usage
[f2]: #standby-apic-disk-space-usage
Expand All @@ -105,6 +105,8 @@ Items | Faults | This Script
[f19]: #fabric-port-status
[f20]: #equipment-disk-limits
[f21]: #vmm-inventory-partially-synced
[f22]: #apic-storage-inode-usage


### Configuration Checks

Expand Down Expand Up @@ -194,6 +196,10 @@ Items | Defect | This Script
[ISIS DTEPs Byte Size][d27] | CSCwp15375 | :white_check_mark: | :no_entry_sign:
[Policydist configpushShardCont Crash][d28] | CSCwp95515 | :white_check_mark: | :no_entry_sign:
[Auto Firmware Update on Switch Discovery][d29] | CSCwe83941 | :white_check_mark: | :no_entry_sign:
[Rogue EP Exception List missing on switches][d30] | CSCwp64296 | :white_check_mark: | :no_entry_sign:
[N9K-C9408 with more than 5 N9K-X9400-16W LEMs][d31] | CSCws82819 | :white_check_mark: | :no_entry_sign:
[Multi-Pod Modular Spine Bootscript File][d32] | CSCwr66848 | :white_check_mark: | :no_entry_sign:
[BgpProto timer policy already existing][d33] | CSCwt78235 | :white_check_mark: | :no_entry_sign:

[d1]: #ep-announce-compatibility
[d2]: #eventmgr-db-size-defect-susceptibility
Expand Down Expand Up @@ -224,6 +230,10 @@ Items | Defect | This Script
[d27]: #isis-dteps-byte-size
[d28]: #policydist-configpushshardcont-crash
[d29]: #auto-firmware-update-on-switch-discovery
[d30]: #rogue-ep-exception-list-missing-on-switches
[d31]: #n9k-c9408-with-more-than-5-n9k-x9400-16w-lems
[d32]: #multi-pod-modular-spine-bootscript-file
[d33]: #bgpProto-timer-policy-already-existing

## General Check Details

Expand Down Expand Up @@ -1551,6 +1561,56 @@ EPGs using the `pre-provision` resolution immediacy do not rely on the VMM inven

This check returns a `MANUAL` result as there are many reasons for a partial inventory sync to be reported. The goal is to ensure that the VMM inventory sync has fully completed before triggering the APIC upgrade to reduce any chance for unexpected inventory changes to occur.


### APIC Storage Inode Usage

If a Cisco APIC is running low on inode capacity for any reason, the Cisco APIC upgrade can fail. The Cisco APIC will raise three different faults depending on inode utilization. If any of these faults are raised on the system, the issue should be resolved prior to performing the upgrade.

* **F4388**: A warning level fault for Cisco APIC storage inode utilization. This is raised when utilization is greater than 75%.

* **F4389**: A major level fault for Cisco APIC storage inode utilization. This is raised when utilization is between 85% and 90%.

* **F4390**: A critical level fault for Cisco APIC storage inode utilization. This is raised when utilization is greater than 90%.

Although the storage space for the filesystem might be adequate we might still see issues with inode usage, this happens when we have more number of files or directories created with lower file sizes.

Recommended Action:

To recover from this fault, try the following action

1. Free up space from affected disk partition .
2. TAC may be required to analyze and cleanup certain directories due to filesystem permissions. Cleanup of `/` is one such example.

!!! example "Fault Example (F4390: Critical fault for APIC Inode Utilisation)"
```
moquery -c faultInst -f 'fault.Inst.code=="F4390"'
Total Objects shown: 1

# faultInst
ack : yes
alert : no
cause : equipment-full
changeSet : available (Old: 19408344, New: 19407972), inodesFree (Old: 263915, New: 263842), inodesUsed (Old: 2357525, New: 2357598),
used (Old: 19436092, New: 19436464)
code : F4390
created : 2024-08-05T05:42:31.975+02:00
delegated : no
descr : Storage unit /scratch-writes on node 3 with hostname apic3 mounted at /scratch-writes is 90% full for Inodes
dn : topology/pod-2/node-3/sys/ch/p-[/scratch-writes]-f-[/dev/mapper/atx-scratch]/fault-F4390
domain : infra
highestSeverity : critical
lastTransition : 2024-08-05T09:41:18.152+02:00
lc : raised
occur : 2
origSeverity : critical
prevSeverity : cleared
rule : eqpt-storage-inode-critical
severity : critical
subject : equipment-full
type : operational
```


## Configuration Check Details

### VPC-paired Leaf switches
Expand Down Expand Up @@ -2648,6 +2708,7 @@ Due to [CSCwp95515][59], upgrading to an affected version while having any `conf

If any instances of `configpushShardCont` are flagged by this script, Cisco TAC must be contacted to identify and resolve the underlying issue before performing the upgrade.


### Auto Firmware Update on Switch Discovery

[Auto Firmware Update on Switch Discovery][63] automatically upgrades a new switch to the target firmware version before registering it to the ACI fabric. This feature activates in three scenarios:
Expand All @@ -2668,6 +2729,64 @@ To avoid this risk, consider disabling Auto Firmware Update before upgrading to
This issue occurs because older switch firmware versions are not compatible with switch images 6.0(3) or newer. The APIC version is not a factor.


### Rogue EP Exception List missing on switches

The Rogue/COOP Exception List feature, introduced in 5.2(3), allows exclusion of specific MAC addresses from Rogue Endpoint Control and COOP Dampening. Initially, each MAC address had to be configured individually in each bridge domain. In 6.0(3), this feature was enhanced to support fabric-wide exception lists with wildcard options per bridge domain and the ability to exclude MAC addresses in L3Outs.

However, due to [CSCwp64296][64], when upgrading spine switches to version 6.0(3)+ from an older version with Rogue/COOP Exception Lists configured, some exception lists may not be pushed to the spine switches. As a result, the feature may stop functioning after the upgrade.

The root cause is that internal objects called `presListener` for Rogue/COOP Exception List, which publish the configuration from APICs to switches, may be missing on the APICs after an upgrade.

Recommended action: Delete the affected exception list and create it again. If needed, contact Cisco TAC to help recover missing `presListener` objects on APICs.


### N9K-C9408 with more than 5 N9K-X9400-16W LEMs

Due to defect [CSCws82819][65], N9K-C9408 switch will experience a boot loop with dt_helper process crash if upgraded to versions 16.1(2f) to 16.1(5) or 16.2(1g) with more than 5 N9K-X9400-16W LEMs installed.

To avoid this issue, please upgrade to fix version or use less than 6 N9K-X9400-16W in one chassis.


### Multi-Pod Modular Spine Bootscript File

Due to [CSCwr66848][66], in Multi-Pod environments, upgrading a modular spine to 6.1(4h) may result in inter-pod traffic to stop working if the `/bootflash/bootscript` file is missing on the spine prior to the upgrade. The traffic interruption occurs because the spine incorrectly indentifies the reason of its reload, leading to an unnecessary attempt to load the missing bootscript file.

This issue happens only when the target version is specifically 6.1(4h).

To avoid this issue, change the target version to another version. Or verify that the `bootscript` file exists in the bootflash of each modular spine switch prior to upgrading to 6.1(4h). If the file is missing, you have to do clean reboot on the impacted spine to ensure that `/bootflash/bootscript` gets created again. In case you already upgraded your spine and you are experiencing the traffic impact due to this issue, clean reboot of the spine will restore the traffic.


### BgpProto Timer Policy Already Existing

This bug [CSCwt78235][67] validates `F0467` faults where `changeSet` contains 'bgpProt-policy-already-existing'.
The fault indicates conflicting BGP protocol timer policy under an L3Outs deployed in same vrf under same node.

Resolve these faults before upgrade by reviewing the affected L3Out BGP proto timer policy.

Example:

# fault.Delegate
affected : resPolCont/rtdOutCont/rtdOutDef-[uni/tn-common/out-L3outY]/nwissues
code : F0467
ack : no
cause : configuration-failed
changeSet : configQual:bgpProt-policy-already-existing, configSt:failed-to-apply, temporaryError:no
childAction :
created : 2026-03-25T11:31:16.724+00:00
descr : Fault delegate: Configuration failed for uni/tn-common/out-L3outY due to A specific leaf node can hold only a single bgpProtP config; this fault is raised when inconsistent configuration is detected, debug message:
dn : uni/tn-common/out-L3outY/fd-[resPolCont/rtdOutCont/rtdOutDef-[uni/tn-common/out-L3outY]/nwissues]-fault-F0467
domain : tenant
highestSeverity : critical
lc : raised
occur : 1
origSeverity : critical
prevSeverity : critical
rn : fd-[resPolCont/rtdOutCont/rtdOutDef-[uni/tn-common/out-L3outY]/nwissues]-fault-F0467
rule : fv-nw-issues-config-failed
severity : critical
subject : management
type : config

[0]: https://github.com/datacenter/ACI-Pre-Upgrade-Validation-Script
[1]: https://www.cisco.com/c/dam/en/us/td/docs/Website/datacenter/apicmatrix/index.html
[2]: https://www.cisco.com/c/en/us/support/switches/nexus-9000-series-switches/products-release-notes-list.html
Expand Down Expand Up @@ -2731,4 +2850,8 @@ To avoid this risk, consider disabling Auto Firmware Update before upgrading to
[60]: https://www.cisco.com/c/en/us/solutions/collateral/data-center-virtualization/application-centric-infrastructure/white-paper-c11-743951.html#Inter
[61]: https://www.cisco.com/c/en/us/solutions/collateral/data-center-virtualization/application-centric-infrastructure/white-paper-c11-743951.html#EnablePolicyCompression
[62]: https://bst.cloudapps.cisco.com/bugsearch/bug/CSCwe83941
[63]: https://www.cisco.com/c/en/us/td/docs/dcn/aci/apic/all/apic-installation-aci-upgrade-downgrade/Cisco-APIC-Installation-ACI-Upgrade-Downgrade-Guide/m-auto-firmware-update.html
[63]: https://www.cisco.com/c/en/us/td/docs/dcn/aci/apic/all/apic-installation-aci-upgrade-downgrade/Cisco-APIC-Installation-ACI-Upgrade-Downgrade-Guide/m-auto-firmware-update.html
[64]: https://bst.cloudapps.cisco.com/bugsearch/bug/CSCwp64296
[65]: https://bst.cloudapps.cisco.com/bugsearch/bug/CSCws82819
[66]: https://bst.cloudapps.cisco.com/bugsearch/bug/CSCwr66848
[67]: https://bst.cloudapps.cisco.com/bugsearch/bug/CSCwt78235
80 changes: 80 additions & 0 deletions tests/checks/apic_storage_inode_full_check/Fault_combination.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
[
{
"faultInst": {
"attributes": {
"ack": "no",
"alert": "no",
"cause": "equipment-full",
"changeSet": "available (Old: 37868344, New: 37859228), inodesFree (Old: 810163, New: 479339), inodesUsed (Old: 1811277, New: 2142101), inodesUtilized (Old: 70, New: 82), used (Old: 976092, New: 985208)",
"code": "F4388",
"created": "2026-03-06T11:58:43.579+00:00",
"delegated": "no",
"descr": "Storage unit /data/admin/bin/avread on Node 1 mounted at /data/admin/bin/avread is 82% full for Inodes",
"dn": "topology/pod-1/node-1/sys/ch/p-[/data/admin/bin/avread]-f-[overlayfs]/fault-F4388",
"domain": "infra",
"highestSeverity": "warning",
"lastTransition": "2026-03-06T11:58:43.579+00:00",
"lc": "raised",
"occur": "1",
"origSeverity": "warning",
"prevSeverity": "warning",
"rule": "eqpt-storage-inode-warning",
"severity": "warning",
"subject": "equipment-full",
"type": "operational"
}
}
},
{
"faultInst": {
"attributes": {
"ack": "no",
"alert": "no",
"cause": "equipment-full",
"changeSet": "available (Old: 37868344, New: 37859228), inodesFree (Old: 810163, New: 479339), inodesUsed (Old: 1811277, New: 2142101), inodesUtilized (Old: 70, New: 82), used (Old: 976092, New: 985208)",
"code": "F4388",
"created": "2026-03-06T11:58:43.587+00:00",
"delegated": "no",
"descr": "Storage unit /etc/hosts on Node 1 mounted at /etc/hosts is 82% full for Inodes",
"dn": "topology/pod-1/node-1/sys/ch/p-[/etc/hosts]-f-[overlayfs]/fault-F4388",
"domain": "infra",
"highestSeverity": "warning",
"lastTransition": "2026-03-06T11:58:43.587+00:00",
"lc": "soaking",
"occur": "1",
"origSeverity": "warning",
"prevSeverity": "warning",
"rule": "eqpt-storage-inode-warning",
"severity": "warning",
"subject": "equipment-full",
"type": "operational"
}
}
},
{
"faultInst": {
"attributes": {
"ack": "no",
"alert": "no",
"cause": "equipment-full",
"changeSet": "available (Old: 37868344, New: 37859228), inodesFree (Old: 810163, New: 479339), inodesUsed (Old: 1811277, New: 2142101), inodesUtilized (Old: 70, New: 82), used (Old: 976092, New: 985208)",
"code": "F4388",
"created": "2026-03-06T11:58:43.595+00:00",
"delegated": "no",
"descr": "Storage unit /scratch-writes on Node 1 mounted at /scratch-writes is 82% full for Inodes",
"dn": "topology/pod-1/node-1/sys/ch/p-[/scratch-writes]-f-[/dev/mapper/atx-scratch]/fault-F4388",
"domain": "infra",
"highestSeverity": "warning",
"lastTransition": "2026-03-06T11:58:43.595+00:00",
"lc": "raised-clearing",
"occur": "1",
"origSeverity": "warning",
"prevSeverity": "warning",
"rule": "eqpt-storage-inode-warning",
"severity": "warning",
"subject": "equipment-full",
"type": "operational"
}
}
}
]
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
[
{
"faultInst": {
"attributes": {
"ack": "no",
"alert": "no",
"cause": "equipment-full",
"changeSet": "available (Old: 37868344, New: 37859228), inodesFree (Old: 810163, New: 479339), inodesUsed (Old: 1811277, New: 2142101), inodesUtilized (Old: 70, New: 82), used (Old: 976092, New: 985208)",
"code": "F4388",
"created": "2026-03-06T11:58:43.579+00:00",
"delegated": "no",
"descr": "Storage unit /data/admin/bin/avread on Node 1 mounted at /data/admin/bin/avread is 82% full for Inodes",
"dn": "topology/pod-1/node-1/sys/ch/p-[/data/admin/bin/avread]-f-[overlayfs]/fault-F4388",
"domain": "infra",
"highestSeverity": "warning",
"lastTransition": "2026-03-06T11:58:43.579+00:00",
"lc": "cleared",
"occur": "1",
"origSeverity": "warning",
"prevSeverity": "warning",
"rule": "eqpt-storage-inode-warning",
"severity": "warning",
"subject": "equipment-full",
"type": "operational"
}
}
},
{
"faultInst": {
"attributes": {
"ack": "no",
"alert": "no",
"cause": "equipment-full",
"changeSet": "available (Old: 37868344, New: 37859228), inodesFree (Old: 810163, New: 479339), inodesUsed (Old: 1811277, New: 2142101), inodesUtilized (Old: 70, New: 82), used (Old: 976092, New: 985208)",
"code": "F4388",
"created": "2026-03-06T11:58:43.587+00:00",
"delegated": "no",
"descr": "Storage unit /etc/hosts on Node 1 mounted at /etc/hosts is 82% full for Inodes",
"dn": "topology/pod-1/node-1/sys/ch/p-[/etc/hosts]-f-[overlayfs]/fault-F4388",
"domain": "infra",
"highestSeverity": "warning",
"lastTransition": "2026-03-06T11:58:43.587+00:00",
"lc": "retaining",
"occur": "1",
"origSeverity": "warning",
"prevSeverity": "warning",
"rule": "eqpt-storage-inode-warning",
"severity": "warning",
"subject": "equipment-full",
"type": "operational"
}
}
}
]
Loading