diff --git a/doc/smart-switch/pmon/enhance-dpu-robustness.md b/doc/smart-switch/pmon/enhance-dpu-robustness.md
new file mode 100644
index 00000000000..8a5d7b91707
--- /dev/null
+++ b/doc/smart-switch/pmon/enhance-dpu-robustness.md
@@ -0,0 +1,557 @@
+# Smart Switch: PMON: Enhance DPU Robustness #
+
+## Table of Contents ##
+
+- [Revision](#revision)
+- [Scope](#scope)
+- [Definitions/Abbreviations](#definitionsabbreviations)
+- [Overview](#overview)
+- [Terminology](#terminology)
+- [Critical Processes for DPU Management](#critical-processes-for-dpu-management)
+- [Timers and Thresholds](#timers-and-thresholds)
+- [DPU Status DB Info](#dpu-status-db-info)
+  - [Existing DB entries](#existing-db-entries)
+  - [New DB entries](#new-db-entries)
+- [DPU Recovery State Machine](#dpu-recovery-state-machine)
+- [Unplanned Failures](#unplanned-failures)
+  - [DPU Failure](#dpu-failure)
+  - [NPU Ungraceful Reboot](#npu-ungraceful-reboot)
+- [Planned Operations](#planned-operations)
+  - [DPU Graceful Shutdown](#dpu-graceful-shutdown)
+  - [DPU Cold Reboot](#dpu-cold-reboot)
+  - [Full SmartSwitch Reboot](#full-smartswitch-reboot)
+- [Scenario DB State Summary](#scenario-db-state-summary)
+- [CLI](#cli)
+- [Testing](#testing)
+- [Repository Change Summary](#repository-change-summary)
+- [References](#references)
+
+---
+
+## Revision ##
+
+|  Rev  |        Author       | Change Description                     |
+| :---: |  :----------------: | -------------------------------------- |
+|  0.1  |  Vasundhara Volam   | Initial Version                        |
+
+---
+
+## Scope ##
+
+This document covers the High Level Design for DPU failure scenarios on a SmartSwitch from the PMON (Platform Monitor) perspective — specifically focused on detection, DB state management, and recovery actions performed by `chassisd` and other PMON sub-daemons.
+
+The scope includes:
+
+- DPU failures (any event causing control plane or midplane to go down — process crashes, hardware faults, power loss, PCIe failures, kernel panics)
+- NPU ungraceful reboot (kernel panic, unknown reboot cause — triggers power-cycle of all DPUs)
+- DB state tracking for DPU failure detection and recovery (new and existing DB entries)
+- DB state tracking for planned operations
+- PMON critical process definitions and criticality levels
+- Timers and thresholds used by PMON for failure detection and recovery
+
+---
+
+## Definitions/Abbreviations ##
+
+| Term | Meaning                                                 |
+| ---- | ------------------------------------------------------- |
+| API  | Application Programming Interface                       |
+| ASIC | Application-Specific Integrated Circuit                 |
+| CLI  | Command-Line Interface                                  |
+| DB   | Redis Database                                          |
+| DPU  | Data Processing Unit                                    |
+| gNOI | gRPC Network Operations Interface                       |
+| gRPC | gRPC Remote Procedure Call                             |
+| NPU  | Network Processing Unit                                 |
+| PCIe | PCI Express (Peripheral Component Interconnect Express) |
+| PMON | Platform Monitor                                        |
+| RPC  | Remote Procedure Call                                   |
+| SAI  | Switch Abstraction Interface                            |
+
+---
+
+## Overview ##
+
+SmartSwitch consists of one NPU (switch ASIC) and multiple DPUs. All front panel ports are connected to the NPU. DPUs are connected to the NPU via PCIe and back-panel ports.
+
+The PMON (Platform Monitor) daemon on the NPU is responsible for monitoring DPU health and managing DPU lifecycle operations. Its primary sub-daemon, `chassisd`, continuously polls DPU states (midplane, control plane, data plane), detects failures, performs recovery actions (power-cycle, PCIe rescan), and updates database entries to reflect DPU readiness.
+
+This document enumerates all failure scenarios that can occur on a DPU or its supporting infrastructure from the PMON perspective, describes detection mechanisms driven by `chassisd`, recovery paths, and the corresponding database state changes. It also covers planned operations (graceful shutdown, cold reboot, full SmartSwitch reboot) and the DB state changes introduced to support them.
+
+---
+
+## Terminology ##
+
+| Term | Explanation |
+| ---- | ----------- |
+| chassisd | Chassis daemon running inside `pmon` on the NPU; monitors DPU health states, manages DPU power-cycle and reset operations |
+| pmon | Platform Monitor daemon on NPU; hosts `chassisd` and other hardware monitoring sub-daemons |
+| syncd | Sync daemon; manages SAI API calls to DPU ASIC |
+| control plane state | Indicates whether DPU SONiC is booted up, all containers are up, and DPU is ready to accept configuration. Derived from SYSTEM_READY in STATE_DB. Values: `"up"`, `"down"`. |
+| midplane link state | Indicates whether the PCIe link between the NPU and DPU is operational. Monitored and updated by NPU pmon `chassisd` via the `is_midplane_reachable` platform API. Values: `"up"`, `"down"`. |
+| dataplane state | Indicates whether configuration is downloaded, pipeline stages are up, and DPU hardware (port/ASIC) is ready to take traffic. Values: `"up"`, `"down"`. |
+
+---
+
+## Critical Processes for DPU Management ##
+
+The following processes are critical for SmartSwitch DPU lifecycle management. A failure in any of these impacts the ability to monitor, recover, or manage DPUs.
+
+**PMON-managed processes (on NPU):**
+
+| Process |  Role | Failure Impact |
+| ------- |  ---- | -------------- |
+| `chassisd` | Monitors DPU health (midplane, control plane, data plane); manages power-cycle, reset, and DB state updates | All DPU failure detection and recovery stops; no DB updates |
+| `pcied` | Monitors PCIe link state between NPU and DPUs; updates `PCIE_DETACH_INFO` in STATE_DB | PCIe failures go undetected; `PCIE_DETACH_INFO` not updated |
+
+**Other critical NPU processes:**
+
+| Process | Container | Role | Failure Impact |
+| ------- | --------- | ---- | -------------- |
+| `gnoi_reboot_daemon.py` | `gnmi` | Sends gNOI Reboot RPCs to DPUs for graceful shutdown / reboot | Graceful shutdown and planned reboot operations fail; DPU cannot be halted cleanly before power-cycle |
+| `sysmgr` | Host | Routes DPU planned shutdown and reboot requests to host services for execution | Planned DPU reset operations cannot be carried out |
+
+
+---
+
+## Timers and Thresholds ##
+
+All timers and thresholds used by PMON for DPU failure detection and recovery are listed below. Values shown are defaults; some are configurable via `platform.json`.
+
+| Timer / Threshold | Default Value | Configurable | Used By | Description |
+| ----------------- | :-----------: | :----------: | ------- | ----------- |
+| `chassisd` health poll interval | 10 seconds | No | `chassisd` | Interval at which `chassisd` polls `dpu_control_plane_state`, `dpu_data_plane_state`, and `dpu_midplane_link_state`. |
+| `pcied` PCIe poll interval | 60 seconds | No | `pcied` | Interval at which `pcied` checks PCIe link status for all DPUs. A PCIe failure may go undetected for up to 60 seconds. |
+| `dpu_halt_services_timeout` | 60 seconds | Yes (`platform.json`) | `gnoi_reboot_daemon.py` | Maximum time to wait for DPU services to halt gracefully during reboot/shutdown |
+| `dpu_boot_timeout` | 600 seconds | Yes (`platform.json`) | `chassisd` | Maximum time to wait for a DPU to reach `Ready` state (all planes up) after a power-cycle. |
+| `dpu_self_recovery_timeout` | 300 seconds | Yes (`platform.json`) | `chassisd` | Maximum time to wait for a DPU to self-recover before `chassisd` initiates a power-cycle. This grace period allows the DPU to recover from transient failures without external intervention. |
+| `dpu_reset_limit` | 2 | Yes (`platform.json`) | `chassisd` | Maximum number of consecutive unplanned power-cycle attempts before marking DPU as unrecoverable |
+
+> **Note:** `chassisd` polls `dpu_data_plane_state` alongside `dpu_control_plane_state` and `dpu_midplane_link_state`, but `dpu_data_plane_state` alone does not trigger recovery actions. A data-plane-down with control-plane-up scenario indicates that the DPU SONiC stack is running but the data plane pipeline has not converged — this is expected during initial programming or after a configuration change. Recovery is triggered only when `dpu_control_plane_state` or `dpu_midplane_link_state` transitions to `down`. The `dpu_data_plane_state` is used by `chassisd` solely to determine full DPU readiness for setting `ready_status` to `true`.
+
+> **Auto-recovery trigger vs. planned operations:** `chassisd` auto-recovery is triggered **only** for unplanned failures. During planned operations (graceful shutdown via `config chassis module shutdown` or DPU reboot via `reboot -d`), the `state_transition_in_progress` field in `CHASSIS_MODULE_TABLE|DPU<dpu_index>` is set to `True` **before** the DPU control plane goes down. When `chassisd` observes `dpu_control_plane_state: down`, it checks `state_transition_in_progress`: if `True`, `chassisd` skips auto-recovery because the shutdown/reboot is intentional. Auto-recovery is only initiated when `dpu_control_plane_state` or `dpu_midplane_link_state` transitions to `down` **and** no planned transition is in progress (`state_transition_in_progress == False`). There is no additional timeout configured for this check — the distinction is purely flag-based.
+
+---
+
+## DPU Status DB Info ##
+
+### Existing DB entries ###
+
+The following DB entries track the DPU lifecycle state and are referenced during failure detection and recovery.
+
+**DPU State in CHASSIS_STATE_DB:**
+
+```
+DPU_STATE|DPU<dpu_index>:
+{
+  "dpu_control_plane_state": "up" | "down",
+  "dpu_control_plane_time":  "<UTC timestamp>",
+  "dpu_data_plane_state":    "up" | "down",
+  "dpu_data_plane_time":     "<UTC timestamp>",
+  "dpu_midplane_link_state": "up" | "down",
+  "dpu_midplane_link_time":  "<UTC timestamp>"
+}
+```
+
+**PCIe Detach Info in STATE_DB:**
+
+```
+PCIE_DETACH_INFO|DPU<dpu_index>:
+{
+  "dpu_id":    "<index>",
+  "dpu_state": "detaching" | "detached" | "reattached",
+  "bus_info":  "[DDDD:]BB:SS.F"
+}
+```
+
+**Graceful Shutdown / Reboot Tracking in STATE_DB:**
+
+```
+CHASSIS_MODULE_TABLE|DPU<dpu_index>:
+{
+  "oper_status":                  "Online" | "Offline",
+  "state_transition_in_progress": "True" | "False",
+  "transition_start_time":        "<UTC timestamp>",
+  "transition_type":              "shutdown" | "reboot" | "none"
+}
+```
+
+> **Note:** The `state_transition_in_progress`, `transition_start_time`, and `transition_type` fields are managed by the graceful-shutdown implementation in [sonic-gnmi](https://github.com/sonic-net/sonic-gnmi) and [sonic-utilities](https://github.com/sonic-net/sonic-utilities). These fields are not managed by sonic-platform-daemons.
+
+### New DB entries ###
+
+The following DB entries will now be newly created to track DPU failure states.
+
+**DPU additional Info in CHASSIS_STATE_DB on NPU**
+
+```
+DPU_STATE|DPU<dpu_index>:
+{
+  "ready_status":                      "true" | "false",
+  "recovery_status":                   "recoverable" | "unrecoverable",
+  "reset_count":                       "<integer>",
+  "last_down_time":                    "<UTC timestamp>",
+  "last_ready_time":                   "<UTC timestamp>"
+}
+```
+
+| Field | Description | Set by | Cleared by |
+| ----- | ----------- | ------ | ---------- |
+| `ready_status` | Set to `"true"` when the DPU is fully up and ready (midplane, control plane, data plane all up). Set to `"false"` when the DPU goes down or undergoes a reset. | `chassisd` | `chassisd` (set to `"false"` on failure/reset) |
+| `recovery_status` | Set to `"recoverable"` on initialization. Set to `"unrecoverable"` when `reset_count` reaches `dpu_reset_limit`. Reset back to `"recoverable"` (and `reset_count` to 0) on: (1) `chassisd` restart (pmon crash / NPU reboot), or (2) operator-initiated `config chassis module startup DPU<x>` on an unrecoverable DPU. | `chassisd` | `chassisd` (reset to `"recoverable"` on chassisd restart or operator module startup) |
+| `reset_count` | Number of unplanned DPU resets. Reset to 0 on `chassisd` reset on NPU (e.g., NPU reboot, `pmon` restart). | `chassisd` | `chassisd` |
+| `last_down_time` | UTC timestamp of the last time the DPU went down | `chassisd` | — |
+| `last_ready_time` | UTC timestamp of the last time the DPU became ready | `chassisd` | — |
+
+**DPU Auto-Recovery Feature in CONFIG_DB on NPU**
+
+```
+FEATURE|dpu-auto-recovery:
+{
+  "state": "enabled" | "disabled" | "always_disabled"
+}
+```
+
+| Field | Default | Description |
+| ----- | ------- | ----------- |
+| `state` | `enabled` | Enable or disable the DPU auto-recovery feature. When `disabled` or `always_disabled`, `chassisd` will not automatically power-cycle DPUs on failure. |
+
+> **Note:** `dpu-auto-recovery` is **not** a separate service or container. It is a feature flag entry in CONFIG_DB's `FEATURE` table, read by `chassisd` (running inside the `pmon` container) to determine whether automatic DPU power-cycle recovery is enabled. When `state` is `disabled`, `chassisd` still monitors and updates DPU states in CHASSIS_STATE_DB, but will not initiate automatic power-cycle recovery. Manual intervention is required to recover failed DPUs.
+
+---
+
+## DPU Recovery State Machine ##
+
+The following diagram shows the state transitions managed by `chassisd` for a single DPU. Each box represents a `chassisd`-observed DPU state; edges show the triggers and actions.
+
+```mermaid
+stateDiagram-v2
+    [*] --> Booting : DPU power on
+
+    Booting --> Ready : All states up
+    Booting --> PowerCycle : boot timeout expired [auto-recovery enabled]
+    Booting --> ManualIntervention : boot timeout expired [auto-recovery disabled]
+
+    Ready --> WaitForSelfRecovery : Control plane OR midplane down [auto-recovery enabled]
+    Ready --> ManualIntervention : Control plane OR midplane down [auto-recovery disabled]
+
+    WaitForSelfRecovery --> Booting : DPU self-recovered (midplane up OR control plane up)
+    WaitForSelfRecovery --> PowerCycle : self-recovery timeout expired
+
+    PowerCycle --> Booting : Power cycle issued
+    PowerCycle --> Unrecoverable : reset count >= dpu_reset_limit
+
+    ManualIntervention --> Booting : Operator power-cycle / module startup
+
+    Booting --> Offline : CLI module shutdown during boot
+    Ready --> PlannedShutdown : CLI module shutdown
+    Ready --> PlannedReboot : CLI reboot DPU
+
+    PlannedShutdown --> Offline : gNOI HALT then power down
+    PlannedReboot --> Booting : gNOI HALT then power cycle
+
+    Offline --> Booting : CLI module startup
+
+    Unrecoverable --> Booting : Operator module startup or chassisd restart
+```
+
+| State | `ready_status` | `recovery_status` | Key DB Indicators |
+| ----- | :------------: | :----------------: | ----------------- |
+| **Booting** | `false` | `recoverable` | `dpu_boot_timeout` timer running AND NOT (`dpu_midplane_link_state: up` AND `dpu_control_plane_state: up`) |
+| **Ready** | `true` | `recoverable` | `dpu_midplane_link_state: up`, `dpu_control_plane_state: up`, `dpu_data_plane_state: up` |
+| **WaitForSelfRecovery** | `false` | `recoverable` | `dpu_control_plane_state: down` OR `dpu_midplane_link_state: down`; `dpu_self_recovery_timeout` timer running |
+| **PowerCycle** | `false` | `recoverable` | `dpu_midplane_link_state: down`, `dpu_control_plane_state: down`; `reset_count` incremented; power-cycle in progress |
+| **ManualIntervention** | `false` | `recoverable` | `dpu_control_plane_state: down` OR `dpu_midplane_link_state: down`; `FEATURE|dpu-auto-recovery` `state`: `disabled` / `always_disabled` |
+| **Offline** | `false` | `recoverable` | `oper_status: Offline`; DPU admin-down via `config chassis module shutdown` |
+| **Unrecoverable** | `false` | `unrecoverable` | `reset_count` ≥ `dpu_reset_limit`; `dpu_control_plane_state: down`; `dpu_midplane_link_state: down` |
+
+---
+
+## Unplanned Failures ##
+
+### DPU Failure ###
+
+**Description:**
+Any unplanned event that causes `dpu_control_plane_state` or `dpu_midplane_link_state` to transition to `down`. This covers all DPU failure scenarios: process crashes on DPU, DPU hardware faults, power loss, PCIe failures, kernel panics, and memory exhaustion.
+
+**Detection (by PMON):**
+- `chassisd` on the NPU polls `dpu_control_plane_state`, `dpu_data_plane_state`, and `dpu_midplane_link_state` every 10 seconds.
+- A failure is detected when either `dpu_control_plane_state` or `dpu_midplane_link_state` transitions to `down`.
+
+**PMON Action:**
+1. `chassisd` sets `ready_status` to `false` and updates `last_down_time` for the affected DPU.
+2. `chassisd` enters the **WaitForSelfRecovery** state and starts the `dpu_self_recovery_timeout` timer (default 300 seconds). This grace period allows the DPU to recover from transient failures (e.g., process restart, HW watchdog-triggered reboot) without external intervention.
+3. `chassisd` continues polling the DPU state every 10 seconds during this period.
+4. **If the DPU self-recovers** (midplane comes back up OR control plane comes back up) within the `dpu_self_recovery_timeout`:
+   - `chassisd` transitions to the **Booting** state and starts the `dpu_boot_timeout` timer (default 600 seconds) to wait for full DPU readiness (all planes up).
+   - Once all states are up, `chassisd` sets `ready_status` to `true` and updates `last_ready_time`. `reset_count` is **not** incremented (the DPU recovered autonomously).
+   - If the DPU does not reach full readiness within `dpu_boot_timeout`, `chassisd` treats it as a boot failure and initiates a power-cycle (incrementing `reset_count`).
+5. **If both `dpu_control_plane_state` and `dpu_midplane_link_state` remain down** after `dpu_self_recovery_timeout` expires:
+   - `chassisd` transitions to **PowerCycle**, issues a power-cycle of the DPU, and increments `reset_count`.
+   - After the power-cycle, `chassisd` transitions to **Booting** and starts the `dpu_boot_timeout` timer.
+   - Once the DPU reaches full readiness, `chassisd` sets `ready_status` to `true` and updates `last_ready_time`.
+6. If `reset_count` reaches `dpu_reset_limit`, `chassisd` sets `recovery_status` to `"unrecoverable"` and stops further automatic power-cycle attempts.
+7. **When auto-recovery is disabled:** `chassisd` skips the self-recovery wait and power-cycle. The DPU remains in **ManualIntervention** with `ready_status: false`; operator must reset the DPU manually.
+
+**DB State Transition (DPU self-recovers within `dpu_self_recovery_timeout`):**
+
+| DB Field | Before | During WaitForSelfRecovery | After Recovery |
+| -------- | :----: | :------------------------: | :------------: |
+| `dpu_control_plane_state` | `up` | `down` | `up` |
+| `dpu_midplane_link_state` | `up` | `up` or `down` | `up` |
+| `ready_status` | `true` | `false` | `true` |
+| `last_down_time` | — | `<UTC timestamp>` | — |
+| `last_ready_time` | — | — | `<UTC timestamp>` |
+| `reset_count` | N | N | N (unchanged) |
+| `recovery_status` | `recoverable` | `recoverable` | `recoverable` |
+
+**DB State Transition (DPU does NOT self-recover — power-cycle issued):**
+
+| DB Field | Before | During WaitForSelfRecovery | After Power-Cycle Recovery |
+| -------- | :----: | :------------------------: | :------------------------: |
+| `dpu_control_plane_state` | `up` | `down` | `up` |
+| `dpu_midplane_link_state` | `up` | `down` | `up` |
+| `ready_status` | `true` | `false` | `true` |
+| `last_down_time` | — | `<UTC timestamp>` | — |
+| `last_ready_time` | — | — | `<UTC timestamp>` |
+| `reset_count` | N | N | N+1 |
+| `recovery_status` | `recoverable` | `recoverable` | `recoverable` (or `unrecoverable` if N+1 ≥ `dpu_reset_limit`) |
+
+> **Note:** `chassisd` polls `dpu_data_plane_state` alongside `dpu_control_plane_state` and `dpu_midplane_link_state`, but `dpu_data_plane_state` alone does not trigger recovery actions. The `dpu_data_plane_state` is used solely to determine full DPU readiness for setting `ready_status` to `true`.
+
+> **Note:** During **WaitForSelfRecovery**, if a planned operation (`config chassis module shutdown`) is requested, `chassisd` cancels the self-recovery timer, powers down the DPU, and transitions to **Offline**.
+
+---
+
+### NPU Ungraceful Reboot ###
+
+**Description:**
+The NPU reboots unexpectedly due to kernel panic, memory exhaustion, or other unplanned event. All DPUs on the switch are potentially impacted.
+
+**Detection (by PMON):**
+- On NPU recovery, `chassisd` reads the reboot cause from `/host/reboot-cause/reboot-cause.txt`.
+- If the reboot cause is `Kernel Panic` or `Unknown`, `chassisd` treats all DPU states as potentially stale and triggers recovery for all DPUs.
+
+**PMON Action:**
+- On recovery, `chassisd` resets `reset_count` to 0 for all DPUs, sets `ready_status` to `false`, and updates `last_down_time` for all DPUs.
+- For every admin-up DPU, `chassisd` issues a power-cycle (`power_down()` → `pci_detach()` → `power_up()` → `pci_reattach()`) to guarantee a known-good starting state after the ungraceful reboot, and increments `reset_count`.
+- Admin-down DPUs (`oper_status: Offline`) are left powered off; `chassisd` does not reset them.
+- After each DPU comes back, `chassisd` verifies all DPU states (midplane, control plane, data plane) and, on success, sets `ready_status` back to `true` and updates `last_ready_time`.
+- **When auto-recovery is disabled:** `chassisd` skips the power-cycle for admin-up DPUs. Each DPU is left in its post-crash state with `ready_status: false` and remains in **ManualIntervention** awaiting operator action.
+
+**DB State Transition:**
+
+| DB Field | Before Crash | On NPU Recovery | After DPU Recovery |
+| -------- | :----------: | :-------------: | :----------------: |
+| `ready_status` (all DPUs) | `true` | `false` | `true` (per DPU) |
+| `last_down_time` (all DPUs) | — | `<UTC timestamp>` | — |
+| `last_ready_time` (all DPUs) | — | — | `<UTC timestamp>` (per DPU) |
+| `reset_count` (all admin-up DPUs) | N | 0 | 1 |
+
+> **Note:** `reset_count` is reset to 0 on `chassisd` startup. This means a DPU that was close to `dpu_reset_limit` gets a fresh retry budget after an NPU reboot. This is by design — persistent hardware issues will be caught again within `dpu_reset_limit` attempts.
+
+> **Note — Multiple DPU recovery:** When multiple (or all) DPUs need recovery simultaneously, `chassisd` issues power-cycles sequentially (one DPU at a time) to avoid power-rail overload and PCIe bus contention. The `dpu_boot_timeout` is tracked per-DPU independently. If a platform supports parallel DPU power-cycle (declared in `platform.json` via `parallel_dpu_recovery: true`), `chassisd` may issue power-cycles in parallel batches.
+
+---
+
+## Planned Operations ##
+
+### DPU Graceful Shutdown ###
+
+**Description:**
+Orderly shutdown of a DPU via CLI command: `config chassis module shutdown DPU<x>`.
+
+**PMON Sequence:**
+1. `chassisd` calls `set_admin_state(down)` → `module_base.py` triggers `graceful_shutdown_handler()`.
+2. `CHASSIS_MODULE_TABLE` in STATE_DB updated:
+   - `state_transition_in_progress`: `True`
+   - `transition_start_time`: `<UTC timestamp>`
+   - `transition_type`: `shutdown`
+3. `chassisd` updates CHASSIS_STATE_DB:
+   - `DPU_STATE|DPU<dpu_index>`: `ready_status`: `false`, `last_down_time`: `<UTC timestamp>`
+4. `gnoi_reboot_daemon.py` detects the transition and sends gNOI Reboot RPC (Method: `HALT`) to DPU.
+5. DPU gracefully shuts down all services via `reboot -p`.
+6. NPU polls `gnoi_client -rpc RebootStatus` until `active=false` (services terminated).
+7. `state_transition_in_progress` set to `False`.
+8. `module_base.py` calls platform API `power_down()` to power off DPU.
+9. PCIe detach: platform vendor API `pci_detach()`.
+10. Sensor ignore configs added, sensord restarted.
+
+**DB State Transition:**
+
+| DB Field | Before | After Shutdown |
+| -------- | :----: | :------------: |
+| `ready_status` | `true` | `false` |
+| `last_down_time` | — | `<UTC timestamp>` |
+| `oper_status` | `Online` | `Offline` |
+| `state_transition_in_progress` | `False` | `True` → `False` |
+
+**Race Condition Handling:**
+- If module shutdown is requested during a DPU reboot: operation fails; retry after reboot completes.
+- If module shutdown is requested while DPU is in **Booting** state (e.g., during initial boot or after a power-cycle): `chassisd` cancels the `dpu_boot_timeout` timer, skips further recovery, powers down the DPU, and transitions directly to **Offline**.
+- If switch reboot is requested during module shutdown: graceful shutdown completes; switch reboot proceeds.
+- Concurrent startup/shutdown on the same module: fails; user retries later.
+- If `config chassis module shutdown` is issued while `chassisd` is in the middle of an auto-recovery power-cycle for the same DPU: `chassisd` detects the admin-down request, aborts the auto-recovery loop, and proceeds with the graceful shutdown sequence.
+- If `pcied` detects a PCIe failure and updates `PCIE_DETACH_INFO` at the same time `chassisd` initiates a power-cycle due to midplane loss: `chassisd` holds a per-DPU lock during the power-cycle sequence. `pcied` updates `PCIE_DETACH_INFO` independently (no lock contention). `chassisd` reads `PCIE_DETACH_INFO` during its power-cycle flow and performs PCIe rescan if `dpu_state` is `detached`. No conflicting actions occur because `pcied` is read-only from `chassisd`'s perspective — it only updates state, while `chassisd` acts on it.
+
+---
+
+### DPU Cold Reboot ###
+
+**Description:**
+Reboot a DPU with full power-cycle via CLI: `reboot -d <DPU_ID>`.
+
+**PMON Sequence:**
+1. NPU sends gNOI Reboot RPC (Method: `HALT`) to DPU.
+2. NPU polls gNOI `RebootStatus` until `active=false` and `Status=STATUS_SUCCESS`.
+3. Timeout: `dpu_halt_services_timeout` (Read from `platform.json`, default 60 seconds).
+4. PCIe detach: platform vendor API `pci_detach()`.
+5. Platform vendor reboot API invoked (DPU cold boot / power-cycle).
+6. PCIe reattach: platform vendor API `pci_reattach()`.
+7. DPU boots, services start, reports `dpu_control_plane_state=up`.
+8. `chassisd` verifies all DPU states and sets `ready_status` to `true`.
+
+> **Note:** `dpu_boot_timeout` applies to the Booting phase after a planned reboot as well. If the DPU fails to reach `Ready` within the timeout (e.g., broken image installed during upgrade), `chassisd` treats it as a boot failure and initiates another power-cycle, incrementing `reset_count`. The planned reboot's `state_transition_in_progress` flag is already cleared by step 5, so it does not suppress the boot-timeout recovery.
+
+**DB State Transition:**
+
+| DB Field | Before | During Reboot | After Recovery |
+| -------- | :----: | :-----------: | :------------: |
+| `ready_status` | `true` | `false` | `true` |
+| `last_down_time` | — | `<UTC timestamp>` | — |
+| `last_ready_time` | — | — | `<UTC timestamp>` |
+| `PCIE_DETACH_INFO` `dpu_state` | `reattached` | `detaching` → `detached` | `reattached` |
+
+**Error handling:**
+- If gNOI service is unreachable: detach PCIe and proceed after timeout.
+- If PCIe reattach fails: error handling + restoration mechanism triggered.
+- If DPU stuck: hardware watchdog triggers reset (vendor-specific).
+
+---
+
+### Full SmartSwitch Reboot ###
+
+**Description:**
+Planned reboot of the entire SmartSwitch (NPU + all DPUs) via CLI: `reboot`. All DPUs are gracefully shut down in parallel before the NPU reboots.
+
+**PMON Sequence:**
+1. NPU sends gNOI Reboot RPC (Method: `HALT`) to **all** DPUs in parallel (multiple threads).
+2. NPU polls gNOI `RebootStatus` for each DPU until `active=false` and `Status=STATUS_SUCCESS`.
+3. Timeout per DPU: `dpu_halt_services_timeout` (default from `platform.json`, typically 60 seconds).
+4. For each DPU: PCIe detach via platform vendor API `pci_detach()`.
+5. NPU proceeds with its own reboot sequence.
+6. On NPU boot, PCIe enumeration discovers all DPUs.
+7. `chassisd` power-cycles each DPU and performs PCIe reattach.
+8. Each DPU boots: midplane attach → SONiC boot → container startup → reports `dpu_control_plane_state=up`.
+
+> **Note — Multiple DPU recovery:** When multiple (or all) DPUs need recovery simultaneously, `chassisd` issues power-cycles sequentially (one DPU at a time) to avoid power-rail overload and PCIe bus contention. The `dpu_boot_timeout` is tracked per-DPU independently. If a platform supports parallel DPU power-cycle (declared in `platform.json` via `parallel_dpu_recovery: true`), `chassisd` may issue power-cycles in parallel batches.
+
+**DB State Transition:**
+
+| DB Field | Before | During Reboot | After Recovery |
+| -------- | :----: | :-----------: | :------------: |
+| `ready_status` (all DPUs) | `true` | `false` | `true` (per DPU) |
+| `last_down_time` (all DPUs) | — | `<UTC timestamp>` | — |
+| `last_ready_time` (all DPUs) | — | — | `<UTC timestamp>` (per DPU) |
+| `PCIE_DETACH_INFO` `dpu_state` (per DPU) | `reattached` | `detaching` → `detached` | `reattached` |
+
+**Error handling:**
+- If a DPU does not respond to gNOI Reboot RPC within the timeout: NPU proceeds with PCIe detach and continues the reboot. The unresponsive DPU is cold-booted on NPU recovery.
+- If a DPU fails to come back after the full switch reboot: `chassisd` retries power-cycle up to `dpu_reset_limit` (tracked via `reset_count`). If still unresponsive, `chassisd` sets `recovery_status` to `"unrecoverable"`.
+- If the NPU reboot is initiated while a DPU graceful shutdown is in progress: the graceful shutdown completes first, then the NPU reboot proceeds.
+
+---
+
+## Scenario DB State Summary ##
+
+| DPU Scenario | `dpu_control_plane_state` | `dpu_midplane_link_state` | `ready_status` | PMON Action |
+| ------------ | :-----------------------: | :-----------------------: | :-----------: | ----------- |
+| DPU booting – initial state | down | down → up | false | `chassisd` polls; midplane comes up first, then waits for control plane and data plane within `dpu_boot_timeout` |
+| DPU healthy and running | up | up | true | Set `ready_status=true` after verifying all states |
+| DPU failure (control plane or midplane down) | down | up or down | false | Enter **WaitForSelfRecovery**; wait `dpu_self_recovery_timeout` (300s) for self-recovery |
+| DPU self-recovers within timeout | down → up | down → up | false → true | Transition to **Booting**; wait `dpu_boot_timeout` for full readiness; `reset_count` unchanged |
+| DPU does NOT self-recover (timeout expires) | down | down | false | Power-cycle DPU; increment `reset_count`; wait `dpu_boot_timeout` |
+| DPU up after power-cycle | up | up | true | Set `ready_status=true` after verifying all states |
+| DPU dead – unrecoverable | down | down | false | `reset_count` reached `dpu_reset_limit`; `recovery_status` set to `"unrecoverable"`; raise alert |
+| NPU ungraceful reboot (Kernel Panic / Unknown) | down → up | down → up | false → true | Power-cycle all admin-up DPUs; increment `reset_count`; wait `dpu_boot_timeout` |
+| Full SmartSwitch reboot (planned) | down → up | down → up | false → true | gNOI halt; power-cycle; re-verify |
+
+---
+
+## CLI ##
+
+The existing `show chassis modules status` command is extended to include a `Ready-Status` column:
+
+```
+admin@sonic:~$ show chassis modules status
+  Name             Description    Physical-Slot    Oper-Status    Admin-Status        Serial    Ready-Status
+------  ----------------------  ---------------  -------------  --------------  ------------  --------------
+  DPU0  <DPU Sku>              N/A         Online              up  <DPU serial #>            true
+  DPU1  <DPU Sku>              N/A         Online              up  <DPU serial #>            true
+  ...
+```
+
+A new `show chassis modules recovery` command exposes detailed recovery state:
+
+```
+admin@sonic:~$ show chassis modules recovery
+  Name    Ready-Status    Recovery-Status    Reset-Count    Last-Down-Time              Last-Ready-Time
+------  --------------  -----------------  -------------  --------------------------  --------------------------
+  DPU0            true    recoverable                  2    2026-05-28 10:15:30 UTC      2026-05-28 10:18:45 UTC
+  DPU1            true    recoverable                  0    —                           2026-05-28 09:00:12 UTC
+  DPU2           false    unrecoverable                2    2026-05-28 11:02:00 UTC      —
+  ..
+```
+
+| Column | Source DB Field | Description |
+| ------ | --------------- | ----------- |
+| `Ready-Status` | `CHASSIS_STATE_DB: DPU_STATE\|DPU<N>: ready_status` | Whether the DPU is fully up and serving traffic |
+| `Recovery-Status` | `CHASSIS_STATE_DB: DPU_STATE\|DPU<N>: recovery_status` | `recoverable` or `unrecoverable` |
+| `Reset-Count` | `CHASSIS_STATE_DB: DPU_STATE\|DPU<N>: reset_count` | Number of unplanned power-cycles since last `chassisd` restart |
+| `Last-Down-Time` | `CHASSIS_STATE_DB: DPU_STATE\|DPU<N>: last_down_time` | UTC timestamp of last DPU failure |
+| `Last-Ready-Time` | `CHASSIS_STATE_DB: DPU_STATE\|DPU<N>: last_ready_time` | UTC timestamp of last successful DPU recovery |
+
+---
+
+## Testing ##
+
+All DPU failure mode tests run with **auto-recovery enabled** by default (the production configuration). Specific tests explicitly disable and re-enable auto-recovery to validate the `ManualIntervention` path. The existing SmartSwitch test suites (e.g., `test_reload_dpu`) continue to run unmodified — auto-recovery does not interfere with planned reboot/shutdown tests because `chassisd` checks `state_transition_in_progress` before triggering recovery.
+
+Test implementation: [`tests/smartswitch/platform_tests/test_dpu_failure_modes.py`](https://github.com/sonic-net/sonic-mgmt/blob/master/tests/smartswitch/platform_tests/test_dpu_failure_modes.py)
+
+| Test Class | Scenario | Validates |
+| ---------- | -------- | --------- |
+| `TestDatabaseDpuCrash` | Kill per-DPU Redis instance (`databasedpuN`) on NPU | `chassisd` detects loss (`ready_status=false`), enters **WaitForSelfRecovery**; `systemd` restarts service; DPU self-recovers within timeout (`ready_status=true`) |
+| `TestPcieFailure` | Remove DPU PCIe device via sysfs | `pcied` detects detach (`PCIE_DETACH_INFO dpu_state=detached`), `chassisd` marks DPU not-ready, power-cycles, performs PCIe rescan, recovers |
+| `TestControlPlaneOnlyDown` | Stop critical container (`swss`) on DPU | `dpu_control_plane_state=down` while midplane stays up; `chassisd` enters **WaitForSelfRecovery**, waits `dpu_self_recovery_timeout`, then power-cycles DPU |
+| `TestAutoRecoveryDisabled` | Disable `FEATURE\|dpu-auto-recovery`, trigger failure | Confirms `chassisd` does NOT power-cycle (ManualIntervention); re-enable and verify recovery |
+| `TestUnrecoverableState` | Repeatedly trigger failures until `reset_count` ≥ `dpu_reset_limit` | `recovery_status=unrecoverable`; `chassisd` stops retrying |
+| `TestStateMachineTransitions` | Planned shutdown → offline → startup → ready | `last_down_time` and `last_ready_time` updated correctly; `recovery_status` stays `recoverable` |
+| `TestShutdownDuringAutoRecovery` | Issue module shutdown while `chassisd` is mid-recovery | `chassisd` aborts auto-recovery, DPU transitions to Offline cleanly |
+| `TestDpuFailureAfterConfigReload` | Config reload on NPU, then trigger DPU failure | `chassisd` recovery works post-reload; `reset_count` increments |
+
+**Test infrastructure:**
+- Shared `ensure_all_dpus_ready` fixture (in `tests/smartswitch/conftest.py`) ensures all testable DPUs are admin-up, online, and DB-ready before each test, and recovers any offline DPUs in teardown.
+- Tests use `assert_dpu_db_state_ready()` helper to verify full DPU readiness (`ready_status=true`, `recovery_status=recoverable`, all planes up).
+- Topology: `smartswitch` — requires NPU DUT with DPU SSH access via `dpuhosts`.
+
+---
+
+## Repository Change Summary ##
+
+| Repository | Component | Changes |
+| ---------- | --------- | ------- |
+| [sonic-platform-daemons](https://github.com/sonic-net/sonic-platform-daemons) | `chassisd` | DPU failure detection, automated power-cycle recovery, new CHASSIS_STATE_DB fields (`ready_status`, `recovery_status`, `reset_count`, `last_down_time`, `last_ready_time`) |
+| [sonic-buildimage](https://github.com/sonic-net/sonic-buildimage) | PMON container | Configuration updates for new `chassisd` failure recovery features |
+| [sonic-mgmt](https://github.com/sonic-net/sonic-mgmt) | --- | DPU failure mode tests (`test_dpu_failure_modes.py`) |
+
+---
+
+## References ##
+
+- [Smart Switch PMON](../pmon/smartswitch-pmon.md)
+- [Smart Switch Graceful Shutdown](../graceful-shutdown/graceful-shutdown.md)
+- [Smart Switch Reboot HLD](../reboot/reboot-hld.md)
+- [Smart Switch Database Architecture](../smart-switch-database-architecture/smart-switch-database-design.md)
+- [Smart Switch IP Address Assignment](../ip-address-assigment/smart-switch-ip-address-assignment.md)
+- [Smart Switch DPU Upgrade HLD](../upgrade/dpu-upgrade-hld.md)