Skip to content

Commit 4e6af6f

Browse files
authored
Merge pull request #2294 from judyjoseph/pmon_bmc_updates
Updates to BMC platform management and monitoring spec
2 parents f56e497 + cef555d commit 4e6af6f

1 file changed

Lines changed: 68 additions & 35 deletions

File tree

doc/bmc/sonicBMC/pmon-bmc-design.md

Lines changed: 68 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@
1717
* [2.1.4 BMC-Switch Host Interaction](#214-bmc-switch-host-interaction)
1818
* [2.1.5 BMC leak_detection_and_thermal policy](#215-bmc-leak-detection-and-thermal-policy)
1919
* [2.1.6 BMC event logging](#216-bmc-event-logging)
20+
* [2.1.7 RTC clock in BMC](#217-rtc-clock-in-bmc)
2021
* [2.2 BMC Platform Management](#22-bmc-platform-management)
2122
* [2.2.1 BMC controller-bmcctld](#221-bmc-controller---bmcctld)
2223
* [2.2.1.1 bmcctld on bmc](#2211-bmcctld-on-bmc)
@@ -84,19 +85,17 @@ The SONiC in BMC interoperates with the SONiC in Switch-Host as in below diagram
8485

8586
## 2. Detailed Architecture and workflows
8687
### 2.1 BMC platform
87-
Update the <vendor>/<platform>/platform_env.conf with the following flags,
88+
Update the [vendor]/[platform]/platform_env.conf with the following flags,
8889
```
8990
switch_host=1
90-
liquid_cooled=true
9191
```
9292
```
9393
switch_bmc=1
94-
liquid_cooled=true
9594
```
9695

97-
* "liquid_cooled" flag is set to true on a liquid cooled switch OR hybrid cooled switch.
9896
* "switch_host" flag is set to 1 on the switch host, "switch_bmc" flag is set to 1 on the switch BMC.
99-
97+
* "liquid_cooled" flag is not set in this file as the same BMC platform/sku could be used for both air cooled and liquid cooled devices.
98+
So instead we will introduce a platform API to get whether this is a liquid cooled or air cooled device.
10099

101100
#### 2.1.1 BMC platform power up
102101
When device is powered ON, the BMC powers first, boots up the sonic BMC which starts the various containers
@@ -292,6 +291,16 @@ The Leak detection is applicable only to Liquid cooling platform. The action is
292291
The general syslogs will be placed in /var/log/syslog where /var/log directory will be mounted on **tmpfs**. Syslogs will be sent to remote server as well.
293292
The Leak, Switch-Host state and interactions, Rack-manager interactions will be persistently stored on disk/eMMC in "/host/bmc/event.log" with log rotation enabled.
294293

294+
#### 2.1.7 RTC Clock in BMC
295+
On most vendor platforms, the BMC RTC does not have a battery backup. As a result, the clock does not retain time across power cycles.
296+
297+
When the BMC powers on, the system time is initialized as follows:
298+
299+
1. Use the clock epoch file at "/usr/lib/clock-epoch" as the initial system time, if available. This is read by systemd during startup. (This file is updated regularly with a systemd timer service, this feature is coming in SONiC release 202605)
300+
2. The chrony systemd service when it starts later synchronizes with remote NTP servers to obtain and maintain accurate time.
301+
302+
This sequence provides a reasonable initial timestamp at boot, followed by synchronization to an accurate time source via NTP.
303+
295304

296305
### 2.2 BMC Platform Management
297306

@@ -313,8 +322,9 @@ The bmc controller daemon "bmcctld" is started first in BMC pmon container. It a
313322
Detailed workflow below
314323

315324
```
316-
Sleep for SWITCH_HOST_POWER_ON_DELAY (this is configurable value in config_db)
317-
This is to make sure the Rack Manager is up and Liquid flow rate is good.
325+
if the previous reboot was a Cold Boot (Full Power Cycle, reboot cause : REBOOT_CAUSE_POWER_LOSS)
326+
- Sleep for power_on_delay configured in CHASSIS_MODULE|SWITCH-HOST (this is configurable value in config_db)
327+
- This is to make sure the Rack Manager is up and Liquid flow rate is good.
318328
319329
Check for any CRITICAL alert/leak in RACK_MANAGER_ALERT* tables or system SYSTEM_LEAK_STATUS table (device_leak_status == CRITICAL_SYSTEM_LEAK) in STATE_DB
320330
NO External/System LEAK present
@@ -377,7 +387,7 @@ On an Event
377387
```
378388
- use GNOI framework to issue remote SOFT shutdown. The gnmi and sysmgr docker needs to be running on Switch-Host
379389
REF: https://github.com/sonic-net/SONiC/blob/master/doc/mgmt/gnmi/gnoi_system_hld.md, https://github.com/sonic-net/SONiC/pull/1489
380-
- start a timer based on graceful_shutdown_timeout configured in SWITCH_HOST_SHUTDOWN_TIMEOUT|default table.
390+
- start a timer based on graceful_shutdown_timeout configured in CHASSIS_MODULE|SWITCH-HOST.
381391
- if GNOI request came back SUCCESS or No response for GNOI request + Timer expired
382392
- call platform API module->set_admin_state(DOWN) to power down the Switch-Host
383393
- update the HOST_STATE|switch-host with the device_power_state.
@@ -396,24 +406,22 @@ On an Event
396406
This section covers the various tables which this daemon creates/uses in Redis DB on BMC
397407

398408
```
399-
key = SWITCH_HOST_POWER_ON_DELAY |default ; Config DB on BMC
409+
key = CHASSIS_MODULE|SWITCH-HOST ; Config DB on BMC
400410
; field = value
401-
power_on_delay = integer ; Time in secs after power on the device, switch BMC can power on the Switch-Host. ( default = -1, Switch-Host remain powered off ).
402-
; If non-zero and BMC receives POWER ON from Rack manager before this timeout + there are no critical events, BMC will power on Switch-Host.
411+
admin_status = up | down ; default is down, keeps SWITCH-HOST powered down when device powers up.
412+
power_on_delay = integer ; Time in secs BMC waits before powering on Switch-Host when device is powered ON.
413+
; If BMC receives POWER ON from Rack manager before this timeout + there are no critical events, BMC will power on Switch-Host.
414+
graceful_shutdown_timeout = integer ; Time in secs BMC waits for graceful shutdown before forcing power-off (default = 120sec).
415+
; if set to 0, BMC will NOT do a graceful shutdown, instead will do POWER_OFF with platform API.
403416
404417
key = HOST_STATE|switch-host ; STATE_DB on BMC to store state of Switch-Host
405418
; field = value
406-
device_power_state = POWER_ON | POWER_OFF| GRACEFUL_SHUT | POWER_CYCLE ; What was the last action done on Switch-Host
407-
device_status = ONLINE | OFFLINE ; current oper status of device, can use the platform API module->get_oper_state()
419+
device_power_state = POWERED_ON | POWERED_OFF | GRACEFUL_SHUTDOWN | ; Represents the final and transitional power state of Switch-host
420+
POWER_CYCLE | POWERING_ON | POWERING_OFF |
421+
GRACEFUL_SHUTTING_DOWN | POWER_CYCLING
422+
device_status = ONLINE | OFFLINE ; current oper status of device, from platform API module->get_oper_status()
408423
last_change_timestamp = STR
409424
410-
411-
key = SWITCH_HOST_SHUTDOWN_TIMEOUT|default ; Config DB on BMC
412-
; field = value
413-
graceful_shutdown_timeout = integer ; Time in secs the BMC will wait after SHUTDOWN command sent to Switch-Host. ( default = 120 sec ).
414-
; if this timer expires, BMC will go ahead and direct POWER OFF switch-host with platform API
415-
; if shutdown_timeout is 0, BMC will NOT do a graceful shutdown, instead will do POWER_OFF with platform API
416-
417425
```
418426

419427
#### 2.2.2 thermalctld
@@ -496,11 +504,11 @@ This base class is already defined in sonic-platform-common, additional new plat
496504
| Method | Present | Action |
497505
|---------|---------|----------|
498506
| get_name() | Y | Get leak sensor name |
499-
| is_leak() | Y | Is there a leak detected? **Applies debounce logic defined by <vendor>platform before reporting or clearing a leak** |
507+
| is_leak() | Y | Is there a leak detected? **Only stable leak conditions** are asserted or cleared. This could be done by debounce logic in <vendor>platform/firmware |
500508
| is_leak_sensor_ok() | New | Is the leak sensor OK or faulty ? |
501509
| get_leak_sensor_type() | New | What type of leak sensor is this rope, flex_pcb, spot etc |
502510
| get_leak_sensor_location() | New | Location of leak sensor |
503-
| get_leak_severity() | New | Get the severity based on the criticality of the zone or how severe the leak is for a sensor for eg: more liquid presence |
511+
| get_leak_severity() | New | Get the severity based on the criticality of the location/zone or how severe the leak is for a sensor for eg: more liquid presence |
504512
| get_leak_profile() | New | Returns the leak sensor profile associated with this leak sensor type. there will be a profile created per leak sensor type rope, flex_pcb, spot etc |
505513

506514
**Note**
@@ -611,7 +619,9 @@ This base class is already defined in sonic-platform-common.
611619
| Method | Present | Action |
612620
|---------|---------|----------|
613621
| get_all_modules() | Y | Fetch managed modules here, Switch-Host Module object |
614-
622+
| get_reboot_cause()| Y | Fetch previous reboot cause, check if it is Cold Boot(Full Power Cycle, reboot cause : REBOOT_CAUSE_POWER_LOSS) |
623+
| is_bmc() | New | Retrieves whether the sonic chassis instance is/has a BMC module |
624+
| is_liquid_cooled() | New | Is this chassis liquid/hybrid cooled ? |
615625

616626
### 2.3 BMC CLI Commands
617627

@@ -626,19 +636,22 @@ CLI to enable user to graceful power on/off the Switch-Host, and to configure po
626636
Applicable to (LC, AC)
627637

628638
```
629-
config chassis modules startup <Switch-Host>
639+
config chassis modules startup SWITCH-HOST
630640
- This command is to POWER ON the Switch Host from BMC
641+
- Sets the "admin_status" attribute to up
631642
632-
config chassis modules shutdown <Switch-Host>
643+
config chassis modules shutdown SWITCH-HOST
633644
- This command is to graceful POWER OFF the Switch Host from BMC
645+
- Sets the "admin_status" attribute to down
646+
- Default admin_status of SWITCH-HOST is down which keeps SWITCH-HOST powered down when device powers up.
634647
635-
config chassis modules power-on-delay <Switch-Host> <seconds>
648+
config chassis modules power-on-delay SWITCH-HOST <seconds>
636649
- Configure the delay (in seconds) BMC waits after power-on before powering on the Switch-Host.
637-
- Default = -1, Switch-Host remain powered off. This default value is selected as -1 so that in SI phase Switch-Host needs to be powered on manually.
650+
- Default = 0, default is 0 secs which tells Switch-Host to power on immediately if admin_status is up
638651
- If non-zero BMC receives a POWER ON from Rack Manager before this timeout elapses (and no critical events exist),
639652
Switch-Host will be powered on immediately.
640653
641-
config chassis modules shutdown-timeout <Switch-Host> <seconds>
654+
config chassis modules shutdown-timeout SWITCH-HOST <seconds>
642655
- Configure the graceful-shutdown timeout (in seconds) BMC waits after sending a shutdown command
643656
to the Switch-Host before forcing a hard power-off via the platform API.
644657
- Default = 120sec.
@@ -650,13 +663,15 @@ config chassis modules shutdown-timeout <Switch-Host> <seconds>
650663
##### DB schema
651664

652665
```
653-
"CHASSIS_MODULE": {
666+
"CHASSIS_MODULE": { ; In CONFIG_DB
654667
"SWITCH-HOST": {
655-
"admin_status": "up",
656-
"power_on_delay": "300", ; Time in secs BMC waits before powering on Switch-Host (default = -1, Switch-Host remain powered off)
657-
"graceful_shutdown_timeout" : "120" ; Time in secs BMC waits for graceful shutdown before forcing power-off (default = 120sec)
668+
"admin_status": "up", ; admin_status up/down; default is down which keeps SWITCH-HOST powered down when device powers up.
669+
"power_on_delay": "300", ; Time in secs BMC waits before powering on Switch-Host when device is powered ON.
670+
; If BMC receives POWER ON from Rack manager before this timeout + there are no critical events, BMC will power on Switch-Host.
671+
"graceful_shutdown_timeout" : "120" ; Time in secs BMC waits for graceful shutdown before forcing power-off (default = 120sec).
672+
; if set to 0, BMC will NOT do a graceful shutdown, instead will do POWER_OFF with platform API.
658673
}
659-
}
674+
}
660675
```
661676

662677
* **config liquid-cool leak-control**
@@ -680,14 +695,14 @@ Applicable to (LC)
680695
config liquid-cool leak-action [system|rack_mgr] [critical|minor] [syslog_only|graceful_shutdown|power_off]
681696
682697
- syslog_only : Log the event; no Switch-Host power action taken.
683-
- graceful_shutdown: Issue a graceful GNOI shutdown to Switch-Host; force power-off after SWITCH_HOST_SHUTDOWN_TIMEOUT/graceful_shutdown_timeout if unresponsive.
698+
- graceful_shutdown: Issue a graceful GNOI shutdown to Switch-Host; force power-off after graceful_shutdown_timeout (CHASSIS_MODULE|SWITCH-HOST) if unresponsive.
684699
- power_off : Immediately power off Switch-Host via platform API module->set_admin_state(DOWN).
685700
```
686701

687702
##### DB schema
688703

689704
```
690-
"LEAK_CONTROL_POLICY": {
705+
"LEAK_CONTROL_POLICY": { ; In CONFIG_DB
691706
"system_leak_policy" : "enabled | disabled", ; enabled by default
692707
"system_critical_leak_action" : "power_off", ; default is power_off
693708
"system_minor_leak_action" : "syslog_only", ; default is syslog_only
@@ -741,6 +756,21 @@ show chassis module status
741756
742757
```
743758

759+
##### DB schema
760+
761+
```
762+
"CHASSIS_MODULE_TABLE": { ; In STATE_DB
763+
"SWITCH-HOST": {
764+
"name": "SWITCH-HOST"
765+
"desc": "Switch Host managed by BMC"
766+
"slot": "N/A"
767+
"serial": "[Serial-number]"
768+
"oper_status": "ONLINE"
769+
"admin_status": "up"
770+
}
771+
}
772+
```
773+
744774
* **show platform leak control-policy**
745775

746776
Command to show leak control policy configuration
@@ -883,4 +913,7 @@ In case of a firmware upgrade which needs reboot of both Switch-Host and BMC, wi
883913
2. Add support for more Rack manager commands via Redfish for reset_type like ForceRestart, GracefulRestart
884914
3. Add support for ipv6 address to Host-Bmc-Link
885915
4. Introduced the Hybrid cooling skus in this design document. Add more details on requirements and actions of various platform daemons in Switch-Host.
916+
5. Add more details on the actions ( eg: DC personal checkup, RMA etc ) incase if there is a leak sensor faulty.
917+
- Can we run the switch with one or more faulty sensor ?
918+
- if there a faulty sensor and if location tells close to CPU/ASIC, should the switch be powered down and RAMed ?
886919

0 commit comments

Comments
 (0)