Config/Show commands (newly added) for BMC, leak detection, policy#4417
Conversation
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
- show/platform.py: extract local variables to shorten lines >120 chars - tests/liquidcool_test.py: add noqa: E402 for post-sys.path import - tests/show_platform_leak_test.py: remove unused pytest import (F401), add noqa: E402 for post-sys.path import Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Judy Joseph <jujoseph@microsoft.com>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
…NG_INFO Replace LIQUID_COOLING_DEVICE table with LIQUID_COOLING_INFO everywhere to use a single unified table for liquid cooling/leak sensor state. - show/platform.py: rename LIQUID_COOLING_DEVICE_TABLE constant to LIQUID_COOLING_INFO_TABLE and update leak_status() query - tests/show_platform_leak_test.py: update mock table keys - tests/leakage_status_test.py: update mock table keys Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Judy Joseph <jujoseph@microsoft.com>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Pull request overview
Adds new SONiC CLI surfaces for liquid-cooling leak detection/policy controls and extends chassis module configuration for BMC-managed switch-host timing, aligning config/ and show/ command sets with the referenced SONiC changes.
Changes:
- Introduces
config liquidcool {leak-control, leak-action}to write leak policy/action settings toLEAK_CONTROL_POLICY. - Adds
show platform leak {control-policy, profiles, status, rack-manager alerts}to display leak-related config/state. - Extends
config chassis moduleswith BMC-onlypower-on-delayandshutdown-timeoutforSWITCH-HOST, plus unit tests.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
show/platform.py |
Adds new show platform leak ... command group and DB access helpers. |
config/liquidcool.py |
New config liquidcool group for leak policy/action configuration. |
config/main.py |
Registers the new liquidcool config command group. |
config/chassis_modules.py |
Allows SWITCH-HOST module name and adds BMC-only timing commands. |
tests/show_platform_leak_test.py |
New unit tests for the show platform leak ... commands. |
tests/liquidcool_test.py |
New unit tests for config liquidcool ... commands. |
tests/leakage_status_test.py |
Updates leak status test to target the new show platform leak status. |
tests/chassis_modules_test.py |
Adds unit tests for power-on-delay / shutdown-timeout commands. |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
…imeout field, power-on-delay default -1 - Rename DB field 'shutdown_timeout' to 'graceful_shutdown_timeout' to match CHASSIS_MODULE schema in SONiC#2215 HLD - Update power-on-delay default to -1 (Switch-Host remains powered off); allow -1 as a valid sentinel value while rejecting values < -1 - Update tests to match new field name and -1 sentinel behavior Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Judy Joseph <jujoseph@microsoft.com>
…unction - Change Click argument type from float to int for power-on-delay and shutdown-timeout - Rename set_shutdown_timeout -> set_graceful_shutdown_timeout - Update test assertions to use integer string values (not float) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Judy Joseph <jujoseph@microsoft.com>
… decide Switch-Host powering on behavior on first boot
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
yxieca
left a comment
There was a problem hiding this comment.
I feel that these issues need to be addressed:
- power_on_delay default set to -1 but CLI enforces non‑negative; make sure defaults and -- rxpected semantics match (doc says default 0 in help string).,
- config/chassis_modules.py still imports datetime, timezone but not used (minor).
… the code and used IntRange
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 12 out of 12 changed files in this pull request and generated 3 comments.
Comments suppressed due to low confidence (1)
config/chassis_modules.py:70
get_config_module_state()returnsfvs['admin_status']when aCHASSIS_MODULEentry exists. The new BMC timing commands (power-on-delay/shutdown-timeout) can create an entry that contains onlypower_on_delay/graceful_shutdown_timeoutand noadmin_status, which will raiseKeyErrorand break startup/shutdown flows. Please makeget_config_module_state()robust to missingadmin_status(e.g., use.get()with the same platform-specific defaults), and/or ensure the timing commands populateadmin_statuswhen creating a new entry.
def get_config_module_state(db, chassis_module_name):
config_db = db.cfgdb
fvs = config_db.get_entry('CHASSIS_MODULE', chassis_module_name)
if not fvs:
if is_smartswitch():
return 'down'
elif is_bmc() and chassis_module_name.startswith("SWITCH-HOST"):
# On BMC, SWITCH-HOST default is 'down' to keep it powered off on boot
return 'down'
else:
return 'up'
else:
return fvs['admin_status']
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 12 out of 12 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (1)
config/chassis_modules.py:69
get_config_module_state()assumes that any existingCHASSIS_MODULE|<name>entry always containsadmin_status(return fvs['admin_status']). With the newly added BMC timing commands (power-on-delay/shutdown-timeout) it’s possible to create/update an entry that only containspower_on_delayorgraceful_shutdown_timeout, which will make this line raiseKeyErrorand breakstartup/shutdownand any other caller. Consider usingfvs.get('admin_status', <platform-appropriate default>)here and/or ensuring the timing commands always set a defaultadmin_statuswhen missing (e.g.,downfor SWITCH-HOST on BMC).
return 'down'
else:
return 'up'
else:
return fvs['admin_status']
|
@benle7 @oleksandrivantsiv Could you reveiew this PR, just wanted to make sure it doesn't break any existing CLI for leak/liquid cooling |
taking this change in, will raise a follow up PR to do adjustments |
What I did
Introduce config/show CLI commands for BMC chassis module management and liquid cooling leak detection, based on sonic-net/SONiC#2215, sonic-net/SONiC#2294
CLI Commands Added / Changed
NEW —
config chassis modules power-on-delay SWITCH-HOSTmodule_nameSWITCH-HOSTseconds>= 0(default:0)0= power on immediately.Writes
power_on_delaytoCHASSIS_MODULE|<module_name>in CONFIG_DB.NEW —
config chassis modules shutdown-timeout SWITCH-HOSTmodule_nameSWITCH-HOSTseconds>= 0(default:120)0= immediate power-off, no graceful shutdown.Writes
graceful_shutdown_timeouttoCHASSIS_MODULE|<module_name>in CONFIG_DB.CHANGED —
config chassis modules startup SWITCH-HOSTOn BMC, uses
mod_entry(instead ofset_entry) to setadmin_status = upwhile preservingpower_on_delayandgraceful_shutdown_timeoutfields in the sameCHASSIS_MODULEentry.CHANGED —
config chassis modules shutdown SWITCH-HOSTOn BMC, uses
mod_entryto setadmin_status = downwhile preserving other fields.admin_status = downis the default forSWITCH-HOST(keeps it powered off at boot until explicitly started).NEW —
config liquid-cool leak-controlpolicy_typesystem,rack_mgrstateenabled,disabledWrites
system_leak_policyorrack_mgr_leak_policytoLEAK_CONTROL_POLICY|policyin CONFIG_DB.NEW —
config liquid-cool leak-actionpolicy_typesystem,rack_mgrseveritycritical,minoractionsyslog_only,graceful_shutdown,power_offWrites the corresponding action field to
LEAK_CONTROL_POLICY|policyin CONFIG_DB.NEW —
show platform leak statusDisplays leak sensor state from
LIQUID_COOLING_INFOin STATE_DB.NEW —
show platform leak control-policyDisplays leak control policy from
LEAK_CONTROL_POLICY|policyin CONFIG_DB.NEW —
show platform leak rack-manager alertsDisplays rack-manager leak alerts from STATE_DB.
NEW —
show platform leak profilesDisplays leak sensor profiles (sensor type → max minor duration) from CONFIG_DB.
CHANGED —
show chassis modules status [module_name]On BMC,
Oper-Statusis now sourced from the platform API (ModuleHelper.get_module_oper_status()) with fallback to STATE_DB. On non-BMC platforms, behaviour is unchanged.How I did it
config/chassis_modules.pyis_switch_bmcwithis_bmc()fromutilities_common.chassisat all call sitesdatetimeandtimezoneimportspower-on-delaysubcommand (BMC-only,IntRange(min=0), default0)shutdown-timeoutsubcommand (BMC-only,IntRange(min=0),0= immediate power-off, default120)startup/shutdownusesmod_entryto preserve sibling fieldsconfig/liquid_cool.py(new file)liquid-coolwithleak-controlandleak-actionsubcommandsLEAK_CONTROL_POLICY|policyin CONFIG_DBshow/chassis_modules.pyModuleHelper.get_module_oper_status()(platform API) per module; falls back to STATE_DB value ifN/Ais returnedshow/platform.pyleakcommand group undershow platformwith subcommands:status,control-policy,rack-manager alerts,profilesLEAK_CONTROL_POLICY_KEY='policy'utilities_common/chassis.pyis_bmc()helper wrappingdevice_info.is_switch_bmc()withhasattrguardutilities_common/module.pyget_module_oper_status(module_name)toModuleHelper; returnsN/Aon any failureHow to verify it
Tested the new CLIs
Previous command output (if the output of a command-line utility has changed)
New command output (if the output of a command-line utility has changed)
~