You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: doc/xrcvd/cpo-dom-monitoring-hld.md
+48-80Lines changed: 48 additions & 80 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -76,7 +76,7 @@ This design introduces:
76
76
77
77
1. A **new per-device DB key structure** for CPO transceiver tables, extending the existing key from `TRANSCEIVER_DOM_SENSOR|<ifname>` to `TRANSCEIVER_DOM_SENSOR|<ifname>|<device>`.
78
78
2.**New ELSFP-specific DB fields** within the existing `TRANSCEIVER_*` table family.
79
-
3.**Multi-device result support in xcvrd's DB utility methods**, allowing `CpoApi` to return per-device data that the existing `DomInfoUpdateTask`publishes to compound DB keys without any task-level changes.
79
+
3.**Multi-device result support in xcvrd's utility methods**, allowing `CpoApi` to return per-device data that xcvrd publishes to compound DB keys without any task-level changes to `DomInfoUpdateTask`, `SfpStateUpdateTask`, or `CmisManagerTask`.
80
80
4.**CLI updates** to display per-device DOM data for CPO interfaces.
81
81
82
82
---
@@ -99,68 +99,7 @@ This design introduces:
99
99
100
100
The following diagram shows the information flow between the CLI, Redis, xcvrd, and the CPO hardware devices:
101
101
102
-
```
103
-
Network Operator
104
-
|
105
-
| (show interfaces transceiver eeprom -d ...)
106
-
v
107
-
+-----------------------------------------------+
108
-
| sonic-utilities (CLI) |
109
-
+-----------------------------------------------+
110
-
|
111
-
| (reads TRANSCEIVER_* tables)
112
-
v
113
-
+-----------------------------------------------+
114
-
| Redis STATE_DB |
115
-
| |
116
-
| TRANSCEIVER_DOM_SENSOR|Ethernet0|OE1 |
117
-
| TRANSCEIVER_DOM_SENSOR|Ethernet0|ELS1 |
118
-
| TRANSCEIVER_DOM_FLAG|Ethernet0|OE1 |
119
-
| TRANSCEIVER_DOM_FLAG|Ethernet0|ELS1 ... |
120
-
+-----------------------------------------------+
121
-
^
122
-
| (SET <ifname>|<device>, fvs)
123
-
| — compound key per device
124
-
+-----------------------------------------------+
125
-
| pmon container — xcvrd |
126
-
| |
127
-
| DomInfoUpdateTask |
128
-
| | |
129
-
| v |
130
-
| +-------------------------------------+ |
131
-
| | DB utility libraries | |
132
-
| | (post_diagnostic_values_to_db, | |
133
-
| | post_port_dom_flags_to_db, …) | |
134
-
| | | |
135
-
| | Detects MultiDeviceResult and | |
136
-
| | writes one compound DB key per | |
137
-
| | scope: <ifname>|<device> | |
138
-
| +-------------------------------------+ |
139
-
| ^ |
140
-
| | MultiDeviceResult |
141
-
| | {"OE1": {...}, "ELS1": {...}} |
142
-
| | |
143
-
| +-------------------------------------+ |
144
-
| | CpoApi | |
145
-
| | | |
146
-
| | get_transceiver_dom_real_value() | |
147
-
| | get_transceiver_dom_flags() | |
148
-
| | get_transceiver_threshold_info() | |
149
-
| | get_transceiver_status() … | |
150
-
| | | |
151
-
| | Aggregates OE + ELSFP data into | |
152
-
| | a MultiDeviceResult keyed by | |
153
-
| | device scope ID. | |
154
-
| +-------------------------------------+ |
155
-
+-----------------------------------------------+
156
-
| |
157
-
| (i2c reads) | (i2c reads)
158
-
v v
159
-
+----------------+ +----------------+
160
-
| Optical Engine| | ELSFP |
161
-
| (OE1, OE2,…)| | (ELS1, ELS2,…) |
162
-
+----------------+ +----------------+
163
-
```
102
+

164
103
165
104
---
166
105
@@ -178,19 +117,20 @@ The following diagram shows the information flow between the CLI, Redis, xcvrd,
178
117
179
118
##### 7.2.1 Composite Key Structure
180
119
181
-
For CPO interfaces, all`TRANSCEIVER_*` STATE_DB tables use a compound key that includes both the interface name and the device name:
120
+
For CPO interfaces, the following`TRANSCEIVER_*` STATE_DB tables use a compound key that includes both the interface name and the device name:
182
121
183
122
```
184
123
TRANSCEIVER_DOM_SENSOR|<ifname>|<device>
185
124
TRANSCEIVER_DOM_THRESHOLD|<ifname>|<device>
186
125
TRANSCEIVER_DOM_FLAG|<ifname>|<device>
187
126
TRANSCEIVER_STATUS|<ifname>|<device>
188
127
TRANSCEIVER_STATUS_FLAG|<ifname>|<device>
189
-
TRANSCEIVER_INFO|<ifname>|<device>
190
128
TRANSCEIVER_FIRMWARE_INFO|<ifname>|<device>
191
129
TRANSCEIVER_VDM_*|<ifname>|<device>
192
130
```
193
131
132
+
**Exception — TRANSCEIVER_INFO:** The `TRANSCEIVER_INFO` table retains the original flat key format (`TRANSCEIVER_INFO|<ifname>`) on CPO platforms. Downstream consumers such as `orchagent` subscribe to this table and use the key to look up ports in their internal port registries (see `PortsOrch::doTransceiverPresenceCheck`). Compound keys would not match any known port and would cause silent failures. Instead, ELSFP-specific fields are merged into the existing flat entry with an `elsfp_` prefix (see Section 7.2.2).
133
+
194
134
The `<device>` component identifies the physical device. Device names use a type prefix and a **global** index (not per-interface). The naming scheme for devices is decided by the [CPO Port Mapping HLD](https://github.com/sonic-net/SONiC/pull/2211) -- i.e it is up to the network switch vendor to decide the scheme.
195
135
196
136
| Device Type | Key Component | Example | Description |
The following fields are added to `TRANSCEIVER_INFO` for ELSFP entries. Fields already present in `TRANSCEIVER_INFO` for conventional transceivers (`type`, `manufacturer`, `serial`, etc.) are retained without modification.
160
+
Because `TRANSCEIVER_INFO` uses the flat key format (see Section 7.2.1), OE and ELSFP data coexist in a single entry per interface. Existing OE/transceiver fields (`type`, `manufacturer`, `serial`, etc.) are retained without modification. The following ELSFP-specific fields are added with an `elsfp_` prefix to avoid name collisions:
221
161
222
162
| Field Name | Type | Description |
223
163
|---|---|---|
224
-
|`module_function_type`| STRING | Module function type, e.g. "Resource Module" |
225
-
|`lane_count`| INTEGER | Number of optical lanes supported by the ELSFP |
226
-
|`control_mode`| STRING | Laser control mode: "APC" or "ACC" |
227
-
|`max_optical_power`| FLOAT | Maximum supported output optical power (dBm) |
228
-
|`min_optical_power`| FLOAT | Minimum supported output optical power (dBm) |
229
-
|`max_laser_bias`| FLOAT | Maximum laser bias current (mA) |
230
-
|`min_laser_bias`| FLOAT | Minimum laser bias current (mA) |
231
-
|`lane_to_fiber_mapping`| STRING | Mapping of lane index to fiber index |
232
-
|`lane_frequency`| FLOAT | Nominal per-lane optical frequency (THz) |
164
+
|`elsfp_module_function_type`| STRING | Module function type, e.g. "Resource Module" |
165
+
|`elsfp_lane_count`| INTEGER | Number of optical lanes supported by the ELSFP |
166
+
|`elsfp_control_mode`| STRING | Laser control mode: "APC" or "ACC" |
167
+
|`elsfp_max_optical_power`| FLOAT | Maximum supported output optical power (dBm) |
168
+
|`elsfp_min_optical_power`| FLOAT | Minimum supported output optical power (dBm) |
169
+
|`elsfp_max_laser_bias`| FLOAT | Maximum laser bias current (mA) |
170
+
|`elsfp_min_laser_bias`| FLOAT | Minimum laser bias current (mA) |
171
+
|`elsfp_lane_to_fiber_mapping`| STRING | Mapping of lane index to fiber index |
172
+
|`elsfp_lane_frequency`| FLOAT | Nominal per-lane optical frequency (THz) |
173
+
|`elsfp_manufacturer`| STRING | ELSFP vendor name |
174
+
|`elsfp_serial`| STRING | ELSFP serial number |
175
+
|`elsfp_model`| STRING | ELSFP model/part number |
233
176
234
177
###### TRANSCEIVER_DOM_SENSOR
235
178
@@ -337,9 +280,9 @@ The design avoids introducing any CPO-specific task or subclass in xcvrd. Instea
337
280
338
281
1. Having `CpoApi` override standard DOM query methods to return **multi-device results** — a dict-of-dicts keyed by device ID (e.g., `{"OE1": {...}, "ELS1": {...}}`).
339
282
2. Having `ElsfpApi` implement the standard DOM interface methods (`get_transceiver_dom_real_value()`, `get_transceiver_dom_flags()`, etc.) so that ELSFP data can be queried through the same interface as OE data.
340
-
3. Teaching the existing `post_*_to_db` helper methods in xcvrd to detect multi-device results and write compound DB keys.
283
+
3. Teaching the existing utility functions in xcvrd that write to STATE_DB to detect multi-device results and write compound DB keys.
341
284
342
-
`DomInfoUpdateTask` itself requires **no modifications**. Non-CPO platforms observe no change in behavior.
285
+
No task-level logic (`DomInfoUpdateTask`, `SfpStateUpdateTask`, `CmisManagerTask`) requires modification. All changes are confined to the utility/library methods that perform platform API calls and DB writes. Non-CPO platforms observe no change in behavior.
343
286
344
287
##### 7.3.2 Multi-Device Results
345
288
@@ -387,6 +330,23 @@ class CpoApi(CmisApi):
387
330
388
331
The device IDs (e.g., `"OE1"`, `"ELS1"`) are dictated by the naming scheme defined in the [CPO Port Mapping HLD](https://github.com/sonic-net/SONiC/pull/2211) (i.e the vendor decides the naming scheme).
389
332
333
+
**Exception — `get_transceiver_info()`:** As described in Section 7.2.1, `TRANSCEIVER_INFO` is exempt from compound keys. `CpoApi` overrides `get_transceiver_info()` to return a single flat dict (not a `MultiDeviceResult`) that merges OE and ELSFP info. ELSFP fields are prefixed with `elsfp_` to avoid name collisions:
Because this returns a plain dict, `post_port_sfp_info_to_db()` writes it to the flat key `TRANSCEIVER_INFO|<ifname>` with no code changes required.
349
+
390
350
##### 7.3.4 ElsfpApi Standard Methods
391
351
392
352
`ElsfpApi` implements the standard DOM interface methods by translating its existing ELSFP-specific primitives into the standard dict formats:
@@ -403,7 +363,7 @@ Since scoping places OE and ELSFP data under separate DB keys, field names withi
403
363
404
364
##### 7.3.5 Multi-Device-Aware DB Writes
405
365
406
-
The existing `post_*_to_db` helper methods in xcvrd are modified to detect `MultiDeviceResult` and write compound keys. The changes are confined to the DB utility layer — `DomInfoUpdateTask` is not modified.
366
+
The existing utility functions in xcvrd that write to STATE_DB are modified to detect `MultiDeviceResult` and write compound keys. The changes are confined to the utility/library layer — no task-level logic is modified.
407
367
408
368
**`post_diagnostic_values_to_db()`** (the generic helper used by most post methods):
**Table LifeCycle**: `xcvrd` will be updated to be aware that it must delete all device-specific tables for an interface upon port removal.
435
395
436
-
##### 7.3.6 Sequence Diagram
396
+
##### 7.3.6 SfpStateUpdateTask and CmisManagerTask
397
+
398
+
`DomInfoUpdateTask` is not the only xcvrd task that writes to STATE_DB transceiver tables. `SfpStateUpdateTask` writes `TRANSCEIVER_INFO` and `TRANSCEIVER_DOM_THRESHOLD` at transceiver insertion time, and `CmisManagerTask` updates the active ApSel field in `TRANSCEIVER_INFO`. Since CPO hardware does not have a traditional pluggable transceiver, the ELSFP insertion event serves as the trigger for these writes. The same principle applies: all changes are confined to utility functions, no task-level logic is modified.
399
+
400
+
**TRANSCEIVER_DOM_THRESHOLD (SfpStateUpdateTask):**`SfpStateUpdateTask` writes DOM thresholds once at transceiver insertion via `DOMDBUtils.post_port_dom_thresholds_to_db()`, which delegates to `post_diagnostic_values_to_db()`. Because `CpoApi.get_transceiver_threshold_info()` returns a `MultiDeviceResult`, this table acquires compound keys automatically through the changes described in Section 7.3.5. No additional work is required.
401
+
402
+
**TRANSCEIVER_INFO (SfpStateUpdateTask and CmisManagerTask):** As described in Section 7.2.1 and Section 7.3.3, `TRANSCEIVER_INFO` is exempt from compound keys. `CpoApi.get_transceiver_info()` returns a flat merged dict with `elsfp_`-prefixed fields, so `post_port_sfp_info_to_db()` writes to the standard flat key without any code changes. Similarly, `CmisManagerTask`'s `update_active_apsel_in_info_table()` continues to update the flat key — ApSel is an OE-only concept and the OE fields are not prefixed, so the existing update logic works unchanged.
-The `DomInfoUpdateTask` is used as-is, so the polling cadence remains the same. The time to process each interface may be slightly increased, since more information must be read from the hardware by necessity.
427
+
-All existing xcvrd tasks (`DomInfoUpdateTask`, `SfpStateUpdateTask`, `CmisManagerTask`) are used as-is, so the polling cadence and insertion-time behavior remain the same. The time to process each interface may be slightly increased, since more information must be read from the hardware by necessity.
460
428
- For shared devices, the same OE/ELSFP i2c registers are read once per interface that shares the device (once per bank). Future optimization via caching and/or adding xcvrd awareness of shared devices is possible but not required for initial implementation.
0 commit comments