Skip to content

Commit 7a0b43e

Browse files
Update HLD to account for SfpStateUpdateTask and CmisManagerTask writing to TRANSCEIVER_INFO
Signed-off-by: Brian Gallagher <bgallagher@nexthop.ai>
1 parent 3d6ecb2 commit 7a0b43e

2 files changed

Lines changed: 48 additions & 80 deletions

File tree

doc/xrcvd/cpo-dom-monitoring-hld.md

Lines changed: 48 additions & 80 deletions
Original file line numberDiff line numberDiff line change
@@ -76,7 +76,7 @@ This design introduces:
7676

7777
1. A **new per-device DB key structure** for CPO transceiver tables, extending the existing key from `TRANSCEIVER_DOM_SENSOR|<ifname>` to `TRANSCEIVER_DOM_SENSOR|<ifname>|<device>`.
7878
2. **New ELSFP-specific DB fields** within the existing `TRANSCEIVER_*` table family.
79-
3. **Multi-device result support in xcvrd's DB utility methods**, allowing `CpoApi` to return per-device data that the existing `DomInfoUpdateTask` publishes to compound DB keys without any task-level changes.
79+
3. **Multi-device result support in xcvrd's utility methods**, allowing `CpoApi` to return per-device data that xcvrd publishes to compound DB keys without any task-level changes to `DomInfoUpdateTask`, `SfpStateUpdateTask`, or `CmisManagerTask`.
8080
4. **CLI updates** to display per-device DOM data for CPO interfaces.
8181

8282
---
@@ -99,68 +99,7 @@ This design introduces:
9999

100100
The following diagram shows the information flow between the CLI, Redis, xcvrd, and the CPO hardware devices:
101101

102-
```
103-
Network Operator
104-
|
105-
| (show interfaces transceiver eeprom -d ...)
106-
v
107-
+-----------------------------------------------+
108-
| sonic-utilities (CLI) |
109-
+-----------------------------------------------+
110-
|
111-
| (reads TRANSCEIVER_* tables)
112-
v
113-
+-----------------------------------------------+
114-
| Redis STATE_DB |
115-
| |
116-
| TRANSCEIVER_DOM_SENSOR|Ethernet0|OE1 |
117-
| TRANSCEIVER_DOM_SENSOR|Ethernet0|ELS1 |
118-
| TRANSCEIVER_DOM_FLAG|Ethernet0|OE1 |
119-
| TRANSCEIVER_DOM_FLAG|Ethernet0|ELS1 ... |
120-
+-----------------------------------------------+
121-
^
122-
| (SET <ifname>|<device>, fvs)
123-
| — compound key per device
124-
+-----------------------------------------------+
125-
| pmon container — xcvrd |
126-
| |
127-
| DomInfoUpdateTask |
128-
| | |
129-
| v |
130-
| +-------------------------------------+ |
131-
| | DB utility libraries | |
132-
| | (post_diagnostic_values_to_db, | |
133-
| | post_port_dom_flags_to_db, …) | |
134-
| | | |
135-
| | Detects MultiDeviceResult and | |
136-
| | writes one compound DB key per | |
137-
| | scope: <ifname>|<device> | |
138-
| +-------------------------------------+ |
139-
| ^ |
140-
| | MultiDeviceResult |
141-
| | {"OE1": {...}, "ELS1": {...}} |
142-
| | |
143-
| +-------------------------------------+ |
144-
| | CpoApi | |
145-
| | | |
146-
| | get_transceiver_dom_real_value() | |
147-
| | get_transceiver_dom_flags() | |
148-
| | get_transceiver_threshold_info() | |
149-
| | get_transceiver_status() … | |
150-
| | | |
151-
| | Aggregates OE + ELSFP data into | |
152-
| | a MultiDeviceResult keyed by | |
153-
| | device scope ID. | |
154-
| +-------------------------------------+ |
155-
+-----------------------------------------------+
156-
| |
157-
| (i2c reads) | (i2c reads)
158-
v v
159-
+----------------+ +----------------+
160-
| Optical Engine| | ELSFP |
161-
| (OE1, OE2,…)| | (ELS1, ELS2,…) |
162-
+----------------+ +----------------+
163-
```
102+
![](./cpo_dom_flow.png)
164103

165104
---
166105

@@ -178,19 +117,20 @@ The following diagram shows the information flow between the CLI, Redis, xcvrd,
178117

179118
##### 7.2.1 Composite Key Structure
180119

181-
For CPO interfaces, all `TRANSCEIVER_*` STATE_DB tables use a compound key that includes both the interface name and the device name:
120+
For CPO interfaces, the following `TRANSCEIVER_*` STATE_DB tables use a compound key that includes both the interface name and the device name:
182121

183122
```
184123
TRANSCEIVER_DOM_SENSOR|<ifname>|<device>
185124
TRANSCEIVER_DOM_THRESHOLD|<ifname>|<device>
186125
TRANSCEIVER_DOM_FLAG|<ifname>|<device>
187126
TRANSCEIVER_STATUS|<ifname>|<device>
188127
TRANSCEIVER_STATUS_FLAG|<ifname>|<device>
189-
TRANSCEIVER_INFO|<ifname>|<device>
190128
TRANSCEIVER_FIRMWARE_INFO|<ifname>|<device>
191129
TRANSCEIVER_VDM_*|<ifname>|<device>
192130
```
193131

132+
**Exception — TRANSCEIVER_INFO:** The `TRANSCEIVER_INFO` table retains the original flat key format (`TRANSCEIVER_INFO|<ifname>`) on CPO platforms. Downstream consumers such as `orchagent` subscribe to this table and use the key to look up ports in their internal port registries (see `PortsOrch::doTransceiverPresenceCheck`). Compound keys would not match any known port and would cause silent failures. Instead, ELSFP-specific fields are merged into the existing flat entry with an `elsfp_` prefix (see Section 7.2.2).
133+
194134
The `<device>` component identifies the physical device. Device names use a type prefix and a **global** index (not per-interface). The naming scheme for devices is decided by the [CPO Port Mapping HLD](https://github.com/sonic-net/SONiC/pull/2211) -- i.e it is up to the network switch vendor to decide the scheme.
195135

196136
| Device Type | Key Component | Example | Description |
@@ -217,19 +157,22 @@ TRANSCEIVER_DOM_SENSOR|Ethernet8|ELS1
217157

218158
###### TRANSCEIVER_INFO
219159

220-
The following fields are added to `TRANSCEIVER_INFO` for ELSFP entries. Fields already present in `TRANSCEIVER_INFO` for conventional transceivers (`type`, `manufacturer`, `serial`, etc.) are retained without modification.
160+
Because `TRANSCEIVER_INFO` uses the flat key format (see Section 7.2.1), OE and ELSFP data coexist in a single entry per interface. Existing OE/transceiver fields (`type`, `manufacturer`, `serial`, etc.) are retained without modification. The following ELSFP-specific fields are added with an `elsfp_` prefix to avoid name collisions:
221161

222162
| Field Name | Type | Description |
223163
|---|---|---|
224-
| `module_function_type` | STRING | Module function type, e.g. "Resource Module" |
225-
| `lane_count` | INTEGER | Number of optical lanes supported by the ELSFP |
226-
| `control_mode` | STRING | Laser control mode: "APC" or "ACC" |
227-
| `max_optical_power` | FLOAT | Maximum supported output optical power (dBm) |
228-
| `min_optical_power` | FLOAT | Minimum supported output optical power (dBm) |
229-
| `max_laser_bias` | FLOAT | Maximum laser bias current (mA) |
230-
| `min_laser_bias` | FLOAT | Minimum laser bias current (mA) |
231-
| `lane_to_fiber_mapping` | STRING | Mapping of lane index to fiber index |
232-
| `lane_frequency` | FLOAT | Nominal per-lane optical frequency (THz) |
164+
| `elsfp_module_function_type` | STRING | Module function type, e.g. "Resource Module" |
165+
| `elsfp_lane_count` | INTEGER | Number of optical lanes supported by the ELSFP |
166+
| `elsfp_control_mode` | STRING | Laser control mode: "APC" or "ACC" |
167+
| `elsfp_max_optical_power` | FLOAT | Maximum supported output optical power (dBm) |
168+
| `elsfp_min_optical_power` | FLOAT | Minimum supported output optical power (dBm) |
169+
| `elsfp_max_laser_bias` | FLOAT | Maximum laser bias current (mA) |
170+
| `elsfp_min_laser_bias` | FLOAT | Minimum laser bias current (mA) |
171+
| `elsfp_lane_to_fiber_mapping` | STRING | Mapping of lane index to fiber index |
172+
| `elsfp_lane_frequency` | FLOAT | Nominal per-lane optical frequency (THz) |
173+
| `elsfp_manufacturer` | STRING | ELSFP vendor name |
174+
| `elsfp_serial` | STRING | ELSFP serial number |
175+
| `elsfp_model` | STRING | ELSFP model/part number |
233176

234177
###### TRANSCEIVER_DOM_SENSOR
235178

@@ -337,9 +280,9 @@ The design avoids introducing any CPO-specific task or subclass in xcvrd. Instea
337280

338281
1. Having `CpoApi` override standard DOM query methods to return **multi-device results** — a dict-of-dicts keyed by device ID (e.g., `{"OE1": {...}, "ELS1": {...}}`).
339282
2. Having `ElsfpApi` implement the standard DOM interface methods (`get_transceiver_dom_real_value()`, `get_transceiver_dom_flags()`, etc.) so that ELSFP data can be queried through the same interface as OE data.
340-
3. Teaching the existing `post_*_to_db` helper methods in xcvrd to detect multi-device results and write compound DB keys.
283+
3. Teaching the existing utility functions in xcvrd that write to STATE_DB to detect multi-device results and write compound DB keys.
341284

342-
`DomInfoUpdateTask` itself requires **no modifications**. Non-CPO platforms observe no change in behavior.
285+
No task-level logic (`DomInfoUpdateTask`, `SfpStateUpdateTask`, `CmisManagerTask`) requires modification. All changes are confined to the utility/library methods that perform platform API calls and DB writes. Non-CPO platforms observe no change in behavior.
343286

344287
##### 7.3.2 Multi-Device Results
345288

@@ -387,6 +330,23 @@ class CpoApi(CmisApi):
387330

388331
The device IDs (e.g., `"OE1"`, `"ELS1"`) are dictated by the naming scheme defined in the [CPO Port Mapping HLD](https://github.com/sonic-net/SONiC/pull/2211) (i.e the vendor decides the naming scheme).
389332

333+
**Exception — `get_transceiver_info()`:** As described in Section 7.2.1, `TRANSCEIVER_INFO` is exempt from compound keys. `CpoApi` overrides `get_transceiver_info()` to return a single flat dict (not a `MultiDeviceResult`) that merges OE and ELSFP info. ELSFP fields are prefixed with `elsfp_` to avoid name collisions:
334+
335+
```python
336+
def get_transceiver_info(self):
337+
oe_info = self.optical_engine_xcvr_api.get_transceiver_info()
338+
els_info = self.external_laser_source_xcvr_api.get_transceiver_info()
339+
if oe_info is None:
340+
return None
341+
result = dict(oe_info)
342+
if els_info:
343+
for key, value in els_info.items():
344+
result[f"elsfp_{key}"] = value
345+
return result
346+
```
347+
348+
Because this returns a plain dict, `post_port_sfp_info_to_db()` writes it to the flat key `TRANSCEIVER_INFO|<ifname>` with no code changes required.
349+
390350
##### 7.3.4 ElsfpApi Standard Methods
391351

392352
`ElsfpApi` implements the standard DOM interface methods by translating its existing ELSFP-specific primitives into the standard dict formats:
@@ -403,7 +363,7 @@ Since scoping places OE and ELSFP data under separate DB keys, field names withi
403363

404364
##### 7.3.5 Multi-Device-Aware DB Writes
405365

406-
The existing `post_*_to_db` helper methods in xcvrd are modified to detect `MultiDeviceResult` and write compound keys. The changes are confined to the DB utility layer — `DomInfoUpdateTask` is not modified.
366+
The existing utility functions in xcvrd that write to STATE_DB are modified to detect `MultiDeviceResult` and write compound keys. The changes are confined to the utility/library layer — no task-level logic is modified.
407367

408368
**`post_diagnostic_values_to_db()`** (the generic helper used by most post methods):
409369

@@ -433,7 +393,15 @@ def post_diagnostic_values_to_db(self, logical_port_name, table, get_values_func
433393

434394
**Table LifeCycle**: `xcvrd` will be updated to be aware that it must delete all device-specific tables for an interface upon port removal.
435395

436-
##### 7.3.6 Sequence Diagram
396+
##### 7.3.6 SfpStateUpdateTask and CmisManagerTask
397+
398+
`DomInfoUpdateTask` is not the only xcvrd task that writes to STATE_DB transceiver tables. `SfpStateUpdateTask` writes `TRANSCEIVER_INFO` and `TRANSCEIVER_DOM_THRESHOLD` at transceiver insertion time, and `CmisManagerTask` updates the active ApSel field in `TRANSCEIVER_INFO`. Since CPO hardware does not have a traditional pluggable transceiver, the ELSFP insertion event serves as the trigger for these writes. The same principle applies: all changes are confined to utility functions, no task-level logic is modified.
399+
400+
**TRANSCEIVER_DOM_THRESHOLD (SfpStateUpdateTask):** `SfpStateUpdateTask` writes DOM thresholds once at transceiver insertion via `DOMDBUtils.post_port_dom_thresholds_to_db()`, which delegates to `post_diagnostic_values_to_db()`. Because `CpoApi.get_transceiver_threshold_info()` returns a `MultiDeviceResult`, this table acquires compound keys automatically through the changes described in Section 7.3.5. No additional work is required.
401+
402+
**TRANSCEIVER_INFO (SfpStateUpdateTask and CmisManagerTask):** As described in Section 7.2.1 and Section 7.3.3, `TRANSCEIVER_INFO` is exempt from compound keys. `CpoApi.get_transceiver_info()` returns a flat merged dict with `elsfp_`-prefixed fields, so `post_port_sfp_info_to_db()` writes to the standard flat key without any code changes. Similarly, `CmisManagerTask`'s `update_active_apsel_in_info_table()` continues to update the flat key — ApSel is an OE-only concept and the OE fields are not prefixed, so the existing update logic works unchanged.
403+
404+
##### 7.3.7 Sequence Diagram
437405

438406
```
439407
DomInfoUpdateTask post_*_to_db utils CpoApi OE (i2c) ELSFP (i2c) STATE_DB
@@ -456,7 +424,7 @@ DomInfoUpdateTask post_*_to_db utils CpoApi OE (i2c) ELSF
456424

457425
#### 7.4 Scalability and Performance
458426

459-
- The `DomInfoUpdateTask` is used as-is, so the polling cadence remains the same. The time to process each interface may be slightly increased, since more information must be read from the hardware by necessity.
427+
- All existing xcvrd tasks (`DomInfoUpdateTask`, `SfpStateUpdateTask`, `CmisManagerTask`) are used as-is, so the polling cadence and insertion-time behavior remain the same. The time to process each interface may be slightly increased, since more information must be read from the hardware by necessity.
460428
- For shared devices, the same OE/ELSFP i2c registers are read once per interface that shares the device (once per bank). Future optimization via caching and/or adding xcvrd awareness of shared devices is possible but not required for initial implementation.
461429

462430
#### 7.6 Platform Dependency

doc/xrcvd/cpo_dom_flow.png

831 KB
Loading

0 commit comments

Comments
 (0)