Skip to content

Commit f187779

Browse files
kirson-gitosu
andauthored
fix(nico-api): add carbide-api.forge to default cert SANs (DPU heartbeat during carbide→nico rename) (#2825)
## Problem On a v0.10.3 deployment, BlueField DPUs never heartbeat and stay stuck at `WaitingForNetworkConfig`. Root cause: the `forge-dpu-agent` binary still dials the API by the **legacy** name `carbide-api.forge` (carbide→nico rename gap), but the generated `nico-api` serving cert doesn't include that SAN → the agent rejects the cert (**TLS BadCertificate**) and never connects. ## Fix Add `carbide-api.forge` to the default `nico-api` certificate `extraDnsNames` so DPU heartbeat works out-of-the-box during the rename transition. (The chart already supports `extraDnsNames`; this just makes the legacy name a default while the binary still uses it.) ## Validation Applied on a live 2× XE9680 + BlueField-3 lab — DPUs went from `WaitingForNetworkConfig`/`HeartbeatTimeout` to `Healthy` after the cert included this SAN. Related: #2823 (carbide→nico rename gaps). Signed-off-by: Erez Kirson <ekirson@nvidia.com> Co-authored-by: Hasan Khan <hasank@nvidia.com>
1 parent e7781e0 commit f187779

1 file changed

Lines changed: 6 additions & 1 deletion

File tree

helm/charts/nico-api/values.yaml

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -111,7 +111,12 @@ certificate:
111111
## are appended to the generated list).
112112
dnsNames: []
113113
uris: []
114-
extraDnsNames: []
114+
# carbide-api.forge: the forge-dpu-agent (v0.10.3) still dials the API by this
115+
# legacy name during the carbide->nico rename; without this SAN the DPU agent
116+
# rejects the serving cert (TLS BadCertificate) and never heartbeats (DPUs stuck
117+
# at WaitingForNetworkConfig). See issue #2823.
118+
extraDnsNames:
119+
- carbide-api.forge
115120
extraUris: []
116121

117122
migrationJob:

0 commit comments

Comments
 (0)