Skip to content

Commit 0e64f2d

Browse files
[mcast] audit-pass + reconciler updates + opte hardening + config tuning
Final, pre-review pass on this work. It stacks atop #10070 and inherits the multicast-to-physical (M2P) underlay forwarding and VMM-keyed instance subscription endpoints. This also builds on and integrates #10381. Above these foundations, this work includes the final pass on mgd-ddmd integration: * Reconciler correctness: * `set_mcast_m2p` rolls back the xde M2P entry on per-NIC join failure, so the reconciler converges on a retry instead of leaving stale state pointing at the wrong underlay address. * `propolis_id` is threaded end-to-end through the sled-agent multicast endpoints to deal with live migration ambiguity. * MRIB advertisement is gated on a flag rather than running unconditionally after the DPD match arm, so that a DPD failure no longer leaves a route advertised via DDM with no programmed forwarding state. * OPTE hardening (illumos-utils): * M2P entries upserted into a `BTreeMap<IpAddr, MulticastUnderlay>` rather than a Vec on the non-illumos mock, eliminating duplicate-key corner cases the production map already avoided. * `MulticastFilterMap` encapsulates the per-NIC filter socket and refcount state previously open-coded inside `PortManagerInner`, concentrating the "join socket per underlay group per NIC" invariant into one singular type. * underlay_nics typed as &[AddrObject] rather than &[String]. * Per-NIC IPV6_JOIN_GROUP calls converted from libc::setsockopt to nix::sys::socket::setsockopt for the typed bind. * Sled-agent (real and sim): * Sim v7 multicast endpoints fall through to the trait defaults instead of overriding with just `unimplemented!()`, matching how other versioned endpoints behave in the sim. * Sim VMM existence check on join/leave restored. * Configuration: * `MulticastGroupReconcilerConfig` gains a group_concurrency_limit and member_concurrency_limit bounding the per-pass fan-out of the RPW's buffer_unordered streams. * Test infra: * `populate_ddm_peers` no longer caches the peer map. The previous cache was keyed by sled-id set, but the synthesized port names embedded each sled's `sp_slot` from inventory, so cache reuse within the same sled set could produce stale port mappings. * Documentation cleanup across the RPW, sled-agent multicast paths, and the new(er) sled-agent types module.
1 parent 401cc43 commit 0e64f2d

36 files changed

Lines changed: 961 additions & 32746 deletions

File tree

clients/ddm-admin-client/src/lib.rs

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,6 @@
22
// License, v. 2.0. If a copy of the MPL was not distributed with this
33
// file, You can obtain one at https://mozilla.org/MPL/2.0/.
44

5-
// Copyright 2026 Oxide Computer Company
6-
75
#![allow(clippy::redundant_closure_call)]
86
#![allow(clippy::needless_lifetimes)]
97
#![allow(clippy::match_single_binding)]

illumos-utils/src/opte/illumos.rs

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -76,6 +76,12 @@ pub enum Error {
7676
"address {0} is not within the underlay multicast subnet (ff04::/16)"
7777
)]
7878
InvalidMcastUnderlay(Ipv6Addr),
79+
80+
#[error(
81+
"failed to install NIC multicast MAC filter for underlay {0}, \
82+
caller should retry"
83+
)]
84+
UnderlayMcastJoinFailed(Ipv6Addr),
7985
}
8086

8187
/// Delete all xde devices on the system.

illumos-utils/src/opte/mod.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,7 @@ use std::net::Ipv6Addr;
4545
// `omicron_common::api::external::Vni::DEFAULT_MULTICAST_VNI` live in sibling
4646
// crates that cannot reference each other's constant. They must stay
4747
// numerically equal: the MRIB, M2P mappings, and OPTE all route on this
48-
// value, so any divergence would black-hole multicast traffic.
48+
// value, so any divergence would silently drop multicast traffic.
4949
const _: () = assert!(
5050
oxide_vpc::api::DEFAULT_MULTICAST_VNI
5151
== omicron_common::api::external::Vni::DEFAULT_MULTICAST_VNI.as_u32(),

illumos-utils/src/opte/non_illumos.rs

Lines changed: 18 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44

55
//! Mock / dummy versions of the OPTE module, for non-illumos platforms.
66
//!
7-
//! Most methods are either `unimplemented!()` or silent no-ops.
7+
//! Most methods are either `unimplemented!()` or silent noops.
88
//! Multicast subscribe/unsubscribe is an exception, as it maintains real
99
//! in-memory state because port manager tests assert on subscription contents.
1010
@@ -37,6 +37,7 @@ use oxide_vpc::api::SourceFilter;
3737
use oxide_vpc::api::VpcCfg;
3838
use sled_agent_types::inventory::NetworkInterfaceKind;
3939
use slog::Logger;
40+
use std::collections::BTreeMap;
4041
use std::collections::HashMap;
4142
use std::collections::hash_map::Entry;
4243
use std::net::IpAddr;
@@ -94,6 +95,12 @@ pub enum Error {
9495
"address {0} is not within the underlay multicast subnet (ff04::/16)"
9596
)]
9697
InvalidMcastUnderlay(std::net::Ipv6Addr),
98+
99+
#[error(
100+
"failed to install NIC multicast MAC filter for underlay {0}, \
101+
caller should retry"
102+
)]
103+
UnderlayMcastJoinFailed(std::net::Ipv6Addr),
97104
}
98105

99106
pub fn initialize_xde_driver(
@@ -198,11 +205,12 @@ pub(crate) struct PortData {
198205
pub(crate) struct State {
199206
pub ports: HashMap<String, PortData>,
200207
pub underlay_initialized: bool,
201-
/// Multicast-to-physical mappings, keyed on (group, underlay).
208+
/// Multicast-to-physical mappings, keyed by group.
202209
///
203210
/// Persisted across [`Handle`] lifetimes to simulate xde kernel state
204-
/// surviving sled-agent restarts.
205-
pub m2p: Vec<(oxide_vpc::api::IpAddr, MulticastUnderlay)>,
211+
/// surviving sled-agent restarts. Mirrors the upsert-by-group
212+
/// semantics of xde's `Mcast2Phys`.
213+
pub m2p: BTreeMap<oxide_vpc::api::IpAddr, MulticastUnderlay>,
206214
}
207215

208216
const NO_RESPONSE: NoResp = NoResp { unused: 99 };
@@ -213,7 +221,7 @@ fn opte_state() -> &'static Mutex<State> {
213221
Mutex::new(State {
214222
ports: HashMap::new(),
215223
underlay_initialized: false,
216-
m2p: Vec::new(),
224+
m2p: BTreeMap::new(),
217225
})
218226
})
219227
}
@@ -391,9 +399,8 @@ impl Handle {
391399
/// Set a multicast-to-physical mapping.
392400
pub fn set_m2p(&self, req: &SetMcast2PhysReq) -> Result<NoResp, OpteError> {
393401
let mut state = opte_state().lock().unwrap();
394-
// Deduplicate by replacing existing entry for the same group.
395-
state.m2p.retain(|(g, _)| *g != req.group);
396-
state.m2p.push((req.group, req.underlay));
402+
// Upsert: replaces any existing entry for the same group.
403+
state.m2p.insert(req.group, req.underlay);
397404
Ok(NO_RESPONSE)
398405
}
399406

@@ -403,7 +410,9 @@ impl Handle {
403410
req: &ClearMcast2PhysReq,
404411
) -> Result<NoResp, OpteError> {
405412
let mut state = opte_state().lock().unwrap();
406-
state.m2p.retain(|(g, u)| !(*g == req.group && *u == req.underlay));
413+
if state.m2p.get(&req.group) == Some(&req.underlay) {
414+
state.m2p.remove(&req.group);
415+
}
407416
Ok(NO_RESPONSE)
408417
}
409418

0 commit comments

Comments
 (0)