Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
dfb34d4
ls-apis needs to detect cycles in dependency unit graph
davepacheco Jan 22, 2026
172dfef
remove eprintln
davepacheco Jan 22, 2026
d133557
move to api-manifest.toml
sunshowers Jan 23, 2026
5e04994
make check exhaustive
sunshowers Jan 23, 2026
d714b7d
use the dependency_filter_rules list, and add more notes and permalinks
sunshowers Jan 26, 2026
0e08578
self-review fixes
sunshowers Jan 29, 2026
c74b429
more self-review fixes
sunshowers Jan 29, 2026
c2e885a
wind back the clock to when the new metadata was a new type of thing …
davepacheco Feb 5, 2026
b7ac647
localhost -> intra-deployment-unit
davepacheco Feb 5, 2026
9103432
tool could be less wordy
davepacheco Feb 5, 2026
591a205
remove ClientMatcher
davepacheco Feb 5, 2026
66a2c18
use explicit enum instead of Option-with-semantics
davepacheco Feb 5, 2026
58e20d3
clean up docs and user messaging around IDU validation
davepacheco Feb 5, 2026
6d11b14
validation
davepacheco Feb 6, 2026
af29995
comment nit
davepacheco Feb 6, 2026
b742226
update metadata to reflect updates made before winding the clock back
davepacheco Feb 6, 2026
5916ad6
pull forward another change that was wound back
davepacheco Feb 6, 2026
a25117a
review the actual metadata
davepacheco Feb 7, 2026
d4dae9c
add issue links; fix clippy
davepacheco Feb 7, 2026
42cfff2
Merge branch 'main' into dap/ls-apis-fix
davepacheco Feb 19, 2026
6fca775
update this PR for addition of bootstrap-agent-lockstep API
davepacheco Feb 19, 2026
9ff04d2
expectorate
davepacheco Feb 20, 2026
3c13a2e
Merge branch 'main' into dap/ls-apis-fix
davepacheco Mar 2, 2026
21716da
Merge branch 'main' into dap/ls-apis-fix
davepacheco Mar 3, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions dev-tools/ls-apis/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ cargo_metadata.workspace = true
clap.workspace = true
iddqd.workspace = true
indent_write.workspace = true
itertools.workspace = true
newtype_derive.workspace = true
parse-display.workspace = true
petgraph.workspace = true
Expand Down
241 changes: 238 additions & 3 deletions dev-tools/ls-apis/api-manifest.toml
Original file line number Diff line number Diff line change
Expand Up @@ -181,10 +181,18 @@ versioned_how_reason = "depends on itself (i.e., instances call each other)"
client_package_name = "bootstrap-agent-lockstep-client"
label = "Bootstrap Agent Lockstep API"
server_package_name = "bootstrap-agent-lockstep-api"
versioned_how = "server"
versioned_how = "client"
versioned_how_reason = "wicketd can wind up calling into sled agent on different sleds"
notes = """
Lockstep API for rack initialization. Only used by wicketd, which is in the \
same deployment unit (Host OS) as the sled-agent that exposes this API.
Lockstep API for rack initialization. Only used by wicketd. Generally
speaking, Wicketd only talks to the sled agent on the same sled, which is in the
same deployment unit as Wicketd itself (host OS). However, it may pick a
different one if for some reason the local baseboard information is not
available.

Either way, none of this matters for API versioning or software update because
we expect that during rack initialization, all components are running the same
system release.
"""

[[apis]]
Expand Down Expand Up @@ -612,3 +620,230 @@ note = """
Sled Agent uses the Crucible Agent client types only, and only in the simulated
sled agent.
"""

################################################################################
# Intra-deployment-unit-only edges
#
# These edges are excluded from the deployment unit dependency graph because
# they represent communication that only happens locally within a *single
# instance* of a single deployment unit, not across deployment unit boundaries.
# A single deployment unit is updated atomically, so there's no risk of version
# skew with intra_deployment_unit-only edges.
#
# Note that things do not belong here when they cross different instances of the
# same deployment unit. For example, Sled Agent and MGS are in the same
# deployment unit (the host OS), but Sled Agent's use of MGS crosses different
# instances of the deployment unit (i.e., different sleds). Those don't belong
# in this category.
################################################################################

[[intra_deployment_unit_only_edges]]
server = "ddmd"
client = "dpd-client"
note = """
ddmd runs in both the global zone and the switch zone. The switch zone is
configured with dpd_host=[::1] (services.rs:3127). The global zone configures
"dendrite" to false, meaning it does not expose a dendrite server
(ddm/manifest.xml:25).
"""
permalinks = [
"https://github.com/oxidecomputer/omicron/blob/6cef874/sled-agent/src/services.rs#L3127",
"https://github.com/oxidecomputer/maghemite/blob/396bb3c/smf/ddm/manifest.xml#L25",
]
Comment on lines +640 to +652

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@internet-diglett or @rcgoodfellow: when you get a minute, can you confirm our understanding here? it looks like ddmd in the GZ never talks to dpd and ddmd in the switch zone only ever talks to its own localhost. Does that sound right? We're just trying to understand these relationships to understand the upgrade impact when these APIs change.

@rcgoodfellow rcgoodfellow Feb 17, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks like ddmd in the GZ never talks to dpd

That's correct.

ddmd in the switch zone only ever talks to its own localhost

If this is referring to dpd specifically then yes. However, ddmd in the switch zone will talk to other global zone ddmd instances on other sleds.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @rcgoodfellow:

ddmd in the switch zone will talk to other global zone ddmd instances on other sleds

definitely sounds like something I missed. Where do those calls happen?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we were just talking about ddmd talking to dpd via OpenAPI. @rcgoodfellow I think you're talking about a non-OpenAPI interface, right?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. The ddm to ddm comms are a bit odd. Ddm instances communicate over IPv6 link-local addresses which means we cannot use progenitor clients. At least the last time I looked progenitor was using a set of libraries that depend on the url crate, and that crate has decided that scoped IPv6 link local communication is out of its scope. So while ddm does have a dropshot-based API for these comms, we use a hyper-based client.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rcgoodfellow could you link to the code which does that when you have a chance? Also, do we pass in an api-version header today? I ran rg -i api-version in dendrite and didn't see any results.

@rcgoodfellow rcgoodfellow Feb 17, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Here is some hyper client code. Here are the corresponding dropshot server handlers.

Versions are determined via discovery packets that are exchanged over link-local multicast.

There is not currently an api-version header. The versions are explicitly in the path.


[[intra_deployment_unit_only_edges]]
server = "dpd"
client = "gateway-client"

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the referenced source: it looks like someone could override this by providing a CLI argument to dpd. What do you think about seeing if we can put that CLI option behind a dev-only feature flag so that shipping code can't do this? (I'd of course talk to Levon about this before proceeding.)

(I'd do this as a follow up and just mention it in the note for now.)

Edit: maybe a simpler solution would be to remove these from the SMF start script, if that's possible? I noticed that the mgd -> dendrite dependency looks similar (can be overridden by a CLI argument), but the method script doesn't ever set the --mgs-addr option, which makes it pretty unlikely this would get broken in production.

note = """
dpd defaults to [::1]:MGS_PORT (switch_identifiers.rs:60), and the SMF start
script doesn't pass in --mgs-address. The comment also explains that it
explicitly wants the local MGS and this can only be overridden for testing.
"""
permalinks = [
"https://github.com/oxidecomputer/dendrite/blob/606c0be/dpd/src/switch_identifiers.rs#L60",
"https://github.com/oxidecomputer/dendrite/blob/606c0be/dpd/misc/dpd.xml",
"https://github.com/oxidecomputer/dendrite/blob/606c0be/tools/svc-dpd",
]

[[intra_deployment_unit_only_edges]]
server = "lldpd"
client = "dpd-client"
note = """
lldpd defaults to localhost for dpd (dendrite.rs:194), and the SMF start
script doesn't pass in --host.
"""
permalinks = [
"https://github.com/oxidecomputer/lldp/blob/61479b6/lldpd/src/dendrite.rs#L194",
"https://github.com/oxidecomputer/lldp/blob/61479b6/lldpd/misc/lldpd.xml",
"https://github.com/oxidecomputer/lldp/blob/61479b6/lldpd/misc/svc-lldpd",
]

[[intra_deployment_unit_only_edges]]
server = "lldpd"
client = "gateway-client"
note = """
lldpd defaults to localhost for gateway (main.rs:194), and the SMF start
script doesn't override it.
"""
permalinks = [
"https://github.com/oxidecomputer/lldp/blob/d22509dfdb051321b859e924948605115691b93c/lldpd/src/main.rs#L148-L154",
"https://github.com/oxidecomputer/lldp/blob/d22509dfdb051321b859e924948605115691b93c/lldpd/misc/svc-lldpd",
]

[[intra_deployment_unit_only_edges]]
server = "mgd"
client = "ddm-admin-client"
note = "mg-lower hardcodes localhost:8000."
permalinks = [
"https://github.com/oxidecomputer/maghemite/blob/396bb3c/mg-lower/src/ddm.rs#L110",
]

[[intra_deployment_unit_only_edges]]
server = "mgd"
client = "dpd-client"
note = "mg-lower hardcodes localhost."
permalinks = [
"https://github.com/oxidecomputer/maghemite/blob/396bb3c/mg-lower/src/dendrite.rs#L491",
]

[[intra_deployment_unit_only_edges]]
server = "mgd"
client = "gateway-client"
note = """
mgd defaults to [::1]:12225 (mgd main.rs:93), and the SMF start script
doesn't set --mgs-addr.
"""
permalinks = [
"https://github.com/oxidecomputer/maghemite/blob/396bb3c/mgd/src/main.rs#L93",
"https://github.com/oxidecomputer/maghemite/blob/396bb3c/smf/mgd/manifest.xml",
"https://github.com/oxidecomputer/maghemite/blob/396bb3c/smf/mgd_method_script.sh",
]

[[intra_deployment_unit_only_edges]]
server = "omicron-sled-agent"
client = "ddm-admin-client"
note = "Sled Agent uses Client::localhost() to connect to ddm."
permalinks = [
"https://github.com/oxidecomputer/omicron/blob/6cef874/sled-agent/src/ddm_reconciler.rs#L56",
"https://github.com/oxidecomputer/omicron/blob/6cef874/sled-agent/src/bootstrap/bootstore_setup.rs#L92",
"https://github.com/oxidecomputer/omicron/blob/6cef874/sled-agent/src/rack_setup/service.rs#L1086",
"https://github.com/oxidecomputer/omicron/blob/6cef874/sled-agent/src/sled_agent.rs#L1378",
]

[[intra_deployment_unit_only_edges]]
server = "omicron-sled-agent"
client = "mg-admin-client"
note = """
Sled Agent only queries the switch zone on the same sled,
which is within the same deployment unit as the global
zone.
"""
permalinks = [
"https://github.com/oxidecomputer/omicron/blob/6cef874/sled-agent/src/bootstrap/early_networking.rs#L483",
]

[[intra_deployment_unit_only_edges]]
server = "omicron-sled-agent"
client = "propolis-client"
note = """
Only queries the Propolis server on the same sled, so Propolis is in the
same deployment unit.
"""
permalinks = [
"https://github.com/oxidecomputer/omicron/blob/6cef874/sled-agent/src/instance.rs#L2283",
]

# NOTE: The following sled-agent edges are bugs: they cross deployment units.
# They are kept here to avoid breaking the build, but need to be fixed, either
# through client-side versioning or by some other means.

[[intra_deployment_unit_only_edges]]
server = "omicron-sled-agent"
client = "gateway-client"
note = """
BUG: Sled Agent creates two MGS clients. One of them
(early_networking.rs:376) queries the switch zone on
the same sled, which is within the same deployment unit
as the global zone, so this is okay.

The other one (early_networking.rs:277) queries both
switch zones, so it crosses deployment units. This needs
to be fixed.

Reference: https://github.com/oxidecomputer/omicron/issues/9708
"""
permalinks = [
"https://github.com/oxidecomputer/omicron/blob/6cef874/sled-agent/src/bootstrap/early_networking.rs#L376",
"https://github.com/oxidecomputer/omicron/blob/6cef874/sled-agent/src/bootstrap/early_networking.rs#L277-L280",
]

[[intra_deployment_unit_only_edges]]
server = "omicron-sled-agent"
client = "dpd-client"
note = """
BUG: Sled Agent creates two DPD clients. One of them (early_networking.rs:419)
is always to the switch zone on the same sled, which is in the same deployment
unit.

The other one (services.rs:1090) queries both switch zones, so it crosses
deployment units. This needs to be fixed.

Reference: https://github.com/oxidecomputer/omicron/issues/9708
"""
permalinks = [
"https://github.com/oxidecomputer/omicron/blob/6cef874/sled-agent/src/bootstrap/early_networking.rs#L419",
"https://github.com/oxidecomputer/omicron/blob/6cef874/sled-agent/src/services.rs#L1090",
]

[[intra_deployment_unit_only_edges]]
server = "tfportd"
client = "dpd-client"
note = """
The tfportd service has the dpd_host SMF property (tfport.xml:52) set to [::1]
(services.rs:2852). tfportd has a --dpd-host option (main.rs:81) but the SMF
start script (svc-tfportd:12) does not use it.
"""
permalinks = [
"https://github.com/oxidecomputer/omicron/blob/6cef874/sled-agent/src/services.rs#L2852",
"https://github.com/oxidecomputer/dendrite/blob/606c0be/tfportd/misc/tfport.xml#L52-L53",
"https://github.com/oxidecomputer/dendrite/blob/cf31be9/tfportd/src/main.rs#L81-L83",
"https://github.com/oxidecomputer/dendrite/blob/606c0be/tools/svc-tfportd#L12",
]

[[intra_deployment_unit_only_edges]]
server = "tfportd"
client = "lldpd-client"
note = "tfportd hardcodes localhost for its lldpd-client (tfport.rs:88)."
permalinks = [
"https://github.com/oxidecomputer/dendrite/blob/606c0be/tfportd/src/tfport.rs#L88",
]

[[intra_deployment_unit_only_edges]]
server = "wicketd"
client = "ddm-admin-client"
note = "wicketd uses DdmAdminClient::localhost() (bootstrap_addrs.rs:162)."
permalinks = [
"https://github.com/oxidecomputer/omicron/blob/6cef874/wicketd/src/bootstrap_addrs.rs#L162",
]

[[intra_deployment_unit_only_edges]]
server = "wicketd"
client = "dpd-client"
note = "wicketd hardcodes [::1] (uplink.rs:83)."
permalinks = [
"https://github.com/oxidecomputer/omicron/blob/6cef874/wicketd/src/preflight_check/uplink.rs#L83",
]

[[intra_deployment_unit_only_edges]]
server = "wicketd"
client = "gateway-client"
note = """
wicketd's mgs-address config (wicketd.rs:35) is always set to [::1]
(services.rs:2586). This config is passed into wicketd at startup
(manifest.xml:20).
"""
permalinks = [
"https://github.com/oxidecomputer/omicron/blob/6cef874/sled-agent/src/services.rs#L2685-L2689",
"https://github.com/oxidecomputer/omicron/blob/6cef874/wicketd/src/bin/wicketd.rs#L35",
"https://github.com/oxidecomputer/omicron/blob/6cef874/smf/wicketd/manifest.xml#L20",
]
62 changes: 61 additions & 1 deletion dev-tools/ls-apis/src/api_metadata.rs
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ pub struct AllApiMetadata {
deployment_units: BTreeMap<DeploymentUnitName, DeploymentUnitInfo>,
dependency_rules: BTreeMap<ClientPackageName, Vec<DependencyFilterRule>>,
ignored_non_clients: BTreeSet<ClientPackageName>,
intra_deployment_unit_only_edges: Vec<IntraDeploymentUnitOnlyEdge>,
}

impl AllApiMetadata {
Expand Down Expand Up @@ -73,6 +74,13 @@ impl AllApiMetadata {
&self.ignored_non_clients
}

/// Returns the list of intra-deployment-unit-only edges
pub fn intra_deployment_unit_only_edges(
&self,
) -> &[IntraDeploymentUnitOnlyEdge] {
&self.intra_deployment_unit_only_edges
}

/// Returns how we should filter the given dependency
pub(crate) fn evaluate_dependency(
&self,
Expand Down Expand Up @@ -138,6 +146,7 @@ struct RawApiMetadata {
deployment_units: Vec<DeploymentUnitInfo>,
dependency_filter_rules: Vec<DependencyFilterRule>,
ignored_non_clients: Vec<ClientPackageName>,
intra_deployment_unit_only_edges: Vec<IntraDeploymentUnitOnlyEdge>,
}

impl TryFrom<RawApiMetadata> for AllApiMetadata {
Expand Down Expand Up @@ -188,17 +197,41 @@ impl TryFrom<RawApiMetadata> for AllApiMetadata {
for client_pkg in raw.ignored_non_clients {
if !ignored_non_clients.insert(client_pkg.clone()) {
bail!(
"entry in ignored_non_clients appearead twice: {:?}",
"entry in ignored_non_clients appeared twice: {:?}",
&client_pkg
);
}
}

// Validate that IDU-only edges reference only known server components
// and APIs.
let known_components: BTreeSet<_> =
deployment_units.values().flat_map(|u| u.packages.iter()).collect();
for edge in &raw.intra_deployment_unit_only_edges {
if !known_components.contains(&edge.server) {
bail!(
"intra_deployment_unit_only_edges: \
unknown server component {:?}",
edge.server
);
}

if !apis.contains_key(&edge.client) {
bail!(
"intra_deployment_unit_only_edges: \
unknown client {:?}",
edge.client,
);
}
}

Ok(AllApiMetadata {
apis,
deployment_units,
dependency_rules,
ignored_non_clients,
intra_deployment_unit_only_edges: raw
.intra_deployment_unit_only_edges,
})
}
}
Expand Down Expand Up @@ -416,3 +449,30 @@ pub enum Evaluation {
/// This dependency should be part of the update DAG
Dag,
}

/// An edge that should be excluded from the deployment unit dependency graph
/// because it represents communication that only happens locally within a
/// single instance of a single deployment unit.
#[derive(Deserialize)]
#[serde(deny_unknown_fields)]
pub struct IntraDeploymentUnitOnlyEdge {
/// The server component that consumes the API.
pub server: ServerComponentName,
/// The client package consumed.
pub client: ClientPackageName,
/// Explanation of why this edge is intra-deployment-unit-only.
pub note: String,
/// Permalinks to source code referenced by `note`
pub permalinks: Vec<String>,
}

impl IntraDeploymentUnitOnlyEdge {
/// Returns true if this rule matches the given (server, client) pair.
pub fn matches(
&self,
server: &ServerComponentName,
client: &ClientPackageName,
) -> bool {
self.server == *server && self.client == *client
}
}
9 changes: 8 additions & 1 deletion dev-tools/ls-apis/src/bin/ls-apis.rs
Original file line number Diff line number Diff line change
Expand Up @@ -252,7 +252,14 @@ fn print_server_components<'a>(
);
}
for (c, path) in apis.component_apis_consumed(s, filter)? {
println!("{} consumes: {}", prefix, c);
if apis.idu_only_edge_note(s, c).is_some() {
println!(
"{} consumes: {} (intra-deployment-unit-only)",
prefix, c
);
} else {
println!("{} consumes: {}", prefix, c);
}
if show_deps {
for p in path.nodes() {
println!("{} via: {}", prefix, p);
Expand Down
Loading
Loading