Skip to content

Delta npds#1836

Open
jrajahalme wants to merge 4 commits into
mainfrom
delta-npds
Open

Delta npds#1836
jrajahalme wants to merge 4 commits into
mainfrom
delta-npds

Conversation

@jrajahalme
Copy link
Copy Markdown
Member

@jrajahalme jrajahalme commented Apr 6, 2026

Add a new NetworkPolicyResourceDiscoveryService that implements delta updates for policies and selectors, and where policies refer to selectors by their resource name.

NPRDS adds a top-level oneof wrapper that wraps either a Selector or a NetworkPolicy. NetworkPolicy definition is shared with NPDS, but PortNetworkPolicyRule adds a new selectors field that is only used with NPRDS.

NetworkPolicyMap switches to delta mode eagerly when there is evidence that the agent is capable (via BpfMetadata listener filter config), but we switch to SotW mode only when xDS stream transport had failed to connect or closes. This should work for Cilium Agent upgrades and downgrades, as the agent
expresses the desired mode, and upgraded agents listen for both SotW NPDS and Delta NPRDS.

Start from an empty network policy resource map on the first update on a new stream. This fixes NACK cases where further updates on the stream would have IP collisions with resources that were kept from the previous stream, originating from the previous instance of the restarted Cilium Agent.

Network policy map maintains a stream generation number for new stream detection purposes. This is implemented using a new stream events callback added to upstream Envoy gRPC Mux classes via a new patch.

NOTE: This includes commits from the following PRs that should be merged first:

@jrajahalme jrajahalme requested a review from a team as a code owner April 6, 2026 18:42
@jrajahalme jrajahalme added the preview-only Preview only label Apr 6, 2026
@jrajahalme jrajahalme requested a review from sayboras April 6, 2026 18:42
@jrajahalme jrajahalme marked this pull request as draft April 6, 2026 18:42
@jrajahalme jrajahalme requested review from fristonio and nezdolik and removed request for sayboras April 6, 2026 18:42
Comment thread cilium/api/bpf_metadata.proto Outdated
@jrajahalme jrajahalme force-pushed the delta-npds branch 5 times, most recently from 6ff6c15 to db12d3e Compare April 19, 2026 09:36
@jrajahalme jrajahalme force-pushed the delta-npds branch 2 times, most recently from fba7008 to b337bce Compare May 6, 2026 15:44
@jrajahalme jrajahalme added enhancement New feature or request and removed preview-only Preview only labels May 6, 2026
@jrajahalme jrajahalme marked this pull request as ready for review May 6, 2026 15:51
@jrajahalme jrajahalme requested a review from nezdolik May 6, 2026 15:52
@jrajahalme jrajahalme force-pushed the delta-npds branch 3 times, most recently from 698f973 to f2c684a Compare May 23, 2026 20:34
@jrajahalme
Copy link
Copy Markdown
Member Author

Changed to clear the resource map on new streams before parsing policies. This forces re-parsing policies that have unchanged to use the current subscription's ConfigSource for Secret watchers.

Copy link
Copy Markdown
Contributor

@nezdolik nezdolik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did one quick pass, will need to do one more round.

Comment thread cilium/host_map.cc Outdated
Comment thread cilium/host_map.cc Outdated
Comment thread cilium/host_map.h
@jrajahalme jrajahalme force-pushed the delta-npds branch 2 times, most recently from 30e03f7 to 2dc2c93 Compare May 25, 2026 23:23
@jrajahalme jrajahalme requested a review from nezdolik May 25, 2026 23:24
@jrajahalme jrajahalme force-pushed the delta-npds branch 5 times, most recently from 541b0fd to ded3e9f Compare May 26, 2026 20:46
Make sure each test case has endpoint ID field in the NetworkPolicy so
that we can validate for it.

Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
Replace Config::GrpcMuxImpl wrapper with stream event callback patch on
upstream so that new stream detection works on all the needed Mux types
for SotW, Delta, and ADS.

New stream detection is the means by which we detect Cilium Agent
restarts, which generally requires the ipcache bpf map to be
reopened. Delta updates also depend on this detection to force
synchronization as the restarted agent may not know which resources to
remove.

Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
Add Delta rpc to the APIs so that we can run NPDS and NPHDS also via
Delta xDS.

Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
Add new cilium/versioned.h generic container for transactional selector
updates.

Add a new NetworkPolicyResourceDiscoveryService that implements delta
(and SotW) updates for policies and selectors, and where policies refer
to selectors by their resource name.

NPRDS adds a top-level oneof wrapper that wraps either a Selector or a
NetworkPolicy. NetworkPolicy definition is shared with NPDS, but
PortNetworkPolicyRule adds a new selectors field that is only used with
NPRDS.

Add 'policy_type' enum to BpfMetadata config to control whether NPDS
(default) or NPRDS is used.

Store the latest desired ConfigSource in the policy map and use it for:
- initial policy map subscription
- re-subscription when connection under current subscription is terminated
- a healthy network policy stream is not disrupted, unless the desired
  config is for delta xDS and the current one is not

This means that we switch to NPRDS (Delta) mode eagerly when we have
evidence that the agent is capable, but we switch to NPDS (SotW) mode
only when xDS stream transport had failed to connect or closes.

This should work for Cilium Agent upgrades and downgrades, as the agent
expresses the desired mode, and listens for both.

Clear the resource map on a first update on a new stream. This fixes NACK
cases where further updates on the stream would have IP collisions with
resources that were kept from the previous stream.

Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dont-merge/preview-only DON'T MERGE enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants