Skip to content

bfdd, lib: extend ZEBRA_BFD_DEST_REGISTER with SBFD/SRv6 fields#22068

Open
rajshekhar-nexthop wants to merge 2 commits into
FRRouting:masterfrom
rajshekhar-nexthop:zapi-bfd-dest-register-sbfd-fields
Open

bfdd, lib: extend ZEBRA_BFD_DEST_REGISTER with SBFD/SRv6 fields#22068
rajshekhar-nexthop wants to merge 2 commits into
FRRouting:masterfrom
rajshekhar-nexthop:zapi-bfd-dest-register-sbfd-fields

Conversation

@rajshekhar-nexthop
Copy link
Copy Markdown

@rajshekhar-nexthop rajshekhar-nexthop commented May 26, 2026

Summary

Append an optional flag-gated tail to the ZEBRA_BFD_DEST_REGISTER /
DEST_UPDATE / DEST_DEREGISTER ZAPI payload so external BFD clients
(e.g. pathd) can register S-BFD over SRv6 sessions without
introducing a new opcode.

Modeled on the ZAPI_MESSAGE_* pattern in lib/zclient.h: a leading
uint16 flags word indicates which optional fields follow, so
"field absent" and "field set to zero" remain distinguishable
(remote_discr == 0 for sbfd_init is a legitimate value), and new
optional fields can be added by claiming a new bit without reshuffling
positional probing on either side.

Tail format:

  w  bfd_regext_flags          (always, when tail is emitted)
  if FLAG_BFD_MODE:     c      bfd_mode
  if FLAG_REMOTE_DISCR: l      remote_discr
  if FLAG_SRV6_SOURCE:  16     srv6_source_ipv6
  if FLAG_SEG_LIST:     c      seg_num + 16*seg_num bytes seg_list
  if FLAG_BFD_NAME:     c      bfd_name_len + X bytes bfd_name

Wire compatibility: classical-BFD callers leave bfd_regext_flags
zero and produce byte-identical wire frames to the pre-extension
layout. The decoder detects the no-tail case via STREAM_READABLE;
on a non-empty tail, it reads the flags word and conditionally reads
each flagged field. Bytes past the last flagged field this decoder
recognises are silently ignored, so newer senders may append fields
by claiming new flag bits.

Implementation notes:

  • lib/bfd.c::zclient_bfd_command emits the flags word + flagged
    fields iff bfd_regext_flags != 0 — including on deregister frames,
    because bfd_name participates in gen_bfd_key and a named session
    cannot be located for deletion without it. Encoder also rejects
    FLAG_REMOTE_DISCR without FLAG_BFD_MODE, and FLAG_BFD_MODE with
    classical mode (0).

  • bfdd/ptm_adapter.c::_ptm_msg_read populates
    bpc->bfd_regext_flags and conditionally reads each flagged field.
    Bounds-checks seg_num, bfd_name_len, and bfd_mode. Rejects
    SBFD_INIT mode without FLAG_REMOTE_DISCR (RFC 7881 §3,
    discriminator 0 reserved).

  • ptm_bfd_sess_new propagates the new fields onto the freshly
    allocated bfd_session before bs_registrate, since
    bs_registrate -> bfd_session_enable dispatches on bs->bfd_mode
    to pick the SRH vs classical UDP socket, and bs_peer_find keys on
    bfd_name. Gating is on bpc->bfd_regext_flags bits.

  • _bfd_session_update propagates out_sip6, seg_list[], and
    remote_discr on re-register, also gated by flag bits — so a client
    can change a session's SRv6 path without tearing it down. bfd_mode
    and bfd_name are deliberately not mutated on update (would require
    socket re-open / key re-hash; callers wanting that must deregister

    • re-register).

Only the S-BFD echo path is exercised end-to-end. The wire layout
reserves room for SBFD_INIT but the init-session path is not driven
by this change.

Commits:

  1. bfdd, lib: extend ZEBRA_BFD_DEST_REGISTER with SBFD/SRv6 fields
  2. tests: topotest for ZEBRA_BFD_DEST_REGISTER SBFD tail — three
    cases: SBFD register, classical (no-tail) register, and same-name
    SBFD re-register dedupe.

Related Issue

N/A

Components

bfdd, lib, tests

@frrbot frrbot Bot added bfd libfrr tests Topotests, make check, etc labels May 26, 2026
@rajshekhar-nexthop rajshekhar-nexthop force-pushed the zapi-bfd-dest-register-sbfd-fields branch 2 times, most recently from b393cb7 to ba45a6b Compare May 26, 2026 12:48
Comment thread bfdd/bfd.h Outdated

/*
* SBFD over SRv6 fields carried in the optional tail of
* the ZEBRA_BFD_DEST_REGISTER payload. Defaults of zero match
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the relationship with the BFD_DEST_UPDATE message, which has a handler in lib/bfd.c ?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review, and apologies for the delay in getting back.

This is ZEBRA_INTERFACE_BFD_DEST_UPDATE — the bfdd → client state-notification path (handled in lib/bfd.c by zclient_bfd_session_updatebfd_get_peer_info). This PR doesn't touch it; only the client → bfdd register/update/deregister side is extended.

The notification already carries bfd_name (written in ptm_bfd_notify, parsed by bfd_get_peer_info), and SBFD sessions are keyed on bfd_name in gen_bfd_key, so clients can match the notification back to their session without any change there.

If you feel SBFD-aware fields should also ride on the notification, happy to take that up as a follow-up.

Comment thread bfdd/ptm_adapter.c Outdated
* - c: bfd_name length
* - X bytes: bfd_name
*
* The encoder emits the tail only when at least one SBFD field is
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I agree with the comment about the fragility of this design. it would be better to have explicit indications about the additional optional fields, as we do with the route zapi messages.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review and the pointer to the route-zapi pattern, and sorry for the delay in turning this around.

Agreed, reworked in v2 (force-pushed). Optional tail now starts with a uint16 bfd_regext_flags word and per-field BFD_REGEXT_FLAG_* bits, modeled on ZAPI_MESSAGE_* in lib/zclient.h. remote_discr == 0 for sbfd_init is now unambiguous, bfd_name is no longer coupled to SBFD mode, and new fields can just claim a new bit.

While at it, also fixed three issues the old gate-by-value design had:

  • DEST_DEREGISTER now carries the tail too — without bfd_name, bs_peer_find couldn't locate named sessions for deletion.
  • _bfd_session_update propagates out_sip6 / seg_list / remote_discr on re-register (gated by flag bits) — so pathd can change a session's SR path without bouncing it.
  • Decoder rejects SBFD_INIT mode without FLAG_REMOTE_DISCR (RFC 7881 §3, discriminator 0 reserved).

Please have a look when you get a chance.

@rajshekhar-nexthop rajshekhar-nexthop force-pushed the zapi-bfd-dest-register-sbfd-fields branch from ba45a6b to 850a93d Compare June 2, 2026 09:07
@rajshekhar-nexthop rajshekhar-nexthop force-pushed the zapi-bfd-dest-register-sbfd-fields branch from 850a93d to c2cabaa Compare June 2, 2026 16:57
@github-actions github-actions Bot added the master label Jun 2, 2026
Append an optional flag-gated tail to the ZEBRA_BFD_DEST_REGISTER /
DEST_UPDATE / DEST_DEREGISTER ZAPI payload so external BFD clients
(e.g. pathd) can register S-BFD sessions over ZAPI without introducing
a new opcode.

Modeled on the ZAPI_MESSAGE_* pattern in lib/zclient.h: a leading
uint16 flags word in the optional tail indicates which optional fields
follow, so "field absent" and "field set to zero" remain
distinguishable (remote_discr == 0 on sbfd_init is a legitimate value)
and new optional fields can be added by claiming a new bit without
reshuffling positional probing.

The wire-format extension is backward compatible:

* lib/bfd.c::zclient_bfd_command emits the flags word + flagged
  fields iff `bfd_regext_flags != 0`. Includes deregister frames so
  a named session (whose key includes `bfd_name`) can be located for
  deletion; classical-BFD callers leave `bfd_regext_flags` zero and
  produce byte-identical wire frames. Rejects nonsensical flag
  combinations (FLAG_REMOTE_DISCR without FLAG_BFD_MODE,
  FLAG_BFD_MODE with mode 0) at the encoder boundary.

* bfdd/ptm_adapter.c::_ptm_msg_read detects the no-tail case via
  STREAM_READABLE; on a non-empty tail, reads the flags word into
  `bpc->bfd_regext_flags` and conditionally reads each flagged field.
  Bounds-checks `seg_num` and `bfd_name_len` on tail-present frames.

* ptm_bfd_sess_new propagates the new fields onto the freshly-
  allocated bfd_session before bs_registrate (because bs_registrate
  -> bfd_session_enable dispatches on bs->bfd_mode and bs_peer_find
  keys on bfd_name). Gating is on `bpc->bfd_regext_flags` bits so an
  explicit `remote_discr == 0` is honoured.

* _bfd_session_update propagates `out_sip6`, `seg_list[]`, and
  `remote_discr` on re-register, also gated by flag bits. `bfd_mode`
  and `bfd_name` are deliberately not mutated on update (changing
  either requires socket re-open / key re-hash, which only the
  initial-create path performs).

Only the S-BFD echo path is exercised end-to-end. The wire layout
reserves room for SBFD_INIT but the init-session path is not driven
by this change.
Verifies the wire-payload extension landed in the previous patch.
The test process spawns a Python ZAPI client inside r1's network
namespace, performs the ZEBRA_HELLO / ZEBRA_BFD_CLIENT_REGISTER /
ZEBRA_BFD_DEST_REGISTER handshake against zebra, and asserts that
bfdd's session table reflects what was sent.

Three cases:

* test_zapi_sbfd_register_creates_session - register frame carries
  the full SBFD tail (bfd_regext_flags + bfd_mode + remote_discr +
  srv6_source_ipv6 + seg_list[2] + bfd_name="zapi-sbfd-test").
  Asserts a bfdd peer with the expected destination materialises
  and that bfd_name was preserved through the tail.

* test_zapi_classical_register_still_accepted - register frame omits
  the SBFD tail entirely (no flags word), exercising both the encoder
  gate (lib/bfd.c emits no tail when bfd_regext_flags == 0) and the
  decoder's STREAM_READABLE-gated tail handling.

* test_zapi_sbfd_register_deduplicates_by_bfd_name - repeats the
  SBFD register from the first test and asserts the peer count for
  the same destination does not grow. This catches a key-table
  asymmetry (bs_peer_find uses bpc->bfd_name in the lookup key, so
  ptm_bfd_sess_new must populate bs->key.bfdname on insert - see
  the production change in the previous patch).
@rajshekhar-nexthop rajshekhar-nexthop force-pushed the zapi-bfd-dest-register-sbfd-fields branch from c2cabaa to 38c67d5 Compare June 2, 2026 17:34
@rajshekhar-nexthop rajshekhar-nexthop marked this pull request as ready for review June 2, 2026 17:46
@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented Jun 2, 2026

Greptile Summary

This PR extends the ZEBRA_BFD_DEST_REGISTER / DEST_UPDATE / DEST_DEREGISTER ZAPI payload with an optional flag-gated tail carrying SBFD/SRv6 fields (bfd_mode, remote_discr, srv6_source_ipv6, seg_list[], bfd_name), modeled on the existing ZAPI_MESSAGE_* pattern. Classical-BFD callers emit no tail and remain wire-compatible; the decoder detects the tail via STREAM_READABLE.

  • lib/bfd.c and lib/bfd.h define the BFD_REGEXT_FLAG_* bitmask, extend bfd_session_arg, and encode the tail with cross-flag validation.
  • bfdd/ptm_adapter.c decodes the tail with bounds checks and rejects SBFD_INIT without a discriminator; ptm_bfd_sess_new propagates the new fields before bs_registrate.
  • A new topotest exercises SBFD register, classical backward-compat, and same-name re-register dedupe via raw ZAPI frames.

Confidence Score: 3/5

The new-session creation path is correctly ordered and bounds-checked, but a live SBFD session receiving a DEST_UPDATE with new seg_list or out_sip6 will have those fields silently discarded, which could cause incorrect forwarding while BFD still reports Up.

The encoder/decoder and new-session path look well-constructed. In bfdd_dest_register the bs != NULL branch only applies the BFD profile and never calls bfd_session_update, so the SBFD update code added to _bfd_session_update is unreachable from the normal ZAPI re-register path. The PR description advertises live path updates as a supported operation but the implementation does not wire this up.

bfdd/ptm_adapter.c — the bs != NULL branch in bfdd_dest_register and the comment block at lines 611-625 need reconciliation with the intent to support live SBFD path updates.

Important Files Changed

Filename Overview
bfdd/ptm_adapter.c Adds SBFD/SRv6 optional-tail decoding to _ptm_msg_read with correct bounds checks; however bfdd_dest_register's bs != NULL branch never calls bfd_session_update, so decoded SBFD fields are silently discarded on re-register.
bfdd/bfd.c ptm_bfd_sess_new sets bfd_mode, bfd_name, out_sip6, seg_list, and remote_discr before bs_registrate (correct ordering); _bfd_session_update SBFD propagation is only reachable via ptm_bfd_sess_new's internal duplicate detection, not from bfdd_dest_register re-registers.
bfdd/bfd.h Adds bfd_regext_flags, bfd_mode, remote_discr, srv6_source_ipv6, seg_num, and seg_list to bfd_peer_cfg; types and array sizes match bfd_session counterparts.
lib/bfd.c Encoder emits the optional tail when bfd_regext_flags != 0 with correct cross-flag validation and seg_num bounds check.
lib/bfd.h Defines BFD_REGEXT_FLAG_* bitmask constants and extends bfd_session_arg; layout matches the encoder and decoder.
tests/topotests/bfd_zapi_sbfd_topo1/test_bfd_zapi_sbfd_topo1.py New topotest covers SBFD register, classical backward-compat, and dedupe-by-bfd_name; uses hardcoded positional ZAPI opcode values fragile to upstream enum changes.
tests/topotests/bfd_zapi_sbfd_topo1/r1/frr.conf Minimal single-router config enabling zebra, bfdd, and an IPv6 test interface — no issues.
tests/topotests/bfd_zapi_sbfd_topo1/init.py Empty init file for topotest module discovery — no issues.

Sequence Diagram

sequenceDiagram
    participant C as ZAPI Client
    participant Z as zebra
    participant B as bfdd
    C->>Z: ZEBRA_BFD_DEST_REGISTER fixed fields + optional SBFD tail
    Z->>B: forward message
    B->>B: _ptm_msg_read decode fixed fields
    B->>B: "if STREAM_READABLE>0 read flags_w then flagged fields"
    B->>B: bfdd_dest_register bs_peer_find
    alt session not found
        B->>B: ptm_bfd_sess_new set bfd_mode name sids before bs_registrate
        B->>B: bs_registrate opens SRH or UDP socket
        B->>B: _bfd_session_update timers and profile
    else session found
        B->>B: bfd_profile_apply only SBFD fields NOT updated
    end
    B->>C: ZEBRA_INTERFACE_BFD_DEST_UPDATE state notification
Loading

Comments Outside Diff (2)

  1. bfdd/ptm_adapter.c, line 596-625 (link)

    P1 SBFD field updates silently dropped on re-register

    When bs_peer_find finds an existing session, the bs != NULL branch applies only the BFD profile and then falls through to pcn_new — it never calls bfd_session_update or _bfd_session_update. The SBFD fields decoded by _ptm_msg_read (out_sip6, seg_list, remote_discr) are therefore silently discarded for any DEST_REGISTER/DEST_UPDATE that targets a live session.

    The inline comment here states this is "intentionally immutable from the ZAPI surface", while the PR description states the opposite — "_bfd_session_update propagates out_sip6, seg_list[], and remote_discr on re-register … so a client can change a session's SRv6 path without tearing it down." The changes to _bfd_session_update in bfdd/bfd.c are effectively dead code from the perspective of this handler, since ptm_bfd_sess_new is only reached when bs == NULL.

    If live SBFD path updates are the intended contract, _bfd_session_update(bs, &bpc) (or an equivalent call) must be added to the else branch here. If the "deregister + register" constraint is intentional, the new SBFD propagation code in _bfd_session_update should be removed and the PR description corrected.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: bfdd/ptm_adapter.c
    Line: 596-625
    
    Comment:
    **SBFD field updates silently dropped on re-register**
    
    When `bs_peer_find` finds an existing session, the `bs != NULL` branch applies only the BFD profile and then falls through to `pcn_new` — it never calls `bfd_session_update` or `_bfd_session_update`. The SBFD fields decoded by `_ptm_msg_read` (`out_sip6`, `seg_list`, `remote_discr`) are therefore silently discarded for any DEST_REGISTER/DEST_UPDATE that targets a live session.
    
    The inline comment here states this is "intentionally immutable from the ZAPI surface", while the PR description states the opposite — "`_bfd_session_update` propagates `out_sip6`, `seg_list[]`, and `remote_discr` on re-register … so a client can change a session's SRv6 path without tearing it down." The changes to `_bfd_session_update` in `bfdd/bfd.c` are effectively dead code from the perspective of this handler, since `ptm_bfd_sess_new` is only reached when `bs == NULL`.
    
    If live SBFD path updates are the intended contract, `_bfd_session_update(bs, &bpc)` (or an equivalent call) must be added to the `else` branch here. If the "deregister + register" constraint is intentional, the new SBFD propagation code in `_bfd_session_update` should be removed and the PR description corrected.
    
    How can I resolve this? If you propose a fix, please make it concise.
  2. tests/topotests/bfd_zapi_sbfd_topo1/test_bfd_zapi_sbfd_topo1.py, line 481-492 (link)

    P2 Hardcoded positional ZAPI opcodes will silently misfire on rebase

    ZEBRA_HELLO = 19, ZEBRA_BFD_DEST_REGISTER = 27, and ZEBRA_BFD_CLIENT_REGISTER = 35 are positional values from enum zclient_msg_type. If any new command is inserted before these entries in lib/zclient.h, the wrong ZAPI commands will be sent. Zebra issues no wire-level NAK, so the only symptom will be a _wait_for_peer timeout with no diagnostic pointing to the stale opcode. The comment documents this risk and provides a verification command, but deriving the values at test build time (e.g. from the installed zclient.h) would make the test self-healing.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: tests/topotests/bfd_zapi_sbfd_topo1/test_bfd_zapi_sbfd_topo1.py
    Line: 481-492
    
    Comment:
    **Hardcoded positional ZAPI opcodes will silently misfire on rebase**
    
    `ZEBRA_HELLO = 19`, `ZEBRA_BFD_DEST_REGISTER = 27`, and `ZEBRA_BFD_CLIENT_REGISTER = 35` are positional values from `enum zclient_msg_type`. If any new command is inserted before these entries in `lib/zclient.h`, the wrong ZAPI commands will be sent. Zebra issues no wire-level NAK, so the only symptom will be a `_wait_for_peer` timeout with no diagnostic pointing to the stale opcode. The comment documents this risk and provides a verification command, but deriving the values at test build time (e.g. from the installed `zclient.h`) would make the test self-healing.
    
    How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
bfdd/ptm_adapter.c:596-625
**SBFD field updates silently dropped on re-register**

When `bs_peer_find` finds an existing session, the `bs != NULL` branch applies only the BFD profile and then falls through to `pcn_new` — it never calls `bfd_session_update` or `_bfd_session_update`. The SBFD fields decoded by `_ptm_msg_read` (`out_sip6`, `seg_list`, `remote_discr`) are therefore silently discarded for any DEST_REGISTER/DEST_UPDATE that targets a live session.

The inline comment here states this is "intentionally immutable from the ZAPI surface", while the PR description states the opposite — "`_bfd_session_update` propagates `out_sip6`, `seg_list[]`, and `remote_discr` on re-register … so a client can change a session's SRv6 path without tearing it down." The changes to `_bfd_session_update` in `bfdd/bfd.c` are effectively dead code from the perspective of this handler, since `ptm_bfd_sess_new` is only reached when `bs == NULL`.

If live SBFD path updates are the intended contract, `_bfd_session_update(bs, &bpc)` (or an equivalent call) must be added to the `else` branch here. If the "deregister + register" constraint is intentional, the new SBFD propagation code in `_bfd_session_update` should be removed and the PR description corrected.

### Issue 2 of 2
tests/topotests/bfd_zapi_sbfd_topo1/test_bfd_zapi_sbfd_topo1.py:481-492
**Hardcoded positional ZAPI opcodes will silently misfire on rebase**

`ZEBRA_HELLO = 19`, `ZEBRA_BFD_DEST_REGISTER = 27`, and `ZEBRA_BFD_CLIENT_REGISTER = 35` are positional values from `enum zclient_msg_type`. If any new command is inserted before these entries in `lib/zclient.h`, the wrong ZAPI commands will be sent. Zebra issues no wire-level NAK, so the only symptom will be a `_wait_for_peer` timeout with no diagnostic pointing to the stale opcode. The comment documents this risk and provides a verification command, but deriving the values at test build time (e.g. from the installed `zclient.h`) would make the test self-healing.

Reviews (1): Last reviewed commit: "tests: topotest for ZEBRA_BFD_DEST_REGIS..." | Re-trigger Greptile

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants