Skip to content

Dual-backend 1/3 — RNS core + protocol seam#79

Merged
torlando-tech merged 17 commits into
mainfrom
pr1-rns-core
May 31, 2026
Merged

Dual-backend 1/3 — RNS core + protocol seam#79
torlando-tech merged 17 commits into
mainfrom
pr1-rns-core

Conversation

@torlando-tech

Copy link
Copy Markdown
Owner

Part 1 of 3 of a path-scoped split of feat/dual-backend-arch, sliced to stay under the 100-file Greptile review limit. The whole feature is the Python-RNS migration + the RnsBackend protocol seam (embedded-Python and native-Swift backends) + a runtime engine toggle. Net diff vs main is 190 files; this slice is 44.

This slice — RNS core + protocol seam

RNSAPI (seam + Compat), RNSBackendPy / RNSBackendSwift, PythonBridge, the app/*.py Python RNS sources, Columba.xcodeproj, build / support scripts, CI + docs.

Stacked series (merge in order)

  1. this PRmain
  2. app services + BLE bridge → base pr1-rns-core (67 files)
  3. UI screens + tests → base pr2-app-services (79 files)

The three slices' union is byte-identical to feat/dual-backend-arch. They are chunks of one feature, not independent units: merge 1 → 2 → 3, and an individual slice is not independently buildable (fine for static review; not for per-PR CI).

Deferred (carried here, intentionally not applied)

  • The LXMF-swift 0.4.0 / reticulum-swift 0.3.0 bump is not applied — the current pins are kept (the native Swift backend is verified against them); do the bump separately with a Swift-backend rebuild.
  • main's newer interop test files are present on disk but not yet wired into the ColumbaAppTests target.

🤖 Generated with Claude Code

Path-scoped slice of feat/dual-backend-arch for review (Greptile <=100 files):
RNSAPI seam + Compat, RNSBackendPy/Swift, PythonBridge, app/*.py Python RNS
sources, xcodeproj, build/support scripts, CI/docs. Stacked: PR2 (app services)
and PR3 (UI) build on this. Not independently buildable.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@greptile-apps

greptile-apps Bot commented May 30, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR introduces the RNS protocol seam — the RnsBackend protocol and its two concrete implementations (PythonRNSBackend wrapping embedded CPython, SwiftRNSBackend wrapping reticulum-swift/LXMF-swift) — plus the full Python rns_bridge.py module, PythonBridge C-API embedding, build/CI scripts, and the RNSAPI Compat/DTO layer. It is the first of three stacked slices and is not independently buildable.

  • RnsBackend protocol seam with neutral DTOs (BackendEvent, PropagationSyncResult, StartParams) replaces all Python-namespaced types in the host; both backends conform identically from AppServices's perspective.
  • PythonRNSBackend / PythonBridge add blockingQueue separation for long-poll calls (propagationSync, fetchNomadNetPage) and explicit drain-loop restart in start() — addressing previously-flagged drain starvation and drain-loop-death issues.
  • SwiftRNSBackend adds linkLock protecting nextLinkId/links/linkStateTasks, explicit task cancellation in stop()/linkTeardown(), and a racing withThrowingTaskGroup in propagationSync to bound the wait.

Confidence Score: 3/5

The CI workflow clones sibling packages without branch/commit pinning, meaning it can silently test against different library versions than Package.resolved specifies — including the wrong branch for LXST-swift.

The CI clone step for LXST-swift omits -b feat/transport-agnostic, so xcodebuild on CI resolves a different branch than what Package.resolved pins. Prior review rounds surfaced a cluster of concurrency bugs many of which the current code addresses; remaining open items are the primary obstacle to production-readiness.

.github/workflows/tests.yml (clone pinning), Sources/RNSBackendSwift/SwiftRNSBackend.swift and Sources/PythonBridge/PythonBridge.swift (open items from prior rounds), app/rns_bridge.py (same).

Important Files Changed

Filename Overview
.github/workflows/tests.yml Adds Python framework/wheel-fetch step before xcodebuild (fixing the previously-missing step), but the Clone sibling packages step clones all three repos without branch or commit pinning — LXST-swift should use -b feat/transport-agnostic and all three should be pinned to match Package.resolved.
Sources/RNSBackendSwift/SwiftRNSBackend.swift New native Swift backend conforming to RnsBackend. linkLock, explicit linkStateTasks cancellation, and a racing task group in propagationSync address the data-race and unbounded-block issues from prior review rounds.
Sources/RNSBackendPy/PythonRNSBackend.swift Python-backed RnsBackend adapter. The drain-loop restart fix is present. Clean mapping layer between PythonBridge enums and RNSAPI neutrals.
Sources/PythonBridge/PythonBridge.swift CPython embedding bridge. fetchNomadNetPage and propagationSync now run on blockingQueue avoiding drain starvation. Some C-API allocations (PyFloat_FromDouble, PyLong_FromLongLong) are still passed directly to PyTuple_SetItem without NULL guards.
app/rns_bridge.py Core Python RNS/LXMF bridge. alloc_link_id race, link callback ordering, deadlock on teardown, and reset_identity cleanup are all addressed in the current code. _propagation_state_name rebuilds its mapping dict on every poll call.
Sources/RNSAPI/Protocols/RnsBackend.swift Clean protocol seam definition; BackendEvent, PropagationSyncResult, and the composed RnsBackend protocol mirror the Android :rns-api layout. No issues found.
support/fetch-wheels.sh Wheel-fetch script with explicit env-var overrides for all three Python dependencies. ble-reticulum is now pinned to a specific commit hash via BLE_RETICULUM_REF.

Sequence Diagram

sequenceDiagram
    participant UI as SwiftUI / AppServices
    participant B as RnsBackend (protocol)
    participant Py as PythonRNSBackend
    participant Br as PythonBridge (queue / blockingQueue)
    participant Py2 as rns_bridge.py (CPython)
    participant Sw as SwiftRNSBackend
    participant RS as reticulum-swift / LXMFSwift

    UI->>B: start(params)
    B->>Py: start(params)
    Py->>Br: bridge.start(...) [serial queue]
    Br->>Py2: PyObject_CallObject(start)
    Py2-->>Br: identity_hash, destination_hash
    Br-->>Py: LocalInfo
    Py-->>UI: LocalInfo

    loop Drain loop 200ms
        Py->>Br: bridge.drainEvents() [serial queue]
        Br->>Py2: drain_events()
        Py2-->>Br: event dicts
        Br-->>Py: PythonBridge.Event list
        Py->>UI: eventContinuation.yield(BackendEvent)
    end

    UI->>B: propagationSync(timeout)
    B->>Py: propagationSync
    Py->>Br: bridge.propagationSync [blockingQueue]
    Br->>Py2: propagation_sync() blocks up to timeout
    Py2-->>Br: PropagationSyncResult dict
    Br-->>Py: PropagationSyncResult
    Py-->>UI: PropagationSyncResult

    Note over Sw,RS: Swift backend path
    UI->>B: start(params)
    B->>Sw: start(params)
    Sw->>RS: ReticulumTransport + LXMRouter init
    RS-->>Sw: transport / router
    Sw->>Sw: startAnnouncePolling()
    Sw-->>UI: LocalInfo

    RS->>Sw: RouterDelegate.didReceiveMessage
    Sw->>UI: eventContinuation.yield(.inbound)
Loading

Comments Outside Diff (1)

  1. .github/workflows/tests.yml, line 21-23 (link)

    P1 Sibling-package clones use wrong branch and no commit pin

    git clone without -b <branch> clones the repository's default branch. Package.resolved pins lxst-swift to branch feat/transport-agnostic at commit b192964e14, but the clone here picks up the default branch instead. If the Xcode project uses these sibling directories as local path dependencies (which the clone step implies), then CI builds against the wrong lxst-swift branch. The same applies to reticulum-swift (pinned at 0.2.3 / b8ae6491) and lxmf-swift (pinned at 0.3.4 / b31a14a8): unless the default branch happens to be at exactly those commits, the local clone diverges from the Package.resolved pins. At minimum the LXST-swift clone should add -b feat/transport-agnostic; all three should be pinned with --branch <tag-or-commit> or cloned at the specific SHAs from Package.resolved.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: .github/workflows/tests.yml
    Line: 21-23
    
    Comment:
    **Sibling-package clones use wrong branch and no commit pin**
    
    `git clone` without `-b <branch>` clones the repository's default branch. `Package.resolved` pins `lxst-swift` to branch `feat/transport-agnostic` at commit `b192964e14`, but the clone here picks up the default branch instead. If the Xcode project uses these sibling directories as local path dependencies (which the clone step implies), then CI builds against the wrong `lxst-swift` branch. The same applies to `reticulum-swift` (pinned at 0.2.3 / `b8ae6491`) and `lxmf-swift` (pinned at 0.3.4 / `b31a14a8`): unless the default branch happens to be at exactly those commits, the local clone diverges from the `Package.resolved` pins. At minimum the LXST-swift clone should add `-b feat/transport-agnostic`; all three should be pinned with `--branch <tag-or-commit>` or cloned at the specific SHAs from `Package.resolved`.
    
    How can I resolve this? If you propose a fix, please make it concise.

    Fix in Claude Code

Fix All in Claude Code

Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
.github/workflows/tests.yml:21-23
**Sibling-package clones use wrong branch and no commit pin**

`git clone` without `-b <branch>` clones the repository's default branch. `Package.resolved` pins `lxst-swift` to branch `feat/transport-agnostic` at commit `b192964e14`, but the clone here picks up the default branch instead. If the Xcode project uses these sibling directories as local path dependencies (which the clone step implies), then CI builds against the wrong `lxst-swift` branch. The same applies to `reticulum-swift` (pinned at 0.2.3 / `b8ae6491`) and `lxmf-swift` (pinned at 0.3.4 / `b31a14a8`): unless the default branch happens to be at exactly those commits, the local clone diverges from the `Package.resolved` pins. At minimum the LXST-swift clone should add `-b feat/transport-agnostic`; all three should be pinned with `--branch <tag-or-commit>` or cloned at the specific SHAs from `Package.resolved`.

### Issue 2 of 2
app/rns_bridge.py:523-540
`_propagation_state_name` builds its lookup dict on every call. The `mapping.get(val, f"state_{val}")` fallback returns a non-canonical string like `"state_-1"` that `PropagationSyncResult.State(rawValue:)` cannot decode — it falls to `.unknown` and `ok == (state_name == "complete")` returns `false` even after a successful sync if any `PR_*` attribute is missing. Caching the mapping at module level avoids the repeated `getattr` cost and makes the fallback behaviour explicit.

```suggestion
_PROPAGATION_STATE_MAP: dict = {}

def _build_propagation_state_map() -> dict:
    return {
        getattr(LXMF.LXMRouter, "PR_IDLE", -1): "idle",
        getattr(LXMF.LXMRouter, "PR_PATH_REQUESTED", -2): "path_requested",
        getattr(LXMF.LXMRouter, "PR_LINK_ESTABLISHING", -3): "link_establishing",
        getattr(LXMF.LXMRouter, "PR_LINK_ESTABLISHED", -4): "link_established",
        getattr(LXMF.LXMRouter, "PR_REQUEST_SENT", -5): "request_sent",
        getattr(LXMF.LXMRouter, "PR_RECEIVING", -6): "receiving",
        getattr(LXMF.LXMRouter, "PR_RESPONSE_RECEIVED", -7): "response_received",
        getattr(LXMF.LXMRouter, "PR_COMPLETE", -8): "complete",
        getattr(LXMF.LXMRouter, "PR_NO_PATH", -9): "no_path",
        getattr(LXMF.LXMRouter, "PR_TRANSFER_FAILED", -10): "transfer_failed",
    }

def _propagation_state_name(val: Any) -> str:
    if val is None:
        return "idle"
    if not _PROPAGATION_STATE_MAP:
        _PROPAGATION_STATE_MAP.update(_build_propagation_state_map())
    return _PROPAGATION_STATE_MAP.get(val, f"state_{val}")
```

Reviews (22): Last reviewed commit: "fix: guard interfaceIds with linkLock + ..." | Re-trigger Greptile

Comment thread support/fetch-wheels.sh Outdated
Comment thread Sources/PythonBridge/PythonBridge.swift Outdated
Comment thread app/rns_bridge.py Outdated
Comment thread support/fetch-wheels.sh Outdated
- fetch-wheels.sh: require explicit BLE_RETICULUM_LOCAL env-var for a local
  ble-reticulum checkout instead of implicitly picking up ~/repos/ble-reticulum
  (build reproducibility — matches RETICULUM_LOCAL / LXMF_LOCAL).
- PythonBridge.swift: replace force-unwraps on CPython allocation calls
  (PyTuple_New / PyUnicode_FromString / PyDict_New / PyLong) with guard-let
  throwing / return-false, matching start()'s safe pattern — NULL under iOS
  memory pressure no longer crashes the process.
- rns_bridge.py: drop the dead `terminal` set in the propagation-sync poll loop.
- fetch-wheels.sh: correct the stale cryptography `dev1` comment.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@torlando-tech

Copy link
Copy Markdown
Owner Author

@greptile review

Comment thread app/rns_bridge.py
…ter 2)

reset_identity called exit_handler() but, unlike stop(), never cleared
RNS.Reticulum._Reticulum__instance (+ the exit-handler flags). Its docstring
requires the caller to start() again afterwards, but that follow-on start()
hit __init__'s 'Attempt to reinitialise Reticulum' guard, leaving the app
unable to restart without a process relaunch. Mirror stop()'s cleanup.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
(cherry picked from commit 1a17901)
@torlando-tech

Copy link
Copy Markdown
Owner Author

@greptile review

Comment thread app/rns_bridge.py
Comment thread app/rns_bridge.py
…tive (greploop iter 3)

- reset_identity now also calls router.exit_handler() and tears down open
  _links / clears _telephony_destination (mirrors stop()), not just the
  Reticulum singleton — otherwise a follow-on start() spawns a 2nd LXMRouter
  racing the same lxmf-storage SQLite.
- open-link path: move link.identify() to AFTER link_ready.wait() — identify
  sends an encrypted proof that a PENDING link can't, so identifying first
  silently no-op'd and the remote never learned our identity (breaks nomadnet
  stateful pages).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@torlando-tech

Copy link
Copy Markdown
Owner Author

@greptile review

Comment thread app/rns_bridge.py
Comment thread Sources/PythonBridge/PythonBridge.swift
Comment thread Sources/RNSBackendSwift/SwiftRNSBackend.swift
…ter 4)

_on_closed pops _links[link_id]; if the link closed before the entry existed
the pop missed and open_link then inserted a permanently-zombie entry. Add to
_links first, then wire callbacks — matching the inbound (_on_inbound_link) path.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@torlando-tech

Copy link
Copy Markdown
Owner Author

@greptile review

Comment thread .github/workflows/tests.yml
torlando-agent Bot and others added 2 commits May 29, 2026 22:52
SwiftRNSBackend.propagationSync hardcoded receivedMessages: 0, so the UI's
"N new messages" after a propagation-node sync always showed 0 even when
messages were pulled down. Read router.syncState.receivedMessages (the
router increments it per retrieved message) after the await instead.

Verified end-to-end on the interop harness (red→green): a peer queues a
PROPAGATED message on lxmd; with this fix the sync reports received>=1,
and reverting to the hardcoded 0 makes the same test fail at received=0.
The interop regression test lives on the integration branch (it needs the
full Sideband-peer + simulator harness, not present on this branch).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
PythonBridge routes every call through one serial DispatchQueue. propagationSync
blocks up to its timeout (default 60s) polling the router, so on that shared
queue it stalled every other bridge call — announce, send, setPropagationNode —
for the full duration.

Move propagationSync onto a dedicated blockingQueue. The underlying Python poll
releases the GIL between iterations (time.sleep), so short calls on the main
queue acquire the GIL and run within ~0.5s while a sync is in flight. Two queues
into CPython is safe — withGIL (PyGILState_Ensure/Release) serializes all Python
access regardless of thread, the same pattern the BLE callback path already uses.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Comment thread app/rns_bridge.py Outdated
Comment thread Sources/RNSBackendSwift/SwiftRNSBackend.swift
* PythonBridge.fetchNomadNetPage ran on the main serial queue, so a 30s
  NomadNet fetch stalled drainEvents + every other bridge call — the same
  starvation propagationSync was just fixed for. Route it to blockingQueue too.

* _alloc_link_id incremented a counter with a non-atomic read-modify-write
  while being called from both the bridge thread and RNS callback threads,
  so two links could get the same id and overwrite each other in _links.
  Guard it with a dedicated lock (not the shared _lock — some callers already
  hold that, which would deadlock).

* SwiftRNSBackend.stop() called eventContinuation.finish(), permanently
  terminating the AsyncStream; a later start() on the same instance dropped
  every event. Keep the continuation open across stop/start, matching
  PythonRNSBackend.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Comment thread Sources/RNSBackendSwift/SwiftRNSBackend.swift
…s in CI

* SwiftRNSBackend.propagationSync ignored its `timeout` parameter —
  syncFromPropagationNode() has only per-request timeouts, no overall deadline,
  so a stalled link/response blocked the caller forever. Race the sync against
  the timeout in a task group; the loser is cancelled.

* support/fetch-wheels.sh pulled ble-reticulum from a bare repo URL (HEAD),
  making builds non-reproducible. Pin to a commit (origin/main@07d9413; the
  local checkout is 49 commits past v0.2.2, so a tag would regress).

* .github/workflows/tests.yml built the Python-backend scheme without ever
  fetching Python.xcframework + the wheels, so the copy build-phase had nothing
  to copy. Add a fetch-python.sh + fetch-wheels.sh step before the builds.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@torlando-tech

Copy link
Copy Markdown
Owner Author

@greptile review

Comment thread Sources/RNSBackendPy/PythonRNSBackend.swift
Comment thread Sources/RNSBackendSwift/SwiftRNSBackend.swift Outdated
* PythonRNSBackend.start() relied on the one-shot lazy `events` initializer to
  spin up the drain loop, but stop() cancels eventDrainTask — so after any
  stop/start (identity switch, reconnect) the loop never restarted and no
  Python events reached the host. Call startDrainLoop() in start() (no-ops if
  already running).

* SwiftRNSBackend's per-link stateUpdates drain Task was fire-and-forget and
  untracked, so stop() (which only cleared `links`) left them running, holding
  the kept-open eventContinuation and yielding stale linkState events into the
  next session. Track them in a [Int: Task] and cancel on linkTeardown + stop().

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@torlando-tech

Copy link
Copy Markdown
Owner Author

@greptile review

Comment thread app/rns_bridge.py
torlando-agent Bot and others added 2 commits May 30, 2026 00:48
reset_identity cleared the Reticulum singleton + _links but left RNS.Transport's
process-global class state (destinations, interfaces, path tables, announce
handlers, identities) populated. Its docstring requires the caller to call
start() next — but that start() would raise "Attempt to register an already
registered destination" (Transport.register_destination scans the stale
Transport.destinations list), recoverable only by a process restart.

stop() already did this teardown inline; reset_identity diverged and omitted it.
Extract the shared block into _clear_transport_class_state() and call it from
both, so the two teardown paths can't drift again.

Note: resetIdentity is currently dead code (no callers in the app — identity
switch goes through switchIdentity), so this hardens a latent bug rather than a
live crash; grounded by mirroring the proven stop() path. Verified: py_compile
clean; behaviour-preserving refactor of the live stop() teardown.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Both stop() and reset_identity() iterated _links and called link.teardown()
while holding _lock. link.teardown() can fire the _on_closed callback
synchronously on the calling thread (once exit_handler() has stopped the
transport's dispatch threads), and _on_closed does `with _lock: _links.pop(...)`.
threading.Lock is non-reentrant, so that re-acquire would deadlock the bridge
queue and the user couldn't disconnect/restart without killing the process.

Snapshot _links + clear the dict inside the lock, then tear the snapshot down
outside it — the same in-lock-snapshot / out-of-lock-side-effect ordering used
elsewhere.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@torlando-tech

Copy link
Copy Markdown
Owner Author

On the flagged SwiftRNSBackend telephony inbound-link gap: inbound voice links are already wired by CallManager.initialize, which calls transport.registerDestinationLinkCallback(for: telephonyDestHash)handleIncomingLink (CallManager.swift) against the same lxst.telephony destination hash. Adding a second link-established callback on SwiftRNSBackend.telephonyDestination for that hash would register a competing handler for the same hash and risks breaking the working ring path. Whether the network-layer gap greptile describes is real depends on the legacy-transport/double-stack wiring (AppServices' transport vs the backend's), which can only be settled with an actual inbound device voice call — deferring rather than blind-fixing to avoid a regression. Tracked for device-voice verification.

@torlando-tech

Copy link
Copy Markdown
Owner Author

@greptile review

…data (#79)

Two greptile-flagged defects on pr1:
- rns_bridge.py: stop() cleared BLE callbacks but not RNode; reset_identity()
  cleared neither. Stale closures from a torn-down IOSRNodeInterface/IOSBLE
  driver survived a stop+start / identity-reset. Add clear_rnode_callbacks()
  to stop() and clear_ble_callbacks()+clear_rnode_callbacks()+handle-nil to
  reset_identity(), symmetric with the existing BLE cleanup.
- SwiftRNSBackend.announce: the delivery-destination announce set appData to
  raw UTF-8; canonical LXMF (>=0.5.0) is msgpack([name_utf8, stamp_cost]).
  Emit that (stamp_cost nil, matching the Python backend's
  register_delivery_identity(display_name=…)) so Sideband/Android decode the
  name via the msgpack path and read the (absent) stamp cost, instead of the
  legacy raw-utf8 fallback. Telephony announce keeps its raw LXST format.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@torlando-tech

Copy link
Copy Markdown
Owner Author

@greptile review

- reset_identity(): include telephony_destination in the _state.update so it
  matches stop(); previously _state['telephony_destination'] kept a dead
  RNS.Destination after a reset, diverging from stop()'s teardown.
- Snapshot _state['identity'] under _lock before link.identify(): a concurrent
  reset_identity()/stop() could null it between the None-check and the call,
  silently no-op'ing identify(None). Mirrors link_identify().
- Cancel the delayed re-announce daemon across teardown/restart via an
  _announce_generation token: start() captures the current generation;
  stop()/reset_identity() bump it; the thread bails before announcing a
  torn-down (or previous-session) destination.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Comment thread Sources/RNSBackendSwift/SwiftRNSBackend.swift
torlando-agent Bot and others added 3 commits May 31, 2026 03:39
openLink is async with no actor isolation, so concurrent callers could
interleave the nextLinkId read-increment and the links[linkId] write: two
callers grab the same id and the second write orphans the first link
(never tracked in links, never cancelled by linkTeardown/stop). Same class
of bug the Python backend fixed with _link_id_lock. Guard nextLinkId, links,
and linkStateTasks with an NSLock at every access (openLink/linkSend/
linkIdentify/linkTeardown/stop), never held across an await — keeping the
backend a class so linkSend stays actor-hop-free on the audio path.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The link registration and its stateUpdates drain task were stored in two
separate linkLock sections split by an await. A stop() racing that window
cleared linkStateTasks (not yet holding the new task) and links, then
openLink re-inserted the task into the emptied map — orphaning it (link
actor kept alive, stale events into the next session, linkId absent from
links so linkTeardown couldn't reach it). Build the Task and store both
writes inside one critical section; the constructor is synchronous so it's
lock-safe, and setPacketCallback's await now follows the unlock.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- interfaceIds had no synchronization while links/linkStateTasks did:
  statusSnapshot() read it (.first(where:)) concurrently with start()'s
  bring-up loop and addInterface/removeInterface writes from other tasks.
  Swift Dictionary read+write is UB (release-build crash). Extend linkLock
  to every interfaceIds access; statusSnapshot takes a COW snapshot under
  the lock so the map stays lock-free.
- startAnnouncePolling() overwrote announcePoller without cancelling it.
  start() has no idempotency guard, so a second start() without stop()
  leaked the old poller (still polling its captured, now-orphaned PathTable)
  and double-yielded every announce. Cancel before reassigning.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@torlando-tech torlando-tech merged commit 5461539 into main May 31, 2026
1 of 2 checks passed
@torlando-tech torlando-tech deleted the pr1-rns-core branch May 31, 2026 23:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant