|
| 1 | +--- |
| 2 | +name: rust-p2p |
| 3 | +description: "Use when implementing, reviewing, debugging, or explaining Pluto Rust libp2p code: Node/P2PContext ownership, PlutoBehaviour composition, NetworkBehaviour and ConnectionHandler protocols, relay/force-direct/quic-upgrade behaviours, peerinfo/parsigex protocol handlers, DKG protocol handlers, and P2P tests." |
| 4 | +--- |
| 5 | + |
| 6 | +# Rust P2P |
| 7 | + |
| 8 | +Use this skill for Pluto Rust libp2p work. Bias toward the existing networking |
| 9 | +architecture; do not invent a parallel swarm, context, or handler model. |
| 10 | + |
| 11 | +## Start Here |
| 12 | + |
| 13 | +Classify the task and read the matching implementation first. |
| 14 | + |
| 15 | +| Area | Files | Look for | |
| 16 | +| --- | --- | --- | |
| 17 | +| Node runtime | `crates/p2p/src/p2p.rs` | `Node`, builder closure, listen/dial setup | |
| 18 | +| Shared peer state | `crates/p2p/src/p2p_context.rs` | local peer binding, known peers, peer store | |
| 19 | +| Common wrapper | `crates/p2p/src/behaviours/pluto.rs` | common behaviours plus `inner` | |
| 20 | +| Connection tracking | `crates/p2p/src/conn_logger.rs` | peer-store updates on connection lifecycle | |
| 21 | +| Connection gating | `crates/p2p/src/gater.rs` | allow/deny policy | |
| 22 | +| Bootnode | `crates/p2p/src/bootnode.rs` | bootnode run loop and shutdown | |
| 23 | +| Relay | `crates/p2p/src/relay.rs` | reservations and relay-route dialing | |
| 24 | +| Direct routing | `crates/p2p/src/force_direct.rs` | direct-connection enforcement | |
| 25 | +| QUIC upgrade | `crates/p2p/src/quic_upgrade.rs` | TCP-to-QUIC retry/close flow | |
| 26 | +| Optional wrapper | `crates/p2p/src/behaviours/optional.rs` | enabled/disabled behaviour routing | |
| 27 | +| Framing | `crates/p2p/src/proto.rs` | protobuf read/write helpers and limits | |
| 28 | +| Peer info | `crates/peerinfo/src/` | simple behaviour/handler protocol | |
| 29 | +| Parsigex | `crates/parsigex/src/` | handle -> behaviour -> handler broadcast protocol | |
| 30 | +| DKG broadcast | `crates/dkg/src/bcast/` | command-driven behaviour/handler protocol | |
| 31 | +| DKG sync | `crates/dkg/src/sync/` | long-lived stream, waiters, cancellation | |
| 32 | +| Examples | `crates/*/examples/` | expected public construction patterns | |
| 33 | + |
| 34 | +Before editing, draw the ownership path for the specific protocol: |
| 35 | + |
| 36 | +```text |
| 37 | +application / protocol flow |
| 38 | + | |
| 39 | + | handle, command channel, waiter, or event consumer |
| 40 | + v |
| 41 | +feature Behaviour shared P2PContext |
| 42 | + | ^ |
| 43 | + | ToSwarm::{Dial, ListenOn, | conn_logger + identify update it |
| 44 | + | NotifyHandler, GenerateEvent} |
| 45 | + v |
| 46 | +PlutoBehaviour wrapper |
| 47 | + | |
| 48 | + v |
| 49 | +libp2p Swarm |
| 50 | + | |
| 51 | + | connection events + negotiated substreams |
| 52 | + v |
| 53 | +feature ConnectionHandler |
| 54 | + | |
| 55 | + | async stream read/write futures |
| 56 | + v |
| 57 | +protocol state / protocol event |
| 58 | +``` |
| 59 | + |
| 60 | +## Core Ownership |
| 61 | + |
| 62 | +- `Node` owns the `Swarm`. |
| 63 | +- `PlutoBehaviour<B>` wraps common networking behaviours around feature |
| 64 | + behaviour `B`. |
| 65 | +- One node must have one canonical `P2PContext` shared by `Node`, |
| 66 | + `PlutoBehaviour`, and inner behaviours that read peer state. |
| 67 | +- Feature `Behaviour` owns swarm-level protocol orchestration. |
| 68 | +- Feature `ConnectionHandler` owns one connection's negotiated streams. |
| 69 | +- User-facing handles do not own the swarm; they use channels, waiters, shared |
| 70 | + protocol state, or emitted events. |
| 71 | + |
| 72 | +## Node And Context |
| 73 | + |
| 74 | +Use `Node::new` for normal client nodes. It builds TCP/QUIC transports and |
| 75 | +includes relay-client support. Use `Node::new_server` for server-style nodes |
| 76 | +that should not include relay-client behaviour. |
| 77 | + |
| 78 | +Rules: |
| 79 | + |
| 80 | +- Pass the intended `P2PContext` into `Node` construction. |
| 81 | +- Inside the builder closure, use `builder.p2p_context()` for inner behaviours |
| 82 | + that need peer state. |
| 83 | +- Do not create a fresh/default `P2PContext` inside a node builder closure. |
| 84 | +- Treat `P2PContext::default()` as only suitable at standalone node construction |
| 85 | + when no known-peer state is needed and no component must share peer state. |
| 86 | +- Do not reintroduce replaceable-context APIs after the builder is created. |
| 87 | +- Let `Node` bind the local peer ID and preserve `LocalPeerIdMismatch` fail-fast |
| 88 | + behaviour. |
| 89 | +- `filter_private_addrs` affects advertised addresses, not listen addresses. |
| 90 | + |
| 91 | +`P2PContext` is for runtime peer connectivity: |
| 92 | + |
| 93 | +- known peer set, |
| 94 | +- local peer ID once bound, |
| 95 | +- active/inactive connection records, |
| 96 | +- peer addresses learned from identify. |
| 97 | + |
| 98 | +Do not store protocol progress in `P2PContext`. |
| 99 | + |
| 100 | +**Critical failure mode:** if `conn_logger` writes context A while an inner |
| 101 | +behaviour reads context B, the inner behaviour will treat connected peers as |
| 102 | +disconnected. |
| 103 | + |
| 104 | +## PlutoBehaviour |
| 105 | + |
| 106 | +`PlutoBehaviour<B>` composes common behaviour with feature behaviour: |
| 107 | + |
| 108 | +- `conn_logger`: lifecycle logging, peer-store updates, metrics, |
| 109 | +- `gater`: connection allow/deny policy, |
| 110 | +- `identify`: address/protocol exchange; `Node` stores listen addresses, |
| 111 | +- `ping`: latency/keepalive metrics, |
| 112 | +- `autonat`: reachability, |
| 113 | +- `quic_upgrade`: optional TCP -> QUIC upgrade attempts, |
| 114 | +- `inner`: feature-specific behaviour. |
| 115 | + |
| 116 | +**Composition invariant:** `#[derive(NetworkBehaviour)]` delegates behaviour |
| 117 | +methods in struct field order. Keep `conn_logger` before behaviours that may |
| 118 | +read `P2PContext.peer_store`, so connection events update shared peer state |
| 119 | +before later fields handle the same swarm event or get polled afterward. |
| 120 | + |
| 121 | +## Behaviour Pattern |
| 122 | + |
| 123 | +A feature `Behaviour` is a swarm-level coordinator. |
| 124 | + |
| 125 | +Common responsibilities: |
| 126 | + |
| 127 | +- receive user commands through `tokio::sync::mpsc`, |
| 128 | +- queue `ToSwarm` events in `VecDeque`, |
| 129 | +- create a real handler for peers in protocol scope, |
| 130 | +- return `dummy::ConnectionHandler` when the behaviour has no per-connection |
| 131 | + protocol work or when a peer is out of scope, |
| 132 | +- inspect `FromSwarm` connection/dial events, |
| 133 | +- poll retry/routing timers, |
| 134 | +- translate handler events into feature events. |
| 135 | + |
| 136 | +`poll` rules: |
| 137 | + |
| 138 | +- never `.await`, |
| 139 | +- never block, |
| 140 | +- never spin on an immediately failing condition, |
| 141 | +- drain ready commands before sleeping, |
| 142 | +- emit at most one queued `ToSwarm` event per poll, |
| 143 | +- return `Poll::Pending` when no work is ready. |
| 144 | + |
| 145 | +Use `ToSwarm::NotifyHandler` when an existing handler should open a substream. |
| 146 | +Use `ToSwarm::GenerateEvent` for application-visible events. |
| 147 | + |
| 148 | +## Handler Pattern |
| 149 | + |
| 150 | +A `ConnectionHandler` is per connection, not per peer. |
| 151 | + |
| 152 | +Common responsibilities: |
| 153 | + |
| 154 | +- define inbound and outbound protocol upgrades, |
| 155 | +- handle `FullyNegotiatedInbound` by starting inbound stream work, |
| 156 | +- handle `FullyNegotiatedOutbound` by starting matching outbound stream work, |
| 157 | +- report results with `ConnectionHandlerEvent::NotifyBehaviour`, |
| 158 | +- request outbound streams with |
| 159 | + `ConnectionHandlerEvent::OutboundSubstreamRequest`, |
| 160 | +- translate negotiation, timeout, and I/O failures into typed protocol failures. |
| 161 | + |
| 162 | +Do not block in handler `poll`. Store async stream work as boxed futures or a |
| 163 | +`FuturesUnordered`, then poll them. |
| 164 | + |
| 165 | +Do not assume only one handler exists for a peer. If a protocol requires one |
| 166 | +outbound loop per peer, keep an explicit claim in shared protocol state and |
| 167 | +release it on every terminal path. |
| 168 | + |
| 169 | +## Handles, Waiters, Cancellation |
| 170 | + |
| 171 | +Keep user-facing APIs small and keep swarm internals inside `Node`. |
| 172 | + |
| 173 | +Existing shapes: |
| 174 | + |
| 175 | +- command handle -> `mpsc` -> behaviour (`parsigex::Handle`, |
| 176 | + `bcast::Component`, `sync::Client`), |
| 177 | +- behaviour event -> caller observes completion/failure (`parsigex`, `bcast`), |
| 178 | +- shared protocol state -> wait API (`sync::Server`). |
| 179 | + |
| 180 | +Waiter rules: |
| 181 | + |
| 182 | +- Use `Notify` to wake waiters after state changes. |
| 183 | +- Check terminal error and cancellation before awaiting another notification. |
| 184 | +- Use `CancellationToken` for long-running waits and protocol tasks that must |
| 185 | + stop during shutdown. |
| 186 | +- Tests around waiters should still use `tokio::time::timeout`; cancellation is |
| 187 | + part of the contract, not a replacement for test bounds. |
| 188 | + |
| 189 | +## Stream And Framing |
| 190 | + |
| 191 | +Choose the protocol primitive intentionally: |
| 192 | + |
| 193 | +- one stream protocol -> `ReadyUpgrade<StreamProtocol>`, |
| 194 | +- multiple stream protocols on one handler -> `SelectUpgrade`, |
| 195 | +- request/response over a negotiated stream -> helpers in `pluto_p2p::proto`. |
| 196 | + |
| 197 | +Framing variants: |
| 198 | + |
| 199 | +- varint length framing: `write_protobuf` / |
| 200 | + `read_protobuf_with_max_size`, |
| 201 | +- fixed `i64` little-endian length framing: `write_fixed_size_protobuf` / |
| 202 | + `read_fixed_size_protobuf_with_max_size`. |
| 203 | + |
| 204 | +Safety rules: |
| 205 | + |
| 206 | +- Use explicit protocol-specific max sizes where practical. |
| 207 | +- Do not allocate from an untrusted frame length before checking the max. |
| 208 | +- Add per-message `tokio::time::timeout` when a slow peer can otherwise hold a |
| 209 | + stream task indefinitely. |
| 210 | +- Preserve `io::ErrorKind` until retry/terminal decisions are made. |
| 211 | + |
| 212 | +## Dialing And Retry |
| 213 | + |
| 214 | +Prefer swarm-owned dialing through `ToSwarm::Dial`. |
| 215 | + |
| 216 | +Use `PeerCondition::DisconnectedAndNotDialing` when activation should not open |
| 217 | +duplicate dials. |
| 218 | + |
| 219 | +Typical dial-failure handling: |
| 220 | + |
| 221 | +- `Transport(_)`: retry only if the protocol still wants reconnect. |
| 222 | +- `NoAddresses`: retry only after a delay; identify, relay routing, or |
| 223 | + bootstrap may populate addresses later. |
| 224 | +- `DialBackoff`: usually let libp2p backoff stand unless the protocol has a |
| 225 | + specific recovery path. |
| 226 | +- `NegotiationFailed`: usually terminal for that stream protocol; the peer does |
| 227 | + not support it. |
| 228 | + |
| 229 | +Reconnect loops need a wakeup path. If a handler releases an outbound claim, |
| 230 | +ensure another handler can eventually retry by timer, notification, or a new |
| 231 | +swarm event. |
| 232 | + |
| 233 | +## Relay And Routing |
| 234 | + |
| 235 | +Relay reservation and relay routing are separate: |
| 236 | + |
| 237 | +- `MutableRelayReservation` dials relay servers directly, waits for relay |
| 238 | + connection establishment, then listens on `/p2p-circuit` addresses. |
| 239 | +- `RelayRouter` periodically builds relay circuit addresses for known peers and |
| 240 | + queues dials through those relays. |
| 241 | + |
| 242 | +Address rules: |
| 243 | + |
| 244 | +- Dialing a relay server directly: append `/p2p/<relay-id>`, not |
| 245 | + `/p2p-circuit`. |
| 246 | +- Listening through a relay or dialing a target through a relay: include |
| 247 | + `/p2p-circuit`. |
| 248 | +- Do not assume relay peer data is available at startup; mutable relay peers can |
| 249 | + resolve later. |
| 250 | + |
| 251 | +## Testing |
| 252 | + |
| 253 | +Use the smallest test that exercises the real boundary. |
| 254 | + |
| 255 | +Good boundaries: |
| 256 | + |
| 257 | +- behaviour queueing/retry: unit-test `NetworkBehaviour::poll` with |
| 258 | + `noop_waker_ref`, |
| 259 | +- handler state machines: poll handlers/futures directly where possible, |
| 260 | +- real connectivity: use `Node` and real swarms, |
| 261 | +- multi-swarm tests: use `#[tokio::test(flavor = "multi_thread")]`, |
| 262 | +- async barriers and shutdown: wrap waits in `tokio::time::timeout`, |
| 263 | +- ports: prefer `/ip4/127.0.0.1/tcp/0` plus `SwarmEvent::NewListenAddr`. |
| 264 | + |
| 265 | +Regression tests should cover the contract boundary: |
| 266 | + |
| 267 | +- success path, |
| 268 | +- error propagation to waiters or events, |
| 269 | +- cancellation/teardown, |
| 270 | +- retry or dial-failure behaviour, |
| 271 | +- protocol validation failure for authenticated peers. |
0 commit comments