Skip to content

Latest commit

 

History

History
241 lines (197 loc) · 9.71 KB

File metadata and controls

241 lines (197 loc) · 9.71 KB

AeroUDP Protocol Specification — v1

AeroUDP is a connection-oriented, reliable, ordered, bidirectional transport protocol layered on top of unreliable UDP datagrams. It provides TCP-like guarantees with a simpler header, modern congestion control, and a clean async implementation in Rust.

1. Goals

  • Reliable, in-order, bidirectional byte streams of bounded messages.
  • Detect and recover from packet loss, duplication, reordering, and corruption.
  • Sliding-window flow control with explicit advertised receive window.
  • Congestion control modeled after TCP NewReno: slow start, congestion avoidance, fast retransmit, fast recovery.
  • RTT-driven adaptive retransmission timeout (RFC 6298 style).
  • Connection lifecycle: 3-way handshake, graceful FIN exchange, RST abort, TIME_WAIT, keepalives, idle timeout.

2. Wire Format

Every AeroUDP datagram is exactly one UDP payload containing one packet:

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|  magic=0xAE   |  version=1    | packet_type   |   flags       |
+---------------+---------------+---------------+---------------+
|                          conn_id (u32)                        |
+---------------------------------------------------------------+
|                          seq_num (u32)                        |
+---------------------------------------------------------------+
|                          ack_num (u32)                        |
+---------------------------------------------------------------+
|                          window (u32)                         |
+---------------------------------------------------------------+
|         payload_len (u16)     |          reserved (u16)       |
+---------------+---------------+---------------+---------------+
|                          checksum (u32, CRC32 over header+payload)
+---------------------------------------------------------------+
|                          payload ...                          |
+---------------------------------------------------------------+
  • Total header size: 28 bytes.
  • Maximum payload: 1200 bytes (configurable, chosen to fit within common MTUs).
  • All multi-byte fields are big-endian.
  • The checksum field is set to zero before computing CRC32 over the entire packet (header + payload), then patched in.

2.1 Packet types

Code Type Description
1 SYN Initiates connection
2 SYN_ACK Response to SYN, also acknowledges
3 ACK Pure ACK (no payload, no new SEQ)
4 DATA Carries application payload (cumulative ACK in header)
5 FIN Half-close from sender
6 FIN_ACK Acknowledges FIN
7 RST Aborts connection
8 PING Keepalive probe
9 PONG Keepalive response

2.2 Flags

Bit Name Meaning
0x01 ACK ack_num field is valid
0x02 SYN Synchronize sequence numbers
0x04 FIN No more data from sender
0x08 RST Reset
0x10 PSH Hint: deliver immediately (advisory, currently unused)
0x20 ECN Reserved for explicit congestion notification
0x40 RETX Retransmitted packet (observability only)

3. Sequence Numbers

  • Each side selects a random 32-bit initial sequence number (ISN) at connection time.
  • SYN, FIN, and DATA each consume exactly one sequence-number slot. Pure ACKs do not consume sequence space.
  • seq_num numbers DATA packets starting at ISN + 1.
  • ack_num is cumulative: it indicates the next in-order sequence the receiver expects.
  • Comparisons use modular 32-bit arithmetic.

4. Connection Lifecycle

   CLOSED            CLOSED
     |                  |
     | connect()        | listen()
     v                  v
   SYN_SENT  -----SYN---->  LISTEN
     |  <----SYN+ACK----   SYN_RECEIVED
     |  ------ACK------>      |
     v                        v
   ESTABLISHED  <-------->  ESTABLISHED

   ESTABLISHED  --FIN-->  CLOSE_WAIT
       |                     |
   FIN_WAIT_1                |
       |  <--FIN_ACK--       |
   FIN_WAIT_2 <----FIN----   |
       |                     |
   TIME_WAIT  --ACK---->   LAST_ACK --> CLOSED
       |
   (after 2*MSL/close_wait)
       v
     CLOSED

4.1 Three-way handshake

  1. Active opener sends SYN(seq=ISN_a).
  2. Passive side replies with SYN_ACK(seq=ISN_b, ack=ISN_a+1).
  3. Active opener sends ACK(ack=ISN_b+1) and transitions to ESTABLISHED.

The handshake itself uses retransmission timeouts and limited retries (initial_rto, exponential backoff, max_retries).

4.2 Graceful close

Initiator sends FIN. Peer ACKs and may continue sending data. When the peer finishes, it sends its own FIN. The initiator acknowledges and enters TIME_WAIT for close_wait to absorb late retransmissions.

4.3 Abort

Either side may send RST to immediately terminate; the receiver moves to CLOSED and surfaces a PeerReset event.

4.4 Keepalive

If no packet has been received for keepalive_interval, the engine sends PING. The peer responds with PONG. After keepalive_probes consecutive unanswered probes, the connection is closed.

5. Reliability

  • Every DATA packet is tracked in the retransmission queue, keyed by sequence number, with the original packet type and flags preserved so retransmissions are bit-identical except for the RETX flag.
  • Cumulative ACKs evict everything strictly below the ack_num.
  • If the retransmission timer (RTO) elapses for any in-flight packet:
    • Backoff: RTO *= 2 (Karn's algorithm — no RTT sample is taken from a retransmitted segment).
    • Congestion event: ssthresh = max(cwnd/2, 2), cwnd = 1, return to slow start.
    • All expired segments are retransmitted with RETX set.
    • Exceeding max_retries for a single segment closes the connection.
  • Fast retransmit: three duplicate ACKs trigger immediate retransmission of the lowest-seq unacknowledged segment without waiting for RTO, and the controller enters fast recovery.

6. RTT and RTO

RTT estimation follows RFC 6298:

SRTT_0   = R
RTTVAR_0 = R / 2
SRTT     = (1 - 1/8) * SRTT   + (1/8) * R'
RTTVAR   = (1 - 1/4) * RTTVAR + (1/4) * |SRTT - R'|
RTO      = clamp(SRTT + 4 * RTTVAR, min_rto, max_rto)

Only non-retransmitted segments contribute samples (Karn's algorithm). The RTO is multiplied by an exponential backoff factor after each timeout, reset to 1 on a fresh sample.

7. Flow Control

  • The receiver advertises an available window (window field) measured in packets. The sender must keep in_flight <= min(cwnd, peer_window).
  • When the receive buffer fills, window shrinks toward zero, forcing the sender to pause until application reads drain the buffer.
  • Out-of-order packets are buffered until the gap fills, then delivered in-order to the application.

8. Congestion Control

A NewReno-style controller:

  • cwnd starts at initial_cwnd packets, ssthresh at initial_ssthresh.
  • Slow start (cwnd < ssthresh): for each ACK that advances cumulative-ack, cwnd += segments_acked. Exit to congestion avoidance when cwnd >= ssthresh.
  • Congestion avoidance: cwnd grows by approximately 1 MSS per RTT (additive increase).
  • Fast retransmit / recovery: after fast_retransmit_threshold duplicate ACKs, ssthresh = max(cwnd/2, 2), cwnd = ssthresh, enter recovery. ACKs during recovery inflate cwnd by one per duplicate ACK; new cumulative ACK past recovery_point exits recovery.
  • Timeout: ssthresh = max(cwnd/2, 2), cwnd = 1, slow start.

9. Observability

The ConnectionMetrics snapshot exposes:

  • Packet/byte counters in both directions
  • Retransmissions, fast retransmissions, duplicate ACKs, duplicate packets, out-of-order packets, checksum failures, timeouts
  • Mean RTT, smoothed RTT, RTTVAR, current RTO
  • Current cwnd, ssthresh, in-flight count
  • Derived loss rate

The engine emits structured tracing events at aeroudp::state, aeroudp::engine, aeroudp::driver, aeroudp::handshake, and aeroudp::listener targets.

10. Reference defaults

initial_rto                = 300 ms
min_rto / max_rto          = 100 ms / 10 s
max_retries                = 12
handshake_timeout          = 5 s
idle_timeout               = 60 s
keepalive_interval         = 15 s
keepalive_probes           = 3
initial_cwnd / ssthresh    = 10 / 64 packets
max_cwnd                   = 4096 packets
receive_window             = 1024 packets
send_buffer_packets        = 4096
recv_buffer_packets        = 4096
max_payload                = 1200 bytes
fast_retransmit_threshold  = 3
close_wait                 = 2 s

11. Limitations

  • No cryptographic authentication or encryption. AeroUDP is a transport, not a security layer.
  • Single byte-stream per connection. No multiplexed streams (à la QUIC).
  • No selective acknowledgements (SACK) — only cumulative ACKs.
  • 32-bit sequence space; at line-rate workloads wrap-around occurs after ~5 PB of traffic per connection but is not specially protected against.
  • Path MTU is fixed at compile time.