Skip to content

Latest commit

 

History

History
189 lines (157 loc) · 7.99 KB

File metadata and controls

189 lines (157 loc) · 7.99 KB

Architecture

hyper-uds is intentionally small. The crate is a stack of four layers plus two public type modules; each layer has one job and only knows about the layer immediately below it.

        +-----------------------------------------+
        |  Builder  (public, src/builder.rs)      |   configuration entry
        +-----------------------------------------+
                          |
                          v
        +-----------------------------------------+
        |  Connection / serve loop                |   per-connection state
        |  (private, src/conn.rs)                 |   machine
        +-----------------------------------------+
                          |
            +-------------+--------------+
            |                            |
            v                            v
  +---------------------+      +-----------------------+
  |  codec              |      |  fd_stream            |   SCM_RIGHTS-aware
  |  (private,          |      |  (private,            |   recvmsg/sendmsg
  |   src/codec.rs)     |      |   src/fd_stream.rs)   |
  +---------------------+      +-----------------------+
            ^                            ^
            |                            |
            +----- shared types: --------+
                  message::Request, message::Response
                  service::Service, service::ServiceFn
                  error::Error

Layer responsibilities

fd_stream — the transport

fd_stream::FdStream wraps a std::os::unix::net::UnixStream in tokio's AsyncFd and goes straight to rustix::net::recvmsg / rustix::net::sendmsg. It is the only place in the crate that touches ancillary data.

Responsibilities:

  • Drive recvmsg(2) with a stack-allocated ancillary buffer sized for up to 32 fds (RECV_ANCILLARY_FDS). Any SCM_RIGHTS cmsg messages are drained into a VecDeque<OwnedFd> queue (pending_fds) before the byte count is returned, so cmsg ordering is preserved relative to the byte stream.
  • Drive sendmsg(2) with MSG_NOSIGNAL and an optional SCM_RIGHTS ancillary block. The caller decides which sendmsg call carries the cmsg (see conn.rs — only the first chunk of a response head does).
  • Use RecvFlags::CMSG_CLOEXEC so received fds are atomically marked O_CLOEXEC by the kernel. Application code never sees a leakable raw fd.

Why a queue rather than returning fds inline with recv()? A single recvmsg may pull bytes that span the tail of one logical request and the head of the next; the byte stream and the cmsg list are fundamentally asynchronous w.r.t. each other. Buffering fds and exposing take_fds() lets the upper layer ask "what fds belong to this request?" only after it has finished parsing the request head and body.

codec — HTTP/1.1 framing

codec does pure byte-in / struct-out work. It has no I/O, no buffers it owns, no statefulness across calls.

  • decode_head(&[u8]) -> DecodeStatus { Complete | Partial } parses a request line + header block via httparse into DecodedHead (method, uri, version, headers, X-FD names, content-length, keep-alive).
  • encode_response(&Response, keep_alive, &mut Vec<u8>) writes a status line, response headers, the framing headers we own (content-length, connection, x-fd), then the body.
  • split_body(&mut BytesMut, n) is a one-liner that exists to keep the per-iteration cleanup obvious in the connection loop.

The codec deliberately does not enforce limits or talk to the socket; those concerns belong to conn. See docs/protocol.md for the exact wire-level dialect.

conn — the per-connection state machine

conn::serve glues the two lower layers together and is where every operational policy lives:

  1. Read head. Loop on FdStream::recv and codec::decode_head until a complete head is parsed or cfg.max_head_size is exceeded.
  2. Read body. Pull exactly Content-Length bytes (chunked is not supported); enforce cfg.max_body_size.
  3. Take fds. Drain queued fds from the transport. Reject if more than cfg.max_fds_per_request arrived; reject if fewer fds are present than the X-FD header named.
  4. Dispatch to the user Service and await its future.
  5. Encode response and sendmsg. Attach response fds (if any) on the first sendmsg call only; subsequent partial-write retries send the remaining bytes with no cmsg.
  6. Loop if HTTP/1.1 keep-alive is in effect and shutdown has not been requested; otherwise return.

The connection holds a small Config clone and an Arc<AtomicBool> shutdown flag (graceful_shutdown flips it; the loop checks it at the top of each iteration). The future itself is Pin<Box<...>> — a single allocation per connection. Allocator pressure on the hot path comes from HeaderMap and the body Bytes, not from this box.

Builder — configuration

Builder is the only public way to construct a Connection. It is shaped intentionally like hyper::server::conn::http1::Builder:

Builder::new()
    .keep_alive(true)
    .max_head_size(16 * 1024)
    .max_body_size(16 * 1024 * 1024)
    .max_fds_per_request(16)
    .serve_connection(std_unix_stream, service);

serve_connection returns a Connection, which is a Future<Output = Result<()>>. The caller is responsible for owning the listener, calling accept(), and spawning that future onto a runtime. We deliberately do not provide a "run a server" helper; the caller's accept loop is where graceful shutdown, connection limits, backpressure, and per-tenant tracing all live.

Public types — message and service

These two modules are the API surface a service author actually writes against:

  • message::Request — owned Method, Uri, Version, HeaderMap, body Bytes, plus fds: Vec<OwnedFd> and the matching fd_names: Vec<String>. fd_by_name() is the ergonomic accessor; take_fds() moves all fds out without re-running header lookups.
  • message::Response — symmetric, plus a small ResponseBuilder for ergonomic construction. fd("name", fd) appends to both fds and fd_names in lock-step so they cannot drift.
  • service::Servicefn call(&self, Request) -> Self::Future. &self is intentional; share a Service across requests on a connection without forcing interior mutability into every implementor. Blanket impls cover &S, Box<S>, Arc<S>, Rc<S>. service_fn adapts a closure for tests and one-off handlers.

Concurrency model

A Connection is a single future that owns a single FdStream. Requests on the same connection are processed strictly in order:

read head -> read body -> take fds -> service.call -> write response
                                                              |
                                                              v
                                                    next iteration ...

The dispatcher does not pipeline. This is a hard constraint of SCM_RIGHTS semantics, not a temporary simplification — see docs/fd-passing.md. Cross-connection concurrency is achieved by tokio::spawn-ing one Connection future per accepted UDS stream, exactly as the integration test and the benchmark do. Connection<F> is Send whenever the user's Service and its Future are Send, so it composes with multi-threaded runtimes without any LocalSet ceremony.

What lives outside the crate

The library does not own:

  • The listener (UnixListener).
  • The accept loop.
  • The runtime.
  • TLS (there is no TLS).
  • Connection limits, tracing, metrics — wrap the Service or wrap each serve_connection invocation.

This is the same split hyper's server::conn module makes, and for the same reasons: every embedder ends up wanting a slightly different accept-loop policy, and bundling one means everybody else has to opt out of it.