Skip to content

Add default fdtable handlers to grate-rs (or new lib) — auto-translate, override per-syscall #115

@rennergade

Description

@rennergade

Follow-up to #114 (audit of fd-translation bugs across grates).

Motivation

Every Rust grate that maintains its own fdtables entries has been duplicating the same fd-translation boilerplate — and getting it wrong. The IPC grate work on fix-ipc-grate-fd-translation wired ~30 syscall handlers just to translate the user-supplied grate-vfd to the runtime-vfd before forwarding. The audit found that 5 of 6 grates that allocate their own fdtable entries are broken in the same way.

The boilerplate should live in grate-rs (or a small companion crate), with sane defaults and per-syscall override.

Proposed design

Builder API

GrateBuilder::with_fdtable()                // opts into defaults
    .fdkind(IPC_PIPE)
        .on_close(ipc_pipe_close_handler)   // wraps fdtables::register_close_handlers
    .fdkind(IPC_SOCKET)
        .on_close(ipc_socket_close_handler)
    .register(SYS_PIPE, ipc_pipe_handler)   // overrides default (last-write-wins)
    .register(SYS_READ, ipc_read_handler)
    .register(SYS_WRITE, ipc_write_handler)
    // ... only the syscalls the grate actually wants to customize
    .run(argv);

GrateBuilder::new() stays minimal so trivial grates (geteuid, strace) don't pay for handlers they don't need.

Defaults registered by with_fdtable()

fd-creating — forward, then register_kernel_fd (allocates fresh grate-vfd over the runtime-vfd):

  • SYS_OPEN, SYS_OPENAT, SYS_DUP, SYS_DUP2, SYS_DUP3
  • SYS_SOCKET, SYS_ACCEPT, SYS_ACCEPT4, SYS_PIPE, SYS_PIPE2, SYS_SOCKETPAIR
  • SYS_EPOLL_CREATE, SYS_EPOLL_CREATE1

fd-using (arg1) — translate arg1 to underfd, forward:

  • SYS_READ, SYS_WRITE, SYS_CLOSE, SYS_FCNTL, SYS_LSEEK, SYS_IOCTL
  • SYS_FSTAT, SYS_FSYNC, SYS_FDATASYNC, SYS_FTRUNCATE, SYS_FLOCK
  • SYS_FCHDIR, SYS_FCHMOD, SYS_GETDENTS, SYS_FSTATFS, SYS_SYNC_FILE_RANGE
  • SYS_PREAD, SYS_PWRITE, SYS_READV, SYS_WRITEV, SYS_PREADV, SYS_PWRITEV
  • SYS_BIND, SYS_LISTEN, SYS_CONNECT, SYS_SHUTDOWN
  • SYS_SENDTO, SYS_RECVFROM, SYS_SENDMSG, SYS_RECVMSG
  • SYS_SETSOCKOPT, SYS_GETSOCKOPT, SYS_GETSOCKNAME, SYS_GETPEERNAME
  • SYS_EPOLL_WAIT

Special-arg defaults:

  • SYS_MMAP — translate arg5, skip if MAP_ANON
  • SYS_EPOLL_CTL — translate arg1 (epfd) and arg3 (target fd)
  • SYS_OPENAT and other *at syscalls — translate dirfd unless it's AT_FDCWD (-100)

Embedded-in-cage-memory defaults:

  • SYS_POLL, SYS_PPOLL — translate each fd in the pollfd[] buffer; reverse on return
  • SYS_SELECT — translate fds in the fd_set bitmaps; reverse-map on return; pass runtime_nfds = max(underfd) + 1

Lifecycle defaults:

  • SYS_CLONEcopy_fdtable_for_cage + copy_handler_table_to_cage + re-overlay every parent entry whose fdkind ≠ FDKIND_KERNEL (so RawPOSIX populating the child first doesn't matter)
  • SYS_EXIT, SYS_EXIT_GROUPremove_cage_from_fdtable so registered close handlers fire on cage exit
  • preexec hook — install identity 0/1/2 entries for stdio (the IPC, devnull, fdtables-test grates all duplicate this today)

Custom-fdkind handling

Default read_handler is fine for FDKIND_KERNEL (translate + forward), but if a grate has IPC_PIPE entries, the default will translate pipe_id as if it were a runtime-vfd → EBADF.

v1 approach: the default checks entry.fdkind and only translates for FDKIND_KERNEL; for any other fdkind it returns -EBADF. The grate overrides with its own handler that dispatches on kind. (Strict subset of v2; users can layer their own dispatch.)

v2 sugar (follow-up): per-fdkind callbacks

.fdkind_read(IPC_PIPE, my_pipe_read)
.fdkind_write(IPC_PIPE, my_pipe_write)

Default handler dispatches on kind. Cleaner for grates with rich custom kinds.

Helpers to expose publicly

So override handlers don't reimplement these:

pub fn translate_to_underfd(cage: u64, fd: u64) -> Option<u64>;
pub fn translate_dirfd(cage: u64, fd: u64) -> Option<u64>;
pub fn register_kernel_fd(cage: u64, runtime_vfd: i32, cloexec: bool, perfdinfo: u64) -> i32;
pub fn forward_with_fd1(syscall: u64, cage: u64, args: [u64; 6], arg_cages: [u64; 6]) -> i32;
pub fn forward_with_dirfd1(...);

These are already in rust-grates/ipc-grate/src/handlers.rs on the fix-ipc-grate-fd-translation branch — pull them up into lib/grate-rs/src/.

Where it lives

Extend grate-rs rather than spinning a new crate. Most users already pull grate-rs, and "if you have an fdtable, opt into this" is one builder call.

Quirks the implementation must respect

These are the patterns the IPC grate had to special-case; the audit (#114) confirms other grates kept getting them wrong:

  1. dup2 / dup3 — never forward grate-vfd as newfd to the runtime. Pattern: SYS_DUP(old_under) to get a fresh runtime-vfd, then get_specific_virtual_fd(cage, newfd, FDKIND_KERNEL, new_runtime_vfd, …) on the grate side.
  2. *at syscalls — AT_FDCWD (-100) must NOT be translated.
  3. mmap — fd is in arg5, skip translation when MAP_ANON is set (fd may be -1).
  4. epoll_ctl — translate arg1 AND arg3.
  5. poll/ppoll/select — fds are inside the cage buffer/bitmask, not in the syscall args. Translate, forward, reverse-translate on return so the user sees their original grate-vfds.
  6. AF_INET loopback take-over (or any "convert FDKIND_KERNEL into custom fdkind in place") — close the runtime vfd first via translate_to_underfd, then get_specific_virtual_fd overwrites.
  7. Fork — the parent's grate-side custom-fdkind entries must be re-overlaid in the child after copy_fdtable_for_cage, because RawPOSIX may have already populated the child cage from its own fork path.

Validation plan

Once landed, port the at-risk grates from #114 onto the new defaults — that converts the audit punch list into a regression test:

  1. devnull-grate — should shrink dramatically.
  2. write-filter-grate — should shrink dramatically.
  3. mtls-grate — replace half-translated handlers with the defaults; keep TLS-specific overrides.
  4. imfs-grate — keep IMFS dispatch on IMFS_FDKIND, drop the host-libc stdio bypass.
  5. resource-grate — remove identity-pinning hack; fix underfd to be the runtime-vfd.
  6. ipc-grate — replace the ~500 lines of translation boilerplate on fix-ipc-grate-fd-translation with overrides on the defaults.

Reference

Working example of all the translation logic (just embedded in one grate today): rust-grates/ipc-grate/src/handlers.rs on the fix-ipc-grate-fd-translation branch — see forward_with_fd1, translate_to_underfd, register_kernel_fd, translate_fd1_handler! macro.

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions