Last Updated: 2026-04-01 · Target Architecture: x86_64 (AMD64) · Current Release: v0.0.6
This document defines the phased development plan for the Serix hybrid kernel. Each phase specifies concrete deliverables, acceptance criteria, and subsystem dependencies. Phases are ordered by the project's critical path; no calendar estimates are provided.
For architectural context on the subsystems referenced below, see ARCHITECTURE.md.
The kernel has completed Phases 1–2 and the four core features of Phase 3. Current operational capabilities:
- Boot: Limine v10.x (BIOS + UEFI), higher-half kernel with HHDM at
0xFFFF_8000_0000_0000 - Interrupts: LAPIC + I/O APIC fully operational; legacy PIC disabled; LAPIC timer at ~625 Hz (vector 49); PS/2 keyboard (vector 33); VirtIO block (vector 34, IRQ 11)
- Memory: 4-level paging (PML4),
StaticBootFrameAllocator, 1 MiB kernel heap (linked_list_allocator), SLUB allocator for large objects and 1 MiB task stacks (0xFFFF_D000_0000_0000VA range) - Scheduling: Preemptive round-robin; LAPIC timer invokes
schedule()at ~625 Hz;TaskCBwith SLUB-allocated stacks; callee-saved GPR + CR3 context switch;block_current_and_switch()for blocking primitives - Syscalls:
SYSCALL/SYSRETvia MSR;SYS_READ(0),SYS_WRITE(1),SYS_OPEN(2),SYS_CLOSE(3),SYS_SEEK(8),SYS_SEND(20),SYS_RECV(21),SYS_RECV_BLOCK(22),SYS_YIELD(24),SYS_EXIT(60),SYS_MKDIR(83),SYS_UNLINK(87) - IPC: Port-based message passing; blocking
receive_blocking()with wait queues;send()wakes blocked receivers; producer/consumer validated - Storage: VirtIO 1.0 block device (PCI modern, two-phase init); virtqueue with DMA-safe HHDM frame allocation; interrupt-driven sector read/write (IRQ via IOAPIC);
BlockDeviceVFS INode for byte-oriented access; 32 MiB disk, write→read verified - Filesystem: FAT32 driver (
fs/crate) with BPB parsing, cluster chain traversal/allocation, directory entry creation (8.3 + LFN), file read/write,mkdir,unlink(with LFN cleanup), duplicate filename rejection, LAPIC-tick timestamps; 32 MiB disk formatted viamkfs.vfat -F 32; files created by Serix are visible when mountingdisk.imgon Linux - File Descriptors: Global FD table (
kernel/src/fd.rs) keyed by(task_id, fd);open()/close()/seek()operations; FDs 0-2 backed by stdio INodes (fd 0 → PS/2 keyboard, fd 1 → framebuffer console, fd 2 → serial); user files start at fd 3 - Subsystems: VFS (ramdisk + RamDir/RamFile/BlockDevice INodes), ELF loader, IPC, async executor, capability store (not yet enforced), PCI enumeration, serial + framebuffer console, fs (FAT32)
Status: Complete
- Limine v10.x request/response protocol integration
- Memory map parsing from
MemoryMapRequestresponse - Framebuffer initialization via
FramebufferRequest - HHDM offset retrieval via
HhdmRequest
- Page table initialization using bootloader-provided CR3
-
StaticBootFrameAllocatorfrom LimineUSABLEmemory regions - Heap allocator (1 MiB at
0xFFFF_8000_4444_0000) vialinked_list_allocator -
OffsetPageTablewrapper for virtual memory manipulation
- CPU exception handlers (divide-by-zero, page fault, double fault, GPF)
- APIC bring-up (Local APIC enable, I/O APIC redirection table programming)
- Legacy PIC mask-all and disable
- LAPIC timer driver (~625 Hz periodic, vector 49)
- Serial console (COM1,
0x3F8, 115200 baud 8N1)
Status: Complete
-
TaskCB(Task Control Block) withTaskId,TaskState,SchedClass,CPUContext - Async task creation using Rust
Futuretrait objects - Cooperative round-robin executor (
VecDeque-based polling loop) - Low-level
context_switch()assembly (callee-saved GPRs + CR3 + segment registers)
- 128-bit
CapabilityHandlegeneration (RDTSC-seeded entropy) -
CapabilityStore(BTreeMap<CapabilityHandle, Capability>) withspin::Mutex -
CapabilityTypeenum:Task,MemoryRegion,IODevice,FileDescriptor -
grant()/revoke()operations
-
SYSCALL/SYSRETMSR configuration (EFER.SCE,LSTAR,STAR,SFMASK) - Naked ASM entry trampoline with kernel stack swap
- Userspace pointer validation (
0x0–0x8000_0000_0000range check) -
SYS_WRITE,SYS_READ,SYS_EXIT,SYS_YIELD,SYS_SEND,SYS_RECVdispatch
- CPUID leaf parsing (vendor string, feature flags, cache topology)
- CPU topology detection (cores, threads, packages)
- Hybrid core classification infrastructure (P-core/E-core via CPUID leaf
0x1A)
Status: Core features complete; SMP and advanced scheduler features deferred to Phase 7
-
SchedClassenum (Realtime,Fair,Batch,Iso) - LAPIC timer-driven preemption at ~625 Hz invoking
schedule()(vector 49) - SLUB-allocated 1 MiB per-task kernel stacks with guard pages
- Callee-saved GPR + CR3 context switch;
block_current_and_switch()for blocking - Per-CPU run queues with
GS_BASEMSR pointing to per-CPU data area -
TSS.RSP0swap on context switch for per-task kernel stacks - Weighted Fair Queueing (WFQ) for
Fairclass with virtual-runtime tracking - Priority inheritance protocol for capability-holding tasks in critical sections
- AP (Application Processor) bootstrap via INIT-SIPI-SIPI IPI sequence through LAPIC ICR
- Per-AP GDT, IDT, TSS, and kernel stack allocation
- Per-AP LAPIC initialization and timer calibration
-
MP_TRAMPOLINEreal-mode stub at sub-1MiB physical address for AP wake
- Port-based message passing (
send/receiveviaIPC_GLOBAL) - Blocking
receive_blocking()withTaskState::Blocked, per-port wait queues, and scheduler re-entry -
SYS_RECV_BLOCK (22)syscall for userspace blocking receive -
send()wakes first blocked receiver; producer/consumer validated in QEMU - IPC fastpath: direct register transfer when receiver is blocked at
receive()call site - Capability validation on every
send()— enforceCapabilityHandleownership for target port - Asynchronous notification ports (bitmask-based, non-queuing) for interrupt forwarding to Ring 3 servers
- PCI BAR enumeration and VirtIO 1.0 capability structure parsing (COMMON_CFG, NOTIFY_CFG, ISR_CFG, DEVICE_CFG)
- Two-phase init: PCI/feature negotiation before SLUB; virtqueue setup + DRIVER_OK after SLUB
- Virtqueue (descriptor table, available ring, used ring) with DMA-safe HHDM frame allocation
-
read_sector()/write_sector()with polled completion (spin_loop) (IRQ 11 → vector 34, IOAPIC) -
BlockDeviceVFS INode: byte-oriented read/write with sector-aligned translation and read-modify-write - 32 MiB disk (65536 sectors), write→read verified via VFS interface
- Ring 3 driver server process with MMIO BAR mapped into userspace
Status: Complete; Ext4/page cache deferred to Phase 7
- Path resolution engine (iterative component lookup through
INode::lookup()chain) - Mount table (
BTreeMap<VirtAddr, MountPoint>) for overlaying filesystems on directory INodes - File descriptor table (global table keyed by
(task_id, fd), not per-TaskCB) - Standard fd allocation: fd 0 (stdin/PS/2 keyboard), fd 1 (stdout/console), fd 2 (stderr/serial)
-
SYS_OPEN,SYS_CLOSE,SYS_SEEKsyscall implementations
-
BPB (BIOS Parameter Block) parsing from sector 0 (bytes_per_sector, sectors_per_cluster, reserved_sectors, fat_count, sectors_per_fat, root_cluster)
-
FAT cluster chain traversal (
fat_read_entry) and allocation (fat_alloc_clusterwithfat_write_entry) -
Directory entry parsing with 8.3 SFN and Long File Name (LFN) support
-
Directory entry creation (LFN + 8.3 pair with correct checksum and sequence numbers)
-
File read path: cluster chain → sector read → byte-offset copy
-
File write path: cluster chain extension, data write, directory entry size update
-
VFS integration:
FatDirINodeandFatFileINodeimplementingvfs::INodetrait -
mount()function parsing BPB from VirtIO block device sector 0 -
Linux interop:
disk.imgmountable on Linux viamount -o loopto inspect files created by Serix -
Duplicate filename rejection in
insert()(returns error if name already exists) -
mkdir()— allocate cluster, write./..entries, insert parent directory entry;SYS_MKDIR(83) -
unlink()— mark SFN entry deleted (0xE5), wipe associated LFN entries;SYS_UNLINK(87) -
Directory entry timestamps encoded from LAPIC tick counter (creation + modified fields)
-
DMA frame reuse: single pre-allocated DMA buffer per
VirtioBlock; no per-request allocation
- Superblock parsing at device offset
0x400(magic0xEF53, block size, inode count, feature flags) - Block group descriptor table traversal
- Inode table lookup and inode struct parsing (mode, size, extent tree root)
- Extent tree traversal for file block mapping (
ext4_extent_header→ext4_extentleaf nodes) - Directory entry parsing (linear and HTree/dx_root indexed)
- File read path: inode → extent lookup → VirtIO-blk sector read → page cache insertion
- File write path: block allocation from bitmap, extent tree insertion, data writeback
-
mkdir()/rmdir()/unlink()— directory entry manipulation with inode refcount management - Superblock generation and formatting (mkfs equivalent) for blank VirtIO-blk devices
- Journal (JBD2) — transaction commit for metadata consistency (initially ordered-mode)
- Concurrent radix tree indexed by
(InodeId, page_offset)— lockless read path via RCU-like epoch reclamation - Demand paging integration:
#PFhandler dispatches synchronous IPC to VFS for file-backed VMAs - Writeback: dirty page tracking via PTE accessed/dirty bits, periodic flush to Ext4 daemon
-
mmap()file-backed mapping support (MAP_SHARED,MAP_PRIVATEwith CoW)
Status: Planned
- File I/O:
open,read,write,close,lseek,pread64,pwrite64,readv,writev - Memory:
mmap,munmap,mprotect,brk,mremap - Process:
clone,execve,wait4,exit_group,getpid,getppid,gettid - Filesystem:
stat,fstat,lstat,access,getcwd,chdir,rename,link,symlink,readlink - Directory:
getdents64,mkdir,rmdir - Signals:
rt_sigaction,rt_sigprocmask,rt_sigreturn,kill,tgkill - I/O multiplexing:
epoll_create1,epoll_ctl,epoll_wait,poll - Misc:
ioctl(terminalTIOCGWINSZ/TCGETS),fcntl,dup,dup2,pipe2
-
#[repr(C)]re-declarations:struct stat,struct iovec,struct sigaction,struct rusage,struct timespec -
unsafezero-copy pointer reinterpretation for aligned structs; field-by-field fallback for variable-length types -
CapabilityHandleinjection into every translated request before internal dispatch
-
PT_INTERPparsing — load runtime linker ELF from VFS - Auxiliary vector (
auxv) construction:AT_PHDR,AT_PHENT,AT_PHNUM,AT_ENTRY,AT_BASE,AT_PAGESZ,AT_RANDOM,AT_SECURE - User stack layout:
argc→argv[]→NULL→envp[]→NULL→auxv[] -
VDSOpage mapping forclock_gettime()/gettimeofday()fast-path (avoidsSYSCALLoverhead)
-
CLONE_VM→ share PML4 (thread); absence → CoW-fork PML4 -
CLONE_FS→ sharecwd/umask;CLONE_FILES→ share fd table -
CLONE_SIGHAND→ share signal handler table -
CLONE_THREAD/CLONE_PARENT→ thread group semantics - TLS setup:
set_tid_address(),arch_prctl(ARCH_SET_FS)forFS_BASEMSR
Status: Planned
- Gate every syscall/IPC entry with
CapabilityStore::validate()— reject unauthorized access withEPERM - Per-task capability table (inherited on
clone(), cleared onexecve()unless marked inheritable) - Capability delegation: tasks can
grant()subsets of their capabilities to child tasks - Revocation cascading: revoking a capability invalidates all delegated descendants
- DAC interception: hook
open(),access(),chmod(),chown()in the LES layer - Policy database:
(UID, GID, path_prefix, mode_mask)→(CapabilityType, permission_set) - Dynamic capability minting: time-bounded
CapabilityHandlewith fine-grained permissions (read, write, execute, append, seek) -
/etc/serix/cap-policy.tomlconfiguration with hot-reload viaSIGHUP - Audit log: capability grants/denials logged to ring buffer exposed via
/proc/serix/cap-audit
Status: Planned
- INIT-SIPI-SIPI sequence via LAPIC ICR for AP wake-up
- ACPI MADT parsing for LAPIC ID enumeration and I/O APIC base discovery
-
x2APICmode enable (MSR-based, no MMIO) when CPUID indicates support - Per-CPU data structures (
PerCpuData) accessed viaGS_BASEMSR - Inter-Processor Interrupt (IPI) primitives: TLB shootdown, scheduler kick, panic broadcast
- ACPI DMAR table parsing (DMA Remapping Hardware Unit Definition structures)
- IOMMU page table construction (4-level, analogous to CPU paging)
- Per-device DMA domain isolation — restrict each PCIe function's DMA to allocated frame ranges
- Interrupt remapping via IOMMU Interrupt Remapping Table (IRT) to prevent MSI injection attacks
- Fault logging: IOMMU fault events surfaced to Server Manager via IPC
- ACPI FADT parsing:
PM1a_CNT_BLKfor S5 (shutdown),RESET_REGfor reboot - C-States:
MWAITinstruction with target C-state hint (CPUID leaf0x05for supported sub-states); idle loop transitions fromHLTtoMWAIT-based - P-States (Intel HWP):
- Enable HWP via
IA32_PM_ENABLE(MSR0x770) - Configure
IA32_HWP_REQUEST(MSR0x774): setMinimum_Performance,Maximum_Performance,Desired_Performance,Energy_Performance_Preference - Read
IA32_HWP_CAPABILITIES(MSR0x771) for hardware performance bounds
- Enable HWP via
- Thermal monitoring:
IA32_THERM_STATUSMSR polling; throttle scheduler on thermal trip
- PCIe BAR0 MMIO mapping for NVMe controller registers (
CAP,VS,CC,CSTS,AQA,ASQ,ACQ) - Admin Queue pair setup (Submission Queue + Completion Queue in DMA-safe memory)
-
Identify ControllerandIdentify Namespacecommand submission - I/O Queue pair creation (one per CPU core for parallelism)
-
Read/Writecommand submission with PRP (Physical Region Page) list scatter-gather - Interrupt-driven completion via MSI-X vectors routed through I/O APIC
- PCIe BAR0 MMIO mapping for XHCI capability/operational/runtime registers
- Device Context Base Address Array (DCBAA) and Scratchpad Buffer allocation
- Command Ring, Event Ring, and Transfer Ring setup
- Port status change event handling (device attach/detach)
- HID class driver: USB keyboard/mouse report descriptor parsing and input event generation
Status: Planned
This phase targets a Minimum Viable Product (MVP) demonstrating the full kernel stack end-to-end.
Shell (rsh — github.com/gitcomit8/rsh)
Existing Rust shell to be ported from std to #![no_std] + ulib for Serix userspace.
Port from std to no_std:
- Replace
std::iostdin/stdout withulib::read(STDIN)/ulib::write(STDOUT)syscall wrappers - Replace
HashMap→BTreeMap(fromalloc), keepVec/Stringviaalloc - Replace
std::process::exit()→ulib::exit() - Add userspace heap allocator (bump allocator or
linked_list_allocatoroverbrk/fixed region) - Remove ANSI escape sequences that depend on terminal emulation (adapt for framebuffer console)
- Build as
#![no_std]#![no_main]binary linked withuser.ld
Filesystem builtins (require working VFS syscalls):
-
ls— list directory entries viaSYS_GETDENTSor VFS directory read -
cat—serix_open()+read()+write(STDOUT)+serix_close() -
mkdir— create directory viaSYS_MKDIR -
rm— unlink file viaSYS_UNLINK -
touch— create empty file viaserix_open()+serix_close() -
pwd/cd— working directory tracking (requiresSYS_CHDIR/SYS_GETCWDor client-side state)
Process builtins (require clone/execve/waitpid syscalls):
-
ps— list tasks (read/proc/[pid]/stator dedicatedSYS_TASKINFO) -
shutdown/reboot— trigger ACPI S5/reset viaSYS_REBOOT - External command execution via
fork()+execve()withPATHresolution
Existing features to preserve:
- REPL loop with prompt
- Tokenizer with quoted strings, escape sequences, variable substitution
- Shell variables (
set/get) - Control flow (
if,repeat) - Command history
- Line editor: cursor movement, backspace, arrow-key history recall
-
/proc/meminfo— frame allocator statistics: total frames, free frames, used frames, page cache occupancy -
/proc/stat— per-CPU idle time accumulators (ticks spent inMWAIT/HLTidle loop vs. task execution) -
/proc/cpuinfo— CPUID-derived model name, frequency, core type (P-core/E-core), cache sizes -
/proc/[pid]/stat— per-task: state, CPU time (user + system ticks), scheduling class, priority -
/proc/[pid]/maps— per-task VMA listing (start, end, permissions, backing INode) -
/proc/uptime— system uptime derived from LAPIC timer tick count
- Spawns N worker threads via
clone(CLONE_VM | CLONE_FS | CLONE_FILES)to validate thread semantics - Each worker performs a configurable compute-bound workload (e.g., matrix multiply, memory streaming)
- Reads architectural PMU counters (Performance Monitoring Unit) via
rdpmcinstruction:IA32_FIXED_CTR1(unhalted core cycles) andIA32_FIXED_CTR2(unhalted reference cycles) for frequency estimationIA32_PERFEVTSEL0programmed for LLC cache miss events (event=0x2E,umask=0x41)
- Displays cache warmth tracking: per-thread L1d/L2/LLC hit rates before and after core migration
- Measures and displays context switch latency: two threads ping-pong via IPC; timestamp delta via
RDTSCwith invariant TSC calibration - Acceptance criterion: ≤ 500 ns context switch latency on P-cores (measured as 99th percentile over 10,000 iterations)
Status: Planned
- Virtqueue setup for TX and RX (separate queue pairs)
- MAC address read from VirtIO device configuration space
- RX: post buffer descriptors to available ring; process incoming frames from used ring
- TX: construct frame in descriptor buffer; submit to available ring; poll/interrupt for completion
- DMA buffer registration in IOMMU before driver process start
-
CapabilityHandlegrant from network driver to application for shared TX/RX buffer region - Application-side
mmap()of shared buffer into process VMA - Scatter-gather descriptor management: application fills TX descriptors in-place; driver submits to virtqueue
- RX zero-copy: DMA deposits frame directly into application-mapped buffer; notification via event channel
- Integration of
smoltcp(or equivalent#[no_std]Rust TCP/IP library) as a userspace linkable crate - Socket API shim:
socket(),bind(),listen(),accept(),connect(),send(),recv() - ARP table management, DHCP client for dynamic IP configuration
- Loopback interface for local IPC testing
Status: Planned
- Cache warmth heuristic: track
last_run_cpuperTaskCB; apply migration penalty in WFQ virtual-runtime calculation - NUMA-aware frame allocation: prefer frames from the NUMA node local to the scheduling CPU
-
XSAVE/XRSTORlazy FPU context switching: defer FPU state save until another task on the same CPU uses FPU
- GDB stub (
serix-dbg): RSP (Remote Serial Protocol) over serial; register read/write, memory read, breakpoints - Kernel panic handler: unwind stack via
.eh_frame, resolve addresses to symbols via embedded symbol table -
kdump: on panic, snapshot kernel state to reserved memory region; Server Manager writes dump to Ext4 on next boot -
/proc/serix/trace— lightweight ring buffer tracing (syscall entry/exit, context switches, IPC sends)
- GitHub Actions workflow:
cargo build --release,cargo clippy,cargo fmt --check - Automated QEMU boot test:
make runwith timeout, grep serial output for[CHECKPOINT]markers -
Miriundefined behavior checks onunsafe-heavy crates (memory/,task/,kernel/) -
Kanibounded model checking for critical invariants (capability store, IPC port queue bounds)
| Phase | Description | Status |
|---|---|---|
| 1 | Core Foundation (boot, memory, HAL) | ✅ Complete |
| 2 | System Infrastructure (tasks, capabilities, syscalls) | ✅ Complete |
| 3 | Preemptive Scheduling & IPC Hardening | 🔄 Core complete; SMP/WFQ deferred |
| 4 | Storage & Filesystem Stack (Ext4, page cache) | ✅ Complete; Ext4/page cache deferred to Phase 7 |
| 5 | Linux ABI Translation Layer (LES) | 📋 Planned |
| 6 | Security Bridge & Capability Enforcement | 📋 Planned |
| 7 | Hardware Enablement (SMP, IOMMU, ACPI, NVMe, XHCI) | 📋 Planned |
| 8 | Userspace & MVP Deliverables (shell, /proc, demo) | 📋 Planned |
| 9 | Networking (VirtIO-net, zero-copy, TCP/IP) | 📋 Planned |
| 10 | Optimization & Tooling (perf, debug, CI/CD) | 📋 Planned |
See CONTRIBUTING.md for development guidelines. High-priority items for contributors:
- Shell (
rsh) no_std port — port rsh to#![no_std]+ulib(Phase 8) - Ext4 read path — superblock + extent tree parsing (Phase 4)
- Capability enforcement — gate syscalls/IPC on
CapabilityHandle(Phase 6) - SMP bring-up — INIT-SIPI-SIPI AP bootstrap + per-CPU run queues (Phase 7)
File issues or open draft PRs on GitHub to claim a task.