Skip to content

Commit 0000f1a

Browse files
committed
Harden signal safety and fix SIGSYS leak to guest
The supervisor's SIGSYS handler runs with SIGSYS blocked (SA_SIGINFO default). emulate_trap_rt_sigprocmask was writing the raw uc_sigmask, including that SIGSYS bit, back to the guest's old_ptr, leaking the reserved signal into the guest's visible signal state. Same leak existed in emulate_trap_rt_sigpending. This fixes both by stripping the reserved SIGSYS bit before writing to guest memory. Extract kbox_syscall_trap_sigset_strip_reserved() as the inverse of the existing kbox_syscall_trap_sigset_blocks_reserved(). Add signal safety contract documentation to procmem.c, syscall-trap.c, and seccomp-dispatch.c enumerating every signal-visible global, its protection invariant, and what breaks if multi-threaded guests are introduced. Change-Id: I67ff940a7b079990fca088ecc46a78c01162a5c1
1 parent 9a318a4 commit 0000f1a

10 files changed

Lines changed: 515 additions & 5 deletions

File tree

scripts/pre-commit.hook

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -184,6 +184,8 @@ build_cppcheck_suppressions() {
184184
"syntaxError:src/loader-transfer.c"
185185
"invalidFunctionArg:src/rewrite.c"
186186
"invalidFunctionArg:tests/unit/test-procmem.c"
187+
"invalidFunctionArg:tests/guest/signal-safety-test.c"
188+
"nullPointerOutOfMemory:tests/guest/signal-safety-test.c"
187189
"nullPointerArithmeticOutOfMemory:tests/unit/test-procmem.c"
188190
"nullPointerOutOfMemory:tests/unit/test-procmem.c"
189191
"knownConditionTrueFalse"

scripts/run-tests.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -330,7 +330,7 @@ expect_success "umask-test" \
330330
echo ""
331331
echo "--- Guest test programs ---"
332332

333-
for test_prog in dup-test clock-test signal-test path-escape-test errno-test; do
333+
for test_prog in dup-test clock-test signal-test signal-safety-test path-escape-test errno-test; do
334334
if guest_has_test "$test_prog"; then
335335
expect_success "$test_prog" \
336336
"$KBOX" -S "$ROOTFS" -- "/opt/tests/${test_prog}"

src/image.c

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -833,6 +833,7 @@ int kbox_run_image(const struct kbox_image_args *args)
833833
command = args->command ? args->command : "/bin/sh";
834834
probe_mode = args->syscall_mode;
835835
rewrite_requested = args->syscall_mode == KBOX_SYSCALL_MODE_REWRITE;
836+
setenv("KBOX_SYSCALL_MODE", kbox_syscall_mode_name(args->syscall_mode), 1);
836837

837838
/* AUTO enables rewrite analysis for non-shell commands so the
838839
* auto_prefers_userspace_fast_path() selection function can see the

src/procmem.c

Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,48 @@
33
/* Process memory access for seccomp-unotify.
44
*
55
* Wraps process_vm_readv/writev to read/write tracee memory without ptrace.
6+
*
7+
* Signal safety contract
8+
* ----------------------
9+
* Signal-visible globals in this file:
10+
*
11+
* fault_armed (__thread volatile sig_atomic_t)
12+
* Read by fault_handler (SIGSEGV/SIGBUS). Written only by
13+
* safe_memcpy / kbox_current_read_string on the same thread,
14+
* outside the handler. sig_atomic_t + volatile guarantees
15+
* atomic visibility. Thread-local: no cross-thread concern.
16+
*
17+
* fault_jmp (__thread sigjmp_buf)
18+
* Target of siglongjmp in fault_handler. siglongjmp is
19+
* async-signal-safe per POSIX. Thread-local.
20+
*
21+
* fault_handler_gen (volatile unsigned, __atomic ops)
22+
* Bumped by kbox_procmem_signal_changed() (dispatch thread)
23+
* when the guest calls rt_sigaction(SIGSEGV/SIGBUS). Read
24+
* by safe_memcpy via __atomic_load_n. Never accessed from
25+
* a signal handler.
26+
*
27+
* saved_guest_segv, saved_guest_bus (static struct sigaction)
28+
* Read by fault_handler to forward non-kbox faults to the
29+
* guest's handler. Written by install_fault_handler() on
30+
* the dispatch thread. Safe because the single-threaded
31+
* guest constraint (CLONE_THREAD returns ENOSYS) serializes
32+
* rt_sigaction dispatch and safe_memcpy through the same
33+
* thread. If multi-threaded guest support is added, these
34+
* must be protected by a lock or made thread-local.
35+
*
36+
* Functions called from signal handler context:
37+
* fault_handler -- reads fault_armed, calls siglongjmp or
38+
* forwards to guest handler.
39+
* restore_default_and_reraise -- sigaction + raise, both POSIX
40+
* async-signal-safe.
41+
*
42+
* The fault_armed window in safe_memcpy / kbox_current_read_string:
43+
* sigsetjmp -> fault_armed=1 -> memory access -> fault_armed=0
44+
* If a SIGSEGV/SIGBUS arrives while fault_armed=1, siglongjmp
45+
* returns to the sigsetjmp site and returns -EFAULT. If a signal
46+
* arrives while fault_armed=0, it is forwarded to the guest's
47+
* saved handler (or SIG_DFL -> core dump).
648
*/
749

850
#include <errno.h>
@@ -57,6 +99,24 @@ static __thread unsigned
5799

58100
/* Saved guest handlers: when kbox installs its fault handler, the guest's
59101
* prior handlers are preserved here and forwarded to when fault_armed is 0.
102+
*
103+
* Race safety (Go runtime SIGSEGV for stack growth):
104+
*
105+
* When the guest calls rt_sigaction(SIGSEGV, new_handler), kbox's dispatch:
106+
* 1. Bumps fault_handler_gen (kbox_procmem_signal_changed).
107+
* 2. Returns CONTINUE -- host kernel installs guest's new_handler.
108+
*
109+
* At this point kbox's fault_handler is no longer installed. However,
110+
* the next safe_memcpy detects the generation mismatch BEFORE arming
111+
* fault_armed, and re-installs kbox's handler (saving the guest's
112+
* new_handler here). Between steps 2 and the reinstall, no safe_memcpy
113+
* is in progress (fault_armed == 0), so any SIGSEGV in that window goes
114+
* directly to the guest's handler -- which is the correct behavior.
115+
*
116+
* This ordering guarantee depends on the single-threaded guest
117+
* constraint (CLONE_THREAD returns ENOSYS). With multi-threaded
118+
* guests, a concurrent safe_memcpy on another thread could see a
119+
* stale saved_guest_segv. These would need to become per-thread.
60120
*/
61121
static struct sigaction saved_guest_segv;
62122
static struct sigaction saved_guest_bus;

src/seccomp-dispatch.c

Lines changed: 43 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,22 @@
77
* This is the beating heart of kbox: every file open, read, write, stat, and
88
* directory operation the tracee makes gets routed through here.
99
*
10+
* Single-threaded dispatch contract
11+
* ----------------------------------
12+
* All notification processing runs on a single thread (supervisor loop in
13+
* seccomp mode, SIGSYS handler in trap/rewrite mode). The following state is
14+
* protected only by this single-threaded invariant:
15+
*
16+
* dispatch_scratch[] Static I/O buffer (this file).
17+
* kbox_fd_table entries FD table (fd-table.c).
18+
* Path scratch buffers Translation/literal caches (path.c).
19+
* shadow_sockets[] SLIRP socket array (net-slirp.c).
20+
* saved_guest_segv/bus Fault handler saved actions (procmem.c).
21+
* fault_armed Thread-local; single-threaded guest means
22+
* only one thread ever arms it.
23+
*
24+
* If parallel dispatch or multi-threaded guest support is introduced,
25+
* every item above needs locking or per-thread allocation.
1026
*/
1127

1228
#include <errno.h>
@@ -126,6 +142,9 @@ static struct kbox_dispatch emulate_trap_rt_sigprocmask(
126142
unsigned char set_mask[sizeof(sigset_t)];
127143
size_t mask_len;
128144

145+
memset(current, 0, sizeof(current));
146+
memset(next, 0, sizeof(next));
147+
129148
if (sigset_size == 0 || sigset_size > sizeof(current))
130149
return kbox_dispatch_errno(EINVAL);
131150
mask_len = sigset_size;
@@ -153,8 +172,15 @@ static struct kbox_dispatch emulate_trap_rt_sigprocmask(
153172
}
154173

155174
if (old_ptr != 0) {
175+
/* Strip SIGSYS from the reported mask -- the guest must not
176+
* observe kbox's reserved signal in its signal state.
177+
*/
178+
unsigned char visible[sizeof(sigset_t)];
179+
180+
memcpy(visible, current, sizeof(visible));
181+
kbox_syscall_trap_sigset_strip_reserved(visible, sizeof(visible));
156182
int rc = guest_mem_write(ctx, kbox_syscall_request_pid(req), old_ptr,
157-
current, mask_len);
183+
visible, mask_len);
158184
if (rc < 0)
159185
return kbox_dispatch_errno(-rc);
160186
}
@@ -213,15 +239,18 @@ static struct kbox_dispatch emulate_trap_rt_sigpending(
213239
unsigned char pending[sizeof(sigset_t)];
214240
int rc;
215241

216-
(void) ctx;
217-
218242
if (set_ptr == 0)
219243
return kbox_dispatch_errno(EFAULT);
220244
if (sigset_size == 0 || sigset_size > sizeof(pending))
221245
return kbox_dispatch_errno(EINVAL);
222246
if (kbox_syscall_trap_get_pending(pending, sizeof(pending)) < 0)
223247
return kbox_dispatch_errno(EIO);
224248

249+
/* Strip SIGSYS from pending set -- the guest must not observe
250+
* kbox's reserved signal as pending.
251+
*/
252+
kbox_syscall_trap_sigset_strip_reserved(pending, sizeof(pending));
253+
225254
rc = guest_mem_write(ctx, kbox_syscall_request_pid(req), set_ptr, pending,
226255
sigset_size);
227256
if (rc < 0)
@@ -4405,7 +4434,17 @@ struct kbox_dispatch kbox_dispatch_request(
44054434
return kbox_dispatch_value(0);
44064435
}
44074436

4408-
/* Signals. */
4437+
/* Signals.
4438+
*
4439+
* rt_sigaction: only SIGSYS is denied (reserved for trap mode).
4440+
* All other signals -- including SIGURG (Go async preemption),
4441+
* SIGUSR1/2, SIGALRM, etc. -- pass through to the host kernel
4442+
* via CONTINUE. SIGSEGV/SIGBUS changes bump the fault handler
4443+
* generation counter so procmem.c reinstalls its handler.
4444+
*
4445+
* rt_sigprocmask: in trap/rewrite mode, emulated to keep the
4446+
* supervisor's SIGSYS unblocked. In seccomp mode, CONTINUE.
4447+
*/
44094448

44104449
if (nr == h->rt_sigaction) {
44114450
if (request_uses_trap_signals(req) &&

src/syscall-trap-signal.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ struct kbox_syscall_trap_ip_range {
1313
int kbox_syscall_trap_reserved_signal(void);
1414
int kbox_syscall_trap_signal_is_reserved(int signum);
1515
int kbox_syscall_trap_sigset_blocks_reserved(const void *mask, size_t len);
16+
void kbox_syscall_trap_sigset_strip_reserved(void *mask, size_t len);
1617
uintptr_t kbox_syscall_trap_host_syscall_ip(void);
1718
int kbox_syscall_trap_host_syscall_range(
1819
struct kbox_syscall_trap_ip_range *out);

src/syscall-trap.c

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,42 @@
11
/* SPDX-License-Identifier: MIT */
22

3+
/* Syscall trap runtime: SIGSYS handler installation and dispatch.
4+
*
5+
* Signal safety contract
6+
* ----------------------
7+
* Signal-visible globals:
8+
*
9+
* active_trap_runtime (static pointer, atomic load/store)
10+
* Read by trap_sigsys_handler via __atomic_load_n (ACQUIRE).
11+
* Written by install/uninstall via __atomic_store_n (RELEASE).
12+
* Plain pointer load is async-signal-safe.
13+
*
14+
* have_fsgsbase (static int)
15+
* Written once at startup by probe_fsgsbase(). Read in
16+
* read/write_host_fs_base helpers. One-shot init; never
17+
* modified after the first probe.
18+
*
19+
* The SIGSYS handler (trap_sigsys_handler) runs on the guest thread.
20+
* It must avoid:
21+
* - Heap allocation (guest may hold glibc malloc locks).
22+
* All dispatch buffers are static (dispatch_scratch in
23+
* seccomp-dispatch.c).
24+
* - Stack protector (guest FS base != kbox FS base on x86_64).
25+
* Handler is __attribute__((no_stack_protector)).
26+
* - ASAN instrumentation (ASAN runtime syscalls hit the BPF
27+
* filter from unregistered IPs). Handler is
28+
* __attribute__((no_sanitize("address"))).
29+
*
30+
* The handler restores kbox's FS base before calling into C dispatch
31+
* code, then restores the guest's FS base before returning. This
32+
* swap is safe because the guest is single-threaded (CLONE_THREAD
33+
* returns ENOSYS in trap/rewrite mode).
34+
*
35+
* If multi-threaded guest support is added, the following must be
36+
* revisited: active_trap_runtime (must become per-thread), FS base
37+
* swap (must be per-thread), and all static dispatch buffers.
38+
*/
39+
340
#include <errno.h>
441
#include <limits.h>
542
#include <pthread.h>
@@ -878,6 +915,18 @@ int kbox_syscall_trap_sigset_blocks_reserved(const void *mask, size_t len)
878915
return (bytes[byte_index] & (1U << bit_index)) != 0;
879916
}
880917

918+
void kbox_syscall_trap_sigset_strip_reserved(void *mask, size_t len)
919+
{
920+
unsigned char *bytes = mask;
921+
unsigned int signo = (unsigned int) kbox_syscall_trap_reserved_signal();
922+
unsigned int bit = signo - 1U;
923+
unsigned int byte_index = bit / 8U;
924+
925+
if (!mask || len <= byte_index)
926+
return;
927+
bytes[byte_index] &= (unsigned char) ~(1U << (bit % 8U));
928+
}
929+
881930
uintptr_t kbox_syscall_trap_host_syscall_ip(void)
882931
{
883932
#if defined(__x86_64__) || defined(__aarch64__) || \

0 commit comments

Comments
 (0)