Skip to content
This repository was archived by the owner on Mar 24, 2022. It is now read-only.

Commit d65b251

Browse files
authored
Ensure a requested amount of stack is avaiable before hostcalls (#567)
add a configurable guarantee of stack space available for hostcalls this is enforced by generating trampoline functions to ensure this guarantee can be upheld before performing calls to host code additionally, provide documentation around hostcalls and Lucet's implementation thereof
1 parent 3b47c38 commit d65b251

19 files changed

Lines changed: 615 additions & 72 deletions

File tree

benchmarks/lucet-benchmarks/src/seq.rs

Lines changed: 4 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -80,10 +80,7 @@ fn instantiate_with_dense_heap<R: RegionCreate + 'static>(c: &mut Criterion) {
8080
region.new_instance(module).unwrap()
8181
}
8282

83-
let limits = Limits {
84-
heap_memory_size: 1024 * 1024 * 1024,
85-
..Limits::default()
86-
};
83+
let limits = Limits::default().with_heap_memory_size(1024 * 1024 * 1024);
8784

8885
let region = R::create(1, &limits).unwrap();
8986

@@ -103,10 +100,7 @@ fn instantiate_with_sparse_heap<R: RegionCreate + 'static>(c: &mut Criterion) {
103100
region.new_instance(module).unwrap()
104101
}
105102

106-
let limits = Limits {
107-
heap_memory_size: 1024 * 1024 * 1024,
108-
..Limits::default()
109-
};
103+
let limits = Limits::default().with_heap_memory_size(1024 * 1024 * 1024);
110104

111105
let region = R::create(1, &limits).unwrap();
112106

@@ -153,10 +147,7 @@ fn hello_drop_instance<R: RegionCreate + 'static>(c: &mut Criterion) {
153147
fn drop_instance_with_dense_heap<R: RegionCreate + 'static>(c: &mut Criterion) {
154148
fn body(_inst: InstanceHandle) {}
155149

156-
let limits = Limits {
157-
heap_memory_size: 1024 * 1024 * 1024,
158-
..Limits::default()
159-
};
150+
let limits = Limits::default().with_heap_memory_size(1024 * 1024 * 1024);
160151

161152
let region = R::create(1, &limits).unwrap();
162153

@@ -178,10 +169,7 @@ fn drop_instance_with_dense_heap<R: RegionCreate + 'static>(c: &mut Criterion) {
178169
fn drop_instance_with_sparse_heap<R: RegionCreate + 'static>(c: &mut Criterion) {
179170
fn body(_inst: InstanceHandle) {}
180171

181-
let limits = Limits {
182-
heap_memory_size: 1024 * 1024 * 1024,
183-
..Limits::default()
184-
};
172+
let limits = Limits::default().with_heap_memory_size(1024 * 1024 * 1024);
185173

186174
let region = R::create(1, &limits).unwrap();
187175

docs/src/SUMMARY.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@
1010
- [Module integrity and authentication](./Integrity-and-authentication.md)
1111
- [Lucet components](./Lucet-components.md)
1212
- [`lucetc`](./lucetc.md)
13+
- [`Hostcalls`](./lucetc/hostcalls.md)
1314
- [`lucet-runtime`](./lucet-runtime.md)
1415
- [`KillSwitch`](./lucet-runtime/killswitch.md)
1516
- [`lucet-wasi`](./lucet-wasi.md)

docs/src/lucetc/hostcalls.md

Lines changed: 130 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,130 @@
1+
# Hostcalls
2+
3+
Hostcalls are how Lucet guests interact with the world outside the WebAssembly
4+
VM. For example, all functions in [WASI](https://github.com/bytecodealliance/wasmtime/blob/main/docs/WASI-intro.md) are implemented in terms of
5+
hostcalls that can be exposed to a WebAssembly guest. This chapter discusses
6+
implementation details of hostcalls as Lucet implements them.
7+
8+
Lucet implements hostcalls as imports of symbols specified by the
9+
`bindings.json` provided to `lucetc`. This maps namespaced functions provided
10+
to the WebAssembly module to symbol names the module should import when loaded.
11+
Functionally, `lucet-runtime` currently relies on the dynamic linker to be able
12+
to locate and fix up refereneces to these imported functions, and will fail to
13+
load a module if the dynamic linker can't resolve all imports.
14+
15+
Hostcalls have an important intersection with safety properties Lucet seeks to
16+
uphold: if a fault occurs in a WebAssembly guest, it should be isolated to that
17+
guest, and the host should, generally, be able to continue execution. However,
18+
a fault in a hostcall is a fault _outside_ the WebAssembly guest, back in
19+
whatever code the host running `lucet-runtime` has provided. A fault here is
20+
well outside any guarantees WebAssembly can offer, and the only sound option
21+
Lucet has is to raise that issue in the host, if it has the option, and hope
22+
the host knows what to do with it.
23+
24+
## Stack overflows
25+
26+
Generally, "kick the problem to the host and hope they know what to do" works.
27+
For a general memory fault in host code, that gets handled no differently.
28+
Language-specific features, like Rust panics, work too; if a `ud2` is present
29+
in the hostcall's body, the `SIGILL` still causes the same kind of panic - not
30+
a recoverable issue, but Lucet doesn't cause new problems here.
31+
32+
Unfortunately, memory faults like `SIGBUS` and `SIGSEGV` are typically fatal to
33+
host applications. Memory faults in unpredictable locations even moreso. A
34+
naive hostcall implementation scheme by calling import functions raises a real
35+
risk here: if a WebAssembly guest consumes most, but not all, of the Lucet
36+
guest's stack, _then_ makes a hostcall, the hostcall may consume the rest of
37+
the guest's stack space and experience a stack overflow.
38+
39+
To mitigate the risk of an unknown WebAssembly guest being able to cause host
40+
faults essentially on-demand, Lucet guards hostcalls by a trampoline function
41+
that performs safety checks before actually making the call into host code.
42+
Currently, there is one check: is there enough guest stack space remaining to
43+
uphold some guaranteed amount available for hostcalls?
44+
45+
By guaranteeing some minimum available space, the problem of hostcall stack use
46+
becomes the same as not overflowing stacks generally; if a host expects to
47+
handle stack overflows in some manner, it probably still can, and if it allows
48+
the system to do what it will on stack overflows, a hostcall overflowing the
49+
guaranteed space will still observe a normal stack overflow. Hostcalls must
50+
conform to the same requirements code would have without Lucet inolved, except
51+
that the Lucet hostcall stack reservation may be more or less than the system's
52+
configured thread size.
53+
54+
The good news is that while the hostcall stack reservation is a fixed size, it
55+
is customizable: the field
56+
[hostcall_reservation](https://docs.rs/lucet-runtime/0.7.0/lucet_runtime/struct.Limits.html#structfield.heap_memory_size)
57+
in `Limits` specifies the space Lucet will require to be available, with a
58+
default of 32KiB. Lucet requires that `hostcall_reservation` is between zero
59+
and the guest's entire stack. Finally, a `hostcall_reservation` equal to the
60+
entire guest stack size is allowed, and a de facto denial of hostcalls to the
61+
guest - a Lucet guest will always have some stack space reserved for the
62+
runtime-required backstop, so the availability check would always fail.
63+
64+
In terms of implementation, hostcall checks are done in entirely synthetic
65+
functions generated by lucetc, prefixed with `trampoline_`. The trampoline
66+
functions themselves are very simple, and have a shape like the following
67+
Cranelift IR:
68+
```
69+
; A trampoline has the same signature as its hostcall - args plus a vmctx.
70+
function %trampoline_$HOSTCALL($HOSTCALL_ARGS.., i64 vmctx) -> $HOSTCALL_RESULT {
71+
gv_vmctx = vmctx
72+
heap0 = static gv0
73+
74+
block0($HOSTCALL_ARGS.., vmctx: i64):
75+
; The stack limit is recorded as part of the instance, just before the start of the guest heap.
76+
stack_limit_addr = heap_addr.i64 heap0, gv_vmctx, -$STACK_LIMIT_OFFSET
77+
stack_limit = load.i64 stack_limit_addr
78+
; Compare the current stack pointer to stack_limit.
79+
; The stack pointer is the LHS of this comparison, `stack_limit` the RHS.
80+
stack_cmp = ifcmp_sp stack_limit
81+
; If the limit is greater than or equal to the stack pointer, there is
82+
; insufficient space for the hostcall. Branch to the fail block and trap.
83+
;
84+
; "greater than or equal" may be surprising - it might be more natural to
85+
; consider this comparison with arguments reversed; if the stack pointer is
86+
; less than the limit, there is insufficient space.
87+
;
88+
; Even phrased like this, "less than" may be surprising, but is correct since
89+
; the stack grows downward. Given an example layout:
90+
; 0x0000: start of stack - start of the stack's allocation
91+
; 0x2000: hostcall limit - reserve 0x2000 bytes
92+
; 0x8000: base of stack - initial guest stack pointer
93+
;
94+
; and knowledge that the stack grows downward, the space from 0x2000 to
95+
; 0x0000 is the reserved space, and a stack pointer below 0x2000 is in the
96+
; reserved area, and thus voids the guarantee of reserved space by Lucet.
97+
brif ugte stack_cmp, stack_check_fail
98+
jump do_hostcall($HOSTCALL_ARGS.., vmctx)
99+
100+
do_hostcall($HOSTCALL_ARGS.., vmctx: i64):
101+
$HOSTCALL_RESULT.. = call $HOSTCALL($HOSTCALL_ARGS.., vmctx)
102+
return $HOSTCALL_RESULT
103+
104+
stack_check_fail():
105+
; If the stack check fails, it's raised as a stack overflow in guest code.
106+
; This is reasonably close to the actual occurrance, and allows the guest to be
107+
; unwound.
108+
trap stk_ovf
109+
```
110+
111+
### Lucet implementation considerations
112+
113+
Why do this trampoline and stack usage test, instead of observing failures in
114+
guest stack guards? The issue here is twofold: first, we can't robustly
115+
distinguish a stack overflow from an unlucky errant memory access for other
116+
reasons. If hosts are built using stack probes, stack overflows will probably
117+
be observed in places we expect, with patterns we can expect. But this is by no
118+
means a guarantee, and stack accesses might not be through the stack pointer
119+
directly (perhaps the address is loaded into an alternate register, and a fault
120+
occurs without referencing the stack pointer at all). Second, if a hostcall
121+
fault were to be recovered by Lucet, the runtime may have to unwind a guest that
122+
already has exhausted its stack space. This would require temporarily making
123+
the stack guard writable to support instigating a guest unwind, and probably
124+
motivate a second "real" guard page for safety against unforseen circumstances
125+
where unwinding might consume significant amounts of stack space.
126+
127+
Given the complexity and fallibility of trying to recover from a stack overflow
128+
in guest code, we've concluded that pushing the error out of host code, with
129+
hostcalls constrained to the same kind of boundaries as would otherwise be
130+
expected, is a reasonable compromise.

lucet-module/src/runtime.rs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,4 +5,5 @@
55
pub struct InstanceRuntimeData {
66
pub globals_ptr: *mut i64,
77
pub instruction_count: u64,
8+
pub stack_limit: u64,
89
}

lucet-runtime/include/lucet_types.h

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -120,6 +120,10 @@ struct lucet_alloc_limits {
120120
* Size of the guest stack. (default 128K)
121121
*/
122122
uint64_t stack_size;
123+
/**
124+
* Amount of the guest stack that must be available for hostcalls. (default 32K)
125+
*/
126+
uint64_t hostcall_reservation;
123127
/**
124128
* Size of the globals region in bytes; each global uses 8 bytes. (default 4K)
125129
*/

lucet-runtime/lucet-runtime-internals/src/alloc/mod.rs

Lines changed: 40 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -432,6 +432,8 @@ pub struct Limits {
432432
pub heap_address_space_size: usize,
433433
/// Size of the guest stack. (default 128K)
434434
pub stack_size: usize,
435+
/// Amount of the guest stack that must be available for hostcalls. (default 32K)
436+
pub hostcall_reservation: usize,
435437
/// Size of the globals region in bytes; each global uses 8 bytes. (default 4K)
436438
pub globals_size: usize,
437439
/// Size of the signal stack in bytes. (default SIGSTKSZ for release builds, at least 12K for
@@ -484,13 +486,42 @@ impl Limits {
484486
heap_memory_size: 16 * 64 * 1024,
485487
heap_address_space_size: 0x0002_0000_0000,
486488
stack_size: 128 * 1024,
489+
hostcall_reservation: 32 * 1024,
487490
globals_size: 4096,
488491
signal_stack_size: DEFAULT_SIGNAL_STACK_SIZE,
489492
}
490493
}
491-
}
492494

493-
impl Limits {
495+
pub const fn with_heap_memory_size(mut self, heap_memory_size: usize) -> Self {
496+
self.heap_memory_size = heap_memory_size;
497+
self
498+
}
499+
500+
pub const fn with_heap_address_space_size(mut self, heap_address_space_size: usize) -> Self {
501+
self.heap_address_space_size = heap_address_space_size;
502+
self
503+
}
504+
505+
pub const fn with_stack_size(mut self, stack_size: usize) -> Self {
506+
self.stack_size = stack_size;
507+
self
508+
}
509+
510+
pub const fn with_hostcall_reservation(mut self, hostcall_reservation: usize) -> Self {
511+
self.hostcall_reservation = hostcall_reservation;
512+
self
513+
}
514+
515+
pub const fn with_globals_size(mut self, globals_size: usize) -> Self {
516+
self.globals_size = globals_size;
517+
self
518+
}
519+
520+
pub const fn with_signal_stack_size(mut self, signal_stack_size: usize) -> Self {
521+
self.signal_stack_size = signal_stack_size;
522+
self
523+
}
524+
494525
pub fn total_memory_size(&self) -> usize {
495526
// Memory is laid out as follows:
496527
// * the instance (up to instance_heap_offset)
@@ -544,6 +575,13 @@ impl Limits {
544575
if self.stack_size <= 0 {
545576
return Err(Error::InvalidArgument("stack size must be greater than 0"));
546577
}
578+
// We allow `hostcall_reservation == self.stack_size`, a circumstance that guarantees
579+
// any hostcalls will fail with a StackOverflow.
580+
if self.hostcall_reservation > self.stack_size {
581+
return Err(Error::InvalidArgument(
582+
"hostcall reserved space must not be greater than stack size",
583+
));
584+
}
547585
if self.signal_stack_size < MINSIGSTKSZ {
548586
tracing::info!(
549587
"signal stack size of {} requires manual configuration of signal stacks",

lucet-runtime/lucet-runtime-internals/src/alloc/tests.rs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -666,6 +666,7 @@ macro_rules! alloc_tests {
666666
heap_memory_size: 4096,
667667
heap_address_space_size: 2 * 4096,
668668
stack_size: 4096,
669+
hostcall_reservation: 4096,
669670
globals_size: 4096,
670671
..Limits::default()
671672
};

lucet-runtime/lucet-runtime-internals/src/c_api.rs

Lines changed: 10 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -142,6 +142,8 @@ pub struct lucet_alloc_limits {
142142
pub heap_address_space_size: u64,
143143
/// Size of the guest stack. (default 128K)
144144
pub stack_size: u64,
145+
/// Amount of the guest stack that must be available for hostcalls. (default 32K)
146+
pub hostcall_reservation: u64,
145147
/// Size of the globals region in bytes; each global uses 8 bytes. (default 4K)
146148
pub globals_size: u64,
147149
/// Size of the signal stack in bytes. (default SIGSTKSZ for release builds, at least 12K for
@@ -168,6 +170,7 @@ impl From<&Limits> for lucet_alloc_limits {
168170
heap_memory_size: limits.heap_memory_size as u64,
169171
heap_address_space_size: limits.heap_address_space_size as u64,
170172
stack_size: limits.stack_size as u64,
173+
hostcall_reservation: limits.hostcall_reservation as u64,
171174
globals_size: limits.globals_size as u64,
172175
signal_stack_size: limits.signal_stack_size as u64,
173176
}
@@ -182,13 +185,13 @@ impl From<lucet_alloc_limits> for Limits {
182185

183186
impl From<&lucet_alloc_limits> for Limits {
184187
fn from(limits: &lucet_alloc_limits) -> Limits {
185-
Limits {
186-
heap_memory_size: limits.heap_memory_size as usize,
187-
heap_address_space_size: limits.heap_address_space_size as usize,
188-
stack_size: limits.stack_size as usize,
189-
globals_size: limits.globals_size as usize,
190-
signal_stack_size: limits.signal_stack_size as usize,
191-
}
188+
Limits::default()
189+
.with_heap_memory_size(limits.heap_memory_size as usize)
190+
.with_heap_address_space_size(limits.heap_address_space_size as usize)
191+
.with_stack_size(limits.stack_size as usize)
192+
.with_hostcall_reservation(limits.hostcall_reservation as usize)
193+
.with_globals_size(limits.globals_size as usize)
194+
.with_signal_stack_size(limits.signal_stack_size as usize)
192195
}
193196
}
194197

lucet-runtime/lucet-runtime-internals/src/instance.rs

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -851,6 +851,23 @@ impl Instance {
851851
pub fn set_instruction_count(&mut self, instruction_count: u64) {
852852
self.get_instance_implicits_mut().instruction_count = instruction_count;
853853
}
854+
855+
#[inline]
856+
pub fn set_hostcall_stack_reservation(&mut self) {
857+
let slot = self
858+
.alloc
859+
.slot
860+
.as_ref()
861+
.expect("reachable instance has a slot");
862+
863+
let reservation = slot.limits.hostcall_reservation;
864+
865+
// The `.stack` field is a pointer to the lowest address of the stack - the start of its
866+
// allocation. Because the stack grows downward, this is the end of the stack space. So the
867+
// limit we'll need to check for hostcalls is some reserved space upwards from here, to
868+
// meet some guest stack pointer early.
869+
self.get_instance_implicits_mut().stack_limit = slot.stack as u64 + reservation as u64;
870+
}
854871
}
855872

856873
// Private API
@@ -887,6 +904,8 @@ impl Instance {
887904
};
888905
inst.set_globals_ptr(globals_ptr);
889906
inst.set_instruction_count(0);
907+
// Ensure the hostcall limit tracked in this instance's guest-shared data is up-to-date.
908+
inst.set_hostcall_stack_reservation();
890909

891910
assert_eq!(mem::size_of::<Instance>(), HOST_PAGE_SIZE_EXPECTED);
892911
let unpadded_size = offset_of!(Instance, _padding);
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
{
2+
"env": {
3+
"onetwothree": "onetwothree"
4+
}
5+
}

0 commit comments

Comments
 (0)