Skip to content

Commit 753ccd9

Browse files
branchseerclaude
andauthored
Support pre-existing LD_PRELOAD/DYLD_INSERT_LIBRARIES (#349)
## Summary Fixes issue #340 by allowing fspy to coexist with user-supplied `LD_PRELOAD` (Linux) or `DYLD_INSERT_LIBRARIES` (macOS) environment variables. Previously, fspy would reject spawns when these variables were already set. Now it appends its tracer shim to the end of the preload list, preserving both the user's preload and correct symbol interposition order. ## Key Changes - **New `append_path_env()` function** in `fspy_shared_unix/src/exec/mod.rs`: Appends a value to colon-separated path environment variables (like `LD_PRELOAD`) instead of overwriting them. The function is idempotent and avoids leading colons. Includes comprehensive unit tests covering edge cases. - **Updated Linux spawn handler** (`fspy_shared_unix/src/spawn/linux/mod.rs`): Changed from `ensure_env()` (which would fail if the variable existed) to `append_path_env()` to append fspy's shim to any existing `LD_PRELOAD`. - **Updated macOS spawn handler** (`fspy_shared_unix/src/spawn/macos.rs`): Similarly updated to use `append_path_env()` for `DYLD_INSERT_LIBRARIES`. - **New test library** (`crates/preload_test_lib/src/lib.rs`): A Linux-only `LD_PRELOAD` library that intercepts `open`/`openat` syscalls. It short-circuits paths containing the marker `preload_test_short_circuit` (returning `ENOENT` without forwarding), while forwarding all other calls via `RTLD_NEXT`. This allows testing that fspy correctly handles user preloads that short-circuit syscalls. - **New e2e test fixture** (`preexisting_ld_preload`): Comprehensive end-to-end test that verifies: - Short-circuited file accesses are not tracked by fspy (cache hit when modified) - Real file accesses are tracked (cache miss when modified) - The preload chain works correctly with both user and fspy preloads - **E2e test harness improvements** (`vite_task_bin/tests/e2e_snapshots/main.rs`): Added support for Linux platform filter and environment variable placeholder substitution (`<PRELOAD_TEST_LIB_PATH>`) to inject the test library path at runtime. ## Implementation Details The append strategy is critical: by placing fspy's shim *last* in the preload list, user preloads that short-circuit syscalls (returning without forwarding to libc) remain invisible to fspy—accurately reflecting what the OS actually executed. This preserves cache correctness when user preloads intercept file operations. https://claude.ai/code/session_018oB9YLpLUKWprpr1RFJq4k --------- Co-authored-by: Claude <noreply@anthropic.com>
1 parent e81fbc6 commit 753ccd9

15 files changed

Lines changed: 549 additions & 8 deletions

File tree

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
# Changelog
22

3+
- **Fixed** `vp run` no longer aborts with `failed to prepare the command for injection: Invalid argument` when the user environment already has `LD_PRELOAD` (Linux) or `DYLD_INSERT_LIBRARIES` (macOS) set. The tracer shim is now appended to any existing value and placed last, so user preloads keep their symbol-interposition precedence ([#340](https://github.com/voidzero-dev/vite-task/issues/340))
34
- **Changed** Arguments passed after a task name (e.g. `vp run test some-filter`) are now forwarded only to that task. Tasks pulled in via `dependsOn` no longer receive them ([#324](https://github.com/voidzero-dev/vite-task/issues/324))
45
- **Fixed** Windows file access tracking no longer panics when a task touches malformed paths that cannot be represented as workspace-relative inputs ([#330](https://github.com/voidzero-dev/vite-task/pull/330))
56
- **Fixed** `vp run --cache` now supports running without a task specifier and opens the interactive task selector, matching bare `vp run` behavior ([#312](https://github.com/voidzero-dev/vite-task/pull/313))

Cargo.lock

Lines changed: 8 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

crates/fspy_shared_unix/src/exec/mod.rs

Lines changed: 119 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -175,3 +175,122 @@ pub fn ensure_env(
175175
envs.push((name.to_owned(), Some(value.to_owned())));
176176
Ok(())
177177
}
178+
179+
/// Ensures `value` is the trailing colon-separated entry of env var `name`.
180+
///
181+
/// Used for `LD_PRELOAD` / `DYLD_INSERT_LIBRARIES`, which the dynamic loader
182+
/// treats as colon-separated lists. Appending (rather than overwriting)
183+
/// preserves any user-provided preload, and appending to the *end* keeps
184+
/// fspy's shim as the last interposer so a user preload that short-circuits
185+
/// a call (returning without forwarding to libc) stays invisible to fspy —
186+
/// mirroring what the OS actually did.
187+
///
188+
/// - Absent: inserts `(name, value)`.
189+
/// - Present with `value` already as the last colon-separated entry: no
190+
/// change (idempotent across nested execs within the preloaded shim).
191+
/// - Present otherwise: rewrites to `{existing}:{value}`. If `existing` is
192+
/// empty, sets to `value` alone to avoid a leading `:` (which glibc's
193+
/// `ld.so` interprets as the current directory).
194+
pub fn append_path_env(
195+
envs: &mut Vec<(BString, Option<BString>)>,
196+
name: impl AsRef<BStr>,
197+
value: impl AsRef<BStr>,
198+
) {
199+
let name = name.as_ref();
200+
let value = value.as_ref();
201+
if let Some(entry) = envs.iter_mut().find(|(n, _)| n == name) {
202+
let existing: &[u8] = entry.1.as_deref().map_or(&[][..], |v| v.as_ref());
203+
let value_bytes: &[u8] = value.as_ref();
204+
let already_last = existing == value_bytes
205+
|| (existing.len() > value_bytes.len()
206+
&& existing.ends_with(value_bytes)
207+
&& existing[existing.len() - value_bytes.len() - 1] == b':');
208+
if already_last {
209+
return;
210+
}
211+
let mut new_value = Vec::with_capacity(existing.len() + 1 + value_bytes.len());
212+
if !existing.is_empty() {
213+
new_value.extend_from_slice(existing);
214+
new_value.push(b':');
215+
}
216+
new_value.extend_from_slice(value_bytes);
217+
entry.1 = Some(BString::from(new_value));
218+
} else {
219+
envs.push((name.to_owned(), Some(value.to_owned())));
220+
}
221+
}
222+
223+
#[cfg(test)]
224+
mod tests {
225+
use bstr::BString;
226+
227+
use super::append_path_env;
228+
229+
fn env(envs: &[(BString, Option<BString>)], name: &[u8]) -> Option<Vec<u8>> {
230+
envs.iter()
231+
.find(|(n, _)| AsRef::<[u8]>::as_ref(n) == name)
232+
.and_then(|(_, v)| v.as_ref().map(|v| AsRef::<[u8]>::as_ref(v).to_vec()))
233+
}
234+
235+
#[test]
236+
fn inserts_when_absent() {
237+
let mut envs: Vec<(BString, Option<BString>)> = vec![];
238+
append_path_env(&mut envs, "LD_PRELOAD", "/a.so");
239+
assert_eq!(env(&envs, b"LD_PRELOAD"), Some(b"/a.so".to_vec()));
240+
}
241+
242+
#[test]
243+
fn noop_when_equal() {
244+
let mut envs = vec![(BString::from("LD_PRELOAD"), Some(BString::from("/a.so")))];
245+
append_path_env(&mut envs, "LD_PRELOAD", "/a.so");
246+
assert_eq!(env(&envs, b"LD_PRELOAD"), Some(b"/a.so".to_vec()));
247+
}
248+
249+
#[test]
250+
fn noop_when_value_is_last_entry() {
251+
let mut envs = vec![(BString::from("LD_PRELOAD"), Some(BString::from("/user.so:/a.so")))];
252+
append_path_env(&mut envs, "LD_PRELOAD", "/a.so");
253+
assert_eq!(env(&envs, b"LD_PRELOAD"), Some(b"/user.so:/a.so".to_vec()));
254+
}
255+
256+
#[test]
257+
fn appends_with_colon_when_present_and_different() {
258+
let mut envs = vec![(BString::from("LD_PRELOAD"), Some(BString::from("/user.so")))];
259+
append_path_env(&mut envs, "LD_PRELOAD", "/a.so");
260+
assert_eq!(env(&envs, b"LD_PRELOAD"), Some(b"/user.so:/a.so".to_vec()));
261+
}
262+
263+
#[test]
264+
fn sets_without_leading_colon_when_existing_is_empty() {
265+
let mut envs = vec![(BString::from("LD_PRELOAD"), Some(BString::from("")))];
266+
append_path_env(&mut envs, "LD_PRELOAD", "/a.so");
267+
assert_eq!(env(&envs, b"LD_PRELOAD"), Some(b"/a.so".to_vec()));
268+
}
269+
270+
#[test]
271+
fn idempotent_on_repeat() {
272+
let mut envs: Vec<(BString, Option<BString>)> = vec![];
273+
append_path_env(&mut envs, "LD_PRELOAD", "/a.so");
274+
append_path_env(&mut envs, "LD_PRELOAD", "/a.so");
275+
append_path_env(&mut envs, "LD_PRELOAD", "/a.so");
276+
assert_eq!(env(&envs, b"LD_PRELOAD"), Some(b"/a.so".to_vec()));
277+
}
278+
279+
#[test]
280+
fn does_not_false_match_prefix_without_preceding_colon() {
281+
// `lib/a.so` ends with `/a.so` as bytes, but the preceding byte is
282+
// `b` not `:`, so it must NOT be treated as already-present.
283+
let mut envs = vec![(BString::from("LD_PRELOAD"), Some(BString::from("/lib/a.so")))];
284+
append_path_env(&mut envs, "LD_PRELOAD", "a.so");
285+
assert_eq!(env(&envs, b"LD_PRELOAD"), Some(b"/lib/a.so:a.so".to_vec()));
286+
}
287+
288+
#[test]
289+
fn inserts_when_present_with_none_value() {
290+
// An env var present in the list but with `None` value (name without
291+
// `=`) should be rewritten to `Some(value)`.
292+
let mut envs = vec![(BString::from("LD_PRELOAD"), None)];
293+
append_path_env(&mut envs, "LD_PRELOAD", "/a.so");
294+
assert_eq!(env(&envs, b"LD_PRELOAD"), Some(b"/a.so".to_vec()));
295+
}
296+
}

crates/fspy_shared_unix/src/spawn/linux/mod.rs

Lines changed: 11 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,11 @@ use fspy_seccomp_unotify::{payload::SeccompPayload, target::install_target};
66
use memmap2::Mmap;
77

88
#[cfg(not(target_env = "musl"))]
9-
use crate::{elf, exec::ensure_env, open_exec::open_executable};
9+
use crate::{
10+
elf,
11+
exec::{append_path_env, ensure_env},
12+
open_exec::open_executable,
13+
};
1014
use crate::{
1115
exec::Exec,
1216
payload::{EncodedPayload, PAYLOAD_ENV_NAME},
@@ -40,11 +44,15 @@ pub fn handle_exec(
4044
nix::Error::try_from(io_error).unwrap_or(nix::Error::UnknownErrno)
4145
})?;
4246
if elf::is_dynamically_linked_to_libc(executable_mmap)? {
43-
ensure_env(
47+
// Append (don't overwrite) so a user-provided LD_PRELOAD keeps
48+
// working. fspy's shim goes last so user preloads that
49+
// short-circuit a libc call stay invisible to fspy — what the
50+
// OS actually executed is what we want to record.
51+
append_path_env(
4452
&mut command.envs,
4553
LD_PRELOAD,
4654
encoded_payload.payload.preload_path.as_os_str().as_bytes(),
47-
)?;
55+
);
4856
ensure_env(&mut command.envs, PAYLOAD_ENV_NAME, &encoded_payload.encoded_string)?;
4957
return Ok(None);
5058
}

crates/fspy_shared_unix/src/spawn/macos.rs

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ use std::{
88
use phf::{Set, phf_set};
99

1010
use crate::{
11-
exec::{Exec, ensure_env},
11+
exec::{Exec, append_path_env, ensure_env},
1212
payload::{EncodedPayload, PAYLOAD_ENV_NAME},
1313
};
1414

@@ -60,11 +60,15 @@ pub fn handle_exec(
6060
};
6161

6262
if injectable {
63-
ensure_env(
63+
// Append (don't overwrite) so a user-provided DYLD_INSERT_LIBRARIES
64+
// keeps working. fspy's shim goes last so user preloads that
65+
// short-circuit a libc call stay invisible to fspy — what the OS
66+
// actually executed is what we want to record.
67+
append_path_env(
6468
&mut command.envs,
6569
DYLD_INSERT_LIBRARIES,
6670
encoded_payload.payload.preload_path.as_os_str().as_bytes(),
67-
)?;
71+
);
6872
ensure_env(&mut command.envs, PAYLOAD_ENV_NAME, &encoded_payload.encoded_string)?;
6973
} else {
7074
command.envs.retain(|(name, _)| {

crates/preload_test_lib/Cargo.toml

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
[package]
2+
name = "preload_test_lib"
3+
version = "0.0.0"
4+
edition.workspace = true
5+
publish = false
6+
7+
[lib]
8+
crate-type = ["cdylib"]
9+
test = false
10+
doctest = false
11+
12+
[target.'cfg(target_os = "linux")'.dependencies]
13+
libc = { workspace = true }
14+
15+
[lints]
16+
workspace = true

crates/preload_test_lib/src/lib.rs

Lines changed: 161 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,161 @@
1+
//! Test-only `LD_PRELOAD` library used by the `preexisting_ld_preload` e2e
2+
//! fixture. Intercepts `open`/`openat` (and their `64` variants) to exercise
3+
//! two behaviours fspy must tolerate when appended to a pre-existing
4+
//! `LD_PRELOAD` list:
5+
//!
6+
//! 1. For paths containing the marker `preload_test_short_circuit`, the
7+
//! call is short-circuited with `ENOENT` *without* forwarding to the
8+
//! next preloaded library. Because fspy is appended after this library
9+
//! in the preload list, fspy never observes the call — exactly the
10+
//! property we want to verify.
11+
//! 2. For every other path the call is forwarded via
12+
//! `dlsym(RTLD_NEXT, …)`, so fspy still sees the real accesses and can
13+
//! track them as cache inputs.
14+
#![cfg(target_os = "linux")]
15+
#![feature(c_variadic)]
16+
17+
use std::{
18+
ffi::{CStr, c_char, c_int},
19+
sync::OnceLock,
20+
};
21+
22+
const MARKER: &[u8] = b"preload_test_short_circuit";
23+
24+
fn should_short_circuit(path: *const c_char) -> bool {
25+
if path.is_null() {
26+
return false;
27+
}
28+
// SAFETY: callers of `open`/`openat` pass a valid NUL-terminated C string
29+
// (or NULL, handled above).
30+
let bytes = unsafe { CStr::from_ptr(path) }.to_bytes();
31+
bytes.windows(MARKER.len()).any(|w| w == MARKER)
32+
}
33+
34+
fn fail_with_enoent() -> c_int {
35+
// SAFETY: `__errno_location` is async-signal-safe and always returns a
36+
// valid pointer to the per-thread errno.
37+
unsafe { *libc::__errno_location() = libc::ENOENT };
38+
-1
39+
}
40+
41+
const fn has_mode_arg(flags: c_int) -> bool {
42+
flags & libc::O_CREAT != 0 || flags & libc::O_TMPFILE != 0
43+
}
44+
45+
type OpenFn = unsafe extern "C" fn(*const c_char, c_int, ...) -> c_int;
46+
type OpenatFn = unsafe extern "C" fn(c_int, *const c_char, c_int, ...) -> c_int;
47+
48+
fn load_next_fn<F: Copy>(name: &CStr) -> F {
49+
// SAFETY: `dlsym` with `RTLD_NEXT` returns either NULL or a valid
50+
// function pointer for a symbol that must exist in libc. The cast is
51+
// valid because the caller supplies a `F` whose layout is a function
52+
// pointer of the corresponding libc signature.
53+
let ptr = unsafe { libc::dlsym(libc::RTLD_NEXT, name.as_ptr()) };
54+
assert!(!ptr.is_null(), "dlsym RTLD_NEXT returned null");
55+
// SAFETY: see above.
56+
unsafe { std::mem::transmute_copy(&ptr) }
57+
}
58+
59+
fn next_open() -> OpenFn {
60+
static S: OnceLock<OpenFn> = OnceLock::new();
61+
*S.get_or_init(|| load_next_fn(c"open"))
62+
}
63+
fn next_open64() -> OpenFn {
64+
static S: OnceLock<OpenFn> = OnceLock::new();
65+
*S.get_or_init(|| load_next_fn(c"open64"))
66+
}
67+
fn next_openat() -> OpenatFn {
68+
static S: OnceLock<OpenatFn> = OnceLock::new();
69+
*S.get_or_init(|| load_next_fn(c"openat"))
70+
}
71+
fn next_openat64() -> OpenatFn {
72+
static S: OnceLock<OpenatFn> = OnceLock::new();
73+
*S.get_or_init(|| load_next_fn(c"openat64"))
74+
}
75+
76+
/// # Safety
77+
/// Interposer over libc `open(2)`; same contract as the real function. Must
78+
/// only be called by the dynamic loader after installation via `LD_PRELOAD`.
79+
#[unsafe(no_mangle)]
80+
pub unsafe extern "C" fn open(path: *const c_char, flags: c_int, mut args: ...) -> c_int {
81+
if should_short_circuit(path) {
82+
return fail_with_enoent();
83+
}
84+
if has_mode_arg(flags) {
85+
// SAFETY: `O_CREAT`/`O_TMPFILE` guarantees a `mode_t` follows per
86+
// the `open(2)` contract.
87+
let mode: libc::mode_t = unsafe { args.arg() };
88+
// SAFETY: forwarding the caller's arguments unchanged.
89+
unsafe { next_open()(path, flags, mode) }
90+
} else {
91+
// SAFETY: forwarding the caller's arguments unchanged.
92+
unsafe { next_open()(path, flags) }
93+
}
94+
}
95+
96+
/// # Safety
97+
/// Interposer over libc `open64(2)`; same contract as the real function.
98+
#[unsafe(no_mangle)]
99+
pub unsafe extern "C" fn open64(path: *const c_char, flags: c_int, mut args: ...) -> c_int {
100+
if should_short_circuit(path) {
101+
return fail_with_enoent();
102+
}
103+
if has_mode_arg(flags) {
104+
// SAFETY: `O_CREAT`/`O_TMPFILE` guarantees a `mode_t` follows per
105+
// the `open64(2)` contract.
106+
let mode: libc::mode_t = unsafe { args.arg() };
107+
// SAFETY: forwarding the caller's arguments unchanged.
108+
unsafe { next_open64()(path, flags, mode) }
109+
} else {
110+
// SAFETY: forwarding the caller's arguments unchanged.
111+
unsafe { next_open64()(path, flags) }
112+
}
113+
}
114+
115+
/// # Safety
116+
/// Interposer over libc `openat(2)`; same contract as the real function.
117+
#[unsafe(no_mangle)]
118+
pub unsafe extern "C" fn openat(
119+
dirfd: c_int,
120+
path: *const c_char,
121+
flags: c_int,
122+
mut args: ...
123+
) -> c_int {
124+
if should_short_circuit(path) {
125+
return fail_with_enoent();
126+
}
127+
if has_mode_arg(flags) {
128+
// SAFETY: `O_CREAT`/`O_TMPFILE` guarantees a `mode_t` follows per
129+
// the `openat(2)` contract.
130+
let mode: libc::mode_t = unsafe { args.arg() };
131+
// SAFETY: forwarding the caller's arguments unchanged.
132+
unsafe { next_openat()(dirfd, path, flags, mode) }
133+
} else {
134+
// SAFETY: forwarding the caller's arguments unchanged.
135+
unsafe { next_openat()(dirfd, path, flags) }
136+
}
137+
}
138+
139+
/// # Safety
140+
/// Interposer over libc `openat64(2)`; same contract as the real function.
141+
#[unsafe(no_mangle)]
142+
pub unsafe extern "C" fn openat64(
143+
dirfd: c_int,
144+
path: *const c_char,
145+
flags: c_int,
146+
mut args: ...
147+
) -> c_int {
148+
if should_short_circuit(path) {
149+
return fail_with_enoent();
150+
}
151+
if has_mode_arg(flags) {
152+
// SAFETY: `O_CREAT`/`O_TMPFILE` guarantees a `mode_t` follows per
153+
// the `openat64(2)` contract.
154+
let mode: libc::mode_t = unsafe { args.arg() };
155+
// SAFETY: forwarding the caller's arguments unchanged.
156+
unsafe { next_openat64()(dirfd, path, flags, mode) }
157+
} else {
158+
// SAFETY: forwarding the caller's arguments unchanged.
159+
unsafe { next_openat64()(dirfd, path, flags) }
160+
}
161+
}

0 commit comments

Comments
 (0)