Skip to content

Commit da67590

Browse files
committed
fix(#469 review): skip caching workspace_entry results for removed workspaces
Address rchiodo's verified review comment. The generation guard alone catches the race where configure() runs *during* a parse, but misses the case where configure() removes a workspace *before* workspace_entry snapshots the generation: T1: try_from snapshots state.workspaces = [W, X] T2: configure removes W, generation = N+1 T1: workspace_entry(W) misses cache, snapshots generation = N+1 T1: parses W, re-acquires lock, generation still matches, inserts W W now lives in state.parsed as an orphan until the next configure() clears it. Add a membership check before insert: if the workspace is no longer in state.workspaces, return the parsed value (so the in-flight caller can finish) without caching it. Impact of the prior behaviour was low (cache gets cleared on next configure and new callers never re-iterate the removed workspace), but the membership check keeps the cache invariant clean: parsed only contains entries for currently configured workspaces. Test: workspace_entry_does_not_cache_for_unconfigured_workspace pins the contract directly without needing threads.
1 parent c114864 commit da67590

1 file changed

Lines changed: 70 additions & 6 deletions

File tree

crates/pet-hatch/src/lib.rs

Lines changed: 70 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -141,12 +141,19 @@ impl Hatch {
141141
/// performed outside the state mutex so concurrent `try_from()` calls
142142
/// for *other* workspaces are not blocked by a slow filesystem.
143143
///
144-
/// Race handling: `configure()` may run between our cache-miss check
145-
/// and our re-acquire to insert, invalidating the cache while we were
146-
/// parsing. We snapshot the generation before the parse and only
147-
/// insert if the generation has not changed; otherwise we discard the
148-
/// stale parse and retry. The loop is bounded in practice because
149-
/// `configure()` is a client-driven, infrequent operation.
144+
/// Race handling:
145+
/// * `configure()` may run between our cache-miss check and our
146+
/// re-acquire to insert, invalidating the cache while we were
147+
/// parsing. We snapshot the generation before the parse and discard
148+
/// the result if the generation moved, then retry against the
149+
/// current state.
150+
/// * `configure()` may also have removed `workspace` from the
151+
/// configured set before we were called (the caller iterates a
152+
/// snapshot of `state.workspaces` taken before `workspace_entry`).
153+
/// In that case we still return the parsed value so the in-flight
154+
/// caller can complete, but we do not cache it — otherwise the
155+
/// cache would hold an orphan entry for a workspace that is no
156+
/// longer configured until the next `configure()` clears it.
150157
fn workspace_entry(&self, workspace: &Path) -> Arc<WorkspaceEntry> {
151158
loop {
152159
// Fast path: cache hit. Also snapshot the generation so we can
@@ -175,6 +182,15 @@ impl Hatch {
175182
// Drop it and retry against the current generation.
176183
continue;
177184
}
185+
if !state.workspaces.iter().any(|w| w == workspace) {
186+
// `workspace` is not in the current configured set (the
187+
// caller's snapshot was taken before a configure() that
188+
// removed it). Return the parsed result so the in-flight
189+
// caller can finish without re-reading disk, but don't
190+
// pollute `parsed` with an orphan entry that would
191+
// outlive the workspace until the next configure().
192+
return parsed;
193+
}
178194
// A concurrent caller for the same workspace may have already
179195
// inserted while we were parsing; prefer the existing entry
180196
// so every caller observes the same `Arc`. `or_insert_with`
@@ -1448,6 +1464,54 @@ mod tests {
14481464
);
14491465
}
14501466

1467+
#[test]
1468+
fn workspace_entry_does_not_cache_for_unconfigured_workspace() {
1469+
// Race scenario: try_from() / find() snapshots state.workspaces,
1470+
// then configure() removes a workspace before workspace_entry()
1471+
// re-acquires the lock to insert. Without a workspaces-membership
1472+
// check, the parsed cache would hold an orphan entry for a
1473+
// workspace that is no longer configured, persisting until the
1474+
// next configure(). The generation guard alone is not enough
1475+
// here because configure() can land *before* workspace_entry()
1476+
// snapshots the generation.
1477+
//
1478+
// Verify the contract directly: calling workspace_entry() for a
1479+
// workspace not in state.workspaces returns a usable parsed
1480+
// value but does NOT populate the cache.
1481+
let temp = TempDir::new().unwrap();
1482+
let configured = temp.path().join("configured");
1483+
let removed = temp.path().join("removed");
1484+
fs::create_dir_all(&configured).unwrap();
1485+
fs::create_dir_all(&removed).unwrap();
1486+
fs::write(
1487+
removed.join("pyproject.toml"),
1488+
b"[tool.hatch.dirs.env]\nvirtual = \".hatch\"\n",
1489+
)
1490+
.unwrap();
1491+
1492+
let locator = make_locator(None);
1493+
let config = Configuration {
1494+
workspace_directories: Some(vec![configured.clone()]),
1495+
..Configuration::default()
1496+
};
1497+
locator.configure(&config);
1498+
1499+
// `removed` is not in state.workspaces. Calling workspace_entry
1500+
// for it must still return a parsed entry (the in-flight caller
1501+
// needs a usable value) but must not pollute the cache.
1502+
let entry = locator.workspace_entry(&removed);
1503+
assert_eq!(entry.virtual_dirs, vec![norm_case(&removed.join(".hatch"))]);
1504+
let state = locator.state.lock().unwrap();
1505+
assert!(
1506+
!state.parsed.contains_key(&removed),
1507+
"parsed cache must not contain entries for unconfigured workspaces"
1508+
);
1509+
assert!(
1510+
!state.parsed.contains_key(&configured),
1511+
"configured workspace was never accessed; cache should be empty"
1512+
);
1513+
}
1514+
14511515
#[cfg(target_os = "linux")]
14521516
#[test]
14531517
fn data_dir_uses_xdg_data_home_when_set() {

0 commit comments

Comments
 (0)