Skip to content

Commit f85eeff

Browse files
committed
docs: formalize locator refresh state contract (Fixes #387)
1 parent 560df4c commit f85eeff

File tree

4 files changed

+139
-6
lines changed

4 files changed

+139
-6
lines changed

README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,8 @@ This project will be consumed by the [Python extension](https://marketplace.visu
3131

3232
Our approach prioritizes performance and efficiency by leveraging Rust. We minimize I/O operations by collecting all necessary environment information at once, which reduces repeated I/O and the need to spawn additional processes, significantly enhancing overall performance.
3333

34+
Locator refresh-state contracts are documented in [docs/LOCATOR_STATE.md](docs/LOCATOR_STATE.md).
35+
3436
## Contributing
3537

3638
This project welcomes contributions and suggestions. Most contributions require you to agree to a

crates/pet-core/src/lib.rs

Lines changed: 20 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -63,13 +63,24 @@ pub enum LocatorKind {
6363

6464
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
6565
pub enum RefreshStatePersistence {
66-
/// The locator keeps no mutable state across requests.
66+
/// The locator keeps no mutable state that survives a request.
6767
Stateless,
68-
/// The locator keeps configured inputs only; refresh must not copy them back.
68+
/// The locator keeps configured inputs only.
69+
///
70+
/// Refresh creates and configures transient locator instances for one request. A
71+
/// locator in this category must get its configuration from that request's
72+
/// configuration snapshot, not by copying anything back from the transient
73+
/// locator into the long-lived shared locator.
6974
ConfiguredOnly,
70-
/// The locator keeps cache-like state, but later requests can repopulate it on demand.
75+
/// The locator keeps cache-like state that later requests can repopulate on demand.
76+
///
77+
/// Refresh may populate this state on a transient locator, but correctness must
78+
/// not depend on syncing it back into the long-lived shared locator.
7179
SelfHydratingCache,
72-
/// The locator keeps refresh-discovered state that later requests depend on for correctness.
80+
/// The locator keeps refresh-discovered state that later requests depend on.
81+
///
82+
/// Locators in this category must override `sync_refresh_state_from()` and copy
83+
/// only correctness-critical discovery state for the provided sync scope.
7384
SyncedDiscoveryState,
7485
}
7586

@@ -121,8 +132,11 @@ pub trait Locator: Any + Send + Sync {
121132
}
122133
/// Describes what mutable state, if any, must survive a refresh boundary.
123134
///
124-
/// Refresh runs execute against transient locator graphs and then invoke
125-
/// `sync_refresh_state_from()` on the long-lived shared locators.
135+
/// Refresh requests run against transient locator graphs. After a refresh
136+
/// completes, the server invokes `sync_refresh_state_from()` on the long-lived
137+
/// shared locator graph while the starting configuration generation is still
138+
/// current. The returned classification is the contract the locator makes with
139+
/// that sync step.
126140
fn refresh_state(&self) -> RefreshStatePersistence {
127141
RefreshStatePersistence::Stateless
128142
}

crates/pet/src/jsonrpc.rs

Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1223,6 +1223,7 @@ mod tests {
12231223
use pet_conda::manager::CondaManager;
12241224
use pet_core::manager::EnvManager;
12251225
use pet_core::manager::EnvManagerType;
1226+
use pet_core::LocatorKind;
12261227
use pet_core::RefreshStatePersistence;
12271228
use std::path::PathBuf;
12281229
use std::sync::mpsc;
@@ -1699,6 +1700,76 @@ mod tests {
16991700
assert!(!synced);
17001701
}
17011702

1703+
#[test]
1704+
fn test_locator_graph_refresh_state_contracts_are_explicit() {
1705+
let environment = EnvironmentApi::new();
1706+
let conda_locator = Arc::new(Conda::from(&environment));
1707+
let poetry_locator = Arc::new(Poetry::from(&environment));
1708+
let locators = create_locators(conda_locator, poetry_locator, &environment);
1709+
1710+
let actual = locators
1711+
.iter()
1712+
.map(|locator| (locator.get_kind(), locator.refresh_state()))
1713+
.collect::<Vec<_>>();
1714+
1715+
let expected = vec![
1716+
#[cfg(windows)]
1717+
(
1718+
LocatorKind::WindowsStore,
1719+
RefreshStatePersistence::SyncedDiscoveryState,
1720+
),
1721+
#[cfg(windows)]
1722+
(
1723+
LocatorKind::WindowsRegistry,
1724+
RefreshStatePersistence::SyncedDiscoveryState,
1725+
),
1726+
#[cfg(windows)]
1727+
(LocatorKind::WinPython, RefreshStatePersistence::Stateless),
1728+
(
1729+
LocatorKind::PyEnv,
1730+
RefreshStatePersistence::SelfHydratingCache,
1731+
),
1732+
(LocatorKind::Pixi, RefreshStatePersistence::Stateless),
1733+
(
1734+
LocatorKind::Conda,
1735+
RefreshStatePersistence::SyncedDiscoveryState,
1736+
),
1737+
(LocatorKind::Uv, RefreshStatePersistence::ConfiguredOnly),
1738+
(
1739+
LocatorKind::Poetry,
1740+
RefreshStatePersistence::SyncedDiscoveryState,
1741+
),
1742+
(LocatorKind::PipEnv, RefreshStatePersistence::ConfiguredOnly),
1743+
(
1744+
LocatorKind::VirtualEnvWrapper,
1745+
RefreshStatePersistence::Stateless,
1746+
),
1747+
(LocatorKind::Venv, RefreshStatePersistence::Stateless),
1748+
(LocatorKind::VirtualEnv, RefreshStatePersistence::Stateless),
1749+
#[cfg(unix)]
1750+
(LocatorKind::Homebrew, RefreshStatePersistence::Stateless),
1751+
#[cfg(target_os = "macos")]
1752+
(LocatorKind::MacXCode, RefreshStatePersistence::Stateless),
1753+
#[cfg(target_os = "macos")]
1754+
(
1755+
LocatorKind::MacCommandLineTools,
1756+
RefreshStatePersistence::Stateless,
1757+
),
1758+
#[cfg(target_os = "macos")]
1759+
(
1760+
LocatorKind::MacPythonOrg,
1761+
RefreshStatePersistence::Stateless,
1762+
),
1763+
#[cfg(all(unix, not(target_os = "macos")))]
1764+
(
1765+
LocatorKind::LinuxGlobal,
1766+
RefreshStatePersistence::SelfHydratingCache,
1767+
),
1768+
];
1769+
1770+
assert_eq!(actual, expected);
1771+
}
1772+
17021773
#[test]
17031774
fn test_refresh_coordinator_does_not_join_different_generations() {
17041775
let coordinator = RefreshCoordinator::default();

docs/LOCATOR_STATE.md

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
# Locator Refresh State
2+
3+
Refresh requests run on a transient locator graph. The server configures that graph from a single configuration snapshot, runs discovery, and then syncs selected refresh-discovered state back into the long-lived shared locator graph only if that configuration generation is still current.
4+
5+
The `Locator::refresh_state()` classification is the contract for that boundary. It keeps configured inputs, self-hydrating caches, and correctness-critical discovery state distinct.
6+
7+
## Classifications
8+
9+
| Classification | Meaning | Sync behavior |
10+
| --- | --- | --- |
11+
| `Stateless` | The locator keeps no mutable state that survives a request. | Nothing is copied back. |
12+
| `ConfiguredOnly` | The locator stores configured inputs such as executable paths or workspace directories. | Refresh must use the transient locator's request snapshot and must not copy this state back. |
13+
| `SelfHydratingCache` | The locator stores a cache that later requests can rebuild on demand. | Refresh may fill a transient cache, but correctness must not rely on syncing it. |
14+
| `SyncedDiscoveryState` | The locator stores refresh-discovered state that later requests need for correctness or fidelity. | The locator must override `sync_refresh_state_from()` and copy only state appropriate for the `RefreshStateSyncScope`. |
15+
16+
## Current Locator Inventory
17+
18+
| Locator | Mutable state | Classification | Notes |
19+
| --- | --- | --- | --- |
20+
| WindowsStore | Discovered Store environments | `SyncedDiscoveryState` | Full and matching global-kind refreshes replace the cache; workspace refreshes leave it alone. |
21+
| WindowsRegistry | Discovered registry managers and environments | `SyncedDiscoveryState` | Full and matching global-kind refreshes replace the cache; workspace refreshes leave it alone. |
22+
| WinPython | None | `Stateless` | Windows-only locator. |
23+
| PyEnv | Manager and versions-directory cache | `SelfHydratingCache` | `find()` clears the cache, and `try_from()` can rebuild it from the environment. |
24+
| Pixi | None | `Stateless` | Identification is derived from filesystem markers. |
25+
| Conda | Environment, manager, and mamba-manager discovery caches; configured executable | `SyncedDiscoveryState` | Discovery caches are synced. The configured executable remains configuration state and is not copied from refresh locators. |
26+
| Uv | Configured workspace directories; immutable uv install directory | `ConfiguredOnly` | Workspace directories come from the request configuration snapshot. |
27+
| Poetry | Configured workspace directories and executable; discovered search result | `SyncedDiscoveryState` | Search results are synced or merged by scope. Configured inputs are not copied back. |
28+
| PipEnv | Configured pipenv executable | `ConfiguredOnly` | The executable comes from the configuration snapshot. |
29+
| VirtualEnvWrapper | Environment variables captured at construction | `Stateless` | No refresh-discovered mutable state. |
30+
| Venv | None | `Stateless` | Identification is derived from `pyvenv.cfg` and filesystem layout. |
31+
| VirtualEnv | None | `Stateless` | Identification is derived from virtualenv markers. |
32+
| Homebrew | Environment variables captured at construction | `Stateless` | No refresh-discovered mutable state. |
33+
| MacXCode | None | `Stateless` | macOS-only locator. |
34+
| MacCommandLineTools | None | `Stateless` | macOS-only locator. |
35+
| MacPythonOrg | None | `Stateless` | macOS-only locator. |
36+
| LinuxGlobalPython | Reported executable cache | `SelfHydratingCache` | `try_from()` can repopulate the cache by scanning known global bin directories. |
37+
38+
## Updating The Contract
39+
40+
When adding mutable state to a locator, classify it before relying on it across refreshes:
41+
42+
1. If it is configured input, keep it under `ConfiguredOnly` and source it from `Configuration`.
43+
2. If it is only a performance cache, use `SelfHydratingCache` and make later requests able to rebuild it.
44+
3. If later requests need refresh-discovered state, use `SyncedDiscoveryState`, implement `sync_refresh_state_from()`, and cover full, workspace, and kind-filtered scopes with tests.
45+
46+
The locator graph has a regression test in `crates/pet/src/jsonrpc.rs` that pins the current classification of each locator created by `create_locators()`.

0 commit comments

Comments
 (0)