Skip to content

Commit c76ff33

Browse files
author
taosu
committed
fix(uninstall): prevent over-hashing of user files in manifest
trellis init walked managed dirs (.codex/, .claude/, .opencode/) and hashed every file into .template-hashes.json — including pre-existing user data (.codex/sessions/, .claude/projects/, user-owned AGENTS.md). trellis uninstall then unlinked every manifest entry, deleting that user data. Two real reports: GitHub Issue #221 (.codex/sessions/ wiped) and PR #271 (pre-existing AGENTS.md wiped via skip-existing path). Manifest is now derived from "what trellis actually wrote this run", not from walking the disk: - writeFile() instruments recordWrite() on actual disk writes only. byte- identical, skip-existing, and append paths do NOT record. initializeHashes consumes the recorded set instead of fs.readdirSync walks. - Root-level AGENTS.md only enters manifest when trellis writes it. - pruneOrphanManifestKeys() self-heals already-poisoned manifests at the top of both trellis update and trellis uninstall. Preserves .trellis/* (still walk-managed), every configured-platform collectTemplates() path, every path referenced by any migration manifest from/to, and AGENTS.md only when it still carries trellis managed-block markers. - Homedir guard: trellis init / uninstall refuse to run when cwd is exactly the user's home directory (realpathSync.native + Windows lowercase). Bypass via TRELLIS_ALLOW_HOMEDIR=1; --force does not bypass. .trellis/ files keep their existing walk-based hashing — trellis uninstall rm -rf's that subtree wholesale regardless of manifest content, so over- hashing there does not affect uninstall safety. Internal: - New utils/cwd-guard.ts and utils/manifest-prune.ts. - claude configurator excludes dev-only .ts from its template walk. - 27 new tests across recorder boundaries, prune semantics, homedir cases, and full init+uninstall reproductions for both reported scenarios. Spec: - .trellis/spec/cli/backend/migrations.md adds "Manifest ownership contract" with the recorder rules, prune preserve-set, homedir guard, wrong-vs- correct, and tests required for any future change in this area.
1 parent e000208 commit c76ff33

13 files changed

Lines changed: 1296 additions & 99 deletions

File tree

.trellis/spec/cli/backend/migrations.md

Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -193,6 +193,84 @@ update:
193193
- 初始化:`trellis init` 时自动创建
194194
- 更新:`trellis update` 后自动更新被覆盖文件的哈希
195195

196+
### Manifest ownership contract (CRITICAL — data loss prevention)
197+
198+
`.template-hashes.json` is the **single source of truth** for `trellis uninstall`. Every key listed there gets `fs.unlinkSync`'d at uninstall time. Therefore the manifest must contain **only files trellis actually wrote during init / update — never files that merely happen to exist under a managed directory**.
199+
200+
#### Why this matters
201+
202+
`.codex/`, `.claude/`, `.opencode/` etc. contain platform-specific user data:
203+
- `.codex/sessions/*.jsonl` — Codex chat history
204+
- `.codex/history` — Codex prompt history
205+
- `.claude/projects/<sanitized-cwd>/*.jsonl` — Claude Code conversation history
206+
- User-added `.codex/skills/<name>/`, `.claude/agents/<name>/`
207+
208+
If `initializeHashes` walks these dirs and hashes everything, uninstall faithfully deletes user data. Real reported incidents: GitHub Issue #221 (`.codex/sessions/` wiped), PR #271 (pre-existing `AGENTS.md` wiped).
209+
210+
#### Required contract
211+
212+
**`initializeHashes(cwd, opts)`** must derive the manifest set from `opts.trackedPaths` (a `Set<string>` of paths trellis **actually wrote this run**), NOT from `fs.readdirSync` walks of platform dirs.
213+
214+
The set is produced by `startRecordingWrites()` / `stopRecordingWrites()` instrumentation inside `utils/file-writer.ts`. `writeFile()` records `recordWrite(absPath)` ONLY when:
215+
216+
- The file did not exist → wrote (recorded)
217+
- The file existed and content differed → overwrote (recorded)
218+
219+
And **does NOT record** when:
220+
221+
- The file existed and content was byte-identical (no disk write)
222+
- `writeMode` was `skip-existing` and the file existed (skipped)
223+
- The call was append-mode (`appendFile`)
224+
225+
Every configurator that wants its writes tracked must funnel through `writeFile()` — direct `fs.writeFileSync` bypasses the recorder.
226+
227+
#### `.trellis/` walk exemption
228+
229+
`.trellis/` files are still hashed via recursive walk (existing `collectFiles` behavior + `EXCLUDE_FROM_HASH` filters). Rationale: `trellis uninstall` step 3 does `fs.rmSync('.trellis/', { recursive: true, force: true })` regardless of manifest content, so the walk's blast radius is contained. Over-hashing inside `.trellis/` only affects `trellis update` 3-way-merge accuracy, not uninstall safety.
230+
231+
#### Self-heal contract: `pruneOrphanManifestKeys`
232+
233+
Existing users may have poisoned manifests from older trellis versions. `pruneOrphanManifestKeys(cwd, configuredPlatforms, hashes, opts)` runs at the top of both `trellis update` AND `trellis uninstall` (before plan classification) and prunes orphan keys.
234+
235+
**Preserve set** (a key is kept iff one of):
236+
237+
1. Starts with `.trellis/` (walk-managed)
238+
2. Path appears in `PLATFORM_FUNCTIONS[id].collectTemplates()` output for ANY configured platform
239+
3. Path appears in `from` or `to` of ANY migration manifest entry (across all `migrations/manifests/*.json`, not just pending) — must preserve so rename/delete migration logic still has a hash to compare against
240+
4. Path is `AGENTS.md` AND the file on disk contains `TRELLIS_BLOCK_START` + `TRELLIS_BLOCK_END` markers, OR the file is missing. Files lacking markers are treated as user-owned and pruned.
241+
242+
`options.persist: false` for dry-run paths (e.g. `uninstall --dry-run`).
243+
244+
#### Wrong vs Correct
245+
246+
```typescript
247+
// Wrong — walk-based hashing pollutes manifest with user data
248+
for (const dir of ALL_MANAGED_DIRS) {
249+
for (const f of collectFiles(cwd, dir)) {
250+
hashes[f] = sha256(read(f)); // includes .codex/sessions/foo.jsonl
251+
}
252+
}
253+
254+
// Correct — manifest reflects exactly what trellis wrote this run
255+
startRecordingWrites();
256+
await runConfigurators(cwd); // each writeFile() calls recordWrite()
257+
const written = stopRecordingWrites();
258+
initializeHashes(cwd, { trackedPaths: written });
259+
```
260+
261+
#### Homedir guard (R2 — defense in depth)
262+
263+
Even with the manifest contract above, `trellis init` and `trellis uninstall` refuse to run when `process.cwd()` is exactly the user's home directory. Use `isCwdHomedir()` from `utils/cwd-guard.ts` — it compares via `fs.realpathSync.native()` on both sides, Windows-lowercases for case-insensitivity, try/catch defaults permissive on lookup failure. Override via `TRELLIS_ALLOW_HOMEDIR=1` only; `--force` does NOT bypass.
264+
265+
#### Tests required for any change in this area
266+
267+
- Integration: pre-populate user files under `.codex/`, `.claude/`, `.opencode/`, run init+uninstall, assert user data preserved.
268+
- Integration: pre-existing `AGENTS.md` (skip-existing path), uninstall preserves it.
269+
- Integration: poisoned manifest (manually injected key) → update OR uninstall prunes it, user file survives.
270+
- Unit: `recordWrite` instrumentation — new/overwrite recorded, identical/skip/append NOT recorded.
271+
- Unit: `pruneOrphanManifestKeys` — preserves all four classes above; rewrites manifest only when pruned.length > 0.
272+
- Unit: `isCwdHomedir` — symlinked home matches; subdirectory does NOT match; Windows case-insensitive.
273+
196274
### `workflow.md` whole-file update contract
197275

198276
`.trellis/workflow.md` is not only documentation. It is runtime input for

packages/cli/src/commands/init.ts

Lines changed: 85 additions & 49 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,8 @@ import { VERSION } from "../constants/version.js";
2323
import { agentsMdContent } from "../templates/markdown/index.js";
2424
import {
2525
setWriteMode,
26+
startRecordingWrites,
27+
stopRecordingWrites,
2628
writeFile,
2729
type WriteMode,
2830
} from "../utils/file-writer.js";
@@ -35,6 +37,11 @@ import {
3537
type DetectedPackage,
3638
} from "../utils/project-detector.js";
3739
import { initializeHashes } from "../utils/template-hash.js";
40+
import {
41+
isCwdHomedir,
42+
homedirGuardMessage,
43+
homedirBypassEnabled,
44+
} from "../utils/cwd-guard.js";
3845
import {
3946
fetchTemplateIndex,
4047
probeRegistryIndex,
@@ -826,26 +833,35 @@ async function handleReinit(
826833
}
827834
}
828835

829-
for (const tool of platformsToAdd) {
830-
const platformId = resolveCliFlag(tool as CliFlag);
831-
if (platformId) {
832-
if (configuredPlatforms.has(platformId)) {
833-
console.log(
834-
chalk.gray(
835-
` ○ ${AI_TOOLS[platformId].name} already configured, skipping`,
836-
),
837-
);
838-
} else {
839-
console.log(
840-
chalk.blue(`📝 Configuring ${AI_TOOLS[platformId].name}...`),
841-
);
842-
await configurePlatform(platformId, cwd);
836+
const reinitWritten = startRecordingWrites(cwd);
837+
try {
838+
for (const tool of platformsToAdd) {
839+
const platformId = resolveCliFlag(tool as CliFlag);
840+
if (platformId) {
841+
if (configuredPlatforms.has(platformId)) {
842+
console.log(
843+
chalk.gray(
844+
` ○ ${AI_TOOLS[platformId].name} already configured, skipping`,
845+
),
846+
);
847+
} else {
848+
console.log(
849+
chalk.blue(`📝 Configuring ${AI_TOOLS[platformId].name}...`),
850+
);
851+
await configurePlatform(platformId, cwd);
852+
}
843853
}
844854
}
855+
} finally {
856+
stopRecordingWrites();
845857
}
846858

847-
// Update template hashes
848-
const hashedCount = initializeHashes(cwd);
859+
// Update template hashes. Merge mode: preserve previously-tracked
860+
// platforms' hashes, layer in the newly-added platform's writes.
861+
const hashedCount = initializeHashes(cwd, {
862+
trackedPaths: reinitWritten,
863+
merge: true,
864+
});
849865
if (hashedCount > 0) {
850866
console.log(
851867
chalk.gray(`📋 Tracking ${hashedCount} template files for updates`),
@@ -997,6 +1013,14 @@ interface InitAnswers {
9971013
}
9981014

9991015
export async function init(options: InitOptions): Promise<void> {
1016+
// Refuse to run in $HOME — running here would scoop platform runtime data
1017+
// (Claude/Codex/OpenCode session histories etc.) into the trellis hash
1018+
// manifest, and a subsequent `trellis uninstall` would wipe it.
1019+
if (isCwdHomedir() && !homedirBypassEnabled()) {
1020+
console.error(chalk.red(homedirGuardMessage("init")));
1021+
process.exit(1);
1022+
}
1023+
10001024
const cwd = process.cwd();
10011025
const isFirstInit = !fs.existsSync(path.join(cwd, DIR_NAMES.WORKFLOW));
10021026
// Captured here (before createWorkflowStructure + init_developer run) so
@@ -1727,47 +1751,59 @@ export async function init(options: InitOptions): Promise<void> {
17271751
// Create Workflow Structure
17281752
// ==========================================================================
17291753

1730-
// Create workflow structure with project type
1731-
console.log(chalk.blue("📁 Creating workflow structure..."));
1732-
await createWorkflowStructure(cwd, {
1733-
projectType,
1734-
skipSpecTemplates: useRemoteTemplate,
1735-
packages: monorepoPackages,
1736-
remoteSpecPackages,
1737-
});
1754+
// Record every successful write from here through createRootFiles. The
1755+
// captured set is the source of truth for `.template-hashes.json`'s
1756+
// platform/root entries — replacing the previous "walk every managed dir"
1757+
// approach that swept user-owned runtime files into the manifest
1758+
// (.codex/sessions/, .claude/projects/, pre-existing AGENTS.md).
1759+
const writtenPaths = startRecordingWrites(cwd);
1760+
try {
1761+
// Create workflow structure with project type
1762+
console.log(chalk.blue("📁 Creating workflow structure..."));
1763+
await createWorkflowStructure(cwd, {
1764+
projectType,
1765+
skipSpecTemplates: useRemoteTemplate,
1766+
packages: monorepoPackages,
1767+
remoteSpecPackages,
1768+
});
17381769

1739-
// Write monorepo packages to config.yaml (non-destructive patch)
1740-
if (monorepoPackages) {
1741-
writeMonorepoConfig(cwd, monorepoPackages);
1742-
console.log(chalk.blue("📦 Monorepo packages written to config.yaml"));
1743-
}
1770+
// Write monorepo packages to config.yaml (non-destructive patch)
1771+
if (monorepoPackages) {
1772+
writeMonorepoConfig(cwd, monorepoPackages);
1773+
console.log(chalk.blue("📦 Monorepo packages written to config.yaml"));
1774+
}
17441775

1745-
// Write version file for update tracking
1746-
const versionPath = path.join(cwd, DIR_NAMES.WORKFLOW, ".version");
1747-
fs.writeFileSync(versionPath, VERSION);
1776+
// Write version file for update tracking
1777+
const versionPath = path.join(cwd, DIR_NAMES.WORKFLOW, ".version");
1778+
fs.writeFileSync(versionPath, VERSION);
17481779

1749-
// Configure selected tools by copying entire directories (dogfooding)
1750-
for (const tool of tools) {
1751-
const platformId = resolveCliFlag(tool);
1752-
if (platformId) {
1753-
console.log(chalk.blue(`📝 Configuring ${AI_TOOLS[platformId].name}...`));
1754-
await configurePlatform(platformId, cwd);
1780+
// Configure selected tools by copying entire directories (dogfooding)
1781+
for (const tool of tools) {
1782+
const platformId = resolveCliFlag(tool);
1783+
if (platformId) {
1784+
console.log(
1785+
chalk.blue(`📝 Configuring ${AI_TOOLS[platformId].name}...`),
1786+
);
1787+
await configurePlatform(platformId, cwd);
1788+
}
17551789
}
1756-
}
17571790

1758-
const pythonPlatforms = getPlatformsWithPythonHooks();
1759-
const hasSelectedPythonPlatform = pythonPlatforms.some((id) =>
1760-
tools.includes(AI_TOOLS[id].cliFlag),
1761-
);
1762-
if (hasSelectedPythonPlatform) {
1763-
logPythonAdaptationNotice(pythonCmd);
1764-
}
1791+
const pythonPlatforms = getPlatformsWithPythonHooks();
1792+
const hasSelectedPythonPlatform = pythonPlatforms.some((id) =>
1793+
tools.includes(AI_TOOLS[id].cliFlag),
1794+
);
1795+
if (hasSelectedPythonPlatform) {
1796+
logPythonAdaptationNotice(pythonCmd);
1797+
}
17651798

1766-
// Create root files (skip if exists)
1767-
await createRootFiles(cwd);
1799+
// Create root files (skip if exists)
1800+
await createRootFiles(cwd);
1801+
} finally {
1802+
stopRecordingWrites();
1803+
}
17681804

17691805
// Initialize template hashes for modification tracking
1770-
const hashedCount = initializeHashes(cwd);
1806+
const hashedCount = initializeHashes(cwd, { trackedPaths: writtenPaths });
17711807
if (hashedCount > 0) {
17721808
console.log(
17731809
chalk.gray(`📋 Tracking ${hashedCount} template files for updates`),

packages/cli/src/commands/uninstall.ts

Lines changed: 46 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,16 @@ import inquirer from "inquirer";
2626
import { DIR_NAMES } from "../constants/paths.js";
2727
import { loadHashes } from "../utils/template-hash.js";
2828
import { cleanupEmptyDirs } from "./update.js";
29-
import { ALL_MANAGED_DIRS } from "../configurators/index.js";
29+
import {
30+
ALL_MANAGED_DIRS,
31+
getConfiguredPlatforms,
32+
} from "../configurators/index.js";
33+
import { pruneOrphanManifestKeys } from "../utils/manifest-prune.js";
34+
import {
35+
isCwdHomedir,
36+
homedirGuardMessage,
37+
homedirBypassEnabled,
38+
} from "../utils/cwd-guard.js";
3039
import {
3140
scrubHooksJson,
3241
scrubOpencodePackageJson,
@@ -362,6 +371,14 @@ function executePlan(
362371
* Entry point.
363372
*/
364373
export async function uninstall(options: UninstallOptions = {}): Promise<void> {
374+
// Refuse to run in $HOME — same reasoning as init. A manifest poisoned by
375+
// a prior buggy init would otherwise unlink global platform runtime data
376+
// (chat history, session JSONLs).
377+
if (isCwdHomedir() && !homedirBypassEnabled()) {
378+
console.error(chalk.red(homedirGuardMessage("uninstall")));
379+
process.exit(1);
380+
}
381+
365382
const cwd = process.cwd();
366383
const trellisDir = path.join(cwd, DIR_NAMES.WORKFLOW);
367384

@@ -388,7 +405,34 @@ export async function uninstall(options: UninstallOptions = {}): Promise<void> {
388405
process.exit(1);
389406
}
390407

391-
const plan = buildPlan(cwd, hashes);
408+
// Self-heal poisoned manifests from buggy init versions: prune any manifest
409+
// entry that no current configurator owns. Runs BEFORE buildPlan so the
410+
// user-owned paths (.codex/sessions/, .claude/projects/, pre-existing
411+
// AGENTS.md, etc.) never reach the deletion list. See PRD R3.
412+
//
413+
// Dry-run: still compute the pruned hashes (so the plan reflects post-prune
414+
// reality) but pass `persist: false` so no disk write happens. The actual
415+
// disk write defers to executePlan time, where we'd be rewriting the
416+
// manifest only to delete the whole .trellis/ dir anyway — but the
417+
// computation must remain to keep the rendered plan honest.
418+
const configuredPlatforms = getConfiguredPlatforms(cwd);
419+
const { pruned, hashes: prunedHashes } = pruneOrphanManifestKeys(
420+
cwd,
421+
[...configuredPlatforms],
422+
hashes,
423+
{ persist: !options.dryRun },
424+
);
425+
if (pruned.length > 0) {
426+
// Surface counts only — listing every poisoned entry would alarm users
427+
// without giving them an actionable signal.
428+
console.log(
429+
chalk.gray(
430+
` Pruned ${pruned.length} orphan manifest entries (user-owned files trellis did not write).`,
431+
),
432+
);
433+
}
434+
435+
const plan = buildPlan(cwd, prunedHashes);
392436
renderPlan(cwd, plan);
393437

394438
if (options.dryRun) {

packages/cli/src/commands/update.ts

Lines changed: 29 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,7 @@ import {
5252
isManagedRootDir,
5353
} from "../configurators/index.js";
5454
import { replacePythonCommandLiterals } from "../configurators/shared.js";
55+
import { pruneOrphanManifestKeys } from "../utils/manifest-prune.js";
5556

5657
export interface UpdateOptions {
5758
dryRun?: boolean;
@@ -1754,7 +1755,7 @@ export async function update(options: UpdateOptions): Promise<void> {
17541755
// Migration metadata is displayed at the end to prevent scrolling off screen
17551756

17561757
// Load template hashes for modification detection
1757-
const hashes = loadHashes(cwd);
1758+
let hashes = loadHashes(cwd);
17581759
const isFirstHashTracking = Object.keys(hashes).length === 0;
17591760

17601761
// Handle unknown version - skip regular migrations but safe-file-delete still runs
@@ -1772,6 +1773,10 @@ export async function update(options: UpdateOptions): Promise<void> {
17721773
}
17731774

17741775
// Detect legacy Codex (has .agents/skills/ tracked by Trellis but no .codex/)
1776+
// NOTE: this MUST happen before pruneOrphanManifestKeys below, since the
1777+
// detector reads the raw manifest looking for .agents/skills/ markers that
1778+
// the prune step would otherwise consider orphans (codex hasn't been added
1779+
// to configuredPlatforms yet at this point).
17751780
const codexUpgradeNeeded = needsCodexUpgrade(cwd);
17761781
if (codexUpgradeNeeded) {
17771782
console.log(
@@ -1781,6 +1786,29 @@ export async function update(options: UpdateOptions): Promise<void> {
17811786
);
17821787
}
17831788

1789+
// Self-heal poisoned manifests: prune entries that no current platform
1790+
// configurator owns. This silently removes user-owned paths that early
1791+
// buggy versions of `trellis init` over-hashed (e.g. .codex/sessions/*).
1792+
// Include codex in known-platforms when codexUpgradeNeeded so legacy Codex
1793+
// markers under .agents/skills/ survive into the upgrade flow.
1794+
{
1795+
const configuredPlatforms = new Set<AITool>(getConfiguredPlatforms(cwd));
1796+
if (codexUpgradeNeeded) configuredPlatforms.add("codex");
1797+
const prune = pruneOrphanManifestKeys(
1798+
cwd,
1799+
[...configuredPlatforms],
1800+
hashes,
1801+
);
1802+
if (prune.pruned.length > 0) {
1803+
console.log(
1804+
chalk.gray(
1805+
` Pruned ${prune.pruned.length} orphan manifest entries from .template-hashes.json`,
1806+
),
1807+
);
1808+
hashes = prune.hashes;
1809+
}
1810+
}
1811+
17841812
// For breaking releases with recommendMigrate + --migrate, bypass update.skip
17851813
// across the board (safe-file-delete, new file writes, template updates).
17861814
// Why: honoring skip here leaves users forever half-migrated — old deprecated

0 commit comments

Comments
 (0)