Skip to content

Commit 2a6d452

Browse files
chapterjasonclaude
andcommitted
Scope JetBrains cache per-workspace to fix Toolbox Agent collision
Concurrent workspaces shared ~/.cache/JetBrains/ via the per-owner persist volume, so the Toolbox Agent's UnixApplicationStartLock + IPC socket under Toolbox/ports/ collided across workspaces ("main instance is alive, cannot bind twice"). Split the JetBrains manifest: settings, plugins, and JetProfile state stay owner-scoped; .cache/JetBrains/ becomes workspace-scoped under /mnt/home-persist/.workspaces/<id>/. Adds a generic scope field to the home-persist manifest schema and a one-shot migration sweep that drops the orphaned owner-scoped cache after rollout. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent c07b881 commit 2a6d452

4 files changed

Lines changed: 182 additions & 17 deletions

File tree

docs/persistence.md

Lines changed: 96 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,33 @@ opt in.
5454
the first create on an empty volume leaves `~/.claude` as a dangling
5555
symlink, and any consumer doing `mkdir -p ~/.claude` fails with EEXIST.
5656

57+
**Scope** (optional, default `"owner"`): which workspaces share the
58+
persisted copy.
59+
60+
- `"owner"` — one copy under `/mnt/home-persist/<rel>`, visible to every
61+
workspace the owner runs. Right for settings, credentials, anything
62+
you want synced across workspaces.
63+
- `"workspace"` — one copy per workspace under
64+
`/mnt/home-persist/.workspaces/<CODER_WORKSPACE_ID>/<rel>`, private to
65+
that workspace but surviving its stop/start. Required for paths with
66+
single-writer semantics (lock files, unix sockets, per-project
67+
indexes) — otherwise two concurrent workspaces race and one fails.
68+
69+
```json
70+
{
71+
"source": "jetbrains-local",
72+
"scope": "workspace",
73+
"paths": [".cache/JetBrains/"]
74+
}
75+
```
76+
77+
**Sibling rule**: owner-scoped and workspace-scoped paths must not nest
78+
under each other. A path symlinked at the parent already points into the
79+
shared volume; a child symlink would land inside that target and leak
80+
per-workspace state into the shared store. Declare them as siblings at
81+
the appropriate XDG roots (`.config/`, `.local/share/`, `.cache/`) — that
82+
split is almost always the right line anyway.
83+
5784
3. **The resolver.** `main.tf`'s `coder_script.lifecycle_init` invokes
5885
`/usr/local/bin/home-persist-resolve` at agent start, with
5986
`start_blocks_login = true` so IDEs don't connect before the symlinks are
@@ -77,12 +104,18 @@ opt in.
77104
│ │
78105
│ docker volume: coder-<owner>-home-persist ◄── one per owner │
79106
│ │ │
80-
│ ├─► workspace A at /mnt/home-persist │
81-
│ │ └─► symlinks from $HOME into it │
82-
│ ├─► workspace B at /mnt/home-persist │
83-
│ │ └─► symlinks from $HOME into it │
84-
│ └─► workspace C at /mnt/home-persist │
85-
│ └─► symlinks from $HOME into it │
107+
│ ├─ owner-scoped paths (shared) │
108+
│ │ .config/… .local/share/… .claude/ … │
109+
│ │ │
110+
│ └─ .workspaces/<CODER_WORKSPACE_ID>/ workspace-scoped │
111+
│ ├─ <id-A>/ .cache/JetBrains/ … (private to ws A) │
112+
│ ├─ <id-B>/ .cache/JetBrains/ … (private to ws B) │
113+
│ └─ <id-C>/ .cache/JetBrains/ … (private to ws C) │
114+
│ │
115+
│ workspace A mounts /mnt/home-persist │
116+
│ └─► $HOME/.config/JetBrains → /mnt/home-persist/.config/... │
117+
│ └─► $HOME/.cache/JetBrains → /mnt/home-persist/.workspaces │
118+
│ /<ws-A-id>/.cache/JetBrains │
86119
└─────────────────────────────────────────────────────────────────┘
87120
```
88121

@@ -100,10 +133,11 @@ Properties that fall out:
100133

101134
## What's declared today
102135

103-
| Source | Paths | Why |
104-
| ------------- | ----------------------------- | ------------------------------------ |
105-
| `claude-code` | `.claude/`, `.claude.json` | Login credentials, sessions, plugins |
106-
| `jetbrains` | `.cache/JetBrains/`, `.config/JetBrains/`, `.local/share/JetBrains/`, `.java/.userPrefs/jetbrains/` | The workspace is headless — Toolbox/Gateway runs on the user's local machine. Clicking the IDE button opens a `jetbrains-gateway://` URL; Gateway SSHes in and has `remote-dev-server.sh` download the IDE backend into `~/.cache/JetBrains/RemoteDev/dist/` on first connect (hundreds of MB per IDE). The three JetBrains roots keep the downloaded backend plus per-IDE `RemoteDev-<Code>/` settings, plugins, and project indexes. `.java/.userPrefs/jetbrains/` is the Java `Preferences` store the IDEs use for JetBrains Account / JetProfile login, license activation, and non-commercial-license acceptance — persisting it avoids re-login on every restart. |
136+
| Source | Scope | Paths | Why |
137+
| ----------------- | ----------- | ----------------------------- | ------------------------------------ |
138+
| `claude-code` | owner | `.claude/`, `.claude.json` | Login credentials, sessions, plugins |
139+
| `jetbrains` | owner | `.config/JetBrains/`, `.local/share/JetBrains/`, `.java/.userPrefs/jetbrains/` | Settings, plugins, and JetProfile state that should follow the user across workspaces. Keymaps, color schemes, installed plugins, license acceptance. |
140+
| `jetbrains-local` | workspace | `.cache/JetBrains/` | Per-workspace runtime: the SSH-deployed Toolbox Agent (`Toolbox-CLI-dist/`), its IPC lock and unix socket under `Toolbox/ports/`, the downloaded IDE backend (`RemoteDev/dist/`), and per-IDE system caches and project indexes. Must be per-workspace — concurrent workspaces that share `.cache/JetBrains/` race on the Toolbox Agent's `UnixApplicationStartLock` and fail to connect ("main instance is alive, cannot bind twice"). |
107141

108142
Anything not declared is image-owned (or per-workspace-home-volume-owned)
109143
and resets on image rebuild — git config, SSH keys, bash history, caches.
@@ -154,6 +188,24 @@ EOF
154188

155189
Trailing `/` for directories, no slash for files.
156190

191+
If any of those paths must not be shared between concurrent workspaces (lock
192+
files, sockets, per-project indexes), split them into a second manifest with
193+
`"scope": "workspace"`:
194+
195+
```bash
196+
cat > /etc/home-persist.d/my-tool-local.json <<'EOF'
197+
{
198+
"source": "my-tool-local",
199+
"scope": "workspace",
200+
"paths": [".cache/my-tool/"]
201+
}
202+
EOF
203+
```
204+
205+
Keep the owner and workspace paths as siblings (see the sibling rule in
206+
"The manifest" above) — don't declare a parent owner-scoped and then try to
207+
carve a child out as workspace-scoped.
208+
157209
No install ordering required — `home-persist-resolve` runs at agent start,
158210
after the image is already built with every script's manifest in place.
159211
Listing the same path in two manifests is harmless: the second is logged
@@ -178,6 +230,37 @@ One volume per owner means:
178230

179231
- One Claude Code login reused across every workspace the owner opens.
180232
- Two workspaces running simultaneously means two processes writing to the
181-
same files in the volume. For credential files and config this is fine;
182-
for anything with single-writer semantics, same caveats as any shared
183-
home. Add the path only if you actually want it shared.
233+
same owner-scoped files in the volume. For credential files and config
234+
this is fine; for anything with single-writer semantics (lock files,
235+
unix sockets, indexes), declare the path with `"scope": "workspace"` so
236+
each workspace gets its own copy under `.workspaces/<id>/`.
237+
238+
When a workspace is deleted, its `.workspaces/<id>/` subtree is orphaned —
239+
nothing sweeps it automatically. Clean up manually from any running
240+
workspace if disk usage grows:
241+
242+
```bash
243+
ls /mnt/home-persist/.workspaces/
244+
rm -rf /mnt/home-persist/.workspaces/<stale-id>
245+
```
246+
247+
## Migrating an owner-scoped path to workspace-scoped
248+
249+
Flipping a path from `scope: "owner"` to `scope: "workspace"` leaves the old
250+
`/mnt/home-persist/<path>` dir behind — the resolver retargets the symlink
251+
but doesn't touch the previous target. Tens to hundreds of MB can accumulate
252+
(JetBrains caches, Docker-ish state, etc.).
253+
254+
Add a `migration_sweep` line to `coder_script.lifecycle_init` in `main.tf`,
255+
keyed by a unique sentinel name:
256+
257+
```bash
258+
migration_sweep <sentinel-name> <path-relative-to-/mnt/home-persist>
259+
# e.g.
260+
migration_sweep jetbrains-cache-owner-to-workspace .cache/JetBrains
261+
```
262+
263+
The sweep runs once per owner volume (the sentinel
264+
`/mnt/home-persist/.workspaces/.migrated/<sentinel-name>` blocks reruns) and
265+
is a no-op if the orphan is already gone. Delete the line from `main.tf`
266+
once every workspace has cycled past it.

main.tf

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -99,6 +99,15 @@ resource "coder_agent" "main" {
9999
os = "linux"
100100
dir = data.coder_parameter.directory.value
101101

102+
# Workspace identity is exposed to shells and coder_script blocks so
103+
# home-persist-resolve can scope per-workspace paths under
104+
# /mnt/home-persist/.workspaces/$CODER_WORKSPACE_ID/. Name is informational
105+
# (renameable); ID is the stable key.
106+
env = {
107+
CODER_WORKSPACE_NAME = data.coder_workspace.me.name
108+
CODER_WORKSPACE_ID = data.coder_workspace.me.id
109+
}
110+
102111
startup_script = <<-EOT
103112
set -e
104113
@@ -303,6 +312,23 @@ resource "coder_script" "lifecycle_init" {
303312
fi
304313
fi
305314
315+
# One-shot migration sweeps. Each entry removes an owner-scoped path that
316+
# has since been moved to scope=workspace. Gated by a sentinel on the
317+
# shared volume so only the first workspace to cycle after the switch
318+
# pays the cost; subsequent workspaces see the sentinel and skip. Safe
319+
# to delete a migration block once every owner has cycled past it.
320+
migration_sweep() {
321+
sentinel="/mnt/home-persist/.workspaces/.migrated/$1"
322+
orphan="/mnt/home-persist/$2"
323+
[ -f "$sentinel" ] && return
324+
[ -e "$orphan" ] && rm -rf "$orphan"
325+
mkdir -p "$(dirname "$sentinel")"
326+
touch "$sentinel"
327+
}
328+
if [ -w /mnt/home-persist ]; then
329+
migration_sweep jetbrains-cache-owner-to-workspace .cache/JetBrains
330+
fi
331+
306332
[ -x /usr/local/bin/home-persist-resolve ] && /usr/local/bin/home-persist-resolve
307333
[ -x /usr/local/share/context-mode/post-create.sh ] && /usr/local/share/context-mode/post-create.sh
308334
[ -x "$HOME/.local/share/rtk/post-create.sh" ] && "$HOME/.local/share/rtk/post-create.sh"

scripts/home-persist/resolve.sh

Lines changed: 35 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,21 @@
77
# volume wins. Idempotent.
88
#
99
# Manifest shape:
10-
# { "source": "<label>", "paths": ["<rel-to-$HOME>", ...] }
10+
# { "source": "<label>", "scope": "owner"|"workspace", "paths": [...] }
11+
#
12+
# scope (default "owner"):
13+
# - "owner" — target is $STATE/<rel>, shared across all the owner's
14+
# workspaces (existing behavior).
15+
# - "workspace" — target is $STATE/.workspaces/$CODER_WORKSPACE_ID/<rel>,
16+
# private to this workspace. Requires CODER_WORKSPACE_ID;
17+
# the manifest is skipped with a warning otherwise.
18+
# Use for paths with single-writer semantics (lock files,
19+
# unix sockets, per-IDE indexes) that collide when two
20+
# workspaces share them.
21+
#
22+
# Owner and workspace paths must be siblings, not parent/child: a path
23+
# symlinked at the parent cannot have a child symlinked underneath it (the
24+
# child would land inside the parent's target and pollute the shared volume).
1125
#
1226
# Path convention:
1327
# - Trailing slash ("/") means the path is a directory. The target is
@@ -50,6 +64,25 @@ for mf in "${manifests[@]}"; do
5064
continue
5165
fi
5266
source=$(jq -r '.source // "unknown"' "$mf")
67+
scope=$(jq -r '.scope // "owner"' "$mf")
68+
69+
case "$scope" in
70+
owner)
71+
scope_root="$STATE"
72+
;;
73+
workspace)
74+
if [ -z "${CODER_WORKSPACE_ID:-}" ]; then
75+
log "skipping $mf: scope=workspace but CODER_WORKSPACE_ID is unset"
76+
continue
77+
fi
78+
scope_root="$STATE/.workspaces/$CODER_WORKSPACE_ID"
79+
mkdir -p "$scope_root"
80+
;;
81+
*)
82+
log "skipping $mf: unknown scope '$scope' (expected owner|workspace)"
83+
continue
84+
;;
85+
esac
5386

5487
while IFS= read -r raw; do
5588
[ -z "$raw" ] && continue
@@ -70,7 +103,7 @@ for mf in "${manifests[@]}"; do
70103
owner[$rel]="$source"
71104

72105
link="$HOME/$rel"
73-
target="$STATE/$rel"
106+
target="$scope_root/$rel"
74107
mkdir -p "$(dirname "$target")" "$(dirname "$link")"
75108

76109
if [ -e "$link" ] && [ ! -L "$link" ]; then

scripts/jetbrains/install.sh

Lines changed: 25 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,9 @@
1818
#
1919
# ~/.cache/JetBrains/ — RemoteDev/dist/<build>/ (downloaded backend),
2020
# RemoteDev-<Code>/ system caches, indexes,
21-
# LocalHistory, log
21+
# LocalHistory, log, Toolbox-CLI-dist/ (the
22+
# SSH-deployed Toolbox Agent), Toolbox/ports/
23+
# (agent IPC / lock)
2224
# ~/.config/JetBrains/ — RemoteDev-<Code>/ IDE settings, keymaps,
2325
# schemes, options, workspace, .lock
2426
# ~/.local/share/JetBrains/ — RemoteDev-<Code>/ installed plugins,
@@ -29,17 +31,38 @@
2931
# acceptance — skipping this forces re-login + re-accept every restart:
3032
#
3133
# ~/.java/.userPrefs/jetbrains/
34+
#
35+
# .cache/JetBrains/ is scoped per-workspace because the Toolbox Agent's
36+
# UnixApplicationStartLock + IPC socket live under it and would collide
37+
# across concurrent workspaces ("main instance is alive, cannot bind twice").
38+
# Settings, plugins, and JetProfile state stay owner-scoped so they follow
39+
# the user across workspaces.
3240
set -e
3341

3442
mkdir -p /etc/home-persist.d
43+
44+
# Owner-scoped: things we want shared across all of the user's workspaces.
3545
tee /etc/home-persist.d/jetbrains.json >/dev/null <<'EOF'
3646
{
3747
"source": "jetbrains",
48+
"scope": "owner",
3849
"paths": [
39-
".cache/JetBrains/",
4050
".config/JetBrains/",
4151
".local/share/JetBrains/",
4252
".java/.userPrefs/jetbrains/"
4353
]
4454
}
4555
EOF
56+
57+
# Workspace-scoped: runtime IPC and per-project indexes. Must be private
58+
# per workspace, otherwise concurrent workspaces race on the Toolbox Agent
59+
# lock. Survives restarts of the same workspace.
60+
tee /etc/home-persist.d/jetbrains-local.json >/dev/null <<'EOF'
61+
{
62+
"source": "jetbrains-local",
63+
"scope": "workspace",
64+
"paths": [
65+
".cache/JetBrains/"
66+
]
67+
}
68+
EOF

0 commit comments

Comments
 (0)