Skip to content

Commit 4665f33

Browse files
abrichrclaude
andauthored
fix: use persistent storage for Docker data-root instead of ephemeral /mnt (#37)
The Azure ephemeral disk (/mnt) gets wiped on VM deallocate, causing Docker images to be lost and pool-resume to fail with WAA timeout. Move Docker data-root to /home/azureuser/docker (OS disk, persistent) and increase OS disk to 128GB to accommodate Docker images. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 64ae12d commit 4665f33

4 files changed

Lines changed: 15 additions & 11 deletions

File tree

.beads/issues.jsonl

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,5 +13,5 @@
1313
{"id":"openadapt-evals-hvm","title":"VL model fix PR #18 ready to merge","notes":"2026-02-08: openadapt-ml PR #18 was already merged on 2026-01-29. VL model fix is done.","status":"closed","priority":0,"issue_type":"task","owner":"richard.abrich@gmail.com","created_at":"2026-01-29T16:17:03.491938-05:00","created_by":"Richard Abrich","updated_at":"2026-02-08T12:55:19.233249-05:00","closed_at":"2026-02-08T12:55:19.233249-05:00","close_reason":"PR #18 already merged 2026-01-29"}
1414
{"id":"openadapt-evals-mx8","title":"Analyze evaluation results and publish findings","description":"After demo-conditioned evaluation completes, analyze results: success rates, failure modes, demo impact. Create data-driven roadmap for improvements.","status":"open","priority":1,"issue_type":"task","owner":"richard.abrich@gmail.com","created_at":"2026-02-14T12:23:06.328838-05:00","created_by":"Richard Abrich","updated_at":"2026-02-14T12:23:06.328838-05:00"}
1515
{"id":"openadapt-evals-sz4","title":"RCA: Windows product key prompt recurring issue","status":"closed","priority":0,"issue_type":"task","owner":"richard.abrich@gmail.com","created_at":"2026-01-20T18:59:36.266286-05:00","created_by":"Richard Abrich","updated_at":"2026-01-20T20:32:06.493102-05:00","closed_at":"2026-01-20T20:32:06.493102-05:00","close_reason":"RCA complete - root cause is VERSION mismatch (CLI=11, Dockerfile=11e). Fix documented in RECURRING_ISSUES.md and WINDOWS_PRODUCT_KEY_RCA.md"}
16-
{"id":"openadapt-evals-vcb","title":"Run demo-conditioned WAA evaluation","description":"Once demos are recorded, run WAA evaluation with demo-conditioned agents (RetrievalAugmentedAgent with real demos). Target: measure improvement over zero-shot baseline. Requires real demos from recording task.","notes":"Pipeline complete. 3 annotated demos produced. Need Azure VM to run eval. Anthropic credits depleted — use OpenAI.","status":"open","priority":0,"issue_type":"task","owner":"richard.abrich@gmail.com","created_at":"2026-02-14T12:23:04.624305-05:00","created_by":"Richard Abrich","updated_at":"2026-02-18T00:03:34.77925-05:00"}
16+
{"id":"openadapt-evals-vcb","title":"Run demo-conditioned WAA evaluation","description":"Once demos are recorded, run WAA evaluation with demo-conditioned agents (RetrievalAugmentedAgent with real demos). Target: measure improvement over zero-shot baseline. Requires real demos from recording task.","notes":"PR #35 merged (v0.4.0): full pipeline implemented — record-waa (interactive WAA API recording via VNC), annotate (VLM annotation of screenshots), eval (delegates to eval-suite). 12 harder tasks defined (0/12 zero-shot). CI workflow added. PR #36 merged (v0.4.1): fixed PyPI README images. Next: spin up Azure VM, record demos for 12 harder tasks, annotate, run DC eval.","status":"open","priority":0,"issue_type":"task","owner":"richard.abrich@gmail.com","created_at":"2026-02-14T12:23:04.624305-05:00","created_by":"Richard Abrich","updated_at":"2026-02-24T02:00:07.491221-05:00"}
1717
{"id":"openadapt-evals-wis","title":"Add pre-flight check to detect Windows install issues","status":"closed","priority":1,"issue_type":"task","owner":"richard.abrich@gmail.com","created_at":"2026-01-20T18:59:36.865052-05:00","created_by":"Richard Abrich","updated_at":"2026-01-20T20:32:06.757261-05:00","closed_at":"2026-01-20T20:32:06.757261-05:00","close_reason":"Duplicate of openadapt-evals-0dt"}

openadapt_evals/benchmarks/vm_cli.py

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -349,15 +349,16 @@ def cmd_create(args):
349349
sudo systemctl enable docker
350350
sudo usermod -aG docker $USER
351351
352-
# Configure Docker to use /mnt (larger temp disk)
352+
# Configure Docker to use persistent storage (NOT /mnt which is ephemeral
353+
# and gets wiped on VM deallocate, breaking pool-resume)
353354
sudo systemctl stop docker
354-
sudo mkdir -p /mnt/docker
355-
sudo bash -c 'echo "{\\"data-root\\": \\"/mnt/docker\\"}" > /etc/docker/daemon.json'
355+
sudo mkdir -p /home/azureuser/docker
356+
sudo bash -c 'echo "{\\"data-root\\": \\"/home/azureuser/docker\\"}" > /etc/docker/daemon.json'
356357
sudo systemctl start docker
357358
358359
# Verify
359360
docker --version
360-
df -h /mnt
361+
df -h /home
361362
"""
362363
result = ssh_run(ip, docker_setup, stream=True, step="CREATE")
363364
if result.returncode != 0:

openadapt_evals/infrastructure/azure_vm.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -665,6 +665,7 @@ def _sdk_create_vm(
665665
"image_reference": {"id": image_id} if image_id else _UBUNTU_2204_IMAGE,
666666
"os_disk": {
667667
"create_option": "FromImage",
668+
"disk_size_gb": 128,
668669
"managed_disk": {"storage_account_type": "Premium_LRS"},
669670
},
670671
},

openadapt_evals/infrastructure/pool.py

Lines changed: 8 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -77,10 +77,11 @@ class PoolRunResult:
7777
sudo systemctl enable docker
7878
sudo usermod -aG docker $USER
7979
80-
# Configure Docker to use /mnt (larger temp disk)
80+
# Configure Docker to use persistent storage (NOT /mnt which is ephemeral
81+
# and gets wiped on VM deallocate, breaking pool-resume)
8182
sudo systemctl stop docker
82-
sudo mkdir -p /mnt/docker
83-
sudo bash -c 'echo "{\\"data-root\\": \\"/mnt/docker\\"}" > /etc/docker/daemon.json'
83+
sudo mkdir -p /home/azureuser/docker
84+
sudo bash -c 'echo "{\\"data-root\\": \\"/home/azureuser/docker\\"}" > /etc/docker/daemon.json'
8485
sudo systemctl start docker
8586
8687
# Pull base images (use sudo since usermod hasn't taken effect yet)
@@ -110,10 +111,11 @@ class PoolRunResult:
110111
sudo systemctl enable docker
111112
sudo usermod -aG docker $USER
112113
113-
# Configure Docker to use /mnt (larger temp disk)
114+
# Configure Docker to use persistent storage (NOT /mnt which is ephemeral
115+
# and gets wiped on VM deallocate, breaking pool-resume)
114116
sudo systemctl stop docker
115-
sudo mkdir -p /mnt/docker
116-
sudo bash -c 'echo "{{\\"data-root\\": \\"/mnt/docker\\"}}" > /etc/docker/daemon.json'
117+
sudo mkdir -p /home/azureuser/docker
118+
sudo bash -c 'echo "{{\\"data-root\\": \\"/home/azureuser/docker\\"}}" > /etc/docker/daemon.json'
117119
sudo systemctl start docker
118120
119121
# Pull pre-built image from ACR (faster than building)

0 commit comments

Comments
 (0)