Skip to content

Commit 4e677c9

Browse files
Document Claude Code skills and conventions in README
Adds a "Using Claude Code with xfer" section covering the seven shipped skills and the workflow invariants they enforce (workstation/cluster split, per-system paths, POSIX-first manifest build, load-aware transfer cluster selection, rebase-on-vantage-change, credential hygiene) so the same guidance applies to manual runs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 3904b86 commit 4e677c9

1 file changed

Lines changed: 31 additions & 0 deletions

File tree

README.md

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -309,6 +309,37 @@ run/
309309

310310
---
311311

312+
## Using Claude Code with xfer
313+
314+
This repo ships a set of [Claude Code](https://claude.com/claude-code) skills under `.claude/skills/` that walk through each stage of a transfer. Each skill drives the corresponding `xfer` subcommand and encodes conventions we've found important on real clusters.
315+
316+
### Available skills
317+
318+
Invoke each by intent in a Claude Code session, or explicitly via `/<skill-name>`:
319+
320+
| Skill | Stage |
321+
| ------------------------ | -------------------------------------------------------------------- |
322+
| `xfer-rclone-config` | Create `rclone.conf` and deploy it to each cluster |
323+
| `xfer-manifest-build` | Run `xfer manifest build` on a login node (POSIX source preferred) |
324+
| `xfer-manifest-analyze` | File-size histogram → suggested rclone flags and shard count |
325+
| `xfer-manifest-shard` | Byte-balanced split of the manifest into shards |
326+
| `xfer-manifest-rebase` | Remap source/dest roots when the transfer host's view differs |
327+
| `xfer-slurm-render` | Render `worker.sh` / `sbatch_array.sh` / `config.resolved.json` |
328+
| `xfer-slurm-submit` | Stage the run directory to the cluster and `sbatch` |
329+
330+
See `CLAUDE.md` for the cross-cutting context Claude loads in every session in this repo.
331+
332+
### Conventions
333+
334+
The skills (and `CLAUDE.md`) enforce a few invariants. These apply whether you drive xfer through Claude Code or by hand:
335+
336+
- **Workstation orchestrates, clusters execute.** Run xfer from a local checkout in a `uv` environment. SSH to Slurm login nodes for `manifest build` and `sbatch`; `analyze`, `shard`, `rebase`, and `render` run locally.
337+
- **Paths are per-system.** `--rclone-config`, the xfer repo path, and the run directory all differ between workstation, build cluster, and transfer cluster. Always resolve the correct absolute path on whichever host the command runs on — do not assume a workstation path resolves identically on a cluster.
338+
- **POSIX-first manifest build.** If any Slurm cluster has a POSIX mount of the source bucket, build the manifest there against the POSIX path. Listing is latency-bound, and POSIX beats S3 by a wide margin.
339+
- **CPU-only, load-aware transfer.** Prefer CPU-only partitions for both build and transfer. Pick the transfer cluster by current `sinfo`/`squeue` load rather than by habit.
340+
- **Vantage change ⇒ rebase.** When the host that will run the transfer has a different view of source or destination than the host that built the manifest, run `xfer manifest rebase` and re-shard before render. Skipping this makes every array task fail identically.
341+
- **Credential hygiene.** Keep `rclone.conf` at mode `0600` on every host it lives on. Never commit it to the repo, and confirm before transmitting it over `scp`.
342+
312343
## Design notes
313344

314345
* **Manifest is immutable** → enables reproducibility and auditing

0 commit comments

Comments
 (0)