Skip to content

Commit 3e77886

Browse files
committed
Clarify GCS bucket permission requirements in SFT and RL tutorials
Updates the SFT and RL multi-host tutorials to recommend either the broader Storage Admin role, or combining the restrictive Storage Object Admin role with Storage Legacy Bucket Reader. This ensures JAX/TensorStore can perform bucket-level operations (like storage.buckets.get) and prevents misleading 'bucket not found' errors.
1 parent 2e6cd11 commit 3e77886

2 files changed

Lines changed: 2 additions & 2 deletions

File tree

docs/tutorials/posttraining/rl_on_multi_host.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -59,7 +59,7 @@ Before starting, ensure you have:
5959
- **IAM Roles** required:
6060
- **Kubernetes Engine Developer** (`roles/container.developer`) to submit and manage workloads on GKE.
6161
- **Artifact Registry Writer** (`roles/artifactregistry.writer`) to upload Docker images.
62-
- **Storage Object Admin** (`roles/storage.objectAdmin`) on your GCS bucket to read/write checkpoints and logs.
62+
- **Storage Admin** (`roles/storage.admin`) or **Storage Object Admin** (`roles/storage.objectAdmin`) combined with **Storage Legacy Bucket Reader** (`roles/storage.legacyBucketReader`) on your GCS bucket to read/write checkpoints and logs. (Note: A bucket-level read permission like `storage.buckets.get` is required by JAX/TensorStore to verify bucket existence and metadata; using `roles/storage.objectAdmin` alone will cause a misleading "bucket not found" error).
6363
- A Hugging Face account with an access token for downloading models.
6464
- Prerequisites for XPK installed (follow [official documentation](https://github.com/AI-Hypercomputer/xpk/blob/main/docs/installation.md#1-prerequisites)).
6565
- **Important:** Modern GKE clusters require the GKE auth plugin. If you encounter `gke-gcloud-auth-plugin not found` when running `kubectl` commands, you must install it locally (e.g., `sudo apt-get install google-cloud-sdk-gke-gcloud-auth-plugin` for `apt` installations, or `gcloud components install gke-gcloud-auth-plugin` for standalone archive installations).

docs/tutorials/posttraining/sft_on_multi_host.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ Before starting, ensure you have:
3232
- **IAM Roles** required:
3333
- **Kubernetes Engine Developer** (`roles/container.developer`) to submit and manage workloads on GKE.
3434
- **Artifact Registry Writer** (`roles/artifactregistry.writer`) to upload Docker images.
35-
- **Storage Object Admin** (`roles/storage.objectAdmin`) on your GCS bucket to read/write checkpoints and logs.
35+
- **Storage Admin** (`roles/storage.admin`) or **Storage Object Admin** (`roles/storage.objectAdmin`) combined with **Storage Legacy Bucket Reader** (`roles/storage.legacyBucketReader`) on your GCS bucket to read/write checkpoints and logs. (Note: A bucket-level read permission like `storage.buckets.get` is required by JAX/TensorStore to verify bucket existence and metadata; using `roles/storage.objectAdmin` alone will cause a misleading "bucket not found" error).
3636
- A Hugging Face account with an access token for downloading models.
3737
- Prerequisites for XPK installed (follow [official documentation](https://github.com/AI-Hypercomputer/xpk/blob/main/docs/installation.md#1-prerequisites)).
3838
- **Important:** Modern GKE clusters require the GKE auth plugin. If you encounter `gke-gcloud-auth-plugin not found` when running `kubectl` commands, you must install it locally (e.g., `sudo apt-get install google-cloud-sdk-gke-gcloud-auth-plugin` for `apt` installations, or `gcloud components install gke-gcloud-auth-plugin` for standalone archive installations).

0 commit comments

Comments
 (0)