We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
2 parents 37ca7e1 + 66e9ed9 commit 01d4b30Copy full SHA for 01d4b30
1 file changed
docs/how-to/prepare-tokenized-data.md
@@ -121,8 +121,9 @@ local directory.
121
122
### Cache the tokenizer first
123
124
-HuggingFace won't reach out to the hub from a compute node that
125
-doesn't have internet. Cache on the login node once:
+Hugging Face cannot access the hub from compute nodes with no or limited internet connectivity.
+Since compute nodes also have much slower bandwidth (~1 Gbps vs. ~100 Gbps on login nodes),
126
+cache the tokenizer once on the login node:
127
128
```bash
129
python -c "from transformers import AutoTokenizer; AutoTokenizer.from_pretrained('gpt2')"
0 commit comments