You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: tests/end_to_end/tpu/test_convergence_1b_params.sh
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -18,7 +18,7 @@ export LOSS_THRESHOLD=100.0 # Set to large value so test is guaranteed to pass.
18
18
export STEPS=20400 # Run for 20B tokens for a 1B sized mode for "chinchilla" scaling https://arxiv.org/abs/2203.15556
19
19
export EVAL_STEPS=160
20
20
export EVAL_INTERVAL=100
21
-
export DATASET_TYPE=tfds
21
+
export DATASET_TYPE=grain
22
22
export MTP_NUM_LAYERS=0 # Disable MTP by default
23
23
export PER_DEVICE_BATCH_SIZE=8.0 # With the default learning rate (3e-4) this should have global batch of 512, with 2k sequence length (1M global batch in tokens)
0 commit comments