We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
1 parent 87f556c commit 8a3e073Copy full SHA for 8a3e073
1 file changed
src/maxtext/configs/base.yml
@@ -315,7 +315,7 @@ attention_out: 'remat'
315
316
optimizer_memory_host_offload: False
317
parameter_memory_host_offload: False
318
-scan_layers: False # We recommend setting this to false when using pipeline parallelism, instead scanning the PP iterations.
+scan_layers: True # We recommend setting this to false when using pipeline parallelism, instead scanning the PP iterations.
319
param_scan_axis: 1
320
321
# The attention parameter dictates the specific algorithm/methodology used to compute the attention scores
0 commit comments