Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/guides/data_input_pipeline/data_input_grain.md
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,7 @@ Note that `FILE_PATH` is optional; when provided, the script runs `ls -R` for pr
bash tools/setup/setup_gcsfuse.sh \
DATASET_GCS_BUCKET=maxtext-dataset \
MOUNT_PATH=/tmp/gcsfuse && \
python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml \
python3 -m maxtext.trainers.pre_train.train \
run_name=<RUN_NAME> base_output_directory=gs://<MY_BUCKET> \
dataset_type=grain \
grain_file_type=arrayrecord # or parquet \
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ After installing the dependencies listed above, you are ready to compile ahead o

```sh
# Run the below on a single machine, e.g. a CPU
python3 -m maxtext.trainers.pre_train.train_compile src/maxtext/configs/base.yml compile_topology=v5e-256 compile_topology_num_slices=2 \
python3 -m maxtext.trainers.pre_train.train_compile compile_topology=v5e-256 compile_topology_num_slices=2 \
global_parameter_scale=16 per_device_batch_size=4
```

Expand All @@ -71,7 +71,7 @@ Here is an example that saves then loads the compiled `train_step`, starting wit
```sh
# Run the below on a single machine, e.g. a CPU
export LIBTPU_INIT_ARGS="--xla_enable_async_all_gather=true"
python3 -m maxtext.trainers.pre_train.train_compile src/maxtext/configs/base.yml compile_topology=v5e-256 \
python3 -m maxtext.trainers.pre_train.train_compile compile_topology=v5e-256 \
compile_topology_num_slices=2 \
compiled_trainstep_file=my_compiled_train.pickle global_parameter_scale=16 \
per_device_batch_size=4 steps=10000 learning_rate=1e-3
Expand All @@ -84,7 +84,7 @@ To load the compiled train_step, you just need to pass `compiled_trainstep_file=
```sh
# Run the below on each host of the target hardware, e.g. each host on 2 slices of v5e-256
export LIBTPU_INIT_ARGS="--xla_enable_async_all_gather=true"
python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml run_name=example_load_compile \
python3 -m maxtext.trainers.pre_train.train run_name=example_load_compile \
compiled_trainstep_file=my_compiled_train.pickle \
global_parameter_scale=16 per_device_batch_size=4 steps=10000 learning_rate=1e-3 \
base_output_directory=gs://my-output-bucket dataset_path=gs://my-dataset-bucket
Expand All @@ -109,7 +109,7 @@ This example illustrates the flags to use for a multihost GPU compilation target
```sh
# Run the below on a single A3 machine
export XLA_FLAGS="--xla_gpu_enable_async_collectives=true"
python3 -m maxtext.trainers.pre_train.train_compile src/maxtext/configs/base.yml compile_topology=a3 \
python3 -m maxtext.trainers.pre_train.train_compile compile_topology=a3 \
compile_topology_num_slices=4 \
compiled_trainstep_file=my_compiled_train.pickle global_parameter_scale=16 \
attention=dot_product per_device_batch_size=4 steps=10000 learning_rate=1e-3
Expand All @@ -122,7 +122,7 @@ To load the compiled `train_step`, you just need to pass `compiled_trainstep_fil
```sh
# Run the below on each of the 4 target A3 hosts.
export XLA_FLAGS="--xla_gpu_enable_async_collectives=true"
python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml run_name=example_load_compile \
python3 -m maxtext.trainers.pre_train.train run_name=example_load_compile \
compiled_trainstep_file=my_compiled_train.pickle \
attention=dot_product global_parameter_scale=16 per_device_batch_size=4 steps=10000 learning_rate=1e-3 \
base_output_directory=gs://my-output-bucket dataset_path=gs://my-dataset-bucket
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ MaxText has integrated the ML Diagnostics [SDK](https://github.com/AI-Hypercompu
1. Enable ML Diagnostics to just capture Maxtext metrics and configs

```
python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml \
python3 -m maxtext.trainers.pre_train.train \
run_name=${USER}-tpu-job \
base_output_directory="gs://your-output-bucket/" \
dataset_path="gs://your-dataset-bucket/" \
Expand All @@ -47,7 +47,7 @@ MaxText has integrated the ML Diagnostics [SDK](https://github.com/AI-Hypercompu
2. Enable ML Diagnostics to capture Maxtext metrics, configs and singlehost profiles (on the first TPU device)

```
python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml \
python3 -m maxtext.trainers.pre_train.train \
run_name=${USER}-tpu-job \
base_output_directory="gs://your-output-bucket/" \
dataset_path="gs://your-dataset-bucket/" \
Expand All @@ -60,7 +60,7 @@ MaxText has integrated the ML Diagnostics [SDK](https://github.com/AI-Hypercompu
3. Enable ML Diagnostics to capture Maxtext metrics, configs and multihost profiles (on all TPU devices)

```
python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml \
python3 -m maxtext.trainers.pre_train.train \
run_name=${USER}-tpu-job \
base_output_directory="gs://your-output-bucket/" \
dataset_path="gs://your-dataset-bucket/" \
Expand Down
8 changes: 4 additions & 4 deletions docs/guides/monitoring_and_debugging/monitor_goodput.md
Original file line number Diff line number Diff line change
Expand Up @@ -89,7 +89,7 @@ Please use a unique workload name, unless you intend to monitor cumulative Goodp
MaxText enables Goodput recording and monitoring by default with `enable_goodput_recording=True` and `monitor_goodput=True`. You can configure the goodput upload frequency by setting `goodput_upload_interval_seconds`.

```bash
python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml base_output_directory=${OUTPUT_PATH?} \
python3 -m maxtext.trainers.pre_train.train base_output_directory=${OUTPUT_PATH?} \
dataset_path=${DATA_PATH?} run_name=goodput-test-run steps=200 goodput_upload_interval_seconds=30
```

Expand All @@ -98,7 +98,7 @@ python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml base_ou
MaxText enables step time deviation monitoring by default with `monitor_step_time_deviation=True`. You can configure the upload frequency by setting `step_deviation_interval_seconds`.

```bash
python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml base_output_directory=${OUTPUT_PATH?} \
python3 -m maxtext.trainers.pre_train.train base_output_directory=${OUTPUT_PATH?} \
dataset_path=${DATA_PATH?} run_name=goodput-test-run steps=200 step_deviation_interval_seconds=30
```

Expand All @@ -111,7 +111,7 @@ Enabling `enable_pathways_goodput` turns on Goodput measurement for Pathways wor
```

```bash
python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml base_output_directory=${OUTPUT_PATH?} dataset_path=${DATA_PATH?} \
python3 -m maxtext.trainers.pre_train.train base_output_directory=${OUTPUT_PATH?} dataset_path=${DATA_PATH?} \
run_name=goodput-test-run steps=200 goodput_upload_interval_seconds=30 enable_pathways_goodput=True
```

Expand Down Expand Up @@ -168,7 +168,7 @@ and `enable_gcp_step_deviation_metrics` to `False` for disabling step deviation
metrics.

```bash
python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml base_output_directory=${OUTPUT_PATH?} dataset_path=${DATA_PATH?} \
python3 -m maxtext.trainers.pre_train.train base_output_directory=${OUTPUT_PATH?} dataset_path=${DATA_PATH?} \
run_name=goodput-test-run steps=200 goodput_upload_interval_seconds=30 enable_gcp_goodput_metrics=False \
enable_gcp_step_deviation_metrics=False
```
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ When you run a training job, MaxText produces detailed output logs. This guide s
To start, run a simple pretraining job on a single-host TPU. For instance, we can run the following command on TPU v5p-8. The resulting log is used as an example throughout this guide.

```bash
python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml \
python3 -m maxtext.trainers.pre_train.train \
base_output_directory=gs://runner-maxtext-logs run_name=demo \
model_name=deepseek2-16b \
per_device_batch_size=24 max_target_length=2048 steps=10 dataset_type=synthetic enable_checkpointing=false
Expand Down Expand Up @@ -123,7 +123,7 @@ To generate all optional artifacts in one run, you can set the corresponding fla
This command enables tensorboard, profiler, text metrics, config saving, and checkpointing:

```bash
python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml \
python3 -m maxtext.trainers.pre_train.train \
base_output_directory=gs://runner-maxtext-logs run_name=demo2 \
model_name=deepseek2-16b \
per_device_batch_size=24 max_target_length=2048 steps=10 dataset_type=synthetic \
Expand Down
4 changes: 2 additions & 2 deletions docs/reference/core_concepts/quantization.md
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,7 @@ Common options for the `quantization` flag when using Qwix include:
Here is an example of how to run a training job with int8 quantization enabled via Qwix:

```bash
python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml run_name=${YOUR_JOB_NAME?} base_output_directory=gs://<my-bucket> dataset_type=synthetic use_qwix_quantization=true quantization='int8'
python3 -m maxtext.trainers.pre_train.train run_name=${YOUR_JOB_NAME?} base_output_directory=gs://<my-bucket> dataset_type=synthetic use_qwix_quantization=true quantization='int8'
```

#### The Qwix Interception API
Expand Down Expand Up @@ -142,7 +142,7 @@ When using AQT, you can pass one of the following values to the `quantization` f
#### Example command for AQT

```bash
python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml run_name=${YOUR_JOB_NAME?} base_output_directory=gs://<my-bucket> dataset_type=synthetic use_qwix_quantization=false quantization='int8'
python3 -m maxtext.trainers.pre_train.train run_name=${YOUR_JOB_NAME?} base_output_directory=gs://<my-bucket> dataset_type=synthetic use_qwix_quantization=false quantization='int8'
```

Note that `use_qwix_quantization` is not set to `True`.
Expand Down
4 changes: 2 additions & 2 deletions docs/tutorials/inference.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ We include a script for convenient offline inference of MaxText models in `src/m
An example of how to run this script can be found below:

```bash
python3 -m maxtext.inference.vllm_decode src/maxtext/configs/base.yml \
python3 -m maxtext.inference.vllm_decode \
model_name=qwen3-30b-a3b \
tokenizer_path=Qwen/Qwen3-30B-A3B \
load_parameters_path=$CHECKPOINT_PATH \
Expand Down Expand Up @@ -133,7 +133,7 @@ curl http://localhost:8000/v1/completions \
To use a MaxText model architecture for samplers in reinforcement learning algorithms like GRPO, we can override the vLLM model architecture and pass in MaxText specific config arguments similar to the [online inference](online-inference) use-case. An example of an RL command using the MaxText model for samplers can be found below:

```bash
python3 -m src.maxtext.trainers.post_train.rl.train_rl src/maxtext/configs/post_train/rl.yml \
python3 -m src.maxtext.trainers.post_train.rl.train_rl \
model_name=qwen3-0.6b \
tokenizer_path=Qwen/Qwen3-0.6B \
run_name=$WORKLOAD \
Expand Down
2 changes: 0 additions & 2 deletions docs/tutorials/posttraining/multimodal.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,6 @@ To run a forward pass and verify the model's output, use the following command:
```shell
# Gemma3 decode
python -m maxtext.inference.decode \
maxtext/configs/base.yml \
model_name=gemma3-4b \
hf_access_token=${HF_ACCESS_TOKEN?} \
tokenizer_path=src/maxtext/assets/tokenizers/tokenizer.gemma3 \
Expand Down Expand Up @@ -109,7 +108,6 @@ export TARGET_LENGTH=... # Adjust to fit expected output length
export PREDICT_LENGTH=... # Adjust to fit image tokens + text prompt

python -m maxtext.inference.decode \
maxtext/configs/base.yml \
model_name=gemma3-4b \
... \
max_prefill_predict_length=${PREDICT_LENGTH?} # Adjust to fit image tokens + text prompt \
Expand Down
Loading