From 8dad16cdbb66ee4256580912618050cf6a6cb9d0 Mon Sep 17 00:00:00 2001
From: Branden Vandermoon <bvandermoon@google.com>
Date: Thu, 23 Apr 2026 23:47:08 +0000
Subject: [PATCH] Update tutorials with specific MaxText installation options

---
 docs/tutorials/first_run.md           | 4 ++--
 docs/tutorials/inference.md           | 2 +-
 docs/tutorials/post_training_index.md | 4 ++++
 docs/tutorials/pretraining.md         | 4 ++++
 4 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/docs/tutorials/first_run.md b/docs/tutorials/first_run.md
index 58b71f80ac..f04c6acb9c 100644
--- a/docs/tutorials/first_run.md
+++ b/docs/tutorials/first_run.md
@@ -36,7 +36,7 @@ Local development is a convenient way to run MaxText on a single host. It doesn'
 multiple hosts but is a good way to learn about MaxText.
 
 1. [Create and SSH to the single host VM of your choice](https://cloud.google.com/tpu/docs/managing-tpus-tpu-vm). You can use any available single host TPU, such as `v5litepod-8`, `v5p-8`, or `v4-8`.
-2. For instructions on installing MaxText on your VM, please refer to the [official documentation](https://maxtext.readthedocs.io/en/latest/install_maxtext.html).
+2. For instructions on installing MaxText on your VM, please refer to the [official documentation](https://maxtext.readthedocs.io/en/latest/install_maxtext.html). For this tutorial on TPUs, install `maxtext[tpu]`.
 3. After installation completes, run training on synthetic data with the following command:
 
 ```sh
@@ -70,7 +70,7 @@ You can use [demo_decoding.ipynb](https://github.com/AI-Hypercomputer/maxtext/bl
 
 ### Run MaxText on NVIDIA GPUs
 
-1. For instructions on installing MaxText on your VM, please refer to the [official documentation](https://maxtext.readthedocs.io/en/latest/install_maxtext.html).
+1. For instructions on installing MaxText on your VM, please refer to the [official documentation](https://maxtext.readthedocs.io/en/latest/install_maxtext.html). For this tutorial on GPUs, install `maxtext[cuda12]`.
 2. After installation is complete, run training with the following command on synthetic data:
 
 ```sh
diff --git a/docs/tutorials/inference.md b/docs/tutorials/inference.md
index f4e7a8d495..049cdd83be 100644
--- a/docs/tutorials/inference.md
+++ b/docs/tutorials/inference.md
@@ -25,7 +25,7 @@ We support inference of MaxText models on vLLM via an [out-of-tree](https://gith
 
 # Installation
 
-Follow the instructions in [install maxtext](https://maxtext.readthedocs.io/en/latest/install_maxtext.html) to install MaxText with post-training dependencies. We recommend installing from PyPI to ensure you have the latest stable versionset of dependencies.
+Follow the instructions in [install maxtext](https://maxtext.readthedocs.io/en/latest/install_maxtext.html) to install MaxText. For this inference tutorial on TPU (which uses vLLM), you must install `maxtext[tpu-post-train]`, as it includes the required adapter plugin. We recommend installing from PyPI to ensure you have the latest stable version of dependencies.
 
 After finishing the installation, ensure that the MaxText on vLLM adapter plugin has been installed. To do so, run the following command:
 
diff --git a/docs/tutorials/post_training_index.md b/docs/tutorials/post_training_index.md
index 0ed355e16b..401aebb6a6 100644
--- a/docs/tutorials/post_training_index.md
+++ b/docs/tutorials/post_training_index.md
@@ -1,5 +1,9 @@
 # Post-training
 
+```{note}
+Post-training workflows on TPU require specific dependencies. Please ensure you have installed MaxText with `maxtext[tpu-post-train]` as described in the [official documentation](https://maxtext.readthedocs.io/en/latest/install_maxtext.html).
+```
+
 ## What is MaxText post-training?
 
 MaxText provides performance and scalable LLM and VLM post-training, across a variety of techniques like SFT and GRPO.
diff --git a/docs/tutorials/pretraining.md b/docs/tutorials/pretraining.md
index c39c5a561b..a1ae985db0 100644
--- a/docs/tutorials/pretraining.md
+++ b/docs/tutorials/pretraining.md
@@ -20,6 +20,10 @@
 
 In this tutorial, we introduce how to run pretraining with real datasets. While synthetic data is commonly used for benchmarking, we rely on real datasets to obtain meaningful weights. Currently, MaxText supports three dataset input pipelines: HuggingFace, Grain, and TensorFlow Datasets (TFDS). We will walk you through: setting up dataset, modifying the [dataset configs](https://github.com/AI-Hypercomputer/maxtext/blob/f11f5507c987fdb57272c090ebd2cbdbbadbd36c/src/maxtext/configs/base.yml#L631-L675) and [tokenizer configs](https://github.com/AI-Hypercomputer/maxtext/blob/f11f5507c987fdb57272c090ebd2cbdbbadbd36c/src/maxtext/configs/base.yml#L566) for training, and optionally enabling evaluation.
 
+```{note}
+Before starting this tutorial, ensure you have installed MaxText following the [official documentation](https://maxtext.readthedocs.io/en/latest/install_maxtext.html). For pre-training, install `maxtext[tpu]` for TPUs or `maxtext[cuda12]` for GPUs.
+```
+
 To start with, we focus on HuggingFace datasets for convenience.
 
 - Later on, we will give brief examples for Grain and TFDS. For a comprehensive guide, see the [Data Input Pipeline](../guides/data_input_pipeline.md) topic.