Skip to content

Commit 6464162

Browse files
committed
Update geting started doc
1 parent b2153a3 commit 6464162

5 files changed

Lines changed: 66 additions & 114 deletions

File tree

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ MaxText achieves high Model FLOPs Utilization (MFU) and tokens/second from singl
2828

2929
MaxText is the launching point for ambitious LLM projects both in research and production. We encourage you to start by experimenting with MaxText out of the box and then fork and modify MaxText to meet your needs.
3030

31-
Check out our [Read The Docs site](https://maxtext.readthedocs.io/en/latest/) or directly [Get Started](https://maxtext.readthedocs.io/en/latest/tutorials/first_run.html) with your first MaxText run. If you’re interested in Diffusion models (Wan 2.1, Flux, etc), see the [MaxDiffusion](https://github.com/AI-Hypercomputer/maxdiffusion) repository in our AI Hypercomputer GitHub organization.
31+
Check out our [Read The Docs site](https://maxtext.readthedocs.io/en/latest/) or directly [Get Started](https://maxtext.readthedocs.io/en/latest/getting_started.html) with your first MaxText run. If you’re interested in Diffusion models (Wan 2.1, Flux, etc), see the [MaxDiffusion](https://github.com/AI-Hypercomputer/maxdiffusion) repository in our AI Hypercomputer GitHub organization.
3232

3333
## Installation
3434

docs/getting_started.md

Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
<!--
2+
Copyright 2023-2026 Google LLC
3+
4+
Licensed under the Apache License, Version 2.0 (the "License");
5+
you may not use this file except in compliance with the License.
6+
You may obtain a copy of the License at
7+
8+
https://www.apache.org/licenses/LICENSE-2.0
9+
10+
Unless required by applicable law or agreed to in writing, software
11+
distributed under the License is distributed on an "AS IS" BASIS,
12+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
See the License for the specific language governing permissions and
14+
limitations under the License.
15+
-->
16+
17+
(getting-started)=
18+
19+
# Getting Started
20+
21+
Welcome to MaxText! This guide will help you get started with running your first MaxText workloads. Whether you are working on a single host or scaling up to a multihost environment using Cloud TPUs or NVIDIA GPUs, this page provides the starting point for your journey. Follow the steps below to install MaxText, train your first model, and run inference.
22+
23+
## Prerequisites
24+
25+
1. To store logs and checkpoints, [create a Cloud Storage bucket](https://cloud.google.com/storage/docs/creating-buckets) in your project. To run MaxText, the TPU or GPU VMs must have read/write permissions for the bucket. These permissions are granted by service account roles, such as the `STORAGE ADMIN` role.
26+
27+
2. MaxText reads a yaml file for configuration. We also recommend reviewing the configurable options in `configs/base.yml`. This file includes a decoder-only model of ~1B parameters. The configurable options can be overwritten from the command line. For instance, you can change the `steps` or `log_period` by either modifying `configs/base.yml` or by passing in `steps` and `log_period` as additional arguments to the `train.py` call. Set `base_output_directory` to a folder in the bucket you just created.
28+
29+
3. **Checkpoint Conversion**: In order to run MaxText on HuggingFace checkpoints, you must convert them to the MaxText/Orbax format first. For detailed instructions, see the [Checkpoint Conversion Guide](guides/checkpointing_solutions/convert_checkpoint.md).
30+
31+
## Running MaxText on a Single Host
32+
33+
This procedure describes how to run MaxText on a single GPU or TPU host.
34+
35+
### 1. Installation
36+
37+
Before running MaxText, you must install it on your VM.
38+
* For detailed installation instructions, see the [Installation Guide](install_maxtext.md).
39+
* For TPU VMs, install `maxtext[tpu]` for pre-training, or `maxtext[tpu-post-train]` for post-training.
40+
* For GPU VMs, ensure you install `maxtext[cuda12]`.
41+
42+
### 2. Running Pre-training
43+
44+
To get started with training your first model, refer to the [Pre-training Tutorial](tutorials/pretraining.md).
45+
46+
### 3. Running Post-training
47+
48+
To fine-tune your model or apply post-training techniques (such as SFT or RL), refer to the [Post-training Tutorial](tutorials/post_training_index.md). This guide covers various post-training workflows.
49+
50+
### 4. Running Inference
51+
52+
To run inference (decoding) using MaxText models, refer to the [Inference Tutorial](tutorials/inference.md). This guide covers offline and online inference, as well as integration with vLLM.
53+
54+
## Running MaxText on Multiple Hosts
55+
56+
Google Kubernetes Engine (GKE) is the recommended way to run MaxText on multiple hosts. It provides a managed environment for deploying and scaling containerized applications, including those that require TPUs or GPUs. See [Running Maxtext with XPK](run_maxtext/run_maxtext_via_xpk.md) for details.
57+
58+
## Running MaxText in Notebooks
59+
60+
You can run MaxText interactively using Jupyter notebooks, Google Colab, or Visual Studio Code. Refer to the [Notebook Guide](guides/run_python_notebook.md) for instructions on setting up your notebook environment on TPUs.
61+
62+
## Next steps: preflight optimizations
63+
64+
After you get workloads running, there are optimizations you can apply to improve performance. For more information, see [Optimization Tips](guides/optimization.md).

docs/index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,7 @@ end-before: <!-- NEWS END -->
4242
maxdepth: 2
4343
hidden:
4444
---
45+
getting_started
4546
install_maxtext
4647
tutorials
4748
run_maxtext

docs/tutorials.md

Lines changed: 0 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -22,13 +22,6 @@ Explore our tutorials to learn how to use MaxText, from your first run to advanc
2222
---
2323
gutter: 2
2424
---
25-
```{grid-item-card} 🚀 Getting Started
26-
:link: tutorials/first_run
27-
:link-type: doc
28-
29-
Installation, prerequisites, verification, and your first training run.
30-
```
31-
3225
```{grid-item-card} 🛠️ Build and upload MaxText Docker Images
3326
---
3427
link: tutorials/build_maxtext
@@ -64,7 +57,6 @@ Step-by-step guides for running inference of MaxText models on vLLM.
6457
hidden:
6558
maxdepth: 1
6659
---
67-
tutorials/first_run.md
6860
tutorials/build_maxtext.md
6961
tutorials/pretraining.md
7062
tutorials/post_training_index.md

docs/tutorials/first_run.md

Lines changed: 0 additions & 105 deletions
This file was deleted.

0 commit comments

Comments
 (0)