You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# Optimization 2: Numa binding (You can only apply this to v4 and v5p)
@@ -22,14 +22,14 @@ For GCE,
22
22
[preflight.sh](https://github.com/google/maxtext/blob/main/preflight.sh) will help you install `numactl` dependency, so you can use it directly, here is an example:
`numactl` should be built into your docker image from [maxtext_tpu_dependencies.Dockerfile](https://github.com/google/maxtext/blob/main/src/dependencies/dockerfiles/maxtext_tpu_dependencies.Dockerfile), so you can use it directly if you built the maxtext docker image. Here is an example
1.`numactl`: This is the command-line tool used for controlling NUMA policy for processes or shared memory. It's particularly useful on multi-socket systems where memory locality can impact performance.
@@ -221,7 +221,7 @@ To extend conversion support to a new model architecture, you must define its sp
221
221
- In [`utils/param_mapping.py`](https://github.com/AI-Hypercomputer/maxtext/blob/main/src/MaxText/checkpoint_conversion/utils/param_mapping.py), add the `hook_fn` logic (`def {MODEL}_MAXTEXT_TO_HF_PARAM_HOOK_FN`). This is the transformation needed per layer.
222
222
223
223
2.**Add Hugging Face weights Shape**: In [`utils/hf_shape.py`](https://github.com/AI-Hypercomputer/maxtext/blob/main/src/MaxText/checkpoint_conversion/utils/hf_shape.py), define the tensor shape of Hugging Face format (`def {MODEL}_HF_WEIGHTS_TO_SHAPE`). This is used to ensure the tensor shape is matched after to_huggingface conversion.
224
-
3.**Register model key**: In [`utils/utils.py`](https://github.com/AI-Hypercomputer/maxtext/blob/main/src/MaxText/checkpoint_conversion/utils/utils.py), add the new model key in `HF_IDS`.
224
+
3.**Register model key**: In [`utils/utils.py`](https://github.com/AI-Hypercomputer/maxtext/blob/main/src/maxtext/utils/globals.py), add the new model key in `HF_IDS`.
225
225
4.**Add transformer config**: In [`utils/hf_model_configs.py`](https://github.com/AI-Hypercomputer/maxtext/blob/main/src/MaxText/checkpoint_conversion/utils/hf_model_configs.py), add the `transformers.Config` object, describing the Hugging Face model configuration (defined in [`src/maxtext/configs/models`](https://github.com/AI-Hypercomputer/maxtext/tree/main/src/maxtext/configs/models)). **Note**: This configuration must precisely match the MaxText model's architecture.
226
226
227
227
Here is an example [PR to add support for gemma3 multi-modal model](https://github.com/AI-Hypercomputer/maxtext/pull/1983)
Copy file name to clipboardExpand all lines: docs/guides/run_python_notebook.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -103,7 +103,7 @@ To install, click the `Extensions` icon on the left sidebar (or press `Ctrl+Shif
103
103
104
104
### Step 4: Install MaxText and Dependencies
105
105
106
-
To execute post-training notebooks on your TPU-VM, follow the official [MaxText installation guides](https://maxtext.readthedocs.io/en/latest/tutorials/posttraining/rl.html#create-virtual-environment-and-install-maxtext-dependencies) to install MaxText and its dependencies inside a dedicated virtual environment.
106
+
To execute post-training notebooks on your TPU-VM, follow the official [MaxText installation guides](https://maxtext.readthedocs.io/en/latest/install_maxtext.html#from-source)and specifically follow `Option 3: Installing [tpu-post-train]`. This will ensure all post-training dependencies are installed inside your virtual environment.
107
107
108
108
### Step 5: Install the necessary library for Jupyter
109
109
@@ -162,7 +162,7 @@ pip3 install jupyterlab
162
162
163
163
### Step 4: Install MaxText and Dependencies
164
164
165
-
To execute post-training notebooks on your TPU-VM, follow the official [MaxText installation guides](https://maxtext.readthedocs.io/en/latest/tutorials/posttraining/rl.html#create-virtual-environment-and-install-maxtext-dependencies) to install MaxText and its dependencies inside a dedicated virtual environment.
165
+
To execute post-training notebooks on your TPU-VM, follow the official [MaxText installation guides](https://maxtext.readthedocs.io/en/latest/install_maxtext.html#from-source)and specifically follow `Option 3: Installing [tpu-post-train]`. This will ensure all post-training dependencies are installed inside your virtual environment.
166
166
167
167
### Step 5: Register virtual environment as a Jupyter Kernel
> **Note:** The `install_maxtext_tpu_github_deps`, `install_maxtext_cuda12_github_dep`, and
53
-
`install_maxtext_tpu_post_train_extra_deps` commands are temporarily required to install dependencies directly from GitHub
54
-
that are not yet available on PyPI. As shown above, choose the one that corresponds to your use case.
55
+
> `install_maxtext_tpu_post_train_extra_deps` commands are temporarily required to install dependencies directly from GitHub
56
+
> that are not yet available on PyPI. As shown above, choose the one that corresponds to your use case.
55
57
56
58
> **Note:** The maxtext package contains a comprehensive list of all direct and transitive dependencies, with lower bounds, generated by [seed-env](https://github.com/google-ml-infra/actions/tree/main/python_seed_env). We highly recommend the `--resolution=lowest` flag. It instructs `uv` to install the specific, tested versions of dependencies defined by MaxText, rather than the latest available ones. This ensures a consistent and reproducible environment, which is critical for stable performance and for running benchmarks.
57
59
58
60
## From Source
61
+
59
62
If you plan to contribute to MaxText or need the latest unreleased features, install from source.
60
63
61
64
```bash
@@ -98,11 +101,11 @@ Please keep dependencies updated throughout development. This will allow each co
98
101
99
102
To update dependencies, you will follow these general steps:
100
103
101
-
1.**Modify Base Requirements**: Update the desired dependencies in `base_requirements/requirements.txt` or the hardware-specific files (`base_requirements/tpu-base-requirements.txt`, `base_requirements/gpu-base-requirements.txt`).
102
-
2.**Generate New Files**: Run the `seed-env` CLI tool to generate new, fully-pinned requirements files based on your changes.
103
-
3.**Update Project Files**: Copy the newly generated files into the `generated_requirements/` directory.
104
-
4.**Handle GitHub Dependencies**: Move any dependencies that are installed directly from GitHub from the generated files to `src/install_maxtext_extra_deps/extra_deps_from_github.txt`.
105
-
5.**Verify**: Test the new dependencies to ensure the project installs and runs correctly.
104
+
1.**Modify Base Requirements**: Update the desired dependencies in `base_requirements/requirements.txt` or the hardware-specific files (`base_requirements/tpu-base-requirements.txt`, `base_requirements/gpu-base-requirements.txt`).
105
+
2.**Generate New Files**: Run the `seed-env` CLI tool to generate new, fully-pinned requirements files based on your changes.
106
+
3.**Update Project Files**: Copy the newly generated files into the `generated_requirements/` directory.
107
+
4.**Handle GitHub Dependencies**: Move any dependencies that are installed directly from GitHub from the generated files to `src/install_maxtext_extra_deps/extra_deps_from_github.txt`.
108
+
5.**Verify**: Test the new dependencies to ensure the project installs and runs correctly.
106
109
107
110
The following sections provide detailed instructions for each step.
108
111
@@ -154,25 +157,26 @@ seed-env \
154
157
155
158
After generating the new requirements, you need to update the files in the MaxText repository.
156
159
157
-
1.**Copy the generated files:**
158
-
- Move `generated_tpu_artifacts/tpu-requirements.txt` to `generated_requirements/tpu-requirements.txt`.
159
-
- Move `generated_gpu_artifacts/cuda12-requirements.txt` to `generated_requirements/cuda12-requirements.txt`.
160
+
1.**Copy the generated files:**
161
+
162
+
- Move `generated_tpu_artifacts/tpu-requirements.txt` to `generated_requirements/tpu-requirements.txt`.
163
+
- Move `generated_gpu_artifacts/cuda12-requirements.txt` to `generated_requirements/cuda12-requirements.txt`.
Currently, MaxText uses a few dependencies, such as `mlperf-logging` and `google-jetstream`, that are installed directly from GitHub source. These are defined in `base_requirements/requirements.txt`, and the `seed-env` tool will carry them over to the generated requirements files.
Currently, MaxText uses a few dependencies, such as `mlperf-logging` and `google-jetstream`, that are installed directly from GitHub source. These are defined in `base_requirements/requirements.txt`, and the `seed-env` tool will carry them over to the generated requirements files.
163
167
164
168
## Step 5: Verify the New Dependencies
165
169
166
170
Finally, test that the new dependencies install correctly and that MaxText runs as expected.
167
171
168
-
1.**Create a clean environment:** It's best to start with a fresh Python virtual environment.
172
+
1.**Create a clean environment:** It's best to start with a fresh Python virtual environment.
169
173
170
174
```bash
171
175
uv venv --python 3.12 --seed maxtext_venv
172
176
source maxtext_venv/bin/activate
173
177
```
174
178
175
-
2.**Run the setup script:** Execute `bash setup.sh` to install the new dependencies.
179
+
2.**Run the setup script:** Execute `bash setup.sh` to install the new dependencies.
After the installation is complete, run a short training job using synthetic data to confirm everything is working correctly. This command trains a model for just 10 steps. Remember to replace `$YOUR_JOB_NAME` with a unique name for your run and `gs://<my-bucket>` with the path to the GCS bucket you configured in the prerequisites.
We tell `multihost_job` to target the `reserved` pool by by including `--reserved` as extra arguments to the CQR request, but you may instead target the `on-demand` pool by removing the `--CQR_EXTRA_ARGS` flag (on-demand is default), or the pre-emptible pool with `--CQR_EXTRA_ARGS="--best-effort"`, which may be necessary if your reservation is full.
0 commit comments