Add simplified APIs for model obtaining maxtext models by A9isha · Pull Request #3450 · AI-Hypercomputer/maxtext

A9isha · 2026-03-18T21:45:13Z

Description

Adding a simplified API of from_pretrained() which can work minimally as

For creating all the necessary configs and mesh for RL, we can use the APIs.

import maxtext as mt

config= mt.pyconfig(model_name="llama3.1-8b-Instruct")
model,mesh = mt.from_pretrained(config)

#OR
config= mt.pyconfig(model_name="llama3.1-8b-Instruct")
## your own mesh
model = mt.from_pretrained(config, mesh)


# OR for train_rl.py
trainer_config, sampler_config, trainer_devices, sampler_devices = mt.setup_configs_and_devices(model_name="llama3.1-8b-Instruct")

reference_model, reference_mesh, actor_model, actor_mesh, rollout_mesh = mt.create_models_and_meshes(
      trainer_config, sampler_config, trainer_devices, sampler_devices
  )

# OR regular invocation of train_rl.py

run_name=maz-8b-$RANDOM python3 -m src.maxtext.trainers.post_train.rl.train_rl \
 model_name=llama3.1-8b-Instruct run_name=$run_name\
    steps=4 rollout_tensor_parallelism=-1\
 rollout_data_parallelism=1 rollout_expert_parallelism=1\
 test_batch_start_index=10  num_test_batches=15 
base_output_directory=/path/to/say/gcs/bucket

# OR standalone script where you want to invoke 

    MAXTEXT_CONFIGS_DIR = "src/maxtext/configs"
    import maxtext as mt

    config= mt.pyconfig(model_name=MAXTEXT_MODEL_VERSION, 
    hf_access_token=args.hf_token,
    base_output_directory="gs://path/to/save/artifacts",
    base_config=f"{MAXTEXT_CONFIGS_DIR}/post_train/rl.yml",
    )
    qwen2_actor, _= mt.from_pretrained(config) # ref_mesh and train_mesh are the same for us
  else:
    qwen2_actor = params_lib.create_model_from_safe_tensors(
        MODEL_PATH, config, trainer_mesh, dtype=MODEL_DTYPE
    )

FIXES: b/492376313

Tests

Ran locally

Checklist

Before submitting this PR, please make sure (put X in square brackets):

I have performed a self-review of my code. For an optional AI review, add the gemini-review label.
I have necessary comments in my code, particularly in hard-to-understand areas.
I have run end-to-end tests tests and provided workload links above if applicable.
I have made or will make corresponding changes to the doc if needed, including adding new documentation pages to the relevant Table of Contents (toctree directive) as explained in our documentation.

codecov · 2026-03-18T21:51:01Z

Codecov Report

❌ Patch coverage is 47.41379% with 61 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/maxtext/utils/model_creation_utils.py	48.11%	46 Missing and 9 partials ⚠️
src/maxtext/trainers/post_train/rl/train_rl.py	33.33%	6 Missing ⚠️

📢 Thoughts on this report? Let us know!

wang2yn84

Thank you very much!

richjames0

lgtm

A9isha changed the title ~~Anisha from pretrained3~~ Add simplified APIs for model obtaining maxtext models Mar 18, 2026

A9isha force-pushed the anisha-from-pretrained3 branch from 1e13fdc to ca7b4be Compare March 23, 2026 22:29

A9isha requested a review from abhinavclemson as a code owner April 9, 2026 23:33

A9isha closed this Apr 10, 2026

A9isha force-pushed the anisha-from-pretrained3 branch from 00de2a3 to 41a4e9d Compare April 10, 2026 06:16

A9isha reopened this Apr 10, 2026

A9isha force-pushed the anisha-from-pretrained3 branch 4 times, most recently from d47d478 to 5f6f563 Compare April 10, 2026 07:08

wang2yn84 approved these changes Apr 10, 2026

View reviewed changes

Comment thread src/maxtext/configs/pyconfig.py

Comment thread src/maxtext/utils/model_creation_utils.py

NicoGrande reviewed Apr 10, 2026

View reviewed changes

A9isha force-pushed the anisha-from-pretrained3 branch 2 times, most recently from c4097a8 to ff01066 Compare April 15, 2026 06:49

igorts-git reviewed Apr 15, 2026

View reviewed changes

Comment thread src/maxtext/configs/pyconfig.py Outdated

Comment thread src/maxtext/configs/pyconfig.py Outdated

Comment thread src/maxtext/utils/model_creation_utils.py

A9isha force-pushed the anisha-from-pretrained3 branch from ff01066 to 8195563 Compare April 15, 2026 20:20

add from_pretrained as simple API

ef03866

A9isha force-pushed the anisha-from-pretrained3 branch from 8195563 to ef03866 Compare April 15, 2026 20:46

igorts-git approved these changes Apr 15, 2026

View reviewed changes

NicoGrande approved these changes Apr 15, 2026

View reviewed changes

richjames0 approved these changes Apr 16, 2026

View reviewed changes

A9isha added the pull ready label Apr 16, 2026

copybara-service Bot merged commit 5182e3b into main Apr 20, 2026
49 of 50 checks passed

copybara-service Bot deleted the anisha-from-pretrained3 branch April 20, 2026 22:14

NuojCheng mentioned this pull request Apr 22, 2026

Update vllm rule for better EP performance #3718

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add simplified APIs for model obtaining maxtext models#3450

Add simplified APIs for model obtaining maxtext models#3450
copybara-service[bot] merged 1 commit intomainfrom
anisha-from-pretrained3

A9isha commented Mar 18, 2026 •

edited

Loading

Uh oh!

codecov Bot commented Mar 18, 2026 •

edited

Loading

Uh oh!

wang2yn84 left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

richjames0 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

A9isha commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Tests

Checklist

Uh oh!

codecov Bot commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

wang2yn84 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

richjames0 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

A9isha commented Mar 18, 2026 •

edited

Loading

codecov Bot commented Mar 18, 2026 •

edited

Loading