[NNX] NNX migration prep (8/N): NNX native lora grpo by ecnal-cienet · Pull Request #3824 · AI-Hypercomputer/maxtext

ecnal-cienet · 2026-05-06T14:33:19Z

NNX Migration Route Map

✅ Add NNX scaffolding: pure_nnx flag, init_state_fn, TrainStateNNX, NNX utils. Linen workflow unchanged. (PR NNX migration prep (1/N): pure_nnx flag and init_state_fn scaffolding #3427)
✅ NNX sharding utilities: get_abstract_state_nnx, get_named_sharding_nnx, set_named_sharding_nnx, get_partition_spec_nnx, get_mesh_from_config. (PR NNX migration prep (2/N): NNX utils and sharding utilities #3470)
✅ NNX fully supported end-to-end: TrainStateNNX, model creation, gradient accumulation, checkpointing, and training loop dispatch. (PR NNX migration prep (3/N): TrainState, model creation, and end-to-end training loop #3500)
✅ Sharding diagnostics on NNX, plus post-training bugfixes that surfaced once the NNX path got exercised end-to-end. (PR [NNX] NNX migration prep (4/N): sharding tools and post-training fixes #3652)
4.5. ✅ Linen↔NNX checkpoint converter. (PR [NNX] NNX migration prep (4.5/N): Linen<->NNX checkpoint converter #3843)
4.6. ❌ Linen↔NNX checkpoint comparator (sibling branch on PR4.5).
✅ NNX correctness fixes, feature enablements, and vocab tiling on NNX.
✅ NNX-native DPO.
✅ NNX-native MaxEngine inference. (PR [NNX] NNX migration prep (7/N): NNX-native MaxEngine inference #3821)
🔄 [This PR] NNX-native LoRA + GRPO. NNX-native serving / decode-checkpoint LoRA via apply_lora_on_base_params_nnx / unapply_lora_from_base_params_nnx / get_lora_abstract_state_nnx (the maxengine pure_nnx + LoRA carve-out from PR7 is cleared); NNX-native GRPO trainer via grpo_loss_fn_nnx + compute_log_probs_nnx + NNX setup_train_loop/train_step/eval_step paths. Stacks on PR7.
❌ NNX-aware QK-Clip + remaining checkpoint utilities.
9.5. ❌ NNX + AQT in MaxEngine + serve-mode reload + gpt3 prefill fix.
❌ Vocab tiling custom_vjp for NNX.
❌ Set NNX defaults to True; regenerate sharding goldens; flip back integration-test pure_nnx=False annotations.
❌ Delete Linen-specific code paths and NNX compatibility flags.

Description

This PR implements NNX-native LoRA serving and NNX-native GRPO by adding NNX-shape walkers and step helpers alongside the existing Linen ones, then dispatching on config.pure_nnx. Every NNX modification is gated by if config.pure_nnx:, preserving the Linen path byte-for-byte. The diff spans +551 / −84 across 5 source files, plus 2 new test files (515 lines).

Part 1: NNX-shape LoRA Walkers

New helpers in src/maxtext/utils/lora_utils.py operating on nnx.State pure trees (no {"params": ...} outer wrap):

apply_lora_on_base_params_nnx mutates base_params in place: W += B @ A * scale at target attention paths
unapply_lora_from_base_params_nnx is the symmetric inverse
get_lora_abstract_state_nnx walks the abstract state.model substate and emits a parallel tree with lora_a.kernel/lora_b.kernel leaves at target attention paths and None elsewhere
_nnx_param_subtree drops the outer TrainStateNNX wrapping

The base model stays pristine; "apply" merges the delta into the kernel, "unapply" reverses. No nnx.LoRA wrapper, no model surgery. The on-disk format (HuggingFace PEFT-style lora_a.kernel / lora_b.kernel) round-trips between Linen and NNX consumers unchanged.

Part 2: LoRA Dispatch in `setup_initial_lora_state` and `load_adapter`

Both top-level entry points in lora_utils.py branch on config.pure_nnx:

NNX init builds the abstract base via model_creation_utils.create_nnx_abstract_model + TrainStateNNX(model, optimizer)
Linen branch is the original init_initial_state + get_lora_abstract_state path, untouched

Part 3: MaxEngine LoRA Carve-out Cleared

src/maxtext/inference/maxengine/maxengine.py:

load_single_adapter no longer raises NotImplementedError on pure_nnx
apply_adapter / unapply_adapter branch on config.pure_nnx to call the _nnx siblings

Part 4: GRPO Loss and Step Helpers

src/maxtext/experimental/rl/grpo_trainer.py:

grpo_loss_fn_nnx(policy_model, config, data, dropout_rng, params, reference_model, is_train). Signature matches Linen grpo_loss_fn so callers dispatch on the same shape. dropout_rng and params are unused on NNX; reference_model is a frozen nnx.Module and the reference forward is wrapped in stop_gradient. Returns (loss, LossAux), same dataclass as Linen.
_train_step_nnx: nnx.merge(graphdef, state) to reconstruct TrainStateNNX, value_and_grad over policy params, state.apply_gradients(grads), return nnx.state(new_state, nnx.Not(nnx.Intermediate)).
_eval_step_nnx: same merge + loss-fn call, no state update.
train_step / eval_step early-dispatch on config.pure_nnx; Linen branches verbatim.

Part 5: GRPO setup_train_loop on NNX

grpo_trainer.py::setup_train_loop:

Builds training and inference models via mt.from_config(rngs=create_nnx_rngs(...))
Initializes state via create_nnx_abstract_model + TrainStateNNX(model, optimizer, reference_model=...)
Reference uses the same init seed as policy and is never updated by apply_gradients (sibling field on TrainStateNNX, not embedded in params)
The WARNING: GRPO RL trainer does not yet support pure_nnx natively log is removed

Part 6: GRPO train_loop NNX Branches

grpo_trainer.py::train_loop — three Linen-coupled spots branched on pure_nnx:

Initial reference seeding is skipped on NNX (already set up by init_state_fn)
metric_logger.write_setup_info_to_tensorboard receives a flat nnx.Param state on NNX
Checkpoint save passes the whole TrainStateNNX on NNX; the Linen _split_grpo_state(state)[0] strip is bypassed

The reshard call routes to pathways_reshard_nnx when pure_nnx. New helpers in grpo_utils.py:

compute_log_probs_nnx: NNX model is called directly; intermediates pulled via nnx.state(model, nnx.Intermediate).to_pure_dict()
pathways_reshard_nnx: splits state.model to a flat nnx.Param state, reshards onto the inference mesh, calls inference_engine.update_params(...)

Part 7: Carve-outs (NotImplementedError Sites)

Feature	Tracked In
GRPO + `gradient_accumulation_steps > 1`	Follow-up
GRPO + `scan_layers=False`	Follow-up (needs an NNX-aware unscan helper)

Tests

New unit tests (tests/unit/lora_utils_nnx_test.py, 10 tests):

5 on get_lora_abstract_state_nnx: q/k/v/o shape derivation, target-vs-non-target masking, sharding propagation, leaf type validation, error paths
3 on apply_lora_on_base_params_nnx: apply→unapply identity, target-only mutation, numerical parity vs Linen apply_lora_on_base_params on the same random inputs
2 Linen regression smoke tests on apply_lora_on_base_params and unapply_lora_from_base_params (no existing unit test for these helpers in the tree)

New unit tests (tests/unit/grpo_nnx_test.py, 8 tests):

5 on grpo_loss_fn_nnx: LossAux shape parity, signature compatibility, identical-policy/reference → zero KL, grpo_beta=0 → aux.avg_kl=None, finite policy grads
1 on compute_log_probs_nnx: shape [B, S] → [B, S-1]
2 Linen regression smoke tests on grpo_loss_fn and compute_log_probs (the existing Linen integration test is TPU-only and currently @pytest.mark.skip)

Modified test: tests/unit/maxengine_test.py swaps test_lora_raises_for_nnx (asserted NotImplementedError) for test_lora_load_single_adapter_reaches_loader_on_nnx (asserts FileNotFoundError from the loader).

Existing Linen tests: untouched and still pass; pure_nnx=False stays default.

Test results: 198 passed, 1 skipped (pre-existing CPU-only skip) across the broader NNX regression sweep — maxengine_test, dpo_nnx_test, train_nnx_test, lora_utils_nnx_test, grpo_nnx_test, train_state_nnx_test, train_utils_nnx_test, gradient_accumulation_nnx_test, linen_nnx_converter_test, compare_linen_nnx_checkpoint_test.

Linting: bash lint.sh — pyink + pylint 10.00/10.

Checklist

Before submitting this PR, please make sure (put X in square brackets):

I have performed a self-review of my code. For an optional AI review, add the gemini-review label.
I have necessary comments in my code, particularly in hard-to-understand areas.
I have run end-to-end tests tests and provided workload links above if applicable.
I have made or will make corresponding changes to the doc if needed, including adding new documentation pages to the relevant Table of Contents (toctree directive) as explained in our documentation.

codecov · 2026-05-06T17:49:34Z

Codecov Report

❌ Patch coverage is 60.90909% with 43 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/maxtext/utils/lora_utils.py	60.90%	36 Missing and 7 partials ⚠️

📢 Thoughts on this report? Let us know!

github-actions · 2026-06-02T00:43:47Z

🤖 Hi @ecnal-cienet, I've received your request, and I'm working on it now! You can track my progress in the logs for more details.

github-actions

## 📋 Review Summary

The Pull Request successfully implements NNX-native support for LoRA serving and GRPO training, which is a key milestone in the NNX migration for MaxText. The changes are comprehensive, covering utilities, trainer logic, and inference engine integration, while maintaining parity with the existing Linen implementation. The addition of thorough unit tests for both GRPO and LoRA in NNX ensures numerical correctness and structural integrity.

🔍 General Feedback

Preservation of Logic: The migration faithfully reproduces the Linen logic, including specialized LoRA update walkers and GRPO loss functions, which minimizes the risk of regression.
Variable Naming: As noted in the inline comments, some variable naming in the LoRA utilities is inherited from a confusing pattern in the Linen path. While correct, cleaning this up in the NNX implementation would improve long-term maintainability.
Modularity: The separation of NNX-specific logic into *_nnx functions and branches is well-handled and keeps the codebase clean during this transition period.
Testing: Excellent test coverage with parity checks against Linen is a major highlight of this PR.

…e_nnx warning)

ecnal-cienet changed the title ~~Feat/nnx native lora grpo~~ [NNX] NNX migration prep (8/N): Feat/nnx native lora grpo May 6, 2026

ecnal-cienet changed the title ~~[NNX] NNX migration prep (8/N): Feat/nnx native lora grpo~~ [NNX] NNX migration prep (8/N): native lora grpo May 6, 2026

ecnal-cienet changed the title ~~[NNX] NNX migration prep (8/N): native lora grpo~~ [NNX] NNX migration prep (8/N): NNX native lora grpo May 6, 2026

ecnal-cienet force-pushed the feat/nnx-native-lora-grpo branch 6 times, most recently from 03c43e2 to 626ce66 Compare May 7, 2026 21:48

ecnal-cienet force-pushed the feat/nnx-native-lora-grpo branch 12 times, most recently from 78049f9 to 6c65652 Compare May 14, 2026 22:51

ecnal-cienet mentioned this pull request May 15, 2026

[NNX] NNX migration (11/N): set pure_nnx / enable_nnx / pure_nnx_decoder defaults to True #3526

Merged

4 tasks

ecnal-cienet force-pushed the feat/nnx-native-lora-grpo branch 4 times, most recently from b47ad17 to 82af9cb Compare May 20, 2026 00:53

ecnal-cienet requested review from NuojCheng, SurbhiJainUSC, abhinavclemson, aireenmei, bvandermoon, dipannita08, gobbleturk, hengtaoguo, igorts-git, jesselu-google and jiangjy1982 as code owners May 26, 2026 21:29

ecnal-cienet force-pushed the feat/nnx-native-lora-grpo branch 7 times, most recently from 47d28ee to 2b3f99f Compare May 29, 2026 22:10

cgarciae reviewed Jun 1, 2026

View reviewed changes

Comment thread src/maxtext/experimental/rl/grpo_trainer.py Outdated

github-actions Bot reviewed Jun 2, 2026

View reviewed changes

Comment thread src/maxtext/utils/lora_utils.py Outdated

Comment thread src/maxtext/utils/lora_utils.py Outdated

Comment thread src/maxtext/experimental/rl/grpo_trainer.py

bvandermoon approved these changes Jun 2, 2026

View reviewed changes

Comment thread src/maxtext/utils/lora_utils.py

NNX: native LoRA + GRPO (drop maxengine LoRA carve-out, drop GRPO pur…

da02ec8

…e_nnx warning)

cgarciae approved these changes Jun 2, 2026

View reviewed changes

ecnal-cienet mentioned this pull request Jun 8, 2026

[NNX] NNX migration (12/N): delete Linen code paths, classes, and NNX compatibility flags #4038

Draft

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[NNX] NNX migration prep (8/N): NNX native lora grpo#3824

[NNX] NNX migration prep (8/N): NNX native lora grpo#3824
copybara-service[bot] merged 1 commit into
mainfrom
feat/nnx-native-lora-grpo

ecnal-cienet commented May 6, 2026 •

edited

Loading

Uh oh!

codecov Bot commented May 6, 2026 •

edited

Loading

Uh oh!

Uh oh!

github-actions Bot commented Jun 2, 2026

Uh oh!

github-actions Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

ecnal-cienet commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

NNX Migration Route Map

Description

Part 1: NNX-shape LoRA Walkers

Part 2: LoRA Dispatch in setup_initial_lora_state and load_adapter

Part 3: MaxEngine LoRA Carve-out Cleared

Part 4: GRPO Loss and Step Helpers

Part 5: GRPO setup_train_loop on NNX

Part 6: GRPO train_loop NNX Branches

Part 7: Carve-outs (NotImplementedError Sites)

Tests

Checklist

Uh oh!

codecov Bot commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

github-actions Bot commented Jun 2, 2026

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

🔍 General Feedback

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ecnal-cienet commented May 6, 2026 •

edited

Loading

Part 2: LoRA Dispatch in `setup_initial_lora_state` and `load_adapter`

codecov Bot commented May 6, 2026 •

edited

Loading