Feat/Decoder migration: DeepSeek/Gemma3/Gemma4/Llama4 by hsuan-lun-chiang · Pull Request #3114 · AI-Hypercomputer/maxtext

hsuan-lun-chiang · 2026-02-09T07:45:44Z

Description

Implement and update the following models in NNX decoder that were are not supported in previous PR 2831:

DeepSeek
Gemma 3, 4
Llama4

Tests

Test with different model and compare with Linen training. Details in the GDoc file

Checklist

Before submitting this PR, please make sure (put X in square brackets):

I have performed a self-review of my code. For an optional AI review, add the gemini-review label.
I have necessary comments in my code, particularly in hard-to-understand areas.
I have run end-to-end tests tests and provided workload links above if applicable.
I have made or will make corresponding changes to the doc if needed, including adding new documentation pages to the relevant Table of Contents (toctree directive) as explained in our documentation.

codecov · 2026-02-09T08:33:49Z

Codecov Report

❌ Patch coverage is 31.34328% with 138 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/maxtext/layers/nnx_decoders.py	29.53%	119 Missing and 17 partials ⚠️
src/maxtext/layers/initializers.py	71.42%	1 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

RissyRan · 2026-04-15T19:47:07Z

Thanks for the change! Before the review, could you ensure those are tested? Thank you!

cc @bvandermoon @entrpn

hsuan-lun-chiang · 2026-04-16T06:25:32Z

Thanks for the change! Before the review, could you ensure those are tested? Thank you!

cc @bvandermoon @entrpn

Hi @RissyRan, the Linen/NNX comparison logs for Llama4/Deepseek/Gemma4 are in the GDoc file. The PR is passing all unit tests with NNX flags (enable_nnx=True and pure_nnx_decoder=True), except for a few cases requiring further discussion, which we’ve documented here. Would you mind taking a look? Thank you!

…mma3/Llama4

hsuan-lun-chiang force-pushed the feat/Migrate-Decoder-And-Tests-to-NNX branch from 7faa14d to edbbf29 Compare February 9, 2026 08:15

hsuan-lun-chiang force-pushed the feat/Migrate-Decoder-And-Tests-to-NNX branch 9 times, most recently from 8a3e073 to 2f30ac1 Compare February 12, 2026 11:19

hsuan-lun-chiang force-pushed the feat/Migrate-Decoder-And-Tests-to-NNX branch 6 times, most recently from 1a5740b to 5725403 Compare February 26, 2026 07:27

charlesli640 requested changes Feb 26, 2026

View reviewed changes

Comment thread tests/unit/multi_token_prediction_test.py Outdated

hsuan-lun-chiang force-pushed the feat/Migrate-Decoder-And-Tests-to-NNX branch 12 times, most recently from de4ec11 to 29a9b74 Compare March 6, 2026 09:31

hsuan-lun-chiang force-pushed the feat/Migrate-Decoder-And-Tests-to-NNX branch 4 times, most recently from 047b91e to 1d3cc0c Compare March 16, 2026 08:03

hsuan-lun-chiang force-pushed the feat/Migrate-Decoder-And-Tests-to-NNX branch 6 times, most recently from 13cfedf to d00508e Compare March 23, 2026 08:52

hsuan-lun-chiang closed this Mar 23, 2026

hsuan-lun-chiang force-pushed the feat/Migrate-Decoder-And-Tests-to-NNX branch from 40f33b8 to c4b5e64 Compare March 23, 2026 09:40

hsuan-lun-chiang reopened this Mar 23, 2026

hsuan-lun-chiang force-pushed the feat/Migrate-Decoder-And-Tests-to-NNX branch 4 times, most recently from 80ebfcb to f0ddf63 Compare March 25, 2026 10:00

charlesli640 reviewed Mar 26, 2026

View reviewed changes

Comment thread tests/unit/nnx_decoder_test.py Outdated

hsuan-lun-chiang force-pushed the feat/Migrate-Decoder-And-Tests-to-NNX branch 3 times, most recently from e1bc3f2 to 21fd4f5 Compare April 1, 2026 07:18

hsuan-lun-chiang force-pushed the feat/Migrate-Decoder-And-Tests-to-NNX branch 6 times, most recently from 06657a2 to 53f3052 Compare April 9, 2026 09:11

Implement and update the following models in NNX decoder: DeepSeek/Ge…

4f46176

…mma3/Llama4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/Decoder migration: DeepSeek/Gemma3/Gemma4/Llama4#3114

Feat/Decoder migration: DeepSeek/Gemma3/Gemma4/Llama4#3114
hsuan-lun-chiang wants to merge 1 commit intoAI-Hypercomputer:mainfrom
CIeNET-International:feat/Migrate-Decoder-And-Tests-to-NNX

hsuan-lun-chiang commented Feb 9, 2026 •

edited

Loading

Uh oh!

codecov Bot commented Feb 9, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

RissyRan commented Apr 15, 2026

Uh oh!

hsuan-lun-chiang commented Apr 16, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

hsuan-lun-chiang commented Feb 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Tests

Checklist

Uh oh!

codecov Bot commented Feb 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

RissyRan commented Apr 15, 2026

Uh oh!

hsuan-lun-chiang commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

hsuan-lun-chiang commented Feb 9, 2026 •

edited

Loading

codecov Bot commented Feb 9, 2026 •

edited

Loading

hsuan-lun-chiang commented Apr 16, 2026 •

edited

Loading