Update README with recent highlights, benchmarks, and support matrix … (#1508)

trvachov · web-flow · commit 85d145f5c914 · 2026-03-19T04:29:34.000Z
…fixes Add new Recent News entries for Qwen3, ESM2 low-precision (NVFP4/MXFP8), Mixtral MoE, ESM2 PEFT, and Llama3 context parallelism. Add benchmark figures for ESM2 low-precision on B300 and Llama3 70B CP on GB300. Update support matrix with new model/recipe rows and fix stale WIP statuses. Fix typos ("bionemo2" → "BioNeMo Framework") and remove stray character in amplify README. ### Description  #### Usage  ```python TODO: Add code snippet ``` ### Type of changes  - [ ] Bug fix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Refactor - [ ] Documentation update - [ ] Other (please describe): ### CI Pipeline Configuration Configure CI behavior by applying the relevant labels. By default, only basic unit tests are run. - [ciflow:skip](https://github.com/NVIDIA/bionemo-framework/blob/main/docs/docs/main/contributing/contributing.md#ciflow:skip) - Skip all CI tests for this PR - [ciflow:notebooks](https://github.com/NVIDIA/bionemo-framework/blob/main/docs/docs/main/contributing/contributing.md#ciflow:notebooks) - Run Jupyter notebooks execution tests for bionemo2 - [ciflow:slow](https://github.com/NVIDIA/bionemo-framework/blob/main/docs/docs/main/contributing/contributing.md#ciflow:slow) - Run slow single GPU integration tests marked as @pytest.mark.slow for bionemo2 - [ciflow:all](https://github.com/NVIDIA/bionemo-framework/blob/main/docs/docs/main/contributing/contributing.md#ciflow:all) - Run all tests (unit tests, slow tests, and notebooks) for bionemo2. This label can be used to enforce running tests for all bionemo2. - [ciflow:all-recipes](https://github.com/NVIDIA/bionemo-framework/blob/main/docs/docs/main/contributing/contributing.md#ciflow:all-recipes) - Run tests for all recipes (under bionemo-recipes). This label can be used to enforce running tests for all recipes. Unit tests marked as `@pytest.mark.multi_gpu` or `@pytest.mark.distributed` are not run in the PR pipeline. For more details, see [CONTRIBUTING](CONTRIBUTING.md) > [!NOTE] > By default, only basic unit tests are run. Add appropriate labels to enable an additional test coverage. #### Authorizing CI Runs We use [copy-pr-bot](https://docs.gha-runners.nvidia.com/apps/copy-pr-bot/#automation) to manage authorization of CI runs on NVIDIA's compute resources. - If a pull request is opened by a trusted user and contains only trusted changes, the pull request's code will automatically be copied to a pull-request/ prefixed branch in the source repository (e.g. pull-request/123) - If a pull request is opened by an untrusted user or contains untrusted changes, an NVIDIA org member must leave an `/ok to test` comment on the pull request to trigger CI. This will need to be done for each new commit. #### Triggering Code Rabbit AI Review To trigger a code review from code rabbit, comment on a pull request with one of these commands: - @coderabbitai review - Triggers a standard review - @coderabbitai full review - Triggers a comprehensive review See https://docs.coderabbit.ai/reference/review-commands for a full list of commands. ### Pre-submit Checklist  - [ ] I have tested these changes locally - [ ] I have updated the documentation accordingly - [ ] I have added/updated tests as needed - [ ] All existing tests pass successfully Signed-off-by: Timur Rvachov <trvachov@nvidia.com>
diff --git a/README.md b/README.md
@@ -43,6 +43,11 @@ cd bionemo-framework/bionemo-recipes/recipes/esm2_native_te/
 
 ## Recent News
 
+- 03/09/2026 [Qwen2.5 / Qwen3 model](bionemo-recipes/models/qwen/) with TE acceleration, FP8/MXFP8, KV-cache inference, and bidirectional HF checkpoint conversion.
+- 03/05/2026 [ESM2 NVFP4 and MXFP8](bionemo-recipes/recipes/esm2_native_te/README.md#low-precision-performance-benchmarks) low-precision training — up to **2,367 TFLOPS/GPU** on NVIDIA B300 at 15B scale with per-layer precision control.
+- 02/23/2026 [Mixtral MoE model](bionemo-recipes/models/mixtral/) with TE `GroupedLinear` for efficient parallel expert computation, FP8/FP4 support, and HF conversion.
+- 02/13/2026 [ESM2 PEFT recipe](bionemo-recipes/recipes/esm2_peft_te/) for LoRA fine-tuning with sequence packing support.
+- 01/14/2026 [Llama3 Context Parallelism](bionemo-recipes/recipes/llama3_native_te/README.md#performance-benchmarks) — scaling Llama 3 70B to 144K context on 36x GB300 NVL36 with ~65% MFU.
 - 10/27/2025 [CodonFM recipe](https://github.com/NVIDIA/bionemo-framework/tree/main/bionemo-recipes/recipes/codonfm_ptl_te) released! This is an accelerated version of the original [research codebase](https://github.com/NVIDIA-Digital-Bio/CodonFM) with [scientific preprint](https://research.nvidia.com/labs/dbr/assets/data/manuscripts/nv-codonfm-preprint.pdf).
 - 09/30/2025 Megatron/NeMo 5D parallel BioNeMo Framework image v2.7 [released on NGC](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/clara/containers/bionemo-framework) for both x86 and ARM CPUs.
 - 09/01/2025 [bionemo-recipes](https://github.com/NVIDIA/bionemo-framework/tree/main/bionemo-recipes) goes live! Lightweight and portable examples with state-of-the-art training performance you can riff on to meet your needs.
@@ -61,13 +66,18 @@ A core use-case of the BioNeMo Framework is to help digital biology scientists a
 | ---------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------- | -------------- | ----------- | ------------- | ------ | ---------------- | ------ | ------------------- |
 | `models/`<br>`amplify`                         | TE accelerated protein BERT, pushed to HuggingFace                                                                            | ✅ Active      | ❌          | ✅            | ✅     | 🚧 WIP           | ✅     | 🚧 WIP              |
 | `models/`<br>`esm2`                            | TE accelerated protein BERT, pushed to HuggingFace                                                                            | ✅ Active      | ❌          | ✅            | ✅     | ✅               | ✅     | ✅                  |
-| `models/`<br>`llama3`                          | TE accelerated Llama 3                                                                                                        | ✅ Active      | ❌          | 🚧 WIP        | ✅     | ✅               | 🚧 WIP | 🚧 WIP              |
+| `models/`<br>`llama3`                          | TE accelerated Llama 3                                                                                                        | ✅ Active      | ❌          | 🚧 WIP        | ✅     | ✅               | ✅     | ✅                  |
 | `models/`<br>`geneformer`                      | TE accelerated single-cell BERT                                                                                               | 🚧 WIP         | ❌          | ✅            | 🚧 WIP | 🚧 WIP           | 🚧 WIP | 🚧 WIP              |
 | `recipes/`<br>`codonfm_ptl_te`                 | Recipe for [CodonFM](https://research.nvidia.com/labs/dbr/assets/data/manuscripts/nv-codonfm-preprint.pdf)'s Encodon using TE | ✅ Active      | ❌          | 🚧 WIP        | ✅     | ✅               | 🚧 WIP | 🚧 WIP              |
 | `recipes/`<br>`esm2_accelerate_te`             | Recipe for ESM2 TE + HF Accelerate                                                                                            | ✅ Active      | ❌          | 🚧 WIP        | ✅     | ❌               | ✅     | 🚧 WIP              |
-| `recipes/`<br>`esm2_native_te`                 | Recipe for ESM2 TE + native PyTorch                                                                                           | ✅ Active      | ❌          | ✅            | ✅     | ✅               | ✅     | 🚧 WIP              |
+| `recipes/`<br>`esm2_native_te`                 | Recipe for ESM2 TE + native PyTorch                                                                                           | ✅ Active      | ❌          | ✅            | ✅     | ✅               | ✅     | ✅                  |
 | `recipes/`<br>`geneformer_native_te_mfsdp_fp8` | Recipe for Geneformer HF model                                                                                                | 🚧 WIP         | ❌          | ✅            | ✅     | ❌               | ✅     | 🚧 WIP              |
-| `recipes/`<br>`llama3_native_te`               | Recipe for Llama 3 TE + native PyTorch                                                                                        | ✅ Active      | ❌          | 🚧 WIP        | ✅     | ✅               | 🚧 WIP | 🚧 WIP              |
+| `recipes/`<br>`llama3_native_te`               | Recipe for Llama 3 TE + native PyTorch                                                                                        | ✅ Active      | ❌          | 🚧 WIP        | ✅     | ✅               | ✅     | ✅                  |
+| `models/`<br>`mixtral`                         | TE accelerated MoE model                                                                                                      | ✅ Active      | ❌          | 🚧 WIP        | ✅     | ✅               | ✅     | 🚧 WIP              |
+| `models/`<br>`qwen`                            | TE accelerated Qwen2.5/Qwen3                                                                                                  | ✅ Active      | ❌          | 🚧 WIP        | ✅     | ✅               | ✅     | 🚧 WIP              |
+| `recipes/`<br>`esm2_peft_te`                   | Recipe for ESM2 LoRA fine-tuning                                                                                              | ✅ Active      | ❌          | ❌            | ✅     | ✅               | 🚧 WIP | ❌                  |
+| `recipes/`<br>`evo2_megatron`                  | Recipe for Evo2 via Megatron Bridge                                                                                           | 🚧 WIP         | ❌          | ❌            | ✅     | ❌               | ✅     | ❌                  |
+| `recipes/`<br>`fp8_analysis`                   | FP8 training analyzer & heatmap tool                                                                                          | ✅ Active      | N/A         | N/A           | N/A    | N/A              | N/A    | N/A                 |
 | `recipes/`<br>`vit`                            | Recipe for Vision Transformer                                                                                                 | 🚧 WIP         | ❌          | ✅            | ✅     | ❌               | ✅     | 🚧 WIP              |
 
 </small>
@@ -113,7 +123,7 @@ BioNeMo Framework is part of a larger ecosystem of NVIDIA Biopharma products. Ge
 
 ## Documentation Resources
 
-- **Official Documentation:** Contents of `sub-packages` including user guides, API references, and troubleshooting, are documented on our [official documentation](https://docs.nvidia.com/bionemo-framework/latest/). Nightly builds of this documentation is available on [BioNeMo Framework GitHub Pages](https://nvidia.github.io/bionemo-framework/)
+- **Official Documentation:** Documentation for sub-packages, including user guides, API references, and troubleshooting, is available on our [official documentation](https://docs.nvidia.com/bionemo-framework/latest/). Nightly builds of this documentation is available on [BioNeMo Framework GitHub Pages](https://nvidia.github.io/bionemo-framework/)
 
 - **🚧 In-Progress Documentation 🚧:** `bionemo-recipes` documentation is currently work in progress, however the recipes are meant to be self-documented and easy to understand—we suggest you throw them into your favorite genai code assistant!
 
@@ -136,8 +146,8 @@ docker run --rm -it \
 
 #### Initializing 3rd-party dependencies as git submodules
 
-The NeMo and Megatron-LM dependencies are included as git submodules in bionemo2. The pinned commits for these submodules represent the "last-known-good" versions of these packages
-that are confirmed to be working with bionemo2 (and those that are tested in CI).
+The NeMo and Megatron-LM dependencies are included as git submodules in BioNeMo Framework. The pinned commits for these submodules represent the "last-known-good" versions of these packages
+that are confirmed to be working with BioNeMo Framework (and those that are tested in CI).
 
 To initialize these sub-modules when cloning the repo, add the `--recursive` flag to the git clone command:
 
diff --git a/bionemo-recipes/models/amplify/README.md b/bionemo-recipes/models/amplify/README.md
@@ -117,5 +117,3 @@ Or, upload all models at once with:
 ```bash
 for dir in *; do huggingface-cli upload nvidia/$(basename "$dir") "$dir/"; done
 ```
-
-z

-Original file line number
+Diff line change
 ```bash
 for dir in *; do huggingface-cli upload nvidia/$(basename "$dir") "$dir/"; done
 ```
+-
 -z