NVIDIA-BioNeMo
diff --git a/‎bionemo-recipes/README.md‎
Lines changed: 32 additions & 26 deletions b/‎bionemo-recipes/README.md‎
Lines changed: 32 additions & 26 deletions
diff --git a/‎bionemo-recipes/models/README.md‎
Lines changed: 5 additions & 5 deletions b/‎bionemo-recipes/models/README.md‎
Lines changed: 5 additions & 5 deletions
diff --git a/‎bionemo-recipes/models/amplify/README.md‎
Lines changed: 7 additions & 6 deletions b/‎bionemo-recipes/models/amplify/README.md‎
Lines changed: 7 additions & 6 deletions
diff --git a/‎bionemo-recipes/models/esm2/README.md‎
Lines changed: 7 additions & 8 deletions b/‎bionemo-recipes/models/esm2/README.md‎
Lines changed: 7 additions & 8 deletions
@@ -1,15 +1,15 @@
 # BioNeMo Recipes
 
-BioNeMo Recipes provides an easy path for the biological foundation model training community to scale up transformer-based models efficiently. Rather than offering a batteries-included training framework, we provide **model checkpoints** with TransformerEngine (TE) layers and **training recipes** that demonstrate how to achieve maximum throughput with popular open-source frameworks and fully sharded data parallel (FSDP) scale-out.
+BioNeMo Recipes provides an easy path for the biological foundation model training community to scale up transformer-based models efficiently. Rather than offering a batteries-included training framework, BioNeMo Recipes provide **model checkpoints** with TransformerEngine (TE) layers and **training recipes** that demonstrate how to achieve maximum throughput with popular open-source frameworks and fully sharded data parallel (FSDP) scale-out.
 
 ## Overview
 
-The biological AI community is actively prototyping model architectures and needs tooling that prioritizes extensibility, interoperability, and ease-of-use alongside performance. BioNeMo Recipes addresses this by offering:
+The biological AI community actively prototypes model architectures and needs tooling that prioritizes extensibility, interoperability, and ease-of-use, alongside performance. BioNeMo Recipes addresses this by offering:
 
-- **Flexible scaling**: Scale from single-GPU prototyping to multi-node training without complex parallelism configurations
+- **Flexible scaling**: Scales from single-GPU prototyping to multi-node training without complex parallelism configurations
 - **Framework compatibility**: Works with popular frameworks like HuggingFace Accelerate, PyTorch Lightning, and vanilla PyTorch
 - **Performance optimization**: Leverages TransformerEngine and megatron-FSDP for state-of-the-art training efficiency
-- **Research-friendly**: Hackable, readable code that researchers can easily adapt for their experiments
+- **Research-friendly**: Contains hackable and readable code that researchers can easily adapt for their experiments
 
 ### Performance Benchmarks
 
@@ -21,6 +21,8 @@ The biological AI community is actively prototyping model architectures and need
 
 ### Use Cases
 
+The use cases of BioNeMO Recipes include:
+
 - **Foundation Model Developers**: AI researchers and ML engineers developing novel biological foundation models who need to scale up prototypes efficiently
 - **Foundation Model Customizers**: Domain scientists looking to fine-tune existing models with proprietary data for drug discovery and biological research
 
@@ -48,9 +50,9 @@ Abbreviations:
 - BF16: [brain-float 16](https://en.wikipedia.org/wiki/Bfloat16_floating-point_format), a common 16 bit float format for deep learning.
 - FP8<sup>[1]</sup>: [8-bit floating point](https://docs.nvidia.com/deeplearning/transformer-engine/user-guide/examples/fp8_primer.html), a compact format for weights allowing for faster training and inference.
 - MXFP8<sup>[2]</sup>: [Multi Scale 8-bit floating point](https://docs.nvidia.com/deeplearning/transformer-engine/user-guide/examples/fp8_primer.html), as compact as FP8 but with better numerical precision.
-- NVFP4<sup>[2]</sup>: [NVIDIA 4-bit floating point](https://docs.nvidia.com/deeplearning/transformer-engine/user-guide/examples/fp8_primer.html#Beyond-FP8---training-with-NVFP4), faster than FP8, retaining accuracy via multi-scale.
-- THD: **T**otal **H**eads **D**imension, also known as ["sequence packing"](https://docs.nvidia.com/nemo-framework/user-guide/24.07/nemotoolkit/features/optimizations/sequence_packing.html#sequence-packing-for-sft-peft). A way to construct a batch with sequences of different length so there are no pads, therefore no compute is wasted on computing attention for padding tokens. This is in contrast to **B**atch **S**equence **H**ead **D**imension (BSHD) format, which uses pads to create a rectangular batch.
-- CP: Context parallel, also known as sequence parallel. A way to distribute the memory required to process long sequences across multiple GPUs. For more information please see [context parallel](./recipes/context_parallel.md)
+- NVFP4<sup>[2]</sup>: [NVIDIA 4-bit floating point](https://docs.nvidia.com/deeplearning/transformer-engine/user-guide/examples/fp8_primer.html#Beyond-FP8---training-with-NVFP4), faster than FP8, retaining accuracy using multi-scale.
+- THD: **T**otal **H**eads **D**imension, also known as ["sequence packing"](https://docs.nvidia.com/nemo-framework/user-guide/24.07/nemotoolkit/features/optimizations/sequence_packing.html#sequence-packing-for-sft-peft). A way to construct a batch with sequences of different lengths so there are no pads, which results in no compute wasted on computing attention for padding tokens. This is in contrast to **B**atch **S**equence **H**ead **D**imension (BSHD) format, which uses pads to create a rectangular batch.
+- CP: Context parallel, also known as sequence parallel. A way to distribute the memory required to process long sequences across multiple GPUs. For more information, refer to [context parallel](./recipes/context_parallel.md)
 
 \[1\]: Requires [compute capability](https://developer.nvidia.com/cuda-gpus) 9.0 and above (Hopper+) <br/>
 \[2\]: Requires [compute capability](https://developer.nvidia.com/cuda-gpus) 10.0 and 10.3 (Blackwell), 12.0 support pending <br/>
@@ -63,7 +65,7 @@ This repository contains two types of components:
 
 Huggingface-compatible `PreTrainedModel` classes that use TransformerEngine layers internally. These are designed to be:
 
-- **Distributed via Hugging Face Hub**: Pre-converted checkpoints available at [huggingface.co/nvidia](https://huggingface.co/nvidia)
+- **Distributed through Hugging Face Hub**: Pre-converted checkpoints available at [huggingface.co/nvidia](https://huggingface.co/nvidia)
 - **Drop-in replacements**: Compatible with `AutoModel.from_pretrained()` without additional dependencies
 - **Performance optimized**: Leverage TransformerEngine features like FP8 training and context parallelism
 
@@ -82,7 +84,11 @@ Recipes are **not pip-installable packages** but serve as reference implementati
 
 ## Quick Start
 
-### Using Models
+This section describe how you can get started with BioNeMo Recipes.
+
+### Loading Models
+
+Run the following to load the BioNeMo model.
 
 ```python
 from transformers import AutoModel, AutoTokenizer
@@ -94,6 +100,8 @@ tokenizer = AutoTokenizer.from_pretrained("nvidia/AMPLIFY_120M")
 
 ### Running Recipes
 
+Build and run recipes with the following.
+
 ```bash
 # Navigate to a recipe
 cd recipes/esm2_native_te_mfsdp
@@ -103,13 +111,9 @@ docker build -t esm2_recipe .
 docker run --rm -it --gpus all esm2_recipe python train.py
 ```
 
-______________________________________________________________________
+## Setting Up the Development Environment
 
-## Developer Guide
-
-### Setting Up Development Environment
-
-1. **Install pre-commit hooks:**
+1. Install pre-commit hooks:
 
    ```bash
    pre-commit install
@@ -130,9 +134,9 @@ ______________________________________________________________________
    docker run --rm -it --gpus all my_tag pytest -v .
    ```
 
-### Coding Guidelines
+## Coding Guidelines
 
-We prioritize **readability and simplicity** over comprehensive feature coverage:
+BioNeMo Recipes prioritize **readability and simplicity** over comprehensive feature coverage:
 
 - **KISS (Keep It Simple) over DRY (Don't Repeat Yourself)**: It's better to have clear, duplicated code than complex
   abstractions
@@ -141,7 +145,7 @@ We prioritize **readability and simplicity** over comprehensive feature coverage
 
 ### Testing Strategy
 
-We use a three-tier testing approach:
+BioNeMo Reciptes use a three-tier testing approach:
 
 #### L0 Tests (Pre-merge)
 
@@ -166,9 +170,11 @@ We use a three-tier testing approach:
 
 ### Adding New Components
 
+With BioNeMo Recipes, you can add new components including models and recipes.
+
 #### Adding a New Model
 
-Models should be pip-installable packages that can export checkpoints to Hugging Face. See the
+Models should be pip-installable packages that can export checkpoints to Hugging Face. Refer to the
 [models README](models/README.md) for detailed guidelines on:
 
 - Package structure and conventions
@@ -178,7 +184,7 @@ Models should be pip-installable packages that can export checkpoints to Hugging
 
 #### Adding a New Recipe
 
-Recipes should be self-contained Docker environments demonstrating specific training patterns. See
+Recipes should be self-contained Docker environments demonstrating specific training patterns. Refer to
 the [recipes README](recipes/README.md) for guidance on:
 
 - Directory structure and naming
@@ -209,14 +215,14 @@ We aim to provide the fastest available training implementations for biological
 
 ## Contributing
 
-We welcome contributions that advance the state of biological foundation model training. Please ensure your contributions:
+We welcome contributions that advance the state of biological foundation model training. Ensure your contributions:
 
-1. Follow our coding guidelines emphasizing clarity
-2. Include appropriate tests (L0 minimum, L1/L2 as applicable)
-3. Provide clear documentation and examples
-4. Maintain compatibility with our supported frameworks
+- Follow our coding guidelines emphasizing clarity
+- Include appropriate tests (L0 minimum, L1/L2 as applicable)
+- Provide clear documentation and examples
+- Maintain compatibility with our supported frameworks
 
-For detailed contribution guidelines, see our individual component READMEs:
+For detailed contribution guidelines, refer to our individual component READMEs:
 
 - [Models Development Guide](models/README.md)
 - [Recipes Development Guide](recipes/README.md)
 
@@ -1,14 +1,14 @@
 # Models Directory
 
-This directory contains HuggingFace-compatible model implementations that use TransformerEngine layers internally. These models are designed to be distributed via the Hugging Face Hub and serve as drop-in replacements for standard transformer models with enhanced performance.
+This directory contains HuggingFace-compatible model implementations that use TransformerEngine layers internally. These models are designed to be distributed through the Hugging Face Hub and serve as drop-in replacements for standard transformer models with enhanced performance.
 
 ## Overview
 
 Models in this directory are **not intended to be pip-installed directly**. Instead, they serve as:
 
-1. **Reference implementations** of biological foundation models using TransformerEngine
-2. **Conversion utilities** for transforming existing model checkpoints to TE-compatible format
-3. **Export tools** for preparing model releases on the Hugging Face Hub
+- **Reference implementations** of biological foundation models using TransformerEngine
+- **Conversion utilities** for transforming existing model checkpoints to TE-compatible format
+- **Export tools** for preparing model releases on the Hugging Face Hub
 
 Users will typically interact with these models by loading pre-converted checkpoints directly from the Hugging Face Hub using standard transformers APIs.
 
@@ -33,7 +33,7 @@ To add a new model to this directory, you must provide:
 #### 3. Checkpoint Export Script
 
 - **`export.py`**: Script that packages all necessary files for Hugging Face Hub upload
-- **Complete asset bundling**: Must include all required files (see [Export Requirements](#export-requirements))
+- **Complete asset bundling**: Must include all required files, refer to [Export Requirements](#export-requirements)
 - **Automated process**: Should be runnable with minimal manual intervention
 
 #### 4. Open Source License
 
@@ -1,9 +1,8 @@
 # AMPLIFY Optimized with NVIDIA TransformerEngine
 
 This folder contains source code and tests for an AMPLIFY model that inherits from the transformers `PreTrainedModel`
-class and uses TransformerEngine layers. Users don't need to install this package directly, but can load the
-model directly from HuggingFace Hub using the standard transformers API (see [Inference Examples](#inference-examples)
-below).
+class and uses TransformerEngine layers. Users do not need to install this package directly, but can load the
+model directly from HuggingFace Hub using the standard transformers API. For more information, refer to [Inference Examples](#inference-examples).
 
 ## Feature support
 
@@ -18,7 +17,7 @@ The AMPLIFY implementation natively supports the following TransformerEngine-pro
 | **Import from HuggingFace checkpoints** | ✅ Supported               |
 | **Export to HuggingFace checkpoints**   | 🚧 Under development       |
 
-See [BioNeMo Recipes](../../recipes/README.md) for more details on how to use these features to accelerate model
+Refer to [BioNeMo Recipes](../../recipes/README.md) for more details on how to use these features to accelerate model
 training and inference.
 
 ## Links to HF checkpoints
@@ -34,7 +33,7 @@ Pre-trained AMPLIFY models are available on HuggingFace as part of the NVIDIA
 ## Runtime Requirements
 
 We recommend using the latest [NVIDIA PyTorch container](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch)
-for optimal performance and compatibility. See the provided Dockerfile for details.
+for optimal performance and compatibility. Refer to the provided Dockerfile for details.
 
 ## Inference Examples
 
@@ -61,7 +60,7 @@ output = model(**inputs)
 ## Recipe Links
 
 Training recipes are available in the `bionemo-recipes/recipes/` directory. AMPLIFY can be trained using the same
-recipes as ESM-2, simply by switching the model_tag to reference the AMPLIFY model, e.g. `nvidia/AMPLIFY_120M`, and
+recipes as ESM-2, simply by switching the model_tag to reference the AMPLIFY model, such as `nvidia/AMPLIFY_120M`, and
 changing the dataset as appropriate.
 
 - **[esm2_native_te](../../recipes/esm2_native_te/)** - Demonstrates training with a simple native PyTorch training
@@ -118,3 +117,5 @@ Or, upload all models at once with:
 ```bash
 for dir in *; do huggingface-cli upload nvidia/$(basename "$dir") "$dir/"; done
 ```
+
+z
@@ -2,8 +2,7 @@
 
 This folder contains source code and tests for an ESM-2 model that inherits from the transformers `PreTrainedModel`
 class and uses TransformerEngine layers. Users don't need to install this package directly, but can load the
-model directly from HuggingFace Hub using the standard transformers API (see [Inference Examples](#inference-examples)
-below).
+model directly from HuggingFace Hub using the standard transformers API. For more information, refer to [Inference Examples](#inference-examples).
 
 ## Feature support
 
@@ -18,7 +17,7 @@ The ESM-2 implementation natively supports the following TransformerEngine-provi
 | **Import from HuggingFace checkpoints** | ✅ Supported                                                                     |
 | **Export to HuggingFace checkpoints**   | ✅ Supported                                                                     |
 
-See [BioNemo Recipes](../../recipes/README.md) for more details on how to use these features to accelerate model
+Refer to [BioNemo Recipes](../../recipes/README.md) for more details on how to use these features to accelerate model
 training and inference.
 
 ## Links to HF checkpoints
@@ -38,7 +37,7 @@ Pre-trained ESM-2 models converted from the original Facebook weights are availa
 ## Runtime Requirements
 
 We recommend using the latest [NVIDIA PyTorch container](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch)
-for optimal performance and compatibility. See the provided Dockerfile for details.
+for optimal performance and compatibility. Refer to the provided Dockerfile for details.
 
 ## Inference Examples
 
@@ -101,7 +100,7 @@ hf_model = convert_esm_te_to_hf(te_model)
 hf_model.save_pretrained("/path/to/hf_checkpoint")
 ```
 
-Load and Test the Exported Model
+### Loading and Testing the Exported Model
 
 Load the exported model and perform validation:
 
@@ -114,8 +113,8 @@ tokenizer = AutoTokenizer.from_pretrained("facebook/esm2_t6_8M_UR50D")
 
 ### Validating Converted Models
 
-See the commands in [Inference Examples](#inference-examples) above to load and test both the original and converted
-models to ensure loss and logit values are similar. See also the golden value tests in
+To validate the converted models, refer to the commands in [Inference Examples](#inference-examples) above to load and test both the original and converted
+models to ensure loss and logit values are similar. Additionally, refer to the golden value tests in
 [test_modeling_esm_te.py](tests/test_modeling_esm_te.py) and [test_convert.py](tests/test_convert.py).
 
 ## Developer Guide
@@ -153,7 +152,7 @@ Now deploy the converted checkpoints to the HuggingFace Hub by running the follo
 huggingface-cli upload nvidia/${MODEL_NAME} $PWD/checkpoint_export/${MODEL_NAME}
 ```
 
-Or, upload all models at once with:
+You can also upload all models at once with:
 
 ```bash
 cd checkpoint_export