Skip to content

Commit f01a9d2

Browse files
lvojtkuLinette Tangjwilber
authored
Doc edits to BioNeMo recipes (#1389)
### Description <!-- Provide a detailed description of the changes in this PR --> #### Usage <!--- How does a user interact with the changed code --> ```python TODO: Add code snippet ``` ### Type of changes <!-- Mark the relevant option with an [x] --> - [ ] Bug fix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Refactor - [ ] Documentation update - [ ] Other (please describe): ### CI Pipeline Configuration Configure CI behavior by applying the relevant labels. By default, only basic unit tests are run. - [ciflow:skip](https://github.com/NVIDIA/bionemo-framework/blob/main/docs/docs/main/contributing/contributing.md#ciflow:skip) - Skip all CI tests for this PR - [ciflow:notebooks](https://github.com/NVIDIA/bionemo-framework/blob/main/docs/docs/main/contributing/contributing.md#ciflow:notebooks) - Run Jupyter notebooks execution tests for bionemo2 - [ciflow:slow](https://github.com/NVIDIA/bionemo-framework/blob/main/docs/docs/main/contributing/contributing.md#ciflow:slow) - Run slow single GPU integration tests marked as @pytest.mark.slow for bionemo2 - [ciflow:all](https://github.com/NVIDIA/bionemo-framework/blob/main/docs/docs/main/contributing/contributing.md#ciflow:all) - Run all tests (unit tests, slow tests, and notebooks) for bionemo2. This label can be used to enforce running tests for all bionemo2. - [ciflow:all-recipes](https://github.com/NVIDIA/bionemo-framework/blob/main/docs/docs/main/contributing/contributing.md#ciflow:all-recipes) - Run tests for all recipes (under bionemo-recipes). This label can be used to enforce running tests for all recipes. Unit tests marked as `@pytest.mark.multi_gpu` or `@pytest.mark.distributed` are not run in the PR pipeline. For more details, see [CONTRIBUTING](CONTRIBUTING.md) > [!NOTE] > By default, only basic unit tests are run. Add appropriate labels to enable an additional test coverage. #### Authorizing CI Runs We use [copy-pr-bot](https://docs.gha-runners.nvidia.com/apps/copy-pr-bot/#automation) to manage authorization of CI runs on NVIDIA's compute resources. - If a pull request is opened by a trusted user and contains only trusted changes, the pull request's code will automatically be copied to a pull-request/ prefixed branch in the source repository (e.g. pull-request/123) - If a pull request is opened by an untrusted user or contains untrusted changes, an NVIDIA org member must leave an `/ok to test` comment on the pull request to trigger CI. This will need to be done for each new commit. ### Pre-submit Checklist <!--- Ensure all items are completed before submitting --> - [x] I have tested these changes locally - [x] I have updated the documentation accordingly - [ ] I have added/updated tests as needed - [ ] All existing tests pass successfully --------- Signed-off-by: Linette Tang <lvojktu@nvidia.com> Co-authored-by: Linette Tang <lvojktu@nvidia.com> Co-authored-by: Jared Wilber <jwilber@nvidia.com>
1 parent 40fae3e commit f01a9d2

11 files changed

Lines changed: 133 additions & 142 deletions

File tree

bionemo-recipes/README.md

Lines changed: 32 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,15 @@
11
# BioNeMo Recipes
22

3-
BioNeMo Recipes provides an easy path for the biological foundation model training community to scale up transformer-based models efficiently. Rather than offering a batteries-included training framework, we provide **model checkpoints** with TransformerEngine (TE) layers and **training recipes** that demonstrate how to achieve maximum throughput with popular open-source frameworks and fully sharded data parallel (FSDP) scale-out.
3+
BioNeMo Recipes provides an easy path for the biological foundation model training community to scale up transformer-based models efficiently. Rather than offering a batteries-included training framework, BioNeMo Recipes provide **model checkpoints** with TransformerEngine (TE) layers and **training recipes** that demonstrate how to achieve maximum throughput with popular open-source frameworks and fully sharded data parallel (FSDP) scale-out.
44

55
## Overview
66

7-
The biological AI community is actively prototyping model architectures and needs tooling that prioritizes extensibility, interoperability, and ease-of-use alongside performance. BioNeMo Recipes addresses this by offering:
7+
The biological AI community actively prototypes model architectures and needs tooling that prioritizes extensibility, interoperability, and ease-of-use, alongside performance. BioNeMo Recipes addresses this by offering:
88

9-
- **Flexible scaling**: Scale from single-GPU prototyping to multi-node training without complex parallelism configurations
9+
- **Flexible scaling**: Scales from single-GPU prototyping to multi-node training without complex parallelism configurations
1010
- **Framework compatibility**: Works with popular frameworks like HuggingFace Accelerate, PyTorch Lightning, and vanilla PyTorch
1111
- **Performance optimization**: Leverages TransformerEngine and megatron-FSDP for state-of-the-art training efficiency
12-
- **Research-friendly**: Hackable, readable code that researchers can easily adapt for their experiments
12+
- **Research-friendly**: Contains hackable and readable code that researchers can easily adapt for their experiments
1313

1414
### Performance Benchmarks
1515

@@ -21,6 +21,8 @@ The biological AI community is actively prototyping model architectures and need
2121

2222
### Use Cases
2323

24+
The use cases of BioNeMO Recipes include:
25+
2426
- **Foundation Model Developers**: AI researchers and ML engineers developing novel biological foundation models who need to scale up prototypes efficiently
2527
- **Foundation Model Customizers**: Domain scientists looking to fine-tune existing models with proprietary data for drug discovery and biological research
2628

@@ -48,9 +50,9 @@ Abbreviations:
4850
- BF16: [brain-float 16](https://en.wikipedia.org/wiki/Bfloat16_floating-point_format), a common 16 bit float format for deep learning.
4951
- FP8<sup>[1]</sup>: [8-bit floating point](https://docs.nvidia.com/deeplearning/transformer-engine/user-guide/examples/fp8_primer.html), a compact format for weights allowing for faster training and inference.
5052
- MXFP8<sup>[2]</sup>: [Multi Scale 8-bit floating point](https://docs.nvidia.com/deeplearning/transformer-engine/user-guide/examples/fp8_primer.html), as compact as FP8 but with better numerical precision.
51-
- NVFP4<sup>[2]</sup>: [NVIDIA 4-bit floating point](https://docs.nvidia.com/deeplearning/transformer-engine/user-guide/examples/fp8_primer.html#Beyond-FP8---training-with-NVFP4), faster than FP8, retaining accuracy via multi-scale.
52-
- THD: **T**otal **H**eads **D**imension, also known as ["sequence packing"](https://docs.nvidia.com/nemo-framework/user-guide/24.07/nemotoolkit/features/optimizations/sequence_packing.html#sequence-packing-for-sft-peft). A way to construct a batch with sequences of different length so there are no pads, therefore no compute is wasted on computing attention for padding tokens. This is in contrast to **B**atch **S**equence **H**ead **D**imension (BSHD) format, which uses pads to create a rectangular batch.
53-
- CP: Context parallel, also known as sequence parallel. A way to distribute the memory required to process long sequences across multiple GPUs. For more information please see [context parallel](./recipes/context_parallel.md)
53+
- NVFP4<sup>[2]</sup>: [NVIDIA 4-bit floating point](https://docs.nvidia.com/deeplearning/transformer-engine/user-guide/examples/fp8_primer.html#Beyond-FP8---training-with-NVFP4), faster than FP8, retaining accuracy using multi-scale.
54+
- THD: **T**otal **H**eads **D**imension, also known as ["sequence packing"](https://docs.nvidia.com/nemo-framework/user-guide/24.07/nemotoolkit/features/optimizations/sequence_packing.html#sequence-packing-for-sft-peft). A way to construct a batch with sequences of different lengths so there are no pads, which results in no compute wasted on computing attention for padding tokens. This is in contrast to **B**atch **S**equence **H**ead **D**imension (BSHD) format, which uses pads to create a rectangular batch.
55+
- CP: Context parallel, also known as sequence parallel. A way to distribute the memory required to process long sequences across multiple GPUs. For more information, refer to [context parallel](./recipes/context_parallel.md)
5456

5557
\[1\]: Requires [compute capability](https://developer.nvidia.com/cuda-gpus) 9.0 and above (Hopper+) <br/>
5658
\[2\]: Requires [compute capability](https://developer.nvidia.com/cuda-gpus) 10.0 and 10.3 (Blackwell), 12.0 support pending <br/>
@@ -63,7 +65,7 @@ This repository contains two types of components:
6365

6466
Huggingface-compatible `PreTrainedModel` classes that use TransformerEngine layers internally. These are designed to be:
6567

66-
- **Distributed via Hugging Face Hub**: Pre-converted checkpoints available at [huggingface.co/nvidia](https://huggingface.co/nvidia)
68+
- **Distributed through Hugging Face Hub**: Pre-converted checkpoints available at [huggingface.co/nvidia](https://huggingface.co/nvidia)
6769
- **Drop-in replacements**: Compatible with `AutoModel.from_pretrained()` without additional dependencies
6870
- **Performance optimized**: Leverage TransformerEngine features like FP8 training and context parallelism
6971

@@ -82,7 +84,11 @@ Recipes are **not pip-installable packages** but serve as reference implementati
8284

8385
## Quick Start
8486

85-
### Using Models
87+
This section describe how you can get started with BioNeMo Recipes.
88+
89+
### Loading Models
90+
91+
Run the following to load the BioNeMo model.
8692

8793
```python
8894
from transformers import AutoModel, AutoTokenizer
@@ -94,6 +100,8 @@ tokenizer = AutoTokenizer.from_pretrained("nvidia/AMPLIFY_120M")
94100

95101
### Running Recipes
96102

103+
Build and run recipes with the following.
104+
97105
```bash
98106
# Navigate to a recipe
99107
cd recipes/esm2_native_te_mfsdp
@@ -103,13 +111,9 @@ docker build -t esm2_recipe .
103111
docker run --rm -it --gpus all esm2_recipe python train.py
104112
```
105113

106-
______________________________________________________________________
114+
## Setting Up the Development Environment
107115

108-
## Developer Guide
109-
110-
### Setting Up Development Environment
111-
112-
1. **Install pre-commit hooks:**
116+
1. Install pre-commit hooks:
113117

114118
```bash
115119
pre-commit install
@@ -130,9 +134,9 @@ ______________________________________________________________________
130134
docker run --rm -it --gpus all my_tag pytest -v .
131135
```
132136

133-
### Coding Guidelines
137+
## Coding Guidelines
134138

135-
We prioritize **readability and simplicity** over comprehensive feature coverage:
139+
BioNeMo Recipes prioritize **readability and simplicity** over comprehensive feature coverage:
136140

137141
- **KISS (Keep It Simple) over DRY (Don't Repeat Yourself)**: It's better to have clear, duplicated code than complex
138142
abstractions
@@ -141,7 +145,7 @@ We prioritize **readability and simplicity** over comprehensive feature coverage
141145

142146
### Testing Strategy
143147

144-
We use a three-tier testing approach:
148+
BioNeMo Reciptes use a three-tier testing approach:
145149

146150
#### L0 Tests (Pre-merge)
147151

@@ -166,9 +170,11 @@ We use a three-tier testing approach:
166170

167171
### Adding New Components
168172

173+
With BioNeMo Recipes, you can add new components including models and recipes.
174+
169175
#### Adding a New Model
170176

171-
Models should be pip-installable packages that can export checkpoints to Hugging Face. See the
177+
Models should be pip-installable packages that can export checkpoints to Hugging Face. Refer to the
172178
[models README](models/README.md) for detailed guidelines on:
173179

174180
- Package structure and conventions
@@ -178,7 +184,7 @@ Models should be pip-installable packages that can export checkpoints to Hugging
178184

179185
#### Adding a New Recipe
180186

181-
Recipes should be self-contained Docker environments demonstrating specific training patterns. See
187+
Recipes should be self-contained Docker environments demonstrating specific training patterns. Refer to
182188
the [recipes README](recipes/README.md) for guidance on:
183189

184190
- Directory structure and naming
@@ -209,14 +215,14 @@ We aim to provide the fastest available training implementations for biological
209215

210216
## Contributing
211217

212-
We welcome contributions that advance the state of biological foundation model training. Please ensure your contributions:
218+
We welcome contributions that advance the state of biological foundation model training. Ensure your contributions:
213219

214-
1. Follow our coding guidelines emphasizing clarity
215-
2. Include appropriate tests (L0 minimum, L1/L2 as applicable)
216-
3. Provide clear documentation and examples
217-
4. Maintain compatibility with our supported frameworks
220+
- Follow our coding guidelines emphasizing clarity
221+
- Include appropriate tests (L0 minimum, L1/L2 as applicable)
222+
- Provide clear documentation and examples
223+
- Maintain compatibility with our supported frameworks
218224

219-
For detailed contribution guidelines, see our individual component READMEs:
225+
For detailed contribution guidelines, refer to our individual component READMEs:
220226

221227
- [Models Development Guide](models/README.md)
222228
- [Recipes Development Guide](recipes/README.md)

bionemo-recipes/models/README.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,14 @@
11
# Models Directory
22

3-
This directory contains HuggingFace-compatible model implementations that use TransformerEngine layers internally. These models are designed to be distributed via the Hugging Face Hub and serve as drop-in replacements for standard transformer models with enhanced performance.
3+
This directory contains HuggingFace-compatible model implementations that use TransformerEngine layers internally. These models are designed to be distributed through the Hugging Face Hub and serve as drop-in replacements for standard transformer models with enhanced performance.
44

55
## Overview
66

77
Models in this directory are **not intended to be pip-installed directly**. Instead, they serve as:
88

9-
1. **Reference implementations** of biological foundation models using TransformerEngine
10-
2. **Conversion utilities** for transforming existing model checkpoints to TE-compatible format
11-
3. **Export tools** for preparing model releases on the Hugging Face Hub
9+
- **Reference implementations** of biological foundation models using TransformerEngine
10+
- **Conversion utilities** for transforming existing model checkpoints to TE-compatible format
11+
- **Export tools** for preparing model releases on the Hugging Face Hub
1212

1313
Users will typically interact with these models by loading pre-converted checkpoints directly from the Hugging Face Hub using standard transformers APIs.
1414

@@ -33,7 +33,7 @@ To add a new model to this directory, you must provide:
3333
#### 3. Checkpoint Export Script
3434

3535
- **`export.py`**: Script that packages all necessary files for Hugging Face Hub upload
36-
- **Complete asset bundling**: Must include all required files (see [Export Requirements](#export-requirements))
36+
- **Complete asset bundling**: Must include all required files, refer to [Export Requirements](#export-requirements)
3737
- **Automated process**: Should be runnable with minimal manual intervention
3838

3939
#### 4. Open Source License

bionemo-recipes/models/amplify/README.md

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,8 @@
11
# AMPLIFY Optimized with NVIDIA TransformerEngine
22

33
This folder contains source code and tests for an AMPLIFY model that inherits from the transformers `PreTrainedModel`
4-
class and uses TransformerEngine layers. Users don't need to install this package directly, but can load the
5-
model directly from HuggingFace Hub using the standard transformers API (see [Inference Examples](#inference-examples)
6-
below).
4+
class and uses TransformerEngine layers. Users do not need to install this package directly, but can load the
5+
model directly from HuggingFace Hub using the standard transformers API. For more information, refer to [Inference Examples](#inference-examples).
76

87
## Feature support
98

@@ -18,7 +17,7 @@ The AMPLIFY implementation natively supports the following TransformerEngine-pro
1817
| **Import from HuggingFace checkpoints** | ✅ Supported |
1918
| **Export to HuggingFace checkpoints** | 🚧 Under development |
2019

21-
See [BioNeMo Recipes](../../recipes/README.md) for more details on how to use these features to accelerate model
20+
Refer to [BioNeMo Recipes](../../recipes/README.md) for more details on how to use these features to accelerate model
2221
training and inference.
2322

2423
## Links to HF checkpoints
@@ -34,7 +33,7 @@ Pre-trained AMPLIFY models are available on HuggingFace as part of the NVIDIA
3433
## Runtime Requirements
3534

3635
We recommend using the latest [NVIDIA PyTorch container](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch)
37-
for optimal performance and compatibility. See the provided Dockerfile for details.
36+
for optimal performance and compatibility. Refer to the provided Dockerfile for details.
3837

3938
## Inference Examples
4039

@@ -61,7 +60,7 @@ output = model(**inputs)
6160
## Recipe Links
6261

6362
Training recipes are available in the `bionemo-recipes/recipes/` directory. AMPLIFY can be trained using the same
64-
recipes as ESM-2, simply by switching the model_tag to reference the AMPLIFY model, e.g. `nvidia/AMPLIFY_120M`, and
63+
recipes as ESM-2, simply by switching the model_tag to reference the AMPLIFY model, such as `nvidia/AMPLIFY_120M`, and
6564
changing the dataset as appropriate.
6665

6766
- **[esm2_native_te](../../recipes/esm2_native_te/)** - Demonstrates training with a simple native PyTorch training
@@ -118,3 +117,5 @@ Or, upload all models at once with:
118117
```bash
119118
for dir in *; do huggingface-cli upload nvidia/$(basename "$dir") "$dir/"; done
120119
```
120+
121+
z

bionemo-recipes/models/esm2/README.md

Lines changed: 7 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,7 @@
22

33
This folder contains source code and tests for an ESM-2 model that inherits from the transformers `PreTrainedModel`
44
class and uses TransformerEngine layers. Users don't need to install this package directly, but can load the
5-
model directly from HuggingFace Hub using the standard transformers API (see [Inference Examples](#inference-examples)
6-
below).
5+
model directly from HuggingFace Hub using the standard transformers API. For more information, refer to [Inference Examples](#inference-examples).
76

87
## Feature support
98

@@ -18,7 +17,7 @@ The ESM-2 implementation natively supports the following TransformerEngine-provi
1817
| **Import from HuggingFace checkpoints** | ✅ Supported |
1918
| **Export to HuggingFace checkpoints** | ✅ Supported |
2019

21-
See [BioNemo Recipes](../../recipes/README.md) for more details on how to use these features to accelerate model
20+
Refer to [BioNemo Recipes](../../recipes/README.md) for more details on how to use these features to accelerate model
2221
training and inference.
2322

2423
## Links to HF checkpoints
@@ -38,7 +37,7 @@ Pre-trained ESM-2 models converted from the original Facebook weights are availa
3837
## Runtime Requirements
3938

4039
We recommend using the latest [NVIDIA PyTorch container](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch)
41-
for optimal performance and compatibility. See the provided Dockerfile for details.
40+
for optimal performance and compatibility. Refer to the provided Dockerfile for details.
4241

4342
## Inference Examples
4443

@@ -101,7 +100,7 @@ hf_model = convert_esm_te_to_hf(te_model)
101100
hf_model.save_pretrained("/path/to/hf_checkpoint")
102101
```
103102

104-
Load and Test the Exported Model
103+
### Loading and Testing the Exported Model
105104

106105
Load the exported model and perform validation:
107106

@@ -114,8 +113,8 @@ tokenizer = AutoTokenizer.from_pretrained("facebook/esm2_t6_8M_UR50D")
114113

115114
### Validating Converted Models
116115

117-
See the commands in [Inference Examples](#inference-examples) above to load and test both the original and converted
118-
models to ensure loss and logit values are similar. See also the golden value tests in
116+
To validate the converted models, refer to the commands in [Inference Examples](#inference-examples) above to load and test both the original and converted
117+
models to ensure loss and logit values are similar. Additionally, refer to the golden value tests in
119118
[test_modeling_esm_te.py](tests/test_modeling_esm_te.py) and [test_convert.py](tests/test_convert.py).
120119

121120
## Developer Guide
@@ -153,7 +152,7 @@ Now deploy the converted checkpoints to the HuggingFace Hub by running the follo
153152
huggingface-cli upload nvidia/${MODEL_NAME} $PWD/checkpoint_export/${MODEL_NAME}
154153
```
155154

156-
Or, upload all models at once with:
155+
You can also upload all models at once with:
157156

158157
```bash
159158
cd checkpoint_export

0 commit comments

Comments
 (0)