Skip to content

Releases: NVIDIA-BioNeMo/bionemo-recipes

v3.0.0 - BioNeMo Recipes

Choose a tag to compare

@pstjohn pstjohn released this 24 Jun 18:26
v3.0.0
66c150f

This marks the first release from BioNeMo Recipes, a lighter-weight version of the BioNeMo model training framework which demonstrates NVIDIA-optimized reference architectures for biological foundation models. This release drops support for the monolithic bionemo-framework container, and instead opts for per-folder Dockerfiles demonstrating how to extend the NGC PyTorch container with the necessary packages for biological foundation model training. BioNeMo Recipes is continuously tested with nightly CI and all further releases of BioNeMo Recipes will occur on a rolling basis — the v3.0.0 tag simply marks a reference point.

The recipes are hermetic and simple by design. The best way to use these recipes is as reference files to your coding agent — tell your agent to run them, or use them as reference for new models to build yourself.

Legacy bionemo-framework code can be found in the bionemo2 branch.

NVIDIA BioNeMo Framework v2.7

Choose a tag to compare

@trvachov trvachov released this 01 Oct 23:46
8d88bd8

Updates & Improvements

  • Evo2 model improvements:

    • Context, tensor and data parallelism support in the prediction endpoint as well as support for context lengths over 8192 #1123. Fixes #910 and #1048.

    • LoRA fine-tuning by @gabenavarro: #980. Note: internal CI coverage of LoRA convergence is still a work in progress; therefore, we cannot guarantee convergence.

    • Fix a 2x memory-usage issue during Evo2 generation: NVIDIA-NeMo/Speech#14515

    • Add flash-decode support in inference: #1000

    • Update Rotary Embedding and sequence-length defaults to address incorrect checkpoint conversion: NVIDIA-NeMo/Speech#14514

    • Improvements to tag masking in the Evo2 loss: #1008

    • Support for Spike-no-more to improve training stability: #1011

  • Added a header to SCDL archives, providing improved provenance tracking and supporting future releases. It also adds tracking of AnnData API coverage in SCDL tests.
    This header stores metadata about the archive and its composite arrays, including a version; the array lengths and data types; and information about the RowFeatureIndexes. This adds the features necessary to fix #999 as well as to implement simple bit-packing of the rowptr, colptr, and data arrays. It should also make SCDL more secure, enable strict compatibility checking, and open the door to further performance improvements: #1030

  • bionemo-geometric has been deprecated and removed. The molecular-featurization tooling in this package has moved to cuik-molmaker.

Known Issues

  • We have removed libtiff from the container due to a known vulnerability, CVE-2025-9900. libtiff isn't directly used in any BioNeMo code; however, users might face issues with e.g. Pillow or other common image-manipulation libraries inside this container.

What's Changed

Read more

NVIDIA BioNeMo Framework v2.6.3

Choose a tag to compare

@trvachov trvachov released this 31 Jul 21:09

Updates & Improvements

  • Fixes numerous issues with Evo2 model:
    1. Inference/Generation issues resolved. #890
    2. FP8 training resumption issues resolved. #973
    3. Bug in inference script that concerns checkpoint loading is fixed. #950
  • ESM2 LoRA model inference issue resolved. #996
  • Added experimental evo2-mamba model. #888
  • Updated base Docker image to nvidia-pytorch 25.06-py3
  • NCCL issue in ESM2 pretraing resolved. #970

What's Changed

Full Changelog: v2.6.2...v2.6.3

NVIDIA BioNeMo Framework v2.6.2

Choose a tag to compare

@trvachov trvachov released this 02 Jul 23:18

Updates & Improvements

  • Fixes numerous ESM2 model issues:
    1. Finetuning metric for token classification is fixed. #946
    2. Losses for finetuning were fixed for data and model parallelism. #959
    3. Bug in inference script that concerns checkpoint loading is fixed. #950
  • Updated base Docker image to nvidia-pytorch 25.04-py3

Known Issues

  • Evo2 generation is broken (i.e. bionemo-evo2/src/bionemo/evo2/run/infer.py). See issue #890. A workaround exists on branch #949 and we are working to fix this issue for the July release.
  • There is a NCCL communication issue on certain A100 multi-node environments. In our internal testing, we were not able to reproduce the issue reliably across environments. If end users see the following error, please report in issue #970 :
[rank9]: torch.distributed.DistBackendError: NCCL error in: /opt/pytorch/pytorch/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:3356, internal error - please report this issue to the NCCL developers, NCCL version 2.26.3

What's Changed

New Contributors

Full Changelog: v2.6.1...v2.6.2

NVIDIA BioNeMo Framework v2.6.1

Choose a tag to compare

@trvachov trvachov released this 02 Jun 18:38

Updates & Improvements

  • Fixes around ESM2 pretraining and funetuning checkpoints.
  • Added sanity dataset for AMPLIFY testing.
  • Tested against A100 brev instances.
  • Update tornado package to >6.5.0 to fix container CVEs.

Full Changelog: v2.6...v2.6.1

NVIDIA BioNeMo Framework v2.6

Choose a tag to compare

@trvachov trvachov released this 30 Apr 20:15

New Features

  • Adds support for AMPLIFY doi:10.1101/2024.09.23.614603 pre-training and inference, offering a 70% speedup over the xformers-based attention backend with similar final perplexity values at 1M pre-training steps. (4.23 for 120M, 3.05 for 350M). The model is fully compatible with existing weights on HuggingFace.
  • Adds alpha support for LoRA fine-tuning to for ESM2 models. Inference and fine-tuning are enabled along with resumption from a checkpoint.

Updates & Improvements

  • Blackwell support, tested on B200 systems.
  • Fixed Grace CPU support, released ARM compatible container.

What's Changed

New Contributors

  • @ShevaNguyen made their first contribution in #831

Full Changelog: v2.5...v2.6

NVIDIA BioNeMo Framework v2.5

Choose a tag to compare

@dorotat-nv dorotat-nv released this 17 Mar 13:40

New Features

  • Adding the Evo2 model training workflow, including data preprocessing, pre-training, fine-tuning and inference with bf16 and fp8 support.

Updates & Improvements

  • Supporting/upgrading federated learning examples of BioNeMo in NVFlare
  • Upgrade bionemo-moco to v0.0.2
  • Brev.dev launchable tutorials

What's Changed

  • Bump 3rdparty/Megatron-LM from 2a9793d to a0365bc by @dependabot in #692
  • Bump 3rdparty/NeMo from 48f10af to ee28bc5 by @dependabot in #693
  • add announcement README.md by @ntadimeti in #695
  • Adjust ESM2 fine-tuning to allow NVFlare usecases by @farhadrgh in #689
  • Upgrade bionemo-moco to v0.0.2 by @nvdreidenbach in #688
  • disable metric when model parallel by @sichu2023 in #701
  • bump NeMo by @farhadrgh in #703
  • split trufflehog scan into two actions, run on entire repo on scheduled event by @pstjohn in #696
  • cve vulnerability on main by @dorotat-nv in #709
  • move trufflehog scan to new action by @pstjohn in #721
  • Pstjohn/trufflehog move action 2 by @pstjohn in #722
  • Evo2 by @jstjohn in #694
  • Trigger and skip trufflehog scan in merge group by @pstjohn in #728
  • remove zstandard to address nvbug 5149698 by @pstjohn in #726
  • JET for evo2: 1b model training by @dorotat-nv in #727
  • If desired, training can be stopped on a specific step without impacting the LR curve. by @jstjohn in #739
  • Cleanup any new files made by notebook tests by @jstjohn in #748
  • Adding bf16 fine-tuned variant of evo2 1b checkpoint by @jstjohn in #747
  • Bump nemo version to have the 1b checkpoint fix by @jstjohn in #729
  • GTC Evo2 Demo Notebooks by @jwilber in #724
  • [cye/subpack-ci] Add sub-package build, test, and publish to OSS. (WORK IN PROGRESS - PENDING MORE SUB-PACKAGE COVERAGE) by @cspades in #725
  • Disable notebook and slow tests from running in merge queue by @pstjohn in #754
  • fix: removes BIONEMO_HOME from repository [JIRA-BIONEMO-482] by @jomitchellnv in #742
  • Update brev.dev badges to launchable built off main branch by @jwilber in #752
  • Evo2 modelcard by @jstjohn in #746
  • [cye/fix-subpack-ci] Fix bug where workflow dispatch collected packages are not passed to the next job. by @cspades in #753
  • reduced mem to 12gb by @nvdreidenbach in #730
  • Initial commits prepping for nv-gha-runners by @pstjohn in #733
  • xfail evo2 long context train test by @dorotat-nv in #732

New Contributors

Full Changelog: v2.4.1...v2.5

NVIDIA BioNeMo Framework v2.4.1

Choose a tag to compare

@skothenhill-nv skothenhill-nv released this 28 Feb 19:39
0ce6166

What's Changed

Applies fixes to ESM2 metric logging that result in NotImplementedError while using Model Parallelism.

Full Changelog: v2.4...v2.4.1

NVIDIA BioNeMo Framework v2.4

Choose a tag to compare

@skothenhill-nv skothenhill-nv released this 25 Feb 17:27

New Features

  • Draft implementation of Evo2 with support for Hyena operators
  • bionemo-moco v0.0.1 released for building diffusion-like generative models.

Updates & Improvements

What's Changed

Full Changelog: v2.3...v2.4

NVIDIA BioNeMo Framework v2.3

Choose a tag to compare

@trvachov trvachov released this 28 Jan 15:27

New Features

  • Distributed Inference Support for ESM2 and Geneformer

Updates & Improvements

  • Prior Geneformer inference on H100 accuracy regression fixed.
  • Base image updated to nvcr.io/nvidia/pytorch:24.12-py3; python updated to 3.12 among other core dependency upgrades (base container release notes here).

Changes

New Contributors

Full Changelog: v2.2...v2.3