Skip to content

Commit f9a946b

Browse files
svc-bionemopstjohn
andauthored
docs: fix 81 broken links in documentation (NVIDIA-BioNeMo#1596)
## Summary Fix 81 broken links identified in a docs.nvidia.com link scan (2026-05-14). Addresses [NVBug 6179235](https://nvbugspro.nvidia.com/bug/6179235). ### Changes (31 files, pure link replacements) **Org migration:** - `nvidia.github.io/bionemo-framework` → `nvidia-bionemo.github.io/bionemo-framework` (GitHub Pages URL after org move) - `github.com/NVIDIA/bionemo-framework` → `github.com/NVIDIA-BioNeMo/bionemo-framework` in documentation links **Recipe README relative-link fixes:** - Converted broken relative links to support files (`.py`, `.yaml`, `Dockerfile`, etc.) into absolute GitHub `blob`/`tree` URLs - Affected recipes: vit, esm2_native_te, esm2_accelerate_te, geneformer_native_te_mfsdp_fp8, codonfm_ptl_te, evo2_megatron - Fixed renamed files: `AI_DOCUMENTATION.md` → `AGENT_DOCUMENTATION.md`, `hydra_config/model/4b.yaml` → `hydra_config/4b.yaml`, `gitingest.txt` → `gitingest.sh` **Other doc fixes:** - Fixed megatron-core link (api-guide → user-guide/features) - Fixed ESM-2 model page dataset links - Fixed contributing.md workflow link - Removed/archived stale links to deleted pages ### Not addressed (separate issues) - 888 broken links from API reference pages linking to removed sub-packages (`bionemo-amplify`, `bionemo-esm2`, `bionemo-fw`) — these will self-fix once `docs.nvidia.com/bionemo-framework/latest/` is re-deployed from current `main` - 56 broken links in archived versions (`/1.10/`, `/1.10.1/`) — frozen content, needs NVIDIA docs infra intervention ### Testing - Pre-commit passes (mdformat, check-copied-files, etc.) - All target files verified to exist in the repo --------- Signed-off-by: svc-bionemo <267129667+svc-bionemo@users.noreply.github.com> Co-authored-by: svc-bionemo <267129667+svc-bionemo@users.noreply.github.com> Co-authored-by: Peter St. John <pstjohn@nvidia.com>
1 parent aa2692e commit f9a946b

32 files changed

Lines changed: 544 additions & 112 deletions

File tree

.github/workflows/gh-docs-deploy.yml

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,10 @@ jobs:
2828
python -m pip install --upgrade pip
2929
pip install -r docs/requirements.txt
3030
- name: Build site
31-
run: mkdocs build
31+
run: mkdocs build --strict
32+
working-directory: docs
33+
- name: Check internal links
34+
run: python scripts/check_internal_links.py site
3235
working-directory: docs
3336
- name: Configure Git Credentials
3437
if: github.event_name == 'push'

README.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -6,8 +6,8 @@
66
<div align="left">
77

88
[![Click here to deploy.](https://uohmivykqgnnbiouffke.supabase.co/storage/v1/object/public/landingpage/brevdeploynavy.svg)](https://console.brev.dev/launchable/deploy/now?launchableID=env-2pPDA4sJyTuFf3KsCv5KWRbuVlU)
9-
[![Docs Build](https://img.shields.io/github/actions/workflow/status/NVIDIA/bionemo-framework/pages/pages-build-deployment?label=docs-build)](https://nvidia.github.io/bionemo-framework)
10-
[![Test Status](https://github.com/NVIDIA/bionemo-framework/actions/workflows/unit-tests.yml/badge.svg)](https://github.com/NVIDIA/bionemo-framework/actions/workflows/unit-tests.yml)
9+
[![Docs Build](https://img.shields.io/github/actions/workflow/status/NVIDIA/bionemo-framework/pages/pages-build-deployment?label=docs-build)](https://nvidia-bionemo.github.io/bionemo-framework)
10+
[![Test Status](https://github.com/NVIDIA-BioNeMo/bionemo-framework/actions/workflows/unit-tests.yml/badge.svg)](https://github.com/NVIDIA-BioNeMo/bionemo-framework/actions/workflows/unit-tests.yml)
1111
[![Latest Tag](https://img.shields.io/github/v/tag/NVIDIA/bionemo-framework?label=latest-version)](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/clara/containers/bionemo-framework/tags)
1212
[![codecov](https://codecov.io/gh/NVIDIA/bionemo-framework/branch/main/graph/badge.svg?token=XqhegdZRqB)](https://codecov.io/gh/NVIDIA/bionemo-framework)
1313

@@ -55,8 +55,8 @@ cd bionemo-framework/bionemo-recipes/recipes/esm2_native_te/
5555
- 02/23/2026 [Mixtral MoE model](bionemo-recipes/models/mixtral/) with TE `GroupedLinear` for efficient parallel expert computation, FP8/FP4 support, and HF conversion.
5656
- 02/13/2026 [ESM2 PEFT recipe](bionemo-recipes/recipes/esm2_peft_te/) for LoRA fine-tuning with sequence packing support.
5757
- 01/14/2026 [Llama3 Context Parallelism](bionemo-recipes/recipes/llama3_native_te/README.md#performance-benchmarks) — scaling Llama 3 70B to 144K context on 36x GB300 NVL36 with ~65% MFU.
58-
- 10/27/2025 [CodonFM recipe](https://github.com/NVIDIA/bionemo-framework/tree/main/bionemo-recipes/recipes/codonfm_ptl_te) released! This is an accelerated version of the original [research codebase](https://github.com/NVIDIA-Digital-Bio/CodonFM) with [scientific preprint](https://research.nvidia.com/labs/dbr/assets/data/manuscripts/nv-codonfm-preprint.pdf).
59-
- 09/01/2025 [bionemo-recipes](https://github.com/NVIDIA/bionemo-framework/tree/main/bionemo-recipes) goes live! Lightweight and portable examples with state-of-the-art training performance you can riff on to meet your needs.
58+
- 10/27/2025 [CodonFM recipe](https://github.com/NVIDIA-BioNeMo/bionemo-framework/tree/main/bionemo-recipes/recipes/codonfm_ptl_te) released! This is an accelerated version of the original [research codebase](https://github.com/NVIDIA-Digital-Bio/CodonFM) with [scientific preprint](https://research.nvidia.com/labs/dbr/assets/data/manuscripts/nv-codonfm-preprint.pdf).
59+
- 09/01/2025 [bionemo-recipes](https://github.com/NVIDIA-BioNeMo/bionemo-framework/tree/main/bionemo-recipes) goes live! Lightweight and portable examples with state-of-the-art training performance you can riff on to meet your needs.
6060

6161
## Code Overview
6262

@@ -114,7 +114,7 @@ BioNeMo Framework is part of a larger ecosystem of NVIDIA Biopharma products. Ge
114114

115115
## Documentation Resources
116116

117-
- **Official Documentation:** Guides, API references, and troubleshooting for the framework are documented on our [official documentation](https://docs.nvidia.com/bionemo-framework/latest/). Nightly builds of this documentation are available on [BioNeMo Framework GitHub Pages](https://nvidia.github.io/bionemo-framework/)
117+
- **Official Documentation:** Guides, API references, and troubleshooting for the framework are documented on our [official documentation](https://docs.nvidia.com/bionemo-framework/latest/). Nightly builds of this documentation are available on [BioNeMo Framework GitHub Pages](https://nvidia-bionemo.github.io/bionemo-framework/)
118118

119119
- **🚧 In-Progress Documentation 🚧:** `bionemo-recipes` documentation is currently work in progress, however the recipes are meant to be self-documented and easy to understand—we suggest you throw them into your favorite genai code assistant!
120120

bionemo-recipes/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ The biological AI community actively prototypes model architectures and needs to
1414
### Performance Benchmarks
1515

1616
<p align="center">
17-
<img src="https://raw.githubusercontent.com/NVIDIA/bionemo-framework/main/docs/docs/assets/images/esm2/esm2_native_te_benchmarks.svg" width="600">
17+
<img src="../docs/docs/assets/images/esm2/esm2_native_te_benchmarks.svg" width="600">
1818
<br>
1919
<em> Training benchmarks for ESM-2 using the <code>esm2_native_te</code> recipe.</em>
2020
</p>

bionemo-recipes/recipes/codonfm_native_te/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -195,4 +195,4 @@ e.g., `python train_fsdp2.py fp8_config.enabled=true`. For verbose logging, use
195195

196196
## License
197197

198-
Refer to [LICENSE](../../LICENSE).
198+
Refer to the [bionemo-recipes LICENSE](https://github.com/NVIDIA-BioNeMo/bionemo-framework/blob/main/bionemo-recipes/LICENSE).

bionemo-recipes/recipes/context_parallel.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ The core idea behind CP is to partition the data into various chunks, with each
1818

1919
In BioNeMo, we've created some abstractions to partition the data for you. There exists a [ContextParallelDataLoaderWrapper](esm2_native_te/collator.py) that will shard the CP data for you and send it to each device. This dataloader operates on Sequence Packed (THD) data [link](https://docs.nvidia.com/nemo-framework/user-guide/24.12/nemotoolkit/features/optimizations/sequence_packing.html). This `ContextParallelDataLoaderWrapper` will take as arguments your CP group and local CP rank. This dataloader wrapper will call its underlying dataloader to generate a unique piece of data and then shard those unique sequences across your CP groups. This is beneficial because you won't need to maintain a deterministic data pipeline because unique data is only being generated across the non CP groups, and it is replicated across the CP groups. More details below.
2020

21-
Alternatively, one could utilize any DataLoader such as the canonical [PyTorch DataLoader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader), however, you would have to ensure that your dataset is synchronized across CP ranks. In some cases, if you have a non-deterministic data pipeline, even if you attempt to get the same data from a dataloader it may be different due to non-deterministic preprocessing stages such as masking. For more information on preserving determinism in your datasets, please see [MegatronLMDataModule](https://nvidia.github.io/bionemo-framework/main/about/background/megatron_datasets/).
21+
Alternatively, one could utilize any DataLoader such as the canonical [PyTorch DataLoader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader), however, you would have to ensure that your dataset is synchronized across CP ranks. In some cases, if you have a non-deterministic data pipeline, even if you attempt to get the same data from a dataloader it may be different due to non-deterministic preprocessing stages such as masking. For more information on preserving determinism in your datasets, please see [MegatronLMDataModule](../../docs/docs/main/about/background/megatron_datasets.md).
2222

2323
### Context Parallelism Sharding Example
2424

bionemo-recipes/recipes/esm2_accelerate_te/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ This folder demonstrates how to train TE-accelerated ESM-2 using the [Hugging Fa
66

77
This folder contains an independent, minimal training example. It does not depend on any other code in the top-level
88
bionemo-framework repository. You can download a zipped directory of this folder alone by clicking
9-
[here](https://download-directory.github.io?url=https://github.com/NVIDIA/bionemo-framework/tree/main/bionemo-recipes/recipes/esm2_accelerate_te&filename=esm2-accelerate-te).
9+
[here](https://download-directory.github.io?url=https://github.com/NVIDIA-BioNeMo/bionemo-framework/tree/main/bionemo-recipes/recipes/esm2_accelerate_te&filename=esm2-accelerate-te).
1010

1111
### How to deploy this recipe on cloud providers
1212

bionemo-recipes/recipes/esm2_native_te/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ training.
88

99
This folder contains an independent, minimal training example. It does not depend on any other code in the top-level
1010
bionemo-framework repository. You can download a zipped directory of this folder alone by clicking
11-
[here](https://download-directory.github.io?url=https://github.com/NVIDIA/bionemo-framework/tree/main/bionemo-recipes/recipes/esm2_native_te&filename=esm2-native-te).
11+
[here](https://download-directory.github.io?url=https://github.com/NVIDIA-BioNeMo/bionemo-framework/tree/main/bionemo-recipes/recipes/esm2_native_te&filename=esm2-native-te).
1212

1313
### How to deploy this recipe on cloud providers
1414

bionemo-recipes/recipes/fp8_analysis/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ and training scripts.
3030
| ESM2 ||||
3131
| LLAMA3 ||||
3232

33-
To gather FP8 statistics for analysis, refer to the model-specific documentation (e.g., [ESM2 FP8 Debugging](../esm2_native_te/README.md#fp8-debugging)) or add these arguments to your training command:
33+
To gather FP8 statistics for analysis, refer to the model-specific documentation (e.g., [ESM2 quantized training](../esm2_native_te/README.md#quantized-training-fp8-mxfp8-nvfp4)) or add these arguments to your training command:
3434

3535
```python
3636
python train_fsdp2.py \

bionemo-recipes/recipes/geneformer_native_te_mfsdp_fp8/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,8 @@
22

33
# ⚠️ IMPORTANT FOR AI AGENTS ⚠️
44

5-
**DO NOT proceed without reading [AI_DOCUMENTATION.md](AI_DOCUMENTATION.md) first.**
6-
This file contains comprehensive documentation specifically designed for AI agents. Please see [gitingest.txt](./internal/gitingest.txt) for the complete codebase.
5+
**DO NOT proceed without reading [AGENT_DOCUMENTATION.md](AGENT_DOCUMENTATION.md) first.**
6+
This file contains comprehensive documentation specifically designed for AI agents. Please see [gitingest.sh](./internal/gitingest.sh) for the complete codebase.
77

88
# Geneformer Pretraining with mfsdp and a custom pytorch training loop.
99

bionemo-recipes/recipes/llama3_native_te/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ training. This recipe is configured for genomic sequences using a custom nucleot
88

99
This folder contains an independent, minimal training example. It does not depend on any other code in the top-level
1010
bionemo-framework repository. You can download a zipped directory of this folder alone by clicking
11-
[here](https://download-directory.github.io?url=https://github.com/NVIDIA/bionemo-framework/tree/main/bionemo-recipes/recipes/llama3_native_te&filename=llama3-native-te).
11+
[here](https://download-directory.github.io?url=https://github.com/NVIDIA-BioNeMo/bionemo-framework/tree/main/bionemo-recipes/recipes/llama3_native_te&filename=llama3-native-te).
1212

1313
### How to deploy this recipe on cloud providers
1414

@@ -145,7 +145,7 @@ We compared the convergence of this Llama3 recipe (with FSDP2) against NeMo 2.0
145145
implementation on the DCLM Baseline 1.0 dataset. See [Training on Natural Language Data (Lingua
146146
Reproduction)](#lingua-reproduction) for more details. The figure above shows similar loss convergence and step time to
147147
the NeMo 2.0 training example, and the following table shows downstream performance on various tasks using the
148-
[lm-eval](github.com/eleutherai/lm-evaluation-harness) library. The variation in training step time every 10,000 steps
148+
[lm-eval](https://github.com/eleutherai/lm-evaluation-harness) library. The variation in training step time every 10,000 steps
149149
are due checkpointing, further work will be done to improve training step time stability.
150150

151151
| name | arc_challenge | arc_easy | boolq | copa | hella_swag | piqa | winogrande |

0 commit comments

Comments
 (0)