You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
### Description
<!-- Provide a detailed description of the changes in this PR -->
#### Usage
<!--- How does a user interact with the changed code -->
```python
TODO: Add code snippet
```
### Type of changes
<!-- Mark the relevant option with an [x] -->
- [ ] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Refactor
- [ ] Documentation update
- [ ] Other (please describe):
### CI Pipeline Configuration
Configure CI behavior by applying the relevant labels. By default, only
basic unit tests are run.
-
[ciflow:skip](https://github.com/NVIDIA/bionemo-framework/blob/main/docs/docs/main/contributing/contributing.md#ciflow:skip)
- Skip all CI tests for this PR
-
[ciflow:notebooks](https://github.com/NVIDIA/bionemo-framework/blob/main/docs/docs/main/contributing/contributing.md#ciflow:notebooks)
- Run Jupyter notebooks execution tests
-
[ciflow:slow](https://github.com/NVIDIA/bionemo-framework/blob/main/docs/docs/main/contributing/contributing.md#ciflow:slow)
- Run slow single GPU integration tests marked as @pytest.mark.slow
-
[ciflow:all](https://github.com/NVIDIA/bionemo-framework/blob/main/docs/docs/main/contributing/contributing.md#ciflow:all)
- Run all tests (unit tests, slow tests, and notebooks). This label can
be used to enforce running all framework tests.
-
[ciflow:all-recipes](https://github.com/NVIDIA/bionemo-framework/blob/main/docs/docs/main/contributing/contributing.md#ciflow:all-recipes)
- Run tests for all recipes (under bionemo-recipes). This label can be
used to enforce running tests for all recipes.
Unit tests marked as `@pytest.mark.multi_gpu` or
`@pytest.mark.distributed` are not run in the PR pipeline.
For more details, see [CONTRIBUTING](CONTRIBUTING.md)
> [!NOTE]
> By default, only basic unit tests are run. Add appropriate labels to
enable an additional test coverage.
#### Authorizing CI Runs
We use
[copy-pr-bot](https://docs.gha-runners.nvidia.com/apps/copy-pr-bot/#automation)
to manage authorization of CI
runs on NVIDIA's compute resources.
- If a pull request is opened by a trusted user and contains only
trusted changes, the pull request's code will
automatically be copied to a pull-request/ prefixed branch in the source
repository (e.g. pull-request/123)
- If a pull request is opened by an untrusted user or contains untrusted
changes, an NVIDIA org member must leave an
`/ok to test` comment on the pull request to trigger CI. This will need
to be done for each new commit.
#### Triggering Code Rabbit AI Review
To trigger a code review from code rabbit, comment on a pull request
with one of these commands:
- @coderabbitai review - Triggers a standard review
- @coderabbitai full review - Triggers a comprehensive review
See https://docs.coderabbit.ai/reference/review-commands for a full list
of commands.
### Pre-submit Checklist
<!--- Ensure all items are completed before submitting -->
- [ ] I have tested these changes locally
- [ ] I have updated the documentation accordingly
- [ ] I have added/updated tests as needed
- [ ] All existing tests pass successfully
---------
Signed-off-by: Bruno Alvisio <balvisio@nvidia.com>
Copy file name to clipboardExpand all lines: bionemo-recipes/recipes/evo2_megatron/README.md
+163Lines changed: 163 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -266,6 +266,169 @@ Options:
266
266
-`--mixed-precision-recipe` — precision recipe (default: `bf16_mixed`). NOTE for checkpoints sensitive to FP8 and Hopper you need to run with `--mixed-precision-recipe bf16-mixed` and also supply the `--vortex-style-fp8` option for prediction/inference, you should not use the fp8 recipe for those models, as they are sensitive to the exact FP8 configuration they were trained with in savanna, see the [table under the section on available nvidia checkpoints for download from NGC](#available-models-in-ngc-currently-nemo-format-so-first-convert-to-mbridge).
267
267
-`--verbose` / `-v` — enable debug logging.
268
268
269
+
## LoRA Fine-tuning
270
+
271
+
`Evo2LoRA` is a LoRA variant built on top of the Megatron Bridge PEFT stack. It
272
+
freezes the entire base model and attaches low-rank adapter matrices to the
273
+
modules you specify, with an optional escape hatch to keep selected modules
274
+
fully trainable.
275
+
276
+
### Basic usage
277
+
278
+
Add `--lora-finetune` to any `train_evo2` command alongside a checkpoint:
|`False`|`word_embeddings` only | Embedding weight is fully trainable. Output projection is frozen unless also listed. |
371
+
|`False`|`output_layer` only | Output projection weight is fully trainable. Embedding is frozen unless also listed. |
372
+
|`False`| both | Both weights are fully trainable. |
373
+
|`True`|`word_embeddings` only |**Error.** Listing only one side of a tied pair breaks the weight-tying invariant. Both must be listed together. |
374
+
|`True`|`output_layer` only |**Error.** Listing only one side of a tied pair breaks the weight-tying invariant. Both must be listed together. |
375
+
|`True`| both | Accepted. The shared weight (owned by `word_embeddings`) is unfrozen, so both the embedding lookup and the output projection train via the same tensor. **Note:** because `output_layer` allocates no weight of its own, gradient flow through the output projection path back to the shared tensor is a TODO item and may not be fully wired in all pipeline-parallel configurations. |
376
+
377
+
#### Recommendations
378
+
379
+
-**Default (vocabulary weights frozen, LoRA on inner layers):** omit both
380
+
embedding/output modules from both flags. The default `--lora-target-modules`
381
+
does not touch either layer.
382
+
-**Apply LoRA to the output projection (untied models only):** list
383
+
`output_layer` in `--lora-target-modules` and set
384
+
`share_embeddings_and_output_weights=False` in the model config.
385
+
-**Fully fine-tune the vocabulary weight alongside LoRA on inner layers:**
386
+
list **both**`word_embeddings` and `output_layer` in
0 commit comments