Commit e2150f8
authored
Adding update that solves one logger issue for Evo2 training (#1331)
### Description
Use nemo training loop to take advantage of their one-logger
configuration support. Also remove the private callback that broke
checkpointing.
### Type of changes
<!-- Mark the relevant option with an [x] -->
- [x] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Refactor
- [ ] Documentation update
- [ ] Other (please describe):
### CI Pipeline Configuration
Configure CI behavior by applying the relevant labels. By default, only
basic unit tests are run.
-
[ciflow:skip](https://github.com/NVIDIA/bionemo-framework/blob/main/docs/docs/main/contributing/contributing.md#ciflow:skip)
- Skip all CI tests for this PR
-
[ciflow:notebooks](https://github.com/NVIDIA/bionemo-framework/blob/main/docs/docs/main/contributing/contributing.md#ciflow:notebooks)
- Run Jupyter notebooks execution tests for bionemo2
-
[ciflow:slow](https://github.com/NVIDIA/bionemo-framework/blob/main/docs/docs/main/contributing/contributing.md#ciflow:slow)
- Run slow single GPU integration tests marked as @pytest.mark.slow for
bionemo2
-
[ciflow:all](https://github.com/NVIDIA/bionemo-framework/blob/main/docs/docs/main/contributing/contributing.md#ciflow:all)
- Run all tests (unit tests, slow tests, and notebooks) for bionemo2.
This label can be used to enforce running tests for all bionemo2.
-
[ciflow:all-recipes](https://github.com/NVIDIA/bionemo-framework/blob/main/docs/docs/main/contributing/contributing.md#ciflow:all-recipes)
- Run tests for all recipes (under bionemo-recipes). This label can be
used to enforce running tests for all recipes.
Unit tests marked as `@pytest.mark.multi_gpu` or
`@pytest.mark.distributed` are not run in the PR pipeline.
For more details, see [CONTRIBUTING](CONTRIBUTING.md)
> [!NOTE]
> By default, only basic unit tests are run. Add appropriate labels to
enable an additional test coverage.
#### Authorizing CI Runs
We use
[copy-pr-bot](https://docs.gha-runners.nvidia.com/apps/copy-pr-bot/#automation)
to manage authorization of CI
runs on NVIDIA's compute resources.
- If a pull request is opened by a trusted user and contains only
trusted changes, the pull request's code will
automatically be copied to a pull-request/ prefixed branch in the source
repository (e.g. pull-request/123)
- If a pull request is opened by an untrusted user or contains untrusted
changes, an NVIDIA org member must leave an
`/ok to test` comment on the pull request to trigger CI. This will need
to be done for each new commit.
### Pre-submit Checklist
- [x] I have tested these changes locally
- [ ] I have updated the documentation accordingly
- [ ] I have added/updated tests as needed
- [ ] All existing tests pass successfully
Signed-off-by: John St John <jstjohn@nvidia.com>1 parent abd0ecb commit e2150f8
2 files changed
Lines changed: 27 additions & 37 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
24 | 24 | | |
25 | 25 | | |
26 | 26 | | |
27 | | - | |
| 27 | + | |
28 | 28 | | |
29 | 29 | | |
30 | 30 | | |
| |||
53 | 53 | | |
54 | 54 | | |
55 | 55 | | |
56 | | - | |
| 56 | + | |
57 | 57 | | |
58 | 58 | | |
59 | 59 | | |
| |||
853 | 853 | | |
854 | 854 | | |
855 | 855 | | |
856 | | - | |
857 | | - | |
858 | | - | |
859 | | - | |
860 | | - | |
861 | | - | |
862 | | - | |
863 | | - | |
864 | | - | |
865 | | - | |
866 | | - | |
867 | | - | |
868 | | - | |
869 | | - | |
870 | | - | |
871 | | - | |
872 | | - | |
873 | | - | |
874 | | - | |
875 | | - | |
876 | | - | |
877 | 856 | | |
878 | 857 | | |
879 | 858 | | |
| |||
1103 | 1082 | | |
1104 | 1083 | | |
1105 | 1084 | | |
1106 | | - | |
1107 | | - | |
1108 | | - | |
1109 | | - | |
1110 | | - | |
1111 | | - | |
1112 | | - | |
1113 | | - | |
1114 | | - | |
1115 | 1085 | | |
1116 | 1086 | | |
1117 | 1087 | | |
| |||
1139 | 1109 | | |
1140 | 1110 | | |
1141 | 1111 | | |
1142 | | - | |
1143 | | - | |
1144 | | - | |
| 1112 | + | |
1145 | 1113 | | |
1146 | | - | |
1147 | | - | |
1148 | 1114 | | |
1149 | 1115 | | |
1150 | 1116 | | |
| |||
Lines changed: 24 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
14 | 14 | | |
15 | 15 | | |
16 | 16 | | |
| 17 | + | |
17 | 18 | | |
18 | 19 | | |
19 | 20 | | |
20 | 21 | | |
21 | 22 | | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
22 | 46 | | |
23 | 47 | | |
24 | 48 | | |
| |||
0 commit comments