Skip to content
This repository was archived by the owner on Nov 19, 2025. It is now read-only.

Commit 2c3c245

Browse files
committed
Add DeepSeek entry in Changelog and bump Nemo commit
1 parent 16873d0 commit 2c3c245

2 files changed

Lines changed: 8 additions & 1 deletion

File tree

CHANGELOG.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -62,6 +62,13 @@ for more details on running `prepare_packed_ft_dataset.py` and on running SFT wi
6262
```
6363
- Add code and instructions for replicating Reward Modeling training in HelpSteer2 and HelpSteer2-Preference
6464
- Implement REINFORCE algorithm.
65+
- Added support for DeepSeek-V3. Training from a DeepSeek-v3 NeMo 2.0 checkpoint requires adding these additional parameters to the training script:
66+
```
67+
++model.transformer_engine=True \
68+
++model.dist_ckpt_load_strictness=log_all \
69+
++model.name=decoder_block_gpt \
70+
++model.moe_layer_freq=[0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1]
71+
```
6572
6673
### Breaking Changes
6774
- Upgrade TRTLLM dependency from v0.10.0 to v0.12.0 and migrate from `GPTSession` cpp runtime to `ModelRunner` python runtime. Please use the latest Dockerfile.

Dockerfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ ARG MAX_JOBS=8
1313
# Git refs for dependencies
1414
ARG TE_TAG=7d576ed25266a17a7b651f2c12e8498f67e0baea
1515
ARG PYTRITON_VERSION=0.5.10
16-
ARG NEMO_TAG=633cb602777bffefbe12066b0c915c87e7b469e9 # On: v2.1.0
16+
ARG NEMO_TAG=6a07e88e1ecabcc10b05f73ae8bdd102fb734f0d # On: main
1717
ARG MLM_TAG=d15cec53beb283e7127b7d594e1c46b8a0719b6d # On: core_r0.10.0
1818
ARG ALIGNER_COMMIT=main
1919
ARG TRTLLM_VERSION=v0.13.0

0 commit comments

Comments
 (0)