Skip to content

Commit fa74e73

Browse files
authored
Merge pull request #452 from denys-fridman/dfridman/moe_aux_loss_weight
Add moe_aux_loss_coeff constant and compliance checks for DeepSeek V3
2 parents 9228c49 + 54fdd19 commit fa74e73

3 files changed

Lines changed: 10 additions & 0 deletions

File tree

mlperf_logging/compliance_checker/training_6.0.0/closed_deepseekv3_671b.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -69,6 +69,11 @@
6969
REQ: EXACTLY_ONE
7070
CHECK: " v['value'] == 1.0 "
7171

72+
- KEY:
73+
NAME: moe_aux_loss_coeff
74+
REQ: EXACTLY_ONE
75+
CHECK: " v['value'] == 0.01 "
76+
7277
- KEY:
7378
NAME: gradient_accumulation_steps
7479
REQ: EXACTLY_ONE

mlperf_logging/compliance_checker/training_6.0.0/open_deepseekv3_671b.yaml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,10 @@
5555
NAME: opt_gradient_clip_norm
5656
REQ: EXACTLY_ONE
5757

58+
- KEY:
59+
NAME: moe_aux_loss_coeff
60+
REQ: EXACTLY_ONE
61+
5862
- KEY:
5963
NAME: gradient_accumulation_steps
6064
REQ: EXACTLY_ONE

mlperf_logging/mllog/constants.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -177,6 +177,7 @@
177177
START_WARMUP_STEP = "start_warmup_step"
178178
INIT_CHECKPOINT_STEP = "init_checkpoint_step"
179179
LORA_ALPHA = "lora_alpha"
180+
MOE_AUX_LOSS_COEFF = "moe_aux_loss_coeff"
180181
# Log keys - misc.
181182
BBOX = "bbox"
182183
SEGM = "segm"

0 commit comments

Comments
 (0)