Commit 3b66875
authored
docs:report [run_20260530_165216](~791 tok/s) (Eamon2009#60)
Includes metrics for generalization gap, throughput (~791 tok/s), and gradient norms.
Parameters: 6.68M | lr: 1e-3 | batch: 16 | steps: 6000 - Achieved best validation loss of 4.1319 at step 39001 parent 6519631 commit 3b66875
1 file changed
Loading
0 commit comments