fix(pt): typo in epoch training#5410
Conversation
There was a problem hiding this comment.
Pull request overview
Fixes PyTorch trainer configuration parsing so epoch-based training works when configs are normalized via deepmd.utils.argcheck.normalize().
Changes:
- Read
training.numb_epoch(canonical argcheck key) instead oftraining.num_epochin the PyTorchTrainerinitialization.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughTrainer initialization now reads epoch count from the Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #5410 +/- ##
=======================================
Coverage 80.47% 80.47%
=======================================
Files 820 820
Lines 86005 86005
Branches 4139 4140 +1
=======================================
+ Hits 69209 69210 +1
Misses 15521 15521
+ Partials 1275 1274 -1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
njzjz
left a comment
There was a problem hiding this comment.
I searched num_epoch and found many other places using num_epoch
https://github.com/search?q=repo%3Adeepmodeling%2Fdeepmd-kit%20num_epoch&type=code
Thanks for the catch! I went through every num_epoch hit in that search. The only other occurrence that had the same bug was in the Paddle backend, which I've now fixed in the latest commit: deepmd/pd/train/training.py: training_params.get("num_epoch") → training_params.get("numb_epoch") (same root cause — the canonical key after normalize() is numb_epoch, while num_epoch is only an alias). deepmd/tf/entrypoints/train.py and change_bias.py: num_epoch is just a local variable name; the .get(...) call already uses the canonical "numb_epoch". |
Summary by CodeRabbit