add multi-gpu COMET judge#1456
Conversation
Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>
Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>
Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>
Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>
Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>
Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>
Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>
Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>
📝 WalkthroughWalkthroughThis PR extends COMET evaluation with precision control and distributed execution support. The COMET evaluator gains PyTorch 2.6+ compatibility, a precision-to-dtype mapping, and distributed-aware output writing. A new ChangesCOMET Evaluator Precision and Distributed Execution
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
nemo_skills/evaluation/evaluator/comet.py (1)
124-125: 💤 Low valueUnused loop variable
sample.The loop unpacks
samplebut only usesidxto index intodata. Use_to indicate the variable is intentionally unused.Suggested fix
- for idx, sample in enumerate(data): + for idx, _ in enumerate(data): data[idx]["comet"] = comet_scores[idx]🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@nemo_skills/evaluation/evaluator/comet.py` around lines 124 - 125, The for-loop in comet.py currently unpacks an unused variable `sample` (for idx, sample in enumerate(data))—replace `sample` with `_` so it reads `for idx, _ in enumerate(data):` to signal the variable is intentionally unused; keep the rest of the body that assigns `data[idx]["comet"] = comet_scores[idx]` unchanged and ensure variable names `idx`, `data`, and `comet_scores` are left intact.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@nemo_skills/evaluation/evaluator/comet.py`:
- Around line 124-125: The for-loop in comet.py currently unpacks an unused
variable `sample` (for idx, sample in enumerate(data))—replace `sample` with `_`
so it reads `for idx, _ in enumerate(data):` to signal the variable is
intentionally unused; keep the rest of the body that assigns `data[idx]["comet"]
= comet_scores[idx]` unchanged and ensure variable names `idx`, `data`, and
`comet_scores` are left intact.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 11b5cbb3-ba72-41bb-8b71-adef82956d7c
📒 Files selected for processing (3)
nemo_skills/evaluation/evaluator/comet.pynemo_skills/pipeline/eval.pynemo_skills/pipeline/judges/comet_judge.py
This PR introduces multi-gpu inference of COMET models.
It additionally sets default COMET judge scoring to bfloat16 instead of float32