Skip to content

cp: [perf] fix: guard cuda_graph_scope validation against None (3249) into r0.4.0#3262

Merged
ko3n1g merged 1 commit intor0.4.0from
cherry-pick-3249-r0.4.0
Apr 16, 2026
Merged

cp: [perf] fix: guard cuda_graph_scope validation against None (3249) into r0.4.0#3262
ko3n1g merged 1 commit intor0.4.0from
cherry-pick-3249-r0.4.0

Conversation

@svcnvidia-nemo-ci
Copy link
Copy Markdown
Contributor

@svcnvidia-nemo-ci svcnvidia-nemo-ci commented Apr 10, 2026

beep boop [🤖]: Hi @yaoyu-33 👋,

we've cherry picked #3249 into  for you! 🚀

Please review and approve this cherry pick by your convenience!

Summary by CodeRabbit

  • Bug Fixes
    • Improved CUDA graph scope validation to reliably detect invalid configurations in all scenarios, including when using default values, preventing configuration errors from going undetected during model setup.

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: NeMo Bot <nemo-bot@nvidia.com>
@svcnvidia-nemo-ci
Copy link
Copy Markdown
Contributor Author

/ok to test 0d4dddc

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Apr 10, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 10, 2026

📝 Walkthrough

Walkthrough

Moved CUDA graph scope validation logic from conditional CLI-argument handling to post-override validation phase, ensuring invalid scope values in the effective recipe configuration are caught regardless of whether CLI arguments are provided.

Changes

Cohort / File(s) Summary
CUDA Graph Scope Validation
scripts/performance/utils/overrides.py
Relocated scope validation for transformer_engine CUDA graph implementation from conditional CLI argument block to post-override validation; now validates effective recipe values instead of raw CLI values, ensuring comprehensive validation after all overrides are applied.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title is a cherry-pick notation that mentions the fix (guard cuda_graph_scope validation against None) and target branch, but obscures the main change with the cherry-pick prefix 'cp:' and issue reference, making it less clear as a standalone change description.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Test Results For Major Changes ✅ Passed PR contains minor bug fix with +9/-6 line changes to validation logic, not a new feature or breaking change. According to custom check instructions, test results required only for major changes; minor changes may pass without explicit test documentation.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch cherry-pick-3249-r0.4.0

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/performance/utils/overrides.py`:
- Around line 119-126: The TE scope validation in the overrides.py block uses
effective_scope = getattr(recipe.model, "cuda_graph_scope", None) but doesn't
normalize it, causing strings or enum values to be iterated per-character;
update the validation to normalize effective_scope into a list of simple strings
before checking against valid_te_scopes: if effective_scope is a str convert to
[effective_scope], if items are enum-like objects convert each to its .value or
str representation, then assert the normalized list is non-empty and that every
element is in valid_te_scopes (use the existing valid_te_scopes list and the
same assert message).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 2c36def4-8a59-40dd-8a29-16b4081fe969

📥 Commits

Reviewing files that changed from the base of the PR and between a44580a and 0d4dddc.

📒 Files selected for processing (1)
  • scripts/performance/utils/overrides.py

Comment on lines +119 to +126
# Validate post-override state so we check the effective recipe value,
# not the raw CLI input (which may be None when the recipe default is used).
if recipe.model.cuda_graph_impl == "transformer_engine":
valid_te_scopes = ["attn", "mlp", "moe", "moe_router", "moe_preprocess", "mamba"]
effective_scope = getattr(recipe.model, "cuda_graph_scope", None)
assert effective_scope is not None and all(scope in valid_te_scopes for scope in effective_scope), (
f"Invalid cuda graph scope: {effective_scope}. Valid options are: {valid_te_scopes}"
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Normalize effective_scope before TE scope validation to avoid false failures.

At Line 124, validation assumes effective_scope is a list of plain strings. But cuda_graph_scope is consumed elsewhere as str | list | single value and may contain enum-like entries, so valid configs can fail here (e.g., string input iterates per-character).

💡 Proposed fix
     if recipe.model.cuda_graph_impl == "transformer_engine":
         valid_te_scopes = ["attn", "mlp", "moe", "moe_router", "moe_preprocess", "mamba"]
         effective_scope = getattr(recipe.model, "cuda_graph_scope", None)
-        assert effective_scope is not None and all(scope in valid_te_scopes for scope in effective_scope), (
-            f"Invalid cuda graph scope: {effective_scope}. Valid options are: {valid_te_scopes}"
-        )
+        assert effective_scope is not None, "cuda_graph_scope must not be None when cuda_graph_impl=transformer_engine"
+        if isinstance(effective_scope, str):
+            effective_scope = effective_scope.split(",") if effective_scope else []
+        elif not isinstance(effective_scope, list):
+            effective_scope = [effective_scope]
+
+        normalized_scope = []
+        for scope in effective_scope:
+            scope_name = getattr(scope, "value", str(scope))
+            if scope_name.startswith("CudaGraphScope."):
+                scope_name = scope_name.split(".", 1)[1]
+            normalized_scope.append(scope_name)
+
+        assert all(scope in valid_te_scopes for scope in normalized_scope), (
+            f"Invalid cuda graph scope: {effective_scope}. Valid options are: {valid_te_scopes}"
+        )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Validate post-override state so we check the effective recipe value,
# not the raw CLI input (which may be None when the recipe default is used).
if recipe.model.cuda_graph_impl == "transformer_engine":
valid_te_scopes = ["attn", "mlp", "moe", "moe_router", "moe_preprocess", "mamba"]
effective_scope = getattr(recipe.model, "cuda_graph_scope", None)
assert effective_scope is not None and all(scope in valid_te_scopes for scope in effective_scope), (
f"Invalid cuda graph scope: {effective_scope}. Valid options are: {valid_te_scopes}"
)
# Validate post-override state so we check the effective recipe value,
# not the raw CLI input (which may be None when the recipe default is used).
if recipe.model.cuda_graph_impl == "transformer_engine":
valid_te_scopes = ["attn", "mlp", "moe", "moe_router", "moe_preprocess", "mamba"]
effective_scope = getattr(recipe.model, "cuda_graph_scope", None)
assert effective_scope is not None, "cuda_graph_scope must not be None when cuda_graph_impl=transformer_engine"
if isinstance(effective_scope, str):
effective_scope = effective_scope.split(",") if effective_scope else []
elif not isinstance(effective_scope, list):
effective_scope = [effective_scope]
normalized_scope = []
for scope in effective_scope:
scope_name = getattr(scope, "value", str(scope))
if scope_name.startswith("CudaGraphScope."):
scope_name = scope_name.split(".", 1)[1]
normalized_scope.append(scope_name)
assert all(scope in valid_te_scopes for scope in normalized_scope), (
f"Invalid cuda graph scope: {effective_scope}. Valid options are: {valid_te_scopes}"
)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/performance/utils/overrides.py` around lines 119 - 126, The TE scope
validation in the overrides.py block uses effective_scope =
getattr(recipe.model, "cuda_graph_scope", None) but doesn't normalize it,
causing strings or enum values to be iterated per-character; update the
validation to normalize effective_scope into a list of simple strings before
checking against valid_te_scopes: if effective_scope is a str convert to
[effective_scope], if items are enum-like objects convert each to its .value or
str representation, then assert the normalized list is non-empty and that every
element is in valid_te_scopes (use the existing valid_te_scopes list and the
same assert message).

@ko3n1g ko3n1g marked this pull request as draft April 13, 2026 08:45
@ko3n1g
Copy link
Copy Markdown
Contributor

ko3n1g commented Apr 13, 2026

will merge for the patch

@yaoyu-33 yaoyu-33 added area:perf Performance optimizations and benchmarking needs-author Author action is required before review or merge can continue labels Apr 13, 2026
@ko3n1g ko3n1g marked this pull request as ready for review April 16, 2026 23:12
@ko3n1g ko3n1g merged commit dba8b1e into r0.4.0 Apr 16, 2026
53 checks passed
@ko3n1g ko3n1g deleted the cherry-pick-3249-r0.4.0 branch April 16, 2026 23:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:perf Performance optimizations and benchmarking cherry-pick needs-author Author action is required before review or merge can continue Run CICD

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants