[Test] Stabilize RTX MDL shader-warmup flake in rendering correctness tests by hujc7 · Pull Request #5687 · isaac-sim/IsaacLab

hujc7 · 2026-05-19T07:04:57Z

Description

The rendering-correctness tests intermittently fail on cold-cache CI runners with "Camera output is all zeros or all inf" for simple_shading_*_mdl and simple_shading_constant_diffuse variants — the GPU returns a still-zero framebuffer because the RTX MDL material has not finished compiling by the time the test reads the camera tensor.

Add a poll-until-non-zero warmup that exits as soon as every camera-output tensor has a non-zero max:

test_shadow_hand_vision_presets.py::test_camera_renders_not_empty — polls via env.step() with a 60-step cap. (This test has no flaky retry, so a single-step fixture is the actual failure surface.)
rendering_test_utils.warmup_render_until_nonzero — iterates every sensor in env.scene.sensors and polls via sim.render() + scene.update() with a 30-pass cap. Called from the four rendering helpers in rendering_test_utils.py (rendering_test_shadow_hand / _cartpole / _dexsuite_kuka) and from test_rendering_registered_tasks.py. Mirrors the pattern already in DirectRLEnvCfg.num_rerenders_on_reset / wait_for_textures; physics state is not advanced, so the existing goldens at the post-init "first non-zero frame" stay valid.

An earlier attempt at a fixed-pass warmup over-rendered in the goldenfile helpers and broke dexsuite_kuka goldens at 91% pixel diff — early-exit on first-non-zero avoids that by landing at the same frame the goldens were captured at.

Type of change

Bug fix (non-breaking change which fixes an issue)

Checklist

I have run the `pre-commit` checks with `./isaaclab.sh --format`
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
I have updated the changelog and the corresponding version in the extension's `config/extension.toml` file
I have added my name to the `CONTRIBUTORS.md` or my name already exists there

… tests `test_shadow_hand_vision_presets.py::test_camera_renders_not_empty` and the three helper-driven rendering tests in `rendering_test_utils.py` (`rendering_test_shadow_hand`/`_cartpole`/`_dexsuite_kuka`) intermittently fail with "Camera output 'simple_shading_*_mdl' is all zeros or all inf after stepping" on cold-cache CI runners. Root cause is RTX MDL shader compile timing: a single `env.step()` returns a still-zero framebuffer for shader variants that haven't finished compiling. The variant that flakes (constant_diffuse / diffuse_mdl / full_mdl) shifts between runs depending on which MDL pipeline finishes first. Step the env 10 frames before reading the camera tensor — empirically enough across the MDL variants that flake. Adds ~1-2 s per parametrize variant; cheap relative to the cost of a retry. Affected jobs: - `isaaclab_tasks [2/3]` (test_camera_renders_not_empty) - `rendering-correctness-kitless` (test_rendering_*_kitless[*-simple_shading_*])

isaaclab-review-bot

Summary

This PR correctly addresses an RTX MDL shader-warmup race condition in the rendering correctness tests. The root cause analysis is solid: asynchronous shader compilation can leave the framebuffer zeroed when tests read camera tensors too early after environment creation.

Key Observations

✅ Strengths

Well-designed helper function: The _warmup_render() abstraction is clean and consolidates the warmup logic in a single place, making future tuning straightforward.
Good documentation: Both the docstring and inline comments explain why 10 steps are needed, including the specific MDL shader variants affected. This context will help future maintainers.
Appropriate placement: Calling _warmup_render() after maybe_save_stage() but before validate_camera_outputs() is the right sequencing.
Minimal scope: The change is surgical — it only touches the test infrastructure that was flaking, with no production code impact.

💭 Observations (not blockers)

Magic number 10: The warmup step count is empirically determined. Consider adding a brief comment in the fixture (render_correctness_env) referencing _warmup_render so readers know both sites use the same rationale.
No explicit warmup validation: The helper assumes stepping 10 times is sufficient but does not verify shaders actually compiled. This is likely fine for a test stabilization fix, but if flakes persist, a future enhancement could add a pixel-value sanity check after warmup.
Changelog fragment: Well-written and follows the project format.

Verdict

LGTM 👍 — This is a pragmatic fix for a well-understood CI flake. The tradeoff of ~1-2s additional wall time per variant is absolutely worth the CI stability gain.

Update (08cd6c9): Excellent refinement! The switch from env.step() to sim.render() + scene.update() is a meaningful improvement:

✅ No physics advancement: The warmup no longer steps physics state, so golden reference images captured at the post-init state remain valid without recapture
✅ Better semantic fit: The new approach mirrors DirectRLEnvCfg.num_rerenders_on_reset / wait_for_textures patterns already established in the codebase
✅ Clearer naming: num_passes better describes what's happening vs num_steps
✅ Improved documentation: The expanded docstring now explicitly calls out the physics-preserving design and references the relevant config attributes

The changelog update accurately reflects the new approach, differentiating the render-only warmup (for golden-image tests) from the step-based warmup (for non-zero pixel checks where physics drift is acceptable).

LGTM 👍 — Ready to merge.

Update (c359111): Significant scope reduction — the PR now takes a simpler approach:

❌ Removed _warmup_render() helper entirely
❌ Removed warmup calls from rendering_test_shadow_hand, rendering_test_cartpole, and rendering_test_dexsuite_kuka
✅ Retained only the step-based warmup for test_camera_renders_not_empty (which has no golden comparison)
📝 Updated changelog to reflect the reduced scope, noting the golden-image tests already use @flaky(max_runs=3)

Assessment:

This is a reasonable decision. The tradeoffs:

Approach	Pros	Cons
New (flaky retries)	Zero additional wall time; existing mechanism; simpler code	May still fail 3x in rare cold-cache scenarios
Previous (warmup)	Deterministic fix	~1-2s added per test; more invasive

The author is betting that @flaky(max_runs=3) provides sufficient coverage for the golden-image tests, since shader compilation is a one-time cold-cache event. If any retry succeeds (after shaders compile), the test passes. This is pragmatic.

LGTM 👍 — The reduced scope is justified. The only test that truly needed explicit warmup was test_camera_renders_not_empty (which lacked any retry mechanism), and that fix is retained.

Update (39e6bb7): Nice iteration! The warmup now uses adaptive polling instead of a fixed step count:

✅ Polling with early exit: Instead of always stepping 10 times, the fixture now polls up to _MAX_WARMUP_STEPS = 60 but exits as soon as all camera outputs have max() > 0.2 (excluding inf values)
✅ Handles all output types: The loop checks env._tiled_camera.data.output.values() so all data-type tensors (RGB, depth, etc.) are validated
✅ Better worst-case coverage: 60 steps provides headroom for pathological cold-cache scenarios while the early-exit keeps typical runs fast
✅ Improved changelog: Updated to describe the polling approach with the 60-step cap

Minor observations (not blockers):

Threshold 0.2: The pixel value threshold is empirically chosen. It's reasonable but could be documented (e.g., why not 0.0 or 0.1?). For MDL shaders, fully black pixels return 0.0, so 0.2 is safely above that.
Direct camera access: env._tiled_camera is a private attribute. If the API changes, this will break. For a test fixture this is acceptable since tests can be updated alongside implementation changes.

LGTM 👍 — The adaptive approach is strictly better than the fixed count: faster in the common case, more robust in edge cases. Ready to merge.

Update (3de115a): Great consolidation! The warmup logic is now properly refactored:

Changes:

New helper _warmup_render_until_nonzero() in rendering_test_utils.py:
- Extracted from the fixture into a reusable function
- Uses sim.render() + scene.update(dt) instead of env.step() — physics state is NOT advanced
- 30-pass cap (vs 60 in the fixture version)
- Same 0.2 threshold logic with inf filtering
- Flexible camera detection: checks _tiled_camera first, falls back to scene.sensors.get('base_camera')
- Excellent docstring explaining the MDL async compilation race and why render-only passes preserve golden validity
Applied to all three golden-image test helpers:
- rendering_test_shadow_hand()
- rendering_test_cartpole()
- rendering_test_dexsuite_kuka()
Expanded changelog:
- Now documents both approaches: step-based for test_camera_renders_not_empty (60-step cap) and render-only for golden tests (30-pass cap)
- Clearly explains why physics-preserving warmup is needed for golden comparisons

Assessment:

This is the right design:

Test Type	Warmup Method	Physics Impact	Why
`test_camera_renders_not_empty`	`env.step()` (60 cap)	Yes	Only checks non-zero; goldens don't matter
Golden-image tests	`sim.render() + scene.update()` (30 cap)	No	Must match existing reference frames

The dual approach respects the invariants each test type requires. Render-only warmup for golden tests means no golden recapture is needed.

LGTM 👍 — Clean, well-documented, and production-ready. Ready to merge.

Update (eb23b0c): Excellent final polish! Key improvements:

Public API exposure: _warmup_render_until_nonzero → warmup_render_until_nonzero — the underscore removal signals this is now a supported test utility
Multi-sensor support: The helper now iterates over ALL sensors in env.scene.sensors instead of hardcoded _tiled_camera / base_camera lookups. This is more robust and future-proof for envs with multiple cameras
Wrapped env handling: Added getattr(env, "unwrapped", env) to correctly access scene/sim on wrapped environments
Extended coverage: Now also applied to test_rendering_registered_tasks.py, which was previously missing warmup and could flake on cold-cache runs
Changelog accuracy: Updated to reflect the broader scope (four rendering helpers + registered tasks test)

The nested loop structure properly handles early-exit semantics and the defensive getattr chains ensure graceful no-ops for envs without standard sensor structures.

LGTM 👍 — This is the complete, polished solution. Ready to merge.

hujc7 · 2026-05-19T08:13:21Z

does not seem working. need to change golden file.

hujc7 · 2026-05-19T08:53:41Z

Reopening. Going back at this with the warmup + golden regeneration approach, validated locally before pushing.

The previous warmup used env.step() which advances physics and breaks golden-image comparison in the rendering correctness tests. Switch to sim.render() + scene.update() so shader compile finishes without moving the scene state, mirroring num_rerenders_on_reset in the env classes.

The 10-pass sim.render warmup in the three goldenfile helpers (rendering_test_shadow_hand / _cartpole / _dexsuite_kuka) caused test_rendering_dexsuite_kuka and *_kitless variants to fail with large pixel diffs vs the existing goldens — the warmup exposes a different post-loading scene state for the dexsuite_kuka asset set. Keep only the test_camera_renders_not_empty change (no golden compare, env.step warmup is safe there). The goldenfile helpers already use flaky(max_runs=3) for occasional MDL hiccups.

10 frames was not enough — CI still flaked on test_camera_renders_not_empty[physx-isaacsim_rtx-simple_shading_diffuse_mdl]. Poll until every camera output tensor has a non-zero max with a 60-step cap, exiting early once all outputs are ready.

greptile-apps · 2026-05-19T11:41:31Z

Greptile Summary

This PR adds a poll-until-non-zero MDL shader warmup to the rendering correctness tests to eliminate intermittent "Camera output is all zeros or all inf" failures on cold-cache CI runners, where the GPU returns a still-zero framebuffer before RTX MDL materials finish compiling.

warmup_render_until_nonzero (new helper in rendering_test_utils.py): drives up to 30 sim.render() + scene.update() passes — no physics advance — until every sensor tensor's finite max exceeds 0.2, then returns early so golden-compare state stays at the same frame the reference images were captured at.
render_correctness_env fixture (test_shadow_hand_vision_presets.py): replaces the single env.step() with a loop of up to 60 env.step() calls with the same non-zero check, exiting early once ready.
test_rendering_registered_tasks.py: calls warmup_render_until_nonzero before validate_camera_outputs.

Confidence Score: 5/5

Safe to merge; the warmup logic is correct, early-exits cleanly, and does not advance physics state, so existing golden images remain valid.

The change is narrowly scoped to test infrastructure: it adds a polling loop around render calls with a bounded cap, uses the same threshold already in the test assertions, and all three rendering helpers are updated consistently. No production code is touched.

No files require special attention.

Important Files Changed

Filename	Overview
source/isaaclab_tasks/test/rendering_test_utils.py	Adds `warmup_render_until_nonzero` helper that polls via `sim.render()` + `scene.update()` up to 30 passes until every camera tensor's finite max exceeds 0.2; integrated into all three rendering test helpers. Logic is correct; no diagnostic on cap exhaustion.
source/isaaclab_tasks/test/test_shadow_hand_vision_presets.py	Replaces single `env.step()` in the `render_correctness_env` fixture with a poll loop of up to 60 steps using `env.step()` + tensor max check; also contains a leftover `# import pdb; pdb.set_trace()` comment.
source/isaaclab_tasks/test/test_rendering_registered_tasks.py	Imports and calls `warmup_render_until_nonzero` before camera output validation; straightforward integration.
source/isaaclab_tasks/changelog.d/jichuanh-stabilize-rendering-mdl-warmup.rst	New changelog fragment accurately describing the MDL shader warmup fix.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Env constructed] --> B{warmup_render_until_nonzero}
    B --> C[Check all sensor tensors\nfinite.max > 0.2?]
    C -- All ready --> D[Return immediately\nstate at first-non-zero frame]
    C -- Not ready --> E[sim.render + scene.update\nno physics advance]
    E --> F{Pass count < 30?}
    F -- Yes --> C
    F -- No cap hit --> G[Return silently\nno warning emitted]
    D --> H[validate_camera_outputs\nor golden compare]
    G --> H

_{Reviews (3): Last reviewed commit: "Apply warmup to registered-tasks renderi..." | Re-trigger Greptile}

greptile-apps · 2026-05-19T11:41:40Z

+    _MAX_WARMUP_STEPS = 60
+    for _ in range(_MAX_WARMUP_STEPS):
+        env.step(actions)
+        outputs_ready = True
+        for output in env._tiled_camera.data.output.values():
+            tensor = output.torch
+            finite = torch.where(torch.isinf(tensor), torch.zeros_like(tensor), tensor)
+            if finite.max() <= 0.2:
+                outputs_ready = False
+                break
+        if outputs_ready:
+            break


No diagnostic when warmup cap is reached

If all 60 steps complete without any output reaching the threshold, the fixture silently yields and the test fails with the standard "Camera output is all zeros or all inf" assertion error — with no indication that the warmup cap was hit. Adding a warnings.warn or print on cap exhaustion would make it immediately clear in CI logs that the problem is slow shader compilation rather than a deeper rendering bug.

The three rendering_test_utils helpers (rendering_test_shadow_hand / _cartpole / _dexsuite_kuka) now drive sim.render() + scene.update() passes until every camera-output tensor has a non-zero max, with a 30-pass cap and early-exit. Physics is not advanced (no env.step) so the existing goldens at the post-init "first non-zero frame" stay valid. A previous fixed 10-pass over-rendered and broke dexsuite_kuka goldens at 91% pixel diff.

Extend warmup_render_until_nonzero to iterate every sensor in env.scene.sensors (instead of hardcoded _tiled_camera / base_camera) and call it from test_rendering_registered_tasks.py. Also rename the helper from _warmup_render_until_nonzero (module-private) to warmup_render_until_nonzero now that it has cross-module callers.

github-actions Bot added the isaac-lab Related to Isaac Lab team label May 19, 2026

isaaclab-review-bot Bot reviewed May 19, 2026

View reviewed changes

hujc7 closed this May 19, 2026

hujc7 reopened this May 19, 2026

hujc7 added 3 commits May 19, 2026 09:04

hujc7 marked this pull request as ready for review May 19, 2026 11:39

greptile-apps Bot reviewed May 19, 2026

View reviewed changes

hujc7 marked this pull request as draft May 19, 2026 11:42

hujc7 marked this pull request as ready for review May 19, 2026 13:16

hujc7 marked this pull request as draft May 19, 2026 13:38

hujc7 marked this pull request as ready for review May 19, 2026 15:57

hujc7 marked this pull request as draft May 19, 2026 16:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Test] Stabilize RTX MDL shader-warmup flake in rendering correctness tests#5687

[Test] Stabilize RTX MDL shader-warmup flake in rendering correctness tests#5687
hujc7 wants to merge 6 commits into
isaac-sim:developfrom
hujc7:jichuanh/stabilize-rendering-mdl-warmup

hujc7 commented May 19, 2026 •

edited

Loading

Uh oh!

isaaclab-review-bot Bot left a comment •

edited

Loading

Uh oh!

hujc7 commented May 19, 2026

Uh oh!

hujc7 commented May 19, 2026

Uh oh!

greptile-apps Bot commented May 19, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot May 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

hujc7 commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Checklist

Uh oh!

isaaclab-review-bot Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Summary

Key Observations

✅ Strengths

💭 Observations (not blockers)

Verdict

Uh oh!

hujc7 commented May 19, 2026

Uh oh!

hujc7 commented May 19, 2026

Uh oh!

greptile-apps Bot commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

greptile-apps Bot May 19, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

hujc7 commented May 19, 2026 •

edited

Loading

isaaclab-review-bot Bot left a comment •

edited

Loading

greptile-apps Bot commented May 19, 2026 •

edited

Loading