Fixes benchmark scripts when tensorboard logs are missing by kellyguo11 · Pull Request #5564 · isaac-sim/IsaacLab

kellyguo11 · 2026-05-10T06:58:10Z

Description

When benchmarking scripts are executed with num_iterations set to below the threshold for reward logging, the run can produce missing reward data. However, the scripts are hardcoded to always parse rewards from tensorboard, which may not exist in these cases. This change patches the RL benchmarking scripts to only process rewards logging if they were written to tensorboard.

Type of change

Bug fix (non-breaking change which fixes an issue)

Checklist

I have read and understood the contribution guidelines
I have run the pre-commit checks with ./isaaclab.sh --format
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
I have updated the changelog and the corresponding version in the extension's config/extension.toml file
I have added my name to the CONTRIBUTORS.md or my name already exists there

greptile-apps · 2026-05-10T07:01:52Z

Greptile Summary

This PR guards the rewards and episode_lengths TensorBoard log lookups in both RL benchmarking scripts with .get() and conditional logging, preventing KeyError crashes when training runs fewer iterations than the reward-logging threshold.

benchmark_rlgames.py: log_data["rewards/iter"] and log_data["episode_lengths/iter"] replaced with .get() calls; log_rl_policy_rewards, log_rl_policy_episode_lengths, and log_convergence are now skipped with a printed warning when the keys are absent.
benchmark_rsl_rl.py: Identical pattern applied to Train/mean_reward and Train/mean_episode_length, and convergence checking is similarly guarded.

Confidence Score: 4/5

Safe to merge; the fix correctly targets the crash path and leaves the always-present performance timing metrics untouched.

The change is small and well-scoped. The only concern is that if rewards: (a falsy check on a list) is used instead of the more explicit if rewards is not None:, which could produce a misleading warning if parse_tf_logs ever returns an empty list for a key that was registered but had no events written — a subtle but unlikely edge case.

Both changed files carry the same if rewards: pattern; no other files need attention.

Important Files Changed

Filename	Overview
scripts/benchmarks/benchmark_rlgames.py	Switches reward and episode-length log lookups to `.get()` with truthiness guards; minor concern that `if rewards:` (falsy check) should be `if rewards is not None:` to correctly distinguish a missing key from an empty data list.
scripts/benchmarks/benchmark_rsl_rl.py	Same defensive `.get()` pattern applied to `Train/mean_reward` and `Train/mean_episode_length`; same `if rewards:` vs `if rewards is not None:` nit applies here as well.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[parse_tf_logs] --> B{rewards key present?}
    B -- "None (missing)" --> C[Print WARNING: missing rewards]
    B -- "list of values" --> D[log_rl_policy_rewards]
    D --> E[log_convergence]
    C --> F{check_convergence flag?}
    F -- yes --> G[Print WARNING: cannot check convergence]
    F -- no --> H[skip]
    E --> I[log_success / finalize]
    G --> I
    H --> I
    A --> J{episode_lengths key present?}
    J -- "None (missing)" --> K[Print WARNING: missing episode lengths]
    J -- "list of values" --> L[log_rl_policy_episode_lengths]
    L --> I
    K --> I

Comments Outside Diff (4)

scripts/benchmarks/benchmark_rlgames.py, line 292-301 (link)

Using a plain truthiness check (if rewards:) here conflates two different situations: the key being absent from log_data (returns None) and the key being present but having no events yet (returns []). Both produce the same "TensorBoard log is missing" warning message even though the second case means the key was registered but no data was emitted. Using is not None makes the intent explicit and keeps the two cases distinguishable for future debugging.
scripts/benchmarks/benchmark_rlgames.py, line 306-317 (link)

Same truthiness-vs-None concern as above: the guard on log_convergence should match the guards above so the two branches stay semantically consistent.
scripts/benchmarks/benchmark_rsl_rl.py, line 291-300 (link)

Same if rewards: vs if rewards is not None: concern as in benchmark_rlgames.py — applies to both the rewards and episode-lengths guards here.
scripts/benchmarks/benchmark_rsl_rl.py, line 306-317 (link)

Same consistency concern for the convergence guard in benchmark_rsl_rl.py.

_{Reviews (1): Last reviewed commit: "add CI tests" | Re-trigger Greptile}

isaaclab-review-bot

🤖 Isaac Lab Review Bot

Summary

Fixes a KeyError crash in benchmark scripts when num_iterations is below the reward logging threshold. The fix replaces direct dict access (log_data["key"]) with .get() + conditional processing and appropriate warnings. Clean, well-scoped defensive fix applied consistently to both benchmark_rlgames.py and benchmark_rsl_rl.py.

Design Assessment

Design is sound. The fix correctly converts hard crashes into graceful degradation with user-visible warnings. The warning pattern (print("[WARNING]...")) is consistent with existing conventions in the benchmark scripts and utils.py.

Findings

🔵 Suggestion: Consider if rewards is not None: for consistency — scripts/benchmarks/benchmark_rsl_rl.py:291

The code uses if rewards: (truthiness) but the adjacent success_rates check uses if success_rates is not None:. Both protect against missing keys, but the truthiness check also treats empty lists as "missing." In practice this is fine — TensorBoard won't produce empty lists for registered tags, and log_rl_policy_rewards() would crash on max([]) anyway — but is not None would be more semantically precise and consistent with the existing pattern. Not a bug, just a consistency nit.

Test Coverage

Tests not practical — standalone GPU benchmark scripts lack unit-test infrastructure
The fix is mechanical (.get() + guard) with very low regression risk
Acceptable without additional tests

CI Status

Check for Broken Links: ✅ pass
Most other checks: ⏳ pending

Verdict

Ship it ✅

Clean, correct, and well-scoped bug fix. The if rewards: truthiness check is safe for the actual data patterns produced by parse_tf_logs(). The consistency suggestion is a minor polish, not a blocker.

isaaclab-review-bot · 2026-05-10T07:02:00Z

+            episode_length_tag="Train/mean_episode_length",
+            task=args_cli.task,
            workflow="rsl_rl",
            should_check_convergence=args_cli.check_convergence,


🔵 Suggestion: For consistency with the success_rates is not None check a few lines below, consider using if rewards is not None: here (and for episode_lengths). Both work correctly in practice — parse_tf_logs returns None for missing keys — but is not None makes the intent clearer and matches the existing pattern in this file.

Non-blocking — the truthiness check is also safe since an empty list from TensorBoard is practically impossible, and log_rl_policy_rewards would crash on max([]) anyway.

Fixes benchmark scripts when tensorboard logs are missing

4f8c7c9

kellyguo11 requested a review from ooctipus as a code owner May 10, 2026 06:58

github-actions Bot added bug Something isn't working isaac-lab Related to Isaac Lab team labels May 10, 2026

add CI tests

995f9f1

isaaclab-review-bot Bot reviewed May 10, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixes benchmark scripts when tensorboard logs are missing#5564

Fixes benchmark scripts when tensorboard logs are missing#5564
kellyguo11 wants to merge 2 commits intoisaac-sim:developfrom
kellyguo11:fix-benchmark-scripts

kellyguo11 commented May 10, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented May 10, 2026 •

edited

Loading

Comments Outside Diff (4)

Uh oh!

isaaclab-review-bot Bot left a comment

Uh oh!

isaaclab-review-bot Bot May 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

kellyguo11 commented May 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Checklist

Uh oh!

greptile-apps Bot commented May 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Flowchart

Comments Outside Diff (4)

Uh oh!

isaaclab-review-bot Bot left a comment

Choose a reason for hiding this comment

🤖 Isaac Lab Review Bot

Summary

Design Assessment

Findings

Test Coverage

CI Status

Verdict

Uh oh!

isaaclab-review-bot Bot May 10, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

kellyguo11 commented May 10, 2026 •

edited

Loading

greptile-apps Bot commented May 10, 2026 •

edited

Loading