Skip to content

[CI] Revert eval_runner tests#559

Merged
xyao-nv merged 16 commits intomainfrom
xyao/exp/revert_eval_runner_tests
Apr 11, 2026
Merged

[CI] Revert eval_runner tests#559
xyao-nv merged 16 commits intomainfrom
xyao/exp/revert_eval_runner_tests

Conversation

@xyao-nv
Copy link
Copy Markdown
Collaborator

@xyao-nv xyao-nv commented Apr 8, 2026

Summary

As CI seems to run smoothly agin, bring back previously disabled tests.

@xyao-nv xyao-nv force-pushed the xyao/exp/ci_stall branch 2 times, most recently from bc9bb9d to e80dca2 Compare April 8, 2026 23:12
Copy link
Copy Markdown

@kellyguo11 kellyguo11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: PR #559 — revert test

Summary: This PR re-enables 5 eval runner tests that were previously skipped with @pytest.mark.skip(reason="Skipping because of CI stalling") by replacing those skip markers with @pytest.mark.with_subprocess.

Context understood: The base branch xyao/exp/ci_stall added infrastructure to separate subprocess-spawning tests from persistent SimulationApp tests (via the with_subprocess marker and CI workflow changes). This PR is the natural follow-up that actually reverts the skips and uses the new marker.

✅ What looks good

  • The change is straightforward: 5 mechanical replacements of skipwith_subprocess.
  • The with_subprocess marker is properly registered in pytest.ini and the CI workflow already has a dedicated step that runs -m with_subprocess tests with ISAACLAB_ARENA_SUBPROCESS_TIMEOUT=900.
  • The marker semantics are correct — these tests all use run_eval_runner_and_check_no_failures(), which spawns eval_runner.py via subprocess.run().

⚠️ Potential concern: missing timeout / process group isolation in run_eval_runner_and_check_no_failures

The utility run_subprocess() in isaaclab_arena/tests/utils/subprocess.py was carefully written with:

  • start_new_session=True to isolate GPU child processes
  • timeout=_SUBPROCESS_TIMEOUT_SEC to prevent hangs

However, run_eval_runner_and_check_no_failures() in this test file uses raw subprocess.run(args, capture_output=True, text=True, check=True)no timeout, no start_new_session=True. If the original CI stalling was caused by subprocess hangs or orphaned GPU processes, these tests could still stall even with the marker separation, since the subprocess itself has no timeout.

Suggestion: Consider either:

  1. Refactoring run_eval_runner_and_check_no_failures() to use run_subprocess() from the utils module, or
  2. At minimum, adding timeout=_SUBPROCESS_TIMEOUT_SEC and start_new_session=True to the subprocess.run() call.

This isn't a blocker if the CI workflow-level timeout-minutes: 60 is sufficient, but it would be more robust to have per-subprocess timeouts too.

PR description

The PR description still has the default template text. A brief note about reverting the skips now that the CI stalling fix is in place would help future readers.

Overall: The change itself is correct and minimal. Approving since the marker infrastructure is in place and CI will validate.

@xyao-nv xyao-nv force-pushed the xyao/exp/ci_stall branch from 53466ba to 867391f Compare April 10, 2026 04:58
@xyao-nv xyao-nv changed the title revert test [CI] Revert eval_runner tests Apr 10, 2026
@xyao-nv xyao-nv force-pushed the xyao/exp/revert_eval_runner_tests branch from 7bbe406 to 419d4ab Compare April 10, 2026 06:34
@xyao-nv xyao-nv changed the base branch from xyao/exp/ci_stall to main April 10, 2026 17:50
@xyao-nv xyao-nv marked this pull request as ready for review April 10, 2026 21:18
@greptile-apps
Copy link
Copy Markdown

greptile-apps bot commented Apr 10, 2026

Greptile Summary

This PR re-enables five eval_runner tests (previously disabled while CI was unstable) in isaaclab_arena/tests/test_eval_runner.py. Each test is marked @pytest.mark.with_subprocess, writes a temporary jobs config, and invokes eval_runner.py as a child process, then parses the output table for failed status entries.

Confidence Score: 5/5

Safe to merge — this is a clean test re-enablement with no logic changes to production code.

All findings are P2 style suggestions. The test logic, subprocess invocation, marker usage, and config construction are correct. The only note is a defensive guard against an unlikely silent-pass edge case.

No files require special attention.

Important Files Changed

Filename Overview
isaaclab_arena/tests/test_eval_runner.py Re-enables five subprocess-based eval_runner tests; logic, markers, and config construction look correct with one minor note on silent-pass edge case if output table is absent.

Sequence Diagram

sequenceDiagram
    participant PT as pytest
    participant TF as test_eval_runner.py
    participant FS as tmp_path (filesystem)
    participant SP as subprocess (eval_runner.py)
    participant JM as JobManager

    PT->>TF: run test function
    TF->>FS: write_jobs_config_to_file(jobs, path)
    TF->>SP: subprocess.run([python, eval_runner.py, --eval_jobs_config, path], check=True)
    SP->>JM: load and execute jobs
    JM-->>SP: job statuses (completed / failed)
    SP-->>TF: stdout + stderr (table output)
    TF->>TF: regex parse status table
    TF->>TF: assert no "failed" entries
    TF-->>PT: pass / AssertionError
Loading

Comments Outside Diff (1)

  1. isaaclab_arena/tests/test_eval_runner.py, line 44-60 (link)

    P2 Silent pass when output table is absent

    If eval_runner.py exits with code 0 but produces no table output (e.g. early log format change or quiet mode), matches will be empty, job_statuses will be [], and the assertion is never reached — the test passes silently despite no jobs having been verified. Adding a guard ensures at least one status was parsed:

    # after job_statuses is built
    if not job_statuses:
        print("\n" + output)
        raise AssertionError("No job statuses found in eval_runner output — unable to verify results")

Reviews (1): Last reviewed commit: "Merge branch 'main' into xyao/exp/revert..." | Re-trigger Greptile

@xyao-nv xyao-nv enabled auto-merge (squash) April 11, 2026 03:59
@xyao-nv xyao-nv merged commit 9a382b5 into main Apr 11, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants