fix tuning test by yzhou103 · Pull Request #3118 · ROCm/aiter

yzhou103 · 2026-05-11T06:17:22Z

Motivation

Technical Details

Test Plan

Test Result

Submission Checklist

Look over the contributing guidelines at https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.

github-actions · 2026-05-11T06:17:57Z

🏷️ CI Guide

Runs automatically on every PR:

✅ Pre-checks (submodule verification, code formatting)
✅ Aiter op tests (gfx942 + gfx950)
✅ Triton tests on MI35X (only when aiter/ops/triton/** or related paths are changed)

Extended tests (opt-in via labels):

Label	Tests
`ci:triton-300x`	Run an additional Triton test job on MI300X in PRs; main branch always runs both MI35X and MI300X
`ci:sglang`	SGLang integration tests: DeepSeek-R1-MXFP4 accuracy, Qwen 3.5 accuracy
`ci:atom`	ATOM benchmark: DeepSeek-R1-0528, GPT-OSS-120B
`ci:atom_full`	ATOM accuracy suite for PR and main models from ATOM `models_accuracy.json`
`ci:vllm`	vLLM benchmark: GPT-OSS-120B, DeepSeek-R1-0528, Kimi-K2.5
`ci:all`	All standard extended tests (excludes `ci:atom_full`)

Only add ci:atom_full for FlyDSL or Triton upgrades.
Add labels via the sidebar or gh pr edit 3118 --add-label <label>

Copilot

Pull request overview

This PR updates the tuning test suite to align with current config-merge behavior, reduce e2e tuning pipeline flakiness, and clarify how to run individual tuner tests.

Changes:

Update update_config_files unit test to expect CSVs with differing columns to merge (with missing columns filled) instead of raising.
Add a pre/post tuner-run cleanup helper for stale JIT lock files and simplify the --compare --update_improved e2e test invocation/shape set.
Expand tuning test README with guidance for running specific tuner tests (mp=1 vs mp=default).

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File	Description
`op_tests/tuning_tests/test_tuner_infra.py`	Updates config-merge unit test expectations for mismatched CSV columns.
`op_tests/tuning_tests/test_tune_pipeline.py`	Adds lock-file cleanup, consolidates compare pipeline test, adjusts args/timeouts/shapes.
`op_tests/tuning_tests/README.md`	Documents running individual tuner tests and clarifies pipeline coverage.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+def _cleanup_stale_lock_files():
+    """Remove stale FileBaton lock files left by killed subprocesses."""
+    build_dir = os.path.join(AITER_ROOT, "aiter", "jit", "build")
+    if not os.path.isdir(build_dir):
+        return
+    lock_patterns = [
+        os.path.join(build_dir, "lock_*"),
+        os.path.join(build_dir, "*", "build", "lock"),
+        os.path.join(build_dir, "lock_3rdparty_*"),
+    ]
+    for pattern in lock_patterns:
+        for lock_file in glob.glob(pattern):
+            try:
+                os.remove(lock_file)
+                print(f"Cleaned up stale lock file: {lock_file}", flush=True)
+            except OSError:
+                pass


+    build_dir = os.path.join(AITER_ROOT, "aiter", "jit", "build")
+    if not os.path.isdir(build_dir):
+        return
+    lock_patterns = [
+        os.path.join(build_dir, "lock_*"),
+        os.path.join(build_dir, "*", "build", "lock"),
+        os.path.join(build_dir, "lock_3rdparty_*"),
+    ]


-            "shapes": [(1, 1024, 512), (16, 1536, 7168)],
+            "shapes": [(1, 1024, 512)],
            "keys": ["cu_num", "M", "N", "K"],
+            "timeout": 3600,


+    def test_compare_and_update(self):
+        """--compare --update_improved: tune, compare, update tuned CSV."""
+        cfg = self.CONFIGS["a8w8_blockscale"]
+        timeout = cfg.get("timeout", 900)
        tmp = tempfile.mkdtemp()
-        untuned = os.path.join(tmp, "untuned.csv")
-        tuned = os.path.join(tmp, "tuned.csv")
-        _write_csv(untuned, cfg["header"], cfg["shapes"])
-
-        extra = ["--compare"]
-        if update_improved:
-            extra.append("--update_improved")
-        result = _run_tuner(
-            cfg["script"], untuned, tuned, extra_args=extra, timeout=900
-        )
-        return result, tuned, tmp
-
-    def test_compare_only(self):
-        """--compare runs pre/post benchmark and prints comparison."""
-        result, tuned, tmp = self._run_compare("a8w8_blockscale", update_improved=False)
        try:
+            untuned = os.path.join(tmp, "untuned.csv")
+            tuned = os.path.join(tmp, "tuned.csv")
+            _write_csv(untuned, cfg["header"], cfg["shapes"])
+
+            result = _run_tuner(
+                cfg["script"],
+                untuned,
+                tuned,
+                extra_args=[
+                    "--compare",
+                    "--update_improved",
+                    "--libtype",
+                    "ck",
+                    "--batch",
+                    "1",
+                ],
+                timeout=timeout,
+                mp=1,
+            )


fix tuning test

8dc90a6

yzhou103 requested review from a team and Copilot May 11, 2026 06:17

Copilot started reviewing on behalf of yzhou103 May 11, 2026 06:18 View session

Copilot AI reviewed May 11, 2026

View reviewed changes

update

a1c116e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix tuning test#3118

fix tuning test#3118
yzhou103 wants to merge 2 commits intoROCm:mainfrom
yzhou103:fix_tuning_test

yzhou103 commented May 11, 2026

Uh oh!

github-actions Bot commented May 11, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

yzhou103 commented May 11, 2026

Motivation

Technical Details

Test Plan

Test Result

Submission Checklist

Uh oh!

github-actions Bot commented May 11, 2026

🏷️ CI Guide

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants