Skip to content

[MIOpen] Dapper validation phase#8879

Closed
randyspauldingamd wants to merge 63 commits into
ROCm:developfrom
randyspauldingamd:u/rjs/dpr_validate
Closed

[MIOpen] Dapper validation phase#8879
randyspauldingamd wants to merge 63 commits into
ROCm:developfrom
randyspauldingamd:u/rjs/dpr_validate

Conversation

@randyspauldingamd

@randyspauldingamd randyspauldingamd commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Motivation

MIOpen has adapted Dependency Parser to run in CI. This will reduce CI test time dramatically for small PRs. It uses a naive file change-to-test executable mapping, so the time reduction depends on how many test files consume the modified files. Some larger PRs and PRs that touch core files will not see as good of a time reduction.

Technical Details

This PR enables the Dapper selection and filter generation in MIOpen CI, but the filter is not used. Instead, it adds a ctest miopen_gtest_sharded_dapper that interrogates the test results from all shards to determine Dapper's efficacy. It also includes a validation stage which ensures that Dapper would have caught any test failures caused by the changes. If any expected tests did not run, miopen_gtest_sharded_dapper fails. Note that Dapper operates in a subtractive-only manner; meaning it ignores tests that either were not enabled by the user or were disabled via the base gtest_filter.

While this PR contains some groundwork for TheRock CI, the intent is to have no effect on TheRock at this time.

Test Plan

Ran a full MIOpen-CI run on gfx90A and gfx950.

Test Result

PASS on gfx90A: this run had 3 changes:

  • 1 injected failure that caused test failures. Caught in Overall Test Result and Dapper Test Result, denied Minimal Compliance and caused Dapper Compliance to FAIL.
  • 1 nullop change that did not cause failure (no output in the summary)
  • 1 injected failure that was intentionally left out of the filter, which was not executed. This denied Minimal Compliance and caused Dapper Compliance to FAIL.

The test itself PASSED since all failures that were tested were caught.

25: ========== Dapper Gtest Sharded Analysis ========================
25: Total Test Time                            : 26715.087s
25: Dapper Time                                : 337.585s
25: Time Dapper would have saved               : 26377.502s (98.736%)
25: Dapper fixtures not in category filter     : 1
25: Dapper fixtures negated by category filter : 0
25: Overall Test Result                        : FAIL
25: Dapper Test Result                         : FAIL
25: Covered dapper patterns (forward)          : 9
25: Covered dapper patterns (reverse)          : 9
25: Minimal Compliance Achieved?               : False
25: Dapper Compliance                          : FAIL
25: Validation Result                          : VALID
25/25 Test #25: miopen_gtest_sharded_dapper ......   Passed    1.56 sec

gfx950: TBD

Submission Checklist

randyspauldingamd and others added 30 commits May 21, 2026 05:40
CpuActivationPackedMultiThread sized per-thread work with two stacked
ceilings (ceil(num_items / 16M), then ceil(num_jobs / num_threads) * 16M),
so chunk_size * num_threads overshot num_items by whole chunks. Trailing
threads received offsets and ends past the buffer -> heap OOB read ->
SIGSEGV. The overshoot only occurs when the thread count does not divide
the work evenly, so it was host core-count dependent and reproduced on
some machines but not others.

Recompute the per-thread item counts with a proportional split: thread t
processes [num_items * t / num_threads, num_items * (t + 1) / num_threads).
Consecutive threads reuse the same boundary expression, so the ranges are
contiguous, non-overlapping, and the final thread ends exactly at
num_items. This fixes the item-count calculation directly and removes the
separate remainder branch and every clamp -- there is no std::min and no
special last-thread case. num_threads <= num_jobs <= ceil(num_items / 16M)
guarantees num_items >= num_threads, so every launched thread has a
non-empty range.

Return std::size_t from CpuActivationGetNumThreads as well: std::min
already yields std::size_t, so the previous unsigned return type silently
narrowed the value. The single caller deduces the type with auto and uses
it only in std::size_t arithmetic, so widening the return type removes the
narrowing without any other change.

Test-only; library and kernel code are untouched; no performance impact.
@therock-pr-bot

therock-pr-bot Bot commented Jun 26, 2026

Copy link
Copy Markdown

❌ PR Check — Action Required

Check Status Details
🌿 Branch Name ❌ Fail Branch name does not match allowed patterns.
Branch: u/rjs/dpr_validate
Allowed patterns:
- ^users\/[A-Za-z0-9][A-Za-z0-9\-]*\/.+
- ^shared\/.+
- ^[A-Za-z0-9][A-Za-z0-9\-_]*$
- ^dependabot\/.+
- ^revert-[0-9]+-.+
📝 PR Title/Description ❌ Fail Error: Title does not follow Conventional Commits style.
Expected: start with a valid type (feat, fix, docs, …).
Desired format: type(optional-scope): short description
Forbidden Files ✅ Pass
🧪 Unit Test ✅ Pass
🔎 pre-commit ✅ Pass
🚫 Draft PR 🔜 To Be Enabled
🚩 Feature Flag 🔜 To Be Enabled
📊 Code Coverage 🔜 To Be Enabled

⚠️ 2 policy check(s) failed. Please address the issues above before this PR can be Reviewed.

🚫 Please fix the failed policies

  • ❌ Branch Name
  • ❌ PR Title/Description

The Not ready to Review label was added to this PR. Once all policies pass, the label is removed automatically.

📖 Need help? See the Policy FAQ for details on every check and how to fix failures.

@therock-pr-bot

Copy link
Copy Markdown

🚫 Please fix the failed policies before requesting reviews.

The following policy checks failed:

  • ❌ Branch Name
  • ❌ PR Title/Description

The Not ready to Review label has been added to this PR.
Once all policies pass, the label will be removed automatically.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants