Commit d43a48b
authored
Fix ASan OOM in QDQ Gemm transformer tests (#28797)
# PR: Fix ASan OOM in QDQ Gemm transformer tests
## Description
PR #28131 ("Reject QDQ Gemm→QGemm fusion when alpha != 1 with bias")
added `alpha_not_one`
coverage to the QDQ Gemm fusion tests. This multiplied the number of
`TransformerTester`
session builds inside the already-large `Gemm_U8U8U8` test matrix and
pushed the
AddressSanitizer (ASan) build of `onnxruntime_test_all` over its
allocator limit, causing the
`windows_x64_asan` CI to fail with `AddressSanitizer: Out of memory. The
process has exhausted
8192MB for size class 8192`. This PR isolates the `alpha != 1` coverage
into small, dedicated
tests so the peak memory of any single test is reduced.
## Summary of Changes
| File | Change |
|------|--------|
| `onnxruntime/test/optimizer/qdq_transformer_test.cc` | Added an
`opset_version` parameter to `QDQTransformerGemmTests` (default `0` =
run opsets 12/18/19); replaced the three hardcoded `TransformerTester`
calls with a loop over the selected opset(s); removed the inline
`alpha_not_one` block from the templated `QDQTransformerGemmTests()`;
added a dedicated `TEST(QDQTransformerTests, Gemm_AlphaNotOne_U8U8U8)`
that runs only uint8/uint8/uint8 at opset 19. |
| `onnxruntime/test/optimizer/qdq_transformer_fastmath_test.cc` | Same
refactor for the fastmath variant; added `TEST(QDQTransformerTests,
Gemm_AlphaNotOne_U8U8U8_FastMath)` running only uint8/uint8/uint8 at
opset 19. |
The net effect is that the incremental `alpha_not_one` work added by
#28131 drops from 24
session builds (4 alpha variants × 3 opsets, in each of the regular and
fastmath files) to 8,
and is no longer part of the large `Gemm_U8U8U8` matrix test — directly
lowering the peak
memory consumed in a single test.
## Testing
- Build with `--enable_address_sanitizer` (the `windows_x64_asan`
configuration) and run
`onnxruntime_test_all`; confirm
`QDQTransformerTests.Gemm_AlphaNotOne_U8U8U8`,
`QDQTransformerTests.Gemm_AlphaNotOne_U8U8U8_FastMath`, and
`QDQTransformerTests.Gemm_U8U8U8`
pass and the suite no longer hits the ASan OOM.
- Fusion behavior is unchanged: the same `alpha != 1` rejection logic is
still exercised, just
with a narrower datatype/opset footprint.
## Motivation and Context
The ASan failure is the sanitizer's internal allocator size-class limit
(8 GB for
`size class 8192`), not a runner RAM cap that can simply be raised.
Loosening it via
`ASAN_OPTIONS` quarantine tuning would weaken the sanitizer's
bug-detection guarantees, so the
fix targets the test's memory footprint instead.
### Options considered
1. **`--test_parallel` (reduce CTest concurrency).** Lower the
parallelism in the ASan workflow
(e.g., `--test_parallel 1`) so fewer test binaries run concurrently.
This only addresses
cumulative/overlapping process memory; it does **not** reduce the peak
memory of a single
test, and it slows the CI down for every run. Rejected as a blunt,
non-durable workaround.
2. **Shard the ASan tests.** Split `onnxruntime_test_all` into N gtest
shards
(`GTEST_TOTAL_SHARDS` / `GTEST_SHARD_INDEX`) so the ASan allocator
resets between shards.
This helps with cumulative growth across the whole binary, but it still
does **not** reduce
the peak memory of any individual test — if one test alone approaches
the limit, sharding
the binary will not help. Rejected for the same root-cause reason.
3. **Break the test into smaller tests (chosen).** Isolate the `alpha !=
1` coverage into
dedicated tests that run a single datatype (uint8/uint8/uint8) at a
single opset (19), and
remove the alpha cases from the large `Gemm_U8U8U8` matrix. This reduces
the work done in
the heaviest single test and addresses the peak-memory problem at its
source while keeping
the same fusion behavior under test.
Reference: PR #28131 (merge commit `585273033e`).
## Checklist
- [x] Tests added/updated
- [ ] Documentation updated (if applicable)
- [x] No breaking changes (test-only change)
- [ ] CI passes1 parent b3e1a9e commit d43a48b
2 files changed
Lines changed: 43 additions & 67 deletions
Lines changed: 22 additions & 35 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
324 | 324 | | |
325 | 325 | | |
326 | 326 | | |
327 | | - | |
| 327 | + | |
328 | 328 | | |
329 | 329 | | |
330 | 330 | | |
| |||
435 | 435 | | |
436 | 436 | | |
437 | 437 | | |
438 | | - | |
439 | | - | |
440 | | - | |
441 | | - | |
442 | | - | |
443 | | - | |
444 | | - | |
445 | | - | |
446 | | - | |
447 | | - | |
448 | | - | |
449 | | - | |
450 | | - | |
451 | | - | |
452 | | - | |
453 | | - | |
454 | | - | |
455 | | - | |
456 | | - | |
457 | | - | |
458 | | - | |
459 | | - | |
460 | | - | |
461 | | - | |
462 | | - | |
463 | | - | |
464 | | - | |
| 438 | + | |
| 439 | + | |
| 440 | + | |
| 441 | + | |
| 442 | + | |
| 443 | + | |
| 444 | + | |
| 445 | + | |
| 446 | + | |
| 447 | + | |
| 448 | + | |
| 449 | + | |
| 450 | + | |
465 | 451 | | |
466 | 452 | | |
467 | 453 | | |
| |||
498 | 484 | | |
499 | 485 | | |
500 | 486 | | |
501 | | - | |
502 | | - | |
503 | | - | |
504 | | - | |
505 | | - | |
506 | | - | |
507 | | - | |
508 | 487 | | |
509 | 488 | | |
510 | 489 | | |
511 | 490 | | |
| 491 | + | |
| 492 | + | |
| 493 | + | |
| 494 | + | |
| 495 | + | |
| 496 | + | |
| 497 | + | |
| 498 | + | |
512 | 499 | | |
513 | 500 | | |
514 | 501 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
719 | 719 | | |
720 | 720 | | |
721 | 721 | | |
722 | | - | |
| 722 | + | |
723 | 723 | | |
724 | 724 | | |
725 | 725 | | |
| |||
825 | 825 | | |
826 | 826 | | |
827 | 827 | | |
828 | | - | |
829 | | - | |
830 | | - | |
831 | | - | |
832 | | - | |
833 | | - | |
834 | | - | |
835 | | - | |
836 | | - | |
837 | | - | |
838 | | - | |
839 | | - | |
840 | | - | |
841 | | - | |
842 | | - | |
843 | | - | |
844 | | - | |
845 | | - | |
846 | | - | |
847 | | - | |
848 | | - | |
849 | | - | |
850 | | - | |
851 | | - | |
| 828 | + | |
| 829 | + | |
| 830 | + | |
| 831 | + | |
| 832 | + | |
| 833 | + | |
| 834 | + | |
| 835 | + | |
| 836 | + | |
| 837 | + | |
| 838 | + | |
| 839 | + | |
852 | 840 | | |
853 | 841 | | |
854 | 842 | | |
| |||
868 | 856 | | |
869 | 857 | | |
870 | 858 | | |
871 | | - | |
872 | | - | |
873 | | - | |
874 | | - | |
875 | | - | |
876 | | - | |
877 | | - | |
| 859 | + | |
| 860 | + | |
| 861 | + | |
| 862 | + | |
| 863 | + | |
| 864 | + | |
| 865 | + | |
| 866 | + | |
878 | 867 | | |
879 | 868 | | |
880 | 869 | | |
| |||
0 commit comments