Skip to content

[Issue] Windows: rocBLAS full-suite tests crash (SEH divide-by-zero/access violation) and end with heap corruption (0xc0000374) #7149

@chiranjeevipattigidi

Description

@chiranjeevipattigidi

Summary

rocBLAS client full-suite tests crash on Windows in Multi-Arch CI (PR #5072 run), showing repeated SEH exceptions in AXPY batched tests (divide-by-zero), then TRSV strided-batched failures (access violation + HIP invalid value), and finally the overall test process terminates with a heap-corruption style exit code.

Failing CI link

Tests Failed

  1. Multiple AXPY batched/strided-batched parameterized tests crash with:
  • unknown file: error: SEH exception with code 0xc0000094 thrown in the test body.

Examples from the log:

  • _ /axpy_batched_ex ... nightly_blas1_batched_with_alpha_..._400000_...
  • _ /axpy_strided_batched_ex ... nightly_blas1_strided_batched_with_alpha_..._300001_...
  • also some pre_checkin ... blas1_graph_check ... variants

(SEH 0xc0000094 = integer divide-by-zero on Windows.)

  1. TRSV strided-batched shows access violations and a rocBLAS/HIP error sequence:
  • unknown file: error: SEH exception with code 0xc0000005 thrown in the test body.
  • .../rocblas/clients/common/gtest_helpers.cpp(217): error: Failed
  • Received uncaught exception: rocblas_status_internal_error
  • hipGetLastError at end of test function call: ...
  • rocBLAS error from hip error code: 'hipErrorInvalidValue':1

(SEH 0xc0000005 = access violation on Windows.)

  1. The overall CTest target fails with process exit code:
  • rocblas-test_full_suite (Exit code 0xc0000374)

(0xc0000374 is commonly associated with heap corruption / corrupted heap detected on Windows.)

Environment

  • Repo: ROCm/TheRock
  • Workflow: "Multi-Arch CI"
  • Target Archs: gfx110x-all
  • Platform: Windows runner (SEH + Windows-style paths in log)
  • Commit/ref: bb544cdc1731adb5a1d16afc42027242bf1cd9eb

Log excerpts

  • unknown file: error: SEH exception with code 0xc0000094 thrown in the test body.
  • unknown file: error: SEH exception with code 0xc0000005 thrown in the test body.
  • .../rocblas/clients/common/gtest_helpers.cpp(217): error: Failed
  • Received uncaught exception: rocblas_status_internal_error
  • rocBLAS error from hip error code: 'hipErrorInvalidValue':1
  • rocblas-test_full_suite (Exit code 0xc0000374)

Impact:

Blocker for promotion of rocm-systems ROCm/TheRock#5072 , rocm-libraries ROCm/TheRock#5073

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions