Skip to content

Add missing public APIs and replace miopen::deref() in driver#7255

Draft
BradPepersAMD wants to merge 2 commits intousers/brpepers/fix_symbol_leaksfrom
users/brpepers/fix_symbol_leaks_2
Draft

Add missing public APIs and replace miopen::deref() in driver#7255
BradPepersAMD wants to merge 2 commits intousers/brpepers/fix_symbol_leaksfrom
users/brpepers/fix_symbol_leaks_2

Conversation

@BradPepersAMD
Copy link
Copy Markdown
Contributor

Summary

Add 8 new public API functions to miopen.h and replace 227 of 264 miopen::deref() calls in the driver with public API equivalents. Create shared tensor utility wrappers in common_utils/tensor_utils.hpp.

Base branch: users/brpepers/fix_symbol_leaks (PR #7254)

New public API functions

Tensor descriptor

  • miopenGetTensorLayout() — get tensor layout enum
  • miopenGetTensorElementSpace() — get total element space including stride gaps
  • miopenIsTensorPacked() — query if tensor is contiguous in memory
  • miopenGetTensorVectorLength() — get vector length for vectorized layouts

Convolution/Pooling

  • miopenGetConvolutionPaddingMode() — padding mode getter (was missing)
  • miopenGetPoolingPaddingMode() — padding mode getter (was missing)

Debug control

  • miopenSetDebugFlag() / miopenGetDebugFlag() with miopenDebugFlag_t enum
    • miopenDebugLoggingQuiet, miopenDebugFindEnforceDisable, miopenDebugIsWarmupOngoing, miopenDebugAlwaysEnableConvDirectNaive

Shared convenience wrappers

common_utils/tensor_utils.hpp provides tensor_utils:: namespace wrappers:

  • GetLengths, GetStrides, GetType, GetNumDims, GetElementSize, GetLayout, GetElementSpace, IsPacked, GetVectorLength
  • GetTypeSize, Tien<N>, GetNCDHW, LogRange

Used by driver, miopen_utils, and tests.

Driver internal dependency reduction

Metric Before After
miopen::deref() calls 264 37
Internal miopen/ includes 59 34

Remaining 37 deref calls (cannot fix without deeper refactoring)

  • 19 in get_inner_expanded_tv — needs tensor_view refactoring
  • 13 in TensorDescriptor value-type patterns (mloPoolingHost, mloConvHost)
  • 3 in .desc member assignments (CBAInferFusion)
  • 2 in CallGemm — no public GEMM API exists

Test plan

  • Build MIOpen library
  • Build MIOpenDriver
  • Build and run tests
  • Run driver smoke test: MIOpenDriver conv -n 1 -c 1 -H 32 -W 32 -k 1 -y 3 -x 3 -V 1

🤖 Generated with Claude Code

@BradPepersAMD BradPepersAMD requested a review from a team as a code owner May 9, 2026 16:07
@BradPepersAMD BradPepersAMD marked this pull request as draft May 9, 2026 16:09
@BradPepersAMD BradPepersAMD force-pushed the users/brpepers/fix_symbol_leaks branch from 1864233 to c567a50 Compare May 11, 2026 00:43
BradPepersAMD and others added 2 commits May 10, 2026 19:18
Add 8 new public API functions to miopen.h:
- miopenGetTensorLayout, miopenGetTensorElementSpace,
  miopenIsTensorPacked, miopenGetTensorVectorLength
- miopenGetConvolutionPaddingMode, miopenGetPoolingPaddingMode
- miopenSetDebugFlag / miopenGetDebugFlag with miopenDebugFlag_t enum

Create common_utils/tensor_utils.hpp with convenience wrappers around
the public tensor descriptor API (GetLengths, GetStrides, GetType,
GetNumDims, Tien, GetNCDHW, LogRange, GetTypeSize, etc.).

Replace 227 of 264 miopen::deref() calls in driver with public API:
- Tensor descriptor queries → tensor_utils:: wrappers
- Convolution descriptor → miopenGetConvolutionNdDescriptor etc.
- Pooling descriptor → miopenSetNdPoolingDescriptor etc.
- Dropout descriptor → miopenGetDropoutDescriptor etc.
- Debug globals → miopenSetDebugFlag / miopenGetDebugFlag
- Handle::Finish() → hipStreamSynchronize(GetStream())
- HipEventPtr → local RAII wrapper in timer.hpp
- GetImage3dMaxWidth → hipDeviceGetAttribute

Remaining 37 miopen::deref calls require deeper refactoring:
- 19 in get_inner_expanded_tv (tensor view refactoring)
- 13 in TensorDescriptor value-type usage (mloPoolingHost, mloConvHost)
- 3 in .desc member assignments (CBAInferFusion)
- 2 in CallGemm (no public GEMM API)

Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>
Add size_t-based tensor descriptor getter to the public API, matching
the existing miopenSetTensorDescriptorV2. This avoids silent truncation
of strides that exceed INT_MAX when using the int-based getter.

- Add miopenGetTensorDescriptorV2() to miopen.h (beta API) and
  implement in tensor_api.cpp
- Update tensor_utils::GetLengths/GetStrides to use V2 API and
  return vector<size_t>
- Fix tensor_layout.hpp: add missing #include <numeric>
- Fix tensor_holder.hpp: add size_t delegating constructor for
  layout+dims+strides
- Fix dropout_gpu_emulator.hpp: define MAX_PRNG_STATE locally
  after removing miopen/dropout.hpp internal include
- Add miopenHandle_t handle parameter to RNN/GRU/LSTM backward
  verify functions, needed by RunDropoutBackwardEmulator which
  now uses miopenGetDropoutDescriptor (public API) instead of
  miopen::deref()

Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>
@BradPepersAMD BradPepersAMD force-pushed the users/brpepers/fix_symbol_leaks_2 branch from 27e6464 to 4221fbb Compare May 11, 2026 01:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant