Skip to content

[hipDNN] ALMIOPEN-1943 Add pytest suite and CI workflow for Python bindings#7600

Draft
tvy-amd wants to merge 113 commits into
developfrom
users/tvy/python_binding_tests
Draft

[hipDNN] ALMIOPEN-1943 Add pytest suite and CI workflow for Python bindings#7600
tvy-amd wants to merge 113 commits into
developfrom
users/tvy/python_binding_tests

Conversation

@tvy-amd
Copy link
Copy Markdown
Contributor

@tvy-amd tvy-amd commented May 19, 2026

Summary

Add a pytest test suite (42 tests) for the hipDNN Python bindings and a GitHub Actions workflow to run them on GPU hardware with miopen-provider. Default HIPDNN_BUILD_PYTHON_BINDINGS to OFF so existing builds are unaffected.

Risk Assessment

Low risk. Adds new test infrastructure and a CI workflow without modifying production library code. The only product-code change is flipping a CMake option default from ON to OFF, which is additive — existing consumers who set the flag explicitly are unaffected.

Testing Summary

  • Full pytest suite run locally with miopen-provider on GPU hardware (42/42 passed).
  • Non-GPU subset validated without provider (6/6 passed).

Testing Checklist

  • Python binding tests (local) - pytest projects/hipdnn/python/hipdnn_frontend/test/ -v - ASICs: gfx942 - Status: Passed (42/42)
  • PR CI - GitHub PR checks - Status: Pending

Technical Changes

  • Add projects/hipdnn/python/hipdnn_frontend/test/ with tests covering tensor API, graph API, convolution (fprop/dgrad/wgrad), matmul, and device buffer operations.
  • Extract shared test helpers (build_conv_fprop_graph, execute_graph) into helpers.py since pytest does not support importing from conftest.py as a regular module.
  • Fix tests for updated Python binding API: Tensor.create() no longer auto-assigns UIDs, Graph.tensor(attrs) removed.
  • Register custom pytest markers (gpu, integration) in pyproject.toml.
  • Add .github/workflows/hipdnn-python-tests.yml that builds hipDNN + miopen-provider on azure-linux-scale-rocm and runs the full suite.
  • Default HIPDNN_BUILD_PYTHON_BINDINGS to OFF in CMakeLists.txt; CI workflow explicitly enables it.

tvy-amd and others added 30 commits May 12, 2026 11:50
…arking changes

Rewrite python/CMakeLists.txt to use scikit-build-core's SKBUILD variable
for wheel-specific install rules (following the stinkytofu pattern) instead
of unconditional dist-packages install. Regular cmake --install stages the
wheel via pip wheel in an else() branch, avoiding recursion.

Revert all dnn-benchmarking packaging changes to keep this branch focused
on hipdnn-frontend Python bindings only.

Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>
Let scikit-build-core handle Python files via wheel.packages instead of
cmake install rules. Simplifies SKBUILD block to only install the .so.

Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>
Installs the built hipdnn-frontend wheel into a fresh venv and runs
17 smoke tests covering imports, enums, Tensor/Graph construction,
method chaining, serialization helpers, and engine ID management.

Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>
When built via TheRock superbuild, nanobind is provided as a third-party
dependency. For standalone builds, FetchContent fetches it from GitHub.
Also bumps FetchContent tag from v1.8.0 to v2.4.0.

Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>
…ndings

Replace inline FetchContent for nanobind with the project's
hipdnn_add_dependency pattern so find-or-fetch behavior is consistent
with all other dependencies. Add tsl-robin-map as a separate dependency
since nanobind is configured with NB_USE_SUBMODULE_DEPS OFF.

Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>
Exclude Python, nanobind, and _deps headers from clang-tidy's header
filter to prevent false positives on third-party code. Rename binding
functions from snake_case to camelBack to match project naming
conventions, and add braces around bare if/else bodies.

Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>
Add shared bindings.hpp header with forward declarations to resolve
misc-use-internal-linkage. Suppress unavoidable performance-no-int-to-ptr
for Python FFI pointer casts. Use const references for nb::object and
nb::bytes parameters. Add const to status/error variables and use auto
with reinterpret_cast.

Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>
Only the module development component is needed for nanobind extension
modules, not the full Development (which also requires Development.Embed
and libpython). This matches the pattern used by other Python bindings in
the repo (stinkytofu, origami, rocisa).

Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>
…ding package locations

When pip wheel re-invokes CMake on python/CMakeLists.txt standalone,
hipdnn_add_dependency is not available. Include Dependencies.cmake as
a fallback. Forward nanobind_DIR, tsl-robin-map_DIR, and version
variables so find_package() picks up already-built packages from
TheRock without re-downloading.

Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>
The version variables are unnecessary — Dependencies.cmake has matching
hardcoded defaults, and when packages are found via nanobind_DIR /
tsl-robin-map_DIR the version is not used.

Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>
Add integration tests for conv_fprop, conv_dgrad, conv_wgrad, and matmul
that mirror the existing C++ samples. Add unit-style API tests for Tensor,
Graph, and DeviceBuffer. Convert the existing engine_id_overloads test to
pytest format. Configure pytest in pyproject.toml with gpu and integration
markers.

Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>
…hardcoded compiler paths

Forward FETCHCONTENT_SOURCE_DIR for nanobind and tsl-robin-map so the
standalone wheel build reuses already-downloaded sources when the parent
used FetchContent (where _DIR variables are empty).

Remove hardcoded /opt/rocm/llvm/bin/clang compiler paths from
pyproject.toml — let the environment or -C flags control the compiler.

Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>
The pip wheel subprocess could not find the HIP package because
hip_DIR was not forwarded from the parent CMake configuration.

Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>
The pip wheel subprocess finds HIP via hip_DIR but hip-config-amd.cmake
calls find_dependency for AMDDeviceLibs, amd_comgr, and hsa-runtime64.
Without forwarding these _DIR variables the isolated build environment
cannot locate them, causing CMake configuration failure in TheRock CI.

Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>
…nctions

Replace _determine_git_tag indirection with direct use of the VERSION
parameter for the git tag. The version is already passed from the caller
via hipdnn_add_dependency so the macro lookup is unnecessary.

Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>
Replace the scikit-build-core + pip subprocess approach with a standalone
wheel assembly script, eliminating the need to forward CMake dependency
variables (hip_DIR, AMDDeviceLibs_DIR, etc.) through pip config settings.

The superbuild path now stages the compiled .so and calls
assemble_wheel.py (stdlib only) to produce the wheel directly. The
developer path uses hatchling with a custom build hook that either
compiles via CMake or packages a pre-built .so via HIPDNN_PREBUILT_SO.

Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>
tvy-amd and others added 5 commits May 27, 2026 15:06
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Reverts hipblas and miopen cpp/hpp diffs vs PR merge-base. Patch saved
in WIP/worktrees/rocm-libraries/python_binding_tests/cpphpp-pr-diffs-2.patch
for later reapply.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Reverts the codegen template; underlying generated .cpp files were
already reverted to PR merge-base in prior commit. Patch saved to
WIP/worktrees/rocm-libraries/python_binding_tests/descriptor-j2.patch.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Temporary CI hash bump for testing. Revert before merge.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…02fa9d9fe4d9ed

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 27, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

❌ Your project status has failed because the head coverage (77.83%) is below the target coverage (80.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #7600      +/-   ##
===========================================
- Coverage    62.06%   62.06%   -0.00%     
===========================================
  Files         2085     2085              
  Lines       357573   357573              
  Branches     54060    54060              
===========================================
- Hits        221914   221911       -3     
- Misses      116871   116874       +3     
  Partials     18788    18788              
Flag Coverage Δ *Carryforward flag
TensileLite 27.29% <ø> (ø) Carriedforward from ed4c922
hipBLAS 90.65% <ø> (ø) Carriedforward from ed4c922
hipBLASLt 41.27% <ø> (ø) Carriedforward from ed4c922
hipCUB 82.21% <ø> (ø) Carriedforward from ed4c922
hipDNN 86.61% <ø> (-0.01%) ⬇️
hipFFT 50.00% <ø> (ø) Carriedforward from ed4c922
hipRAND 76.12% <ø> (ø) Carriedforward from ed4c922
hipSOLVER 69.24% <ø> (ø) Carriedforward from ed4c922
hipSPARSE 85.42% <ø> (ø) Carriedforward from ed4c922
rocBLAS 48.09% <ø> (ø) Carriedforward from ed4c922
rocFFT 52.07% <ø> (ø) Carriedforward from ed4c922
rocRAND 57.04% <ø> (ø) Carriedforward from ed4c922
rocSOLVER 77.83% <ø> (ø) Carriedforward from ed4c922
rocSPARSE 72.68% <ø> (ø) Carriedforward from ed4c922

*This pull request uses carry forward flags. Click here to find out more.
see 2 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

tvy-amd added 2 commits May 27, 2026 18:34
…on bindings test

The hipdnn_python_bindings test script in TheRock expects rocm-libraries
at external-sources/rocm-libraries/ (treated as a submodule of TheRock).
Update the test-component workflow checkout path and the notify_teams
script reference to match the expected layout.
Reverts the external-sources/ workflow checkout path added in
da6cf75 and bumps the TheRock pin to pick up the script-side fix
(TheRock@649855473) that resolves the hipDNN pytest dir via
HIPDNN_PYTHON_TESTS_DIR / ROCM_LIBRARIES_DIR / sibling rocm-libraries
checkout instead of the hard-coded external-sources/ submodule path.
import numpy as np
import pytest

import hipdnn_frontend as hipdnn
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed because helpers call hipdnn_frontend for us.

tvy-amd and others added 3 commits May 28, 2026 10:38
tvy-amd and others added 8 commits May 28, 2026 11:57
Stage hipdnn_frontend/test/*.py into share/hipdnn/tests/python so
downstream packagers (e.g. TheRock) can include them in the hipDNN
test artifact. Keeps tests outside the package payload to avoid
accidentally testing the staged tree instead of the installed wheel.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…inding_tests

# Conflicts:
#	.github/workflows/therock-ci-linux.yml
#	.github/workflows/therock-ci-nightly.yml
#	.github/workflows/therock-ci-windows.yml
#	.github/workflows/therock-ci.yml
#	.github/workflows/therock-test-component.yml
#	.github/workflows/therock-test-packages.yml
TheRock CI uses its own pack_frontend_wheel.py for wheel packaging.
The local CMake build only stages files for TheRock to consume and
never invokes assemble_wheel.py.
tvy-amd and others added 4 commits May 28, 2026 18:06
Un-reverts the install rule the TheRock test runner depends on.
test_hipdnn_frontend_python.py expects the pytest directory at
build/share/hipdnn/tests/python; without it the test job fails with
FileNotFoundError.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Point all TheRock CI pins to 9c0ca0a4a, which fixes the frontend
wheel-pack script to use `uv build --wheel` instead of the invalid
`uv pip wheel`. Also strip trailing whitespace on the ref line that
failed the pre-commit trailing-whitespace hook.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants