Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,7 @@ Also, that release drops support for Python 3.9, making Python 3.10 the minimum
* Resolved an issue with strides calculation in `dpnp.diagonal` to return correct values for empty diagonals [#2814](https://github.com/IntelPython/dpnp/pull/2814)
* Fixed test tolerance issues for float16 intermediate precision that became visible when testing against conda-forge's NumPy [#2828](https://github.com/IntelPython/dpnp/pull/2828)
* Ensured device aware dtype handling in `dpnp.identity` and `dpnp.gradient` [#2835](https://github.com/IntelPython/dpnp/pull/2835)
* Resolved a deadlock in `dpnp.linalg.qr` by releasing the GIL before OneMKL `orgqr` call to prevent host tasks contention [#2850](https://github.com/IntelPython/dpnp/pull/2850)

### Security

Expand Down
9 changes: 9 additions & 0 deletions dpnp/backend/extensions/lapack/orgqr.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -87,8 +87,17 @@ static sycl::event orgqr_impl(sycl::queue &exec_q,

sycl::event orgqr_event;
try {
// Release GIL to avoid serialization of host task submissions
Comment thread
vlad-perevezentsev marked this conversation as resolved.
// to the same queue in OneMKL
py::gil_scoped_release lock{};

scratchpad = sycl::malloc_device<T>(scratchpad_size, exec_q);

// mkl_lapack::orgqr() is done through GPU-to-Host reverse offload:
// exec_q.submit([&](sycl::handler& cgh) {
// cgh.depends_on(depends);
// cgh.host_task([=]() { orgqr_host(...); });
// }).wait();
orgqr_event = mkl_lapack::orgqr(
exec_q,
m, // The number of rows in the matrix; (0 ≤ m).
Expand Down
Loading