Skip to content

Commit 5af1474

Browse files
authored
Merge branch 'master' into add-api-functions-to-usmarray
2 parents 15664be + a48c4af commit 5af1474

4 files changed

Lines changed: 14 additions & 3 deletions

File tree

.github/workflows/openssf-scorecard.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -72,6 +72,6 @@ jobs:
7272

7373
# Upload the results to GitHub's code scanning dashboard.
7474
- name: "Upload to code-scanning"
75-
uses: github/codeql-action/upload-sarif@c10b8064de6f491fea524254123dbe5e09572f13 # v4.35.1
75+
uses: github/codeql-action/upload-sarif@95e58e9a2cdfd71adc6e0353d5c52f41a045d225 # v4.35.2
7676
with:
7777
sarif_file: results.sarif

CHANGELOG.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -60,6 +60,7 @@ Also, that release drops support for Python 3.9, making Python 3.10 the minimum
6060
* Moved all SYCL kernel functors from `backend/extensions/` to a unified `backend/kernels/` directory hierarchy [#2816](https://github.com/IntelPython/dpnp/pull/2816)
6161
* `dpnp` uses pybind11 3.0.3 [#2834](https://github.com/IntelPython/dpnp/pull/2834)
6262
* Disabled `dpnp.tensor` tests by default in `conda build --test` to prevent OOM failures during package testing. Set `SKIP_TENSOR_TESTS=0` to re-enable them on systems with enough memory [#2860](https://github.com/IntelPython/dpnp/pull/2860)
63+
* `dpnp` uses pybind11 3.0.4 [#2865](https://github.com/IntelPython/dpnp/pull/2865)
6364

6465
### Deprecated
6566

@@ -88,6 +89,7 @@ Also, that release drops support for Python 3.9, making Python 3.10 the minimum
8889
* Fixed test tolerance issues for float16 intermediate precision that became visible when testing against conda-forge's NumPy [#2828](https://github.com/IntelPython/dpnp/pull/2828)
8990
* Ensured device aware dtype handling in `dpnp.identity` and `dpnp.gradient` [#2835](https://github.com/IntelPython/dpnp/pull/2835)
9091
* Fixed `dpnp.tensor.round` to use device-aware output dtype for boolean input [#2851](https://github.com/IntelPython/dpnp/pull/2851)
92+
* Resolved a deadlock in `dpnp.linalg.qr` by releasing the GIL before OneMKL `orgqr` call to prevent host tasks contention [#2850](https://github.com/IntelPython/dpnp/pull/2850)
9193

9294
### Security
9395

CMakeLists.txt

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -106,8 +106,8 @@ find_package(Python 3.10...<3.15 REQUIRED COMPONENTS Development.Module NumPy)
106106
include(FetchContent)
107107
FetchContent_Declare(
108108
pybind11
109-
URL https://github.com/pybind/pybind11/archive/refs/tags/v3.0.3.tar.gz
110-
URL_HASH SHA256=787459e1e186ee82001759508fefa408373eae8a076ffe0078b126c6f8f0ec5e
109+
URL https://github.com/pybind/pybind11/archive/refs/tags/v3.0.4.tar.gz
110+
URL_HASH SHA256=74b6a2c2b4573a400cafb6ecbf60c98df300cd3d0041296b913d02b2cbbb2676
111111
FIND_PACKAGE_ARGS NAMES pybind11
112112
)
113113
FetchContent_MakeAvailable(pybind11)

dpnp/backend/extensions/lapack/orgqr.cpp

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -87,8 +87,17 @@ static sycl::event orgqr_impl(sycl::queue &exec_q,
8787

8888
sycl::event orgqr_event;
8989
try {
90+
// Release GIL to avoid serialization of host task submissions
91+
// to the same queue in OneMKL
92+
py::gil_scoped_release lock{};
93+
9094
scratchpad = sycl::malloc_device<T>(scratchpad_size, exec_q);
9195

96+
// mkl_lapack::orgqr() is done through GPU-to-Host reverse offload:
97+
// exec_q.submit([&](sycl::handler& cgh) {
98+
// cgh.depends_on(depends);
99+
// cgh.host_task([=]() { orgqr_host(...); });
100+
// }).wait();
92101
orgqr_event = mkl_lapack::orgqr(
93102
exec_q,
94103
m, // The number of rows in the matrix; (0 ≤ m).

0 commit comments

Comments
 (0)