Commit 9c4aed2
authored
Improve
The PR propose to improve implementation and to use `dpnp.sort` call
when
- input array has number of dimensions > 1
- input array has previously not supported integer dtype
- `axis` keyword is passed (previously not supported)
- sequence of `kth` is passed (previously not supported)
In case of `ndim > 1` previously the implementation from legacy backend
was used, which is significantly slow (see performance comparation
below). It used a copy of input data into the shared USM memory and
included computations on the host.
This PR proposes to reuse `dpnp.sort` for all the above cases.
While in case when the legacy implementation is stable and fast (for 1D
input array), it will remain, because it relays on `std::nth_element`
from OneDPL.
The benchmark results were collected on PVC with help of the below code:
```python
import dpnp, numpy as np
from dpnp.tests.helper import generate_random_numpy_array
a = generate_random_numpy_array(10**7, dtype=np.float64, seed_value=117)
ia = dpnp.array(a)
%timeit x = dpnp.partition(ia, 513); x.sycl_queue.wait()
```
Below tables contains data in case of 1D input array (shape=(10**7,)),
where the implementation path was kept the same, plus adding support of
missing integer dtypes using fallback on the sort function:
| Implementation | int32 | uint32 | int64 | uint64 | float32 | float64 |
complex64 | complex128 |
|--------|--------|--------|--------|--------|--------|--------|--------|--------|
| old (legacy backend) | 7.46 ms | not supported | 9.46 ms | not
supported | 7.39 ms | 8.92 ms | 10.9 ms | 21.2 ms |
| new (backend + sort) | 7.34 ms | 10.8 ms | 9.48 ms | 12.5 ms | 7.37 ms
| 8.89 ms | 11 ms | 21.2 ms |
The following code was used for 2D input array with shape=(10**4,
10**4):
```python
import dpnp, numpy as np
from dpnp.tests.helper import generate_random_numpy_array
a = generate_random_numpy_array((10**4, 10**4), dtype=np.float64, seed_value=117)
ia = dpnp.array(a)
%timeit x = dpnp.partition(ia, 1513); x.sycl_queue.wait()
```
In that case the new implementation is fully based on the sort call:
| Implementation | int32 | int64 | float32 | float64 | complex64 |
complex128 |
|--------|--------|--------|--------|--------|--------|--------|
| old (legacy backend) | 6.4 s | 6.89 s | 7.36 s | 7.66 s | 8.61 s | 10
s |
| new (sort) | 57.4 ms | 64.7 ms | 62.2 ms | 68 ms | 77 ms | 151 ms |dpnp.partition implementation (#2766)1 parent 9d6d5a5 commit 9c4aed2
File tree
7 files changed
+393
-300
lines changed- dpnp
- backend/kernels
- tests
- third_party/cupy/sorting_tests
7 files changed
+393
-300
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
49 | 49 | | |
50 | 50 | | |
51 | 51 | | |
| 52 | + | |
52 | 53 | | |
53 | 54 | | |
54 | 55 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
70 | 70 | | |
71 | 71 | | |
72 | 72 | | |
73 | | - | |
74 | | - | |
75 | | - | |
76 | | - | |
| 73 | + | |
| 74 | + | |
77 | 75 | | |
78 | | - | |
79 | | - | |
| 76 | + | |
| 77 | + | |
80 | 78 | | |
81 | | - | |
82 | | - | |
| 79 | + | |
| 80 | + | |
83 | 81 | | |
84 | | - | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
85 | 88 | | |
86 | 89 | | |
87 | 90 | | |
88 | | - | |
| 91 | + | |
89 | 92 | | |
90 | | - | |
91 | | - | |
92 | | - | |
93 | | - | |
94 | | - | |
95 | | - | |
96 | | - | |
97 | | - | |
98 | | - | |
99 | | - | |
100 | | - | |
101 | | - | |
102 | | - | |
103 | | - | |
104 | | - | |
105 | | - | |
106 | | - | |
107 | | - | |
108 | | - | |
109 | | - | |
110 | | - | |
111 | | - | |
112 | | - | |
113 | | - | |
114 | | - | |
115 | | - | |
116 | | - | |
117 | | - | |
118 | | - | |
119 | 93 | | |
120 | | - | |
121 | | - | |
122 | | - | |
123 | | - | |
124 | | - | |
125 | | - | |
126 | | - | |
127 | | - | |
128 | | - | |
129 | | - | |
130 | | - | |
131 | | - | |
132 | | - | |
133 | | - | |
134 | | - | |
135 | | - | |
136 | | - | |
137 | | - | |
138 | | - | |
139 | | - | |
140 | | - | |
141 | | - | |
142 | | - | |
143 | | - | |
144 | | - | |
145 | | - | |
146 | | - | |
147 | | - | |
148 | | - | |
149 | | - | |
150 | | - | |
151 | | - | |
152 | | - | |
153 | | - | |
154 | | - | |
155 | | - | |
156 | | - | |
157 | 94 | | |
158 | 95 | | |
159 | 96 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1459 | 1459 | | |
1460 | 1460 | | |
1461 | 1461 | | |
1462 | | - | |
| 1462 | + | |
| 1463 | + | |
| 1464 | + | |
| 1465 | + | |
| 1466 | + | |
| 1467 | + | |
| 1468 | + | |
1463 | 1469 | | |
1464 | | - | |
1465 | | - | |
1466 | | - | |
| 1470 | + | |
1467 | 1471 | | |
1468 | | - | |
1469 | | - | |
1470 | | - | |
| 1472 | + | |
| 1473 | + | |
| 1474 | + | |
| 1475 | + | |
| 1476 | + | |
| 1477 | + | |
| 1478 | + | |
| 1479 | + | |
| 1480 | + | |
| 1481 | + | |
1471 | 1482 | | |
1472 | | - | |
| 1483 | + | |
1473 | 1484 | | |
1474 | 1485 | | |
1475 | 1486 | | |
1476 | 1487 | | |
| 1488 | + | |
| 1489 | + | |
1477 | 1490 | | |
1478 | 1491 | | |
1479 | 1492 | | |
1480 | 1493 | | |
1481 | 1494 | | |
1482 | 1495 | | |
1483 | 1496 | | |
| 1497 | + | |
| 1498 | + | |
| 1499 | + | |
| 1500 | + | |
1484 | 1501 | | |
1485 | 1502 | | |
1486 | 1503 | | |
1487 | 1504 | | |
1488 | | - | |
1489 | | - | |
1490 | | - | |
| 1505 | + | |
| 1506 | + | |
| 1507 | + | |
| 1508 | + | |
| 1509 | + | |
1491 | 1510 | | |
1492 | 1511 | | |
1493 | 1512 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
39 | 39 | | |
40 | 40 | | |
41 | 41 | | |
| 42 | + | |
| 43 | + | |
42 | 44 | | |
43 | | - | |
44 | 45 | | |
45 | 46 | | |
46 | 47 | | |
| |||
51 | 52 | | |
52 | 53 | | |
53 | 54 | | |
54 | | - | |
55 | 55 | | |
56 | 56 | | |
57 | 57 | | |
| |||
147 | 147 | | |
148 | 148 | | |
149 | 149 | | |
150 | | - | |
| 150 | + | |
151 | 151 | | |
152 | 152 | | |
153 | 153 | | |
| |||
201 | 201 | | |
202 | 202 | | |
203 | 203 | | |
204 | | - | |
| 204 | + | |
205 | 205 | | |
206 | 206 | | |
207 | 207 | | |
208 | 208 | | |
209 | 209 | | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
210 | 232 | | |
211 | 233 | | |
212 | | - | |
213 | | - | |
214 | | - | |
215 | | - | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
| 268 | + | |
216 | 269 | | |
217 | 270 | | |
218 | 271 | | |
219 | | - | |
220 | | - | |
221 | | - | |
222 | | - | |
223 | | - | |
224 | | - | |
| 272 | + | |
225 | 273 | | |
226 | | - | |
227 | | - | |
228 | | - | |
229 | | - | |
230 | | - | |
231 | | - | |
232 | | - | |
233 | | - | |
234 | | - | |
235 | | - | |
236 | | - | |
237 | | - | |
238 | | - | |
239 | | - | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
240 | 282 | | |
241 | | - | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
| 290 | + | |
| 291 | + | |
| 292 | + | |
| 293 | + | |
| 294 | + | |
| 295 | + | |
| 296 | + | |
| 297 | + | |
| 298 | + | |
| 299 | + | |
| 300 | + | |
| 301 | + | |
| 302 | + | |
| 303 | + | |
| 304 | + | |
| 305 | + | |
| 306 | + | |
| 307 | + | |
| 308 | + | |
| 309 | + | |
| 310 | + | |
| 311 | + | |
| 312 | + | |
| 313 | + | |
| 314 | + | |
| 315 | + | |
| 316 | + | |
| 317 | + | |
| 318 | + | |
| 319 | + | |
| 320 | + | |
| 321 | + | |
| 322 | + | |
| 323 | + | |
| 324 | + | |
| 325 | + | |
242 | 326 | | |
243 | 327 | | |
244 | 328 | | |
| |||
0 commit comments