Commit d428760
fix: count shared buffers once in hash join build-side memory accounting (#22862)
## Which issue does this PR close?
- Closes #22861.
## Rationale for this change
When using DataFusion comet I noticed that my hash join operator was
failing with the following error: `Failed to acquire 142606336 bytes
where 17142251456 bytes already reserved and the fair limit is
17179869184 bytes, 4 registered`. Looking into this more, DataFusion
asks to reserve memory for each batch (by default 8192 rows) of the
build side of a hash join - and tries to reserve (without actually
allocating it) num_batches * batch_size. This is problematic when these
are batches are zero-copy slices of a larger batch (e.g.
GroupedHashAggregateStream), since the slice size is evaluated to be the
size of the larger buffer. This is because the reference to the slice
actually keeps the entire buffer from being freed. DataFusion doesn't
overallocate memory (the underlying data is the same), but it does
over-request it (in the centralized accounting system), which can lead
to these "ResourcesExhausted" exceptions.
## What changes are included in this PR?
In this change, we keep track of all of the buffers that we've already
counted via a set of pointers. This way, we don't redundantly request
memory for the whole arrow buffer for each sub-slice of it. We choose
this approach as opposed to just requesting a smaller amount of memory
per batch, because as mentioned before, the pointer to each batch
technically keeps the entire arrow-buffer from being freed.
## Are these changes tested?
The new hash join test fails on main with ResourcesExhausted and passes
with this change.
## Are there any user-facing changes?
No breaking changes. Adds a new public helper
count_record_batch_memory_size to datafusion-common.
Co-authored-by: Jordan Epstein <jordan.epstein@imc.com>1 parent cb2542c commit d428760
2 files changed
Lines changed: 140 additions & 16 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
21 | 21 | | |
22 | 22 | | |
23 | 23 | | |
24 | | - | |
| 24 | + | |
| 25 | + | |
25 | 26 | | |
26 | 27 | | |
27 | 28 | | |
| |||
131 | 132 | | |
132 | 133 | | |
133 | 134 | | |
134 | | - | |
135 | | - | |
136 | | - | |
137 | | - | |
138 | | - | |
139 | | - | |
140 | | - | |
141 | | - | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
142 | 162 | | |
143 | 163 | | |
144 | | - | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
145 | 186 | | |
146 | 187 | | |
147 | 188 | | |
148 | 189 | | |
149 | 190 | | |
150 | | - | |
| 191 | + | |
151 | 192 | | |
152 | 193 | | |
153 | 194 | | |
154 | 195 | | |
155 | | - | |
| 196 | + | |
156 | 197 | | |
157 | 198 | | |
158 | 199 | | |
159 | 200 | | |
160 | 201 | | |
161 | | - | |
| 202 | + | |
162 | 203 | | |
163 | 204 | | |
164 | 205 | | |
| |||
295 | 336 | | |
296 | 337 | | |
297 | 338 | | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
| 342 | + | |
| 343 | + | |
| 344 | + | |
| 345 | + | |
| 346 | + | |
| 347 | + | |
| 348 | + | |
| 349 | + | |
| 350 | + | |
| 351 | + | |
| 352 | + | |
| 353 | + | |
| 354 | + | |
| 355 | + | |
| 356 | + | |
| 357 | + | |
| 358 | + | |
| 359 | + | |
| 360 | + | |
| 361 | + | |
298 | 362 | | |
299 | 363 | | |
300 | 364 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
52 | 52 | | |
53 | 53 | | |
54 | 54 | | |
55 | | - | |
56 | 55 | | |
57 | 56 | | |
58 | 57 | | |
| |||
72 | 71 | | |
73 | 72 | | |
74 | 73 | | |
75 | | - | |
| 74 | + | |
76 | 75 | | |
77 | 76 | | |
78 | 77 | | |
| |||
1817 | 1816 | | |
1818 | 1817 | | |
1819 | 1818 | | |
| 1819 | + | |
| 1820 | + | |
| 1821 | + | |
| 1822 | + | |
1820 | 1823 | | |
1821 | 1824 | | |
1822 | 1825 | | |
| |||
1833 | 1836 | | |
1834 | 1837 | | |
1835 | 1838 | | |
| 1839 | + | |
1836 | 1840 | | |
1837 | 1841 | | |
1838 | 1842 | | |
| |||
1923 | 1927 | | |
1924 | 1928 | | |
1925 | 1929 | | |
1926 | | - | |
| 1930 | + | |
1927 | 1931 | | |
1928 | 1932 | | |
1929 | 1933 | | |
| |||
1945 | 1949 | | |
1946 | 1950 | | |
1947 | 1951 | | |
| 1952 | + | |
1948 | 1953 | | |
1949 | 1954 | | |
1950 | 1955 | | |
| |||
5369 | 5374 | | |
5370 | 5375 | | |
5371 | 5376 | | |
| 5377 | + | |
| 5378 | + | |
| 5379 | + | |
| 5380 | + | |
| 5381 | + | |
| 5382 | + | |
| 5383 | + | |
| 5384 | + | |
| 5385 | + | |
| 5386 | + | |
| 5387 | + | |
| 5388 | + | |
| 5389 | + | |
| 5390 | + | |
| 5391 | + | |
| 5392 | + | |
| 5393 | + | |
| 5394 | + | |
| 5395 | + | |
| 5396 | + | |
| 5397 | + | |
| 5398 | + | |
| 5399 | + | |
| 5400 | + | |
| 5401 | + | |
| 5402 | + | |
| 5403 | + | |
| 5404 | + | |
| 5405 | + | |
| 5406 | + | |
| 5407 | + | |
| 5408 | + | |
| 5409 | + | |
| 5410 | + | |
| 5411 | + | |
| 5412 | + | |
| 5413 | + | |
| 5414 | + | |
| 5415 | + | |
| 5416 | + | |
| 5417 | + | |
| 5418 | + | |
| 5419 | + | |
| 5420 | + | |
| 5421 | + | |
| 5422 | + | |
| 5423 | + | |
| 5424 | + | |
| 5425 | + | |
| 5426 | + | |
| 5427 | + | |
| 5428 | + | |
| 5429 | + | |
| 5430 | + | |
| 5431 | + | |
5372 | 5432 | | |
5373 | 5433 | | |
5374 | 5434 | | |
| |||
0 commit comments