Commit 6bdaec5
committed
GH-3522: Reduce peak memory during row group flush by eagerly releasing column buffers
During flushToFileWriter(), each column's compressed page buffers are now
released immediately after being written to disk, rather than held in memory
until the entire row group flush completes. For a schema with N columns,
this reduces peak flush memory from ~N columns' worth of compressed pages
to ~1 column's worth.
Changes:
- Add writeAllToAndRelease() to ConcatenatingByteBufferCollector for
progressive slab-by-slab memory release during write
- Make close() idempotent (safe to call after eager release or multiple times)
- Call pageWriter.close() after each column in flushToFileWriter()
- Add tests for eager release, double-close safety, and output equivalence1 parent 53d7842 commit 6bdaec5
3 files changed
Lines changed: 114 additions & 0 deletions
File tree
- parquet-common/src
- main/java/org/apache/parquet/bytes
- test/java/org/apache/parquet/bytes
- parquet-hadoop/src/main/java/org/apache/parquet/hadoop
Lines changed: 29 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
26 | 26 | | |
27 | 27 | | |
28 | 28 | | |
| 29 | + | |
29 | 30 | | |
30 | 31 | | |
31 | 32 | | |
| |||
64 | 65 | | |
65 | 66 | | |
66 | 67 | | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
67 | 71 | | |
68 | 72 | | |
69 | 73 | | |
70 | 74 | | |
| 75 | + | |
71 | 76 | | |
72 | 77 | | |
73 | 78 | | |
| |||
78 | 83 | | |
79 | 84 | | |
80 | 85 | | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
81 | 110 | | |
82 | 111 | | |
83 | 112 | | |
| |||
Lines changed: 80 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
109 | 109 | | |
110 | 110 | | |
111 | 111 | | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
112 | 192 | | |
Lines changed: 5 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
692 | 692 | | |
693 | 693 | | |
694 | 694 | | |
| 695 | + | |
| 696 | + | |
| 697 | + | |
| 698 | + | |
| 699 | + | |
695 | 700 | | |
696 | 701 | | |
697 | 702 | | |
0 commit comments