Commit be65147
authored
Rework
## Summary
Tracking Issue: #7216
Adds a new `CompressionEstimate` type in
`vortex-compressor/src/estimate.rs` that the
`expected_compression_ratio` method now returns. Additionally moves some
things around for clarity.
Note that this is not just a refactor, there is subtle logic that has
changed in a few places (that I think is better, not actually sure). I'm
happy to split out some stuff into other PRs if that helps.
### Future Work
- I also would like to add a variant called `Exact` that returns the
fully compressed array in the case that we can only determine if a
scheme is a candidate by compressing the whole thing without any errors;
the only case where we want to do this is `SequenceArray` (and maybe
there's an argument to do this for `ConstantArray` too, but the
semantics around `ConstantArray` should be even more special regardless,
imo).
- This might be in a `ResolvedEstimate` enum instead.
- There are also a bunch of TODOs littered everywhere that are easily
fixed, but I want to do those in a followup.
- We probably want to hardcode the `ConstantScheme` logic into the
compressor since I cannot think of any reason why you would not want to
have a `ConstantScheme` (except when you have a very small array, and at
that point you don't care about perf regardless).
## API Changes
`expected_compression_ratio` now only takes stats and compressor context
(it does not take the compressor at all) and returns a
`CompressionEstimate`. This method must be super quick, and any sampling
or expensive operations are now deferred to later by the compressor.
## Testing
Just a few extra tests, am relying on the existing test suite as it's
not like completely new logic is happening.
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>Scheme estimation in compressor (#7230)1 parent 06065ff commit be65147
File tree
37 files changed
+1504
-1229
lines changed- vortex-btrblocks
- src
- schemes
- vortex-compressor
- benches
- src
- builtins
- constant
- dict
- stats
- vortex-tensor
- src/encodings/turboquant/array
37 files changed
+1504
-1229
lines changedSome generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
65 | 65 | | |
66 | 66 | | |
67 | 67 | | |
68 | | - | |
69 | | - | |
70 | | - | |
71 | | - | |
72 | | - | |
73 | | - | |
74 | | - | |
75 | | - | |
76 | | - | |
77 | | - | |
78 | 68 | | |
79 | 69 | | |
Large diffs are not rendered by default.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
66 | 66 | | |
67 | 67 | | |
68 | 68 | | |
69 | | - | |
70 | 69 | | |
71 | 70 | | |
72 | 71 | | |
73 | 72 | | |
74 | 73 | | |
75 | | - | |
76 | 74 | | |
77 | 75 | | |
78 | 76 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
10 | 10 | | |
11 | 11 | | |
12 | 12 | | |
| 13 | + | |
13 | 14 | | |
14 | 15 | | |
15 | 16 | | |
| |||
42 | 43 | | |
43 | 44 | | |
44 | 45 | | |
45 | | - | |
46 | 46 | | |
47 | 47 | | |
48 | | - | |
| 48 | + | |
49 | 49 | | |
50 | | - | |
| 50 | + | |
51 | 51 | | |
52 | 52 | | |
53 | 53 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
15 | 15 | | |
16 | 16 | | |
17 | 17 | | |
| 18 | + | |
18 | 19 | | |
19 | 20 | | |
20 | 21 | | |
| |||
28 | 29 | | |
29 | 30 | | |
30 | 31 | | |
31 | | - | |
32 | 32 | | |
33 | 33 | | |
34 | 34 | | |
| |||
73 | 73 | | |
74 | 74 | | |
75 | 75 | | |
76 | | - | |
77 | 76 | | |
78 | 77 | | |
79 | | - | |
| 78 | + | |
80 | 79 | | |
81 | 80 | | |
82 | 81 | | |
83 | | - | |
| 82 | + | |
84 | 83 | | |
85 | 84 | | |
86 | 85 | | |
87 | | - | |
88 | | - | |
| 86 | + | |
| 87 | + | |
89 | 88 | | |
90 | 89 | | |
91 | | - | |
| 90 | + | |
92 | 91 | | |
93 | 92 | | |
94 | 93 | | |
| |||
97 | 96 | | |
98 | 97 | | |
99 | 98 | | |
100 | | - | |
101 | | - | |
102 | | - | |
| 99 | + | |
103 | 100 | | |
104 | 101 | | |
105 | 102 | | |
| |||
124 | 121 | | |
125 | 122 | | |
126 | 123 | | |
127 | | - | |
128 | 124 | | |
129 | | - | |
130 | | - | |
131 | | - | |
132 | | - | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
133 | 130 | | |
134 | 131 | | |
135 | | - | |
| 132 | + | |
136 | 133 | | |
137 | 134 | | |
138 | 135 | | |
| |||
141 | 138 | | |
142 | 139 | | |
143 | 140 | | |
144 | | - | |
| 141 | + | |
145 | 142 | | |
146 | | - | |
147 | | - | |
148 | | - | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
149 | 146 | | |
150 | 147 | | |
151 | 148 | | |
152 | | - | |
| 149 | + | |
153 | 150 | | |
154 | 151 | | |
155 | 152 | | |
| |||
191 | 188 | | |
192 | 189 | | |
193 | 190 | | |
194 | | - | |
195 | 191 | | |
196 | 192 | | |
197 | | - | |
| 193 | + | |
| 194 | + | |
198 | 195 | | |
| 196 | + | |
199 | 197 | | |
200 | | - | |
201 | | - | |
202 | | - | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
203 | 201 | | |
204 | 202 | | |
205 | 203 | | |
206 | | - | |
207 | | - | |
| 204 | + | |
| 205 | + | |
208 | 206 | | |
209 | 207 | | |
210 | 208 | | |
211 | | - | |
| 209 | + | |
212 | 210 | | |
213 | 211 | | |
214 | 212 | | |
| |||
217 | 215 | | |
218 | 216 | | |
219 | 217 | | |
220 | | - | |
221 | | - | |
222 | 218 | | |
223 | | - | |
| 219 | + | |
224 | 220 | | |
225 | 221 | | |
226 | 222 | | |
| |||
250 | 246 | | |
251 | 247 | | |
252 | 248 | | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
253 | 257 | | |
254 | 258 | | |
255 | 259 | | |
256 | 260 | | |
257 | 261 | | |
258 | 262 | | |
259 | | - | |
260 | | - | |
261 | | - | |
262 | | - | |
263 | | - | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
| 268 | + | |
264 | 269 | | |
265 | 270 | | |
266 | 271 | | |
| |||
406 | 411 | | |
407 | 412 | | |
408 | 413 | | |
409 | | - | |
| 414 | + | |
| 415 | + | |
410 | 416 | | |
411 | 417 | | |
412 | 418 | | |
| |||
0 commit comments