Commit 797d730
authored
PERF: Skip header check in parquet reader (rapidsai#22679)
The parquet reader previously required three reads to read the parquet footer:
1. A 4 byte read to check the header for the parquet magic bytes
2. An 8 byte read to read the footer length and footer parquet magic bytes
3. A varaible-length read for the footer metadata
We don't really care about ensuring that the header is valid. For high-latency storage, it's not worth the extra read.
Part of rapidsai#22668, which also proposes to remove the second 8-byte read. But this is a smaller change that should be less controversial.
Authors:
- Tom Augspurger (https://github.com/TomAugspurger)
Approvers:
- Bradley Dice (https://github.com/bdice)
- Vukasin Milovanovic (https://github.com/vuule)
URL: rapidsai#226791 parent ef0a96d commit 797d730
2 files changed
Lines changed: 24 additions & 5 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
37 | 37 | | |
38 | 38 | | |
39 | 39 | | |
40 | | - | |
41 | | - | |
42 | | - | |
43 | | - | |
44 | 40 | | |
45 | | - | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
46 | 44 | | |
47 | 45 | | |
48 | 46 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4375 | 4375 | | |
4376 | 4376 | | |
4377 | 4377 | | |
| 4378 | + | |
| 4379 | + | |
| 4380 | + | |
| 4381 | + | |
| 4382 | + | |
| 4383 | + | |
| 4384 | + | |
| 4385 | + | |
| 4386 | + | |
| 4387 | + | |
| 4388 | + | |
| 4389 | + | |
| 4390 | + | |
| 4391 | + | |
| 4392 | + | |
| 4393 | + | |
| 4394 | + | |
| 4395 | + | |
| 4396 | + | |
| 4397 | + | |
| 4398 | + | |
4378 | 4399 | | |
4379 | 4400 | | |
4380 | 4401 | | |
| |||
0 commit comments