Commit 108ec64
[553] Parquet Schema and Column Stats Converters (#669)
* smaller PR for parquet
* read parquet file for metadataExtractor: compiling, not testd
* cleanups for statsExtractor: compiling, not testd
* refactoring for statsExtractor: compiling, not testd
* added avro dependency
* added tests for SchemaExtractor: int and string primitiveTypes test passes
* fixed some minor bugs in SchemaExtractor
* close fileReader and handle exception
* adjusted fromInternalSchema()
* added a test and adjusted SchemaExtractor
* added a testing code
* bug fix for Schema extractor: groupType
* bug fix for Schema extractor
* bug fix for tests
* bug fix for SchemaExtractor and added tests for nested lists support
* bug fix for tests for nested lists support
* bug fix for complex test which now passes!
* added test for Map
* schemaExtractor refactored
* bug fixed isNullable() schema
* fromInternalSchema : list and map types
* decimal primitive test added
* float primitive + list and map tests for fromInternalSchema
* added tests for primitive type (date and timestamp)
* refactoring for partitionValues extractor
* git build error fixed
* cleanups for schemaExtractor + refactoring for schemaExtractorTests + added test code for statsExtractor
* added assertsEqual test for stats + removed partitionFields from the test, TODO check if field is needed in ColumnStats
* bug fixed for stats tests: columnStats + tests data are read using FileReader
* bug fixed for stats tests, TODO equality test for two objects
* added compareFiles() in InternDataFile for the statsExtractor tests to pass: OK
* added custom comparison test for ColumnStat and InternDataFile, test passes, TODO: other stat types and other schema types testing
* added custom comparison test for ColumnStat (field) and exec spotless apply
* tempDir for parquet stats testing
* binaryStatistics test passes
* added int32 file schema test for statsExtractor
* cleanups + added fields comparison for InternalDataFile
* cleanups + added fixed_len_byte_array primitive type schema file test
* use of genericGetMax instead for stats extraction + cleanups
* boolean schema file test for statsExtractor added
* removed hard coded path in statsExtractor test
* cleanups + imports
* separate tests for int and binary for stats
* custom equals() not needed for InternalDataFile and ColumnStat
* removed parquet version from core sub-project pom
* statsExtractor tests as a suite, removed comments + run spotless apply
* removed uncessary classes
* removed uncessary classes: undo
* undo irrelevant changes
* fixed formatting issues with spotless:apply cmd
* cleanups for test class and fixes for failed build
* tmp file name fixed for failed build
* cleanups
* splotless apply run + assertion internalDataFile equality changed to display errors
* fixes for build, PhysicalPath and BinaryStats
* fixes for build, PhysicalPath and BinaryStats + synced fork
* fixes for build, PhysicalPath and BinaryStats + synced fork
* fixes for build and cleanups
* fixes for build and cleanups
* Parquet dep set as provided to use Spark's
* parquet dep version back to 1.15.1
* parquet-avro moved from core to project's pom
* parquet-avro moved after hadoop-common
* parquet dep scope removed
* run spotless:apply
---------
Co-authored-by: Selim Soufargi <ssoufargi.idealab.unical@gmail.com~>1 parent fc4d6e8 commit 108ec64
14 files changed
Lines changed: 1533 additions & 2 deletions
File tree
- xtable-api/src/main/java/org/apache/xtable
- conversion
- model
- catalog
- schema
- xtable-core
- src
- main/java/org/apache/xtable/parquet
- test/java/org/apache/xtable
- iceberg
- parquet
- xtable-utilities/src/main/java/org/apache/xtable/utilities
Lines changed: 4 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
34 | 34 | | |
35 | 35 | | |
36 | 36 | | |
| 37 | + | |
37 | 38 | | |
38 | 39 | | |
| 40 | + | |
39 | 41 | | |
40 | 42 | | |
| 43 | + | |
41 | 44 | | |
42 | 45 | | |
| 46 | + | |
43 | 47 | | |
44 | 48 | | |
45 | 49 | | |
| |||
xtable-api/src/main/java/org/apache/xtable/model/catalog/ThreePartHierarchicalTableIdentifier.java
Lines changed: 1 addition & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
42 | 42 | | |
43 | 43 | | |
44 | 44 | | |
| 45 | + | |
45 | 46 | | |
46 | 47 | | |
47 | 48 | | |
| |||
Lines changed: 2 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
43 | 43 | | |
44 | 44 | | |
45 | 45 | | |
| 46 | + | |
46 | 47 | | |
47 | 48 | | |
48 | 49 | | |
| 50 | + | |
49 | 51 | | |
50 | 52 | | |
51 | 53 | | |
| |||
Lines changed: 2 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
75 | 75 | | |
76 | 76 | | |
77 | 77 | | |
78 | | - | |
| 78 | + | |
| 79 | + | |
79 | 80 | | |
80 | 81 | | |
81 | 82 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
56 | 56 | | |
57 | 57 | | |
58 | 58 | | |
59 | | - | |
| 59 | + | |
60 | 60 | | |
61 | 61 | | |
62 | 62 | | |
| |||
116 | 116 | | |
117 | 117 | | |
118 | 118 | | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
119 | 123 | | |
120 | 124 | | |
121 | 125 | | |
| |||
Lines changed: 63 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
0 commit comments