Skip to content

Commit d0081be

Browse files
committed
Merge remote-tracking branch 'apache/master' into feature-apache-parquet-2417-geospatial
2 parents 23af52f + 7d1fe32 commit d0081be

68 files changed

Lines changed: 623 additions & 7027 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

LICENSE

Lines changed: 0 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -197,16 +197,6 @@ License: Apache License Version 2.0 http://www.apache.org/licenses/LICENSE-2.0
197197

198198
--------------------------------------------------------------------------------
199199

200-
This product includes code from Apache Spark.
201-
202-
* dev/merge_parquet_pr.py is based on Spark's dev/merge_spark_pr.py
203-
204-
Copyright: 2014 The Apache Software Foundation.
205-
Home page: https://spark.apache.org/
206-
License: http://www.apache.org/licenses/LICENSE-2.0
207-
208-
--------------------------------------------------------------------------------
209-
210200
This product includes code from Twitter's ElephantBird project.
211201

212202
* parquet-hadoop's UnmaterializableRecordCounter.java includes code from

README.md

Lines changed: 6 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -73,7 +73,7 @@ Parquet is a very active project, and new features are being added quickly. Here
7373

7474
* Type-specific encoding
7575
* Hive integration (deprecated)
76-
* Pig integration
76+
* Pig integration (deprecated)
7777
* Cascading integration (deprecated)
7878
* Crunch integration
7979
* Apache Arrow integration
@@ -132,24 +132,6 @@ See the APIs:
132132
* [Record conversion API](https://github.com/apache/parquet-java/tree/master/parquet-column/src/main/java/org/apache/parquet/io/api)
133133
* [Hadoop API](https://github.com/apache/parquet-java/tree/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/api)
134134

135-
## Apache Pig integration
136-
A [Loader](https://github.com/apache/parquet-java/blob/master/parquet-pig/src/main/java/org/apache/parquet/pig/ParquetLoader.java) and a [Storer](https://github.com/apache/parquet-java/blob/master/parquet-pig/src/main/java/org/apache/parquet/pig/ParquetStorer.java) are provided to read and write Parquet files with Apache Pig
137-
138-
Storing data into Parquet in Pig is simple:
139-
```
140-
-- options you might want to fiddle with
141-
SET parquet.page.size 1048576 -- default. this is your min read/write unit.
142-
SET parquet.block.size 134217728 -- default. your memory budget for buffering data
143-
SET parquet.compression lzo -- or you can use none, gzip, snappy
144-
STORE mydata into '/some/path' USING parquet.pig.ParquetStorer;
145-
```
146-
Reading in Pig is also simple:
147-
```
148-
mydata = LOAD '/some/path' USING parquet.pig.ParquetLoader();
149-
```
150-
151-
If the data was stored using Pig, things will "just work". If the data was stored using another method, you will need to provide the Pig schema equivalent to the data you stored (you can also write the schema to the file footer while writing it -- but that's pretty advanced). We will provide a basic automatic schema conversion soon.
152-
153135
## Hive integration
154136

155137
Hive integration is provided via the [parquet-hive](https://github.com/apache/parquet-java/tree/master/parquet-hive) sub-project.
@@ -167,29 +149,29 @@ The build runs in [GitHub Actions](https://github.com/apache/parquet-java/action
167149

168150
## Add Parquet as a dependency in Maven
169151

170-
The current release is version `1.15.0`.
152+
The current release is version `1.15.1`.
171153

172154
```xml
173155
<dependencies>
174156
<dependency>
175157
<groupId>org.apache.parquet</groupId>
176158
<artifactId>parquet-common</artifactId>
177-
<version>1.15.0</version>
159+
<version>1.15.1</version>
178160
</dependency>
179161
<dependency>
180162
<groupId>org.apache.parquet</groupId>
181163
<artifactId>parquet-encoding</artifactId>
182-
<version>1.15.0</version>
164+
<version>1.15.1</version>
183165
</dependency>
184166
<dependency>
185167
<groupId>org.apache.parquet</groupId>
186168
<artifactId>parquet-column</artifactId>
187-
<version>1.15.0</version>
169+
<version>1.15.1</version>
188170
</dependency>
189171
<dependency>
190172
<groupId>org.apache.parquet</groupId>
191173
<artifactId>parquet-hadoop</artifactId>
192-
<version>1.15.0</version>
174+
<version>1.15.1</version>
193175
</dependency>
194176
</dependencies>
195177
```

dev/README.md

Lines changed: 0 additions & 93 deletions
This file was deleted.

0 commit comments

Comments
 (0)