This repository was archived by the owner on Nov 11, 2022. It is now read-only.
Version 1.5.0
With this release, we have begun preparing the Dataflow SDK for Java for an eventual move to Apache Beam (incubating). Specifically, we have refactored a number of internal APIs and removed from the SDK classes used only within the worker, which will now be provided by the Google Cloud Dataflow Service during job execution. This refactoring should not affect any user code.
Additionally, the 1.5.0 release includes the following changes:
- Enabled an indexed side input format for batch pipelines executed on the Google Cloud Dataflow service. Indexed side inputs significantly increase performance for
View.asList,View.asMap,View.asMultimap, and any non-globally-windowedPCollectionViews. - Upgraded to Protocol Buffers version
3.0.0-beta-1. If you use custom Protocol Buffers, you should recompile them with the corresponding version of theprotoccompiler. You can continue using both version 2 and 3 of the Protocol Buffers syntax, and no user pipeline code needs to change. - Added
ProtoCoder, which is aCoderfor Protocol Buffers messages that supports both version 2 and 3 of the Protocol Buffers syntax. This coder can detect when messages can be encoded deterministically.Proto2Coderis now deprecated; we recommend that all users switch toProtoCoder. - Added
withoutResultFlatteningtoBigQueryIO.Readto disable flattening query results when reading from BigQuery. - Added
BigtableIO, enabling support for reading from and writing to Google Cloud Bigtable. - Improved
CompressedSourceto detect compression format according to the file extension. Added support for reading.gzfiles that are transparently decompressed by the underlying transport logic.