This repository was archived by the owner on Nov 11, 2022. It is now read-only.
Version 1.4.0
- Added a series of batch and streaming example pipelines in a mobile gaming domain that illustrate some advanced topics, including windowing and triggers.
- Added support for
Combinefunctions to access pipeline options and side inputs through a context. SeeGlobalCombineFnandPerKeyCombineFnfor further details. - Modified
ParDo.withSideInputs()such that successive calls are cumulative. - Modified automatic coder detection of Protocol Buffer messages; such classes now have their coders provided automatically.
- Added support for limiting the number of results returned by
DatastoreIO.Source. However, when this limit is set, the operation that reads from Cloud Datastore is performed by a single worker rather than executing in parallel across the worker pool. - Modified definition of
PaneInfo.{EARLY, ON_TIME, LATE}so that panes with only late data are alwaysLATE, and anON_TIMEpane can never cause a later computation to yield aLATEpane. - Modified
GroupByKeyto drop late data when that late data arrives for a window that has expired. An expired window means the end of the window is passed by more than the allowed lateness. - When using
GlobalWindows, you are no longer required to specifywithAllowedLateness(), since no data is ever dropped. - Added support for obtaining the default project ID from the default project configuration produced by newer versions of the
gcloudutility. If the default project configuration does not exist, Dataflow reverts to using the old project configuration generated by older versions of thegcloudutility.