This repository was archived by the owner on Nov 11, 2022. It is now read-only.
Version 1.6.0
- Added
InProcessPipelineRunner, an improvement over theDirectPipelineRunnerthat better implements the Dataflow model.InProcessPipelineRunnerruns on a user's local machine and supports multithreaded execution, unboundedPCollections, and triggers for speculative and late outputs. - Added display data, which allows annotating user functions (
DoFn,CombineFn, andWindowFn),Sources, andSinks with static metadata to be displayed in the Dataflow Monitoring Interface. Display data has been implemented for core components and is automatically applied to allPipelineOptions. - Added the ability to compose multiple
CombineFns into a singleCombineFnusingCombineFns.composeorCombineFns.composeKeyed. - Added the methods
getSplitPointsConsumedandgetSplitPointsRemainingto theBoundedReaderAPI to improve Dataflow's ability to automatically scale a job reading from these sources. Default implementations of these functions have been provided, but reader implementers should override them to provide better information when available. - Improved performance of side inputs when using workers with many cores.
- Improved efficiency when using
CombineFnWithContext. - Fixed several issues related to stability in the streaming mode.