[FLINK-20625] Implement V2 Google Cloud PubSub Source in accordance with FLIP-27#32
[FLINK-20625] Implement V2 Google Cloud PubSub Source in accordance with FLIP-27#32clmccart wants to merge 13 commits into
Conversation
fix: remove unwanted changes feat: create basic split enumerator fix: requested changes Update split.proto Update PubSubSink.java Update PubSubNotifyingPullSubscriber.java Update PubSubSinkTest.java Update PubSubCheckpointSerializerTest.java Update PubSubSplitEnumeratorTest.java feat: create pubsub source fix: comment Fix a typo in the ordering key prober readme (#353) Create a parent pom.xml and restructure the flink-connector source code to support directories for integration tests and sample code. Also, replace Java's Optional class with Guava's Optional, which is serializable, to enable starting a Flink job that uses CPS as a source. Add a WIP example of using CPS as a Flink source. Clean up POM files. Change example code to use both a PubSub sink and source. Fix interrupt not returning empty messages Add flow control setting to source builder. Remove limitExceededBehavior source option Create separate unit tests Add PubSubSource integration test using CPS emulator Remove emulatorEndpoint source builder option Replace PubSubEmulatorTestBase with PubSubEmulatorHelper Add initial documentation for Flink connector Address PR comments Add sink/it testing doc Add builder options to source/sink Document source/sink builder options Add at-least-once delivery guarantee for CPS sink Update README with connector status Improve example Flink job and documentation Update library version to 1.0.0-SNAPSHOT Use docker to start CPS emulator in it tests Add PubSubSinkEmulatorTest Do not add source messages to a checkpoint until after it is emitted (#369) Add contribution guide for CPS flink connector (#370)
gitignore move examples into streaming folder move sourcev2 example into right folder. does not compile big move + proto compile move to correct packages import FixedHeaderProvider add autovalue to pom compiles tests compile add vscode files to gitignore fix example pipeline builds revert flink-examples-streaming-gcp-pubsub pom java.lang.NoClassDefFoundError: org/apache/flink/shaded/io/netty/channel/ChannelFactory pipeline works
|
Thanks for opening this pull request! Please check out our contributing guidelines. (https://flink.apache.org/contributing/how-to-contribute.html) |
84cc42f to
3af0b29
Compare
|
looks like the sink rewrite was completed in FLINK-24298. will need to take a look at that and probably remove the sink changes from this PR |
the main reason we wrote a new sink implementation was because the previous implementation was deprecated. we didnt make any significant performance improvements so i went ahead and remove the sink rewrite from this PR. we can followup with changes to the sink implementation in a subsequent PR but no changes are pressing. |
|
@snuyanzin would you mind taking a look at this PR? or is there someone else who might be a better fit? |
|
@clmccart are you still interested in getting this through? @vahmed-hamdy is this something you can help review? we're hitting some limits with the blocking subscriber. I'm happy to take a look but I'm not proficient in this repo. thanks 🙏 |
yeah i'm happy to push it through if there is a reviewer |
|
Hi @clmccart , We are trying to use this code with our Flink setup and we found a couple of issues. First, we are migrating to Flink v2.0 and had to adjust the code - mostly remove the classes that are no longer supported. This is what we found:
I hope these findings will help to improve this MR and get it to the finish line. Best, P.S.: I'm not sure how to contribute changes back, since they are Flink v2 based. |
TLDR;
PubSubSplitEnumerator:
PubSubSourceReader:
Adds an example pipeline using the new source and adds the "Deprecated" tag to the old source implementation
Additional context:
Note that documentation will be updated in a separate PR