[FLINK-38915] FlinkBlueGreenDeplomynet in place suspension handler#1053
Conversation
| return patchFlinkDeployment(context, currentBlueGreenDeploymentType); | ||
| } | ||
|
|
||
| // Check if child is currently suspended - if so, just patch specs without restart |
There was a problem hiding this comment.
Is there a benefit of patching the spec if the deployment is suspended? A RESUME command can/will override these changes anyway, am I correct?
There was a problem hiding this comment.
Hi! Yes, a RESUME will reconcile all changes made during suspension. However, without this patch, the FlinkDeployment and FlinkBlueGreenDeployment will be out of sync during that entire time.
We think keeping them in sync provides a better user experience: users can inspect the child FlinkDeployment at any time and see the exact spec that will be executed, rather than having to track changes across resources. We're hoping to eventually make the FlinkBlueGreenDeployment the single source of truth, with the child always reflecting the parent's current desired state.
That said, we're open to other perspectives! If you think the extra patch call isn't worth it, we'd be happy to discuss!
| return false; | ||
| } | ||
| return deployment.getStatus().getLifecycleState() == ResourceLifecycleState.SUSPENDED | ||
| && isChildSuspended(deployment); |
There was a problem hiding this comment.
As of now isFlinkDeploymentSuspended is always called after isChildSuspended, it seems we don't need to make another call to it in 454.
There was a problem hiding this comment.
Thanks for that! Ya, that's a redundant check. Updated.
eb4717a to
a636429
Compare
a636429 to
4847dfd
Compare
What is the purpose of the change
Tackles: https://issues.apache.org/jira/browse/FLINK-38915
Improve blue/green suspend/resume behavior: allow in-place suspension/resume without spawning new deployments, propagate spec changes while suspended, block suspend during transitions, and fix BG status sync bugs.
Brief change log
BlueGreenDeploymentService. (This means if suspension was done on blue, the pipeline will be resumed on blue when state is set back to running).job.state=SUSPENDED.Verifying this change
This change added tests and can be verified as follows:
FlinkBlueGreenDeploymentControllerTest: suspend/resume in-place, suspend during transition blocked, initial suspended rejection.FlinkBlueGreenDeploymentSpecDiffTest: SUSPEND/RESUME diff detection.Does this pull request potentially affect one of the following parts:
CustomResourceDescriptors): noDocumentation