Skip to content

Commit 03d5220

Browse files
leonardBangclaude
andcommitted
[test][pipeline-e2e] Shrink MySqlToHudiE2eITCase write volume to fit MOR window
The products workload wrote ~20011 rows across 20 schema evolutions into a Hudi MERGE_ON_READ table, which could not fully materialize and be read back within validateSinkResult's 20-minute window (rows stalled and snapshot reads ballooned as log files piled up), making the test flaky and the suite hit the 90-minute CI limit. Reduce the per-batch insert count from 1000 to 100 (~2011 rows total) while keeping all 20 ALTER iterations, so schema-evolution coverage is unchanged but the table stays small enough to materialize and read quickly. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
1 parent 5315751 commit 03d5220

1 file changed

Lines changed: 12 additions & 7 deletions

File tree

flink-cdc-e2e-tests/flink-cdc-pipeline-e2e-tests/src/test/java/org/apache/flink/cdc/pipeline/tests/MySqlToHudiE2eITCase.java

Lines changed: 12 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -310,12 +310,12 @@ public void testSyncWholeDatabase() throws Exception {
310310
* <ol>
311311
* <li><b>Column Addition:</b> It sequentially adds 10 new columns, named {@code point_c_0}
312312
* through {@code point_c_9}, each with a {@code VARCHAR(10)} type. After each column is
313-
* added, it executes a batch of 1000 {@code INSERT} statements, populating the columns
314-
* that exist at that point.
313+
* added, it executes a batch of {@code statementBatchCount} {@code INSERT} statements,
314+
* populating the columns that exist at that point.
315315
* <li><b>Column Modification:</b> After all columns are added, it enters a second phase. In
316-
* each of the 10 iterations, it first inserts another 1000 rows and then modifies the
317-
* data type of the first new column ({@code point_c_0}), progressively increasing its
318-
* size from {@code VARCHAR(10)} to {@code VARCHAR(19)}.
316+
* each of the 10 iterations, it first inserts another {@code statementBatchCount} rows
317+
* and then modifies the data type of the first new column ({@code point_c_0}),
318+
* progressively increasing its size from {@code VARCHAR(10)} to {@code VARCHAR(19)}.
319319
* </ol>
320320
*
321321
* <p>Throughout this process, the method constructs and returns a list of strings. Each string
@@ -333,7 +333,12 @@ private List<String> createChangesAndValidate(Statement stat) throws SQLExceptio
333333

334334
// Auto-increment id will start from this
335335
int currentId = 113;
336-
final int statementBatchCount = 1000;
336+
// Keep the per-batch insert count small: a Hudi MERGE_ON_READ table accumulates a log
337+
// file per delta commit, and snapshot reads slow down sharply as those pile up. With 20
338+
// schema evolutions below, a large count makes the table unable to fully materialize (and
339+
// be read back) within validateSinkResult's window. The schema-evolution coverage comes
340+
// from the 20 ALTER iterations, not from the row volume.
341+
final int statementBatchCount = 100;
337342

338343
// Step 1 - Add Column: Add 10 columns with VARCHAR(10) sequentially
339344
for (int addColumnRepeat = 0; addColumnRepeat < 10; addColumnRepeat++) {
@@ -368,7 +373,7 @@ private List<String> createChangesAndValidate(Statement stat) throws SQLExceptio
368373

369374
// Step 2 - Modify type for the columns added in Step 1, increasing the VARCHAR length
370375
for (int modifyColumnRepeat = 0; modifyColumnRepeat < 10; modifyColumnRepeat++) {
371-
// Perform 1000 inserts as a batch, continuing the ID sequence from Step 1
376+
// Perform a batch of inserts, continuing the ID sequence from Step 1
372377
for (int statementCount = 0; statementCount < statementBatchCount; statementCount++) {
373378
stat.addBatch(
374379
String.format(

0 commit comments

Comments
 (0)