Commit e70237e
authored
Support denormalized document joins for RDS source (opensearch-project#6762)
* Add join configuration model and metadata enricher for RDS source
Add JoinConfig, JoinRelation configuration classes and JoinMetadataEnricher
that enriches CDC events with join metadata (_table, _fields, _is_delete,
_primary_key) to enable denormalization on write in the OpenSearch sink.
Signed-off-by: Dinu John <86094133+dinujoh@users.noreply.github.com>
* Wire join metadata enrichment into RDS source event pipeline
Add joins config to RdsSourceConfig. Update RecordConverter.convert()
to accept columnNames and call JoinMetadataEnricher when table
participates in a join. Wire enricher creation in BinlogEventListener
and LogicalReplicationEventProcessor.
Signed-off-by: Dinu John <86094133+dinujoh@users.noreply.github.com>
* Add rds-joins template with auto-configured denormalization script
Add rds-joins-rule.yaml that matches when both source.rds and
source.rds.joins are present. Add rds-joins-template.yaml that
configures the OpenSearch sink with upsert action and a painless
script that selectively merges/removes fields based on join metadata.
Update RuleEvaluator to sort rules by specificity (most apply_when
conditions first) so the more specific rds-joins rule matches before
the generic rds rule.
Signed-off-by: Dinu John <86094133+dinujoh@users.noreply.github.com>
* Add 1:N join support with nested array denormalization
Update JoinRelation to include child_primary_key for array element
identification. Update JoinMetadataEnricher to set _is_parent,
_child_table_name, _child_pk_name, _child_pk_value metadata and
exclude join key columns from _fields. Update painless script in
rds-joins template to handle parent flat merge, child nested array
insert/update/delete, and parent delete as full document delete.
Signed-off-by: Dinu John <86094133+dinujoh@users.noreply.github.com>
* Override S3 partition key for join tables to use parent key
For child tables in a join, override the S3 partition key to use the
join primary key (parent key value) instead of the child table's own
primary key. This ensures related parent and child events hash to the
same S3 folder so they are processed together by the s3 source pipeline.
Signed-off-by: Dinu John <86094133+dinujoh@users.noreply.github.com>
* Remove external versioning from joins template
Multiple tables write to the same document in join mode. Events from
different tables in the same transaction can share the same timestamp,
causing version conflicts with external versioning since it requires
strictly greater versions. The script itself is idempotent so versioning
is not needed for correctness.
Signed-off-by: Dinu John <86094133+dinujoh@users.noreply.github.com>
* Add username/password passthrough to rds-joins template
Pass through username and password from customer's OpenSearch sink
config to support basic auth in addition to AWS IAM auth.
Signed-off-by: Dinu John <86094133+dinujoh@users.noreply.github.com>
* Add per-row versioning, join_type, max_child_records, and monotonic versioning for RDS join
Changes:
- Painless script: per-table versioning for parent/1:1, per-row versioning for 1:N children
- Configurable version_field (default __versions) to avoid field name collisions
- join_type: one_to_one (flat merge) and one_to_many (array with max_child_records cap)
- Monotonic version counter in StreamRecordConverter using AtomicLong
(timestamp_millis * 1000 + sequence) for unique versions within same second
- retryOnConflict(3) on scripted upsert bulk operations
- Export path: wire JoinMetadataEnricher into DataFileScheduler, pass column names
- Set default empty string for _child_pk_value on parent events to prevent NPE
Tested: parent/child CRUD, 1:1 join, 1:N with max cap, child-before-parent,
concurrent writes, rapid updates, delete+re-insert, NULL values, special chars,
bulk UPDATE, REPLACE INTO, load tests (300+ events verified against MySQL).
Signed-off-by: Dinu John <86094133+dinujoh@users.noreply.github.com>
* Add composite FK support and FK change detection for RDS join
Composite FK (multi-column join key):
- JoinRelation: parent_key, child_key, child_primary_key changed from String to List<String>
with ACCEPT_SINGLE_VALUE_AS_ARRAY for backward compatibility
- JoinMetadataEnricher: composite key values joined with | for document ID and child PK
- Painless script: pkNames split by |, composite matching in removeIf and trim cleanup
FK change detection (before-image):
- BinlogEventListener.handleUpdateEvent: detects when child FK columns change
between before-image and after-image, emits DELETE for old parent doc
- Supports composite FK: checks all key columns, triggers on any column change
- RecordConverter: added getJoinMetadataEnricher() getter
- JoinMetadataEnricher: added getChildKeyColumns() returning List<String>
Requires binlog_row_image=FULL (Aurora MySQL default) for before-image availability.
Signed-off-by: Dinu John <86094133+dinujoh@users.noreply.github.com>
* Add overlay directive for template transformer and update joins template
Template transformer:
- Added <<overlay path>> directive support in DynamicConfigTransformer
- Processes after placeholder resolution, before model conversion
- Supports [*] wildcard to apply overlay to all matching array elements
- Deep merge semantics: overlay fields override target, nested objects merged
- Example: <<overlay sink[*].opensearch>> merges join script into all OS sinks
Joins template:
- Changed from hardcoded OpenSearch sink to full customer sink passthrough
- sink: <<$.<<pipeline-name>>.sink>> preserves all customer sink config
(hosts, aws, index, dlq, routes, etc.)
- <<overlay sink[*].opensearch>> injects action, document_id, script,
scripted_upsert, retry_on_conflict into every OpenSearch sink entry
- Non-OpenSearch sinks are left untouched
Signed-off-by: Dinu John <86094133+dinujoh@users.noreply.github.com>
* Address PR review comments from @oeyh
- Use Objects.equals() for FK comparison to handle NULL values (BinlogEventListener)
- Create JoinType enum (ONE_TO_ONE, ONE_TO_MANY) and use in JoinRelation
- Add proper ArrayList import in RuleEvaluator
Signed-off-by: Dinu John <86094133+dinujoh@users.noreply.github.com>
* Add unit tests for join metadata enricher, version counter, rule evaluator, and overlay directive
- JoinMetadataEnricherTest: parent/child enrichment, composite keys,
1:1 join type, delete events, isJoinTable, getChildKeyColumns
- StreamRecordConverterTest: monotonic version counter (same millis
increments, new millis resets, always > export version)
- RuleEvaluatorTest: more specific rule (2 conditions) matches before
generic rule (1 condition) regardless of load order
- DynamicConfigTransformerTest: overlay directive merges into opensearch
sinks, leaves non-opensearch sinks untouched, removes overlay key
Signed-off-by: Dinu John <86094133+dinujoh@users.noreply.github.com>
---------
Signed-off-by: Dinu John <86094133+dinujoh@users.noreply.github.com>1 parent 569ae3a commit e70237e
27 files changed
Lines changed: 1176 additions & 17 deletions
File tree
- data-prepper-pipeline-parser/src
- main/java/org/opensearch/dataprepper/pipeline/parser
- rule
- transformer
- test
- java/org/opensearch/dataprepper/pipeline/parser
- rule
- transformer
- resources/transformation
- rules
- templates/testSource
- data-prepper-plugins
- opensearch/src/main/java/org/opensearch/dataprepper/plugins/sink/opensearch
- rds-source/src
- main
- java/org/opensearch/dataprepper/plugins/source/rds
- configuration
- converter
- export
- stream
- resources/org/opensearch/dataprepper/transforms
- rules
- templates
- test/java/org/opensearch/dataprepper/plugins/source/rds
- converter
- export
Lines changed: 27 additions & 7 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
24 | 24 | | |
25 | 25 | | |
26 | 26 | | |
| 27 | + | |
27 | 28 | | |
28 | 29 | | |
29 | 30 | | |
| |||
91 | 92 | | |
92 | 93 | | |
93 | 94 | | |
94 | | - | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
95 | 98 | | |
96 | 99 | | |
97 | | - | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
98 | 112 | | |
99 | 113 | | |
100 | 114 | | |
| |||
107 | 121 | | |
108 | 122 | | |
109 | 123 | | |
110 | | - | |
| 124 | + | |
111 | 125 | | |
112 | 126 | | |
113 | 127 | | |
| |||
116 | 130 | | |
117 | 131 | | |
118 | 132 | | |
119 | | - | |
| 133 | + | |
120 | 134 | | |
121 | 135 | | |
122 | 136 | | |
123 | | - | |
124 | | - | |
125 | | - | |
126 | 137 | | |
127 | 138 | | |
128 | 139 | | |
| |||
143 | 154 | | |
144 | 155 | | |
145 | 156 | | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
146 | 166 | | |
Lines changed: 163 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
82 | 82 | | |
83 | 83 | | |
84 | 84 | | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
85 | 91 | | |
86 | 92 | | |
87 | 93 | | |
| |||
166 | 172 | | |
167 | 173 | | |
168 | 174 | | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
169 | 178 | | |
170 | 179 | | |
171 | 180 | | |
| |||
286 | 295 | | |
287 | 296 | | |
288 | 297 | | |
| 298 | + | |
| 299 | + | |
| 300 | + | |
| 301 | + | |
289 | 302 | | |
290 | 303 | | |
291 | 304 | | |
| |||
489 | 502 | | |
490 | 503 | | |
491 | 504 | | |
| 505 | + | |
| 506 | + | |
| 507 | + | |
| 508 | + | |
| 509 | + | |
| 510 | + | |
| 511 | + | |
| 512 | + | |
| 513 | + | |
| 514 | + | |
| 515 | + | |
| 516 | + | |
| 517 | + | |
| 518 | + | |
| 519 | + | |
| 520 | + | |
| 521 | + | |
| 522 | + | |
| 523 | + | |
| 524 | + | |
| 525 | + | |
| 526 | + | |
| 527 | + | |
| 528 | + | |
| 529 | + | |
| 530 | + | |
| 531 | + | |
| 532 | + | |
| 533 | + | |
| 534 | + | |
| 535 | + | |
| 536 | + | |
| 537 | + | |
| 538 | + | |
| 539 | + | |
| 540 | + | |
| 541 | + | |
| 542 | + | |
| 543 | + | |
| 544 | + | |
| 545 | + | |
| 546 | + | |
| 547 | + | |
| 548 | + | |
| 549 | + | |
| 550 | + | |
| 551 | + | |
| 552 | + | |
| 553 | + | |
| 554 | + | |
| 555 | + | |
| 556 | + | |
| 557 | + | |
| 558 | + | |
| 559 | + | |
| 560 | + | |
| 561 | + | |
| 562 | + | |
| 563 | + | |
| 564 | + | |
| 565 | + | |
| 566 | + | |
| 567 | + | |
| 568 | + | |
| 569 | + | |
| 570 | + | |
| 571 | + | |
| 572 | + | |
| 573 | + | |
| 574 | + | |
| 575 | + | |
| 576 | + | |
| 577 | + | |
| 578 | + | |
| 579 | + | |
| 580 | + | |
| 581 | + | |
| 582 | + | |
| 583 | + | |
| 584 | + | |
| 585 | + | |
| 586 | + | |
| 587 | + | |
| 588 | + | |
| 589 | + | |
| 590 | + | |
| 591 | + | |
| 592 | + | |
| 593 | + | |
| 594 | + | |
| 595 | + | |
| 596 | + | |
| 597 | + | |
| 598 | + | |
| 599 | + | |
| 600 | + | |
| 601 | + | |
| 602 | + | |
| 603 | + | |
| 604 | + | |
| 605 | + | |
| 606 | + | |
| 607 | + | |
| 608 | + | |
| 609 | + | |
| 610 | + | |
| 611 | + | |
| 612 | + | |
| 613 | + | |
| 614 | + | |
| 615 | + | |
| 616 | + | |
| 617 | + | |
| 618 | + | |
| 619 | + | |
| 620 | + | |
| 621 | + | |
| 622 | + | |
| 623 | + | |
| 624 | + | |
| 625 | + | |
| 626 | + | |
| 627 | + | |
| 628 | + | |
| 629 | + | |
| 630 | + | |
| 631 | + | |
| 632 | + | |
| 633 | + | |
| 634 | + | |
| 635 | + | |
| 636 | + | |
| 637 | + | |
| 638 | + | |
| 639 | + | |
| 640 | + | |
| 641 | + | |
| 642 | + | |
| 643 | + | |
| 644 | + | |
| 645 | + | |
| 646 | + | |
| 647 | + | |
| 648 | + | |
| 649 | + | |
| 650 | + | |
| 651 | + | |
| 652 | + | |
| 653 | + | |
| 654 | + | |
492 | 655 | | |
493 | 656 | | |
494 | 657 | | |
| |||
Lines changed: 41 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
94 | 94 | | |
95 | 95 | | |
96 | 96 | | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
97 | 138 | | |
98 | 139 | | |
99 | 140 | | |
| |||
0 commit comments