Feature: Foreign key check with null safe support#1106
Conversation
|
All commits in PR should be signed ('git commit -S ...'). See https://docs.github.com/en/authentication/managing-commit-signature-verification/signing-commits |
|
@IvannKurchenko thank you for the PR. We require all commits to be signed. Please follow the instructions: https://databrickslabs.github.io/dqx/docs/dev/contributing/#first-contribution |
mwojtyczka
left a comment
There was a problem hiding this comment.
Thanks for the contribution — the idea of wrapping single columns in a struct to preserve NULL-aware base_condition is a sound approach. However, the join condition inside the apply closure is not updated for null-safe matching, which means the null-safe tests should fail. See inline comments for details.
Summary:
| Severity | Item |
|---|---|
| Blocker | Join condition uses == which cannot match NULL struct fields — null-safe matching won't work |
| High | Missing validation for negate + null_safe despite being documented as raising an error |
| Low | Docstring grammar |
|
@IvannKurchenko you need to repush all commits, the first commit is still not signed |
68653ea to
fbbd548
Compare
…thub.com:IvannKurchenko/dqx into feature/foreign-key-check-with-null-safe-support
Co-authored-by: Marcin Wojtyczka <marcin.wojtyczka@databricks.com>
fbbd548 to
7b2b51c
Compare
|
@mwojtyczka Thank you for the review and feedback. I have added the tests you requested, along with a case for negate enabled. |
No worries, thanks for your awesome contribution! will merge as soon as all tests pass |
Changes
Added
null_safeparameter toforeign_keycheck to explicitly enablenullforeign key comparison. To distinguish cases whennullforeign key is matched vs foreign key is not found completely whennull_safe=Truewrapping even single columns comparison into Struct.Special behaviour
Enabling
null_safe=Trueon a previously non-null-safe single-column FK changes the message and auto created check name:a_not_exists_in_ref_b→struct_a_as_a_not_exists_in_ref_struct_b_as_aValue '...' in column 'a' not found in reference column 'b'→'struct(a AS a)' ... 'struct(b AS a)'Linked issues
#1069
Resolves #1069
Tests