Skip to content

Sync fork with awslabs/python-deequ upstream master#1

Draft
Copilot wants to merge 4 commits into
masterfrom
copilot/sync-latest-from-remote
Draft

Sync fork with awslabs/python-deequ upstream master#1
Copilot wants to merge 4 commits into
masterfrom
copilot/sync-latest-from-remote

Conversation

Copilot AI commented Jun 15, 2026

Copy link
Copy Markdown

Fast-forward merges 4 commits from awslabs/python-deequ master that were missing from this fork.

Changes pulled from upstream

sudsali and others added 4 commits May 18, 2026 11:52
…ments (awslabs#263)

* fix: reduce PR review false positives, increase context budget

* fix: incremental PR review, auto-approve, and bot operational improvements
Co-authored-by: Mirochill <200482516+Mirochill@users.noreply.github.com>
* Add VerificationResult.rowLevelResultsAsDataFrame support

Wrap deequ's VerificationResult.rowLevelResultsAsDataFrame as a
classmethod on pydeequ's VerificationResult. This returns the original
DataFrame with additional Boolean columns indicating which rows passed
or failed each Check.

- Add rowLevelResultsAsDataFrame classmethod to VerificationResult
- Add tests covering completeness, containedIn, ANDed constraints,
  aggregate-only checks, column preservation, and pandas output
- Update README with usage example

Closes awslabs#261

* Add orderBy to tests for deterministic row ordering

Address review feedback: Spark DataFrames have no guaranteed row
order, so add explicit orderBy() before collect() in all tests that
assert row-level values.

* Add row count assertion to completeness test

Verify that rowLevelResultsAsDataFrame preserves the same number of
rows as the original DataFrame.

* test: add multi-Check test verifying separate Boolean columns per Check

Addresses review feedback requesting a test for addCheck(check1).addCheck(check2)
producing distinct Boolean columns in row-level results.

* Address review feedback: improve AND test and README clarity

- README: add sentence explaining multi-constraint AND behavior
- Test: use isContainedIn + isComplete so constraints disagree on
  different rows, properly validating AND logic
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants