Skip to content

fix(destination-motherduck): properly leverage source_defined_primary_key when defined (CDK bump)#62133

Merged
Aaron ("AJ") Steers (aaronsteers) merged 12 commits into
masterfrom
devin/1751072697-motherduck-use-cdk-dev-branch
Jul 7, 2025
Merged

fix(destination-motherduck): properly leverage source_defined_primary_key when defined (CDK bump)#62133
Aaron ("AJ") Steers (aaronsteers) merged 12 commits into
masterfrom
devin/1751072697-motherduck-use-cdk-dev-branch

Conversation

@devin-ai-integration
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot commented Jun 28, 2025

Related:

Point MotherDuck destination to CDK dev branch with primary key fix

Summary

This PR updates the MotherDuck destination to use a development branch of the Airbyte Python CDK that contains a fix for primary key handling in SQL destinations. The original issue was that the CatalogProvider.get_primary_keys() method was ignoring source-defined primary keys when configured primary keys were empty/None, affecting all SQL destinations including MotherDuck.

Dependencies: This PR depends on CDK PR #627 which implements the actual fix.

Changes Made:

  • Updated pyproject.toml to point airbyte-cdk dependency to dev branch devin/1751064114-fix-primary-key-fallback
  • Added poethepoet as dev dependency (required for the poe task used to update CDK reference)
  • Updated poetry.lock with new dependency resolution

Review & Testing Checklist for Human

⚠️ MEDIUM RISK - Dependency change affecting core destination functionality

  • End-to-end testing: Test MotherDuck destination with actual data to verify primary key handling works correctly, especially with deduplication sync modes
  • Primary key scenario testing: Create test cases with configured catalogs having empty primary_key but non-empty source_defined_primary_key to verify fallback behavior
  • Regression testing: Verify existing MotherDuck functionality still works (no breaking changes from CDK update)
  • Dependency management review: Confirm using a CDK dev branch is appropriate for this testing scenario and understand the merge/release plan

Recommended Test Plan

  1. Set up MotherDuck destination with test data
  2. Create configured catalogs with various primary key combinations:
    • Streams with only configured primary keys (should use configured)
    • Streams with empty configured primary keys but source-defined ones (should fall back)
    • Streams with neither (should handle gracefully)
  3. Run sync operations with deduplication modes that rely on primary keys
  4. Verify SQL generation and data integrity

Diagram

graph TD
    A[destination-motherduck/pyproject.toml]:::major-edit
    B[airbyte-cdk dependency]:::context
    C[CDK CatalogProvider.get_primary_keys]:::context
    D[CDK PR #627]:::context
    E[MotherDuck SQL Operations]:::context
    F[poetry.lock]:::major-edit

    A --> B
    B --> C
    C --> D
    A --> F
    B --> E
    C --> E

    subgraph Legend
        L1[Major Edit]:::major-edit
        L2[Minor Edit]:::minor-edit
        L3[Context/No Edit]:::context
    end

    classDef major-edit fill:#90EE90
    classDef minor-edit fill:#87CEEB
    classDef context fill:#FFFFFF
Loading

Notes

  • Coordination required: This PR creates a testing branch for the MotherDuck destination that depends on an unmerged CDK fix
  • Temporary state: This dependency on a dev branch should be reverted once CDK PR Move Postgres Source to AB Protocol #627 is merged and a new CDK version is released
  • Testing scope: While this PR only affects MotherDuck destination configuration, the underlying CDK fix affects all SQL destinations
  • CI considerations: Some CI checks may fail due to the dev branch dependency until the CDK PR is merged

Link to Devin run: https://app.devin.ai/sessions/c79bdd64852f4d7ebf155898492407d1
Requested by: Aaron ("AJ") Steers (@aaronsteers)

…y fix

- Update airbyte-cdk dependency to use dev branch devin/1751064114-fix-primary-key-fallback
- This branch contains the fix for primary key fallback logic in CatalogProvider
- Allows end-to-end testing of the primary key fix with MotherDuck destination
- References CDK PR: airbytehq/airbyte-python-cdk#627

Co-Authored-By: AJ Steers <aj@airbyte.io>
@devin-ai-integration
Copy link
Copy Markdown
Contributor Author

Original prompt from AJ Steers:

Received message in Slack channel #ask-devin-ai:

@Devin - we've identified a bug in the MotherDuck destination. Specifically, it looks like that implementation does not consider overridden primary key for streams described in the configured catalog. Instead, it seems to always use the source defined primary key. Can you look into this and (1) confirm, and (2) if confirmed propose a fix?

@devin-ai-integration
Copy link
Copy Markdown
Contributor Author

devin-ai-integration Bot commented Jun 28, 2025

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@vercel
Copy link
Copy Markdown

vercel Bot commented Jun 28, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
airbyte-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jul 4, 2025 6:58pm

@github-actions
Copy link
Copy Markdown
Contributor

👋 Greetings, Contributor!

Here are some helpful tips and reminders for your convenience.

Helpful Resources

PR Slash Commands

Airbyte Maintainers (that's you!) can execute the following slash commands on your PR:

  • /format-fix - Fixes most formatting issues.
  • /bump-version - Bumps connector versions.
    • You can specify a custom changelog by passing changelog. Example: /bump-version changelog="My cool update"
    • Leaving the changelog arg blank will auto-populate the changelog from the PR title.
  • /run-cat-tests - Runs legacy CAT tests (Connector Acceptance Tests)
  • /build-connector-images - Builds and publishes a pre-release docker image for the modified connector(s).

📝 Edit this welcome message.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 28, 2025

destination-motherduck Connector Test Results

43 tests   42 ✅  12s ⏱️
 2 suites   1 💤
 2 files     0 ❌

Results for commit 12ae85b.

♻️ This comment has been updated with latest results.

@aaronsteers
Copy link
Copy Markdown
Member

Aaron ("AJ") Steers (aaronsteers) commented Jun 28, 2025

/bump-version

Bump Version job started... Check job output.

✅ Changes applied successfully. (61d2bd4)

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR updates the MotherDuck destination to use a development branch of the Airbyte Python CDK that contains a fix for primary key handling, and bumps the version numbers accordingly.

  • Dependency update: points airbyte-cdk to the "devin/1751064114-fix-primary-key-fallback" branch.
  • Version bump: updates version in pyproject.toml and metadata.yaml from 0.1.19 to 0.1.20.
  • Dev dependency added: includes poethepoet for managing CDK reference tasks.

Reviewed Changes

Copilot reviewed 3 out of 4 changed files in this pull request and generated no comments.

File Description
docs/integrations/destinations/motherduck.md Adds a new version entry to the changelog.
airbyte-integrations/connectors/destination-motherduck/pyproject.toml Updates version, changes airbyte-cdk dependency to a dev branch, adds poethepoet.
airbyte-integrations/connectors/destination-motherduck/metadata.yaml Updates dockerImageTag version to match the new version.

@aaronsteers
Copy link
Copy Markdown
Member

Aaron ("AJ") Steers (aaronsteers) commented Jul 3, 2025

/build-connector-images

Connector Image Build Started

  • This workflow will build the connector image and run basic tests.
  • The connector image(s) will be pushed to the GitHub Container Registry.

Check job output.
Connector Image Builds job completed. See logs for details.

@aaronsteers Aaron ("AJ") Steers (aaronsteers) changed the title feat(destination-motherduck): Point to CDK dev branch with primary key fix Fix(destination-motherduck): properly leverage source defined primary key when defined Jul 3, 2025
@aaronsteers Aaron ("AJ") Steers (aaronsteers) changed the title Fix(destination-motherduck): properly leverage source defined primary key when defined fix(destination-motherduck): properly leverage source_defined_primary_key when defined (CDK bump) Jul 3, 2025
@aaronsteers

This comment was marked as outdated.

@aaronsteers
Copy link
Copy Markdown
Member

Aaron ("AJ") Steers (aaronsteers) commented Jul 4, 2025

/poe connector destination-motherduck lock

Running poe connector destination-motherduck lock...

Link to job logs.

🤖 Auto-commit successful: 64c60f3

🟦 Poe command connector destination-motherduck lock completed successfully.

@aaronsteers
Copy link
Copy Markdown
Member

Aaron ("AJ") Steers (aaronsteers) commented Jul 4, 2025

/poe destination motherduck use-cdk-latest

Running poe destination motherduck use-cdk-latest...

Link to job logs.

🤖 Auto-commit successful: e5c57ca

🟦 Poe command destination motherduck use-cdk-latest completed successfully.

@aaronsteers
Copy link
Copy Markdown
Member

Aaron ("AJ") Steers (aaronsteers) commented Jul 4, 2025

/poe destination motherduck lock

Running poe destination motherduck lock ...

Link to job logs.

🤖 Auto-commit successful: 856494f

🟦 Poe command destination motherduck lock completed successfully.

Comment thread docs/integrations/destinations/motherduck.md Outdated
Comment thread docs/integrations/destinations/motherduck.md Outdated
Comment thread docs/integrations/destinations/motherduck.md Outdated
@github-project-automation github-project-automation Bot moved this from Backlog to Ready to Ship in 🧑‍🏭 Community Pull Requests Jul 7, 2025
@aaronsteers Aaron ("AJ") Steers (aaronsteers) merged commit cd6e1fc into master Jul 7, 2025
26 checks passed
@aaronsteers Aaron ("AJ") Steers (aaronsteers) deleted the devin/1751072697-motherduck-use-cdk-dev-branch branch July 7, 2025 16:28
@github-project-automation github-project-automation Bot moved this from Ready to Ship to Done in 🧑‍🏭 Community Pull Requests Jul 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Development

Successfully merging this pull request may close these issues.

4 participants