Skip to content

feat(ingest/snowflake): snowflake enhancements to support tasks,snowpipe and stages#16888

Merged
alokr-dhub merged 9 commits into
masterfrom
snowflake_enhancements
Apr 27, 2026
Merged

feat(ingest/snowflake): snowflake enhancements to support tasks,snowpipe and stages#16888
alokr-dhub merged 9 commits into
masterfrom
snowflake_enhancements

Conversation

@alokr-dhub
Copy link
Copy Markdown
Contributor

@alokr-dhub alokr-dhub commented Apr 2, 2026

Summary

  • Add support for ingesting Snowflake Tasks as DataJob/DataFlow entities with DAG dependency tracking
    (predecessor tasks)
  • Add support for ingesting Snowflake Snowpipe objects as DataJob/DataFlow entities with COPY INTO lineage
    (stage → pipe → target table)
  • Add support for ingesting Snowflake Stages as containers with placeholder datasets for internal stages
    and cloud URN resolution for external stages (S3, GCS, Azure)
  • All three features are opt-in via stages.enabled, tasks.enabled, and pipes.enabled config flags (default:
    False)

Test plan

  • 32 unit tests covering tasks extractor, pipes extractor (including parse_copy_into regex), and stages
    extractor
  • 3 integration tests with golden file validation: full stages+tasks+pipes, pipes-without-stages lineage
    resolution, and tasks-only independence
  • lintFix passes clean (ruff + mypy)
  • Existing snowflake integration tests unaffected (new code paths are opt-in only)
  • The PR conforms to DataHub's Contributing Guideline (particularly PR Title Format)
  • Links to related issues (if applicable)
  • Tests for the changes have been added/updated (if applicable)
  • Docs related to the changes have been added/updated (if applicable). If a new feature has been added a Usage Guide has been added for the same.
  • For any breaking change/potential downtime/deprecation/big changes an entry has been made in Updating DataHub

@github-actions github-actions Bot added the ingestion PR or Issue related to the ingestion of metadata label Apr 2, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 2, 2026

@datahub-connector-tests
Copy link
Copy Markdown

datahub-connector-tests Bot commented Apr 2, 2026

Connector Tests Results

All connector tests passed for commit 0565cc4

View full test logs →

To skip connector tests, add the skip-connector-tests label (org members only).

Autogenerated by the connector-tests CI pipeline.

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 6, 2026

Bundle Report

Changes will increase total bundle size by 27 bytes (0.0%) ⬆️. This is within the configured threshold ✅

Detailed changes
Bundle name Size Change
datahub-react-web-esm 22.71MB 27 bytes (0.0%) ⬆️

Affected Assets, Files, and Routes:

view changes for bundle: datahub-react-web-esm

Assets Changed:

Asset Name Size Change Total Size Change (%)
assets/index-*.js 27 bytes 12.46MB 0.0%

@alwaysmeticulous
Copy link
Copy Markdown

alwaysmeticulous Bot commented Apr 6, 2026

🔴 Meticulous spotted visual differences in 928 of 1805 screens tested: view and approve differences detected.

Meticulous evaluated ~8 hours of user flows against your PR.

Last updated for commit 19c3288 fix: linting fixes. This comment will update as new commits are pushed.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 7, 2026

Linear: ING-2198

@maggiehays maggiehays added the needs-review Label for PRs that need review from a maintainer. label Apr 7, 2026
@treff7es
Copy link
Copy Markdown
Contributor

@alokr-dhub something is wrong with this pr, why the change is sooo huge?

Copy link
Copy Markdown
Contributor

@treff7es treff7es left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pr seemingly has unrelated changes. It needs to be removed.

@alokr-dhub alokr-dhub force-pushed the snowflake_enhancements branch from 1986794 to 5987d54 Compare April 23, 2026 08:23
@maggiehays maggiehays added pending-submitter-response Issue/request has been reviewed but requires a response from the submitter and removed needs-review Label for PRs that need review from a maintainer. labels Apr 23, 2026
Comment thread metadata-ingestion/src/datahub/ingestion/source/snowflake/snowflake_schema.py Outdated
Comment thread metadata-ingestion/src/datahub/ingestion/source/snowflake/snowflake_schema.py Outdated
Comment thread metadata-ingestion/src/datahub/ingestion/source/snowflake/snowflake_pipes.py Outdated
Comment thread metadata-ingestion/src/datahub/ingestion/source/snowflake/snowflake_config.py Outdated
Comment thread metadata-ingestion/src/datahub/ingestion/source/snowflake/snowflake_pipes.py Outdated
@alokr-dhub
Copy link
Copy Markdown
Contributor Author

@treff7es Addressed all comments. Please review.

@alokr-dhub alokr-dhub requested a review from treff7es April 23, 2026 19:04
@maggiehays maggiehays added needs-review Label for PRs that need review from a maintainer. and removed pending-submitter-response Issue/request has been reviewed but requires a response from the submitter labels Apr 23, 2026
Copy link
Copy Markdown
Contributor

@treff7es treff7es left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@maggiehays maggiehays added pending-submitter-merge and removed needs-review Label for PRs that need review from a maintainer. labels Apr 24, 2026
@alokr-dhub alokr-dhub merged commit 6d8af3c into master Apr 27, 2026
97 checks passed
@alokr-dhub alokr-dhub deleted the snowflake_enhancements branch April 27, 2026 07:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ingestion PR or Issue related to the ingestion of metadata pending-submitter-merge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants