Skip to content

fix: dedupe in int__mitxpro__coursesinprogram#2361

Merged
rachellougee merged 2 commits into
mainfrom
rachellougee/mitxpro-coursesinprogram-dedupe
Jul 1, 2026
Merged

fix: dedupe in int__mitxpro__coursesinprogram#2361
rachellougee merged 2 commits into
mainfrom
rachellougee/mitxpro-coursesinprogram-dedupe

Conversation

@rachellougee

@rachellougee rachellougee commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

What are the relevant tickets?

https://pipelines.odl.mit.edu/runs/b9b212cb-af5f-410d-b142-baf8bd098668

�[31mFailure in test dbt_expectations_expect_compound_columns_to_be_unique_int__mitxpro__coursesinprogram_course_id__program_id (models/intermediate/mitxpro/_int_mitxpro__models.yml)�[0m

  Got 21 results, configured to fail if >10

Description (What does it do?)

Deduplicate MIT xPro CMS courses-in-program staging data and remove duplicate (course_id, program_id) rows from int__mitxpro__coursesinprogram.

How can this be tested?

dbt build --select +int__mitxpro__coursesinprogram

Additional Context

Copilot AI review requested due to automatic review settings July 1, 2026 17:57

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses duplicate records in the MIT xPro “courses in program” modeling path to resolve a failing dbt compound-uniqueness expectation on (course_id, program_id) in int__mitxpro__coursesinprogram.

Changes:

  • Added raw-table deduplication to the MIT xPro CMS coursesinprogrampage staging model using the shared deduplicate_raw_table macro.
  • Added a distinct selection in int__mitxpro__coursesinprogram to reduce duplicate output rows.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
src/ol_dbt/models/staging/mitxpro/stg__mitxpro__app__postgres__cms_coursesinprogrampage.sql Deduplicates raw Airbyte-ingested CMS coursesinprogrampage records by page_ptr_id before downstream modeling.
src/ol_dbt/models/intermediate/mitxpro/int__mitxpro__coursesinprogram.sql Attempts to remove duplicates emitted by the intermediate model by adding distinct to the final projection.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@rachellougee rachellougee merged commit 2abc1ed into main Jul 1, 2026
6 checks passed
@rachellougee rachellougee deleted the rachellougee/mitxpro-coursesinprogram-dedupe branch July 1, 2026 18:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants