feat(dimensional): add semester + passing_grade to dim_course_run from MicroMasters exam runs#2319
Open
quazi-h wants to merge 8 commits into
Open
feat(dimensional): add semester + passing_grade to dim_course_run from MicroMasters exam runs#2319quazi-h wants to merge 8 commits into
quazi-h wants to merge 8 commits into
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR extends the dimensional dim_course_run table with MicroMasters-derived proctored exam metadata (semester, passing_grade) and begins migrating the MITxOnline portion of the DEDP proctored exam grades mart to use dimensional models (tfact_*/dim_*) instead of int__*.
Changes:
- Add
semesterandpassing_gradecolumns todim_course_run, sourced (for MITxOnline only) fromstg__micromasters__app__postgres__exams_examrun, and include them in the SCD2 change-detection predicate. - Rewrite the MITxOnline branch of
marts__micromasters_dedp_exam_gradesto usetfact_grade+dim_course_run+dim_course+dim_user, with dedup via an anti-join against the MicroMasters branch.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
src/ol_dbt/models/dimensional/dim_course_run.sql |
Adds semester/passing_grade to the course-run SCD2 dimension via a MicroMasters examrun left join, and updates incremental change detection + expiration logic. |
src/ol_dbt/models/marts/micromasters/marts__micromasters_dedp_exam_grades.sql |
Migrates the MITxOnline branch to dimensional sources and replaces prior join-based dedup/email-bridge logic with a NOT EXISTS anti-join and dim_user.email. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…partially migrate marts__micromasters_dedp_exam_grades - Add micromasters_examruns CTE to dim_course_run sourced from stg__micromasters__app__postgres__exams_examrun; left-join to mitxonline_courseruns on courserun_readable_id = examrun_readable_id to populate semester and passing_grade for proctored exam runs. - Add nullable semester (varchar) and passing_grade (double) columns to all non-mitxonline platform CTEs and thread through to final SELECT and records_to_expire for SCD2 correctness. - Include semester and passing_grade in the incremental not-exists change-detection predicate so exam run config corrections trigger new SCD2 records. - Rewrite MITxOnline branch of marts__micromasters_dedp_exam_grades to use tfact_grade + dim_course_run + dim_user instead of int__mitxonline__proctored_exam_grades and int__micromasters__users. - MicroMasters branch retained on int__micromasters__dedp_proctored_exam_grades pending MicroMasters being added to tfact_grade (epic #2072). - user_micromasters_email cast to NULL in MITxOnline branch pending dim_user exposing platform-specific emails (epic #2072). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
073c4e3 to
d5e4681
Compare
- Restore dim_course_run.sql to original single-quote/comma-leading style so PR diff against main shows only the ~27 functional added lines - Fix coalesce(passing_grade, 0.0) -> coalesce(passing_grade, -1.0) in SCD2 change-detection predicate to distinguish NULL from a real 0.0 value - Add YAML column docs for semester and passing_grade in _dim_course_run.yml Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
KatelynGit
reviewed
Jun 24, 2026
The mart's course_run_dim CTE referenced dim_course_run.semester and dim_course_run.passing_grade which don't exist in the production table until this PR is deployed, causing COLUMN_NOT_FOUND on dbt build. Additionally, joining dim_course_run for the MicroMasters branch would produce null semester for ~9,987 older edxorg-era exam run rows (where the examrun_readable_id maps to an edxorg platform row in dim_course_run that has semester=null), breaking the not_null test on semester. Both int models already carry semester and passing_grade directly, so removing the dim join loses no data. The value of this PR is Part 1: making semester/passing_grade available on dim_course_run for future consumers — the mart migration itself is deferred to the follow-up work in epic #2072. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…tional changes in this PR) The mart is unchanged from main — semester/passing_grade continue to source from int models. The dim_course_run additions in this PR are for future dimensional consumers; the mart migration is deferred to epic #2072 follow-up. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
KatelynGit
approved these changes
Jun 25, 2026
KatelynGit
left a comment
Contributor
There was a problem hiding this comment.
Build completed without errors now looks good. But may need to look over copilot comments
…cs and null test - Add QUALIFY ROW_NUMBER() dedup to micromasters_examruns CTE to prevent 1:N fan-out if multiple exam run records share the same examrun_readable_id; picks the most recently updated row per readable ID. - Fix semester column description in _dim_course_run.yml: value format is '2T2022' (term identifier), not 'Fall 2023' (human-readable label). - Remove not_null test on semester in _marts_micromasters__models.yml: MITxOnline exam grade rows without a corresponding MicroMasters exam run record will have semester = NULL, so the constraint was incorrect. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…no compatibility Trino does not support the QUALIFY clause. Replace the micromasters_examruns dedup with the equivalent ROW_NUMBER() inner-subquery pattern, consistent with the existing pattern documented and used in dim_course.sql. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add examrun_id desc as secondary ORDER BY key so the ROW_NUMBER dedup is deterministic when two exam run rows share the same examrun_updated_on timestamp, preventing nondeterministic semester/passing_grade selection and unnecessary SCD2 churn. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Comment on lines
+24
to
+28
| partition by examrun_readable_id order by examrun_updated_on desc nulls last, examrun_id desc | ||
| ) as _row_num | ||
| from {{ ref("stg__micromasters__app__postgres__exams_examrun") }} | ||
| ) | ||
| where _row_num = 1 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What are the relevant tickets?
Partially addresses #2088
Description (What does it do?)
Adds two nullable columns to
dim_course_run:semester(varchar) — academic term identifier for a proctored exam run (e.g.2T2022)passing_grade(double) — passing score threshold set per exam runSourced from
stg__micromasters__app__postgres__exams_examrunvia a newmicromasters_examrunsCTE, left-joined onto MITxOnline course runs oncourserun_readable_id = examrun_readable_id. All other platforms get NULL. Both fields are included in the SCD2 change-detection predicate (using-1.0as sentinel forpassing_gradeto distinguish NULL from an actual0.0value).How can this be tested?
Additional Context
semesterandpassing_gradebelong ondim_course_run(nottfact_grade) because they describe the exam run configuration, not the individual grade event.