Add the NWICU dataset#309
Draft
mmcdermott wants to merge 1 commit into
Draft
Conversation
This was referenced May 13, 2026
Codecov Report✅ All modified and coverable lines are covered by tests.
|
54accc2 to
cc9e1b1
Compare
mmcdermott
added a commit
that referenced
this pull request
May 13, 2026
Prerequisite for the per-dataset registration PRs (#305 AUMCdb, #306 EHRShot, #307 HIRID, #308 INSPIRE, #309 NWICU, #310 SICdb, #311 eICU). Most of those datasets' upstream extractors don't ship a publicly installable demo, and the existing registry validation requires every dataset to declare a build_demo command. Switches the convention to: a dataset has a demo iff its commands declare build_demo. Absence is the signal — no separate metadata field. - `test_all_datasets_have_commands` now requires `build_full` (which every dataset still needs) and allows missing `build_demo`. - `tests/conftest.py` drops datasets without `build_demo` from the integration test matrix, so a per-dataset CI lane for one collects zero parametrized tests and passes cleanly rather than trying to build data the dataset can't produce. - `src/MEDS_DEV/datasets/__main__.py` raises a clear error when called with `demo=True` against a dataset that doesn't declare a build_demo command (instead of the previous KeyError). No dataset.yaml files change here — those changes ship with the sister per-dataset PRs that depend on this one. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
17019a0 to
a2d4038
Compare
cc9e1b1 to
0f49a92
Compare
0f49a92 to
365b0b9
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Registers NWICU (Northwestern ICU) in
src/MEDS_DEV/datasets/NWICU/. Extraction is viaNWICU_MEDSfrom the upstreamnwicu-medspackage (pinned to0.0.11).Files added/modified:
dataset.yaml— metadata +build_fullshelling out toNWICU_MEDS. Nobuild_demo(upstream doesn't ship a demo recipe; Makebuild_demooptional, skip demo-less datasets in tests #312 makes the key optional and ensures integration tests skip it).predicates.yaml— admission/discharge plus a few lab predicates.requirements.txt—NWICU-MEDS==0.0.11.refs.bib,README.md.tasks/mortality/in_icu/first_24h.yaml— adds NWICU tosupported_datasets. NWICU defines the requiredicu_admission/icu_dischargepredicates.Depends on #312
#312 makes
build_demooptional in the registry and skips datasets that don't declare it from the integration test matrix. Targeted atfeat/dataset-demo-availabilityfor now; once #312 merges, this PR retargets todev.Test plan
build_demooptional, skip demo-less datasets in tests #312.Supersedes / refs
🤖 Generated with Claude Code