Skip to content

Add the INSPIRE dataset#308

Draft
mmcdermott wants to merge 1 commit into
devfrom
feat/add-dataset-inspire
Draft

Add the INSPIRE dataset#308
mmcdermott wants to merge 1 commit into
devfrom
feat/add-dataset-inspire

Conversation

@mmcdermott
Copy link
Copy Markdown
Collaborator

@mmcdermott mmcdermott commented May 13, 2026

Summary

Registers INSPIRE (perioperative medicine) in src/MEDS_DEV/datasets/INSPIRE/. Extraction is via MEDS_extract-INSPIRE from the upstream inspire-meds package.

Files added:

  • dataset.yaml — metadata + build_full shelling out to MEDS_extract-INSPIRE. No build_demo (upstream doesn't ship a demo recipe; Make build_demo optional, skip demo-less datasets in tests #312 makes the key optional and ensures integration tests skip it).
  • predicates.yaml — admission/discharge plus a few lab/vital predicates. Regex syntax bugs (^X* form) and commented-out predicate stubs from the contributor's branch were cleaned up.
  • requirements.txtINSPIRE-MEDS==0.0.11.
  • refs.bib.
  • README.md.

INSPIRE is not added to any task's supported_datasets.

Depends on #312

#312 makes build_demo optional in the registry and skips datasets that don't declare it from the integration test matrix. Targeted at feat/dataset-demo-availability for now; once #312 merges, this PR retargets to dev.

Test plan

Supersedes / refs

🤖 Generated with Claude Code

@codecov
Copy link
Copy Markdown

codecov Bot commented May 13, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ All tests successful. No failed tests found.

Files with missing lines Coverage Δ
src/MEDS_DEV/datasets/__init__.py 97.43% <100.00%> (+0.06%) ⬆️

@mmcdermott mmcdermott changed the base branch from dev to feat/dataset-demo-availability May 13, 2026 16:33
@mmcdermott mmcdermott force-pushed the feat/add-dataset-inspire branch from 8048c83 to 48c84ef Compare May 13, 2026 16:33
mmcdermott added a commit that referenced this pull request May 13, 2026
Prerequisite for the per-dataset registration PRs (#305 AUMCdb, #306
EHRShot, #307 HIRID, #308 INSPIRE, #309 NWICU, #310 SICdb, #311 eICU).
Most of those datasets' upstream extractors don't ship a publicly
installable demo, and the existing registry validation requires every
dataset to declare a build_demo command.

Switches the convention to: a dataset has a demo iff its commands
declare build_demo. Absence is the signal — no separate metadata field.

- `test_all_datasets_have_commands` now requires `build_full` (which
  every dataset still needs) and allows missing `build_demo`.
- `tests/conftest.py` drops datasets without `build_demo` from the
  integration test matrix, so a per-dataset CI lane for one collects
  zero parametrized tests and passes cleanly rather than trying to
  build data the dataset can't produce.
- `src/MEDS_DEV/datasets/__main__.py` raises a clear error when called
  with `demo=True` against a dataset that doesn't declare a
  build_demo command (instead of the previous KeyError).

No dataset.yaml files change here — those changes ship with the sister
per-dataset PRs that depend on this one.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@mmcdermott mmcdermott force-pushed the feat/dataset-demo-availability branch from 17019a0 to a2d4038 Compare May 13, 2026 17:20
@mmcdermott mmcdermott force-pushed the feat/add-dataset-inspire branch from 48c84ef to d79e827 Compare May 13, 2026 17:22
@mmcdermott mmcdermott changed the base branch from feat/dataset-demo-availability to dev May 13, 2026 17:43
@mmcdermott mmcdermott force-pushed the feat/add-dataset-inspire branch from d79e827 to 5cd613c Compare May 13, 2026 17:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant