Register MEDS-EIC-AR as a model (micro/small/medium/large)#313
Draft
mmcdermott wants to merge 1 commit into
Draft
Register MEDS-EIC-AR as a model (micro/small/medium/large)#313mmcdermott wants to merge 1 commit into
mmcdermott wants to merge 1 commit into
Conversation
MEDS-EIC-AR ("Everything is Code" autoregressive) is a MEDS-native
transformer LM with a fully dataset-agnostic pipeline: tokenization,
tensorization, and autoregressive pre-training all run on any MEDS
dataset without per-dataset wiring.
Registered as four capacity variants — `meds_eic_ar/{micro,small,
medium,large}` — each selecting an upstream `lightning_module=<size>`
preset, matching the `meds_tab/tiny` capacity-variant pattern. Shared
README + refs.bib at the parent level. Pinned to `MEDS-EIC-AR==0.3.1`.
Wired:
- `unsupervised: train` — `MEICAR_process_data` (tokenize + tensorize)
then `MEICAR_pretrain`, switching to the upstream demo configs under
`{demo}=True` and selecting the variant's capacity preset otherwise.
- `supervised: predict` — `MEICAR_generate_trajectories` rolls future
patient timelines forward from each task sample's prediction time.
Not yet wired (zero-shot prediction resolution), blocked on two gaps,
both filed and linked from the parent README:
- MEDS-DEV does not expose the ACES task criteria / dataset predicates
files to model `supervised` commands, which meds-trajectory-evaluation's
`ZSACES_label` needs (#314).
- meds-trajectory-evaluation has no CLI to aggregate per-trajectory
`ZSACES_label` output into an empirical-probability predictions.parquet
in meds-evaluation format (mmcdermott/MEDS_trajectory_evaluation#42).
Until both land, the supervised lane stops after trajectory generation
and won't produce a packaged result, so the model-lane integration test
is expected to stop short of evaluation.
Resolves #302.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
f48a40a to
065399b
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds MEDS-EIC-AR (PyPI
MEDS-EIC-AR==0.3.1) as a registered model. Resolves #302.Registered as four capacity variants —
meds_eic_ar/{micro,small,medium,large}— each selecting an upstreamlightning_module=<size>preset, matching themeds_tab/tinycapacity-variant pattern. Shared README + refs.bib at the parentmeds_eic_ar/level.meds_eic_ar/micromeds_eic_ar/smallmeds_eic_ar/mediummeds_eic_ar/largeThe whole pipeline is dataset-agnostic — nothing is wired per-dataset.
What's wired
unsupervised: train—MEICAR_process_data(tokenize + tensorize) thenMEICAR_pretrain. Uses the upstream demo configs under{demo}=True; selects the variant'slightning_module=<size>capacity preset in full mode.supervised: predict(step 1 of 2) —MEICAR_generate_trajectoriesrolls future patient timelines forward from each task sample's prediction time, consuming the pretrained model + tensorized cohort fromunsupervised: trainand the task labels dir.What's not wired — zero-shot prediction resolution
Zero-shot inference is two steps upstream: generate trajectories (wired, above) then resolve trajectories into a
predictions.parquet. The resolution step is blocked on two gaps, both now filed:meds-trajectory-evaluation'sZSACES_labelCLI needs the ACES task criteria + dataset predicates files, and MEDS-DEV doesn't pass those to modelsupervisedcommands (the available template vars aredataset_dir,labels_dir,model_initialization_dir,output_dir,model_dir,split,demo). The command literally can't be written today —str.format()wouldKeyErroron an unknown placeholder. Filed as Expose ACES task criteria + dataset predicates to modelsupervisedcommands #314.ZSACES_labelrunnable, it emits per-trajectory boolean labels (valid/determinable/label), not an aggregated empirical-probabilitypredictions.parquet. No CLI aggregates across the N sampled trajectories per task sample into ameds-evaluation-compatible file. Filed upstream as mmcdermott/MEDS_trajectory_evaluation#42.Once both land, each variant's
supervised: predictgains theZSACES_label+ aggregation steps and produces{output_dir}/predictions.parquet. Until then the supervised lane stops after trajectory generation — see the parent README for the full writeup.Open question (per #302) — demo-mode capacity
In
{demo}=Truemode all four variants currently run the upstream demo capacity (_demo_pretrain), not their own architecture — a CI-cost tradeoff. We can overridelightning_module/model=<size>on top of the demo config so each variant exercises its real architecture under demo; flagging it rather than deciding unilaterally.Test plan
test_registry_validation.py).model-lane (meds_eic_ar/*):unsupervised: trainshould run the demo pre-train end-to-end;supervised: predictruns trajectory generation. The lane is expected to stop short of a packaged result until the two gaps above close (nopredictions.parquetyet).Refs
supervisedcommands #314 (MEDS-DEV side), mmcdermott/MEDS_trajectory_evaluation#42 (upstream).🤖 Generated with Claude Code