Graduate contacts-v1 inference into the marinfold CLI#92
Merged
Conversation
Add predict/evaluate for the contacts-v1 document structure so the top-level `marinfold infer` / `marinfold evaluate` (and the per-impl `contacts-v1` driver) can run eric-czech's #61/#75 contacts-v1 1.5B model (eval loss 2.7566), exported to the open-athena bucket by exp89. Previously the graduated package only did generate/view/tokenizer. - inference.py: InferenceConfig, structure_from_sequence, predict, evaluate. Pairwise P(contact) readout (exp82/exp89) over the existing Backend.next_token_probs primitive; --ensemble-k test-time augmentation; sklearn-free AUC + precision@{L,L/2,L/5,R} per range. - plots.py: P(contact) heatmap writers for infer/evaluate. - cli.py / __init__.py: infer/evaluate subcommands + dispatch exports. - MODELS.yaml: 1.5B-contacts-v1 entry (now the default). - README: "Try it out" headlines the contacts-v1 model; the distogram contacts-and-distances-v1 models stay as the previous generation. Validated end-to-end with the published checkpoint: evaluate on 1QYS gives long-range contact AUC 0.957 (exp89 regime). 208 tests pass. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The W&B URL was a reconstruction, not a verified link; leave it out (it is an informational-only field) with a comment pointing at where to find the real one. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
contacts-v1 infer/evaluate gain a `method` knob: the existing fast
`pairwise` P(contact) readout (default) plus exp82's settled best
LM-only recipe, `rollout` — vote over N resampled sampled contact-
section completions, tie-broken by the pairwise log-prob
(combined = votes + 0.5*minmax(pairwise sym)).
- inference/core.py: add a `sample_completions` sampling primitive to
the Backend protocol.
- inference/_vllm.py + _transformers.py: implement it (vLLM
SamplingParams + generate; transformers model.generate). MLX raises
NotImplementedError for now (pairwise still works there).
- contacts_v1/inference.py: `_rollout_score_matrix` (resample + vote +
pairwise tie-break); `_score_matrix` dispatch on cfg.method; the
predict/evaluate record schema is now method-agnostic (`score` +
`method`, was `p_contact`).
- contacts_v1/{plots,cli}.py: method-aware heatmap labels; --method /
--n-rollouts / --temperature / --top-p / --top-k.
- README: rollout (vLLM) documented as the best recipe; the headline
figure is now exp82's rollout result, so the stale ×10-ens copy is
fixed.
Validated: 215 tests pass (incl. stub-backend rollout vote/tiebreak);
rollout runs E2E with the published checkpoint via transformers —
integer votes + [0,0.5) tiebreak as designed. vLLM sampling follows
vLLM's standard generate API but is unverified on this box (its CUDA
driver predates torch's bundled CUDA build).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Repoint notebooks/inference_example1.ipynb from the distogram model to our current best contacts-v1 1.5B: it now imports the contacts_v1 impl, exposes a pairwise/rollout METHOD selector (+ N_ROLLOUTS / ENSEMBLE_K), installs the contacts-v1 extra (pyconfind) for ground-truth contacts, and plots the GT vs predicted contact map inline instead of distance heatmaps. README's Colab bullet updated to match. Verified the compute cells end-to-end against the published checkpoint (transformers): evaluate on 1QYS -> long-range AUC 0.957, and the metrics/PDF/PNG outputs write cleanly. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Graduates contacts-v1 inference into the
marinfoldCLI so the README "Try it out" (and a Colab notebook) can run our current-best model — eric-czech's #61/#75 contacts-v1 1.5B (issue #61, eval loss 2.7566), exported to the open-athena bucket by #89. Previously the graduatedcontacts_v1package only did generate/view/tokenizer; inference lived only in the exp82/exp89 eval harnesses.Supports both contact-prediction readouts, selectable via
--method:pairwise(default, ~0.3 s/protein) — symmetrized autoregressiveP(contact)per pair, optionally averaged over--ensemble-kresampled realizations (exp: evaluate best contacts-v1 model on current eval set #89's TTA).rollout(exp82's settled best LM-only recipe, ~50 s/protein) — vote over--n-rolloutssampled contact-section completions, each from a freshly resampled document, tie-broken by the pairwise log-prob (combined = votes + ½·minmax(pairwise sym)).How
inference/core.py— adds asample_completionssampling primitive to theBackendprotocol (the existing surface was forward-pass-only).inference/_vllm.py+_transformers.py— implement it (vLLMSamplingParams+generate; transformersmodel.generate). MLX raisesNotImplementedErrorfor rollout (pairwise still works there).contacts_v1/inference.py— pairwise readout over the existingBackend.next_token_probs;_rollout_score_matrix(resample → vote → pairwise tie-break);_score_matrixdispatches onmethod. Records are method-agnostic (score+method).contacts_v1/{plots,cli}.py— method-aware heatmap labels;infer/evaluategain--method/--n-rollouts/--temperature/--top-p/--top-k(contacts-v1-specific knobs stay on the per-impl driver; top-levelmarinfold inferstays pairwise/narrow).MODELS.yaml— new1.5B-contacts-v1entry, now the default (was1B).pairwisedefault,rolloutvia vLLM as the best recipe);notebooks/inference_example1.ipynbrepointed to the contacts-v1 model with a pairwise/rollout selector; distogram models kept as "previous generation".Validation
votes + [0,0.5)tie-break bound, and method dispatch.pairwise evaluateontests/data/1QYS.cif→ long-range AUC 0.957 (exp89 regime).rollout inferon Trp-cage → integer votes +[0,0.5)pairwise tie-break, exactly as designed.Notes for review
1B→1.5B-contacts-v1.MODELS.yamldeliberately omitswandb_urlfor the new entry (I don't have the verified link — it's eric'seric-czech/marinrun; drop the exact URL in).sample_completionsis unverified on this box (GPU driver too old for the bundled CUDA) — it follows vLLM's standardgenerateAPI; the rollout logic is verified via transformers. MLX rollout is a deliberateNotImplementedError(an exp82 follow-up).🤖 Generated with Claude Code