Add best_direction() helper for canonical (kind, field) directions#212
Merged
Conversation
Closes #211. Surfaces the canonical "best direction" of a prediction field (max or min) as a queryable API alongside Kind / kind_support. Half of this was already implicit in BindingPredictionCollection's _best_by_score (max) and _best_by_rank (min); the new module-level ``best_direction(kind, field)`` makes the convention explicit and extends it to ``value``, whose direction is kind-dependent (IC50 nM for pMHC_affinity → min; half-life for pMHC_stability → max). - ``FIELD_BEST_DIRECTIONS``: ``{"score": "max", "percentile_rank": "min"}`` — uniform across kinds. - ``VALUE_BEST_DIRECTIONS``: ``{Kind.pMHC_affinity: "min", Kind.pMHC_stability: "max"}`` — extend per kind. - ``best_direction(kind, field)``: lookup helper that raises ValueError for ``field='value'`` on an unregistered kind, with a pointer to the table. Downstream consumers (topiary's BestAlleleField; vaxrank tier signals) can drop their local direction tables and import from mhctools instead. 7 new tests; full suite green. Bump to 3.14.0 (additive minor).
Adds a single direction-aware best_by(kind, field) that uses best_direction() for max/min, plus thin best_by_score / best_by_rank / best_by_value wrappers. The existing kind accessors (affinity, presentation, best_affinity_by_rank, ...) now route through the public methods. Behavior changes: - best_by_value(kind) raises ValueError for kinds without a registered value direction (consistent with best_direction()), since `value` semantics are kind-dependent. - The no-allele fallback that used to live in _best_by_score is now uniform across fields: prefer predictions with an allele, fall back to allele-less if none have an allele, skip None values. 8 new tests; full suite green. Bump to 3.14.1.
Two completeness tests: - best_by accepts string-form kinds (mirrors the existing best_direction string-kind test). - best_*_by_rank now returns allele-less rank predictions when no allele-bearing rank pred exists. This is a behavior change vs the old _best_by_rank (which had no fallback) introduced when best_by was unified across fields. Pinning so it can't silently regress.
iskandr
added a commit
to openvax/topiary
that referenced
this pull request
May 11, 2026
mhctools 3.14.1 ships ``best_direction(kind, field)`` (openvax/mhctools#212, closes #211). Drop topiary's local ``_BEST_FIELD_DIRECTIONS`` / ``_BEST_VALUE_DIRECTIONS`` / ``_best_direction`` and import the canonical helper instead — the direction table is mhctools-canonical metadata and shouldn't live in two places. - ``topiary/ranking/nodes.py``: replace the local table + helper with ``from mhctools import best_direction as _best_direction``. - ``requirements.txt``: bump mhctools floor 3.13.7 → 3.14.1. - ``tests/test_best_allele.py``: match the upstream error message ("best_direction undefined" instead of "best_value direction is undefined"). - Bump topiary 5.12.0 → 5.13.0 (additive minor — the mhctools floor bump tightens deps, but no public API change).
iskandr
added a commit
to openvax/topiary
that referenced
this pull request
May 11, 2026
) * Add best-allele aggregation accessors for haplotype-mode predictors Issue #158 calls out that pMHC_presentation from haplotype-mode predictors (MHCflurry's class1_presentation in haplotype mode) is conceptually a multi-allele input → per-peptide answer, but mhctools emits one row per (peptide, allele) so the DSL's per-allele grouping returns per-allele scores by default. Add an explicit aggregator that returns the best score across alleles plus the corresponding allele attribution, gated behind new accessors so the default per-allele behavior is preserved. - BestAlleleField DSLNode: per-peptide max/min of (kind, field) across alleles, broadcast back to ctx.group_index. Direction by field: score=max, value=min, percentile_rank=min. - KindAccessor properties: best_value, best_score, best_rank, plus *_allele variants returning the allele name string. - DSL string parser: presentation['mhcflurry'].best_score and best_score_allele (and rank/value/percentile spellings) now parse to BestAlleleField. - Refactor: factor _filter_kind_method_version helper out of Field.eval for reuse by BestAlleleField. - Bump topiary to 5.12.0 (additive minor). Use case (per #158): vaxrank's "presented anywhere" tier signal — ``Presentation['mhcflurry'].best_score`` returns the multi-allele aggregate, ``best_score_allele`` returns the attribution. * Address review: drop merge, per-kind directions, kind_support warnings Four fixes against PR #160 review: 1. Replace the fragile ``__best__``/merge broadcast with ``ctx.group_index.droplevel('allele')`` + ``per_peptide.reindex()``. No intermediate `__best__` column, no positional-merge ordering assumption, fewer allocations. ``sub.copy()`` is gone too — work on the filter's already-fresh slice via index masks + ``.assign()``. 2. Replace the field-only ``_BEST_DIRECTIONS`` constant with a per-(kind, field) lookup ``_best_direction(kind, field)``. ``score`` defaults max and ``percentile_rank`` defaults min for every kind, but ``value`` requires a per-kind entry — pMHC_affinity uses IC50 nM (min), pMHC_stability would use half-life (max). Asking for ``best_value`` on a kind without a registered direction raises loudly with a pointer to the table. 3. Plumb ``kind_support`` through ``EvalContext``, ``apply_filter``, ``apply_sort``, and ``evaluate_scores``. When ``best_*`` is invoked against a (model, kind) reporting ``mhc_dependence='single_allele'``, emit a UserWarning that the result is the best per-allele score, not a joint multi-allele aggregate. No warning for ``'haplotype'`` (correct usage) or when ``kind_support`` is absent. 4. Update BestAlleleField docstring with the verified semantics: in haplotype mode mhctools emits a single row per peptide with ``allele=`` MHCflurry's deconvolved ``best_allele``, so the aggregator is a no-op there; in per-allele mode it returns the best per-allele score (not joint). 11 new tests covering: per-kind direction table (best_value undefined on presentation raises; affinity uses min; score works for any kind), kind_support warning on single_allele / no warning on haplotype / no-op without metadata / method substring matching, broadcast ignores input row order, ties pick first deterministically, and apply_filter integration (vaxrank's "presented anywhere" use case) including warning forwarding. Full suite: 1318 passed, 3 skipped. * Use mhctools.best_direction; bump floor to 3.14.1 mhctools 3.14.1 ships ``best_direction(kind, field)`` (openvax/mhctools#212, closes #211). Drop topiary's local ``_BEST_FIELD_DIRECTIONS`` / ``_BEST_VALUE_DIRECTIONS`` / ``_best_direction`` and import the canonical helper instead — the direction table is mhctools-canonical metadata and shouldn't live in two places. - ``topiary/ranking/nodes.py``: replace the local table + helper with ``from mhctools import best_direction as _best_direction``. - ``requirements.txt``: bump mhctools floor 3.13.7 → 3.14.1. - ``tests/test_best_allele.py``: match the upstream error message ("best_direction undefined" instead of "best_value direction is undefined"). - Bump topiary 5.12.0 → 5.13.0 (additive minor — the mhctools floor bump tightens deps, but no public API change).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #211.
Summary
Surfaces the canonical "best direction" of a prediction field — whether higher or lower is better — as a queryable API alongside
Kindandkind_support(). Half of this was already implicit inPeptideResult._best_by_score(max for score) and._best_by_rank(min for percentile_rank); the new module-levelbest_direction(kind, field)makes the convention explicit and extends it tovalue, whose direction is kind-dependent (IC50 nM for pMHC_affinity → min; half-life for pMHC_stability → max).This PR also promotes the previously-private
_best_by_*helpers to a public, direction-aware API onPeptideResult.New API in
mhctools.pred(re-exported frommhctools)best_direction()raisesValueErrorforfield='value'on a kind without a registered direction, with a pointer to the table — the safer "fail loud" path for new value-bearing kinds.The existing convenience accessors (
affinity,presentation,best_affinity_by_rank, …) now route through the public methods. Internal_best_by_score/_best_by_rankare removed in favor of the unifiedbest_by.Why
Downstream aggregators (openvax/topiary#160's
BestAlleleField; vaxrank tier signals) need this to do "best across alleles" or "best across methods" without replicating the table. Same architectural pattern as #210 — per-kind metadata next toKind.Behavior change to note
best_*_by_rankaccessors now fall back to allele-less rank predictions when no allele-bearing rank prediction of that kind exists, matching the existing fallback inbest_*(score) accessors. The old_best_by_rankreturnedNonein that case. In practice rank predictions always come with an allele, so this should be invisible — but downstream consumers relying on "no allele ⇒ None for rank" should be aware. Pinned bytest_best_by_rank_falls_back_to_allele_less.Test plan
pytest tests/test_pred.py— 17 new tests covering: field-default constants, kind-specific value directions,best_direction()happy path for score/rank/value,valueon unregistered kind raises, unknown field raises, accepts string-or-Kind input,best_by_*happy paths for all three fields,best_by_valueraises for unregistered kinds, None-skipping, allele-less fallback for both score and rank, missing-kind returns None, string-kind input onbest_by.Bumps to 3.14.1.