feat(utils): expose STANDARD_SECTIONS as the canonical section order#2402
Open
katosh wants to merge 1 commit intoscverse:mainfrom
Open
feat(utils): expose STANDARD_SECTIONS as the canonical section order#2402katosh wants to merge 1 commit intoscverse:mainfrom
STANDARD_SECTIONS as the canonical section order#2402katosh wants to merge 1 commit intoscverse:mainfrom
Conversation
``iter_outer`` is currently the only way to enumerate AnnData's standard section names, which forces consumers that only need the names (membership checks, layout introspection, ecosystem packages mirroring the layout) to drive the generator and pay for a full ``getattr`` per section — reconstructing aligned mappings and, for backed AnnData, reopening and closing the backing file. Expose the section order as a module-level ``tuple`` and have ``iter_outer`` iterate it: - ``STANDARD_SECTIONS: tuple[AnnDataElem, ...]`` becomes the single source of truth. - ``iter_outer`` now loops over ``STANDARD_SECTIONS``; the yield order and exception behaviour for existing callers is unchanged. - Name-only consumers read the constant directly: ``from anndata.utils import STANDARD_SECTIONS``. This is a pure refactor behaviourally — no caller semantics change — and a small addition to the public surface. Downstream consumers (rich HTML repr in PR scverse#2236, ecosystem packages) can iterate the constant with per-section ``try/except`` to stay usable when a single section's attribute access raises.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #2402 +/- ##
==========================================
- Coverage 87.43% 85.48% -1.96%
==========================================
Files 49 49
Lines 7728 7729 +1
==========================================
- Hits 6757 6607 -150
- Misses 971 1122 +151
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Expose the canonical section order as a public tuple alongside
iter_outerso consumers that only need the names can read it directly — no generator, no per-sectiongetattr, no backing-file toggling.Closes #2401.
What changes
STANDARD_SECTIONS: tuple[AnnDataElem, ...]insrc/anndata/utils.py— names the orderiter_outercurrently yields in.iter_outernow iteratesSTANDARD_SECTIONSinstead of an inline list. Yield order, exception semantics, and the backing-file reopen/close behaviour are unchanged.tests/test_utils.pyverify the constant matches the generator's yield order and that every name resolves to an attribute on a plainAnnData.What does not change
iter_outer's contract for every existing caller (AnnData._gen_repr,to_memory,_reduce). Same yields, same exception propagation.iter_outer. Callers that want resilience against broken sections (the second motivation in Expose canonical section-name tuple alongsideiter_outer#2401) iterateSTANDARD_SECTIONSdirectly with their own per-sectiontry/except.Usage
Design notes
The section names are now declared in two places: the
AnnDataElemLiteralinsrc/anndata/_types.py(the type) andSTANDARD_SECTIONShere (the order used byiter_outer). This is a deliberate trade, not a single source of truth. Alternatives considered:Reorder
AnnDataElemto matchiter_outer's order; dropSTANDARD_SECTIONS; iterateget_literal_members(AnnDataElem). This is genuinely one source of truth for both names and order. A grep of the repo shows no consumer of the current declaration order depends on it —ANNDATA_ELEMSinexperimental/backed/_io.pyis only used as**kwargsintoAnnData(...)and aspytest.mark.parametrizevalues /filter(...)inputs. The cost is thatAnnDataElem's member order becomes load-bearing: a stylistic reorder of theLiteralsilently changesiter_outer's yield order (andAnnData.__repr__with it). Acceptable if documented with a comment and a test, but it overloads a typing construct with iteration semantics.Replace
AnnDataElemwith aStrEnum. Also one source of truth, with native iteration. Changes the public shape ofAnnDataElem(breakstyping.get_argsconsumers) — a larger blast radius than this PR warrants.This PR: two declarations, consistency pinned by tests. The
Literalstays a pure type (set of valid names), the tuple carries order. Names duplicated, but the concerns are decoupled — ifAnnDataElemever narrows (e.g., excludesX/raw/unsbecause they aren't aligned mappings),iter_outer's order doesn't silently break.Happy to switch to option 1 if reviewers prefer the tighter coupling — it's a one-commit refactor. Flagging the trade rather than burying it in a single line.
Test plan
pytest tests/test_utils.py— 4 passed (2 pre-existing, 2 new).iter_outercall sites (AnnData._gen_repr,to_memory,_reduce) continue to work: the refactor is a pure loop-variable substitution.pre-commit.ci+ full test matrix).