Skip to content

feat(utils): expose STANDARD_SECTIONS as the canonical section order#2402

Open
katosh wants to merge 1 commit intoscverse:mainfrom
settylab:feat/standard-sections
Open

feat(utils): expose STANDARD_SECTIONS as the canonical section order#2402
katosh wants to merge 1 commit intoscverse:mainfrom
settylab:feat/standard-sections

Conversation

@katosh
Copy link
Copy Markdown
Contributor

@katosh katosh commented Apr 21, 2026

Expose the canonical section order as a public tuple alongside iter_outer so consumers that only need the names can read it directly — no generator, no per-section getattr, no backing-file toggling.

Closes #2401.

What changes

  • New STANDARD_SECTIONS: tuple[AnnDataElem, ...] in src/anndata/utils.py — names the order iter_outer currently yields in.
  • iter_outer now iterates STANDARD_SECTIONS instead of an inline list. Yield order, exception semantics, and the backing-file reopen/close behaviour are unchanged.
  • Unit tests in tests/test_utils.py verify the constant matches the generator's yield order and that every name resolves to an attribute on a plain AnnData.

What does not change

  • iter_outer's contract for every existing caller (AnnData._gen_repr, to_memory, _reduce). Same yields, same exception propagation.
  • No new flags on iter_outer. Callers that want resilience against broken sections (the second motivation in Expose canonical section-name tuple alongside iter_outer #2401) iterate STANDARD_SECTIONS directly with their own per-section try/except.

Usage

from anndata.utils import STANDARD_SECTIONS

# Membership / ordering without driving the iterator or name-only listing
if section in STANDARD_SECTIONS: ...

Design notes

The section names are now declared in two places: the AnnDataElem Literal in src/anndata/_types.py (the type) and STANDARD_SECTIONS here (the order used by iter_outer). This is a deliberate trade, not a single source of truth. Alternatives considered:

  1. Reorder AnnDataElem to match iter_outer's order; drop STANDARD_SECTIONS; iterate get_literal_members(AnnDataElem). This is genuinely one source of truth for both names and order. A grep of the repo shows no consumer of the current declaration order depends on it — ANNDATA_ELEMS in experimental/backed/_io.py is only used as **kwargs into AnnData(...) and as pytest.mark.parametrize values / filter(...) inputs. The cost is that AnnDataElem's member order becomes load-bearing: a stylistic reorder of the Literal silently changes iter_outer's yield order (and AnnData.__repr__ with it). Acceptable if documented with a comment and a test, but it overloads a typing construct with iteration semantics.

  2. Replace AnnDataElem with a StrEnum. Also one source of truth, with native iteration. Changes the public shape of AnnDataElem (breaks typing.get_args consumers) — a larger blast radius than this PR warrants.

  3. This PR: two declarations, consistency pinned by tests. The Literal stays a pure type (set of valid names), the tuple carries order. Names duplicated, but the concerns are decoupled — if AnnDataElem ever narrows (e.g., excludes X/raw/uns because they aren't aligned mappings), iter_outer's order doesn't silently break.

Happy to switch to option 1 if reviewers prefer the tighter coupling — it's a one-commit refactor. Flagging the trade rather than burying it in a single line.

Test plan

  • pytest tests/test_utils.py — 4 passed (2 pre-existing, 2 new).
  • Existing iter_outer call sites (AnnData._gen_repr, to_memory, _reduce) continue to work: the refactor is a pure loop-variable substitution.
  • CI (pre-commit.ci + full test matrix).

``iter_outer`` is currently the only way to enumerate AnnData's standard
section names, which forces consumers that only need the names
(membership checks, layout introspection, ecosystem packages mirroring
the layout) to drive the generator and pay for a full ``getattr`` per
section — reconstructing aligned mappings and, for backed AnnData,
reopening and closing the backing file.

Expose the section order as a module-level ``tuple`` and have
``iter_outer`` iterate it:

- ``STANDARD_SECTIONS: tuple[AnnDataElem, ...]`` becomes the single
  source of truth.
- ``iter_outer`` now loops over ``STANDARD_SECTIONS``; the yield order
  and exception behaviour for existing callers is unchanged.
- Name-only consumers read the constant directly:
  ``from anndata.utils import STANDARD_SECTIONS``.

This is a pure refactor behaviourally — no caller semantics change —
and a small addition to the public surface. Downstream consumers
(rich HTML repr in PR scverse#2236, ecosystem packages) can iterate the
constant with per-section ``try/except`` to stay usable when a single
section's attribute access raises.
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 21, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 85.48%. Comparing base (7861448) to head (dd15a5c).
✅ All tests successful. No failed tests found.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2402      +/-   ##
==========================================
- Coverage   87.43%   85.48%   -1.96%     
==========================================
  Files          49       49              
  Lines        7728     7729       +1     
==========================================
- Hits         6757     6607     -150     
- Misses        971     1122     +151     
Files with missing lines Coverage Δ
src/anndata/utils.py 86.63% <100.00%> (-0.81%) ⬇️

... and 7 files with indirect coverage changes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Expose canonical section-name tuple alongside iter_outer

1 participant