Refactor/api redesign#60
Open
galenlynch wants to merge 9 commits intomainfrom
Open
Conversation
This is the first commit of the asset-centric API redesign described in the refactor plan. It moves I/O primitives out of zarr.py and pipeline_transformed.py into a new io/ subpackage: - io/metadata.py: direction_from_*, _units_to_meter, _unit_conversion - io/zarr.py: _open_zarr, zarr_to_numpy, _zarr_to_scaled, ensure_native_endian - io/paths.py: _asset_from_zarr_*, _zarr_base_name_*, alignment_zarr_uri_and_metadata_from_zarr_or_asset_pathlike - io/processing.py: _get_processing_pipeline_data, _get_zarr_import_process, _get_image_atlas_alignment_process, image_atlas_alignment_path_relative_from_processing - io/transforms.py: TransformChain, TemplatePaths, _PIPELINE_* constants, pipeline_transforms, all _pipeline_*_local_paths and pipeline_*_local_paths The original module paths re-export the moved symbols so existing imports and test patches keep working. mock_transform_path_resolution patches the new io.transforms location; mock_zarr_operations patches both io.zarr and the legacy zarr re-export so call sites in either module are intercepted. Behavior is unchanged. All 121 tests still pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…age (C2) Second commit of the asset-centric API redesign. Splits the overlay system into two cohesive modules: - domain/overlays.py: concrete overlay classes (SpacingScaleOverlay, FlipIndexAxesOverlay, PermuteIndexAxesOverlay, ForceCornerAnchorOverlay, ForceCornerAnchorOverlayWithVersionWarning, SetLpsWorldSpacingOverlay) plus cardinal-direction helpers (_require_cardinal, lps_world_to_index_spacing_cardinal), Vec3 / _PIPELINE_MULTISCALE_FACTOR, and estimate_pipeline_multiscale. - domain/selector.py: the Overlay protocol, OverlayRule, OverlaySelector, apply_overlays, _as_date, _base_rules, get_selector, extend_selector, make_selector. Imports the two concrete overlays referenced by the default rule set from domain.overlays — selector → overlays only. pipeline_domain_selector.py becomes a re-export shim (also re-exporting fix_corner_compute_origin so the legacy patch target still resolves). mock_overlay_selector now patches both the new domain.selector home and the legacy pipeline_transformed re-export; the corner-anchor overlay test patches both domain.overlays and pipeline_domain_selector. Behavior is unchanged. All 121 tests still pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Third commit of the asset-centric API redesign. Splits format-specific readers into their own subpackage: - formats/neuroglancer.py: full move of neuroglancer.py contents (neuroglancer_annotations_to_*, get_image_sources, all helpers). - formats/swc.py: a pure swc_data_to_indices(spacing, ...) helper that takes level-0 spacing explicitly and performs unit conversion + axis reordering + rounding. No Zarr opening, no Asset coupling. The legacy ``aind_zarr_utils.neuroglancer`` is now a re-export shim. The orchestration wrapper ``swc_data_to_zarr_indices`` in ``pipeline_transformed`` keeps its signature but delegates to ``formats.swc.swc_data_to_indices`` after fetching the spacing from ``_zarr_to_scaled``. mock_annotation_functions in conftest.py patches both the new formats.neuroglancer home and the legacy neuroglancer re-export so that call sites in either module are intercepted. Behavior is unchanged. All 121 tests still pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… image.py (C4)
Fourth commit of the asset-centric API redesign. Splits the heaviest
duplication block (sitk vs ANTs apply-overlay paths, ~320 LOC) into one
type-dispatching function plus a small pure helper for the level > 0
ANTs convention conversion.
New module:
- image.py:
- _pipeline_anatomical_check_args (moved from pipeline_transformed)
- _apply_pipeline_overlays_to_header (moved)
- _build_pipeline_header (moved + renamed from _mimic_pipeline_anatomical_header)
- _to_ants_convention(corrected_header_sitk, level, ants_shape, ants_direction)
-> (spacing, origin) — pure helper. Encodes the spacing-reverse + scale +
opposite-corner origin recompute that the level>0 ANTs path requires
because ANTs and SimpleITK transpose numpy arrays relative to each other.
- apply_pipeline_overlays(img, ...) — type-dispatches on sitk.Image vs
ANTsImage. Level=0 path uses AnatomicalHeader.from_<backend>/update_<backend>;
level>0 sitk scales spacing by 2**level; level>0 ants delegates to
_to_ants_convention. Direction matrix is left unchanged (assumes
overlays don't modify direction; true for the default rule set).
In pipeline_transformed.py:
- _pipeline_anatomical_check_args, _apply_pipeline_overlays_to_header,
_mimic_pipeline_anatomical_header, _build_pipeline_header become
re-exports from image.py.
- apply_pipeline_overlays_to_sitk and _to_ants are now 6-line wrappers
calling apply_pipeline_overlays. The old ~150-LOC bodies are gone.
New regression tests (tests/test_image.py, +10 tests):
- _to_ants_convention math is pinned for identity and RPI directions at
levels 1, 2, 3.
- The opposite-corner code derivation is verified.
- Tests use the original inline formula as the regression oracle.
Behavior is unchanged. All 121 existing tests still pass; 131 total now.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Fifth commit of the asset-centric API redesign. Introduces the new
user-facing surface as a façade over the existing free functions —
no behaviour change, no public-API removal.
New modules:
- origin.py: Origin tagged union with classmethods Origin.default(),
Origin.at(point), Origin.at_corner(code, lps_point). Internal
_legacy_kwargs() translates to the existing
(set_origin, set_corner, set_corner_lps) triple so Asset can delegate
to zarr_to_sitk / zarr_to_ants / zarr_to_sitk_stub without those
functions changing.
- asset.py:
- TransformPaths: small frozen dataclass holding the four resolved
transform-chain values (point_paths, point_invert, image_paths,
image_invert) so callers don't have to remember tuple order.
- Asset: dataclass owning zarr_uri + metadata + processing plus
per-asset config. Three eager constructors (from_zarr, from_root,
from_neuroglancer) that pre-open the Zarr and load metadata files;
bare Asset(zarr_uri, metadata, processing) does no I/O.
Lazy properties opened_zarr (caches _open_zarr) and transforms
(caches pipeline_transforms_local_paths). Methods image(), stub(),
apply_overlays() delegate to the existing free functions, threading
opened_zarr=self.opened_zarr so a single workflow opens the Zarr
exactly once.
overlay_selector defaults via field(default_factory=get_selector) so
the singleton is resolved at instance-creation time (the lru_cache
guarantees it's the same singleton every call) — avoiding the
import-time binding trap of `= get_selector()` default-arg.
Asset and Origin are exported from the package __init__; the legacy
free-function exports are unchanged.
Tests (tests/test_asset.py, +14 tests):
- Origin classmethods + frozen invariant + _legacy_kwargs translation.
- Explicit Asset(...) does no I/O (mocks confirm zero S3 calls).
- Default overlay_selector is the cached singleton.
- opened_zarr is lazy and called exactly once per asset.
- from_root pre-opens the Zarr; from_neuroglancer pulls the URL.
- A single image() + stub() flow opens the Zarr once and threads the
cached tuple to both delegated functions.
- Origin defaults / at_corner propagate correctly to zarr_to_sitk.
All 145 tests pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sixth commit of the asset-centric API redesign. Coordinate spaces
become first-class types: every (N,3) point array now carries a Space
tag, and Asset.transform(points, to=Space.X) walks a small transform
graph instead of dispatching among a half-dozen named free functions.
New module:
- points.py:
- Space enum: ZARR_INDICES, LS_SCALED_MM, LS_ANATOMICAL_MM,
LS_PIPELINE_ANATOMICAL_MM, CCF_MM.
- Points frozen dataclass: values (dict[str, NDArray]) + space +
optional per-point descriptions (list[str|None]).
- Points.from_neuroglancer wraps the existing reader and types
descriptions as lists. Points.from_swc converts SWC coords to
LS_SCALED_MM (unit conversion + axis reorder, no Zarr access).
- Eight edge functions, one per directed adjacency:
INDICES↔SCALED : per-axis multiply / divide
INDICES↔ANAT : annotation_indices_to_anatomical with the base stub
INDICES↔PIPE_ANAT: same with the pipeline-corrected stub
PIPE_ANAT→CCF : per-layer apply_ants_transforms (point chain)
CCF→PIPE_ANAT : batched apply_ants_transforms (image chain)
Mirrors the math in the existing indices_to_ccf / ccf_to_indices
helpers, including the per-layer-vs-batched distinction across
direction.
- Adjacency table forms a tree rooted at ZARR_INDICES; _path() walks
it via BFS. transform_points(asset, pts, to=) does the dispatch.
In asset.py:
- Asset.transform(points, *, to) method delegates to transform_points
via a local import (breaks the points → asset cycle that points.py
resolves with TYPE_CHECKING).
Asset, Origin, Points, Space all exported from the package __init__.
Test/fixture updates:
- conftest.py mock_apply_overlays now accepts **kwargs (the real
apply_overlays gained a zarr_import_version kwarg in C2 — the mock
was technically broken since then but didn't surface until C6 wired
apply_overlays into the new path).
- mock_overlay_selector now patches aind_zarr_utils.image.apply_overlays
too (C4 introduced that call site).
- mock_ants_transforms now patches aind_zarr_utils.points.apply_ants_transforms_to_point_arr
for the new transform-graph edges.
Tests (tests/test_points.py, +26 tests):
- Points construction and from_swc / from_neuroglancer.
- _path resolver: identity, adjacent, through-the-tree, sibling-via-root.
- Each edge with the real on-disk Zarr fixture; CCF-crossing edges
use mock_ants_transforms.
- Asset.transform end-to-end across multi-hop paths.
- Identity transform returns the input unchanged.
- Descriptions propagate across hops.
- Adjacency / dispatch consistency check.
Behavior unchanged for legacy code paths. All 171 tests pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Seventh commit of the asset-centric API redesign. Most C7 work landed alongside C5 and C6 (Asset/Origin/Points/Space tests, transform-graph tests, edge tests against the real Zarr fixture). This commit closes the two outstanding gaps the plan called out: 1. Multi-call cache invariant. A single Asset workflow that touches image() + stub() + transform() (across CCF) opens the Zarr exactly once and resolves the pipeline transform-chain paths exactly once. Catches future regressions where someone forgets to thread asset.opened_zarr or asset.transforms. 2. New-vs-legacy numerical equivalence. With identical metadata, processing, and overlay-selector inputs (and ANTs mocked deterministically), Asset.transform(Points.from_neuroglancer(ng), to=Space.CCF_MM) produces numerical values identical to the legacy neuroglancer_to_ccf(ng, ...) helper. Tests added: - tests/test_asset.py::TestAssetCachesAcrossCalls::test_full_workflow_opens_once_resolves_transforms_once - tests/test_points.py::test_neuroglancer_to_ccf_numerical_equivalence_new_vs_legacy The C4 regression test (level>0 ANTs convention) was already promoted to a permanent fixture as part of C4; no further action needed. The untracked reference demos (tests/demo_ng_ccf.py, tests/demo_swc.py, scripts/neuroglancer_injections_coords_utils.py) were also rewritten in-place against the new Asset/Points/Space API but remain untracked per the plan; they're not part of this commit. All 173 tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Final commit of the asset-centric API redesign branch. Marks the four auto-discovery convenience helpers as deprecated (they have clean ``Asset``-based replacements with no internal use), bumps the version, and documents the redesign in the CHANGELOG. Deprecated (DeprecationWarning, will be removed in a future release): - ``neuroglancer_to_ccf_auto_metadata`` - ``swc_data_to_ccf_auto_metadata`` - ``indices_to_ccf_auto_metadata`` - ``ccf_to_indices_auto_metadata`` Each warning message names the precise ``Asset.from_*().transform(...)`` replacement. The lower-level free functions (``zarr_to_sitk`` / ``zarr_to_ants`` / ``zarr_to_sitk_stub`` / ``mimic_pipeline_zarr_to_anatomical_stub`` / ``neuroglancer_to_ccf`` / ``ccf_to_indices``) are NOT deprecated this release: they are still called internally by the new ``Asset`` façade and the transform-graph edges. Deprecating them would generate spurious warnings from internal use. They will be deprecated once ``Asset`` calls are routed away from them in a future commit. A pyproject ``filterwarnings`` entry suppresses the new DeprecationWarnings inside our own test suite — the legacy paths are still exercised as regression coverage for the deprecation cycle. Version: 0.14.0 → 0.15.0. Stays on 0.x per the agreed BC plan. All 173 tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
| @property | ||
| def name(self) -> str: | ||
| """Human-friendly identifier for the overlay (used in logs/audits).""" | ||
| ... |
| @property | ||
| def priority(self) -> int: | ||
| """Execution priority; lower numbers run earlier.""" | ||
| ... |
| zarr_import_version: str | None = None, | ||
| ) -> AnatomicalHeader: | ||
| """Apply the overlay and return a new ``AnatomicalHeader``.""" | ||
| ... |
Comment on lines
+14
to
+16
| from aind_zarr_utils.annotations import ( | ||
| annotation_indices_to_anatomical as annotation_indices_to_anatomical, | ||
| ) |
Comment on lines
+17
to
+19
| from aind_zarr_utils.formats.neuroglancer import ( | ||
| _extract_spacing as _extract_spacing, | ||
| ) |
Comment on lines
+20
to
+22
| from aind_zarr_utils.formats.neuroglancer import ( | ||
| _get_layer_by_name as _get_layer_by_name, | ||
| ) |
Comment on lines
+23
to
+25
| from aind_zarr_utils.formats.neuroglancer import ( | ||
| _process_annotation_layers as _process_annotation_layers, | ||
| ) |
Comment on lines
+26
to
+28
| from aind_zarr_utils.formats.neuroglancer import ( | ||
| _process_layer_and_descriptions as _process_layer_and_descriptions, | ||
| ) |
Comment on lines
+29
to
+31
| from aind_zarr_utils.formats.neuroglancer import ( | ||
| _resolve_layer_names as _resolve_layer_names, | ||
| ) |
Comment on lines
+32
to
+34
| from aind_zarr_utils.formats.neuroglancer import ( | ||
| _sanitize_source_url as _sanitize_source_url, | ||
| ) |
55d3c35 to
87678cd
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.