PytorchConnectomics
diff --git a/‎.claude/refactor/config_v2.md‎
Lines changed: 124 additions & 0 deletions b/‎.claude/refactor/config_v2.md‎
Lines changed: 124 additions & 0 deletions
diff --git a/‎.claude/refactor/data_v2.md‎
Lines changed: 104 additions & 0 deletions b/‎.claude/refactor/data_v2.md‎
Lines changed: 104 additions & 0 deletions
diff --git a/‎.claude/refactor/decoding_v2.md‎
Lines changed: 121 additions & 0 deletions b/‎.claude/refactor/decoding_v2.md‎
Lines changed: 121 additions & 0 deletions
@@ -0,0 +1,124 @@
+# Config V2 Plan
+
+## Goal
+
+Make config strict, typed, and minimal. V2 should reject old field names instead
+of translating them, and every runtime option should map to one schema field
+with one meaning.
+
+Baseline: `config.md` reports a mature stage-aware Hydra/OmegaConf system. V2
+keeps that architecture but removes remaining compatibility tolerance.
+
+## Public API
+
+Keep a small public API under `connectomics.config`:
+
+- `Config`
+- stage dataclasses: `DefaultConfig`, `TrainConfig`, `TestConfig`, `TuneConfig`
+- section dataclasses for `system`, `model`, `data`, `optimization`,
+  `monitor`, `inference`, `decoding`, and `evaluation`
+- `load_config`
+- `save_config`
+- `validate_config`
+- `resolve_default_profiles`
+- `as_plain_dict`
+- `cfg_get`
+
+Hardware planning helpers are public only under `connectomics.config.hardware`.
+All other helpers are private to `config.pipeline`, `config.schema`, or
+`config.hardware`.
+
+## Delete
+
+- Any old `hydra_config` or `hydra_utils` facade if reintroduced.
+- Any alias fields kept for old tutorials.
+- Any `shared` stage handling.
+- Any nested `inference.decoding` or `inference.evaluation` fields.
+- Any profile selector accepted outside canonical paths.
+- Any runtime fallback that probes a removed field with `getattr`.
+- Any scheduler option duplicated both as a top-level field and inside `params`
+  unless it is genuinely shared by multiple schedulers.
+
+## Move/Rename
+
+- Top-level stage sections should be canonical:
+  - `default.decoding`
+  - `test.decoding`
+  - `tune.decoding`
+  - `default.evaluation`
+  - `test.evaluation`
+  - `tune.evaluation`
+- Keep hardware helpers under `config.hardware`.
+- Keep schema-only dataclasses under `config.schema`.
+- Keep load, merge, profile, and stage-resolution logic under
+  `config.pipeline`.
+
+## Config Contract
+
+Stage allowlists:
+
+| Stage | Allowed sections |
+| --- | --- |
+| `default` | `system`, `model`, `data`, `optimization`, `monitor`, `inference`, `decoding`, `evaluation` |
+| `train` | `system`, `model`, `data`, `optimization`, `monitor` |
+| `test` | `system`, `model`, `data`, `inference`, `decoding`, `evaluation` |
+| `tune` | `system`, `model`, `data`, `inference`, `decoding`, `evaluation` |
+
+Mode names:
+- use `train`, `test`, `tune`;
+- if a future CLI exposes `infer`, `decode`, or `evaluate`, they should resolve
+  from `test`-compatible sections unless separate dataclasses are needed.
+
+Required v2 fields:
+- `inference.decode_after_inference: bool`
+- `inference.chunking.output_mode: "decoded" | "raw_prediction"`
+- `inference.saved_prediction_path: Optional[str]`
+- top-level `decoding` list or decoding stage config
+- top-level `evaluation` config
+
+Rejected v1 patterns:
+- `inference.decoding`
+- `inference.evaluation`
+- `shared`
+- optimizer step aliases other than `n_steps_per_epoch`
+- WandB fields with duplicated `wandb_` prefix
+- MONAI arch aliases that duplicate canonical schema fields
+
+## Boundary Rules
+
+- Runtime packages import dataclasses and loader functions, not profile-engine
+  internals.
+- Config code may validate cross-section consistency, but it must not import
+  training, inference, decoding, or evaluation execution code.
+- YAML profile application happens before dataclass conversion.
+- Stage resolution happens before execution.
+
+## Implementation Order
+
+1. Add or confirm top-level `decoding` and `evaluation` stage schema.
+2. Add `inference.decode_after_inference` and
+   `inference.chunking.output_mode`.
+3. Remove any old aliases and compatibility validators.
+4. Update all tutorial YAMLs to v2 fields.
+5. Add strict failure tests for removed fields.
+6. Run config load tests for every tutorial.
+7. Run stage-specific runtime tests.
+
+## Tests
+
+Add or update:
+- config rejects `shared`;
+- config rejects nested `inference.decoding`;
+- config rejects nested `inference.evaluation`;
+- config rejects removed optimizer and WandB aliases;
+- all tutorials load with v2 schema;
+- profile selectors work only at canonical paths;
+- stage allowlists reject invalid sections;
+- `--debug-config` prints the resolved v2 structure.
+
+## Open Decisions
+
+- Whether `infer`, `decode`, and `evaluate` should become first-class CLI modes
+  in v2 or remain subcommands/wrappers around `test` stage config.
+- Whether scheduler-specific `ReduceLROnPlateau` fields should move fully into
+  `scheduler.params`.
@@ -0,0 +1,104 @@
+# Data V2 Plan
+
+## Goal
+
+Keep `connectomics.data` domain-based and remove compatibility leftovers. The
+module should own IO, preprocessing, augmentation, label transforms, datasets,
+sampling, and splitting with one canonical path for each object.
+
+Baseline: `data.md` reports most data cleanup as complete. V2 mainly deletes
+remaining shims and turns the documented structure into the enforced structure.
+
+## Public API
+
+Keep these public subpackages:
+
+- `connectomics.data.io`
+- `connectomics.data.augmentation`
+- `connectomics.data.processing`
+- `connectomics.data.datasets`
+
+Public concepts:
+- volume IO: read, save, shape, format detection
+- MONAI dictionary transforms for IO, augmentation, and label processing
+- pure augmentation operations in `augment.augment_ops`
+- deterministic label/target operations in `process`
+- dataset classes and sampling/split helpers in `dataset`
+
+Everything else should be private or package-internal.
+
+## Delete
+
+- `connectomics.data.utils` compatibility shim.
+- Duplicate copies of `sampling.py` and `split.py` outside `data.datasets`.
+- Any `monai_transforms.py` compatibility names.
+- Any unused dataset classes kept for legacy code.
+- Any transform builder functions that are no longer called.
+- Any old exports in `data.__init__` that point to deleted modules.
+
+## Move/Rename
+
+Canonical paths:
+
+| Concept | V2 owner |
+| --- | --- |
+| Train/val volume split | `connectomics.data.datasets.split` |
+| Sample counting | `connectomics.data.datasets.sampling` |
+| IO transforms | `connectomics.data.io.transforms` |
+| Augmentation transforms | `connectomics.data.augmentation.transforms` |
+| Label/process transforms | `connectomics.data.processing.transforms` |
+| nnUNet preprocessing | `connectomics.data.processing.nnunet_preprocess` |
+| Pure augmentation functions | `connectomics.data.augmentation.augment_ops` |
+
+## Config Contract
+
+Data config should use domain terms:
+- `data.input`
+- `data.label_transform`
+- `data.augmentation`
+- `data.dataloader`
+- `data.split`
+
+Rules:
+- no config objects inside transform classes;
+- transform constructors receive plain Python parameters;
+- no hidden old-key handling in builders;
+- train/test/tune data shorthands must be documented in schema, not handled
+  through ad hoc fallback logic.
+
+## Boundary Rules
+
+- `data` must not import `training`.
+- `data` must not import `inference`.
+- `data` must not import `decoding` except through tests or examples.
+- Pure functions must not import MONAI.
+- MONAI wrappers should be thin adapters around pure functions.
+- Dataset code should not know about model architectures or losses.
+
+## Implementation Order
+
+1. Delete `connectomics/data/utils`.
+2. Update tests and tutorials to import from `data.datasets`.
+3. Remove stale exports from `data.__init__` and subpackage `__init__` files.
+4. Search for deleted names across Python and YAML.
+5. Tighten transform builders so unsupported config shapes raise early.
+6. Add boundary tests for no `data.utils` import path.
+7. Run data, training data-factory, and tutorial config tests.
+
+## Tests
+
+Add or update:
+- importing `connectomics.data.utils` fails;
+- production imports use `data.datasets.split` and `data.datasets.sampling`;
+- every `RandomizableTransform` calls `self.randomize()` in `__call__`;
+- pure augmentation ops import no MONAI symbols;
+- transform builders reject unsupported config types;
+- datasets preserve image/label padding semantics;
+- nnUNet preprocessing round-trips metadata needed by inference output.
+
+## Open Decisions
+
+- Whether `connectomics.data.__init__` should export only subpackages or also
+  selected high-use functions like `read_volume` and `save_volume`.
+- Whether test/tune `data.batch_size` shorthand should stay or move under
+  `data.dataloader.batch_size` only.
@@ -0,0 +1,121 @@
+# Decoding V2 Plan
+
+## Goal
+
+Make decoding a standalone stage that consumes raw prediction artifacts or
+arrays and writes segmentation artifacts. It should not construct models, load
+checkpoints, run sliding-window inference, or compute final evaluation reports.
+
+Baseline: `decoding.md` reports the decoder registry and implementations are
+mostly clean. V2 focuses on stage separation and artifact contracts.
+
+## Public API
+
+Keep:
+
+- `Decoder`
+- `DecoderOutput`
+- `register_decoder`
+- `get_decoder`
+- `list_decoders`
+- `decode_prediction`
+- `run_decoding_stage`
+- decoder implementations for affinity, watershed, synapse, and ABISS paths
+- decoding parameter tuning helpers if they operate only on saved predictions
+  and labels
+
+## Delete
+
+- Any decoder registration call duplicated outside package initialization.
+- Any decode path that assumes model inference is happening in the same call.
+- Any final metric reporting that belongs to `evaluation`.
+- Any postprocessing module names that conflict with inference output
+  postprocessing semantics.
+- Any print-based operational output.
+- Any old config nesting under `inference`.
+
+## Move/Rename
+
+Canonical responsibilities:
+
+| Concept | V2 owner |
+| --- | --- |
+| Decoder registry | `connectomics.decoding.registry` |
+| Decoder base types | `connectomics.decoding.base` |
+| Segmentation decoders | `connectomics.decoding.segmentation` |
+| Synapse decoder | `connectomics.decoding.synapse` |
+| Decoding postprocess | `connectomics.decoding.postprocess` |
+| Decoding stage runner | `connectomics.decoding.stage` |
+| Decoding tuning | `connectomics.decoding.tuning` or current tuner module |
+
+If `optuna_tuner.py` remains large, split stage-independent parameter search
+from CLI/reporting helpers.
+
+## Config Contract
+
+Top-level `decoding` config owns:
+- decoder name;
+- decoder kwargs;
+- postprocessing kwargs;
+- output segmentation artifact path;
+- optional tuning search space for decoder/postprocess parameters.
+
+It does not own:
+- model checkpoint;
+- sliding-window parameters;
+- raw input image paths, except for metadata or visualization;
+- metric selection.
+
+Decode-only mode:
+
+```yaml
+inference:
+  saved_prediction_path: /path/to/raw_prediction.h5
+decoding:
+  - name: decode_affinity_cc
+    kwargs:
+      threshold: 0.7
+      backend: numba
+```
+
+V2 may rename this so `decode.input_prediction_path` is canonical. If so,
+`inference.saved_prediction_path` should be deleted and decode-only mode should
+not depend on the inference section.
+
+## Boundary Rules
+
+- Decoding may import data processing helpers and top-level utils.
+- Decoding may import config dataclasses.
+- Decoding must not import training or inference managers.
+- Decoding must not write metrics reports except optional decoder diagnostics.
+- Decoding output artifacts must be accepted by evaluation without special
+  decoder-specific handling.
+
+## Implementation Order
+
+1. Decide whether prediction input path lives under `decoding` or `inference`.
+2. Add `run_decoding_stage` that accepts a prediction artifact path and decoder
+   config.
+3. Update combined test path to call `run_decoding_stage`.
+4. Remove decode logic from inference/test orchestration.
+5. Ensure all decoders consume canonical artifact arrays and metadata.
+6. Move final metric calls out to evaluation.
+7. Add decode-only tests.
+
+## Tests
+
+Add or update:
+- decode-only mode works without model construction;
+- decoding consumes a raw prediction artifact from inference;
+- decoded segmentation artifact metadata records decoder name and params;
+- all registered decoders are discoverable;
+- invalid decoder names raise clear errors;
+- decoding does not import training or inference execution modules;
+- tuning can run from saved predictions.
+
+## Open Decisions
+
+- Whether decode input path should be `decoding.input_prediction_path` instead
+  of `inference.saved_prediction_path`.
+- Whether decoding postprocessing should remain a subpackage or be folded into
+  decoder-specific modules where only one decoder uses it.