Skip to content

Commit 2f50e1e

Browse files
author
Donglai Wei
committed
Finalize BANIS crop cleanup
1 parent 282f239 commit 2f50e1e

18 files changed

Lines changed: 2611 additions & 22 deletions
File renamed without changes.

.claude/refactor/v3_claude.md

Lines changed: 607 additions & 0 deletions
Large diffs are not rendered by default.
Lines changed: 341 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,341 @@
1+
# Feedback on V3 Refactor Plan
2+
3+
Reviewed against the current tree after commit `ba0f482`. The plan has the
4+
right architectural direction, but it should not be implemented exactly as
5+
written. The main issue is that several "dead code" and "zero caller" claims
6+
are too broad, while the ordering understates breakage risk.
7+
8+
## Executive Summary
9+
10+
The strongest parts of the plan are:
11+
12+
- Moving tuning orchestration out of `connectomics.decoding`.
13+
- Moving evaluation logic out of Lightning.
14+
- Moving runtime dispatch out of `scripts/main.py`.
15+
- Enforcing strict config loading instead of warning on unknown keys.
16+
- Removing ghost config reads such as `inference.test_time_augmentation.act`.
17+
- Separating raw inference artifacts from decoding and evaluation concerns.
18+
19+
The weakest parts are:
20+
21+
- Theme A overclaims dead code. Some entries are public APIs, builder paths, or
22+
test-covered behavior rather than unreachable code.
23+
- The proposed PR order is unsafe. Public API trimming, strict config, schema
24+
migration, and runtime extraction depend on each other.
25+
- The file-size target is too mechanical. Splits should follow ownership and
26+
dependency direction first, not a hard line-count rule.
27+
- The high-performance objective is not developed enough. Most items improve
28+
architecture, but few directly address runtime hot spots.
29+
30+
Bottom line: implement v3 as a boundary-first refactor with import-boundary
31+
tests and small benchmarks, not as a broad deletion and file-split sweep.
32+
33+
## What The Plan Gets Right
34+
35+
### Decoding/Tuning Boundary
36+
37+
`connectomics/decoding/tuning/optuna_tuner.py` is still the worst boundary
38+
violation. It lazy-imports naming helpers from `connectomics.training`:
39+
40+
- `tta_cache_suffix`
41+
- `tuning_best_params_filename`
42+
- `tuning_best_params_filename_candidates`
43+
- `tuning_study_db_filename`
44+
45+
It also creates a datamodule and calls `trainer.test(...)` inside the decoding
46+
package. That orchestration belongs in `connectomics.runtime`, while decoding
47+
tuning should operate on arrays/artifacts and metric callables.
48+
49+
The proposed split into pure tuner modules plus `runtime/tune_runner.py` is the
50+
right direction.
51+
52+
### Inference/Decoding Boundary
53+
54+
`connectomics/inference/chunked.py` currently contains a streamed affinity-CC
55+
decode path. That means inference imports decoding and performs segmentation
56+
inside an inference module. Moving `run_chunked_affinity_cc_inference` and its
57+
decode-kwargs resolver into a decoding/runtime module is a clean boundary fix.
58+
59+
Keep `run_chunked_prediction_inference` in inference. It should remain raw
60+
prediction only.
61+
62+
### Evaluation Boundary
63+
64+
`training/lightning/test_pipeline.py` owns too much evaluation logic. Moving
65+
binary, instance, NERL, report writing, and metric aggregation into
66+
`connectomics.evaluation` would reduce private `module._*` coupling and make
67+
decode-only/evaluate-only workflows possible without Lightning.
68+
69+
This should be one of the highest-priority v3 changes.
70+
71+
### Strict Config
72+
73+
`config/pipeline/config_io.py::_warn_unconsumed_keys` still warns for unknown
74+
top-level keys. With "no backward compatibility", unknown keys should raise.
75+
This is a good mechanical cleanup and would catch drift earlier.
76+
77+
The ghost activation reads in `inference/tta.py` should also be removed:
78+
79+
- `cfg.inference.test_time_augmentation.act`
80+
- `cfg.inference.output_act`
81+
82+
The canonical path is `inference.test_time_augmentation.channel_activations`.
83+
84+
## Corrections Needed Before Implementation
85+
86+
### Theme A Is Too Aggressive
87+
88+
Do not treat all listed items as safe deletion.
89+
90+
Examples:
91+
92+
- `RandMixupd` is not dead. It is imported by tests and referenced by the
93+
augmentation builder. Removing it is a behavior/API removal, not dead-code
94+
cleanup.
95+
- `auto_plan_config`, `AutoConfigPlanner`, and `AutoPlanResult` are not dead.
96+
They are intentionally tested under `connectomics.config.hardware`. Removing
97+
them is a product decision, not a cleanup.
98+
- `WandbConfig` may have no active logger consumer, but deleting the schema field
99+
still changes accepted config surface. That belongs in the schema-break PR,
100+
with tutorial/docs validation.
101+
- `SaveVolumed` and `TileLoaderd` may have no production caller, but they are
102+
exported from `connectomics.data.io`. Removing exported names should be part
103+
of an API-trim PR, not mixed into dead-code deletion.
104+
- `create_combined_loss`, `create_loss_from_config`, `CombinedLoss`, and
105+
`GANLoss` are exported through model/loss package APIs. If removed, update
106+
package exports and tests explicitly.
107+
- `TestConfig.output_path` and `TestConfig.cache_suffix` are not obviously
108+
"never read". There is stage fallback behavior in `inference/output.py`, and
109+
tests currently set `cfg.test.output_path`.
110+
111+
The plan should split Theme A into three buckets:
112+
113+
1. Safe artifacts/dead modules: delete after grep and import checks.
114+
2. Public API removals: delete only with export/test updates.
115+
3. Product feature removals: require an explicit decision.
116+
117+
### PR Ordering Is Not Safe
118+
119+
The statement "PRs 1-7 are mergeable in any order" is not correct.
120+
121+
Safer dependencies:
122+
123+
- `runtime/output_naming.py` should land before splitting `optuna_tuner.py`,
124+
because it removes decoding's dependency on Lightning naming helpers.
125+
- Public API trim should happen after target modules exist, not before.
126+
- Strict config should avoid removing schema fields before tutorial migration.
127+
- Schema split and tutorial migration must be one coordinated change.
128+
- Evaluation extraction should happen before broad `test_pipeline.py` file
129+
splitting, because it removes the largest and most meaningful coupling first.
130+
131+
### File Size Should Be Secondary
132+
133+
"Every file < 600 lines" is a useful smell check, not a design rule. Splitting
134+
large files without resolving ownership can create more import churn and weaker
135+
locality.
136+
137+
Prioritize splits that reduce dependency direction problems:
138+
139+
- decoding no longer imports training
140+
- inference no longer imports decoding
141+
- config no longer imports data execution machinery
142+
- scripts no longer own runtime orchestration
143+
- evaluation no longer depends on Lightning module internals
144+
145+
After those are fixed, split remaining large files by stable concepts.
146+
147+
### Performance Work Needs More Specific Targets
148+
149+
The plan says "high-performance" but mostly proposes architecture cleanup. Add
150+
concrete performance targets:
151+
152+
- Avoid unnecessary `tensor.cpu().float().numpy()` conversions where downstream
153+
operations can stay in torch or preserve dtype.
154+
- Benchmark raw chunked inference write throughput before changing artifact
155+
writing paths.
156+
- Verify HDF5 chunk shapes and compression settings for prediction artifacts.
157+
- Reduce duplicated raw prediction read/write loops between standard and
158+
chunked inference.
159+
- Measure TTA memory use and data movement, especially patch-first-local and
160+
distributed reductions.
161+
- Avoid forcing every streamed path through an artifact abstraction if it hurts
162+
large-volume throughput.
163+
164+
## Recommended Revised Order
165+
166+
### PR 0: Verification Guardrails
167+
168+
Add tests before moving code:
169+
170+
- Import-boundary tests:
171+
- `connectomics.decoding` must not import `connectomics.training`
172+
- `connectomics.inference` must not import `connectomics.decoding`
173+
- `connectomics.config` must not import MONAI/data execution code
174+
- Strict-config tests for unknown top-level and nested removed keys.
175+
- Snapshot tests for public imports that are intentionally kept.
176+
- Small chunked-inference artifact benchmark or smoke test.
177+
178+
### PR 1: Safe Dead Artifacts Only
179+
180+
Delete only files/symbols that are truly unreachable and not exported. For
181+
example, `decoding/tuning/auto_tuning.py` looks like a strong candidate because
182+
it imports `from .segmentation import decode_affinity_cc`, which is broken.
183+
184+
Required cleanup:
185+
186+
- Remove its re-exports from `connectomics/decoding/tuning/__init__.py`.
187+
- Add an import test for `connectomics.decoding.tuning`.
188+
189+
Do not include public API removals in this PR.
190+
191+
### PR 2: Runtime Naming Extraction
192+
193+
Move cache suffix, checkpoint tag, output-head tag, and tuning filename helpers
194+
from `training/lightning/utils.py` to `connectomics/runtime/output_naming.py`.
195+
196+
Then update `optuna_tuner.py`, `scripts/main.py`, `model.py`, and
197+
`test_pipeline.py` imports.
198+
199+
This is the cleanest first boundary move because it removes lazy decoding to
200+
training imports without changing tuning behavior yet.
201+
202+
### PR 3: Strict Config Pass
203+
204+
Make unknown top-level keys raise in `load_config`.
205+
206+
Delete verified ghost reads:
207+
208+
- `inference.test_time_augmentation.act`
209+
- `inference.output_act`
210+
211+
Keep genuinely optional section-root access where sections may be `None`.
212+
213+
### PR 4: Evaluation Extraction
214+
215+
Move binary/instance/NERL metrics and report writing into
216+
`connectomics.evaluation`.
217+
218+
Target layout:
219+
220+
```text
221+
connectomics/evaluation/
222+
__init__.py
223+
stage.py
224+
metrics.py
225+
nerl.py
226+
report.py
227+
curvilinear.py
228+
```
229+
230+
`training/lightning/test_pipeline.py` should call evaluation as a service, not
231+
own the metric implementation.
232+
233+
### PR 5: Tuning Runtime Split
234+
235+
Move `run_tuning`, `load_and_apply_best_params`, and temporary inference
236+
override orchestration into `connectomics/runtime/tune_runner.py`.
237+
238+
Keep `connectomics.decoding.tuning` focused on pure parameter search over saved
239+
arrays/artifacts and metric callables.
240+
241+
### PR 6: Chunked Decode Boundary
242+
243+
Move streamed affinity-CC decode out of `connectomics.inference`.
244+
245+
Suggested target:
246+
247+
- `connectomics/decoding/streamed_chunked.py`, if it is a decoder concern
248+
- or `connectomics/runtime/chunked_decode.py`, if it remains orchestration-heavy
249+
250+
`connectomics/inference/chunked.py` should only produce raw prediction
251+
artifacts.
252+
253+
### PR 7: Runtime CLI Extraction
254+
255+
Move dispatch-heavy code out of `scripts/main.py` into runtime modules:
256+
257+
- `runtime/cli.py`
258+
- `runtime/checkpoint_dispatch.py`
259+
- `runtime/cache_resolver.py`
260+
- `runtime/sharding.py`
261+
- `runtime/torch_safe_globals.py`
262+
263+
Keep `scripts/main.py` as a thin entrypoint.
264+
265+
### PR 8: Schema Split And Tutorial Migration
266+
267+
Split decoding and evaluation config out of `config/schema/inference.py`.
268+
269+
Do this atomically with tutorial migration because it changes config keys:
270+
271+
- `inference.postprocessing.*` to `decoding.postprocessing.*`
272+
- `inference.saved_prediction_path` to `decoding.input_prediction_path`
273+
- `inference.decoding_path` to `decoding.output_path`
274+
- evaluation config into its own schema module
275+
276+
Run tutorial config validation across all tutorial YAMLs in the same PR.
277+
278+
### PR 9: Public API Trim
279+
280+
Trim `__init__.py` exports after the new homes exist.
281+
282+
Do this with explicit import tests so the kept public surface is intentional.
283+
284+
### PR 10: File Splits
285+
286+
Split remaining large files only after ownership boundaries are fixed.
287+
288+
Focus on:
289+
290+
- `inference/tta.py`
291+
- `inference/lazy.py`
292+
- `data/augmentation/transforms.py`
293+
- `data/processing/transforms.py`
294+
- `training/lightning/callbacks.py`
295+
- `config/pipeline/config_io.py`
296+
297+
### PR 11: Architecture Rename And Docs Refresh
298+
299+
Rename `nnunet_pretrained` to `nnunet` only with tutorial and docs migration.
300+
301+
Refresh stale docs in:
302+
303+
- `docs/source/tutorials/*.rst`
304+
- `docs/source/notes/*.rst`
305+
- `docs/source/modules/model.rst`
306+
- `docs/source/modules/models.rst`
307+
308+
Delete duplicate docs only after confirming they are not linked by Sphinx
309+
toctrees.
310+
311+
## Implementation Guardrails
312+
313+
- Every deletion must pass `rg` checks across `connectomics`, `scripts`,
314+
`tests`, `tutorials`, and `docs`.
315+
- Public export removals need an explicit before/after import surface test.
316+
- Schema changes need tutorial validation in the same commit.
317+
- Boundary changes need import graph tests.
318+
- Runtime changes need focused tests for cache-hit, tune, tune-test, and
319+
checkpoint-derived output paths.
320+
- Performance-sensitive paths need at least smoke benchmarks before and after:
321+
chunked inference, TTA, lazy prediction, and artifact writes.
322+
323+
## Verdict
324+
325+
Use the v3 plan as a strong architectural roadmap, not as an execution spec.
326+
327+
The high-value sequence is:
328+
329+
1. Add guardrails.
330+
2. Extract runtime naming.
331+
3. Enforce strict config.
332+
4. Move evaluation out of Lightning.
333+
5. Move tuning orchestration out of decoding.
334+
6. Move chunked decoding out of inference.
335+
7. Split schema and migrate tutorials.
336+
8. Trim public APIs.
337+
9. Split large files by ownership.
338+
10. Refresh docs.
339+
340+
This preserves the intended "no backward compatibility" cleanup while avoiding
341+
accidental deletion of still-reachable APIs and avoiding review-hostile churn.

0 commit comments

Comments
 (0)