fix: address issue #1948#3036
Open
github-actions[bot] wants to merge 1 commit into
Open
Conversation
Fixes #1948
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #1948
What was broken
When
NaturalImage2DIOwas used for 2D PNG datasets, label files saved asRGB (3-channel) PNGs were silently loaded as
(3, 1, H, W)instead of(1, 1, H, W). This passedverify_dataset_integrity(which onlycompared the spatial dimensions of images and segmentations and never
checked that the segmentation itself was single-channel) and only
exploded later during training inside the data loader with the cryptic
error:
raised at
nnunetv2/training/dataloading/data_loader_2d.py:92when theseg array was padded into the pre-allocated single-channel
seg_allbuffer.
What this fixes
Two minimal, targeted changes make the failure happen early with a clear
message instead of crashing mid-training:
nnunetv2/imageio/natural_image_reader_writer.py—NaturalImage2DIO.read_segnow checks the channel dimension of the loaded segmentation and
raises
RuntimeErrorwith a message that explains the cause and thefix (re-save labels as single-channel grayscale images with integer
class indices) when a multi-channel label file is encountered.
nnunetv2/experiment_planning/verify_dataset_integrity.py—check_casesadditionally asserts
segmentation.shape[0] == 1and prints a clearerror pointing at the offending label file, so
verify_dataset_integritynow actually catches this dataset issue instead of giving a false
"Done" and letting training crash later.
No behavioral change for users with correctly formatted single-channel
labels.
How to verify
Build a tiny 2D PNG dataset of the kind that triggers the bug
(any 2D dataset using
NaturalImage2DIO) and save one of the labelPNGs as RGB (e.g. with
Pillow'smode="RGB").Run
nnUNetv2_plan_and_preprocess -d <ID> --verify_dataset_integrity.With the fix,
verify_dataset_integrityfails immediately with amessage like:
instead of printing
verify_dataset_integrity Done.and proceedinginto preprocessing/training before crashing inside the data loader.
Re-save the label as a single-channel
mode="L"PNG and confirm thepipeline runs through preprocessing as before — no regression for
well-formed datasets.