Skip to content

get_largest_connected_component_mask materializes redundant nonzero index arrays before bincount #8974

Description

@aymuos15

Description

get_largest_connected_component_mask (monai/transforms/utils.py) ranks component sizes with:

nonzeros = features[lib.nonzero(features)]
features_to_keep = lib.argsort(lib.bincount(nonzeros))[::-1][:num_components]

lib.nonzero(features) materializes one full index array per spatial dim, and features[...] gathers a full copy of every non-background voxel, just to drop background before bincount. That is unnecessary: bincount over the whole field already counts every label, and background is index 0.

Proposed fix

counts = lib.bincount(features.reshape(-1))
counts[0] = 0  # ignore background
features_to_keep = lib.argsort(counts)[::-1][:num_components]

Bit-identical output (verified), without the nonzero index arrays or the gathered copy, so faster and lower peak memory.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions