Skip to content

Title: Labels categorical coloring fails for label ids missing from table rows when *_colors is present in adata.uns #392

@ArneDefauw

Description

@ArneDefauw

Hi, I was developing a napari plugin that uses napari-spatialdata, and observed a bug in the labels categorical coloring path.

Summary

For labels layers, napari-spatialdata colors by merging the label ids from the layer against a categorical vector from adata.obs.

If some label ids are present in the labels layer but do not have a matching row in the table, and the corresponding *_colors key is present in adata.uns, coloring can fail with:

  • DirectLabelColormap validation errors because some color values become NaN
  • ValueError: zip() argument 2 is shorter than argument 1 when the stored palette length does not exactly match the category count

Why this looks incorrect

A labels element can legitimately contain ids that are not represented in the table, for example when:

  • the segmentation is larger than the annotated subset
  • the table only contains active / labeled / filtered instances
  • the table comes from a left join against the labels element

In that case, labels coloring should still succeed and give missing ids a fallback color.

Current behavior

In the categorical labels path, napari-spatialdata:

  1. takes layer.metadata["indices"]
  2. merges those ids against a categorical adata.obs vector
  3. builds a category-to-color mapping from adata.uns[f"{key}_colors"] if present
  4. maps merged values to colors
  5. builds a DirectLabelColormap

If *_colors is absent, a missing-value fallback color is added.

If *_colors is present, no missing fallback is added, so merge-missing rows stay NaN and later become invalid label colors.

Also, the stored palette branch uses strict zipping between categories and colors, so incomplete palettes raise immediately.

Expected behavior

  • Missing label ids should get a default “unannotated / missing” color instead of crashing.
  • Stored palettes should be handled defensively when their length does not exactly match the category count.

Suggested fix

In the categorical labels coloring path:

  • always handle merge-missing values explicitly, even when *_colors already exists in adata.uns
  • avoid zip(..., strict=True) for stored palettes, or validate/fallback gracefully before zipping

Environment

Observed with:

  • napari-spatialdata 0.7.0
  • spatialdata 0.7.2

Minimal repro:

import numpy as np
import pandas as pd
from anndata import AnnData
from napari.utils import DirectLabelColormap

# Simulate a labels layer with ids 1, 2, 3
element_indices = pd.Series([1, 2, 3], name="element_indices")

# Simulate a table that only has rows for ids 1 and 2
adata = AnnData(
    shape=(2, 0),
    obs=pd.DataFrame(
        {
            "instance_id": [1, 2],
            "user_class": pd.Categorical([0, 1], categories=[0, 1]),
        }
    ),
)

# Stored categorical palette is present
adata.uns["user_class_colors"] = np.array(["#80808099", "#ff0000"], dtype=object)

# Same logic as napari-spatialdata categorical labels coloring
vec = adata.obs[["instance_id", "user_class"]].set_index("instance_id")["user_class"]
colors = adata.uns["user_class_colors"]
color_dict = dict(zip(vec.cat.categories, colors.tolist(), strict=True))

merge_df = pd.merge(
    element_indices,
    vec,
    left_on="element_indices",
    right_index=True,
    how="left",
)

merge_df["color"] = merge_df["user_class"].map(color_dict)
print(merge_df)
# element_indices=3 has no table row, so color becomes float NaN

index_color_mapping = dict(zip(merge_df["element_indices"], merge_df["color"], strict=False))
print(index_color_mapping)
# {1: '#80808099', 2: '#ff0000', 3: nan}

DirectLabelColormap(color_dict=index_color_mapping)
# Fails because the color for label id 3 is NaN, not a valid color

I should be able to draft a PR for this issue next week I think

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions