Skip to content

MaskGenerationPipeline: is_last never True on final partial batch, silently dropping results #46123

@J3r3myPerera

Description

@J3r3myPerera

System Info

transformers version: current main
Affected file: src/transformers/pipelines/mask_generation.py

Who can help?

No response

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Description

I was looking into the MaskGenerationPipeline and noticed that when you set a points_per_batch value that doesn't divide evenly into the total number of grid points, the pipeline quietly drops the results from the last batch — no error, no warning, just missing masks.

The root cause is this line in preprocess:
is_last = i == n_points - points_per_batch

eg: n_points=100, points_per_batch=64. The loop runs at i=0 and i=64. At i=64, the check asks 64 == 100-64 which is 64 == 36 — always False. So the final batch never gets flagged as the last one.

The pipeline's PipelinePackIterator relies on this is_last flag to know when to stop accumulating results. When it never sees is_last=True, it calls next() on an already-finished generator, hits StopIteration, and exits — leaving the last batch's masks on the floor.

With SAM's default point grid, n_points is rarely a round multiple of the default points_per_batch=64, so this silently affects most real-world usage.

Reproduction

from transformers import pipeline
from PIL import Image
import requests

image = Image.open(requests.get("http://images.cocodataset.org/val2017/000000039769.jpg", stream=True).raw)

generator = pipeline("mask-generation", model="facebook/sam-vit-base")

# points_per_batch=50 causes n_points % points_per_batch != 0 for typical grids
outputs_partial = generator(image, points_per_batch=50)
outputs_full    = generator(image, points_per_batch=None)  # all at once, no batching

# outputs_partial["masks"] will have fewer masks than outputs_full["masks"]
print(len(outputs_partial["masks"]), "vs", len(outputs_full["masks"]))

Expected behavior

All generated masks should be returned regardless of whether n_points is a multiple of points_per_batch.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions