Skip to content

Quality of scene level consistency #21

@physics-constrained-Real2Sim

Description

Hi, thank you for open-sourcing the great work. I really enjoyed reading it.

I have a question regarding scene-level consistency under occlusion. In the demo figures in your paper, many objects appear to be cleanly visible. However, in real-world scenarios, occlusions are quite common. For instance, in the example with the chair heavily occluded by the toy bear, it is not entirely clear how well the method handles such cases. (picture is from SAM3D)
Image

Image

I was wondering whether this limitation might be related to the underlying assumptions in methods like VGGT and TRELLIS, since they do not explicitly address amodal segmentation (i.e., reasoning about the full extent of partially occluded objects).

I’m not raising this as criticism. Instead, I’m genuinely interested in understanding the current best practices. Specifically:

  1. What are some recommended approaches for multi-view 3D reconstruction with strong scene-level consistency under occlusion?
  2. Do you think recent works like multi-view SAM3D (e.g., the arXiv version) paper move in this direction, or is this still an open challenge?

I would really appreciate any insights or pointers you could share. Thanks again for your great work!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions