Skip to content

Fix VGF runtime aborts on 0-dim tensor inputs and Python scalar coercion (#19446)#19446

Open
psiddh wants to merge 1 commit intopytorch:mainfrom
psiddh:export-D104603739
Open

Fix VGF runtime aborts on 0-dim tensor inputs and Python scalar coercion (#19446)#19446
psiddh wants to merge 1 commit intopytorch:mainfrom
psiddh:export-D104603739

Conversation

@psiddh
Copy link
Copy Markdown
Contributor

@psiddh psiddh commented May 11, 2026

Summary:

The ARM-backend VGF tests (e.g. test_sum_dim_intlist_vgf_quant and
test_sum_dim_intlist_vgf_no_quant for all 19 parametrizations) were
hard-aborting the pytest process with two latent bugs that compounded:

  1. C++ aten_bridge nullptr assert on 0-dim tensors (T270603238).
    executorch/extension/aten_util/aten_bridge.cpp::check_tensor_meta had two
    unconditional ET_CHECK_MSG(b.{sizes,strides}().data() != nullptr, ...)
    asserts. For 0-dim (scalar) tensors, sizes()/strides() are empty
    IntArrayRefs whose .data() may legitimately return nullptr. The
    process aborted on every valid scalar tensor input. Fix: gate the nullptr
    checks on b.dim() > 0. The subsequent loops are no-ops when dim() == 0 and dim_order_to_stride_nocheck already early-returns for dims == 0 (dim_order_util.h:132-134), so the relaxed asserts are safe.

  2. VGF Python runner over-wrapping non-tensor inputs (Error::InvalidArgument
    0x12).
    runner_fb.run_vgf previously called torch.tensor(x) on
    every non-tensor input (including None/bool/int), producing
    0-dim tensors. The lowered method's signature, however, expects EValue
    tags Int/Bool/None for those slots — receiving a Tensor
    caused Method::set_inputs to reject the inputs. The pybindings layer
    (pybindings.cpp:804-809) already natively handles
    None/bool/int Python objects; the runner just had to stop
    interfering. Fix: only wrap Python float (and other unknown types) as
    0-dim tensors — the original addmm alpha/beta motivation. Pass
    None/bool/int through unchanged.

  3. Regression tests for the C++ fix in
    executorch/extension/aten_util/test/aten_bridge_test.cpp:
    AliasETensorToATenTensorZeroDim and AliasATTensorToETensorZeroDim
    construct true 0-dim tensors via at::scalar_tensor and verify the
    bridge does not abort. The existing AliasETensorToATenTensorFail death
    test still fires for ranked tensors with empty strides because that case
    has dim() == 3 > 0.

Fixes T270603238.

Differential Revision: D104603739

Copilot AI review requested due to automatic review settings May 11, 2026 04:16
@psiddh psiddh requested a review from JacobSzwejbka as a code owner May 11, 2026 04:16
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented May 11, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19446

Note: Links to docs will display an error until the docs builds have been completed.

⏳ 1 Pending, 2 Unrelated Failures

As of commit 358b4e2 with merge base a49171d (image):

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 11, 2026
@meta-codesync
Copy link
Copy Markdown
Contributor

meta-codesync Bot commented May 11, 2026

@psiddh has exported this pull request. If you are a Meta employee, you can view the originating Diff in D104603739.

@github-actions
Copy link
Copy Markdown

This PR needs a release notes: label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

@psiddh psiddh requested review from GregoryComer and rascani May 11, 2026 04:20
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes hard aborts when bridging between ATen tensors and ExecuTorch tensors for valid 0-dim (scalar) tensors by avoiding invalid nullptr assertions on empty metadata arrays.

Changes:

  • Relax check_tensor_meta to only assert non-null sizes/strides storage for ranked tensors (dim() > 0).
  • Add regression tests covering zero-dimensional aliasing in both bridge directions.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
extension/aten_util/aten_bridge.cpp Gates sizes/strides nullptr checks on dim() > 0 to allow valid scalar tensor metadata.
extension/aten_util/test/aten_bridge_test.cpp Adds regression tests for 0-dim tensor aliasing to ensure the bridge doesn’t abort.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread extension/aten_util/aten_bridge.cpp Outdated
Comment thread extension/aten_util/test/aten_bridge_test.cpp Outdated
Comment thread extension/aten_util/test/aten_bridge_test.cpp Outdated
@psiddh psiddh requested a review from digantdesai May 11, 2026 04:20
@psiddh psiddh force-pushed the export-D104603739 branch from 2700570 to dcbbbf0 Compare May 11, 2026 04:33
…ion (pytorch#19446)

Summary:

The ARM-backend VGF tests (e.g. ``test_sum_dim_intlist_vgf_quant`` and
``test_sum_dim_intlist_vgf_no_quant`` for all 19 parametrizations) were
hard-aborting the pytest process with two latent bugs that compounded:

1. **C++ aten_bridge nullptr assert on 0-dim tensors (T270603238).**
   ``executorch/extension/aten_util/aten_bridge.cpp::check_tensor_meta`` had two
   unconditional ``ET_CHECK_MSG(b.{sizes,strides}().data() != nullptr, ...)``
   asserts. For 0-dim (scalar) tensors, ``sizes()``/``strides()`` are empty
   ``IntArrayRef``s whose ``.data()`` may legitimately return nullptr. The
   process aborted on every valid scalar tensor input. Fix: gate the nullptr
   checks on ``b.dim() > 0``. The subsequent loops are no-ops when ``dim() ==
   0`` and ``dim_order_to_stride_nocheck`` already early-returns for ``dims ==
   0`` (``dim_order_util.h:132-134``), so the relaxed asserts are safe.

2. **VGF Python runner over-wrapping non-tensor inputs (Error::InvalidArgument
   0x12).** ``runner_fb.run_vgf`` previously called ``torch.tensor(x)`` on
   every non-tensor input (including ``None``/``bool``/``int``), producing
   0-dim tensors. The lowered method's signature, however, expects ``EValue``
   tags ``Int``/``Bool``/``None`` for those slots — receiving a ``Tensor``
   caused ``Method::set_inputs`` to reject the inputs. The pybindings layer
   (``pybindings.cpp:804-809``) already natively handles
   ``None``/``bool``/``int`` Python objects; the runner just had to stop
   interfering. Fix: only wrap Python ``float`` (and other unknown types) as
   0-dim tensors — the original ``addmm`` alpha/beta motivation. Pass
   ``None``/``bool``/``int`` through unchanged.

3. **Regression tests** for the C++ fix in
   ``executorch/extension/aten_util/test/aten_bridge_test.cpp``:
   ``AliasETensorToATenTensorZeroDim`` and ``AliasATTensorToETensorZeroDim``
   construct true 0-dim tensors via ``at::scalar_tensor`` and verify the
   bridge does not abort. The existing ``AliasETensorToATenTensorFail`` death
   test still fires for ranked tensors with empty strides because that case
   has ``dim() == 3 > 0``.

Fixes T270603238.

Differential Revision: D104603739
@meta-codesync meta-codesync Bot changed the title Fix VGF runtime aborts on 0-dim tensor inputs and Python scalar coercion Fix VGF runtime aborts on 0-dim tensor inputs and Python scalar coercion (#19446) May 11, 2026
@psiddh psiddh force-pushed the export-D104603739 branch from dcbbbf0 to 358b4e2 Compare May 11, 2026 04:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants