Skip to content

Arm bug: bilinear downscale by exactly 1/16 lowers to invalid TOSA RESIZE #19069

@Rob-Hughes-Arm

Description

@Rob-Hughes-Arm

🐛 Describe the bug

Arm bug: bilinear downscale by exactly 1/16 lowers to invalid TOSA RESIZE

Summary

ExecuTorch Arm lowering can emit a TOSA RESIZE operator that violates the
TOSA lower bound on bilinear downscale factors.

For a bilinear resize with align_corners=False and scale factor exactly
1/16, the lowered TOSA graph contains:

scales = [2, 32, 2, 32]
offset = [15, 15]
border = [-15, -15]

That encodes an exact spatial downscale of 1/16 in both dimensions, but
TOSA requires:

scale_y_d < 16 * scale_y_n
scale_x_d < 16 * scale_x_n

So the exact boundary case 1/16 is invalid and should not be emitted.

Why this is a real bug

This is a backend correctness issue before any downstream compiler or runtime
is involved:

  • the ExecuTorch Arm lowering succeeds
  • the emitted .tosa flatbuffer is rejected by the official TOSA Reference
    Model
  • the failure is reproducible with a one-op model using public PyTorch APIs

Minimal model

import torch
import torch.nn as nn
import torch.nn.functional as F


class TinyResizeProbe(nn.Module):
    def forward(self, x):
        return F.interpolate(
            x,
            scale_factor=1.0 / 16.0,
            mode="bilinear",
            align_corners=False,
        )

Repro: stock ExecuTorch Arm lowering

import shutil
from pathlib import Path

import torch
import torch.nn as nn
import torch.nn.functional as F
from executorch.backends.arm.quantizer import (
    VgfQuantizer,
    get_symmetric_quantization_config,
)
from executorch.backends.arm.tosa.compile_spec import TosaCompileSpec
from executorch.backends.arm.tosa.partitioner import TOSAPartitioner
from executorch.backends.arm.tosa.specification import TosaSpecification
from executorch.exir import to_edge_transform_and_lower
from executorch.exir.capture._config import EdgeCompileConfig


ARTIFACT_DIR = Path("artifacts/tiny_resize_invalid_tosa")


class TinyResizeProbe(nn.Module):
    def forward(self, x):
        return F.interpolate(
            x,
            scale_factor=1.0 / 16.0,
            mode="bilinear",
            align_corners=False,
        )


def strip_unused_guard_nodes(graph_module):
    for node in list(graph_module.graph.nodes):
        if node.op == "call_module" and node.target == "_guards_fn" and len(node.users) == 0:
            graph_module.graph.erase_node(node)
    graph_module.graph.lint()
    graph_module.recompile()


shutil.rmtree(ARTIFACT_DIR, ignore_errors=True)
ARTIFACT_DIR.mkdir(parents=True, exist_ok=True)

x = torch.randn(1, 3, 256, 448)
model = TinyResizeProbe().eval()

exported_program = torch.export.export(model, (x,), strict=True)
graph_module = exported_program.module()
strip_unused_guard_nodes(graph_module)

quantizer = VgfQuantizer(TosaSpecification.create_from_string("TOSA-1.0+INT+int16"))
qconfig = get_symmetric_quantization_config(
    is_per_channel=True,
    is_qat=False,
    is_dynamic=False,
    act_qmin=-127,
    act_qmax=127,
    weight_qmin=-127,
    weight_qmax=127,
)
quantizer.set_global(qconfig).set_io(qconfig)

quantized_graph = quantizer.quantize_with_submodules(
    graph_module,
    calibration_samples=[(x,)],
    is_qat=False,
)
quantized_exported = torch.export.export(quantized_graph, (x,))

compile_spec = TosaCompileSpec(
    TosaSpecification.create_from_string("TOSA-1.0+INT+int16")
).dump_intermediate_artifacts_to(str(ARTIFACT_DIR))

partitioner = TOSAPartitioner(compile_spec)
to_edge_transform_and_lower(
    quantized_exported,
    partitioner=[partitioner],
    compile_config=EdgeCompileConfig(_check_ir_validity=False),
)

This lowering succeeds and emits:

artifacts/tiny_resize_invalid_tosa/output_tag0_TOSA-1.0+INT+int16.tosa

Reference-model validation

Build the TOSA Reference Model from any tosa-tools checkout:

cmake -S <tosa-tools-root> -B <tosa-tools-root>/build-refmodel \
  -DCMAKE_BUILD_TYPE=Release \
  -DTOSA_ENABLE_PROJECTS=reference_model
cmake --build <tosa-tools-root>/build-refmodel -j

For the reproduced artifact, the graph input and output names were:

ifm_name = quantized_decomposed_quantize_per_tensor_default
ofm_name = tosa_transpose_default_1

Use a descriptor like:

{
  "tosa_file": "output_tag0_TOSA-1.0+INT+int16.tosa",
  "ifm_name": [
    "quantized_decomposed_quantize_per_tensor_default"
  ],
  "ifm_file": [
    "input.npy"
  ],
  "ofm_name": [
    "tosa_transpose_default_1"
  ],
  "ofm_file": [
    "output.npy"
  ]
}

Then run:

<tosa-tools-root>/build-refmodel/reference_model/reference_model/tosa_reference_model \
  --test_desc=desc.json \
  -d ALL -l HIGH

Observed failure

The TOSA Reference Model rejects the generated graph:

ERROR_IF() fails ... ((scale_y_d >= 16 * scale_y_n) || (scale_x_d >= 16 * scale_x_n))
OpResize: invalid attribute scale
...
Input[1]  Name: tosa_resize_default_scales DType=SHAPE ... ShapeValue=[2, 32, 2, 32]
Input[2]  Name: tosa_resize_default_offset DType=SHAPE ... ShapeValue=[15, 15]
Input[3]  Name: tosa_resize_default_border DType=SHAPE ... ShapeValue=[-15, -15]
Graph result: ERROR.

Root cause

The Arm backend computes resize parameters by reducing the output/input ratio
to lowest terms and then doubling numerator and denominator for TOSA encoding.

For this repro:

  • 256 -> 16 becomes (scale_n, scale_d, offset, border) = (2, 32, 15, -15)
  • 448 -> 28 becomes (scale_n, scale_d, offset, border) = (2, 32, 15, -15)

Those values are mathematically consistent with an exact 1/16 downscale, but
TOSA does not allow the boundary case. The backend currently checks only
integer range constraints before emission, not the stricter TOSA legality rule
for minimum supported downscale.

Affected code paths

In the validated local tree, the bug is explained by these paths:

  • backends/arm/tosa/utils.py
  • backends/arm/operators/op_tosa_resize.py

The first computes the doubled reduced ratio and border terms. The second
serializes the resulting RESIZE attributes after checking only int16
ranges.

Expected behavior

ExecuTorch Arm should not emit invalid TOSA here.

Reasonable fixes include:

  • reject the model during lowering with a clear error explaining that bilinear
    downscale must be strictly greater than 1/16
  • legalize the resize into a valid sequence before TOSA emission
  • add a regression test that covers the exact 1/16 boundary

Actual behavior

Lowering succeeds and emits a .tosa flatbuffer that the official TOSA
Reference Model rejects as invalid.

Versions

Collecting environment information...
PyTorch version: 2.10.0+cpu
Is debug build: False
CUDA used to build PyTorch: Could not collect
ROCM used to build PyTorch: N/A

OS: Microsoft Windows 11 Enterprise (10.0.26100 64-bit)
GCC version: Could not collect
Clang version: Could not collect
CMake version: Could not collect
Libc version: N/A

Python version: 3.10.5 (tags/v3.10.5:f377153, Jun 6 2022, 16:14:13) [MSC v.1929 64 bit (AMD64)] (64-bit runtime)
Python platform: Windows-10-10.0.26100-SP0
Is CUDA available: False
CUDA runtime version: Could not collect
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: GPU 0: NVIDIA GeForce RTX 4070 Ti
Nvidia driver version: 576.88
cuDNN version: Could not collect
Is XPU available: False
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True
Caching allocator config: N/A

CPU:
Name: 13th Gen Intel(R) Core(TM) i9-13900KF
Manufacturer: GenuineIntel
Family: 207
Architecture: 9
ProcessorType: 3
DeviceID: CPU0
CurrentClockSpeed: 3000
MaxClockSpeed: 3000
L2CacheSize: 32768
L2CacheSpeed: None
Revision: None

Versions of relevant libraries:
[pip3] executorch==1.2.0.dev20260305+cpu
[pip3] numpy==2.1.3
[pip3] pytorch_tokenizers==1.1.0
[pip3] torch==2.10.0
[pip3] torchao==0.15.0
[pip3] torchvision==0.25.0
[conda] Could not collect

cc @digantdesai @freddan80 @per @zingo @oscarandersson8218 @mansnils @Sebastian-Larsson @robell

Metadata

Metadata

Assignees

Labels

module: armIssues related to arm backendpartner: armFor backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions