[Relax][ONNX] Support 3D AffineGrid by guan404ming · Pull Request #19863 · apache/tvm

guan404ming · 2026-06-22T06:49:34Z

Related Issue

closes #19689

Why

The Relax AffineGrid op only handled 2D (4D theta/grid); 5D 3D inputs from ONNX failed.

How

Generalize struct-info inference to 2D/3D via spatial = size_sinfo->ndim.
Branch TOPI affine_grid compute on 2D vs 3D.
Add the 3D permute path in the frontend and a test_affine_grid_3d case.

gemini-code-assist

Code Review

This pull request extends the affine_grid operator to support 3D sampling grids in addition to the existing 2D support, updating the ONNX frontend, TOPI implementation, type inference, and adding a test case. The review feedback suggests refactoring the ONNX frontend to avoid duplicating the grid emission and permutation logic, and replacing the branched 2D/3D computation in TOPI with a generic N-D implementation to improve maintainability.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

cbalint13

LGTM @guan404ming , thanks !

cbalint13 · 2026-06-22T09:52:43Z

@guan404ming , could you update to solve the conflicts ?

guan404ming · 2026-06-22T09:58:47Z

Thanks @cbalint13, just resolved!

tqchen · 2026-06-22T10:33:46Z

+    )
+
+    model = helper.make_model(graph, producer_name="affine_grid_3d_test")
+    check_correctness(model, opset=20)


seems we have been bringing in some of the tests that involves check_correctness against onxx runtime.
I know that we have been doing so in past.

Unfortunately keep doing so would slow down our ci and and a better way would be not doing integration tests in frontends and instead rely on cheaper structural verify, would be great if we can hold off merging changes that involves integration tests in frontends. For some of the key relax op), they should be constructed as relax ast and run verification per OP as UT (perhaps against pytorch.

Make sense to me, thanks for letting me know. Just switched test_affine_grid_3d to a structural verify, no more onnxruntime execution, just imports and asserts on the IR (affine_grid + permute_dims and output shape).

@tqchen ,

Apologise to approve with no oversee on this issue.

I addressed a comment for a possible solution to replace onnxruntime with onnx internal ReferenceEvaluator, see the pros/cons there.

cbalint13 · 2026-06-22T10:44:13Z

@guan404ming , @tlopex
Cc @Hzfengsy , @Msr-H, @tqchen

Perhaps not here is the best discussion for this (out of context), but I rise it here:

Why testcases did not use onnx own provided ReferenceEvaluator ?

To my understanding (and some experience) ReferenceEvaluator is not only for functional correctness (indeed onnxruntime also does it) but provides specification compliance too. It comes with onnx package itself, provides coverage for operator histories too. Also any particular operator can be also sweeped over its all existing op versions using history schema, (see a random testcase example) . ONNX really excels at reflecting its own schema with reference validation too.

I am not saying we should drop onnxruntime now, but if we want excelent onnx coverage with full history schema and compliance we may consider this.

Later EDIT:

Concerning speed , ~514 tests (include ops all own versions) runs in ~20sec on the example provided.

Signed-off-by: Guan-Ming (Wesley) Chiu <105915352+guan404ming@users.noreply.github.com>

tqchen · 2026-06-22T10:58:45Z

On reference evaluator, my guess is reference evaluator would even be slower because it generates reference results and also is integration runs, so it would be worse than onxxruntime in terms of ci time.

I think the main issue is to consider the balance of ci pressure and our core features. Ideally the invariance are:

For importers, we cross validate the ingestion correctness via stuctural check
We have some levels of correctness checks in op UT, which might help

An altnertative could be we split out the frontend modules into separate repos, but still there would be an ci pressure concern there

cbalint13 · 2026-06-22T11:09:39Z

On reference evaluator, my guess is reference evaluator would even be slower because it generates reference results and also is integration runs, so it would be worse than onxxruntime in terms of ci time.

I think the main issue is to consider the balance of ci pressure and our core features. Ideally the invariance are:

For importers, we cross validate the ingestion correctness via stuctural check

We have some levels of correctness checks in op UT, which might help

An altnertative could be we split out the frontend modules into separate repos, but still there would be an ci pressure concern there

@tqchen

Despite its pure python its pretty lightweight, not slower, I was also surprised.

I give you a ruff number: ~514 tests (include ops all own versions) runs in ~20sec on the example , in this 20 sec the lowering to mlir and instantiating the mlir runtime to compute the operator is also included.

I am up for a task to see if we can benefit ReferenceEvaluator over onnxruntime, can migrate all testcases within a Draft PR to see numbers, I dont mind if experiment proves the otherwise and we drop the idea.

Let me know what you think.

tqchen · 2026-06-22T11:22:10Z

For onnx tests, i think it is still better to move to the principle of structural verification

For importers, we cross validate the ingestion correctness via stuctural check
We have some levels of correctness checks in op UT, which might help

So for onnx tests we can rely on structural verification. We can perhaps have some form of side integration checks that are not run by every ci run, eg. nightly.

cbalint13 · 2026-06-23T11:06:43Z

@guan404ming , can't be merged it until change-request (by @tqchen) is not in resolved state.
Thank you !

guan404ming · 2026-06-23T15:51:46Z

Agreed on structural verification for importer tests. For the heavier numerical checks, I think we could borrow Airflow's canary model: one CI workflow where pull_request runs only fast structural/op-UT checks, while schedule, post-merge push, or a canary label additionally run integration tests via a reusable workflow_call job. Fast PRs, end-to-end coverage, opt-in escape hatch.

Refs: triggers, canary gating , integration as workflow_call

guan404ming · 2026-06-23T15:52:03Z

Thanks @cbalint13!

gemini-code-assist Bot reviewed Jun 22, 2026

View reviewed changes

Comment thread python/tvm/relax/frontend/onnx/onnx_frontend.py Outdated

Comment thread python/tvm/topi/image/grid_sample.py Outdated

guan404ming force-pushed the fix/onnx-affinegrid-3d branch from 8f704ef to 4957ad7 Compare June 22, 2026 07:08

guan404ming requested review from cbalint13 and tlopex June 22, 2026 09:07

cbalint13 approved these changes Jun 22, 2026

View reviewed changes

[Relax][ONNX] Support 3D AffineGrid

35eddd9

guan404ming force-pushed the fix/onnx-affinegrid-3d branch from 4957ad7 to 35eddd9 Compare June 22, 2026 09:58

cbalint13 approved these changes Jun 22, 2026

View reviewed changes

tqchen requested changes Jun 22, 2026

View reviewed changes

Switch to structural verify

3a59dde

Signed-off-by: Guan-Ming (Wesley) Chiu <105915352+guan404ming@users.noreply.github.com>

guan404ming requested a review from tqchen June 22, 2026 10:54

cbalint13 approved these changes Jun 23, 2026

View reviewed changes

Conversation

guan404ming commented Jun 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Related Issue

Why

How

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

cbalint13 left a comment

Choose a reason for hiding this comment

Uh oh!

cbalint13 commented Jun 22, 2026

Uh oh!

guan404ming commented Jun 22, 2026

Uh oh!

tqchen Jun 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

guan404ming Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

cbalint13 Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

cbalint13 commented Jun 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tqchen commented Jun 22, 2026

Uh oh!

cbalint13 commented Jun 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tqchen commented Jun 22, 2026

Uh oh!

cbalint13 commented Jun 23, 2026

Uh oh!

guan404ming commented Jun 23, 2026

Uh oh!

guan404ming commented Jun 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

guan404ming commented Jun 22, 2026 •

edited

Loading

tqchen Jun 22, 2026 •

edited

Loading

cbalint13 commented Jun 22, 2026 •

edited

Loading

cbalint13 commented Jun 22, 2026 •

edited

Loading