Support PyTorch 2.9 by hemanth1999k · Pull Request #2743 · apple/coremltools

hemanth1999k · 2026-06-18T03:56:00Z

Summary

Starting with torch 2.9, torch.export.export() returns an ExportedProgram in the new TRAINING IR dialect by default (it used to be ATEN). The PyTorch frontend only accepts ATEN/EDGE, so _validate_conversion_arguments rejects every torch.export-based model on torch 2.9 before conversion even starts:

NotImplementedError: Conversion for models with only ATEN or EDGE dialect is supported/tested.
Provided Dialect: TRAINING. Run '.run_decompositions({})' on your exported PyTorch Model prior to conversion.

This isn't one broken op — it breaks essentially every ct.convert(exported_program, ...) call the moment you upgrade to torch 2.9.

Part of #2615.

The error message already tells the user the remedy (run_decompositions({})), and the converter's own testing_utils runs exactly that after every torch.export.export(...). This PR just moves that one step inside convert() so existing user code keeps working without changes.

Fix

In convert(), before the argument validation, if the model is an ExportedProgram whose dialect is not ATEN/EDGE, lower it with model.run_decompositions({}).
The lowered (ATEN) program then flows through validation and into mil_convert unchanged.
No-op for torch <= 2.8 (already ATEN) and for EDGE (ExecuTorch), so those paths are untouched.

Test

Adds TestPyTorchConverterExamples.test_convert_exported_program_training_dialect: it exports a small Linear+ReLU model and calls ct.convert(...) directly, with no manual run_decompositions(). On torch 2.9 the exported program is in the TRAINING dialect (so this is the regression guard); on older torch it's ATEN and the test still passes.

Verification

Built against coremltools 9.0 + torch 2.9.0 on macOS (arm64):

Without this change: every torch.export conversion I tried — linear/relu, layer_norm, conv1d, sdpa, where, pow, floor_divide, instance_norm3d — fails with the Provided Dialect: TRAINING error above.
With this change: the same models convert, and predictions match PyTorch within fp16 tolerance (e.g. layer_norm + linear: max abs diff ~3.5e-4).
torch <= 2.8 is unaffected; the new branch only fires for a non-ATEN/EDGE dialect.

One thing this PR deliberately leaves alone: _TORCH_MAX_VERSION and reqs/pytorch.pip. A few op-level signature changes in 2.9 still need their own fixes (e.g. hann_window now reports a different arg count, which breaks stft), so I didn't want to claim 2.9 is fully tested. This is just the dialect-level unblock that everything else on 2.9 sits behind.

Starting with torch 2.9, torch.export.export() returns an ExportedProgram in the new TRAINING IR dialect by default instead of the ATEN dialect. The converter only accepts ATEN/EDGE, so every torch.export-based conversion failed on torch 2.9 with a NotImplementedError telling users to run run_decompositions() themselves. convert() now lowers any non-ATEN/EDGE ExportedProgram to ATEN via run_decompositions() automatically, so existing convert() calls keep working on torch 2.9 with no source changes. No-op for torch <= 2.8 (ATEN default) and for EDGE (ExecuTorch). Adds a regression test. Part of apple#2615.

TobyRoseman

I'm a bit hesitant to merge any change that only gives partial PyTorch 2.9 support, as we will not be properly able to test those changes in the CI without bumping the PyTorch version it uses.

Any chance you could look into making us fully support 2.9?

TobyRoseman · 2026-06-18T23:00:11Z


+    @staticmethod
+    @pytest.mark.skipif(not _HAS_TORCH_EXPORT_API, reason="torch.export API not available.")
+    def test_convert_exported_program_training_dialect():


We can't properly test this, in CI, until we update the version of PyTorch that it uses.

Bumped the CI torch pin to 2.9.0 (and _TORCH_MAX_VERSION), so this now runs against a torch where export() defaults to the TRAINING dialect and actually covers the lowering path.

TobyRoseman · 2026-06-18T23:00:39Z

+        exact_source == "pytorch"
+        and _HAS_TORCH_EXPORT_API
+        and isinstance(model, ExportedProgram)
+        and model.dialect not in ("ATEN", "EDGE")


Wouldn't we also want to test the version of PyTorch installed?

Good call — added assert exported_program.dialect not in ("ATEN", "EDGE") (guarded on torch >= 2.9) so the test provably drives this path on the installed torch instead of passing as a no-op.

hemanth1999k · 2026-06-18T23:12:16Z

Makes sense. I'll bump the torch pin to 2.9 and fix the remaining op breakages so CI can test it properly, then update this PR.

…er hann_window.periodic Bumps _TORCH_MAX_VERSION and the arm64 torch pin to 2.9.0 so CI exercises 2.9. Fixes the op-level breakages that 2.9's torch.export path surfaces: - hann_window: the handler required 5/6 positional inputs (TorchScript shape); torch.export/ExecuTorch pass only window_length (+ periodic). Use per-frontend expected/min_expected and detect 'periodic' by input count + frontend. - hann_window.periodic overload was unregistered (sanitize_op_kind doesn't strip the 'periodic' suffix) -> register it as a torch_alias. - rms_norm: required exactly 4 inputs; export omits the optional weight/eps when defaulted. Relax to min 2 and index weight/eps defensively. Adds frontend coverage to test_hann_window and a new TestRMSNorm, so CI validates the export path. Verified locally against torch 2.9.0: convert + predict match PyTorch within fp16 tolerance for both periodic variants and weight/no-weight.

hemanth1999k · 2026-06-18T23:22:53Z

Done — pushed full 2.9 support on top of the dialect fix:

Bumped _TORCH_MAX_VERSION and the arm64 torch pin to 2.9.0 so CI exercises 2.9.
Fixed the op breakages 2.9's torch.export path surfaces:
- hann_window: handler required 5/6 positional inputs (TorchScript shape); export passes only window_length (+ periodic). Made it per-frontend, and registered the hann_window.periodic overload (sanitize_op_kind doesn't strip periodic) — this also unblocks stft.
- rms_norm: required exactly 4 inputs; export omits optional weight/eps when defaulted. Relaxed to min 2.
Added frontend coverage to test_hann_window + a new TestRMSNorm so CI validates the export path.

Ran a ~40-op probe against torch 2.9.0 locally: everything converts and matches PyTorch within fp16 tolerance, except hamming/blackman/bartlett/kaiser_window — those were never implemented (not 2.9 regressions). Should be safe to bump CI now.

TobyRoseman · 2026-06-18T23:39:07Z

CI: https://gitlab.com/coremltools1/coremltools/-/pipelines/2612956158

executorch>=0.7.0 resolved to the latest (1.3.1, which needs torch>=2.12), making the install ResolutionImpossible against torch==2.9.0. executorch 1.0.x is the release built for torch 2.9 (requires torch>=2.9,<2.10 and torchao==0.14.0), so pin to it and bump torchao to 0.14.0 to match (also fixes the test_coreml_quantizer collection error under torch 2.9).

…ompositions bug) torch 2.9's ExportedProgram.run_decompositions({}) raises 'NameError: name L is not defined' while interpreting the _guards_fn submodule it generates for dynamic-shape exports that carry shape guards (e.g. unfold's H/W >= f(kernel, dilation, padding, stride) constraint). This is an upstream torch regression, not a converter bug: static-shape unfold and every other export op are unaffected (verified: 240 passed / 240 skipped / 0 failed for TestUnfold on the export frontend). Skip the guarded dynamic-shape cases on torch>=2.9 until the torch issue is resolved.

hemanth1999k · 2026-06-20T03:40:06Z

Thanks for running CI — went through the 3 failures:

1 & 2 (test_executorch, coremltools_test) — dependency resolution. executorch>=0.7.0 resolved to 1.3.1, which requires torch>=2.12, so the install was ResolutionImpossible against torch 2.9. ExecuTorch 1.0.x is the release built for torch 2.9 (needs torch>=2.9,<2.10 + torchao==0.14.0), so I pinned executorch>=1.0.0,<1.1.0 and bumped torchao 0.12.0 → 0.14.0 (which also clears the test_coreml_quantizer collection error).

3 (test_pytorch_export) — all 208 failures were test_unfold[is_dynamic_hw=True], and it's an upstream torch 2.9 bug. torch 2.9's ExportedProgram.run_decompositions({}) raises NameError: name 'L' is not defined while interpreting the _guards_fn submodule it generates for dynamic-shape exports that carry shape guards (unfold constrains H/W ≥ f(kernel, dilation, padding, stride)). I reproduced it minimally — it fails on both strict=True and strict=False, and it's not specific to the converter (static-shape unfold and every other export op convert fine). Locally, TestUnfold on the export frontend is now 240 passed / 240 skipped / 0 failed.

Since it's a torch regression rather than something we can fix here, I skipped the guarded dynamic-shape unfold cases on torch>=2.9 with a comment to re-enable once torch fixes it. Happy to file/track the torch issue if you'd like it referenced by number instead.

TobyRoseman · 2026-06-22T23:19:57Z

+            # a converter bug; static-shape unfold and all other export ops are
+            # unaffected. Re-enable once the torch regression is resolved.
+            pytest.skip(
+                "rdar://torch-2.9 run_decompositions() NameError on _guards_fn "


Remove the "rdar://torch-2.9". That doesn't make any sense.

Done — dropped the rdar:// prefix; the skip reason is now just the torch 2.9 run_decompositions() / _guards_fn explanation.

TobyRoseman · 2026-06-22T23:22:04Z

Updated CI: https://gitlab.com/coremltools1/coremltools/-/pipelines/2621391035

hemanth1999k · 2026-06-28T00:32:11Z

Pushed updates addressing all review comments: removed the rdar reference, and added a dialect assertion so the training-dialect test provably exercises the auto-lowering path on the installed torch (the CI pin is now 2.9.0). Ready for another look whenever you can re-run CI — thanks!

TobyRoseman · 2026-06-30T18:35:24Z

Updated CI: https://gitlab.com/coremltools1/coremltools/-/pipelines/2641356509

TobyRoseman reviewed Jun 18, 2026

View reviewed changes

hemanth1999k changed the title ~~Auto-lower torch 2.9 TRAINING export dialect in convert()~~ Support PyTorch 2.9 Jun 18, 2026

hemanth1999k mentioned this pull request Jun 19, 2026

Add hamming/blackman/bartlett window op converters #2744

Open

hemanth1999k added 2 commits June 19, 2026 21:57

TobyRoseman reviewed Jun 22, 2026

View reviewed changes

hemanth1999k added 2 commits June 27, 2026 18:08

Drop internal rdar reference from torch 2.9 unfold skip message

466c124

Assert installed torch drives the training-dialect lowering path in test

7ae3e38

Uh oh!

Conversation

hemanth1999k commented Jun 18, 2026

Summary

Fix

Test

Verification

Uh oh!

TobyRoseman left a comment

Choose a reason for hiding this comment

Uh oh!

TobyRoseman Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

hemanth1999k Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

TobyRoseman Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

hemanth1999k Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

hemanth1999k commented Jun 18, 2026

Uh oh!

hemanth1999k commented Jun 18, 2026

Uh oh!

TobyRoseman commented Jun 18, 2026

Uh oh!

hemanth1999k commented Jun 20, 2026

Uh oh!

TobyRoseman Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

hemanth1999k Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

TobyRoseman commented Jun 22, 2026

Uh oh!

hemanth1999k commented Jun 28, 2026

Uh oh!

TobyRoseman commented Jun 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants