Wrap iOS18 quantization errors with ExecuTorch-specific hint#19249
Wrap iOS18 quantization errors with ExecuTorch-specific hint#19249john-rocky wants to merge 2 commits intopytorch:mainfrom
Conversation
When the user lowers a model that was prepared with torchao's `quantize_(...)` (e.g. blockwise int4) but does not pass an iOS18+ `minimum_deployment_target` to the CoreML partitioner, coremltools raises a generic ValueError pointing at coremltools internals. The user has no obvious way to discover that the target is set via `CoreMLBackend.generate_compile_specs` and plumbed through `CoreMLPartitioner(compile_specs=...)`. Catch the ValueError around the two coremltools utilities used by our overridden `dequantize_affine` / `dequantize_codebook` handlers and re-raise it with an ExecuTorch-flavored hint that shows the exact partitioner call to make. Fixes pytorch#13122.
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19249
Note: Links to docs will display an error until the docs builds have been completed.
|
|
Hi @john-rocky! Thank you for your pull request and welcome to our community. Action RequiredIn order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you. ProcessIn order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA. Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks! |
This PR needs a
|
Summary
When a model prepared with torchao's
quantize_(...)(e.g. blockwise int4)is lowered without an iOS18+
minimum_deployment_target, coremltools raisesa
ValueErrorfrom inside_construct_constexpr_dequant_op:This message is technically correct but does not tell the ExecuTorch user
how to set the deployment target — the answer is buried in
CoreMLBackend.generate_compile_specs(...)plusCoreMLPartitioner(compile_specs=...), which is not obvious unless you'vealready been through the docs.
The two
dequantize_affine/dequantize_codebookhandlers inbackends/apple/coreml/compiler/torch_ops.pyare the only call sites wherethe failing coremltools utilities are invoked from ExecuTorch code, so I
wrap them and re-raise the error with an additional hint that shows the
exact partitioner call. After this change the user sees:
Fixes #13122.
Test plan
Added
test_dequantize_affine_below_ios18_raises_with_hintwhich lowers aPerGroup-int4 quantized linear with
minimum_deployment_target=ct.target.iOS17and asserts the raised
ValueErrormentions bothiOS18and theCoreMLPartitioner/minimum_deployment_targetkeywords.The existing iOS18 quantization tests still pass (
test_dequantize_affine_b4w_linearexercised locally to confirm the wrapper does not affect the success path).Authored with Claude.