Qualcomm AI Engine Direct - AMD backend error#18098
Qualcomm AI Engine Direct - AMD backend error#18098abhinaykukkadapu merged 3 commits intopytorch:mainfrom
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18098
Note: Links to docs will display an error until the docs builds have been completed. ❌ 6 New Failures, 1 Cancelled Job, 4 Unrelated FailuresAs of commit ae571ae with merge base c7f1d72 ( NEW FAILURES - The following jobs have failed:
CANCELLED JOB - The following job was cancelled. Please retry:
BROKEN TRUNK - The following jobs failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
|
Hi @cccclai, @abhinaykukkadapu, |
digantdesai
left a comment
There was a problem hiding this comment.
I assume this is during eager model runs?
|
Would you mind creating a ticket on PyTorch/PyTorch? |
Hi @digantdesai, |
|
@winskuo-quic there are bunch of failures, can we rebase, i will monitor and push if the CI is green. Thanks |
6455008 to
f54c655
Compare
@abhinaykukkadapu, |
| import os | ||
|
|
||
| from .scripts.download_qnn_sdk import install_qnn_sdk, is_linux_x86, QNN_ZIP_URL | ||
| import cpuinfo |
There was a problem hiding this comment.
Move this to lazily import within the if "amd" in vendor:
There was a problem hiding this comment.
I have moved all these into the try catch statement.
|
|
||
| info = cpuinfo.get_cpu_info() | ||
| vendor = info.get("vendor_id_raw", "").lower() | ||
| if "amd" in vendor: |
There was a problem hiding this comment.
Please also consider surrounding this in try/except if there are any ImportError with logging to let user know to install py-cpuinfo
There was a problem hiding this comment.
Added a try catch statement.
Thanks @winskuo-quic, provided few suggestions, also we would want something like this: executorch/.ci/scripts/setup-openvino.sh Line 47 in 28f3cf3 If you can add it in this pr, otherwise i can take it up. |
f54c655 to
5e7314f
Compare
|
Hi @abhinaykukkadapu, |
Thanks @winskuo-quic for addressing the comments, i think we missed one more spot to run the pip install requirements. |
|
@abhinaykukkadapu has imported this pull request. If you are a Meta employee, you can view this in D99858579. |
Summary
We noticed that when performing inference with AMD CPU, we will run into
Floating point exception (core dumped).This can be easily reproduced with following lines of code:
import torch.nn as nn
import torch
w2_conv = nn.Conv2d(1536, 32, 1, bias=False)
x = torch.randn(1,1536,1,32)
w2_conv(x)
Temp solution is to set mkldnn.enabled=False:
torch.backends.mkldnn.enabled = False
Test plan
NA
cc @cccclai @cbilgin