Skip to content

Support for Intel XPU / Intel ARC GPUs#1329

Open
Sikerdebaard wants to merge 2 commits into
MIC-DKFZ:masterfrom
Sikerdebaard:xpu
Open

Support for Intel XPU / Intel ARC GPUs#1329
Sikerdebaard wants to merge 2 commits into
MIC-DKFZ:masterfrom
Sikerdebaard:xpu

Conversation

@Sikerdebaard
Copy link
Copy Markdown

Hi everyone,

I am excited to announce that I have begun adding Intel XPU support through IPEX into nnUNet, which will allow training and inference on the Intel ARC GPUs. However, I would like to note that the code needs further testing and optimization before merging. Therefore, I am sharing it with the community in hopes that others can contribute to this project.

Currently, the code has only been tested for CPU and Intel XPU. Therefore, there may be bugs that need to be addressed. Furthermore, I have noticed that training on an AMD 7900x CPU is faster than training with the A770 Intel ARC GPU using this code. Additionally, the XPU backend only supports BFloat16 precision at this time.

If you are interested in helping with this project, please feel free to contribute or provide feedback.

@FabianIsensee
Copy link
Copy Markdown
Member

Hey thanks for this amazing work! I like how you are abstracting the backends into separate classes. Today we have released nnU-net v2 which already extends the supported devices to cuda, cpu and mps.
For the next couple of weeks I will be quite busy with an upcoming evaluation, but after that I would like to discuss how we can use this principle in nnU-Net v2 in order to make integrating new devices less tedious. May I get bet to you on that?

@FabianIsensee
Copy link
Copy Markdown
Member

Hey Thomas, I think fabric is the way to go for this in the future. I will work on adding fabric to nnU-Net soon
https://lightning.ai/pages/open-source/fabric/

@Sikerdebaard
Copy link
Copy Markdown
Author

Sikerdebaard commented Mar 31, 2023

Hi Fabian,

If you are looking into frameworks as a solution then ONNX might be worth considering as well. It is backed by Microsoft.
It seems that both frameworks, lightning and ONNX, do not support Intel XPU out of the box yet for training, but for inference ONNX can already use XPU through the oneDNN API. Furthermore with ONNX it is possible to convert the model to tensorflow and then to tensorflow.js which could be a useful addition.

@FabianIsensee
Copy link
Copy Markdown
Member

Hey, I am quite confident that fabric will support XPUs soon. I have talked to one of their developers recently and they seem highly motivated to include everything that is needed for broad adoption. I like how fabric seamlessly integrates into existing pytorch code which is why I like this solution. It works for both training and inference.
If certain formats, like ONNX, are required for running inference in some circumstances, then it would be better to have some onnx export code that takes care of that

@FabianIsensee
Copy link
Copy Markdown
Member

Is XPU integration still something that you need? If so I can look into enabling that in nnU-Net. Since I don't have an arc GPU I would need someone to test that it works

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants