Motivation
The newer Macs with Apple Silicon (M1 and up) are actually quite powerful and even the lowest end M1 MacBook Air are impressive. In addition, the Apple platform is very suitable for ML workloads thanks to their unified memory architecture (all system RAM can be used as GPU memory with no performance penalty).
The Apple accelerated API is called MPS (Metal Performance Shaders) and is not at all compatible with CUDA, so this requires porting all the kernels, as well as writing the stub code.
Additionally, the Mac is a very popular platform for developers. Supporting MacOS natively for the popular torch libraries (as a longer term goal) means we don't have to resort to expensive Nvidia cloud VMs for every single task.
Proposed solution
@Titus-von-Koeller Feel free to edit this issue as you see fit, if you want a different structure for it for example.
Motivation
The newer Macs with Apple Silicon (M1 and up) are actually quite powerful and even the lowest end M1 MacBook Air are impressive. In addition, the Apple platform is very suitable for ML workloads thanks to their unified memory architecture (all system RAM can be used as GPU memory with no performance penalty).
The Apple accelerated API is called MPS (Metal Performance Shaders) and is not at all compatible with CUDA, so this requires porting all the kernels, as well as writing the stub code.
Additionally, the Mac is a very popular platform for developers. Supporting MacOS natively for the popular torch libraries (as a longer term goal) means we don't have to resort to expensive Nvidia cloud VMs for every single task.
Proposed solution
@Titus-von-Koeller Feel free to edit this issue as you see fit, if you want a different structure for it for example.