Skip to content

Commit f5be4fa

Browse files
Minor edits
1 parent d13fad5 commit f5be4fa

File tree

1 file changed

+3
-4
lines changed

1 file changed

+3
-4
lines changed

docs/blog/posts/benchmark-amd-vms.md

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -69,7 +69,7 @@ This demonstrates that from a model training and hardware utilization perspectiv
6969

7070
This initial benchmark deliberately focused on a single-GPU setup to establish a baseline. A more production-representative evaluation would compare multi-GPU VMs with multi-GPU bare-metal systems. In multi-GPU inference, bare-metal’s direct hardware access could offer an advantage. For distributed training, however, where all GPUs are fully engaged, the performance between VM and bare-metal would likely be even closer.
7171

72-
Furthermore, it's important to note that the performance gap in virtualized setups can often be narrowed significantly with expert hypervisor tuning, such as CPU pinning and NUMA node alignment.
72+
Furthermore, it's important to note that the performance gap in virtualized setups can potentially be narrowed significantly with expert hypervisor tuning, such as CPU pinning and NUMA node alignment.
7373

7474
**Multi-node**
7575

@@ -199,10 +199,9 @@ python3 trl/scripts/sft.py \
199199
--lora_alpha 16
200200
```
201201

202-
<!-- ## Source code
203-
204-
All source code and findings are available in [our GitHub repo :material-arrow-top-right-thin:{ .external }](https://github.com/dstackai/benchmarks/tree/main/amd/baremetal_vms){:target="_blank"}. -->
202+
## Source code
205203

204+
All source code and findings are available in our [GitHub repo :material-arrow-top-right-thin:{ .external }](https://github.com/dstackai/benchmarks/tree/main/amd/single_gpu_vm_vs_bare-metal){:target="_blank"}.
206205

207206
## References
208207

0 commit comments

Comments
 (0)