Skip to content

Commit ec3eed6

Browse files
committed
ready to review
1 parent 754eb6c commit ec3eed6

1 file changed

Lines changed: 4 additions & 4 deletions

File tree

docs/codeflash-concepts/benchmarking-gpu-code.mdx

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -110,8 +110,8 @@ With synchronize: 152.277 ms
110110

111111
# How Codeflash measures execution time involving GPUs
112112

113-
Codeflash automatically inserts synchronization barriers before measuring performance. It currently supports GPU code written in `Pytorch`, `Tensorflow` and `JAX` for NVIDIA GPUs (CUDA) and MacOS Metal Performance Shaders (MPS).
113+
Codeflash automatically inserts synchronization barriers before measuring performance. It currently supports GPU code written in `Pytorch`, `Tensorflow` and `JAX` for NVIDIA GPUs (`CUDA`) and MacOS Metal Performance Shaders (`MPS`).
114114

115-
- **PyTorch**: Uses `torch.cuda.synchronize()` (CUDA) or `torch.mps.synchronize()` (MPS) depending on the device.
116-
- **JAX**: Uses `jax.block_until_ready()` to wait for computation to complete. It works for both CUDA and MPS devices.
117-
- **TensorFlow**: Uses `tf.test.experimental.sync_devices()` for device synchronization. It works for both CUDA and MPS devices.
115+
- **PyTorch**: Uses `torch.cuda.synchronize()` (`CUDA`) or `torch.mps.synchronize()` (`MPS`) depending on the device.
116+
- **JAX**: Uses `jax.block_until_ready()` to wait for computation to complete. It works for both `CUDA` and `MPS` devices.
117+
- **TensorFlow**: Uses `tf.test.experimental.sync_devices()` for device synchronization. It works for both `CUDA` and `MPS` devices.

0 commit comments

Comments
 (0)