You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/source/en/training/distributed_inference.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -360,4 +360,4 @@ We ran a benchmark with Ulysess, Ring, and Unified Attention with [this script](
360
360
| ring | 13076.492 | 3.82 | 56.02 |
361
361
| unified_balanced | 11068.705 | 4.52 | 33.85 |
362
362
363
-
From the above table, it's clear that Unified Attention as a CP backend provides the best trade-off between speed and memory.
363
+
From the above table, it's clear that Ulysses provides better throughput, but the number of devices it can use remains limited to number of attention-heads, a limitation that is solved by unified attention.
0 commit comments