When UCP endpoints are created, it would be very useful to ask such endpoints what the theoretical bandwidth is to the peer, given a pair of memory types.
For example, if I connect between GPU0 and GPU1 in an NVLinked machine, I'd expect NVLink bandwidth. The API would just echo this back, asserting that the connection between the two could be NVLink (for GPU->GPU memory). If I picked H-D, D-H, or H-H, I should see corresponding bandwidth numbers.
If I connect across machines with RoCE/Infiniband or TCP, you'd see a similar output but taking into account the network.
When UCP endpoints are created, it would be very useful to ask such endpoints what the theoretical bandwidth is to the peer, given a pair of memory types.
For example, if I connect between GPU0 and GPU1 in an NVLinked machine, I'd expect NVLink bandwidth. The API would just echo this back, asserting that the connection between the two could be NVLink (for GPU->GPU memory). If I picked H-D, D-H, or H-H, I should see corresponding bandwidth numbers.
If I connect across machines with RoCE/Infiniband or TCP, you'd see a similar output but taking into account the network.