NVLink is a wire-based serial multi-lane near-range communications link developed by Nvidia. Unlike PCI Express, a device can consist of multiple NVLinks, and devices use mesh networking to communicate instead of a central hub.
In the above figure, GPU to GPU memory transfers via NVLink are at most two hops away – a memory request may have to be routed through the NVLink controllers on two GPUs. For example, GPU 0 may need data in GPU 5’s memory, it needs two hops (such as:GPU 0 -> GPU 1 ->GPU5). Each NVLink controller has a memory access latency, so each memory access latency multiplies via the number of hops is the total latency.