Google Virgo Network: Megascale AI Data Center Fabric
🚀 Google announces the Virgo Network, a megascale, flat two-layer fabric purpose-built for modern AI workloads that unifies accelerators across pods into a single compute domain. The design separates a high-bandwidth scale-up domain, an east-west RDMA scale-out accelerator fabric, and the Jupiter north-south network to deliver deterministic low latency and massive non-blocking bandwidth. Virgo uses high-radix switches and multi-planar control domains to reduce layers and isolate faults, while sub-millisecond telemetry and automated straggler detection aim to preserve cluster goodput. The fabric targets predictable performance and rapid recovery for large distributed training and serving.
