In-Network Collective Operations: Game Changer or Challenge for AI Workloads?
Computer (IEEE Computer), 2026
Torsten Hoefler
Mikhail Khalilov
Josiah Clark
Surendra Anubolu
Mohan Kalkunte
Karen Schramm
Eric Spada
Duncan Roweth
Keith Underwood
Adrian Caulfield
Abdul Kabbani
Amirreza Rastegari
- GNN
Main:6 Pages
5 Figures
Bibliography:2 Pages
Appendix:2 Pages
Abstract
This paper summarizes the opportunities of in-network collective operations (INC) for accelerated collective operations in AI workloads. We provide sufficient detail to make this important field accessible to non-experts in AI or networking, fostering a connection between these communities. Consider two types of INC: Edge-INC, where the system is implemented at the node level, and Core-INC, where the system is embedded within network switches. We outline the potential performance benefits as well as six key obstacles in the context of both Edge-INC and Core-INC that may hinder their adoption. Finally, we present a set of predictions for the future development and application of INC.
View on arXivComments on this paper
