IEEE Network / 2024
When In-Network Computing Meets Distributed Machine Learning
Emerging In-Network Computing (INC) technique provides a new opportunity to improve application’s performance by using network programmability, computational capability, and storage capacity enabled by programmable switches. One typical application is Distributed Machine Learning (DML), which accelerates machine learning training by employing multiple works to train model parallelly. This paper introduces INC-based DML systems, analyzes performance improvement from using INC, and overviews current studies of INC-based DML systems. We also propose potential research directions for applying INC to DML systems.
Full paper
Read the original paper
A direct open-access PDF is not available in the database yet. Use the source page or learning resources below to open the complete paper from the publisher or index.