42
0

DOPPLER: Dual-Policy Learning for Device Assignment in Asynchronous Dataflow Graphs

Main:9 Pages
20 Figures
Bibliography:3 Pages
10 Tables
Appendix:20 Pages
Abstract

We study the problem of assigning operations in a dataflow graph to devices to minimize execution time in a work-conserving system, with emphasis on complex machine learning workloads. Prior learning-based methods often struggle due to three key limitations: (1) reliance on bulk-synchronous systems like TensorFlow, which under-utilize devices due to barrier synchronization; (2) lack of awareness of the scheduling mechanism of underlying systems when designing learning-based methods; and (3) exclusive dependence on reinforcement learning, ignoring the structure of effective heuristics designed by experts. In this paper, we propose \textsc{Doppler}, a three-stage framework for training dual-policy networks consisting of 1) a SEL\mathsf{SEL} policy for selecting operations and 2) a PLC\mathsf{PLC} policy for placing chosen operations on devices. Our experiments show that \textsc{Doppler} outperforms all baseline methods across tasks by reducing system execution time and additionally demonstrates sampling efficiency by reducing per-episode training time.

View on arXiv
@article{yao2025_2505.23131,
  title={ DOPPLER: Dual-Policy Learning for Device Assignment in Asynchronous Dataflow Graphs },
  author={ Xinyu Yao and Daniel Bourgeois and Abhinav Jain and Yuxin Tang and Jiawen Yao and Zhimin Ding and Arlei Silva and Chris Jermaine },
  journal={arXiv preprint arXiv:2505.23131},
  year={ 2025 }
}
Comments on this paper