
v1v2 (latest)
AC/DC: Alternating Compressed/DeCompressed Training of Deep Neural Networks
Papers citing "AC/DC: Alternating Compressed/DeCompressed Training of Deep Neural Networks"
46 / 46 papers shown
Title |
---|
![]() Sparse Networks from Scratch: Faster Training without Losing Performance Tim Dettmers Luke Zettlemoyer |
![]() Large Batch Optimization for Deep Learning: Training BERT in 76 minutes Yang You Jing Li Sashank J. Reddi Jonathan Hseu Sanjiv Kumar Srinadh Bhojanapalli Xiaodan Song J. Demmel Kurt Keutzer Cho-Jui Hsieh |