Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2301.12443
Cited By
Pipe-BD: Pipelined Parallel Blockwise Distillation
29 January 2023
Hongsun Jang
Jaewon Jung
Jaeyong Song
Joonsang Yu
Youngsok Kim
Jinho Lee
MoE
AI4CE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Pipe-BD: Pipelined Parallel Blockwise Distillation"
4 / 4 papers shown
Title
Distilling Optimal Neural Networks: Rapid Search in Diverse Spaces
Bert Moons
Parham Noorzad
Andrii Skliar
G. Mariani
Dushyant Mehta
Chris Lott
Tijmen Blankevoort
145
43
0
16 Dec 2020
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
245
1,821
0
17 Sep 2019
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
Andrew G. Howard
Menglong Zhu
Bo Chen
Dmitry Kalenichenko
Weijun Wang
Tobias Weyand
M. Andreetto
Hartwig Adam
3DH
950
20,567
0
17 Apr 2017
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
308
2,890
0
15 Sep 2016
1