ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2301.12443
  4. Cited By
Pipe-BD: Pipelined Parallel Blockwise Distillation

Pipe-BD: Pipelined Parallel Blockwise Distillation

29 January 2023
Hongsun Jang
Jaewon Jung
Jaeyong Song
Joonsang Yu
Youngsok Kim
Jinho Lee
    MoE
    AI4CE
ArXivPDFHTML

Papers citing "Pipe-BD: Pipelined Parallel Blockwise Distillation"

4 / 4 papers shown
Title
Distilling Optimal Neural Networks: Rapid Search in Diverse Spaces
Distilling Optimal Neural Networks: Rapid Search in Diverse Spaces
Bert Moons
Parham Noorzad
Andrii Skliar
G. Mariani
Dushyant Mehta
Chris Lott
Tijmen Blankevoort
145
43
0
16 Dec 2020
Megatron-LM: Training Multi-Billion Parameter Language Models Using
  Model Parallelism
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
245
1,821
0
17 Sep 2019
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision
  Applications
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
Andrew G. Howard
Menglong Zhu
Bo Chen
Dmitry Kalenichenko
Weijun Wang
Tobias Weyand
M. Andreetto
Hartwig Adam
3DH
950
20,567
0
17 Apr 2017
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp
  Minima
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
308
2,890
0
15 Sep 2016
1