Mixed Precision Training

10 October 2017

Paulius Micikevicius

Sharan Narang

Boris Ginsburg

Papers citing "Mixed Precision Training"

50 / 380 papers shown

Title
Deep Volumetric Ambient Occlusion Dominik Engel Timo Ropinski 21 22 0 19 Aug 2020
Compute, Time and Energy Characterization of Encoder-Decoder Networks with Automatic Mixed Precision Training S. Samsi Michael Jones Mark S. Veillette AI4CE 17 4 0 18 Aug 2020
Self-Supervised GAN Compression Chong Yu Jeff Pool 9 9 0 03 Jul 2020
Data Movement Is All You Need: A Case Study on Optimizing Transformers A. Ivanov Nikoli Dryden Tal Ben-Nun Shigang Li Torsten Hoefler 36 131 0 30 Jun 2020
LAMP: Large Deep Nets with Automated Model Parallelism for Image Segmentation Wentao Zhu Can Zhao Wenqi Li H. Roth Ziyue Xu Daguang Xu 3DV 32 18 0 22 Jun 2020
Sparse GPU Kernels for Deep Learning Trevor Gale Matei A. Zaharia C. Young Erich Elsen 17 228 0 18 Jun 2020
DS6, Deformation-aware Semi-supervised Learning: Application to Small Vessel Segmentation with Noisy Training Data S. Chatterjee Kartik Prabhu Mahantesh Pattadkal Gerda Bortsova Chompunuch Sarasaen Florian Dubost Hendrik Mattern Marleen de Bruijne Oliver Speck Andreas Nürnberger 19 18 0 18 Jun 2020
Deep Encoder, Shallow Decoder: Reevaluating Non-autoregressive Machine Translation Jungo Kasai Nikolaos Pappas Hao Peng James Cross Noah A. Smith 38 134 0 18 Jun 2020
Unsupervised Learning of Visual Features by Contrasting Cluster Assignments Mathilde Caron Ishan Misra Julien Mairal Priya Goyal Piotr Bojanowski Armand Joulin OCL SSL 48 3,998 0 17 Jun 2020
Multi-Precision Policy Enforced Training (MuPPET): A precision-switching strategy for quantised fixed-point training of CNNs A. Rajagopal D. A. Vink Stylianos I. Venieris C. Bouganis MQ 16 14 0 16 Jun 2020
Automatic heterogeneous quantization of deep neural networks for low-latency inference on the edge for particle detectors C. Coelho Aki Kuusela Shane Li Zhuang Hao T. Aarrestad Vladimir Loncar J. Ngadiuba M. Pierini Adrian Alan Pol S. Summers MQ 32 175 0 15 Jun 2020
FastPitch: Parallel Text-to-speech with Pitch Prediction Adrian Lañcucki 30 332 0 11 Jun 2020
VirTex: Learning Visual Representations from Textual Annotations Karan Desai Justin Johnson SSL VLM 30 432 0 11 Jun 2020
Linformer: Self-Attention with Linear Complexity Sinong Wang Belinda Z. Li Madian Khabsa Han Fang Hao Ma 63 1,647 0 08 Jun 2020
An Overview of Neural Network Compression James OÑeill AI4CE 45 98 0 05 Jun 2020
High-Fidelity Audio Generation and Representation Learning with Guided Adversarial Autoencoder Kazi Nazmul Haque R. Rana Björn W Schuller DRL 26 12 0 01 Jun 2020
Vector-quantized neural networks for acoustic unit discovery in the ZeroSpeech 2020 challenge Benjamin van Niekerk Leanne Nortje Herman Kamper 13 115 0 19 May 2020
Optimizing Deep Learning Recommender Systems' Training On CPU Cluster Architectures Dhiraj D. Kalamkar E. Georganas Sudarshan Srinivasan Jianping Chen Mikhail Shiryaev A. Heinecke 53 47 0 10 May 2020
NTIRE 2020 Challenge on Spectral Reconstruction from an RGB Image Boaz Arad Radu Timofte Ohad Ben-Shahar Yi Lin G. Finlayson Shai Givati Mohamed H. Sedky 54 122 0 07 May 2020
FlexSA: Flexible Systolic Array Architecture for Efficient Pruned DNN Model Training Sangkug Lym M. Erez 18 25 0 27 Apr 2020
MXR-U-Nets for Real Time Hyperspectral Reconstruction Atmadeep Banerjee Akash Palrecha SupR 25 11 0 15 Apr 2020
Reducing Data Motion to Accelerate the Training of Deep Neural Networks Sicong Zhuang C. Malossi Marc Casas 19 0 0 05 Apr 2020
A Survey of Convolutional Neural Networks: Analysis, Applications, and Prospects Zewen Li Wenjie Yang Shouheng Peng Fan Liu HAI 3DV 54 2,600 0 01 Apr 2020
TorchIO: A Python library for efficient loading, preprocessing, augmentation and patch-based sampling of medical images in deep learning Fernando Pérez-García Rachel Sparks Sébastien Ourselin MedIm LM&MA 144 427 0 09 Mar 2020
Towards Rapid and Robust Adversarial Training with One-Step Attacks Leo Schwinn René Raab Björn Eskofier AAML 33 6 0 24 Feb 2020
Training Question Answering Models From Synthetic Data Raul Puri Ryan Spring M. Patwary M. Shoeybi Bryan Catanzaro ELM 24 159 0 22 Feb 2020
Stochastic Latent Residual Video Prediction Jean-Yves Franceschi E. Delasalles Mickaël Chen Sylvain Lamprier Patrick Gallinari VGen 26 159 0 21 Feb 2020
fastai: A Layered API for Deep Learning Jeremy Howard Sylvain Gugger AI4CE 6 857 0 11 Feb 2020
Compounding the Performance Improvements of Assembled Techniques in a Convolutional Neural Network Jungkyu Lee Taeryun Won Tae Kwan Lee Hyemin Lee Geonmo Gu K. Hong 34 57 0 17 Jan 2020
Shifted and Squeezed 8-bit Floating Point format for Low-Precision Training of Deep Neural Networks Léopold Cambier Anahita Bhiwandiwalla Ting Gong M. Nekuii Oguz H. Elibol Hanlin Tang MQ 21 48 0 16 Jan 2020
Fast is better than free: Revisiting adversarial training Eric Wong Leslie Rice J. Zico Kolter AAML OOD 99 1,158 0 12 Jan 2020
Towards Unified INT8 Training for Convolutional Neural Network Feng Zhu Ruihao Gong F. Yu Xianglong Liu Yanfei Wang Zhelong Li Xiuqi Yang Junjie Yan MQ 35 150 0 29 Dec 2019
PANTHER: A Programmable Architecture for Neural Network Training Harnessing Energy-efficient ReRAM Aayush Ankit I. E. Hajj S. R. Chalamalasetti S. Agarwal M. Marinella M. Foltin J. Strachan D. Milojicic Wen-mei W. Hwu Kaushik Roy 21 65 0 24 Dec 2019
MG-WFBP: Merging Gradients Wisely for Efficient Communication in Distributed Deep Learning S. Shi X. Chu Bo Li FedML 22 25 0 18 Dec 2019
Zero-shot Text Classification With Generative Language Models Raul Puri Bryan Catanzaro VLM 16 102 0 10 Dec 2019
JParaCrawl: A Large Scale Web-Based English-Japanese Parallel Corpus Makoto Morishita Jun Suzuki Masaaki Nagata LRM 35 64 0 25 Nov 2019
REVAMP $^2$ T: Real-time Edge Video Analytics for Multi-camera Privacy-aware Pedestrian Tracking Christopher Neff Matías Mendieta Shrey Mohan Mohammadreza Baharani Samuel Rogers Hamed Tabkhi 24 56 0 20 Nov 2019
Understanding Top-k Sparsification in Distributed Deep Learning S. Shi X. Chu Ka Chun Cheung Simon See 22 93 0 20 Nov 2019
Distributed Low Precision Training Without Mixed Precision Zehua Cheng Weiyan Wang Yan Pan Thomas Lukasiewicz MQ 18 5 0 18 Nov 2019
HAWQ-V2: Hessian Aware trace-Weighted Quantization of Neural Networks Zhen Dong Z. Yao Yaohui Cai Daiyaan Arfeen A. Gholami Michael W. Mahoney Kurt Keutzer MQ 34 274 0 10 Nov 2019
ConveRT: Efficient and Accurate Conversational Representations from Transformers Matthew Henderson I. Casanueva Nikola Mrkvsić Pei-hao Su Tsung-Hsien Ivan Vulić 15 196 0 09 Nov 2019
Blockwise Self-Attention for Long Document Understanding J. Qiu Hao Ma Omer Levy Scott Yih Sinong Wang Jie Tang 11 251 0 07 Nov 2019
Post-Training 4-bit Quantization on Embedding Tables Hui Guan Andrey Malevich Jiyan Yang Jongsoo Park Hector Yuen MQ 13 32 0 05 Nov 2019
On-Device Machine Learning: An Algorithms and Learning Theory Perspective Sauptik Dhar Junyao Guo Jiayi Liu S. Tripathi Unmesh Kurup Mohak Shah 28 141 0 02 Nov 2019
Characterizing Deep Learning Training Workloads on Alibaba-PAI Mengdi Wang Chen Meng Guoping Long Chuan Wu Jun Yang Wei Lin Yangqing Jia 17 53 0 14 Oct 2019
MLPerf Training Benchmark Arya D. McCarthy Christine Cheng Cody Coleman Greg Diamos Paulius Micikevicius ... Carole-Jean Wu Lingjie Xu Masafumi Yamazaki C. Young Matei A. Zaharia 33 305 0 02 Oct 2019
NeMo: a toolkit for building AI applications using Neural Modules Oleksii Kuchaiev Jason Chun Lok Li Huyen Nguyen Oleksii Hrinchuk Ryan Leary ... Jack Cook P. Castonguay Mariya Popova Jocelyn Huang Jonathan M. Cohen 211 292 0 14 Sep 2019
On Extractive and Abstractive Neural Document Summarization with Transformer Language Models Sandeep Subramanian Raymond Li Jonathan Pilault C. Pal 246 215 0 07 Sep 2019
Training Deep Neural Networks Using Posit Number System Jinming Lu Siyuan Lu Zhisheng Wang Chao Fang Jun Lin Zhongfeng Wang Li Du MQ 19 13 0 06 Sep 2019
Real-time Person Re-identification at the Edge: A Mixed Precision Approach Mohammadreza Baharani Shrey Mohan Hamed Tabkhi 26 10 0 19 Aug 2019