Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1710.03740
Cited By
Mixed Precision Training
10 October 2017
Paulius Micikevicius
Sharan Narang
Jonah Alben
G. Diamos
Erich Elsen
David García
Boris Ginsburg
Michael Houston
Oleksii Kuchaiev
Ganesh Venkatesh
Hao Wu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Mixed Precision Training"
50 / 380 papers shown
Title
TorchBench: Benchmarking PyTorch with High API Surface Coverage
Yueming Hao
Xu Zhao
Bin Bao
David Berard
William Constable
Adnan Aziz
Xu Liu
30
5
0
27 Apr 2023
ComGAN: Toward GANs Exploiting Multiple Samples
Hae-Hwan Lee
GAN
31
0
0
24 Apr 2023
How Will It Drape Like? Capturing Fabric Mechanics from Depth Images
Carlos Rodriguez-Pardo
Melania Prieto-Martin
Dan Casas
Elena Garces
31
12
0
13 Apr 2023
DetCLIPv2: Scalable Open-Vocabulary Object Detection Pre-training via Word-Region Alignment
Lewei Yao
Jianhua Han
Xiaodan Liang
Danqian Xu
Wei Zhang
Zhenguo Li
Hang Xu
VLM
ObjD
CLIP
56
74
0
10 Apr 2023
HyperINR: A Fast and Predictive Hypernetwork for Implicit Neural Representations via Knowledge Distillation
Qi Wu
David Bauer
Yuyang Chen
Kwan-Liu Ma
33
14
0
09 Apr 2023
EnforceSNN: Enabling Resilient and Energy-Efficient Spiking Neural Network Inference considering Approximate DRAMs for Embedded Systems
Rachmad Vidya Wicaksana Putra
Muhammad Abdullah Hanif
Muhammad Shafique
32
11
0
08 Apr 2023
On Efficient Training of Large-Scale Deep Learning Models: A Literature Review
Li Shen
Yan Sun
Zhiyuan Yu
Liang Ding
Xinmei Tian
Dacheng Tao
VLM
30
41
0
07 Apr 2023
UNICORN: A Unified Backdoor Trigger Inversion Framework
Zhenting Wang
Kai Mei
Juan Zhai
Shiqing Ma
LLMSV
35
44
0
05 Apr 2023
Effective Theory of Transformers at Initialization
Emily Dinan
Sho Yaida
Susan Zhang
30
14
0
04 Apr 2023
DrBERT: A Robust Pre-trained Model in French for Biomedical and Clinical domains
Yanis Labrak
Adrien Bazoge
Richard Dufour
Mickael Rouvier
Emmanuel Morin
B. Daille
P. Gourraud
LM&MA
25
54
0
03 Apr 2023
ByteTrackV2: 2D and 3D Multi-Object Tracking by Associating Every Detection Box
Yifu Zhang
Xing-Hui Wang
Xiaoqing Ye
Wei Zhang
Jincheng Lu
Xiao Tan
Errui Ding
Pei Sun
Jingdong Wang
VOT
31
20
0
27 Mar 2023
Sparse-IFT: Sparse Iso-FLOP Transformations for Maximizing Training Efficiency
Vithursan Thangarasa
Shreyas Saxena
Abhay Gupta
Sean Lie
31
3
0
21 Mar 2023
Rediscovering Hashed Random Projections for Efficient Quantization of Contextualized Sentence Embeddings
Ulf A. Hamster
Ji-Ung Lee
Alexander Geyken
Iryna Gurevych
26
0
0
13 Mar 2023
One Neuron Saved Is One Neuron Earned: On Parametric Efficiency of Quadratic Networks
Fenglei Fan
Hangcheng Dong
Zhongming Wu
Lecheng Ruan
T. Zeng
Yiming Cui
Jing-Xiao Liao
59
8
0
11 Mar 2023
Angel-PTM: A Scalable and Economical Large-scale Pre-training System in Tencent
Xiaonan Nie
Yi Liu
Fangcheng Fu
Jinbao Xue
Dian Jiao
Xupeng Miao
Yangyu Tao
Bin Cui
MoE
31
16
0
06 Mar 2023
Dissolving Is Amplifying: Towards Fine-Grained Anomaly Detection
Jian Shi
Pengyi Zhang
Ni Zhang
Hakim Ghazzai
Y. Massoud
MedIm
35
6
0
28 Feb 2023
Towards Unifying Medical Vision-and-Language Pre-training via Soft Prompts
Zhihong Chen
Shizhe Diao
Benyou Wang
Guanbin Li
Xiang Wan
MedIm
25
29
0
17 Feb 2023
With Shared Microexponents, A Little Shifting Goes a Long Way
Bita Darvish Rouhani
Ritchie Zhao
V. Elango
Rasoul Shafipour
Mathew Hall
...
Eric S. Chung
Zhaoxia Deng
S. Naghshineh
Jongsoo Park
Maxim Naumov
MQ
43
36
0
16 Feb 2023
Cluster-Level Contrastive Learning for Emotion Recognition in Conversations
Kailai Yang
Tianlin Zhang
Hassan Alhuzali
Sophia Ananiadou
34
43
0
07 Feb 2023
GPS++: Reviving the Art of Message Passing for Molecular Property Prediction
Dominic Masters
Josef Dean
Kerstin Klaser
Zhiyi Li
Sam Maddrell-Mander
...
D. Beker
Andrew Fitzgibbon
Shenyang Huang
Ladislav Rampášek
Dominique Beaini
38
8
0
06 Feb 2023
PDPU: An Open-Source Posit Dot-Product Unit for Deep Learning Applications
Qiong Li
Chao Fang
Zhongfeng Wang
21
4
0
03 Feb 2023
A Survey on Efficient Training of Transformers
Bohan Zhuang
Jing Liu
Zizheng Pan
Haoyu He
Yuetian Weng
Chunhua Shen
31
47
0
02 Feb 2023
The Hidden Power of Pure 16-bit Floating-Point Neural Networks
Juyoung Yun
Byungkon Kang
Zhoulai Fu
MQ
26
1
0
30 Jan 2023
Cross-domain Neural Pitch and Periodicity Estimation
Max Morrison
Caedon Hsieh
Nathan Pruyne
Bryan Pardo
18
17
0
28 Jan 2023
Byte Pair Encoding for Symbolic Music
Nathan Fradet
Nicolas Gutowski
F. Chhel
Jean-Pierre Briot
29
15
0
27 Jan 2023
RedBit: An End-to-End Flexible Framework for Evaluating the Accuracy of Quantized CNNs
A. M. Ribeiro-dos-Santos
João Dinis Ferreira
O. Mutlu
G. Falcão
MQ
21
1
0
15 Jan 2023
A Comprehensive Survey of Dataset Distillation
Shiye Lei
Dacheng Tao
DD
31
88
0
13 Jan 2023
SMMix: Self-Motivated Image Mixing for Vision Transformers
Yonghong Tian
Mingbao Lin
Zhihang Lin
Yuxin Zhang
Rongrong Ji
Rongrong Ji
53
10
0
26 Dec 2022
Randomized Quantization: A Generic Augmentation for Data Agnostic Self-supervised Learning
Huimin Wu
Chenyang Lei
Xiao Sun
Pengju Wang
Qifeng Chen
Kwang-Ting Cheng
Stephen Lin
Zhirong Wu
MQ
38
5
0
19 Dec 2022
Universal Object Detection with Large Vision Model
Feng-Huei Lin
Wenze Hu
Yaowei Wang
Yonghong Tian
Guangming Lu
Fanglin Chen
Yong-mei Xu
Xiaoyu Wang
VLM
ObjD
32
8
0
19 Dec 2022
The Effects of In-domain Corpus Size on pre-training BERT
Chris Sanchez
Zheyu Zhang
AI4CE
16
4
0
15 Dec 2022
Numerical Stability of DeepGOPlus Inference
Inés Gonzalez Pepe
Yohan Chatelain
Gregory Kiar
Tristan Glatard
BDL
24
2
0
13 Dec 2022
Pivotal Role of Language Modeling in Recommender Systems: Enriching Task-specific and Task-agnostic Representation Learning
Kyuyong Shin
Hanock Kwak
Wonjae Kim
Jisu Jeong
Seungjae Jung
KyungHyun Kim
Jung-Woo Ha
Sang-Woo Lee
27
4
0
07 Dec 2022
MEDIAR: Harmony of Data-Centric and Model-Centric for Multi-Modality Microscopy
Gihun Lee
Sangmook Kim
Joonkee Kim
Se-Young Yun
MedIm
19
18
0
07 Dec 2022
On-device Training: A First Overview on Existing Systems
Shuai Zhu
Thiemo Voigt
Jeonggil Ko
Fatemeh Rahimian
34
14
0
01 Dec 2022
CREPE: Open-Domain Question Answering with False Presuppositions
Xinyan Velocity Yu
Sewon Min
Luke Zettlemoyer
Hannaneh Hajishirzi
16
45
0
30 Nov 2022
High-Fidelity Guided Image Synthesis with Latent Diffusion Models
Jaskirat Singh
Stephen Gould
Liang Zheng
DiffM
41
40
0
30 Nov 2022
MegaBlocks: Efficient Sparse Training with Mixture-of-Experts
Trevor Gale
Deepak Narayanan
C. Young
Matei A. Zaharia
MoE
19
103
0
29 Nov 2022
RAMP: A Flat Nanosecond Optical Network and MPI Operations for Distributed Deep Learning Systems
Alessandro Ottino
Joshua L. Benjamin
G. Zervas
30
7
0
28 Nov 2022
Tell Me What Happened: Unifying Text-guided Video Completion via Multimodal Masked Video Generation
Tsu-jui Fu
Licheng Yu
Ning Zhang
Cheng-Yang Fu
Jong-Chyi Su
William Yang Wang
Sean Bell
VGen
56
37
0
23 Nov 2022
Spikeformer: A Novel Architecture for Training High-Performance Low-Latency Spiking Neural Network
Yudong Li
Yunlin Lei
Xu Yang
29
26
0
19 Nov 2022
GPS++: An Optimised Hybrid MPNN/Transformer for Molecular Property Prediction
Dominic Masters
Josef Dean
Kerstin Klaser
Zhiyi Li
Sam Maddrell-Mander
Adam Sanders
Hatem Helal
D. Beker
Ladislav Rampášek
Dominique Beaini
29
23
0
18 Nov 2022
MelHuBERT: A simplified HuBERT on Mel spectrograms
Tzu-Quan Lin
Hung-yi Lee
Hao Tang
SSL
32
13
0
17 Nov 2022
Random-LTD: Random and Layerwise Token Dropping Brings Efficient Training for Large-scale Transformers
Z. Yao
Xiaoxia Wu
Conglong Li
Connor Holmes
Minjia Zhang
Cheng-rong Li
Yuxiong He
28
11
0
17 Nov 2022
Language models are good pathologists: using attention-based sequence reduction and text-pretrained transformers for efficient WSI classification
Juan Pisula
Katarzyna Bozek
VLM
MedIm
36
3
0
14 Nov 2022
Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization Capabilities
Andros Tjandra
Nayan Singhal
David C. Zhang
Ozlem Kalinli
Abdel-rahman Mohamed
Duc Le
M. Seltzer
37
12
0
10 Nov 2022
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
BigScience Workshop
:
Teven Le Scao
Angela Fan
Christopher Akiki
...
Zhongli Xie
Zifan Ye
M. Bras
Younes Belkada
Thomas Wolf
VLM
118
2,310
0
09 Nov 2022
Unsupervised vocal dereverberation with diffusion-based generative models
Koichi Saito
Naoki Murata
Toshimitsu Uesaka
Chieh-Hsin Lai
Yuhta Takida
Takao Fukui
Yuki Mitsufuji
DiffM
47
23
0
08 Nov 2022
MuMIC -- Multimodal Embedding for Multi-label Image Classification with Tempered Sigmoid
Feng Wang
Sarai Mizrachi
Moran Beladev
Guy Nadav
Gil Amsalem
Karen Lastmann Assaraf
Hadas Harush Boker
VLM
22
13
0
02 Nov 2022
Precision Machine Learning
Eric J. Michaud
Ziming Liu
Max Tegmark
24
34
0
24 Oct 2022
Previous
1
2
3
4
5
6
7
8
Next