Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2007.13512
Cited By
Add a SideNet to your MainNet
14 July 2020
Adrien Morisot
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Add a SideNet to your MainNet"
21 / 21 papers shown
Title
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
469
41,106
0
28 May 2020
Calibration of Pre-trained Transformers
Shrey Desai
Greg Durrett
UQLM
266
295
0
17 Mar 2020
Controlling Computation versus Quality for Neural Sequence Models
Ankur Bapna
N. Arivazhagan
Orhan Firat
40
30
0
17 Feb 2020
Emergent Properties of Finetuned Language Representation Models
Alexandre Matton
Luke de Oliveira
SSL
40
1
0
23 Oct 2019
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Victor Sanh
Lysandre Debut
Julien Chaumond
Thomas Wolf
121
7,437
0
02 Oct 2019
What Does BERT Look At? An Analysis of BERT's Attention
Kevin Clark
Urvashi Khandelwal
Omer Levy
Christopher D. Manning
MILM
186
1,586
0
11 Jun 2019
SCAN: A Scalable Neural Networks Framework Towards Compact and Efficient Models
Linfeng Zhang
Zhanhong Tan
Jiebo Song
Jingwei Chen
Chenglong Bao
Kaisheng Ma
28
71
0
27 May 2019
Streaming End-to-end Speech Recognition For Mobile Devices
Yanzhang He
Tara N. Sainath
Rohit Prabhavalkar
Ian McGraw
R. Álvarez
...
K. Sim
Tom Bagby
Shuo-yiin Chang
Kanishka Rao
A. Gruenstein
68
624
0
15 Nov 2018
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
943
93,936
0
11 Oct 2018
Deep Learning Scaling is Predictable, Empirically
Joel Hestness
Sharan Narang
Newsha Ardalani
G. Diamos
Heewoo Jun
Hassan Kianinejad
Md. Mostofa Ali Patwary
Yang Yang
Yanqi Zhou
80
728
0
01 Dec 2017
On Calibration of Modern Neural Networks
Chuan Guo
Geoff Pleiss
Yu Sun
Kilian Q. Weinberger
UQCV
195
5,774
0
14 Jun 2017
Adaptive Neural Networks for Efficient Inference
Tolga Bolukbasi
Joseph Wang
O. Dekel
Venkatesh Saligrama
41
354
0
25 Feb 2017
Deep Residual Learning for Image Recognition
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
MedIm
1.4K
192,638
0
10 Dec 2015
Conditional Computation in Neural Networks for faster models
Emmanuel Bengio
Pierre-Luc Bacon
Joelle Pineau
Doina Precup
AI4CE
94
320
0
19 Nov 2015
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding
Song Han
Huizi Mao
W. Dally
3DGS
194
8,793
0
01 Oct 2015
Distilling the Knowledge in a Neural Network
Geoffrey E. Hinton
Oriol Vinyals
J. Dean
FedML
236
19,523
0
09 Mar 2015
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Sergey Ioffe
Christian Szegedy
OOD
298
43,154
0
11 Feb 2015
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
806
149,474
0
22 Dec 2014
Explaining and Harnessing Adversarial Examples
Ian Goodfellow
Jonathon Shlens
Christian Szegedy
AAML
GAN
161
18,922
0
20 Dec 2014
Going Deeper with Convolutions
Christian Szegedy
Wei Liu
Yangqing Jia
P. Sermanet
Scott E. Reed
Dragomir Anguelov
D. Erhan
Vincent Vanhoucke
Andrew Rabinovich
299
43,511
0
17 Sep 2014
Very Deep Convolutional Networks for Large-Scale Image Recognition
Karen Simonyan
Andrew Zisserman
FAtt
MDE
923
99,991
0
04 Sep 2014
1