Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1701.06548
Cited By
Regularizing Neural Networks by Penalizing Confident Output Distributions
23 January 2017
Gabriel Pereyra
George Tucker
J. Chorowski
Lukasz Kaiser
Geoffrey E. Hinton
NoLa
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Regularizing Neural Networks by Penalizing Confident Output Distributions"
50 / 640 papers shown
Title
Omni-Scale Feature Learning for Person Re-Identification
Kaiyang Zhou
Yongxin Yang
Andrea Cavallaro
Tao Xiang
30
821
0
02 May 2019
Introducing Graph Smoothness Loss for Training Deep Learning Architectures
Myriam Bontonou
Carlos Lassance
G. B. Hacene
Vincent Gripon
Jian Tang
Antonio Ortega
14
18
0
01 May 2019
Performance Monitoring for End-to-End Speech Recognition
Ruizhi Li
Gregory Sell
H. Hermansky
15
2
0
09 Apr 2019
Noise-Tolerant Paradigm for Training Face Recognition CNNs
Wei Hu
Yangyu Huang
Fan Zhang
Ruirui Li
NoLa
CVBM
30
62
0
25 Mar 2019
Pre-trained Language Model Representations for Language Generation
Sergey Edunov
Alexei Baevski
Michael Auli
27
129
0
22 Mar 2019
Calibration of Encoder Decoder Models for Neural Machine Translation
Aviral Kumar
Sunita Sarawagi
27
98
0
03 Mar 2019
Deep learning in bioinformatics: introduction, application, and perspective in big data era
Yu Li
Chao Huang
Lizhong Ding
Zhongxiao Li
Yijie Pan
Xin Gao
AI4CE
29
295
0
28 Feb 2019
Improving Neural Response Diversity with Frequency-Aware Cross-Entropy Loss
Shaojie Jiang
Pengjie Ren
Christof Monz
Maarten de Rijke
25
86
0
25 Feb 2019
Analyzing and Improving Representations with the Soft Nearest Neighbor Loss
Nicholas Frosst
Nicolas Papernot
Geoffrey E. Hinton
17
157
0
05 Feb 2019
Incremental Learning with Maximum Entropy Regularization: Rethinking Forgetting and Intransigence
Dahyun Kim
Jihwan Bae
Yeonsik Jo
Jonghyun Choi
OOD
CLL
36
20
0
03 Feb 2019
Pay Less Attention with Lightweight and Dynamic Convolutions
Felix Wu
Angela Fan
Alexei Baevski
Yann N. Dauphin
Michael Auli
11
604
0
29 Jan 2019
Neural network gradient-based learning of black-box function interfaces
Alon Jacovi
Guy Hadash
Einat Kermany
Boaz Carmeli
Ofer Lavi
George Kour
Jonathan Berant
18
13
0
13 Jan 2019
Vector representations of text data in deep learning
Karol Grzegorczyk
24
12
0
07 Jan 2019
Moment Matching Training for Neural Machine Translation: A Preliminary Study
Cong Duy Vu Hoang
Ioan Calapodescu
Marc Dymetman
16
1
0
24 Dec 2018
Densely Semantically Aligned Person Re-Identification
Zhizheng Zhang
Cuiling Lan
Wenjun Zeng
Zhibo Chen
28
266
0
21 Dec 2018
Adapting Auxiliary Losses Using Gradient Similarity
Yunshu Du
Wojciech M. Czarnecki
Siddhant M. Jayakumar
Mehrdad Farajtabar
Razvan Pascanu
Balaji Lakshminarayanan
35
156
0
05 Dec 2018
Improving robustness of classifiers by training against live traffic
K. Sricharan
Kumar Kallurupalli
Ashok Srivastava
OOD
TTA
12
0
0
01 Dec 2018
Snapshot Distillation: Teacher-Student Optimization in One Generation
Chenglin Yang
Lingxi Xie
Chi Su
Alan Yuille
10
193
0
01 Dec 2018
Deep Bayesian Self-Training
Fabio De Sousa Ribeiro
Francesco Calivá
M. Swainson
Kjartan Gudmundsson
Georgios Leontidis
Stefanos D. Kollias
UQCV
21
1
0
26 Nov 2018
Limited Gradient Descent: Learning With Noisy Labels
Yi Sun
Yan Tian
Yiping Xu
Jianxiang Li
NoLa
35
13
0
20 Nov 2018
A Variational Dirichlet Framework for Out-of-Distribution Detection
Wenhu Chen
Yilin Shen
Xin Eric Wang
Wenjie Wang
UQCV
30
9
0
18 Nov 2018
Abstractive Summarization of Reddit Posts with Multi-level Memory Networks
Byeongchang Kim
Hyunwoo J. Kim
Gunhee Kim
26
182
0
02 Nov 2018
Excessive Invariance Causes Adversarial Vulnerability
J. Jacobsen
Jens Behrmann
R. Zemel
Matthias Bethge
AAML
33
166
0
01 Nov 2018
Towards Linear Time Neural Machine Translation with Capsule Networks
Mingxuan Wang
Jun Xie
Zhixing Tan
Jinsong Su
Deyi Xiong
Lei Li
AIMat
16
27
0
01 Nov 2018
Sequence to Sequence Mixture Model for Diverse Machine Translation
Xuanli He
Gholamreza Haffari
Mohammad Norouzi
20
57
0
17 Oct 2018
Optimal Completion Distillation for Sequence Learning
S. Sabour
William Chan
Mohammad Norouzi
27
45
0
02 Oct 2018
Learning for Single-Shot Confidence Calibration in Deep Neural Networks through Stochastic Inferences
Seonguk Seo
Paul Hongsuck Seo
Bohyung Han
FedML
UQCV
BDL
13
75
0
28 Sep 2018
Dropout Distillation for Efficiently Estimating Model Confidence
Corina Gurau
Alex Bewley
Ingmar Posner
BDL
UQCV
19
19
0
27 Sep 2018
Semi-Supervised Sequence Modeling with Cross-View Training
Kevin Clark
Minh-Thang Luong
Christopher D. Manning
Quoc V. Le
SSL
11
333
0
22 Sep 2018
Maximum-Entropy Fine-Grained Classification
Abhimanyu Dubey
O. Gupta
Ramesh Raskar
Nikhil Naik
28
156
0
16 Sep 2018
Distilled Wasserstein Learning for Word Embedding and Topic Modeling
Hongteng Xu
Wenlin Wang
Wen Liu
Lawrence Carin
MedIm
FedML
35
84
0
12 Sep 2018
Why are Sequence-to-Sequence Models So Dull? Understanding the Low-Diversity Problem of Chatbots
Shaojie Jiang
Maarten de Rijke
18
43
0
06 Sep 2018
Parameter Sharing Methods for Multilingual Self-Attentional Translation Models
Devendra Singh Sachan
Graham Neubig
MoE
45
114
0
01 Sep 2018
Hypernetwork Knowledge Graph Embeddings
Ivana Balazevic
Carl Allen
Timothy M. Hospedales
GNN
11
180
0
21 Aug 2018
Confidence penalty, annealing Gaussian noise and zoneout for biLSTM-CRF networks for named entity recognition
Antonio Jimeno Yepes
26
2
0
13 Aug 2018
LemmaTag: Jointly Tagging and Lemmatizing for Morphologically-Rich Languages with BRNNs
Dan Kondratyuk
T. Gavenčiak
Milan Straka
Jan Hajic
6
33
0
10 Aug 2018
Noise Contrastive Priors for Functional Uncertainty
Danijar Hafner
Dustin Tran
Timothy Lillicrap
A. Irpan
James Davidson
AAML
BDL
UQCV
35
74
0
24 Jul 2018
Acoustic-to-Word Recognition with Sequence-to-Sequence Models
Shruti Palaskar
Florian Metze
10
19
0
23 Jul 2018
Weakly-Supervised Convolutional Neural Networks for Multimodal Image Registration
Yipeng Hu
Marc Modat
Eli Gibson
Wenqi Li
N. Ghavami
...
M. Emberton
Sébastien Ourselin
J. A. Noble
D. Barratt
Tom Kamiel Magda Vercauteren
60
382
0
09 Jul 2018
Gradient Adversarial Training of Neural Networks
Ayan Sinha
Zhao Chen
Vijay Badrinarayanan
Andrew Rabinovich
AAML
30
33
0
21 Jun 2018
Extending Recurrent Neural Aligner for Streaming End-to-End Speech Recognition in Mandarin
Linhao Dong
Shiyu Zhou
Wei Chen
Bo Xu
24
22
0
17 Jun 2018
Improving Regression Performance with Distributional Losses
Ehsan Imani
Martha White
UQCV
13
65
0
12 Jun 2018
Spreading vectors for similarity search
Alexandre Sablayrolles
Matthijs Douze
Cordelia Schmid
Hervé Jégou
MQ
27
115
0
08 Jun 2018
Learn from Your Neighbor: Learning Multi-modal Mappings from Sparse Annotations
Ashwin Kalyan
Stefan Lee
A. Kannan
Dhruv Batra
7
6
0
08 Jun 2018
Scaling Neural Machine Translation
Myle Ott
Sergey Edunov
David Grangier
Michael Auli
AIMat
71
610
0
01 Jun 2018
Theory and Experiments on Vector Quantized Autoencoders
Aurko Roy
Ashish Vaswani
Arvind Neelakantan
Niki Parmar
19
85
0
28 May 2018
Pushing the bounds of dropout
Gábor Melis
Charles Blundell
Tomás Kociský
Karl Moritz Hermann
Chris Dyer
Phil Blunsom
16
13
0
23 May 2018
Measuring and regularizing networks in function space
Ari S. Benjamin
David Rolnick
Konrad Paul Kording
21
138
0
21 May 2018
Knowledge Distillation in Generations: More Tolerant Teachers Educate Better Students
Chenglin Yang
Lingxi Xie
Siyuan Qiao
Alan Yuille
35
135
0
15 May 2018
Token-level and sequence-level loss smoothing for RNN language models
Maha Elbayad
Laurent Besacier
Jakob Verbeek
24
19
0
14 May 2018
Previous
1
2
3
...
11
12
13
Next