Regularizing Neural Networks by Penalizing Confident Output Distributions

23 January 2017

Papers citing "Regularizing Neural Networks by Penalizing Confident Output Distributions"

50 / 640 papers shown

Title
Omni-Scale Feature Learning for Person Re-Identification Kaiyang Zhou Yongxin Yang Andrea Cavallaro Tao Xiang 30 821 0 02 May 2019
Introducing Graph Smoothness Loss for Training Deep Learning Architectures Myriam Bontonou Carlos Lassance G. B. Hacene Vincent Gripon Jian Tang Antonio Ortega 14 18 0 01 May 2019
Performance Monitoring for End-to-End Speech Recognition Ruizhi Li Gregory Sell H. Hermansky 15 2 0 09 Apr 2019
Noise-Tolerant Paradigm for Training Face Recognition CNNs Wei Hu Yangyu Huang Fan Zhang Ruirui Li NoLa CVBM 30 62 0 25 Mar 2019
Pre-trained Language Model Representations for Language Generation Sergey Edunov Alexei Baevski Michael Auli 27 129 0 22 Mar 2019
Calibration of Encoder Decoder Models for Neural Machine Translation Aviral Kumar Sunita Sarawagi 27 98 0 03 Mar 2019
Deep learning in bioinformatics: introduction, application, and perspective in big data era Yu Li Chao Huang Lizhong Ding Zhongxiao Li Yijie Pan Xin Gao AI4CE 29 295 0 28 Feb 2019
Improving Neural Response Diversity with Frequency-Aware Cross-Entropy Loss Shaojie Jiang Pengjie Ren Christof Monz Maarten de Rijke 25 86 0 25 Feb 2019
Analyzing and Improving Representations with the Soft Nearest Neighbor Loss Nicholas Frosst Nicolas Papernot Geoffrey E. Hinton 17 157 0 05 Feb 2019
Incremental Learning with Maximum Entropy Regularization: Rethinking Forgetting and Intransigence Dahyun Kim Jihwan Bae Yeonsik Jo Jonghyun Choi OOD CLL 36 20 0 03 Feb 2019
Pay Less Attention with Lightweight and Dynamic Convolutions Felix Wu Angela Fan Alexei Baevski Yann N. Dauphin Michael Auli 11 604 0 29 Jan 2019
Neural network gradient-based learning of black-box function interfaces Alon Jacovi Guy Hadash Einat Kermany Boaz Carmeli Ofer Lavi George Kour Jonathan Berant 18 13 0 13 Jan 2019
Vector representations of text data in deep learning Karol Grzegorczyk 24 12 0 07 Jan 2019
Moment Matching Training for Neural Machine Translation: A Preliminary Study Cong Duy Vu Hoang Ioan Calapodescu Marc Dymetman 16 1 0 24 Dec 2018
Densely Semantically Aligned Person Re-Identification Zhizheng Zhang Cuiling Lan Wenjun Zeng Zhibo Chen 28 266 0 21 Dec 2018
Adapting Auxiliary Losses Using Gradient Similarity Yunshu Du Wojciech M. Czarnecki Siddhant M. Jayakumar Mehrdad Farajtabar Razvan Pascanu Balaji Lakshminarayanan 35 156 0 05 Dec 2018
Improving robustness of classifiers by training against live traffic K. Sricharan Kumar Kallurupalli Ashok Srivastava OOD TTA 12 0 0 01 Dec 2018
Snapshot Distillation: Teacher-Student Optimization in One Generation Chenglin Yang Lingxi Xie Chi Su Alan Yuille 10 193 0 01 Dec 2018
Deep Bayesian Self-Training Fabio De Sousa Ribeiro Francesco Calivá M. Swainson Kjartan Gudmundsson Georgios Leontidis Stefanos D. Kollias UQCV 21 1 0 26 Nov 2018
Limited Gradient Descent: Learning With Noisy Labels Yi Sun Yan Tian Yiping Xu Jianxiang Li NoLa 35 13 0 20 Nov 2018
A Variational Dirichlet Framework for Out-of-Distribution Detection Wenhu Chen Yilin Shen Xin Eric Wang Wenjie Wang UQCV 30 9 0 18 Nov 2018
Abstractive Summarization of Reddit Posts with Multi-level Memory Networks Byeongchang Kim Hyunwoo J. Kim Gunhee Kim 26 182 0 02 Nov 2018
Excessive Invariance Causes Adversarial Vulnerability J. Jacobsen Jens Behrmann R. Zemel Matthias Bethge AAML 33 166 0 01 Nov 2018
Towards Linear Time Neural Machine Translation with Capsule Networks Mingxuan Wang Jun Xie Zhixing Tan Jinsong Su Deyi Xiong Lei Li AIMat 16 27 0 01 Nov 2018
Sequence to Sequence Mixture Model for Diverse Machine Translation Xuanli He Gholamreza Haffari Mohammad Norouzi 20 57 0 17 Oct 2018
Optimal Completion Distillation for Sequence Learning S. Sabour William Chan Mohammad Norouzi 27 45 0 02 Oct 2018
Learning for Single-Shot Confidence Calibration in Deep Neural Networks through Stochastic Inferences Seonguk Seo Paul Hongsuck Seo Bohyung Han FedML UQCV BDL 13 75 0 28 Sep 2018
Dropout Distillation for Efficiently Estimating Model Confidence Corina Gurau Alex Bewley Ingmar Posner BDL UQCV 19 19 0 27 Sep 2018
Semi-Supervised Sequence Modeling with Cross-View Training Kevin Clark Minh-Thang Luong Christopher D. Manning Quoc V. Le SSL 11 333 0 22 Sep 2018
Maximum-Entropy Fine-Grained Classification Abhimanyu Dubey O. Gupta Ramesh Raskar Nikhil Naik 28 156 0 16 Sep 2018
Distilled Wasserstein Learning for Word Embedding and Topic Modeling Hongteng Xu Wenlin Wang Wen Liu Lawrence Carin MedIm FedML 35 84 0 12 Sep 2018
Why are Sequence-to-Sequence Models So Dull? Understanding the Low-Diversity Problem of Chatbots Shaojie Jiang Maarten de Rijke 18 43 0 06 Sep 2018
Parameter Sharing Methods for Multilingual Self-Attentional Translation Models Devendra Singh Sachan Graham Neubig MoE 45 114 0 01 Sep 2018
Hypernetwork Knowledge Graph Embeddings Ivana Balazevic Carl Allen Timothy M. Hospedales GNN 11 180 0 21 Aug 2018
Confidence penalty, annealing Gaussian noise and zoneout for biLSTM-CRF networks for named entity recognition Antonio Jimeno Yepes 26 2 0 13 Aug 2018
LemmaTag: Jointly Tagging and Lemmatizing for Morphologically-Rich Languages with BRNNs Dan Kondratyuk T. Gavenčiak Milan Straka Jan Hajic 6 33 0 10 Aug 2018
Noise Contrastive Priors for Functional Uncertainty Danijar Hafner Dustin Tran Timothy Lillicrap A. Irpan James Davidson AAML BDL UQCV 35 74 0 24 Jul 2018
Acoustic-to-Word Recognition with Sequence-to-Sequence Models Shruti Palaskar Florian Metze 10 19 0 23 Jul 2018
Weakly-Supervised Convolutional Neural Networks for Multimodal Image Registration Yipeng Hu Marc Modat Eli Gibson Wenqi Li N. Ghavami ... M. Emberton Sébastien Ourselin J. A. Noble D. Barratt Tom Kamiel Magda Vercauteren 60 382 0 09 Jul 2018
Gradient Adversarial Training of Neural Networks Ayan Sinha Zhao Chen Vijay Badrinarayanan Andrew Rabinovich AAML 30 33 0 21 Jun 2018
Extending Recurrent Neural Aligner for Streaming End-to-End Speech Recognition in Mandarin Linhao Dong Shiyu Zhou Wei Chen Bo Xu 24 22 0 17 Jun 2018
Improving Regression Performance with Distributional Losses Ehsan Imani Martha White UQCV 13 65 0 12 Jun 2018
Spreading vectors for similarity search Alexandre Sablayrolles Matthijs Douze Cordelia Schmid Hervé Jégou MQ 27 115 0 08 Jun 2018
Learn from Your Neighbor: Learning Multi-modal Mappings from Sparse Annotations Ashwin Kalyan Stefan Lee A. Kannan Dhruv Batra 7 6 0 08 Jun 2018
Scaling Neural Machine Translation Myle Ott Sergey Edunov David Grangier Michael Auli AIMat 71 610 0 01 Jun 2018
Theory and Experiments on Vector Quantized Autoencoders Aurko Roy Ashish Vaswani Arvind Neelakantan Niki Parmar 19 85 0 28 May 2018
Pushing the bounds of dropout Gábor Melis Charles Blundell Tomás Kociský Karl Moritz Hermann Chris Dyer Phil Blunsom 16 13 0 23 May 2018
Measuring and regularizing networks in function space Ari S. Benjamin David Rolnick Konrad Paul Kording 21 138 0 21 May 2018
Knowledge Distillation in Generations: More Tolerant Teachers Educate Better Students Chenglin Yang Lingxi Xie Siyuan Qiao Alan Yuille 35 135 0 15 May 2018
Token-level and sequence-level loss smoothing for RNN language models Maha Elbayad Laurent Besacier Jakob Verbeek 24 19 0 14 May 2018