Unbiased scalable softmax optimization

Unbiased scalable softmax optimization

22 March 2018

Francois Fagan

ArXiv (abs)PDF HTML

Papers citing "Unbiased scalable softmax optimization"

19 / 19 papers shown

Title
Augment and Reduce: Stochastic Inference for Large Categorical Distributions Francisco J. R. Ruiz Michalis K. Titsias Adji Bousso Dieng David M. Blei BDL 103 22 0 12 Feb 2018
Aggressive Sampling for Multi-class to Binary Reduction with Applications to Text Classification Bikash Joshi Massih-Reza Amini Ioannis Partalas F. Iutzeler Yury Maximov MQ 68 16 0 23 Jan 2017
The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables Chris J. Maddison A. Mnih Yee Whye Teh BDL 200 2,541 0 02 Nov 2016
Simultaneous Learning of Trees and Representations for Extreme Classification and Density Estimation Yacine Jernite A. Choromańska David Sontag 152 36 0 14 Oct 2016
One-vs-Each Approximation to Softmax for Scalable Estimation of Probabilities Michalis K. Titsias UQCV 72 54 0 23 Sep 2016
Efficient softmax approximation for GPUs Edouard Grave Armand Joulin Moustapha Cissé David Grangier Hervé Jégou 114 272 0 14 Sep 2016
Accelerating Stochastic Composition Optimization Mengdi Wang Ji Liu Ethan X. Fang 80 148 0 25 Jul 2016
Logarithmic Time One-Against-Some Hal Daumé Nikos Karampatziakis John Langford Paul Mineiro 48 46 0 15 Jun 2016
DS-MLR: Exploiting Double Separability for Scaling up Distributed Multinomial Logistic Regression Parameswaran Raman Sriram Srinivasan Shin Matsushima Xinhua Zhang Hyokun Yun S.V.N. Vishwanathan 72 12 0 16 Apr 2016
From Softmax to Sparsemax: A Sparse Model of Attention and Multi-Label Classification André F. T. Martins Ramón Fernández Astudillo 190 726 0 05 Feb 2016
BlackOut: Speeding up Recurrent Neural Network Language Models With Very Large Vocabularies Shihao Ji S.V.N. Vishwanathan N. Satish Michael J. Anderson Pradeep Dubey 104 77 0 21 Nov 2015
An Exploration of Softmax Alternatives Belonging to the Spherical Loss Family A. D. Brébisson Pascal Vincent 78 98 0 16 Nov 2015
The Proximal Robbins-Monro Method Panos Toulis Thibaut Horel E. Airoldi 59 30 0 04 Oct 2015
Towards stability and optimality in stochastic gradient descent Panos Toulis Dustin Tran E. Airoldi 102 56 0 10 May 2015
LSHTC: A Benchmark for Large-Scale Text Classification Ioannis Partalas Aris Kosmopoulos Nicolas Baskiotis Thierry Artières George Giannakopoulos Éric Gaussier Ion Androutsopoulos Massih-Reza Amini Patrick Gallinari 54 181 0 30 Mar 2015
Efficient Exact Gradient Update for training Deep Networks with Very Large Sparse Targets Pascal Vincent A. D. Brébisson Xavier Bouthillier 82 49 0 22 Dec 2014
One Billion Word Benchmark for Measuring Progress in Statistical Language Modeling Ciprian Chelba Tomas Mikolov M. Schuster Qi Ge T. Brants P. Koehn T. Robinson 190 1,109 0 11 Dec 2013
A simpler approach to obtaining an O(1/t) convergence rate for the projected stochastic subgradient method Simon Lacoste-Julien Mark Schmidt Francis R. Bach 190 261 0 10 Dec 2012
A Fast and Simple Algorithm for Training Neural Probabilistic Language Models A. Mnih Yee Whye Teh 185 578 0 27 Jun 2012