Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1803.08577
Cited By
Unbiased scalable softmax optimization
22 March 2018
Francois Fagan
G. Iyengar
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Unbiased scalable softmax optimization"
19 / 19 papers shown
Title
Augment and Reduce: Stochastic Inference for Large Categorical Distributions
Francisco J. R. Ruiz
Michalis K. Titsias
Adji Bousso Dieng
David M. Blei
BDL
103
22
0
12 Feb 2018
Aggressive Sampling for Multi-class to Binary Reduction with Applications to Text Classification
Bikash Joshi
Massih-Reza Amini
Ioannis Partalas
F. Iutzeler
Yury Maximov
MQ
68
16
0
23 Jan 2017
The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables
Chris J. Maddison
A. Mnih
Yee Whye Teh
BDL
200
2,541
0
02 Nov 2016
Simultaneous Learning of Trees and Representations for Extreme Classification and Density Estimation
Yacine Jernite
A. Choromańska
David Sontag
152
36
0
14 Oct 2016
One-vs-Each Approximation to Softmax for Scalable Estimation of Probabilities
Michalis K. Titsias
UQCV
72
54
0
23 Sep 2016
Efficient softmax approximation for GPUs
Edouard Grave
Armand Joulin
Moustapha Cissé
David Grangier
Hervé Jégou
114
272
0
14 Sep 2016
Accelerating Stochastic Composition Optimization
Mengdi Wang
Ji Liu
Ethan X. Fang
80
148
0
25 Jul 2016
Logarithmic Time One-Against-Some
Hal Daumé
Nikos Karampatziakis
John Langford
Paul Mineiro
48
46
0
15 Jun 2016
DS-MLR: Exploiting Double Separability for Scaling up Distributed Multinomial Logistic Regression
Parameswaran Raman
Sriram Srinivasan
Shin Matsushima
Xinhua Zhang
Hyokun Yun
S.V.N. Vishwanathan
72
12
0
16 Apr 2016
From Softmax to Sparsemax: A Sparse Model of Attention and Multi-Label Classification
André F. T. Martins
Ramón Fernández Astudillo
190
726
0
05 Feb 2016
BlackOut: Speeding up Recurrent Neural Network Language Models With Very Large Vocabularies
Shihao Ji
S.V.N. Vishwanathan
N. Satish
Michael J. Anderson
Pradeep Dubey
104
77
0
21 Nov 2015
An Exploration of Softmax Alternatives Belonging to the Spherical Loss Family
A. D. Brébisson
Pascal Vincent
78
98
0
16 Nov 2015
The Proximal Robbins-Monro Method
Panos Toulis
Thibaut Horel
E. Airoldi
59
30
0
04 Oct 2015
Towards stability and optimality in stochastic gradient descent
Panos Toulis
Dustin Tran
E. Airoldi
102
56
0
10 May 2015
LSHTC: A Benchmark for Large-Scale Text Classification
Ioannis Partalas
Aris Kosmopoulos
Nicolas Baskiotis
Thierry Artières
George Giannakopoulos
Éric Gaussier
Ion Androutsopoulos
Massih-Reza Amini
Patrick Gallinari
54
181
0
30 Mar 2015
Efficient Exact Gradient Update for training Deep Networks with Very Large Sparse Targets
Pascal Vincent
A. D. Brébisson
Xavier Bouthillier
82
49
0
22 Dec 2014
One Billion Word Benchmark for Measuring Progress in Statistical Language Modeling
Ciprian Chelba
Tomas Mikolov
M. Schuster
Qi Ge
T. Brants
P. Koehn
T. Robinson
190
1,109
0
11 Dec 2013
A simpler approach to obtaining an O(1/t) convergence rate for the projected stochastic subgradient method
Simon Lacoste-Julien
Mark Schmidt
Francis R. Bach
190
261
0
10 Dec 2012
A Fast and Simple Algorithm for Training Neural Probabilistic Language Models
A. Mnih
Yee Whye Teh
185
578
0
27 Jun 2012
1