ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1609.04309
  4. Cited By
Efficient softmax approximation for GPUs
v1v2v3 (latest)

Efficient softmax approximation for GPUs

14 September 2016
Edouard Grave
Armand Joulin
Moustapha Cissé
David Grangier
Hervé Jégou
ArXiv (abs)PDFHTMLGithub (394★)

Papers citing "Efficient softmax approximation for GPUs"

16 / 16 papers shown
Title
Large Vocabulary Size Improves Large Language Models
Large Vocabulary Size Improves Large Language Models
Sho Takase
Ryokan Ri
Shun Kiyono
Takuya Kato
106
4
0
24 Jun 2024
Simultaneous Learning of Trees and Representations for Extreme
  Classification and Density Estimation
Simultaneous Learning of Trees and Representations for Extreme Classification and Density Estimation
Yacine Jernite
A. Choromańska
David Sontag
141
36
0
14 Oct 2016
Exploring the Limits of Language Modeling
Exploring the Limits of Language Modeling
Rafal Jozefowicz
Oriol Vinyals
M. Schuster
Noam M. Shazeer
Yonghui Wu
199
1,145
0
07 Feb 2016
Strategies for Training Large Vocabulary Neural Language Models
Strategies for Training Large Vocabulary Neural Language Models
Welin Chen
David Grangier
Michael Auli
VLM
55
139
0
15 Dec 2015
BlackOut: Speeding up Recurrent Neural Network Language Models With Very
  Large Vocabularies
BlackOut: Speeding up Recurrent Neural Network Language Models With Very Large Vocabularies
Shihao Ji
S.V.N. Vishwanathan
N. Satish
Michael J. Anderson
Pradeep Dubey
71
77
0
21 Nov 2015
Learning Visual Features from Large Weakly Supervised Data
Learning Visual Features from Large Weakly Supervised Data
Armand Joulin
Laurens van der Maaten
Allan Jabri
Nicolas Vasilache
SSL
107
408
0
06 Nov 2015
A Simple Way to Initialize Recurrent Networks of Rectified Linear Units
A Simple Way to Initialize Recurrent Networks of Rectified Linear Units
Quoc V. Le
Navdeep Jaitly
Geoffrey E. Hinton
ODL
88
721
0
03 Apr 2015
Learning Longer Memory in Recurrent Neural Networks
Learning Longer Memory in Recurrent Neural Networks
Tomas Mikolov
Armand Joulin
S. Chopra
Michaël Mathieu
MarcÁurelio Ranzato
94
259
0
24 Dec 2014
Efficient Exact Gradient Update for training Deep Networks with Very
  Large Sparse Targets
Efficient Exact Gradient Update for training Deep Networks with Very Large Sparse Targets
Pascal Vincent
A. D. Brébisson
Xavier Bouthillier
56
49
0
22 Dec 2014
Empirical Evaluation of Gated Recurrent Neural Networks on Sequence
  Modeling
Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling
Junyoung Chung
Çağlar Gülçehre
Kyunghyun Cho
Yoshua Bengio
598
12,734
0
11 Dec 2014
On Using Very Large Target Vocabulary for Neural Machine Translation
On Using Very Large Target Vocabulary for Neural Machine Translation
Sébastien Jean
Kyunghyun Cho
Roland Memisevic
Yoshua Bengio
155
1,011
0
05 Dec 2014
Sequence to Sequence Learning with Neural Networks
Sequence to Sequence Learning with Neural Networks
Ilya Sutskever
Oriol Vinyals
Quoc V. Le
AIMat
437
20,584
0
10 Sep 2014
One Billion Word Benchmark for Measuring Progress in Statistical
  Language Modeling
One Billion Word Benchmark for Measuring Progress in Statistical Language Modeling
Ciprian Chelba
Tomas Mikolov
M. Schuster
Qi Ge
T. Brants
P. Koehn
T. Robinson
190
1,109
0
11 Dec 2013
Speech Recognition with Deep Recurrent Neural Networks
Speech Recognition with Deep Recurrent Neural Networks
Alex Graves
Abdel-rahman Mohamed
Geoffrey E. Hinton
228
8,523
0
22 Mar 2013
Efficient Estimation of Word Representations in Vector Space
Efficient Estimation of Word Representations in Vector Space
Tomas Mikolov
Kai Chen
G. Corrado
J. Dean
3DV
680
31,538
0
16 Jan 2013
A Fast and Simple Algorithm for Training Neural Probabilistic Language
  Models
A Fast and Simple Algorithm for Training Neural Probabilistic Language Models
A. Mnih
Yee Whye Teh
177
578
0
27 Jun 2012
1