ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1803.08240
  4. Cited By
An Analysis of Neural Language Modeling at Multiple Scales

An Analysis of Neural Language Modeling at Multiple Scales

22 March 2018
Stephen Merity
N. Keskar
R. Socher
ArXiv (abs)PDFHTML

Papers citing "An Analysis of Neural Language Modeling at Multiple Scales"

24 / 24 papers shown
Title
Deep contextualized word representations
Deep contextualized word representations
Matthew E. Peters
Mark Neumann
Mohit Iyyer
Matt Gardner
Christopher Clark
Kenton Lee
Luke Zettlemoyer
NAI
227
11,565
0
15 Feb 2018
Breaking the Softmax Bottleneck: A High-Rank RNN Language Model
Breaking the Softmax Bottleneck: A High-Rank RNN Language Model
Zhilin Yang
Zihang Dai
Ruslan Salakhutdinov
William W. Cohen
BDL
68
372
0
10 Nov 2017
Dynamic Evaluation of Neural Sequence Models
Dynamic Evaluation of Neural Sequence Models
Ben Krause
Emmanuel Kahembwe
Iain Murray
Steve Renals
73
135
0
21 Sep 2017
Regularizing and Optimizing LSTM Language Models
Regularizing and Optimizing LSTM Language Models
Stephen Merity
N. Keskar
R. Socher
166
1,096
0
07 Aug 2017
Revisiting Activation Regularization for Language RNNs
Revisiting Activation Regularization for Language RNNs
Stephen Merity
Bryan McCann
R. Socher
60
44
0
03 Aug 2017
On the State of the Art of Evaluation in Neural Language Models
On the State of the Art of Evaluation in Neural Language Models
Gábor Melis
Chris Dyer
Phil Blunsom
68
536
0
18 Jul 2017
Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour
Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour
Priya Goyal
Piotr Dollár
Ross B. Girshick
P. Noordhuis
Lukasz Wesolowski
Aapo Kyrola
Andrew Tulloch
Yangqing Jia
Kaiming He
3DH
128
3,685
0
08 Jun 2017
Fast-Slow Recurrent Neural Networks
Fast-Slow Recurrent Neural Networks
Asier Mujika
Florian Meier
Angelika Steger
83
77
0
24 May 2017
Learning to Generate Reviews and Discovering Sentiment
Learning to Generate Reviews and Discovering Sentiment
Alec Radford
Rafal Jozefowicz
Ilya Sutskever
97
510
0
05 Apr 2017
Outrageously Large Neural Networks: The Sparsely-Gated
  Mixture-of-Experts Layer
Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer
Noam M. Shazeer
Azalia Mirhoseini
Krzysztof Maziarz
Andy Davis
Quoc V. Le
Geoffrey E. Hinton
J. Dean
MoE
251
2,683
0
23 Jan 2017
Language Modeling with Gated Convolutional Networks
Language Modeling with Gated Convolutional Networks
Yann N. Dauphin
Angela Fan
Michael Auli
David Grangier
242
2,403
0
23 Dec 2016
Improving Neural Language Models with a Continuous Cache
Improving Neural Language Models with a Continuous Cache
Edouard Grave
Armand Joulin
Nicolas Usunier
KELM
55
301
0
13 Dec 2016
Neural Architecture Search with Reinforcement Learning
Neural Architecture Search with Reinforcement Learning
Barret Zoph
Quoc V. Le
468
5,378
0
05 Nov 2016
Quasi-Recurrent Neural Networks
Quasi-Recurrent Neural Networks
James Bradbury
Stephen Merity
Caiming Xiong
R. Socher
170
444
0
05 Nov 2016
Tying Word Vectors and Word Classifiers: A Loss Framework for Language
  Modeling
Tying Word Vectors and Word Classifiers: A Loss Framework for Language Modeling
Hakan Inan
Khashayar Khosravi
R. Socher
110
385
0
04 Nov 2016
Surprisal-Driven Zoneout
Surprisal-Driven Zoneout
K. Rocki
Tomasz Kornuta
Tegan Maharaj
56
8
0
24 Oct 2016
HyperNetworks
HyperNetworks
David R Ha
Andrew M. Dai
Quoc V. Le
164
1,632
0
27 Sep 2016
Pointer Sentinel Mixture Models
Pointer Sentinel Mixture Models
Stephen Merity
Caiming Xiong
James Bradbury
R. Socher
RALM
334
2,895
0
26 Sep 2016
Efficient softmax approximation for GPUs
Efficient softmax approximation for GPUs
Edouard Grave
Armand Joulin
Moustapha Cissé
David Grangier
Hervé Jégou
83
272
0
14 Sep 2016
Hierarchical Multiscale Recurrent Neural Networks
Hierarchical Multiscale Recurrent Neural Networks
Junyoung Chung
Sungjin Ahn
Yoshua Bengio
BDL
104
537
0
06 Sep 2016
Using the Output Embedding to Improve Language Models
Using the Output Embedding to Improve Language Models
Ofir Press
Lior Wolf
79
736
0
20 Aug 2016
Recurrent Highway Networks
Recurrent Highway Networks
J. Zilly
R. Srivastava
Jan Koutník
Jürgen Schmidhuber
83
418
0
12 Jul 2016
Zoneout: Regularizing RNNs by Randomly Preserving Hidden Activations
Zoneout: Regularizing RNNs by Randomly Preserving Hidden Activations
David M. Krueger
Tegan Maharaj
János Kramár
Mohammad Pezeshki
Nicolas Ballas
Nan Rosemary Ke
Anirudh Goyal
Yoshua Bengio
Aaron Courville
C. Pal
82
317
0
03 Jun 2016
Adam: A Method for Stochastic Optimization
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
2.0K
150,260
0
22 Dec 2014
1