ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1809.10853
  4. Cited By
Adaptive Input Representations for Neural Language Modeling
v1v2v3 (latest)

Adaptive Input Representations for Neural Language Modeling

28 September 2018
Alexei Baevski
Michael Auli
ArXiv (abs)PDFHTML

Papers citing "Adaptive Input Representations for Neural Language Modeling"

19 / 269 papers shown
Title
One Epoch Is All You Need
One Epoch Is All You Need
Aran Komatsuzaki
78
51
0
16 Jun 2019
Real or Fake? Learning to Discriminate Machine from Human Generated Text
Real or Fake? Learning to Discriminate Machine from Human Generated Text
A. Bakhtin
Sam Gross
Myle Ott
Yuntian Deng
MarcÁurelio Ranzato
Arthur Szlam
DeLMO
103
173
0
07 Jun 2019
Scaling Autoregressive Video Models
Scaling Autoregressive Video Models
Dirk Weissenborn
Oscar Täckström
Jakob Uszkoreit
DiffMVGen
115
204
0
06 Jun 2019
Multi-News: a Large-Scale Multi-Document Summarization Dataset and
  Abstractive Hierarchical Model
Multi-News: a Large-Scale Multi-Document Summarization Dataset and Abstractive Hierarchical Model
Alexander R. Fabbri
Irene Li
Tianwei She
Suyi Li
Dragomir R. Radev
108
590
0
04 Jun 2019
Improving Neural Language Models by Segmenting, Attending, and
  Predicting the Future
Improving Neural Language Models by Segmenting, Attending, and Predicting the Future
Hongyin Luo
Lan Jiang
Yonatan Belinkov
James R. Glass
56
13
0
04 Jun 2019
Multimodal Transformer for Unaligned Multimodal Language Sequences
Multimodal Transformer for Unaligned Multimodal Language Sequences
Yao-Hung Hubert Tsai
Shaojie Bai
Paul Pu Liang
J. Zico Kolter
Louis-Philippe Morency
Ruslan Salakhutdinov
103
1,319
0
01 Jun 2019
PowerSGD: Practical Low-Rank Gradient Compression for Distributed
  Optimization
PowerSGD: Practical Low-Rank Gradient Compression for Distributed Optimization
Thijs Vogels
Sai Praneeth Karimireddy
Martin Jaggi
105
322
0
31 May 2019
Instant Quantization of Neural Networks using Monte Carlo Methods
Instant Quantization of Neural Networks using Monte Carlo Methods
Gonçalo Mordido
Matthijs Van Keirsbilck
A. Keller
MQ
39
9
0
29 May 2019
Language Modeling with Deep Transformers
Language Modeling with Deep Transformers
Kazuki Irie
Albert Zeyer
Ralf Schluter
Hermann Ney
KELM
102
176
0
10 May 2019
Language Models with Transformers
Language Models with Transformers
Chenguang Wang
Mu Li
Alex Smola
93
122
0
20 Apr 2019
Dynamic Evaluation of Transformer Language Models
Dynamic Evaluation of Transformer Language Models
Ben Krause
Emmanuel Kahembwe
Iain Murray
Steve Renals
95
43
0
17 Apr 2019
fairseq: A Fast, Extensible Toolkit for Sequence Modeling
fairseq: A Fast, Extensible Toolkit for Sequence Modeling
Myle Ott
Sergey Edunov
Alexei Baevski
Angela Fan
Sam Gross
Nathan Ng
David Grangier
Michael Auli
VLMFaML
165
3,159
0
01 Apr 2019
Pre-trained Language Model Representations for Language Generation
Pre-trained Language Model Representations for Language Generation
Sergey Edunov
Alexei Baevski
Michael Auli
92
130
0
22 Mar 2019
Cloze-driven Pretraining of Self-attention Networks
Cloze-driven Pretraining of Self-attention Networks
Alexei Baevski
Sergey Edunov
Yinhan Liu
Luke Zettlemoyer
Michael Auli
52
198
0
19 Mar 2019
Tensorized Embedding Layers for Efficient Model Compression
Tensorized Embedding Layers for Efficient Model Compression
Oleksii Hrinchuk
Valentin Khrulkov
L. Mirvakhabova
Elena Orlova
Ivan Oseledets
91
73
0
30 Jan 2019
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
Zihang Dai
Zhilin Yang
Yiming Yang
J. Carbonell
Quoc V. Le
Ruslan Salakhutdinov
VLM
275
3,756
0
09 Jan 2019
Extractive Summary as Discrete Latent Variables
Extractive Summary as Discrete Latent Variables
Aran Komatsuzaki
56
3
0
14 Nov 2018
Understanding Recurrent Neural Architectures by Analyzing and
  Synthesizing Long Distance Dependencies in Benchmark Sequential Datasets
Understanding Recurrent Neural Architectures by Analyzing and Synthesizing Long Distance Dependencies in Benchmark Sequential Datasets
Abhijit Mahalunkar
John D. Kelleher
79
8
0
06 Oct 2018
Analysis of DAWNBench, a Time-to-Accuracy Machine Learning Performance
  Benchmark
Analysis of DAWNBench, a Time-to-Accuracy Machine Learning Performance Benchmark
Cody Coleman
Daniel Kang
Deepak Narayanan
Luigi Nardi
Tian Zhao
Jian Zhang
Peter Bailis
K. Olukotun
Christopher Ré
Matei A. Zaharia
60
117
0
04 Jun 2018
Previous
123456