ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1608.05859
  4. Cited By
Using the Output Embedding to Improve Language Models

Using the Output Embedding to Improve Language Models

20 August 2016
Ofir Press
Lior Wolf
ArXivPDFHTML

Papers citing "Using the Output Embedding to Improve Language Models"

50 / 147 papers shown
Title
Faithful Target Attribute Prediction in Neural Machine Translation
Faithful Target Attribute Prediction in Neural Machine Translation
Xing Niu
Georgiana Dinu
Prashant Mathur
Anna Currey
28
4
0
24 Sep 2021
Learning Opinion Summarizers by Selecting Informative Reviews
Learning Opinion Summarizers by Selecting Informative Reviews
Arthur Brazinskas
Mirella Lapata
Ivan Titov
53
29
0
09 Sep 2021
Rare Tokens Degenerate All Tokens: Improving Neural Text Generation via
  Adaptive Gradient Gating for Rare Token Embeddings
Rare Tokens Degenerate All Tokens: Improving Neural Text Generation via Adaptive Gradient Gating for Rare Token Embeddings
Sangwon Yu
Jongyoon Song
Heeseung Kim
SeongEun Lee
Woo-Jong Ryu
Sung-Hoon Yoon
19
31
0
07 Sep 2021
How Suitable Are Subword Segmentation Strategies for Translating
  Non-Concatenative Morphology?
How Suitable Are Subword Segmentation Strategies for Translating Non-Concatenative Morphology?
Chantal Amrhein
Rico Sennrich
27
13
0
02 Sep 2021
Interactive Machine Comprehension with Dynamic Knowledge Graphs
Interactive Machine Comprehension with Dynamic Knowledge Graphs
Xingdi Yuan
34
3
0
31 Aug 2021
SimVLM: Simple Visual Language Model Pretraining with Weak Supervision
SimVLM: Simple Visual Language Model Pretraining with Weak Supervision
Zirui Wang
Jiahui Yu
Adams Wei Yu
Zihang Dai
Yulia Tsvetkov
Yuan Cao
VLM
MLLM
51
780
0
24 Aug 2021
Training Graph Neural Networks with 1000 Layers
Training Graph Neural Networks with 1000 Layers
Guohao Li
Matthias Muller
Guohao Li
V. Koltun
GNN
AI4CE
51
235
0
14 Jun 2021
Which transformer architecture fits my data? A vocabulary bottleneck in
  self-attention
Which transformer architecture fits my data? A vocabulary bottleneck in self-attention
Noam Wies
Yoav Levine
Daniel Jannai
Amnon Shashua
40
20
0
09 May 2021
Representing Numbers in NLP: a Survey and a Vision
Representing Numbers in NLP: a Survey and a Vision
Avijit Thawani
Jay Pujara
Pedro A. Szekely
Filip Ilievski
32
114
0
24 Mar 2021
Finetuning Pretrained Transformers into RNNs
Finetuning Pretrained Transformers into RNNs
Jungo Kasai
Hao Peng
Yizhe Zhang
Dani Yogatama
Gabriel Ilharco
Nikolaos Pappas
Yi Mao
Weizhu Chen
Noah A. Smith
44
63
0
24 Mar 2021
Improving the Lexical Ability of Pretrained Language Models for
  Unsupervised Neural Machine Translation
Improving the Lexical Ability of Pretrained Language Models for Unsupervised Neural Machine Translation
Alexandra Chronopoulou
Dario Stojanovski
Alexander Fraser
SSL
37
26
0
18 Mar 2021
Predicting the Behavior of Dealers in Over-The-Counter Corporate Bond
  Markets
Predicting the Behavior of Dealers in Over-The-Counter Corporate Bond Markets
Yusen Lin
Jinming Xue
L. Raschid
8
3
0
12 Mar 2021
Train your classifier first: Cascade Neural Networks Training from upper
  layers to lower layers
Train your classifier first: Cascade Neural Networks Training from upper layers to lower layers
Shucong Zhang
Cong-Thanh Do
R. Doddipatla
Erfan Loweimi
P. Bell
Steve Renals
24
2
0
09 Feb 2021
Unifying Vision-and-Language Tasks via Text Generation
Unifying Vision-and-Language Tasks via Text Generation
Jaemin Cho
Jie Lei
Hao Tan
Joey Tianyi Zhou
MLLM
277
525
0
04 Feb 2021
Shortformer: Better Language Modeling using Shorter Inputs
Shortformer: Better Language Modeling using Shorter Inputs
Ofir Press
Noah A. Smith
M. Lewis
230
89
0
31 Dec 2020
Unsupervised Learning of Discourse Structures using a Tree Autoencoder
Unsupervised Learning of Discourse Structures using a Tree Autoencoder
Patrick Huber
Giuseppe Carenini
32
4
0
17 Dec 2020
Rethinking embedding coupling in pre-trained language models
Rethinking embedding coupling in pre-trained language models
Hyung Won Chung
Thibault Févry
Henry Tsai
Melvin Johnson
Sebastian Ruder
95
142
0
24 Oct 2020
Controlling the Interaction Between Generation and Inference in
  Semi-Supervised Variational Autoencoders Using Importance Weighting
Controlling the Interaction Between Generation and Inference in Semi-Supervised Variational Autoencoders Using Importance Weighting
G. Felhi
Joseph Leroux
Djamé Seddah
BDL
21
1
0
13 Oct 2020
ChrEn: Cherokee-English Machine Translation for Endangered Language
  Revitalization
ChrEn: Cherokee-English Machine Translation for Endangered Language Revitalization
Shiyue Zhang
B. Frey
Joey Tianyi Zhou
36
28
0
09 Oct 2020
Fine-Grained Grounding for Multimodal Speech Recognition
Fine-Grained Grounding for Multimodal Speech Recognition
Tejas Srinivasan
Ramon Sanabria
Florian Metze
Desmond Elliott
23
11
0
05 Oct 2020
Improving Low Compute Language Modeling with In-Domain Embedding
  Initialisation
Improving Low Compute Language Modeling with In-Domain Embedding Initialisation
Charles F Welch
Rada Mihalcea
Jonathan K. Kummerfeld
AI4CE
16
4
0
29 Sep 2020
Reusing a Pretrained Language Model on Languages with Limited Corpora
  for Unsupervised NMT
Reusing a Pretrained Language Model on Languages with Limited Corpora for Unsupervised NMT
Alexandra Chronopoulou
Dario Stojanovski
Alexander Fraser
18
33
0
16 Sep 2020
LRSpeech: Extremely Low-Resource Speech Synthesis and Recognition
LRSpeech: Extremely Low-Resource Speech Synthesis and Recognition
Jin Xu
Xu Tan
Yi Ren
Tao Qin
Jian Li
Sheng Zhao
Tie-Yan Liu
VLM
18
90
0
09 Aug 2020
A Multilingual Parallel Corpora Collection Effort for Indian Languages
A Multilingual Parallel Corpora Collection Effort for Indian Languages
Shashank Siripragrada
Jerin Philip
Vinay P. Namboodiri
C. V. Jawahar
VLM
32
47
0
15 Jul 2020
Deep Encoder, Shallow Decoder: Reevaluating Non-autoregressive Machine
  Translation
Deep Encoder, Shallow Decoder: Reevaluating Non-autoregressive Machine Translation
Jungo Kasai
Nikolaos Pappas
Hao Peng
James Cross
Noah A. Smith
38
134
0
18 Jun 2020
VirTex: Learning Visual Representations from Textual Annotations
VirTex: Learning Visual Representations from Textual Annotations
Karan Desai
Justin Johnson
SSL
VLM
30
432
0
11 Jun 2020
rTop-k: A Statistical Estimation Approach to Distributed SGD
rTop-k: A Statistical Estimation Approach to Distributed SGD
L. P. Barnes
Huseyin A. Inan
Berivan Isik
Ayfer Özgür
32
65
0
21 May 2020
Are All Languages Created Equal in Multilingual BERT?
Are All Languages Created Equal in Multilingual BERT?
Shijie Wu
Mark Dredze
25
316
0
18 May 2020
Stay Hungry, Stay Focused: Generating Informative and Specific Questions
  in Information-Seeking Conversations
Stay Hungry, Stay Focused: Generating Informative and Specific Questions in Information-Seeking Conversations
Peng Qi
Yuhao Zhang
Christopher D. Manning
21
38
0
30 Apr 2020
Unsupervised Domain Clusters in Pretrained Language Models
Unsupervised Domain Clusters in Pretrained Language Models
Roee Aharoni
Yoav Goldberg
35
244
0
05 Apr 2020
Dynamic Sampling and Selective Masking for Communication-Efficient
  Federated Learning
Dynamic Sampling and Selective Masking for Communication-Efficient Federated Learning
Shaoxiong Ji
Wenqi Jiang
A. Walid
Xue Li
FedML
28
66
0
21 Mar 2020
ProGen: Language Modeling for Protein Generation
ProGen: Language Modeling for Protein Generation
Ali Madani
Bryan McCann
Nikhil Naik
N. Keskar
N. Anand
Raphael R. Eguchi
Po-Ssu Huang
R. Socher
31
275
0
08 Mar 2020
A deep-learning view of chemical space designed to facilitate drug
  discovery
A deep-learning view of chemical space designed to facilitate drug discovery
P. Maragakis
Hunter M. Nisonoff
B. Cole
D. Shaw
38
28
0
07 Feb 2020
Deconstructing and reconstructing word embedding algorithms
Deconstructing and reconstructing word embedding algorithms
Edward Newell
Kian Kenyon-Dean
Jackie C.K. Cheung
39
4
0
29 Nov 2019
Single Headed Attention RNN: Stop Thinking With Your Head
Single Headed Attention RNN: Stop Thinking With Your Head
Stephen Merity
21
68
0
26 Nov 2019
Controlling Neural Machine Translation Formality with Synthetic
  Supervision
Controlling Neural Machine Translation Formality with Synthetic Supervision
Xing Niu
Marine Carpuat
36
35
0
20 Nov 2019
Improving Transformer Models by Reordering their Sublayers
Improving Transformer Models by Reordering their Sublayers
Ofir Press
Noah A. Smith
Omer Levy
16
87
0
10 Nov 2019
Domain Robustness in Neural Machine Translation
Domain Robustness in Neural Machine Translation
Mathias Müller
Annette Rios Gonzales
Rico Sennrich
33
95
0
08 Nov 2019
Unsupervised Opinion Summarization as Copycat-Review Generation
Unsupervised Opinion Summarization as Copycat-Review Generation
Arthur Brazinskas
Mirella Lapata
Ivan Titov
22
125
0
06 Nov 2019
Generalization through Memorization: Nearest Neighbor Language Models
Generalization through Memorization: Nearest Neighbor Language Models
Urvashi Khandelwal
Omer Levy
Dan Jurafsky
Luke Zettlemoyer
M. Lewis
RALM
53
810
0
01 Nov 2019
Adapting Multilingual Neural Machine Translation to Unseen Languages
Adapting Multilingual Neural Machine Translation to Unseen Languages
Surafel Melaku Lakew
Alina Karakanta
Marcello Federico
Matteo Negri
Marco Turchi
33
20
0
30 Oct 2019
Transformer-based Cascaded Multimodal Speech Translation
Transformer-based Cascaded Multimodal Speech Translation
Zixiu "Alex" Wu
Ozan Caglayan
Julia Ive
Josiah Wang
Lucia Specia
25
7
0
29 Oct 2019
Federated Evaluation of On-device Personalization
Federated Evaluation of On-device Personalization
Kangkang Wang
Rajiv Mathews
Chloé Kiddon
Hubert Eichner
F. Beaufays
Daniel Ramage
FedML
13
282
0
22 Oct 2019
Transformers without Tears: Improving the Normalization of
  Self-Attention
Transformers without Tears: Improving the Normalization of Self-Attention
Toan Q. Nguyen
Julian Salazar
38
224
0
14 Oct 2019
Espresso: A Fast End-to-end Neural Speech Recognition Toolkit
Espresso: A Fast End-to-end Neural Speech Recognition Toolkit
Yiming Wang
Tongfei Chen
Hainan Xu
Shuoyang Ding
Hang Lv
Yiwen Shao
Nanyun Peng
Lei Xie
Shinji Watanabe
Sanjeev Khudanpur
VLM
24
73
0
18 Sep 2019
Code-Switched Language Models Using Neural Based Synthetic Data from
  Parallel Sentences
Code-Switched Language Models Using Neural Based Synthetic Data from Parallel Sentences
Genta Indra Winata
Andrea Madotto
Chien-Sheng Wu
Pascale Fung
SyDa
135
92
0
18 Sep 2019
CTRL: A Conditional Transformer Language Model for Controllable
  Generation
CTRL: A Conditional Transformer Language Model for Controllable Generation
N. Keskar
Bryan McCann
L. Varshney
Caiming Xiong
R. Socher
AI4CE
57
1,236
0
11 Sep 2019
Learn Spelling from Teachers: Transferring Knowledge from Language
  Models to Sequence-to-Sequence Speech Recognition
Learn Spelling from Teachers: Transferring Knowledge from Language Models to Sequence-to-Sequence Speech Recognition
Ye Bai
Jiangyan Yi
J. Tao
Zhengkun Tian
Zhengqi Wen
KELM
22
38
0
13 Jul 2019
Federated Learning for Emoji Prediction in a Mobile Keyboard
Federated Learning for Emoji Prediction in a Mobile Keyboard
Swaroop Indra Ramaswamy
Rajiv Mathews
Kanishka Rao
Franccoise Beaufays
FedML
18
309
0
11 Jun 2019
Shared-Private Bilingual Word Embeddings for Neural Machine Translation
Shared-Private Bilingual Word Embeddings for Neural Machine Translation
Xuebo Liu
Derek F. Wong
Yang Liu
Lidia S. Chao
Tong Xiao
Jingbo Zhu
35
37
0
07 Jun 2019
Previous
123
Next