ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1909.01380
  4. Cited By
The Bottom-up Evolution of Representations in the Transformer: A Study
  with Machine Translation and Language Modeling Objectives

The Bottom-up Evolution of Representations in the Transformer: A Study with Machine Translation and Language Modeling Objectives

3 September 2019
Elena Voita
Rico Sennrich
Ivan Titov
ArXivPDFHTML

Papers citing "The Bottom-up Evolution of Representations in the Transformer: A Study with Machine Translation and Language Modeling Objectives"

49 / 49 papers shown
Title
Towards Understanding How Knowledge Evolves in Large Vision-Language Models
Towards Understanding How Knowledge Evolves in Large Vision-Language Models
Sudong Wang
Yuyao Zhang
Yao Zhu
Jianing Li
Zizhe Wang
Yi Liu
Xiangyang Ji
137
0
0
31 Mar 2025
Racing Thoughts: Explaining Contextualization Errors in Large Language Models
Racing Thoughts: Explaining Contextualization Errors in Large Language Models
Michael A. Lepori
Michael Mozer
Asma Ghandeharioun
LRM
85
1
0
02 Oct 2024
From Yes-Men to Truth-Tellers: Addressing Sycophancy in Large Language Models with Pinpoint Tuning
From Yes-Men to Truth-Tellers: Addressing Sycophancy in Large Language Models with Pinpoint Tuning
Wei Chen
Zhen Huang
Liang Xie
Binbin Lin
Houqiang Li
...
Deng Cai
Yonggang Zhang
Wenxiao Wang
Xu Shen
Jieping Ye
51
6
0
03 Sep 2024
Low-Rank Interconnected Adaptation Across Layers
Low-Rank Interconnected Adaptation Across Layers
Yibo Zhong
Yao Zhou
OffRL
MoE
48
1
0
13 Jul 2024
Investigating the translation capabilities of Large Language Models
  trained on parallel data only
Investigating the translation capabilities of Large Language Models trained on parallel data only
Javier García Gilabert
Carlos Escolano
Aleix Sant Savall
Francesca de Luca Fornaciari
Audrey Mash
Xixian Liao
Maite Melero
LRM
42
2
0
13 Jun 2024
Where does In-context Translation Happen in Large Language Models
Where does In-context Translation Happen in Large Language Models
Suzanna Sia
David Mueller
Kevin Duh
LRM
41
0
0
07 Mar 2024
Disentangling the Linguistic Competence of Privacy-Preserving BERT
Disentangling the Linguistic Competence of Privacy-Preserving BERT
Stefan Arnold
Nils Kemmerzell
Annika Schreiner
25
0
0
17 Oct 2023
Few-Shot Spoken Language Understanding via Joint Speech-Text Models
Few-Shot Spoken Language Understanding via Joint Speech-Text Models
Chung-Ming Chien
Mingjiamei Zhang
Ju-Chieh Chou
Karen Livescu
28
3
0
09 Oct 2023
Layer-wise Representation Fusion for Compositional Generalization
Layer-wise Representation Fusion for Compositional Generalization
Yafang Zheng
Lei Lin
Shantao Liu
Binling Wang
Zhaohong Lai
Wenhao Rao
Biao Fu
Yidong Chen
Xiaodon Shi
AI4CE
43
2
0
20 Jul 2023
On Robustness of Finetuned Transformer-based NLP Models
On Robustness of Finetuned Transformer-based NLP Models
Pavan Kalyan Reddy Neerudu
S. Oota
Mounika Marreddy
Venkateswara Rao Kagita
Manish Gupta
26
7
0
23 May 2023
Explaining How Transformers Use Context to Build Predictions
Explaining How Transformers Use Context to Build Predictions
Javier Ferrando
Gerard I. Gállego
Ioannis Tsiamas
Marta R. Costa-jussá
32
31
0
21 May 2023
Learning to Compose Representations of Different Encoder Layers towards
  Improving Compositional Generalization
Learning to Compose Representations of Different Encoder Layers towards Improving Compositional Generalization
Lei Lin
Shuangtao Li
Yafang Zheng
Biao Fu
Shantao Liu
Yidong Chen
Xiaodon Shi
CoGe
25
2
0
20 May 2023
Privacy-Preserving Prompt Tuning for Large Language Model Services
Privacy-Preserving Prompt Tuning for Large Language Model Services
Yansong Li
Zhixing Tan
Yang Liu
SILM
VLM
47
63
0
10 May 2023
Topics in Contextualised Attention Embeddings
Topics in Contextualised Attention Embeddings
Mozhgan Talebpour
A. G. S. D. Herrera
Shoaib Jameel
26
2
0
11 Jan 2023
On the Effect of Pre-training for Transformer in Different Modality on
  Offline Reinforcement Learning
On the Effect of Pre-training for Transformer in Different Modality on Offline Reinforcement Learning
S. Takagi
OffRL
18
7
0
17 Nov 2022
Mask More and Mask Later: Efficient Pre-training of Masked Language
  Models by Disentangling the [MASK] Token
Mask More and Mask Later: Efficient Pre-training of Masked Language Models by Disentangling the [MASK] Token
Baohao Liao
David Thulke
Sanjika Hewavitharana
Hermann Ney
Christof Monz
28
9
0
09 Nov 2022
Hidden State Variability of Pretrained Language Models Can Guide
  Computation Reduction for Transfer Learning
Hidden State Variability of Pretrained Language Models Can Guide Computation Reduction for Transfer Learning
Shuo Xie
Jiahao Qiu
Ankita Pasad
Li Du
Qing Qu
Hongyuan Mei
35
16
0
18 Oct 2022
Transparency Helps Reveal When Language Models Learn Meaning
Transparency Helps Reveal When Language Models Learn Meaning
Zhaofeng Wu
William Merrill
Hao Peng
Iz Beltagy
Noah A. Smith
19
9
0
14 Oct 2022
Analyzing Transformers in Embedding Space
Analyzing Transformers in Embedding Space
Guy Dar
Mor Geva
Ankit Gupta
Jonathan Berant
19
83
0
06 Sep 2022
An Interpretability Evaluation Benchmark for Pre-trained Language Models
An Interpretability Evaluation Benchmark for Pre-trained Language Models
Ya-Ming Shen
Lijie Wang
Ying Chen
Xinyan Xiao
Jing Liu
Hua-Hong Wu
37
4
0
28 Jul 2022
How to Dissect a Muppet: The Structure of Transformer Embedding Spaces
How to Dissect a Muppet: The Structure of Transformer Embedding Spaces
Timothee Mickus
Denis Paperno
Mathieu Constant
19
19
0
07 Jun 2022
Towards Opening the Black Box of Neural Machine Translation: Source and
  Target Interpretations of the Transformer
Towards Opening the Black Box of Neural Machine Translation: Source and Target Interpretations of the Transformer
Javier Ferrando
Gerard I. Gállego
Belen Alastruey
Carlos Escolano
Marta R. Costa-jussá
22
44
0
23 May 2022
The Geometry of Multilingual Language Model Representations
The Geometry of Multilingual Language Model Representations
Tyler A. Chang
Z. Tu
Benjamin Bergen
16
56
0
22 May 2022
Self-Supervised Speech Representation Learning: A Review
Self-Supervised Speech Representation Learning: A Review
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSL
AI4TS
128
349
0
21 May 2022
Transformer Feed-Forward Layers Build Predictions by Promoting Concepts
  in the Vocabulary Space
Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space
Mor Geva
Avi Caciularu
Ke Wang
Yoav Goldberg
KELM
46
333
0
28 Mar 2022
Contrastive Visual Semantic Pretraining Magnifies the Semantics of
  Natural Language Representations
Contrastive Visual Semantic Pretraining Magnifies the Semantics of Natural Language Representations
Robert Wolfe
Aylin Caliskan
VLM
21
13
0
14 Mar 2022
Representation Topology Divergence: A Method for Comparing Neural
  Network Representations
Representation Topology Divergence: A Method for Comparing Neural Network Representations
S. Barannikov
I. Trofimov
Nikita Balabin
Evgeny Burnaev
3DPC
32
45
0
31 Dec 2021
Measuring Context-Word Biases in Lexical Semantic Datasets
Measuring Context-Word Biases in Lexical Semantic Datasets
Qianchu Liu
Diana McCarthy
Anna Korhonen
31
2
0
13 Dec 2021
Low Frequency Names Exhibit Bias and Overfitting in Contextualizing
  Language Models
Low Frequency Names Exhibit Bias and Overfitting in Contextualizing Language Models
Robert Wolfe
Aylin Caliskan
85
51
0
01 Oct 2021
On the Prunability of Attention Heads in Multilingual BERT
On the Prunability of Attention Heads in Multilingual BERT
Aakriti Budhraja
Madhura Pande
Pratyush Kumar
Mitesh M. Khapra
44
4
0
26 Sep 2021
Fine-Tuned Transformers Show Clusters of Similar Representations Across
  Layers
Fine-Tuned Transformers Show Clusters of Similar Representations Across Layers
Jason Phang
Haokun Liu
Samuel R. Bowman
27
25
0
17 Sep 2021
Not All Models Localize Linguistic Knowledge in the Same Place: A
  Layer-wise Probing on BERToids' Representations
Not All Models Localize Linguistic Knowledge in the Same Place: A Layer-wise Probing on BERToids' Representations
Mohsen Fayyaz
Ehsan Aghazadeh
Ali Modarressi
Hosein Mohebbi
Mohammad Taher Pilehvar
18
21
0
13 Sep 2021
Automatic Text Evaluation through the Lens of Wasserstein Barycenters
Automatic Text Evaluation through the Lens of Wasserstein Barycenters
Pierre Colombo
Guillaume Staerman
Chloé Clavel
Pablo Piantanida
27
41
0
27 Aug 2021
Translation Error Detection as Rationale Extraction
Translation Error Detection as Rationale Extraction
M. Fomicheva
Lucia Specia
Nikolaos Aletras
13
23
0
27 Aug 2021
Do Vision Transformers See Like Convolutional Neural Networks?
Do Vision Transformers See Like Convolutional Neural Networks?
M. Raghu
Thomas Unterthiner
Simon Kornblith
Chiyuan Zhang
Alexey Dosovitskiy
ViT
52
924
0
19 Aug 2021
CoBERL: Contrastive BERT for Reinforcement Learning
CoBERL: Contrastive BERT for Reinforcement Learning
Andrea Banino
Adria Puidomenech Badia
Jacob Walker
Tim Scholtes
Jovana Mitrović
Charles Blundell
OffRL
30
36
0
12 Jul 2021
Layer-wise Analysis of a Self-supervised Speech Representation Model
Layer-wise Analysis of a Self-supervised Speech Representation Model
Ankita Pasad
Ju-Chieh Chou
Karen Livescu
SSL
26
287
0
10 Jul 2021
On Compositional Generalization of Neural Machine Translation
On Compositional Generalization of Neural Machine Translation
Yafu Li
Yongjing Yin
Yulong Chen
Yue Zhang
156
44
0
31 May 2021
LMMS Reloaded: Transformer-based Sense Embeddings for Disambiguation and
  Beyond
LMMS Reloaded: Transformer-based Sense Embeddings for Disambiguation and Beyond
Daniel Loureiro
A. Jorge
Jose Camacho-Collados
33
26
0
26 May 2021
DirectQE: Direct Pretraining for Machine Translation Quality Estimation
DirectQE: Direct Pretraining for Machine Translation Quality Estimation
Qu Cui
Shujian Huang
Jiahuan Li
Xiang Geng
Zaixiang Zheng
Guoping Huang
Jiajun Chen
18
24
0
15 May 2021
Let's Play Mono-Poly: BERT Can Reveal Words' Polysemy Level and
  Partitionability into Senses
Let's Play Mono-Poly: BERT Can Reveal Words' Polysemy Level and Partitionability into Senses
Aina Garí Soler
Marianna Apidianaki
MILM
206
68
0
29 Apr 2021
Editing Factual Knowledge in Language Models
Editing Factual Knowledge in Language Models
Nicola De Cao
Wilker Aziz
Ivan Titov
KELM
45
473
0
16 Apr 2021
Neural Machine Translation: A Review of Methods, Resources, and Tools
Neural Machine Translation: A Review of Methods, Resources, and Tools
Zhixing Tan
Shuo Wang
Zonghan Yang
Gang Chen
Xuancheng Huang
Maosong Sun
Yang Liu
3DV
AI4TS
15
105
0
31 Dec 2020
Image Representations Learned With Unsupervised Pre-Training Contain
  Human-like Biases
Image Representations Learned With Unsupervised Pre-Training Contain Human-like Biases
Ryan Steed
Aylin Caliskan
SSL
19
156
0
28 Oct 2020
SimAlign: High Quality Word Alignments without Parallel Training Data
  using Static and Contextualized Embeddings
SimAlign: High Quality Word Alignments without Parallel Training Data using Static and Contextualized Embeddings
Masoud Jalili Sabet
Philipp Dufter
François Yvon
Hinrich Schütze
23
226
0
18 Apr 2020
Information-Theoretic Probing with Minimum Description Length
Information-Theoretic Probing with Minimum Description Length
Elena Voita
Ivan Titov
21
270
0
27 Mar 2020
A Survey of Deep Learning for Scientific Discovery
A Survey of Deep Learning for Scientific Discovery
M. Raghu
Erica Schmidt
OOD
AI4CE
38
120
0
26 Mar 2020
Fixed Encoder Self-Attention Patterns in Transformer-Based Machine
  Translation
Fixed Encoder Self-Attention Patterns in Transformer-Based Machine Translation
Alessandro Raganato
Yves Scherrer
Jörg Tiedemann
27
92
0
24 Feb 2020
Exploring the Limits of Transfer Learning with a Unified Text-to-Text
  Transformer
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
86
19,440
0
23 Oct 2019
1