The Bottom-up Evolution of Representations in the Transformer: A Study with Machine Translation and Language Modeling Objectives

3 September 2019

Papers citing "The Bottom-up Evolution of Representations in the Transformer: A Study with Machine Translation and Language Modeling Objectives"

49 / 49 papers shown

Title
Towards Understanding How Knowledge Evolves in Large Vision-Language Models Sudong Wang Yuyao Zhang Yao Zhu Jianing Li Zizhe Wang Yi Liu Xiangyang Ji 137 0 0 31 Mar 2025
Racing Thoughts: Explaining Contextualization Errors in Large Language Models Michael A. Lepori Michael Mozer Asma Ghandeharioun LRM 85 1 0 02 Oct 2024
From Yes-Men to Truth-Tellers: Addressing Sycophancy in Large Language Models with Pinpoint Tuning Wei Chen Zhen Huang Liang Xie Binbin Lin Houqiang Li ... Deng Cai Yonggang Zhang Wenxiao Wang Xu Shen Jieping Ye 51 6 0 03 Sep 2024
Low-Rank Interconnected Adaptation Across Layers Yibo Zhong Yao Zhou OffRL MoE 48 1 0 13 Jul 2024
Investigating the translation capabilities of Large Language Models trained on parallel data only Javier García Gilabert Carlos Escolano Aleix Sant Savall Francesca de Luca Fornaciari Audrey Mash Xixian Liao Maite Melero LRM 42 2 0 13 Jun 2024
Where does In-context Translation Happen in Large Language Models Suzanna Sia David Mueller Kevin Duh LRM 41 0 0 07 Mar 2024
Disentangling the Linguistic Competence of Privacy-Preserving BERT Stefan Arnold Nils Kemmerzell Annika Schreiner 25 0 0 17 Oct 2023
Few-Shot Spoken Language Understanding via Joint Speech-Text Models Chung-Ming Chien Mingjiamei Zhang Ju-Chieh Chou Karen Livescu 28 3 0 09 Oct 2023
Layer-wise Representation Fusion for Compositional Generalization Yafang Zheng Lei Lin Shantao Liu Binling Wang Zhaohong Lai Wenhao Rao Biao Fu Yidong Chen Xiaodon Shi AI4CE 43 2 0 20 Jul 2023
On Robustness of Finetuned Transformer-based NLP Models Pavan Kalyan Reddy Neerudu S. Oota Mounika Marreddy Venkateswara Rao Kagita Manish Gupta 26 7 0 23 May 2023
Explaining How Transformers Use Context to Build Predictions Javier Ferrando Gerard I. Gállego Ioannis Tsiamas Marta R. Costa-jussá 32 31 0 21 May 2023
Learning to Compose Representations of Different Encoder Layers towards Improving Compositional Generalization Lei Lin Shuangtao Li Yafang Zheng Biao Fu Shantao Liu Yidong Chen Xiaodon Shi CoGe 25 2 0 20 May 2023
Privacy-Preserving Prompt Tuning for Large Language Model Services Yansong Li Zhixing Tan Yang Liu SILM VLM 47 63 0 10 May 2023
Topics in Contextualised Attention Embeddings Mozhgan Talebpour A. G. S. D. Herrera Shoaib Jameel 26 2 0 11 Jan 2023
On the Effect of Pre-training for Transformer in Different Modality on Offline Reinforcement Learning S. Takagi OffRL 18 7 0 17 Nov 2022
Mask More and Mask Later: Efficient Pre-training of Masked Language Models by Disentangling the [MASK] Token Baohao Liao David Thulke Sanjika Hewavitharana Hermann Ney Christof Monz 28 9 0 09 Nov 2022
Hidden State Variability of Pretrained Language Models Can Guide Computation Reduction for Transfer Learning Shuo Xie Jiahao Qiu Ankita Pasad Li Du Qing Qu Hongyuan Mei 35 16 0 18 Oct 2022
Transparency Helps Reveal When Language Models Learn Meaning Zhaofeng Wu William Merrill Hao Peng Iz Beltagy Noah A. Smith 19 9 0 14 Oct 2022
Analyzing Transformers in Embedding Space Guy Dar Mor Geva Ankit Gupta Jonathan Berant 19 83 0 06 Sep 2022
An Interpretability Evaluation Benchmark for Pre-trained Language Models Ya-Ming Shen Lijie Wang Ying Chen Xinyan Xiao Jing Liu Hua-Hong Wu 37 4 0 28 Jul 2022
How to Dissect a Muppet: The Structure of Transformer Embedding Spaces Timothee Mickus Denis Paperno Mathieu Constant 19 19 0 07 Jun 2022
Towards Opening the Black Box of Neural Machine Translation: Source and Target Interpretations of the Transformer Javier Ferrando Gerard I. Gállego Belen Alastruey Carlos Escolano Marta R. Costa-jussá 22 44 0 23 May 2022
The Geometry of Multilingual Language Model Representations Tyler A. Chang Z. Tu Benjamin Bergen 16 56 0 22 May 2022
Self-Supervised Speech Representation Learning: A Review Abdel-rahman Mohamed Hung-yi Lee Lasse Borgholt Jakob Drachmann Havtorn Joakim Edin ... Shang-Wen Li Karen Livescu Lars Maaløe Tara N. Sainath Shinji Watanabe SSL AI4TS 128 349 0 21 May 2022
Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space Mor Geva Avi Caciularu Ke Wang Yoav Goldberg KELM 46 333 0 28 Mar 2022
Contrastive Visual Semantic Pretraining Magnifies the Semantics of Natural Language Representations Robert Wolfe Aylin Caliskan VLM 21 13 0 14 Mar 2022
Representation Topology Divergence: A Method for Comparing Neural Network Representations S. Barannikov I. Trofimov Nikita Balabin Evgeny Burnaev 3DPC 32 45 0 31 Dec 2021
Measuring Context-Word Biases in Lexical Semantic Datasets Qianchu Liu Diana McCarthy Anna Korhonen 31 2 0 13 Dec 2021
Low Frequency Names Exhibit Bias and Overfitting in Contextualizing Language Models Robert Wolfe Aylin Caliskan 85 51 0 01 Oct 2021
On the Prunability of Attention Heads in Multilingual BERT Aakriti Budhraja Madhura Pande Pratyush Kumar Mitesh M. Khapra 44 4 0 26 Sep 2021
Fine-Tuned Transformers Show Clusters of Similar Representations Across Layers Jason Phang Haokun Liu Samuel R. Bowman 27 25 0 17 Sep 2021
Not All Models Localize Linguistic Knowledge in the Same Place: A Layer-wise Probing on BERToids' Representations Mohsen Fayyaz Ehsan Aghazadeh Ali Modarressi Hosein Mohebbi Mohammad Taher Pilehvar 18 21 0 13 Sep 2021
Automatic Text Evaluation through the Lens of Wasserstein Barycenters Pierre Colombo Guillaume Staerman Chloé Clavel Pablo Piantanida 27 41 0 27 Aug 2021
Translation Error Detection as Rationale Extraction M. Fomicheva Lucia Specia Nikolaos Aletras 13 23 0 27 Aug 2021
Do Vision Transformers See Like Convolutional Neural Networks? M. Raghu Thomas Unterthiner Simon Kornblith Chiyuan Zhang Alexey Dosovitskiy ViT 52 924 0 19 Aug 2021
CoBERL: Contrastive BERT for Reinforcement Learning Andrea Banino Adria Puidomenech Badia Jacob Walker Tim Scholtes Jovana Mitrović Charles Blundell OffRL 30 36 0 12 Jul 2021
Layer-wise Analysis of a Self-supervised Speech Representation Model Ankita Pasad Ju-Chieh Chou Karen Livescu SSL 26 287 0 10 Jul 2021
On Compositional Generalization of Neural Machine Translation Yafu Li Yongjing Yin Yulong Chen Yue Zhang 156 44 0 31 May 2021
LMMS Reloaded: Transformer-based Sense Embeddings for Disambiguation and Beyond Daniel Loureiro A. Jorge Jose Camacho-Collados 33 26 0 26 May 2021
DirectQE: Direct Pretraining for Machine Translation Quality Estimation Qu Cui Shujian Huang Jiahuan Li Xiang Geng Zaixiang Zheng Guoping Huang Jiajun Chen 18 24 0 15 May 2021
Let's Play Mono-Poly: BERT Can Reveal Words' Polysemy Level and Partitionability into Senses Aina Garí Soler Marianna Apidianaki MILM 206 68 0 29 Apr 2021
Editing Factual Knowledge in Language Models Nicola De Cao Wilker Aziz Ivan Titov KELM 45 473 0 16 Apr 2021
Neural Machine Translation: A Review of Methods, Resources, and Tools Zhixing Tan Shuo Wang Zonghan Yang Gang Chen Xuancheng Huang Maosong Sun Yang Liu 3DV AI4TS 15 105 0 31 Dec 2020
Image Representations Learned With Unsupervised Pre-Training Contain Human-like Biases Ryan Steed Aylin Caliskan SSL 19 156 0 28 Oct 2020
SimAlign: High Quality Word Alignments without Parallel Training Data using Static and Contextualized Embeddings Masoud Jalili Sabet Philipp Dufter François Yvon Hinrich Schütze 23 226 0 18 Apr 2020
Information-Theoretic Probing with Minimum Description Length Elena Voita Ivan Titov 21 270 0 27 Mar 2020
A Survey of Deep Learning for Scientific Discovery M. Raghu Erica Schmidt OOD AI4CE 38 120 0 26 Mar 2020
Fixed Encoder Self-Attention Patterns in Transformer-Based Machine Translation Alessandro Raganato Yves Scherrer Jörg Tiedemann 27 92 0 24 Feb 2020
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer Colin Raffel Noam M. Shazeer Adam Roberts Katherine Lee Sharan Narang Michael Matena Yanqi Zhou Wei Li Peter J. Liu AIMat 86 19,440 0 23 Oct 2019