Understanding Learning Dynamics Of Language Models with SVCCA

1 November 2018

Papers citing "Understanding Learning Dynamics Of Language Models with SVCCA"

26 / 26 papers shown

Title
Languages Transferred Within the Encoder: On Representation Transfer in Zero-Shot Multilingual Translation Zhi Qu Chenchen Ding Taro Watanabe 85 1 0 12 Jun 2024
Disentangling the Linguistic Competence of Privacy-Preserving BERT Stefan Arnold Nils Kemmerzell Annika Schreiner 35 0 0 17 Oct 2023
DecoderLens: Layerwise Interpretation of Encoder-Decoder Transformers Anna Langedijk Hosein Mohebbi Gabriele Sarti Willem H. Zuidema Jaap Jumelet 32 10 0 05 Oct 2023
Gaussian Process Probes (GPP) for Uncertainty-Aware Probing Zehao Wang Alexander Ku Jason Baldridge Thomas Griffiths Been Kim UQCV 34 11 0 29 May 2023
Towards domain generalisation in ASR with elitist sampling and ensemble knowledge distillation Rehan Ahmad Md. Asif Jalal Muhammad Umar Farooq A. Ollerenshaw Thomas Hain 18 2 0 01 Mar 2023
The Role of Interactive Visualization in Explaining (Large) NLP Models: from Data to Inference R. Brath Daniel A. Keim Johannes Knittel Shimei Pan Pia Sommerauer Hendrik Strobelt 19 11 0 11 Jan 2023
Understanding Domain Learning in Language Models Through Subpopulation Analysis Zheng Zhao Yftah Ziser Shay B. Cohen 34 6 0 22 Oct 2022
Analyzing Text Representations under Tight Annotation Budgets: Measuring Structural Alignment César González-Gutiérrez Audi Primadhanty Francesco Cazzaro A. Quattoni 33 0 0 11 Oct 2022
Causal Proxy Models for Concept-Based Model Explanations Zhengxuan Wu Karel DÓosterlinck Atticus Geiger Amir Zur Christopher Potts MILM 83 35 0 28 Sep 2022
The Geometry of Multilingual Language Model Representations Tyler A. Chang Zhuowen Tu Benjamin Bergen 23 56 0 22 May 2022
Memorization Without Overfitting: Analyzing the Training Dynamics of Large Language Models Kushal Tirumala Aram H. Markosyan Luke Zettlemoyer Armen Aghajanyan TDI 29 187 0 22 May 2022
Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space Mor Geva Avi Caciularu Ke Wang Yoav Goldberg KELM 69 336 0 28 Mar 2022
Conditional probing: measuring usable information beyond a baseline John Hewitt Kawin Ethayarajh Percy Liang Christopher D. Manning 39 55 0 19 Sep 2021
The Grammar-Learning Trajectories of Neural Language Models Leshem Choshen Guy Hacohen D. Weinshall Omri Abend 29 28 0 13 Sep 2021
The MultiBERTs: BERT Reproductions for Robustness Analysis Thibault Sellam Steve Yadlowsky Jason W. Wei Naomi Saphra Alexander DÁmour ... Iulia Turc Jacob Eisenstein Dipanjan Das Ian Tenney Ellie Pavlick 24 93 0 30 Jun 2021
Probing Across Time: What Does RoBERTa Know and When? Leo Z. Liu Yizhong Wang Jungo Kasai Hannaneh Hajishirzi Noah A. Smith KELM 13 80 0 16 Apr 2021
When Do You Need Billions of Words of Pretraining Data? Yian Zhang Alex Warstadt Haau-Sing Li Samuel R. Bowman 29 136 0 10 Nov 2020
Linguistic Profiling of a Neural Language Model Alessio Miaschi D. Brunato F. Dell’Orletta Giulia Venturi 36 46 0 05 Oct 2020
Analysis and Evaluation of Language Models for Word Sense Disambiguation Daniel Loureiro Kiamehr Rezaee Mohammad Taher Pilehvar Jose Camacho-Collados 33 13 0 26 Aug 2020
Information-Theoretic Probing with Minimum Description Length Elena Voita Ivan Titov 23 270 0 27 Mar 2020
Rapid Learning or Feature Reuse? Towards Understanding the Effectiveness of MAML Aniruddh Raghu M. Raghu Samy Bengio Oriol Vinyals 186 640 0 19 Sep 2019
Zero-shot Reading Comprehension by Cross-lingual Transfer Learning with Multi-lingual Language Representation Model Tsung-Yuan Hsu Chi-Liang Liu Hung-yi Lee 26 60 0 15 Sep 2019
Designing and Interpreting Probes with Control Tasks John Hewitt Percy Liang 32 523 0 08 Sep 2019
Investigating Multilingual NMT Representations at Scale Sneha Kudugunta Ankur Bapna Isaac Caswell N. Arivazhagan Orhan Firat LRM 144 120 0 05 Sep 2019
The Bottom-up Evolution of Representations in the Transformer: A Study with Machine Translation and Language Modeling Objectives Elena Voita Rico Sennrich Ivan Titov 207 181 0 03 Sep 2019
What you can cram into a single vector: Probing sentence embeddings for linguistic properties Alexis Conneau Germán Kruszewski Guillaume Lample Loïc Barrault Marco Baroni 201 882 0 03 May 2018