Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space

28 March 2022

Papers citing "Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space"

22 / 272 papers shown

Title
What Are You Token About? Dense Retrieval as Distributions Over the Vocabulary Ori Ram L. Bezalel Adi Zicher Yonatan Belinkov Jonathan Berant Amir Globerson 39 35 0 20 Dec 2022
You Need Multiple Exiting: Dynamic Early Exiting for Accelerating Unified Vision Language Model Sheng Tang Yaqing Wang Zhenglun Kong Tianchi Zhang Yao Li Caiwen Ding Yanzhi Wang Yi Liang Dongkuan Xu 30 31 0 21 Nov 2022
Convexifying Transformers: Improving optimization and understanding of transformer networks Tolga Ergen Behnam Neyshabur Harsh Mehta MLT 44 15 0 20 Nov 2022
Language Generation Models Can Cause Harm: So What Can We Do About It? An Actionable Survey Sachin Kumar Vidhisha Balachandran Lucille Njoo Antonios Anastasopoulos Yulia Tsvetkov ELM 77 85 0 14 Oct 2022
Mass-Editing Memory in a Transformer Kevin Meng Arnab Sen Sharma A. Andonian Yonatan Belinkov David Bau KELM VLM 38 525 0 13 Oct 2022
Formal Semantic Geometry over Transformer-based Variational AutoEncoder Yingji Zhang Danilo S. Carvalho Ian Pratt-Hartmann André Freitas 26 4 0 12 Oct 2022
Unified Detoxifying and Debiasing in Language Generation via Inference-time Adaptive Optimization Zonghan Yang Xiaoyuan Yi Peng Li Yang Liu Xing Xie 33 33 0 10 Oct 2022
Understanding Transformer Memorization Recall Through Idioms Adi Haviv Ido Cohen Jacob Gidron R. Schuster Yoav Goldberg Mor Geva 28 48 0 07 Oct 2022
Calibrating Factual Knowledge in Pretrained Language Models Qingxiu Dong Damai Dai Yifan Song Jingjing Xu Zhifang Sui Lei Li KELM 238 82 0 07 Oct 2022
Analyzing Transformers in Embedding Space Guy Dar Mor Geva Ankit Gupta Jonathan Berant 24 83 0 06 Sep 2022
Neural Knowledge Bank for Pretrained Transformers Damai Dai Wen-Jie Jiang Qingxiu Dong Yajuan Lyu Qiaoqiao She Zhifang Sui KELM 26 21 0 31 Jul 2022
Toward Transparent AI: A Survey on Interpreting the Inner Structures of Deep Neural Networks Tilman Raukur A. Ho Stephen Casper Dylan Hadfield-Menell AAML AI4CE 23 124 0 27 Jul 2022
Confident Adaptive Language Modeling Tal Schuster Adam Fisch Jai Gupta Mostafa Dehghani Dara Bahri Vinh Q. Tran Yi Tay Donald Metzler 43 160 0 14 Jul 2022
How to Dissect a Muppet: The Structure of Transformer Embedding Spaces Timothee Mickus Denis Paperno Mathieu Constant 27 19 0 07 Jun 2022
LM-Debugger: An Interactive Tool for Inspection and Intervention in Transformer-Based Language Models Mor Geva Avi Caciularu Guy Dar Paul Roit Shoval Sadde Micah Shlain Bar Tamir Yoav Goldberg KELM 27 27 0 26 Apr 2022
A Latent-Variable Model for Intrinsic Probing Karolina Stañczak Lucas Torroba Hennigen Adina Williams Ryan Cotterell Isabelle Augenstein 26 4 0 20 Jan 2022
Analyzing the Limits of Self-Supervision in Handling Bias in Language Lisa Bauer Karthik Gopalakrishnan Spandana Gella Yang Liu Joey Tianyi Zhou Dilek Z. Hakkani-Tür ELM 22 1 0 16 Dec 2021
A Survey on Green Deep Learning Jingjing Xu Wangchunshu Zhou Zhiyi Fu Hao Zhou Lei Li VLM 73 83 0 08 Nov 2021
Consistent Accelerated Inference via Confident Adaptive Transformers Tal Schuster Adam Fisch Tommi Jaakkola Regina Barzilay AI4TS 186 69 0 18 Apr 2021
Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP Timo Schick Sahana Udupa Hinrich Schütze 259 374 0 28 Feb 2021
Analyzing Commonsense Emergence in Few-shot Knowledge Models Jeff Da Ronan Le Bras Ximing Lu Yejin Choi Antoine Bosselut AI4MH KELM 64 40 0 01 Jan 2021
The Bottom-up Evolution of Representations in the Transformer: A Study with Machine Translation and Language Modeling Objectives Elena Voita Rico Sennrich Ivan Titov 198 181 0 03 Sep 2019