v1v2 (latest)

BERT Rediscovers the Classical NLP Pipeline

15 May 2019

Papers citing "BERT Rediscovers the Classical NLP Pipeline"

50 / 821 papers shown

Title
Controlled Randomness Improves the Performance of Transformer Models Tobias Deuβer Cong Zhao Wolfgang Krämer David Leonhard Christian Bauckhage R. Sifa 57 1 0 20 Oct 2023
Investigating semantic subspaces of Transformer sentence embeddings through linear structural probing Dmitry Nikolaev Sebastian Padó 82 5 0 18 Oct 2023
Disentangling the Linguistic Competence of Privacy-Preserving BERT Stefan Arnold Nils Kemmerzell Annika Schreiner 82 0 0 17 Oct 2023
A State-Vector Framework for Dataset Effects E. Sahak Zining Zhu Frank Rudzicz 62 1 0 17 Oct 2023
Untying the Reversal Curse via Bidirectional Language Model Editing Jun-Yu Ma Jia-Chen Gu Zhen-Hua Ling Quan Liu Cong Liu KELM 129 41 0 16 Oct 2023
Advancing Perception in Artificial Intelligence through Principles of Cognitive Science Palaash Agrawal Cheston Tan Heena Rathore 95 1 0 13 Oct 2023
A Comparative Analysis of Task-Agnostic Distillation Methods for Compressing Transformer Language Models Takuma Udagawa Aashka Trivedi Michele Merler Bishwaranjan Bhattacharjee 78 7 0 13 Oct 2023
The Temporal Structure of Language Processing in the Human Brain Corresponds to The Layered Hierarchy of Deep Language Models Ariel Goldstein Eric Ham Mariano Schain Samuel A. Nastase Zaid Zada ... Avinatan Hassidim O. Devinsky A. Flinker Omer Levy Uri Hasson AI4CE 68 10 0 11 Oct 2023
A Meta-Learning Perspective on Transformers for Causal Language Modeling Xinbo Wu Lav Varshney 80 7 0 09 Oct 2023
An Attribution Method for Siamese Encoders Lucas Moller Dmitry Nikolaev Sebastian Padó 80 5 0 09 Oct 2023
Breaking Down Word Semantics from Pre-trained Language Models through Layer-wise Dimension Selection Nayoung Choi 50 0 0 08 Oct 2023
DecoderLens: Layerwise Interpretation of Encoder-Decoder Transformers Anna Langedijk Hosein Mohebbi Gabriele Sarti Willem H. Zuidema Jaap Jumelet 95 12 0 05 Oct 2023
Discovering Knowledge-Critical Subnetworks in Pretrained Language Models Deniz Bayazit Negar Foroutan Zeming Chen Gail Weiss Antoine Bosselut KELM 105 16 0 04 Oct 2023
Out-of-Distribution Detection by Leveraging Between-Layer Transformation Smoothness Fran Jelenić Josip Jukić Martin Tutek Mate Puljiz Jan vSnajder OODD 86 7 0 04 Oct 2023
Ensemble Distillation for Unsupervised Constituency Parsing Behzad Shayegh Yanshuai Cao Xiaodan Zhu Jackie C.K. Cheung Lili Mou 146 5 0 03 Oct 2023
VAL: Interactive Task Learning with GPT Dialog Parsing Lane Lawley Christopher MacLellan VLM 57 13 0 02 Oct 2023
ScaLearn: Simple and Highly Parameter-Efficient Task Transfer by Learning to Scale Markus Frohmann Carolin Holtermann Shahed Masoudian Anne Lauscher Navid Rekabsaz 94 2 0 02 Oct 2023
Knowledge Engineering using Large Language Models Bradley Paul Allen Lise Stork Paul T. Groth 91 25 0 01 Oct 2023
Faithful Explanations of Black-box NLP Models Using LLM-generated Counterfactuals Y. Gat Nitay Calderon Amir Feder Alexander Chapanin Amit Sharma Roi Reichart 133 36 0 01 Oct 2023
TOPFORMER: Topology-Aware Authorship Attribution of Deepfake Texts with Diverse Writing Styles Adaku Uchendu Thai Le Dongwon Lee DeLMO 89 4 0 22 Sep 2023
Embed-Search-Align: DNA Sequence Alignment using Transformer Models Pavan Holur Kenneth Enevoldsen Shreyas Rajesh L. Mboning Thalia Georgiou Louis-S. Bouchard Matteo Pellegrini V. Roychowdhury 68 0 0 20 Sep 2023
Leveraging Contextual Information for Effective Entity Salience Detection Rajarshi Bhowmik Marco Ponza Atharva Tendle Anant Gupta Rebecca Jiang Xingyu Lu Qian Zhao Daniel Preoţiuc-Pietro 35 1 0 14 Sep 2023
Neurons in Large Language Models: Dead, N-gram, Positional Elena Voita Javier Ferrando Christoforos Nalmpantis MILM 166 56 0 09 Sep 2023
Multiple Representation Transfer from Large Language Models to End-to-End ASR Systems Takuma Udagawa Masayuki Suzuki Gakuto Kurata Masayasu Muraoka G. Saon 79 2 0 07 Sep 2023
DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models Yung-Sung Chuang Yujia Xie Hongyin Luo Yoon Kim James R. Glass Pengcheng He HILM 79 167 0 07 Sep 2023
Explainability for Large Language Models: A Survey Haiyan Zhao Hanjie Chen Fan Yang Ninghao Liu Huiqi Deng Hengyi Cai Shuaiqiang Wang D. Yin Jundong Li LRM 106 471 0 02 Sep 2023
Emergent Linear Representations in World Models of Self-Supervised Sequence Models Neel Nanda Andrew Lee Martin Wattenberg FAtt MILM 122 186 0 02 Sep 2023
Why do universal adversarial attacks work on large language models?: Geometry might be the answer Varshini Subhash Anna Bialas Weiwei Pan Finale Doshi-Velez AAML 91 11 0 01 Sep 2023
Empowering Cross-lingual Abilities of Instruction-tuned Large Language Models by Translation-following demonstrations Leonardo Ranaldi Giulia Pucci André Freitas 81 34 0 27 Aug 2023
Towards Vision-Language Mechanistic Interpretability: A Causal Tracing Tool for BLIP Vedant Palit Rohan Pandey Aryaman Arora Paul Pu Liang 86 23 0 27 Aug 2023
Scaling up Discovery of Latent Concepts in Deep NLP Models Majd Hawasly Fahim Dalvi Nadir Durrani 123 5 0 20 Aug 2023
Correct and Optimal: the Regular Expression Inference Challenge Mojtaba Valizadeh P. Gorinski Ignacio Iacobacci Martin Berger 33 0 0 15 Aug 2023
Multimodality and Attention Increase Alignment in Natural Language Prediction Between Humans and Computational Models V. Kewenig Andrew Lampinen Samuel A. Nastase Christopher Edwards Quitterie Lacome DEstalenx Akilles Rechardt Jeremy I. Skipper G. Vigliocco 74 3 0 11 Aug 2023
Decoding Layer Saliency in Language Transformers Elizabeth M. Hou Greg Castañón MILM 67 0 0 09 Aug 2023
Trusting Language Models in Education J. Neto Li-Ming Deng Thejaswi Raya Reza Shahbazi Nick Liu Adhitya Venkatesh Miral Shah Neeru Khosla Rodrigo Guido 59 0 0 07 Aug 2023
Why Linguistics Will Thrive in the 21st Century: A Reply to Piantadosi (2023) Jordan Kodner Sarah Payne Jeffrey Heinz LRM 73 14 0 06 Aug 2023
Explaining Relation Classification Models with Semantic Extents Lars Klöser André Büsgen Philipp Kohl Bodo Kraft Albert Zündorf 35 0 0 04 Aug 2023
Generative Models as a Complex Systems Science: How can we make sense of large language model behavior? Ari Holtzman Peter West Luke Zettlemoyer AI4CE 92 15 0 31 Jul 2023
DPBERT: Efficient Inference for BERT based on Dynamic Planning Weixin Wu H. Zhuo 18 0 0 26 Jul 2023
Combating the Curse of Multilinguality in Cross-Lingual WSD by Aligning Sparse Contextualized Word Representations Gábor Berend 88 8 0 25 Jul 2023
CommonsenseVIS: Visualizing and Understanding Commonsense Reasoning Capabilities of Natural Language Models Xingbo Wang Renfei Huang Zhihua Jin Tianqing Fang Huamin Qu VLM ReLM LRM 109 2 0 23 Jul 2023
The Inner Sentiments of a Thought Christian Gagné Peter Dayan 56 4 0 04 Jul 2023
What Do Self-Supervised Speech Models Know About Words? Ankita Pasad C. Chien Shane Settle Karen Livescu SSL 158 36 0 30 Jun 2023
Advancing Adversarial Training by Injecting Booster Signal Hong Joo Lee Youngjoon Yu Yonghyun Ro AAML 71 3 0 27 Jun 2023
Switch-BERT: Learning to Model Multimodal Interactions by Switching Attention and Input Qingpei Guo Kaisheng Yao Wei Chu MLLM 45 5 0 25 Jun 2023
The Effect of Masking Strategies on Knowledge Retention by Language Models Jonas Wallat Tianyi Zhang Avishek Anand KELM CLL 29 0 0 12 Jun 2023
Morphosyntactic probing of multilingual BERT models Judit Ács Endre Hamerlik Roy Schwartz Noah A. Smith András Kornai 91 10 0 09 Jun 2023
Beyond Surface Statistics: Scene Representations in a Latent Diffusion Model Yida Chen Fernanda Viégas Martin Wattenberg DiffM 68 24 0 09 Jun 2023
Inference-Time Intervention: Eliciting Truthful Answers from a Language Model Kenneth Li Oam Patel Fernanda Viégas Hanspeter Pfister Martin Wattenberg KELM HILM 160 584 0 06 Jun 2023
LEACE: Perfect linear concept erasure in closed form Nora Belrose David Schneider-Joseph Shauli Ravfogel Ryan Cotterell Edward Raff Stella Biderman KELM MU 182 120 0 06 Jun 2023