v1v2 (latest)

BERT Rediscovers the Classical NLP Pipeline

15 May 2019

Papers citing "BERT Rediscovers the Classical NLP Pipeline"

50 / 821 papers shown

Title
Learning Interpretable Concepts: Unifying Causal Representation Learning and Foundation Models Goutham Rajendran Simon Buchholz Bryon Aragam Bernhard Schölkopf Pradeep Ravikumar AI4CE 175 23 0 14 Feb 2024
RA-Rec: An Efficient ID Representation Alignment Framework for LLM-based Recommendation Xiaohan Yu Li Zhang Xin Zhao Yue Wang Zhongrui Ma 72 11 0 07 Feb 2024
Vision-Language Models Provide Promptable Representations for Reinforcement Learning William Chen Oier Mees Aviral Kumar Sergey Levine VLM LM&Ro 130 27 0 05 Feb 2024
Dive into the Chasm: Probing the Gap between In- and Cross-Topic Generalization Andreas Waldis Yufang Hou Iryna Gurevych ELM 75 8 0 02 Feb 2024
Contextual Feature Extraction Hierarchies Converge in Large Language Models and the Brain Gavin Mischler Yinghao Aaron Li Stephan Bickel A. Mehta N. Mesgarani 86 31 0 31 Jan 2024
Document Structure in Long Document Transformers Jan Buchmann Max Eichler Jan-Micha Bodensohn Ilia Kuznetsov Iryna Gurevych 54 3 0 31 Jan 2024
What the Weight?! A Unified Framework for Zero-Shot Knowledge Composition Carolin Holtermann Markus Frohmann Navid Rekabsaz Anne Lauscher MoMe 64 5 0 23 Jan 2024
Patchscopes: A Unifying Framework for Inspecting Hidden Representations of Language Models Asma Ghandeharioun Avi Caciularu Adam Pearce Lucas Dixon Mor Geva 142 114 0 11 Jan 2024
Model Editing Harms General Abilities of Large Language Models: Regularization to the Rescue Jia-Chen Gu Haoyang Xu Jun-Yu Ma Pan Lu Zhen-Hua Ling Kai-Wei Chang Nanyun Peng KELM 118 55 0 09 Jan 2024
On The Potential of The Fractal Geometry and The CNNs Ability to Encode it Julia El Zini Bassel Musharrafieh M. Awad AI4CE 35 2 0 07 Jan 2024
LLaMA Pro: Progressive LLaMA with Block Expansion Chengyue Wu Yukang Gan Yixiao Ge Zeyu Lu Jiahao Wang Ye Feng Ying Shan Ping Luo CLL 88 72 0 04 Jan 2024
A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity Andrew Lee Xiaoyan Bai Itamar Pres Martin Wattenberg Jonathan K. Kummerfeld Rada Mihalcea 147 121 0 03 Jan 2024
MLPs Compass: What is learned when MLPs are combined with PLMs? Li Zhou Wenyu Chen Yong Cao DingYi Zeng Wanlong Liu Hong Qu 90 0 0 03 Jan 2024
Unsupervised Learning of Phylogenetic Trees via Split-Weight Embedding Yibo Kong G. Tiley Claudia Solís-Lemus 45 2 0 26 Dec 2023
Towards Probing Contact Center Large Language Models Varun Nathan Ayush Kumar Digvijay Ingle Jithendra Vepa 43 0 0 26 Dec 2023
Alleviating Hallucinations of Large Language Models through Induced Hallucinations Yue Zhang Leyang Cui Wei Bi Shuming Shi HILM 108 57 0 25 Dec 2023
Reducing LLM Hallucinations using Epistemic Neural Networks Shreyas Verma Kien Tran Yusuf Ali Guangyu Min 101 8 0 25 Dec 2023
Assessing Logical Reasoning Capabilities of Encoder-Only Transformer Models Paulo Pirozelli M. M. José Paulo de Tarso P. Filho A. Brandão Fabio Gagliardi Cozman LRM ELM 103 2 0 18 Dec 2023
Dynamic Syntax Mapping: A New Approach to Unsupervised Syntax Parsing Buvarp Gohsh Woods Ali Michael Anders 83 0 0 18 Dec 2023
Weight subcloning: direct initialization of transformers using larger pretrained ones Mohammad Samragh Mehrdad Farajtabar Sachin Mehta Raviteja Vemulapalli Fartash Faghri Devang Naik Oncel Tuzel Mohammad Rastegari 112 30 0 14 Dec 2023
Large Language Models for Mathematicians Simon Frieder Julius Berner P. Petersen Thomas Lukasiewicz 56 7 0 07 Dec 2023
Transformers are uninterpretable with myopic methods: a case study with bounded Dyck grammars Kaiyue Wen Yuchen Li Bing Liu Andrej Risteski 88 24 0 03 Dec 2023
Mitigating Over-smoothing in Transformers via Regularized Nonlocal Functionals Tam Nguyen Tan-Minh Nguyen Richard G. Baraniuk 76 14 0 01 Dec 2023
Unnatural Error Correction: GPT-4 Can Almost Perfectly Handle Unnatural Scrambled Text Qi Cao Takeshi Kojima Yutaka Matsuo Yusuke Iwasawa 102 19 0 30 Nov 2023
Is This the Subspace You Are Looking for? An Interpretability Illusion for Subspace Activation Patching Aleksandar Makelov Georg Lange Neel Nanda 61 22 0 28 Nov 2023
Physical Reasoning and Object Planning for Household Embodied Agents Ayush Agrawal Raghav Prabhakar Anirudh Goyal Dianbo Liu LM&Ro LRM 32 2 0 22 Nov 2023
Compositional Capabilities of Autoregressive Transformers: A Study on Synthetic, Interpretable Tasks Rahul Ramesh Ekdeep Singh Lubana Mikail Khona Robert P. Dick Hidenori Tanaka CoGe 87 12 0 21 Nov 2023
Do Localization Methods Actually Localize Memorized Data in LLMs? A Tale of Two Benchmarks Ting-Yun Chang Jesse Thomason Robin Jia 81 19 0 15 Nov 2023
MELA: Multilingual Evaluation of Linguistic Acceptability Ziyin Zhang Yikang Liu Wei-Ping Huang Junyu Mao Rui Wang Hai Hu 74 3 0 15 Nov 2023
Examining Modularity in Multilingual LMs via Language-Specialized Subnetworks Rochelle Choenni Ekaterina Shutova Daniel H Garrette 85 8 0 14 Nov 2023
Multilingual Nonce Dependency Treebanks: Understanding how Language Models represent and process syntactic structure David Arps Laura Kallmeyer Younes Samih Hassan Sajjad 68 2 0 13 Nov 2023
Legal-HNet: Mixing Legal Long-Context Tokens with Hartley Transform Daniele Giofré Sneha Ghantasala AILaw 70 0 0 09 Nov 2023
How Abstract Is Linguistic Generalization in Large Language Models? Experiments with Argument Structure Michael Wilson Jackson Petty Robert Frank 115 15 0 08 Nov 2023
Exploring Jiu-Jitsu Argumentation for Writing Peer Review Rebuttals Sukannya Purkayastha Anne Lauscher Iryna Gurevych 88 9 0 07 Nov 2023
Uncovering Intermediate Variables in Transformers using Circuit Probing Michael A. Lepori Thomas Serre Ellie Pavlick 165 7 0 07 Nov 2023
Not all layers are equally as important: Every Layer Counts BERT Lucas Georges Gabriel Charpentier David Samuel 100 18 0 03 Nov 2023
Evaluating Neural Language Models as Cognitive Models of Language Acquisition Héctor Javier Vázquez Martínez Annika Lea Heuser Charles D. Yang Jordan Kodner 102 10 0 31 Oct 2023
Probing LLMs for Joint Encoding of Linguistic Categories Giulio Starace Konstantinos Papakostas Rochelle Choenni Apostolos Panagiotopoulos Matteo Rosati Alina Leidinger Ekaterina Shutova 78 7 0 28 Oct 2023
How do Language Models Bind Entities in Context? Jiahai Feng Jacob Steinhardt 134 40 0 26 Oct 2023
Subspace Chronicles: How Linguistic Information Emerges, Shifts and Interacts during Language Model Training Max Müller-Eberstein Rob van der Goot Barbara Plank Ivan Titov 129 10 0 25 Oct 2023
The Distributional Hypothesis Does Not Fully Explain the Benefits of Masked Language Model Pretraining Ting-Rui Chiang Dani Yogatama 59 1 0 25 Oct 2023
Mixture-of-Linguistic-Experts Adapters for Improving and Interpreting Pre-trained Language Models Raymond Li Gabriel Murray Giuseppe Carenini MoE 81 2 0 24 Oct 2023
Unveiling Multilinguality in Transformer Models: Exploring Language Specificity in Feed-Forward Networks Sunit Bhattacharya Ondrej Bojar 49 12 0 24 Oct 2023
A Joint Matrix Factorization Analysis of Multilingual Representations Zheng Zhao Yftah Ziser Bonnie Webber Shay B. Cohen 87 4 0 24 Oct 2023
Probing Representations for Document-level Event Extraction Barry Wang Xinya Du Claire Cardie 34 1 0 23 Oct 2023
GradSim: Gradient-Based Language Grouping for Effective Multilingual Training Mingyang Wang Heike Adel Lukas Lange Jannik Strötgen Hinrich Schütze 86 4 0 23 Oct 2023
Modality Dropout for Multimodal Device Directed Speech Detection using Verbal and Non-Verbal Features G. Krishna Sameer Dharur Oggi Rudovic Pranay Dighe Saurabh N. Adya Ahmed Hussen Abdelaziz Ahmed H. Tewfik 67 3 0 23 Oct 2023
Transparency at the Source: Evaluating and Interpreting Language Models With Access to the True Distribution Jaap Jumelet Willem H. Zuidema 84 6 0 23 Oct 2023
Implications of Annotation Artifacts in Edge Probing Test Datasets Sagnik Ray Choudhury Jushaan Kalra 48 0 0 20 Oct 2023
Make Your Decision Convincing! A Unified Two-Stage Framework: Self-Attribution and Decision-Making Yanrui Du Sendong Zhao Hao Wang Yuhan Chen Rui Bai Zewen Qiang Muzhen Cai Bing Qin 61 0 0 20 Oct 2023