Probing the Probing Paradigm: Does Probing Accuracy Entail Task Relevance?

2 May 2020

Papers citing "Probing the Probing Paradigm: Does Probing Accuracy Entail Task Relevance?"

42 / 42 papers shown

Title
Probing Network Decisions: Capturing Uncertainties and Unveiling Vulnerabilities Without Label Information Youngju Joung Sehyun Lee Jaesik Choi AAML 48 1 0 12 Mar 2025
The Semantic Hub Hypothesis: Language Models Share Semantic Representations Across Languages and Modalities Zhaofeng Wu Xinyan Velocity Yu Dani Yogatama Jiasen Lu Yoon Kim AIFin 54 10 0 07 Nov 2024
What Features in Prompts Jailbreak LLMs? Investigating the Mechanisms Behind Attacks Nathalie Maria Kirch Constantin Weisser Severin Field Helen Yannakoudakis Stephen Casper 39 2 0 02 Nov 2024
Pixology: Probing the Linguistic and Visual Capabilities of Pixel-based Language Models Kushal Tatariya Vladimir Araujo Thomas Bauwens Miryam de Lhoneux VLM 35 0 0 15 Oct 2024
On Behalf of the Stakeholders: Trends in NLP Model Interpretability in the Era of LLMs Nitay Calderon Roi Reichart 40 10 0 27 Jul 2024
On Linearizing Structured Data in Encoder-Decoder Language Models: Insights from Text-to-SQL Yutong Shao N. Nakashole 32 1 0 03 Apr 2024
Black-Box Access is Insufficient for Rigorous AI Audits Stephen Casper Carson Ezell Charlotte Siegmann Noam Kolt Taylor Lynn Curtis ... Michael Gerovitch David Bau Max Tegmark David M. Krueger Dylan Hadfield-Menell AAML 34 78 0 25 Jan 2024
A Glitch in the Matrix? Locating and Detecting Language Model Grounding with Fakepedia Giovanni Monea Maxime Peyrard Martin Josifoski Vishrav Chaudhary Jason Eisner Emre Kiciman Hamid Palangi Barun Patra Robert West KELM 51 12 0 04 Dec 2023
Towards a Mechanistic Interpretation of Multi-Step Reasoning Capabilities of Language Models Yifan Hou Jiaoda Li Yu Fei Alessandro Stolfo Wangchunshu Zhou Guangtao Zeng Antoine Bosselut Mrinmaya Sachan LRM 30 40 0 23 Oct 2023
Implications of Annotation Artifacts in Edge Probing Test Datasets Sagnik Ray Choudhury Jushaan Kalra 16 0 0 20 Oct 2023
Identifying and Adapting Transformer-Components Responsible for Gender Bias in an English Language Model Abhijith Chintam Rahel Beloch Willem H. Zuidema Michael Hanna Oskar van der Wal 28 16 0 19 Oct 2023
Disentangling the Linguistic Competence of Privacy-Preserving BERT Stefan Arnold Nils Kemmerzell Annika Schreiner 30 0 0 17 Oct 2023
Language Models Represent Space and Time Wes Gurnee Max Tegmark 47 142 0 03 Oct 2023
Morphosyntactic probing of multilingual BERT models Judit Ács Endre Hamerlik Roy Schwartz Noah A. Smith András Kornai 32 9 0 09 Jun 2023
Trustworthy Social Bias Measurement Rishi Bommasani Percy Liang 27 10 0 20 Dec 2022
SocioProbe: What, When, and Where Language Models Learn about Sociodemographics Anne Lauscher Federico Bianchi Samuel R. Bowman Dirk Hovy 29 7 0 08 Nov 2022
Emergent Linguistic Structures in Neural Networks are Fragile Emanuele La Malfa Matthew Wicker Marta Kiatkowska 22 1 0 31 Oct 2022
Do Charge Prediction Models Learn Legal Theory? Zhenwei An Quzhe Huang Cong Jiang Yansong Feng Dongyan Zhao ELM AILaw 27 6 0 31 Oct 2022
Hidden State Variability of Pretrained Language Models Can Guide Computation Reduction for Transfer Learning Shuo Xie Jiahao Qiu Ankita Pasad Li Du Qing Qu Hongyuan Mei 35 16 0 18 Oct 2022
GULP: a prediction-based metric between representations Enric Boix Adserà Hannah Lawrence George Stepaniants Philippe Rigollet 46 11 0 12 Oct 2022
Probing via Prompting Jiaoda Li Ryan Cotterell Mrinmaya Sachan 34 13 0 04 Jul 2022
On the Usefulness of Embeddings, Clusters and Strings for Text Generator Evaluation Tiago Pimentel Clara Meister Ryan Cotterell 48 7 0 31 May 2022
Interpretation of Black Box NLP Models: A Survey Shivani Choudhary N. Chatterjee S. K. Saha FAtt 34 10 0 31 Mar 2022
Visualizing the Relationship Between Encoded Linguistic Information and Task Performance Jiannan Xiang Huayang Li Defu Lian Guoping Huang Taro Watanabe Lemao Liu 42 0 0 29 Mar 2022
Diagnosing AI Explanation Methods with Folk Concepts of Behavior Alon Jacovi Jasmijn Bastings Sebastian Gehrmann Yoav Goldberg Katja Filippova 36 15 0 27 Jan 2022
Inducing Causal Structure for Interpretable Neural Networks Atticus Geiger Zhengxuan Wu Hanson Lu J. Rozner Elisa Kreiss Thomas F. Icard Noah D. Goodman Christopher Potts CML OOD 35 70 0 01 Dec 2021
Global Explainability of BERT-Based Evaluation Metrics by Disentangling along Linguistic Factors Marvin Kaster Wei-Ye Zhao Steffen Eger 27 24 0 08 Oct 2021
A Closer Look at How Fine-tuning Changes BERT Yichu Zhou Vivek Srikumar 26 63 0 27 Jun 2021
Causal Abstractions of Neural Networks Atticus Geiger Hanson Lu Thomas F. Icard Christopher Potts NAI CML 17 218 0 06 Jun 2021
Probing artificial neural networks: insights from neuroscience Anna A. Ivanova John Hewitt Noga Zaslavsky 6 16 0 16 Apr 2021
DirectProbe: Studying Representations without Classifiers Yichu Zhou Vivek Srikumar 32 27 0 13 Apr 2021
Local Interpretations for Explainable Natural Language Processing: A Survey Siwen Luo Hamish Ivison S. Han Josiah Poon MILM 33 48 0 20 Mar 2021
The Rediscovery Hypothesis: Language Models Need to Meet Linguistics Vassilina Nikoulina Maxat Tezekbayev Nuradil Kozhakhmet Madina Babazhanova Matthias Gallé Z. Assylbekov 34 8 0 02 Mar 2021
Contrastive Explanations for Model Interpretability Alon Jacovi Swabha Swayamdipta Shauli Ravfogel Yanai Elazar Yejin Choi Yoav Goldberg 44 95 0 02 Mar 2021
Probing Classifiers: Promises, Shortcomings, and Advances Yonatan Belinkov 226 405 0 24 Feb 2021
First Align, then Predict: Understanding the Cross-Lingual Ability of Multilingual BERT Benjamin Muller Yanai Elazar Benoît Sagot Djamé Seddah LRM 29 71 0 26 Jan 2021
Discovering the Compositional Structure of Vector Representations with Role Learning Networks Paul Soulos R. Thomas McCoy Tal Linzen P. Smolensky CoGe 29 43 0 21 Oct 2019
The Bottom-up Evolution of Representations in the Transformer: A Study with Machine Translation and Language Modeling Objectives Elena Voita Rico Sennrich Ivan Titov 198 181 0 03 Sep 2019
What you can cram into a single vector: Probing sentence embeddings for linguistic properties Alexis Conneau Germán Kruszewski Guillaume Lample Loïc Barrault Marco Baroni 201 882 0 03 May 2018
Hypothesis Only Baselines in Natural Language Inference Adam Poliak Jason Naradowsky Aparajita Haldar Rachel Rudinger Benjamin Van Durme 190 576 0 02 May 2018
A Decomposable Attention Model for Natural Language Inference Ankur P. Parikh Oscar Täckström Dipanjan Das Jakob Uszkoreit 210 1,367 0 06 Jun 2016
Convolutional Neural Networks for Sentence Classification Yoon Kim AILaw VLM 255 13,364 0 25 Aug 2014