Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2005.00719
Cited By
Probing the Probing Paradigm: Does Probing Accuracy Entail Task Relevance?
2 May 2020
Abhilasha Ravichander
Yonatan Belinkov
Eduard H. Hovy
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Probing the Probing Paradigm: Does Probing Accuracy Entail Task Relevance?"
42 / 42 papers shown
Title
Probing Network Decisions: Capturing Uncertainties and Unveiling Vulnerabilities Without Label Information
Youngju Joung
Sehyun Lee
Jaesik Choi
AAML
48
1
0
12 Mar 2025
The Semantic Hub Hypothesis: Language Models Share Semantic Representations Across Languages and Modalities
Zhaofeng Wu
Xinyan Velocity Yu
Dani Yogatama
Jiasen Lu
Yoon Kim
AIFin
54
10
0
07 Nov 2024
What Features in Prompts Jailbreak LLMs? Investigating the Mechanisms Behind Attacks
Nathalie Maria Kirch
Constantin Weisser
Severin Field
Helen Yannakoudakis
Stephen Casper
39
2
0
02 Nov 2024
Pixology: Probing the Linguistic and Visual Capabilities of Pixel-based Language Models
Kushal Tatariya
Vladimir Araujo
Thomas Bauwens
Miryam de Lhoneux
VLM
35
0
0
15 Oct 2024
On Behalf of the Stakeholders: Trends in NLP Model Interpretability in the Era of LLMs
Nitay Calderon
Roi Reichart
40
10
0
27 Jul 2024
On Linearizing Structured Data in Encoder-Decoder Language Models: Insights from Text-to-SQL
Yutong Shao
N. Nakashole
32
1
0
03 Apr 2024
Black-Box Access is Insufficient for Rigorous AI Audits
Stephen Casper
Carson Ezell
Charlotte Siegmann
Noam Kolt
Taylor Lynn Curtis
...
Michael Gerovitch
David Bau
Max Tegmark
David M. Krueger
Dylan Hadfield-Menell
AAML
34
78
0
25 Jan 2024
A Glitch in the Matrix? Locating and Detecting Language Model Grounding with Fakepedia
Giovanni Monea
Maxime Peyrard
Martin Josifoski
Vishrav Chaudhary
Jason Eisner
Emre Kiciman
Hamid Palangi
Barun Patra
Robert West
KELM
51
12
0
04 Dec 2023
Towards a Mechanistic Interpretation of Multi-Step Reasoning Capabilities of Language Models
Yifan Hou
Jiaoda Li
Yu Fei
Alessandro Stolfo
Wangchunshu Zhou
Guangtao Zeng
Antoine Bosselut
Mrinmaya Sachan
LRM
30
40
0
23 Oct 2023
Implications of Annotation Artifacts in Edge Probing Test Datasets
Sagnik Ray Choudhury
Jushaan Kalra
16
0
0
20 Oct 2023
Identifying and Adapting Transformer-Components Responsible for Gender Bias in an English Language Model
Abhijith Chintam
Rahel Beloch
Willem H. Zuidema
Michael Hanna
Oskar van der Wal
28
16
0
19 Oct 2023
Disentangling the Linguistic Competence of Privacy-Preserving BERT
Stefan Arnold
Nils Kemmerzell
Annika Schreiner
30
0
0
17 Oct 2023
Language Models Represent Space and Time
Wes Gurnee
Max Tegmark
47
142
0
03 Oct 2023
Morphosyntactic probing of multilingual BERT models
Judit Ács
Endre Hamerlik
Roy Schwartz
Noah A. Smith
András Kornai
32
9
0
09 Jun 2023
Trustworthy Social Bias Measurement
Rishi Bommasani
Percy Liang
27
10
0
20 Dec 2022
SocioProbe: What, When, and Where Language Models Learn about Sociodemographics
Anne Lauscher
Federico Bianchi
Samuel R. Bowman
Dirk Hovy
29
7
0
08 Nov 2022
Emergent Linguistic Structures in Neural Networks are Fragile
Emanuele La Malfa
Matthew Wicker
Marta Kiatkowska
22
1
0
31 Oct 2022
Do Charge Prediction Models Learn Legal Theory?
Zhenwei An
Quzhe Huang
Cong Jiang
Yansong Feng
Dongyan Zhao
ELM
AILaw
27
6
0
31 Oct 2022
Hidden State Variability of Pretrained Language Models Can Guide Computation Reduction for Transfer Learning
Shuo Xie
Jiahao Qiu
Ankita Pasad
Li Du
Qing Qu
Hongyuan Mei
35
16
0
18 Oct 2022
GULP: a prediction-based metric between representations
Enric Boix Adserà
Hannah Lawrence
George Stepaniants
Philippe Rigollet
46
11
0
12 Oct 2022
Probing via Prompting
Jiaoda Li
Ryan Cotterell
Mrinmaya Sachan
34
13
0
04 Jul 2022
On the Usefulness of Embeddings, Clusters and Strings for Text Generator Evaluation
Tiago Pimentel
Clara Meister
Ryan Cotterell
48
7
0
31 May 2022
Interpretation of Black Box NLP Models: A Survey
Shivani Choudhary
N. Chatterjee
S. K. Saha
FAtt
34
10
0
31 Mar 2022
Visualizing the Relationship Between Encoded Linguistic Information and Task Performance
Jiannan Xiang
Huayang Li
Defu Lian
Guoping Huang
Taro Watanabe
Lemao Liu
42
0
0
29 Mar 2022
Diagnosing AI Explanation Methods with Folk Concepts of Behavior
Alon Jacovi
Jasmijn Bastings
Sebastian Gehrmann
Yoav Goldberg
Katja Filippova
36
15
0
27 Jan 2022
Inducing Causal Structure for Interpretable Neural Networks
Atticus Geiger
Zhengxuan Wu
Hanson Lu
J. Rozner
Elisa Kreiss
Thomas F. Icard
Noah D. Goodman
Christopher Potts
CML
OOD
35
70
0
01 Dec 2021
Global Explainability of BERT-Based Evaluation Metrics by Disentangling along Linguistic Factors
Marvin Kaster
Wei-Ye Zhao
Steffen Eger
27
24
0
08 Oct 2021
A Closer Look at How Fine-tuning Changes BERT
Yichu Zhou
Vivek Srikumar
26
63
0
27 Jun 2021
Causal Abstractions of Neural Networks
Atticus Geiger
Hanson Lu
Thomas F. Icard
Christopher Potts
NAI
CML
17
218
0
06 Jun 2021
Probing artificial neural networks: insights from neuroscience
Anna A. Ivanova
John Hewitt
Noga Zaslavsky
6
16
0
16 Apr 2021
DirectProbe: Studying Representations without Classifiers
Yichu Zhou
Vivek Srikumar
32
27
0
13 Apr 2021
Local Interpretations for Explainable Natural Language Processing: A Survey
Siwen Luo
Hamish Ivison
S. Han
Josiah Poon
MILM
33
48
0
20 Mar 2021
The Rediscovery Hypothesis: Language Models Need to Meet Linguistics
Vassilina Nikoulina
Maxat Tezekbayev
Nuradil Kozhakhmet
Madina Babazhanova
Matthias Gallé
Z. Assylbekov
34
8
0
02 Mar 2021
Contrastive Explanations for Model Interpretability
Alon Jacovi
Swabha Swayamdipta
Shauli Ravfogel
Yanai Elazar
Yejin Choi
Yoav Goldberg
44
95
0
02 Mar 2021
Probing Classifiers: Promises, Shortcomings, and Advances
Yonatan Belinkov
226
405
0
24 Feb 2021
First Align, then Predict: Understanding the Cross-Lingual Ability of Multilingual BERT
Benjamin Muller
Yanai Elazar
Benoît Sagot
Djamé Seddah
LRM
29
71
0
26 Jan 2021
Discovering the Compositional Structure of Vector Representations with Role Learning Networks
Paul Soulos
R. Thomas McCoy
Tal Linzen
P. Smolensky
CoGe
29
43
0
21 Oct 2019
The Bottom-up Evolution of Representations in the Transformer: A Study with Machine Translation and Language Modeling Objectives
Elena Voita
Rico Sennrich
Ivan Titov
198
181
0
03 Sep 2019
What you can cram into a single vector: Probing sentence embeddings for linguistic properties
Alexis Conneau
Germán Kruszewski
Guillaume Lample
Loïc Barrault
Marco Baroni
201
882
0
03 May 2018
Hypothesis Only Baselines in Natural Language Inference
Adam Poliak
Jason Naradowsky
Aparajita Haldar
Rachel Rudinger
Benjamin Van Durme
190
576
0
02 May 2018
A Decomposable Attention Model for Natural Language Inference
Ankur P. Parikh
Oscar Täckström
Dipanjan Das
Jakob Uszkoreit
210
1,367
0
06 Jun 2016
Convolutional Neural Networks for Sentence Classification
Yoon Kim
AILaw
VLM
255
13,364
0
25 Aug 2014
1