Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2310.18360
Cited By
Guiding LLM to Fool Itself: Automatically Manipulating Machine Reading Comprehension Shortcut Triggers
24 October 2023
Mosh Levy
Shauli Ravfogel
Yoav Goldberg
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Guiding LLM to Fool Itself: Automatically Manipulating Machine Reading Comprehension Shortcut Triggers"
4 / 4 papers shown
Title
ChatGPT as an Attack Tool: Stealthy Textual Backdoor Attack via Blackbox Generative Model Trigger
Jiazhao Li
Yijin Yang
Zhuofeng Wu
V. Vydiswaran
Chaowei Xiao
SILM
55
42
0
27 Apr 2023
Lexical Generalization Improves with Larger Models and Longer Training
Elron Bandel
Yoav Goldberg
Yanai Elazar
55
6
0
23 Oct 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
339
12,003
0
04 Mar 2022
Identifying and Mitigating Spurious Correlations for Improving Robustness in NLP Models
Tianlu Wang
Rohit Sridhar
Diyi Yang
Xuezhi Wang
AAML
120
72
0
14 Oct 2021
1