Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.12043
Cited By
The Art of Saying No: Contextual Noncompliance in Language Models
2 July 2024
Faeze Brahman
Sachin Kumar
Vidhisha Balachandran
Pradeep Dasigi
Valentina Pyatkin
Abhilasha Ravichander
Sarah Wiegreffe
Nouha Dziri
Khyathi Raghavi Chandu
Jack Hessel
Yulia Tsvetkov
Noah A. Smith
Yejin Choi
Hannaneh Hajishirzi
Re-assign community
ArXiv
PDF
HTML
Papers citing
"The Art of Saying No: Contextual Noncompliance in Language Models"
11 / 11 papers shown
Title
FalseReject: A Resource for Improving Contextual Safety and Mitigating Over-Refusals in LLMs via Structured Reasoning
Zhehao Zhang
Weijie Xu
Fanyou Wu
Chandan K. Reddy
29
0
0
12 May 2025
Programming Refusal with Conditional Activation Steering
Bruce W. Lee
Inkit Padhi
K. Ramamurthy
Erik Miehling
Pierre L. Dognin
Manish Nagireddy
Amit Dhurandhar
LLMSV
93
13
0
06 Sep 2024
Acceptable Use Policies for Foundation Models
Kevin Klyman
31
14
0
29 Aug 2024
WildChat: 1M ChatGPT Interaction Logs in the Wild
Wenting Zhao
Xiang Ren
Jack Hessel
Claire Cardie
Yejin Choi
Yuntian Deng
42
174
0
02 May 2024
Clarify When Necessary: Resolving Ambiguity Through Interaction with LMs
Michael J.Q. Zhang
Eunsol Choi
32
26
0
16 Nov 2023
Language Generation Models Can Cause Harm: So What Can We Do About It? An Actionable Survey
Sachin Kumar
Vidhisha Balachandran
Lucille Njoo
Antonios Anastasopoulos
Yulia Tsvetkov
ELM
68
85
0
14 Oct 2022
Reducing conversational agents' overconfidence through linguistic calibration
Sabrina J. Mielke
Arthur Szlam
Emily Dinan
Y-Lan Boureau
209
153
0
30 Dec 2020
Extracting Training Data from Large Language Models
Nicholas Carlini
Florian Tramèr
Eric Wallace
Matthew Jagielski
Ariel Herbert-Voss
...
Tom B. Brown
D. Song
Ulfar Erlingsson
Alina Oprea
Colin Raffel
MLAU
SILM
290
1,814
0
14 Dec 2020
Formalizing Trust in Artificial Intelligence: Prerequisites, Causes and Goals of Human Trust in AI
Alon Jacovi
Ana Marasović
Tim Miller
Yoav Goldberg
249
425
0
15 Oct 2020
Calibration of Pre-trained Transformers
Shrey Desai
Greg Durrett
UQLM
243
289
0
17 Mar 2020
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
231
4,460
0
23 Jan 2020
1