Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2108.11830
Cited By
Just Say No: Analyzing the Stance of Neural Dialogue Generation in Offensive Contexts
26 August 2021
Ashutosh Baheti
Maarten Sap
Alan Ritter
Mark O. Riedl
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Just Say No: Analyzing the Stance of Neural Dialogue Generation in Offensive Contexts"
45 / 45 papers shown
Title
Evaluating Implicit Bias in Large Language Models by Attacking From a Psychometric Perspective
Yuchen Wen
Keping Bi
Wei Chen
J. Guo
Xueqi Cheng
89
1
0
20 Feb 2025
FarExStance: Explainable Stance Detection for Farsi
Majid Zarharan
Maryam Hashemi
Malika Behroozrazegh
Sauleh Eetemadi
Mohammad Taher Pilehvar
Jennifer Foster
85
0
0
18 Dec 2024
Knowledge-Enhanced Conversational Recommendation via Transformer-based Sequential Modelling
Jie Zou
Aixin Sun
Cheng Long
Evangelos Kanoulas
LMTD
99
4
0
03 Dec 2024
Mitigating Biases to Embrace Diversity: A Comprehensive Annotation Benchmark for Toxic Language
Xinmeng Hou
24
1
0
17 Oct 2024
Granular Privacy Control for Geolocation with Vision Language Models
Ethan Mendes
Yang Chen
James Hays
Sauvik Das
Wei-ping Xu
Alan Ritter
45
3
0
06 Jul 2024
Stark: Social Long-Term Multi-Modal Conversation with Persona Commonsense Knowledge
Young-Jun Lee
Dokyong Lee
Junyoung Youn
Kyeongjin Oh
ByungSoo Ko
Jonghwan Hyeon
Ho-Jin Choi
33
2
0
04 Jul 2024
Purple-teaming LLMs with Adversarial Defender Training
Jingyan Zhou
Kun Li
Junan Li
Jiawen Kang
Minda Hu
Xixin Wu
Helen Meng
AAML
36
1
0
01 Jul 2024
MentalManip: A Dataset For Fine-grained Analysis of Mental Manipulation in Conversations
Yuxin Wang
Ivory Yang
Saeed Hassanpour
Soroush Vosoughi
AAML
41
7
0
26 May 2024
The Unseen Targets of Hate -- A Systematic Review of Hateful Communication Datasets
Zehui Yu
Indira Sen
Dennis Assenmacher
Mattia Samory
Leon Fröhling
Christina Dahn
Debora Nozza
Claudia Wagner
35
5
0
14 May 2024
"They are uncultured": Unveiling Covert Harms and Social Threats in LLM Generated Conversations
Preetam Prabhu Srikar Dammu
Hayoung Jung
Anjali Singh
Monojit Choudhury
Tanushree Mitra
32
8
0
08 May 2024
Stanceosaurus 2.0: Classifying Stance Towards Russian and Spanish Misinformation
Anton Lavrouk
Ian Ligon
Tarek Naous
Jonathan Zheng
Alan Ritter
Wei-ping Xu
24
0
0
06 Feb 2024
Improving Dialog Safety using Socially Aware Contrastive Learning
Souvik Das
R. Srihari
11
1
0
01 Feb 2024
A Block Metropolis-Hastings Sampler for Controllable Energy-based Text Generation
Jarad Forristal
Niloofar Mireshghallah
Greg Durrett
Taylor Berg-Kirkpatrick
110
4
0
07 Dec 2023
CESAR: Automatic Induction of Compositional Instructions for Multi-turn Dialogs
Taha İbrahim Aksu
Devamanyu Hazarika
Shikib Mehri
Seokhwan Kim
Dilek Z. Hakkani-Tür
Yang Liu
Mahdi Namazifar
28
2
0
29 Nov 2023
Large Language Models can Share Images, Too!
Young-Jun Lee
Dokyong Lee
Joo Won Sung
Jonghwan Hyeon
Ho-Jin Choi
MLLM
24
2
0
23 Oct 2023
EXMODD: An EXplanatory Multimodal Open-Domain Dialogue dataset
Hang Yin
Pinren Lu
Ziang Li
Bin Sun
Kan Li
34
0
0
17 Oct 2023
Through the Lens of Core Competency: Survey on Evaluation of Large Language Models
Ziyu Zhuang
Qiguang Chen
Longxuan Ma
Mingda Li
Yi Han
Yushan Qian
Haopeng Bai
Zixian Feng
Weinan Zhang
Ting Liu
ELM
26
9
0
15 Aug 2023
A Benchmark for Understanding Dialogue Safety in Mental Health Support
Huachuan Qiu
Tong Zhao
Anqi Li
Shuai Zhang
Hongliang He
Zhenzhong Lan
27
9
0
31 Jul 2023
Improved Instruction Ordering in Recipe-Grounded Conversation
Duong Minh Le
Ruohao Guo
Wei-ping Xu
Alan Ritter
28
8
0
26 May 2023
Healing Unsafe Dialogue Responses with Weak Supervision Signals
Zi Liang
Pinghui Wang
Ruofei Zhang
Shuo Zhang
Xiaofan Ye Yi Huang
Junlan Feng
29
1
0
25 May 2023
Leftover Lunch: Advantage-based Offline Reinforcement Learning for Language Models
Ashutosh Baheti
Ximing Lu
Faeze Brahman
Ronan Le Bras
Maarten Sap
Mark O. Riedl
30
9
0
24 May 2023
Reducing Sensitivity on Speaker Names for Text Generation from Dialogues
Qi Jia
Haifeng Tang
Kenny Q. Zhu
16
2
0
23 May 2023
BiasAsker: Measuring the Bias in Conversational AI System
Yuxuan Wan
Wenxuan Wang
Pinjia He
Jiazhen Gu
Haonan Bai
Michael Lyu
27
67
0
21 May 2023
A Survey on Proactive Dialogue Systems: Problems, Methods, and Prospects
Yang Deng
Wenqiang Lei
W. Lam
Tat-Seng Chua
73
44
0
04 May 2023
Transcending the "Male Code": Implicit Masculine Biases in NLP Contexts
Katie Seaborn
Shruti Chandra
Thibault Fabre
23
11
0
22 Apr 2023
Unsupervised Layer-wise Score Aggregation for Textual OOD Detection
Maxime Darrin
Guillaume Staerman
Eduardo Dadalto Camara Gomes
Jackie CK Cheung
Pablo Piantanida
Pierre Colombo
OODD
52
11
0
20 Feb 2023
Using In-Context Learning to Improve Dialogue Safety
Nicholas Meade
Spandana Gella
Devamanyu Hazarika
Prakhar Gupta
Di Jin
Siva Reddy
Yang Liu
Dilek Z. Hakkani-Tür
25
38
0
02 Feb 2023
Language Model Detoxification in Dialogue with Contextualized Stance Control
Jingu Qian
Xifeng Yan
16
1
0
25 Jan 2023
DialGuide: Aligning Dialogue Model Behavior with Developer Guidelines
Prakhar Gupta
Yang Liu
Di Jin
Behnam Hedayatnia
Spandana Gella
Sijia Liu
P. Lange
Julia Hirschberg
Dilek Z. Hakkani-Tür
30
5
0
20 Dec 2022
SODA: Million-scale Dialogue Distillation with Social Commonsense Contextualization
Hyunwoo J. Kim
Jack Hessel
Liwei Jiang
Peter West
Ximing Lu
...
Ronan Le Bras
Malihe Alikhani
Gunhee Kim
Maarten Sap
Yejin Choi
HILM
32
154
0
20 Dec 2022
Stanceosaurus: Classifying Stance Towards Multilingual Misinformation
Jonathan Zheng
Ashutosh Baheti
Tarek Naous
Wei-ping Xu
Alan Ritter
25
12
0
28 Oct 2022
Language Detoxification with Attribute-Discriminative Latent Space
Jin Myung Kwak
Minseon Kim
Sung Ju Hwang
25
12
0
19 Oct 2022
The State of Profanity Obfuscation in Natural Language Processing
Debora Nozza
Dirk Hovy
36
7
0
14 Oct 2022
ProsocialDialog: A Prosocial Backbone for Conversational Agents
Hyunwoo J. Kim
Youngjae Yu
Liwei Jiang
Ximing Lu
Daniel Khashabi
Gunhee Kim
Yejin Choi
Maarten Sap
20
117
0
25 May 2022
InstructDial: Improving Zero and Few-shot Generalization in Dialogue through Instruction Tuning
Prakhar Gupta
Cathy Jiao
Yi-Ting Yeh
Shikib Mehri
M. Eskénazi
Jeffrey P. Bigham
ALM
36
47
0
25 May 2022
Target-Guided Dialogue Response Generation Using Commonsense and Data Augmentation
Prakhar Gupta
Harsh Jhamtani
Jeffrey P. Bigham
46
12
0
19 May 2022
"I'm sorry to hear that": Finding New Biases in Language Models with a Holistic Descriptor Dataset
Eric Michael Smith
Melissa Hall
Melanie Kambadur
Eleonora Presani
Adina Williams
73
129
0
18 May 2022
Robust Conversational Agents against Imperceptible Toxicity Triggers
Ninareh Mehrabi
Ahmad Beirami
Fred Morstatter
Aram Galstyan
AAML
18
32
0
05 May 2022
PanGu-Bot: Efficient Generative Dialogue Pre-training from Pre-trained Language Model
Fei Mi
Yitong Li
Yulong Zeng
Jingyan Zhou
Yasheng Wang
Chuanfei Xu
Lifeng Shang
Xin Jiang
Shiqi Zhao
Qun Liu
ALM
37
18
0
31 Mar 2022
Mix and Match: Learning-free Controllable Text Generation using Energy Language Models
Fatemehsadat Mireshghallah
Kartik Goyal
Taylor Berg-Kirkpatrick
36
78
0
24 Mar 2022
Towards Identifying Social Bias in Dialog Systems: Frame, Datasets, and Benchmarks
Jingyan Zhou
Deng Jiawen
Fei Mi
Yitong Li
Yasheng Wang
Minlie Huang
Xin Jiang
Qun Liu
H. Meng
25
31
0
16 Feb 2022
Exploring the Limits of Domain-Adaptive Training for Detoxifying Large-Scale Language Models
Boxin Wang
Wei Ping
Chaowei Xiao
P. Xu
M. Patwary
M. Shoeybi
Bo-wen Li
Anima Anandkumar
Bryan Catanzaro
20
64
0
08 Feb 2022
COLD: A Benchmark for Chinese Offensive Language Detection
Deng Jiawen
Jingyan Zhou
Hao-Lun Sun
Chujie Zheng
Fei Mi
Helen M. Meng
Minlie Huang
19
98
0
16 Jan 2022
Partner Personas Generation for Diverse Dialogue Generation
Hongyuan Lu
W. Lam
Hong Cheng
H. Meng
20
1
0
27 Nov 2021
Revealing Persona Biases in Dialogue Systems
Emily Sheng
Josh Arnold
Zhou Yu
Kai-Wei Chang
Nanyun Peng
17
37
0
18 Apr 2021
1