Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2209.03463
Cited By
v1
v2 (latest)
Why So Toxic? Measuring and Triggering Toxic Behavior in Open-Domain Chatbots
7 September 2022
Waiman Si
Michael Backes
Jeremy Blackburn
Emiliano De Cristofaro
Gianluca Stringhini
Savvas Zannettou
Yang Zhang
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Why So Toxic? Measuring and Triggering Toxic Behavior in Open-Domain Chatbots"
10 / 10 papers shown
Title
SaLoRA: Safety-Alignment Preserved Low-Rank Adaptation
Mingjie Li
Wai Man Si
Michael Backes
Yang Zhang
Yisen Wang
118
19
0
03 Jan 2025
GenderCARE: A Comprehensive Framework for Assessing and Reducing Gender Bias in Large Language Models
Kunsheng Tang
Wenbo Zhou
Jie Zhang
Aishan Liu
Gelei Deng
Shuai Li
Peigui Qi
Weiming Zhang
Tianwei Zhang
Nenghai Yu
135
4
0
22 Aug 2024
A Map of Exploring Human Interaction patterns with LLM: Insights into Collaboration and Creativity
Jiayang Li
Jiale Li
109
8
0
06 Apr 2024
SA-Attack: Improving Adversarial Transferability of Vision-Language Pre-training Models via Self-Augmentation
Bangyan He
Xiaojun Jia
Siyuan Liang
Tianrui Lou
Yang Liu
Xiaochun Cao
AAML
VLM
109
29
0
08 Dec 2023
GRASP: A Disagreement Analysis Framework to Assess Group Associations in Perspectives
Vinodkumar Prabhakaran
Christopher Homan
Lora Aroyo
Aida Mostafazadeh Davani
Alicia Parrish
Alex S. Taylor
Mark Díaz
Ding Wang
Greg Serapio-García
99
9
0
09 Nov 2023
MasterKey: Automated Jailbreak Across Multiple Large Language Model Chatbots
Gelei Deng
Yi Liu
Yuekang Li
Kailong Wang
Ying Zhang
Zefeng Li
Haoyu Wang
Tianwei Zhang
Yang Liu
SILM
99
136
0
16 Jul 2023
Intersectionality in Conversational AI Safety: How Bayesian Multilevel Models Help Understand Diverse Perceptions of Safety
Christopher Homan
Greg Serapio-García
Lora Aroyo
Mark Díaz
Alicia Parrish
Vinodkumar Prabhakaran
Alex S. Taylor
Ding Wang
86
9
0
20 Jun 2023
Safer Conversational AI as a Source of User Delight
Xiaoding Lu
Aleksey Korshuk
Z. Liu
W. Beauchamp
Chai Research
70
3
0
18 Apr 2023
Talking Abortion (Mis)information with ChatGPT on TikTok
Filipo Sharevski
J. Loop
Peter Jachim
Amy Devine
Emma Pieroni
84
6
0
23 Feb 2023
Beam Search Strategies for Neural Machine Translation
Markus Freitag
Yaser Al-Onaizan
129
396
0
06 Feb 2017
1