Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2305.08005
Cited By
Beyond the Safeguards: Exploring the Security Risks of ChatGPT
13 May 2023
Erik Derner
Kristina Batistic
SILM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Beyond the Safeguards: Exploring the Security Risks of ChatGPT"
39 / 39 papers shown
Title
AI Ethics and Social Norms: Exploring ChatGPT's Capabilities From What to How
Omid Veisi
Sasan Bahrami
Roman Englert
Claudia Müller
132
0
0
25 Apr 2025
SOK: Exploring Hallucinations and Security Risks in AI-Assisted Software Development with Insights for LLM Deployment
Ariful Haque
Sunzida Siddique
M. Rahman
Ahmed Rafi Hasan
Laxmi Rani Das
Marufa Kamal
Tasnim Masura
Kishor Datta Gupta
53
1
0
31 Jan 2025
AI Safety in Generative AI Large Language Models: A Survey
Jaymari Chua
Yun Yvonna Li
Shiyi Yang
Chen Wang
Lina Yao
LM&MA
42
12
0
06 Jul 2024
The Art of Saying No: Contextual Noncompliance in Language Models
Faeze Brahman
Sachin Kumar
Vidhisha Balachandran
Pradeep Dasigi
Valentina Pyatkin
...
Jack Hessel
Yulia Tsvetkov
Noah A. Smith
Yejin Choi
Hannaneh Hajishirzi
75
21
0
02 Jul 2024
A Complete Survey on LLM-based AI Chatbots
Sumit Kumar Dam
Choong Seon Hong
Yu Qiao
Chaoning Zhang
59
52
0
17 Jun 2024
Is On-Device AI Broken and Exploitable? Assessing the Trust and Ethics in Small Language Models
Kalyan Nakka
Jimmy Dani
Nitesh Saxena
48
1
0
08 Jun 2024
Measure-Observe-Remeasure: An Interactive Paradigm for Differentially-Private Exploratory Analysis
Priyanka Nanayakkara
Hyeok Kim
Yifan Wu
Ali Sarvghad
Narges Mahyar
G. Miklau
Jessica Hullman
31
17
0
04 Jun 2024
Towards Trustworthy AI: A Review of Ethical and Robust Large Language Models
Meftahul Ferdaus
Mahdi Abdelguerfi
Elias Ioup
Kendall N. Niles
Ken Pathak
Steve Sloan
39
11
0
01 Jun 2024
FreezeAsGuard: Mitigating Illegal Adaptation of Diffusion Models via Selective Tensor Freezing
Kai Huang
Wei Gao
42
2
0
24 May 2024
Tagengo: A Multilingual Chat Dataset
P. Devine
42
3
0
21 May 2024
Risks of Practicing Large Language Models in Smart Grid: Threat Modeling and Validation
Jiangnan Li
Yingyuan Yang
Jinyuan Stella Sun
62
4
0
10 May 2024
Large Language Models for Cyber Security: A Systematic Literature Review
HanXiang Xu
Shenao Wang
Ningke Li
Kaidi Wang
Yanjie Zhao
Kai Chen
Ting Yu
Yang Liu
Haoyu Wang
37
23
0
08 May 2024
SmartMem: Layout Transformation Elimination and Adaptation for Efficient DNN Execution on Mobile
Wei Niu
Md. Musfiqur Rahman Sanim
Zhihao Shu
Jiexiong Guan
Xipeng Shen
Miao Yin
Gagan Agrawal
Bin Ren
32
6
0
21 Apr 2024
Risk and Response in Large Language Models: Evaluating Key Threat Categories
Bahareh Harandizadeh
A. Salinas
Fred Morstatter
25
3
0
22 Mar 2024
On Protecting the Data Privacy of Large Language Models (LLMs): A Survey
Biwei Yan
Kun Li
Minghui Xu
Yueyan Dong
Yue Zhang
Zhaochun Ren
Xiuzhen Cheng
AILaw
PILM
78
76
0
08 Mar 2024
Exploring the Potential of Large Language Models for Improving Digital Forensic Investigation Efficiency
Akila Wickramasekara
F. Breitinger
Mark Scanlon
52
8
0
29 Feb 2024
Making Them Ask and Answer: Jailbreaking Large Language Models in Few Queries via Disguise and Reconstruction
Tong Liu
Yingjie Zhang
Zhe Zhao
Yinpeng Dong
Guozhu Meng
Kai Chen
AAML
51
44
0
28 Feb 2024
Farsight: Fostering Responsible AI Awareness During AI Application Prototyping
Zijie J. Wang
Chinmay Kulkarni
Lauren Wilcox
Michael Terry
Michael A. Madaio
40
43
0
23 Feb 2024
Mapping the Ethics of Generative AI: A Comprehensive Scoping Review
Thilo Hagendorff
21
35
0
13 Feb 2024
Whispers in the Machine: Confidentiality in LLM-integrated Systems
Jonathan Evertz
Merlin Chlosta
Lea Schonherr
Thorsten Eisenhofer
74
17
0
10 Feb 2024
Improving Dialog Safety using Socially Aware Contrastive Learning
Souvik Das
Rohini Srihari
16
1
0
01 Feb 2024
The Ethics of Interaction: Mitigating Security Threats in LLMs
Ashutosh Kumar
Shiv Vignesh Murty
Sagarika Singh
Swathy Ragupathy
16
34
0
22 Jan 2024
A Survey on Large Language Model (LLM) Security and Privacy: The Good, the Bad, and the Ugly
Yifan Yao
Jinhao Duan
Kaidi Xu
Yuanfang Cai
Eric Sun
Yue Zhang
PILM
ELM
44
475
0
04 Dec 2023
From Chatbots to PhishBots? -- Preventing Phishing scams created using ChatGPT, Google Bard and Claude
Sayak Saha Roy
Poojitha Thota
Krishna Vamsi Naragam
Shirin Nilizadeh
SILM
51
17
0
29 Oct 2023
Ask Again, Then Fail: Large Language Models' Vacillations in Judgment
Qiming Xie
Zengzhi Wang
Yi Feng
Rui Xia
AAML
HILM
35
9
0
03 Oct 2023
Can LLM-Generated Misinformation Be Detected?
Canyu Chen
Kai Shu
DeLMO
39
158
0
25 Sep 2023
Efficient Avoidance of Vulnerabilities in Auto-completed Smart Contract Code Using Vulnerability-constrained Decoding
André Storhaug
Jingyue Li
Tianyuan Hu
AAML
34
14
0
18 Sep 2023
Distilled GPT for Source Code Summarization
Chia-Yi Su
Collin McMillan
30
36
0
28 Aug 2023
GPTEval: A Survey on Assessments of ChatGPT and GPT-4
Rui Mao
Guanyi Chen
Xulang Zhang
Frank Guerin
Min Zhang
ELM
LM&MA
38
101
0
24 Aug 2023
Using Large Language Models for Cybersecurity Capture-The-Flag Challenges and Certification Questions
W. Tann
Yuancheng Liu
Jun Heng Sim
C. Seah
E. Chang
ELM
30
31
0
21 Aug 2023
RatGPT: Turning online LLMs into Proxies for Malware Attacks
Mika Beckerich
L. Plein
Sergio Coronado
SILM
30
18
0
17 Aug 2023
Learning to Prompt in the Classroom to Understand AI Limits: A pilot study
Emily Theophilou
Cansu Koyuturk
Mona Yavari
Sathya Bursic
Gregor Donabauer
...
Davinia Hernández Leo
Martin Ruskov
D. Taibi
A. Gabbiadini
D. Ognibene
19
31
0
04 Jul 2023
On the Detectability of ChatGPT Content: Benchmarking, Methodology, and Evaluation through the Lens of Academic Writing
Zeyan Liu
Zijun Yao
Fengjun Li
Bo Luo
DeLMO
22
17
0
07 Jun 2023
From Text to MITRE Techniques: Exploring the Malicious Use of Large Language Models for Generating Cyber Attack Payloads
P. Charan
Hrushikesh Chunduri
P. Anand
S. Shukla
22
40
0
24 May 2023
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
363
12,003
0
04 Mar 2022
Adaptive Sampling Strategies to Construct Equitable Training Datasets
William Cai
R. Encarnación
Bobbie Chern
S. Corbett-Davies
Miranda Bogen
Stevie Bergman
Sharad Goel
89
30
0
31 Jan 2022
Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP
Timo Schick
Sahana Udupa
Hinrich Schütze
262
374
0
28 Feb 2021
Understanding the Capabilities, Limitations, and Societal Impact of Large Language Models
Alex Tamkin
Miles Brundage
Jack Clark
Deep Ganguli
AILaw
ELM
200
259
0
04 Feb 2021
Extracting Training Data from Large Language Models
Nicholas Carlini
Florian Tramèr
Eric Wallace
Matthew Jagielski
Ariel Herbert-Voss
...
Tom B. Brown
D. Song
Ulfar Erlingsson
Alina Oprea
Colin Raffel
MLAU
SILM
290
1,824
0
14 Dec 2020
1