Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2302.02337
Cited By
Regulating ChatGPT and other Large Generative AI Models
5 February 2023
P. Hacker
A. Engel
M. Mauer
AILaw
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Regulating ChatGPT and other Large Generative AI Models"
26 / 26 papers shown
Title
Benign Samples Matter! Fine-tuning On Outlier Benign Samples Severely Breaks Safety
Zihan Guan
Mengxuan Hu
Ronghang Zhu
Sheng Li
Anil Vullikanti
AAML
31
0
0
11 May 2025
Transferable Adversarial Attacks on Black-Box Vision-Language Models
Kai Hu
Weichen Yu
Lefei Zhang
Alexander Robey
Andy Zou
Chengming Xu
Haoqi Hu
Matt Fredrikson
AAML
VLM
64
1
0
02 May 2025
A General Framework to Enhance Fine-tuning-based LLM Unlearning
J. Ren
Zhenwei Dai
Xianfeng Tang
Hui Liu
Jingying Zeng
...
R. Goutam
Suhang Wang
Yue Xing
Qi He
Hui Liu
MU
163
1
0
25 Feb 2025
Single-pass Detection of Jailbreaking Input in Large Language Models
Leyla Naz Candogan
Yongtao Wu
Elias Abad Rocamora
Grigorios G. Chrysos
V. Cevher
AAML
51
0
0
24 Feb 2025
From Cool Demos to Production-Ready FMware: Core Challenges and a Technology Roadmap
Gopi Krishnan Rajbahadur
G. Oliva
Dayi Lin
Ahmed E. Hassan
46
1
0
28 Jan 2025
Targeted Vaccine: Safety Alignment for Large Language Models against Harmful Fine-Tuning via Layer-wise Perturbation
Guozhi Liu
Weiwei Lin
Tiansheng Huang
Ruichao Mo
Qi Mu
Li Shen
AAML
66
10
0
13 Oct 2024
Democratising Artificial Intelligence for Pandemic Preparedness and Global Governance in Latin American and Caribbean Countries
Andre de Carvalho
R. Bonidia
Jude Dzevela Kong
Mariana Dauhajre
C. Struchiner
...
Edian F. Franco
Cesar Ugarte-Gil
Patricia Espinoza-Lopez
Gabriel Carrasco-Escobar
Ulisses Rocha
62
0
0
21 Sep 2024
AI Horizon Scanning, White Paper p3395, IEEE-SA. Part I: Areas of Attention
Marina Cortês
Andrew R. Liddle
Christos Emmanouilidis
Anthony E. Kelly
Ken Matusow
Ragu Ragunathan
Jayne M. Suess
George Tambouratzis
Janusz Zalewski
David A. Bray
29
1
0
13 Sep 2024
Thorns and Algorithms: Navigating Generative AI Challenges Inspired by Giraffes and Acacias
Waqar Hussain
43
0
0
16 Jul 2024
Memorizing Documents with Guidance in Large Language Models
Bumjin Park
Jaesik Choi
KELM
RALM
36
1
0
23 Jun 2024
RU-AI: A Large Multimodal Dataset for Machine Generated Content Detection
Liting Huang
Zhihao Zhang
Yiran Zhang
Xiyue Zhou
Shoujin Wang
NoLa
46
2
0
07 Jun 2024
What is it for a Machine Learning Model to Have a Capability?
Jacqueline Harding
Nathaniel Sharadin
ELM
40
3
0
14 May 2024
Generative AI in the Wild: Prospects, Challenges, and Strategies
Yuan Sun
Eunchae Jang
Fenglong Ma
Ting Wang
34
21
0
03 Apr 2024
The Opaque Law of Artificial Intelligence
Vincenzo Calderonio
AILaw
29
1
0
19 Oct 2023
Embarrassingly Simple Text Watermarks
Ryoma Sato
Yuki Takezawa
Han Bao
Kenta Niwa
Makoto Yamada
WaLM
29
14
0
13 Oct 2023
ChatGPT-Crawler: Find out if ChatGPT really knows what it's talking about
Aman Rangapur
Haoran Wang
AI4MH
39
3
0
06 Apr 2023
Sparks of Artificial General Intelligence: Early experiments with GPT-4
Sébastien Bubeck
Varun Chandrasekaran
Ronen Eldan
J. Gehrke
Eric Horvitz
...
Scott M. Lundberg
Harsha Nori
Hamid Palangi
Marco Tulio Ribeiro
Yi Zhang
ELM
AI4MH
AI4CE
ALM
304
2,232
0
22 Mar 2023
Auditing large language models: a three-layered approach
Jakob Mokander
Jonas Schuett
Hannah Rose Kirk
Luciano Floridi
AILaw
MLAU
48
194
0
16 Feb 2023
Regulating Gatekeeper AI and Data: Transparency, Access, and Fairness under the DMA, the GDPR, and beyond
P. Hacker
Johann Cordes
Janina Rochon
15
11
0
09 Dec 2022
Self-Destructing Models: Increasing the Costs of Harmful Dual Uses of Foundation Models
Peter Henderson
E. Mitchell
Christopher D. Manning
Dan Jurafsky
Chelsea Finn
23
47
0
27 Nov 2022
The European AI Liability Directives -- Critique of a Half-Hearted Approach and Lessons for the Future
P. Hacker
AILaw
26
59
0
25 Nov 2022
Improving alignment of dialogue agents via targeted human judgements
Amelia Glaese
Nat McAleese
Maja Trkebacz
John Aslanides
Vlad Firoiu
...
John F. J. Mellor
Demis Hassabis
Koray Kavukcuoglu
Lisa Anne Hendricks
G. Irving
ALM
AAML
227
502
0
28 Sep 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
333
11,953
0
04 Mar 2022
What Changes Can Large-scale Language Models Bring? Intensive Study on HyperCLOVA: Billions-scale Korean Generative Pretrained Transformers
Boseop Kim
Hyoungseok Kim
Sang-Woo Lee
Gichang Lee
Donghyun Kwak
...
Jaewook Kang
Inho Kang
Jung-Woo Ha
W. Park
Nako Sung
VLM
249
121
0
10 Sep 2021
Carbon Emissions and Large Neural Network Training
David A. Patterson
Joseph E. Gonzalez
Quoc V. Le
Chen Liang
Lluís-Miquel Munguía
D. Rothchild
David R. So
Maud Texier
J. Dean
AI4CE
253
644
0
21 Apr 2021
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
292
1,595
0
18 Sep 2019
1