Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.14988
Cited By
Risk and Response in Large Language Models: Evaluating Key Threat Categories
22 March 2024
Bahareh Harandizadeh
A. Salinas
Fred Morstatter
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Risk and Response in Large Language Models: Evaluating Key Threat Categories"
7 / 7 papers shown
Title
Operationalizing a Threat Model for Red-Teaming Large Language Models (LLMs)
Apurv Verma
Satyapriya Krishna
Sebastian Gehrmann
Madhavan Seshadri
Anu Pradhan
Tom Ault
Leslie Barrett
David Rabinowitz
John Doucette
Nhathai Phan
54
10
0
20 Jul 2024
AI Safety in Generative AI Large Language Models: A Survey
Jaymari Chua
Yun Yvonna Li
Shiyi Yang
Chen Wang
Lina Yao
LM&MA
36
12
0
06 Jul 2024
Metaheuristics and Large Language Models Join Forces: Toward an Integrated Optimization Approach
Camilo Chacón Sartori
Christian Blum
Filippo Bistaffa
Guillem Rodríguez Corominas
AIFin
61
3
0
28 May 2024
Survey of Vulnerabilities in Large Language Models Revealed by Adversarial Attacks
Erfan Shayegani
Md Abdullah Al Mamun
Yu Fu
Pedram Zaree
Yue Dong
Nael B. Abu-Ghazaleh
AAML
147
146
0
16 Oct 2023
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
319
11,953
0
04 Mar 2022
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
258
4,489
0
23 Jan 2020
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
280
1,595
0
18 Sep 2019
1