Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2307.16680
Cited By
On the Trustworthiness Landscape of State-of-the-art Generative Models: A Survey and Outlook
31 July 2023
Mingyuan Fan
Chengyu Wang
Cen Chen
Yang Liu
Jun Huang
HILM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"On the Trustworthiness Landscape of State-of-the-art Generative Models: A Survey and Outlook"
14 / 14 papers shown
Title
Bridging the Safety Gap: A Guardrail Pipeline for Trustworthy LLM Inferences
Shanshan Han
Salman Avestimehr
Chaoyang He
76
0
0
12 Feb 2025
TorchOpera: A Compound AI System for LLM Safety
Shanshan Han
Yuhang Yao
Zijian Hu
Dimitris Stripelis
Zhaozhuo Xu
Chaoyang He
LLMAG
41
0
0
16 Jun 2024
Towards Trustworthy AI: A Review of Ethical and Robust Large Language Models
Meftahul Ferdaus
Mahdi Abdelguerfi
Elias Ioup
Kendall N. Niles
Ken Pathak
Steve Sloan
39
11
0
01 Jun 2024
ChatGPT as an Attack Tool: Stealthy Textual Backdoor Attack via Blackbox Generative Model Trigger
Jiazhao Li
Yijin Yang
Zhuofeng Wu
V. Vydiswaran
Chaowei Xiao
SILM
49
42
0
27 Apr 2023
The Internal State of an LLM Knows When It's Lying
A. Azaria
Tom Michael Mitchell
HILM
218
299
0
26 Apr 2023
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Xuezhi Wang
Jason W. Wei
Dale Schuurmans
Quoc Le
Ed H. Chi
Sharan Narang
Aakanksha Chowdhery
Denny Zhou
ReLM
BDL
LRM
AI4CE
314
3,248
0
21 Mar 2022
Palette: Image-to-Image Diffusion Models
Chitwan Saharia
William Chan
Huiwen Chang
Chris A. Lee
Jonathan Ho
Tim Salimans
David J. Fleet
Mohammad Norouzi
DiffM
VLM
342
1,591
0
10 Nov 2021
Gradient-based Adversarial Attacks against Text Transformers
Chuan Guo
Alexandre Sablayrolles
Hervé Jégou
Douwe Kiela
SILM
98
227
0
15 Apr 2021
Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP
Timo Schick
Sahana Udupa
Hinrich Schütze
259
374
0
28 Feb 2021
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
255
4,781
0
24 Feb 2021
Understanding the Capabilities, Limitations, and Societal Impact of Large Language Models
Alex Tamkin
Miles Brundage
Jack Clark
Deep Ganguli
AILaw
ELM
200
259
0
04 Feb 2021
Extracting Training Data from Large Language Models
Nicholas Carlini
Florian Tramèr
Eric Wallace
Matthew Jagielski
Ariel Herbert-Voss
...
Tom B. Brown
D. Song
Ulfar Erlingsson
Alina Oprea
Colin Raffel
MLAU
SILM
290
1,815
0
14 Dec 2020
Exploiting Cloze Questions for Few Shot Text Classification and Natural Language Inference
Timo Schick
Hinrich Schütze
258
1,588
0
21 Jan 2020
Generating Natural Language Adversarial Examples
M. Alzantot
Yash Sharma
Ahmed Elgohary
Bo-Jhang Ho
Mani B. Srivastava
Kai-Wei Chang
AAML
245
914
0
21 Apr 2018
1