ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2307.16680
  4. Cited By
On the Trustworthiness Landscape of State-of-the-art Generative Models:
  A Survey and Outlook

On the Trustworthiness Landscape of State-of-the-art Generative Models: A Survey and Outlook

31 July 2023
Mingyuan Fan
Chengyu Wang
Cen Chen
Yang Liu
Jun Huang
    HILM
ArXivPDFHTML

Papers citing "On the Trustworthiness Landscape of State-of-the-art Generative Models: A Survey and Outlook"

14 / 14 papers shown
Title
Bridging the Safety Gap: A Guardrail Pipeline for Trustworthy LLM Inferences
Bridging the Safety Gap: A Guardrail Pipeline for Trustworthy LLM Inferences
Shanshan Han
Salman Avestimehr
Chaoyang He
76
0
0
12 Feb 2025
TorchOpera: A Compound AI System for LLM Safety
TorchOpera: A Compound AI System for LLM Safety
Shanshan Han
Yuhang Yao
Zijian Hu
Dimitris Stripelis
Zhaozhuo Xu
Chaoyang He
LLMAG
41
0
0
16 Jun 2024
Towards Trustworthy AI: A Review of Ethical and Robust Large Language
  Models
Towards Trustworthy AI: A Review of Ethical and Robust Large Language Models
Meftahul Ferdaus
Mahdi Abdelguerfi
Elias Ioup
Kendall N. Niles
Ken Pathak
Steve Sloan
39
11
0
01 Jun 2024
ChatGPT as an Attack Tool: Stealthy Textual Backdoor Attack via Blackbox
  Generative Model Trigger
ChatGPT as an Attack Tool: Stealthy Textual Backdoor Attack via Blackbox Generative Model Trigger
Jiazhao Li
Yijin Yang
Zhuofeng Wu
V. Vydiswaran
Chaowei Xiao
SILM
49
42
0
27 Apr 2023
The Internal State of an LLM Knows When It's Lying
The Internal State of an LLM Knows When It's Lying
A. Azaria
Tom Michael Mitchell
HILM
218
299
0
26 Apr 2023
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Xuezhi Wang
Jason W. Wei
Dale Schuurmans
Quoc Le
Ed H. Chi
Sharan Narang
Aakanksha Chowdhery
Denny Zhou
ReLM
BDL
LRM
AI4CE
314
3,248
0
21 Mar 2022
Palette: Image-to-Image Diffusion Models
Palette: Image-to-Image Diffusion Models
Chitwan Saharia
William Chan
Huiwen Chang
Chris A. Lee
Jonathan Ho
Tim Salimans
David J. Fleet
Mohammad Norouzi
DiffM
VLM
342
1,591
0
10 Nov 2021
Gradient-based Adversarial Attacks against Text Transformers
Gradient-based Adversarial Attacks against Text Transformers
Chuan Guo
Alexandre Sablayrolles
Hervé Jégou
Douwe Kiela
SILM
98
227
0
15 Apr 2021
Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based
  Bias in NLP
Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP
Timo Schick
Sahana Udupa
Hinrich Schütze
259
374
0
28 Feb 2021
Zero-Shot Text-to-Image Generation
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
255
4,781
0
24 Feb 2021
Understanding the Capabilities, Limitations, and Societal Impact of
  Large Language Models
Understanding the Capabilities, Limitations, and Societal Impact of Large Language Models
Alex Tamkin
Miles Brundage
Jack Clark
Deep Ganguli
AILaw
ELM
200
259
0
04 Feb 2021
Extracting Training Data from Large Language Models
Extracting Training Data from Large Language Models
Nicholas Carlini
Florian Tramèr
Eric Wallace
Matthew Jagielski
Ariel Herbert-Voss
...
Tom B. Brown
D. Song
Ulfar Erlingsson
Alina Oprea
Colin Raffel
MLAU
SILM
290
1,815
0
14 Dec 2020
Exploiting Cloze Questions for Few Shot Text Classification and Natural
  Language Inference
Exploiting Cloze Questions for Few Shot Text Classification and Natural Language Inference
Timo Schick
Hinrich Schütze
258
1,588
0
21 Jan 2020
Generating Natural Language Adversarial Examples
Generating Natural Language Adversarial Examples
M. Alzantot
Yash Sharma
Ahmed Elgohary
Bo-Jhang Ho
Mani B. Srivastava
Kai-Wei Chang
AAML
245
914
0
21 Apr 2018
1