ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2112.04359
  4. Cited By
Ethical and social risks of harm from Language Models

Ethical and social risks of harm from Language Models

8 December 2021
Laura Weidinger
John F. J. Mellor
Maribeth Rauh
Conor Griffin
J. Uesato
Po-Sen Huang
Myra Cheng
Mia Glaese
Borja Balle
Atoosa Kasirzadeh
Zachary Kenton
S. Brown
Will Hawkins
T. Stepleton
Courtney Biles
Abeba Birhane
Julia Haas
Laura Rimell
Lisa Anne Hendricks
William S. Isaac
Sean Legassick
G. Irving
Iason Gabriel
    PILM
ArXiv (abs)PDFHTML

Papers citing "Ethical and social risks of harm from Language Models"

50 / 634 papers shown
Title
A Pathway Towards Responsible AI Generated Content
A Pathway Towards Responsible AI Generated Content
Chen Chen
Jie Fu
Lingjuan Lyu
106
72
0
02 Mar 2023
Interactive Text Generation
Interactive Text Generation
Felix Faltings
Michel Galley
Baolin Peng
Kianté Brantley
Weixin Cai
Yizhe Zhang
Jianfeng Gao
Bill Dolan
78
0
0
02 Mar 2023
Can ChatGPT Assess Human Personalities? A General Evaluation Framework
Can ChatGPT Assess Human Personalities? A General Evaluation Framework
Haocong Rao
Cyril Leung
Chunyan Miao
LLMAGLM&MA
79
81
0
01 Mar 2023
Spacerini: Plug-and-play Search Engines with Pyserini and Hugging Face
Spacerini: Plug-and-play Search Engines with Pyserini and Hugging Face
Christopher Akiki
Odunayo Ogundepo
Aleksandra Piktus
Xinyu Crystina Zhang
Akintunde Oladipo
Jimmy J. Lin
Martin Potthast
69
5
0
28 Feb 2023
Comparing Sentence-Level Suggestions to Message-Level Suggestions in
  AI-Mediated Communication
Comparing Sentence-Level Suggestions to Message-Level Suggestions in AI-Mediated Communication
Liye Fu
Benjamin Newman
Maurice Jakesch
S. Kreps
31
22
0
26 Feb 2023
On pitfalls (and advantages) of sophisticated large language models
On pitfalls (and advantages) of sophisticated large language models
A. Strasser
77
14
0
25 Feb 2023
Check Your Facts and Try Again: Improving Large Language Models with
  External Knowledge and Automated Feedback
Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback
Baolin Peng
Michel Galley
Pengcheng He
Hao Cheng
Yujia Xie
...
Qiuyuan Huang
Lars Liden
Zhou Yu
Weizhu Chen
Jianfeng Gao
KELMHILMLRM
108
402
0
24 Feb 2023
Not what you've signed up for: Compromising Real-World LLM-Integrated
  Applications with Indirect Prompt Injection
Not what you've signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection
Kai Greshake
Sahar Abdelnabi
Shailesh Mishra
C. Endres
Thorsten Holz
Mario Fritz
SILM
174
503
0
23 Feb 2023
Towards Safer Generative Language Models: A Survey on Safety Risks,
  Evaluations, and Improvements
Towards Safer Generative Language Models: A Survey on Safety Risks, Evaluations, and Improvements
Jiawen Deng
Jiale Cheng
Hao Sun
Zhexin Zhang
Minlie Huang
LM&MAELM
95
17
0
18 Feb 2023
Auditing large language models: a three-layered approach
Auditing large language models: a three-layered approach
Jakob Mokander
Jonas Schuett
Hannah Rose Kirk
Luciano Floridi
AILawMLAU
152
215
0
16 Feb 2023
Retrieval-augmented Image Captioning
Retrieval-augmented Image Captioning
R. Ramos
Desmond Elliott
Bruno Martins
VLM
80
29
0
16 Feb 2023
Aligning Language Models with Preferences through f-divergence
  Minimization
Aligning Language Models with Preferences through f-divergence Minimization
Dongyoung Go
Tomasz Korbak
Germán Kruszewski
Jos Rozen
Nahyeon Ryu
Marc Dymetman
104
76
0
16 Feb 2023
The Capacity for Moral Self-Correction in Large Language Models
The Capacity for Moral Self-Correction in Large Language Models
Deep Ganguli
Amanda Askell
Nicholas Schiefer
Thomas I. Liao
Kamil.e Lukovsiut.e
...
Tom B. Brown
C. Olah
Jack Clark
Sam Bowman
Jared Kaplan
LRMReLM
92
171
0
15 Feb 2023
Machine Learning Model Attribution Challenge
Machine Learning Model Attribution Challenge
Elizabeth Merkhofe
Deepesh Chaudhari
Hyrum S. Anderson
Keith Manville
Lily Wong
João Gante
57
4
0
13 Feb 2023
Exploiting Programmatic Behavior of LLMs: Dual-Use Through Standard
  Security Attacks
Exploiting Programmatic Behavior of LLMs: Dual-Use Through Standard Security Attacks
Daniel Kang
Xuechen Li
Ion Stoica
Carlos Guestrin
Matei A. Zaharia
Tatsunori Hashimoto
AAML
105
253
0
11 Feb 2023
The Wisdom of Hindsight Makes Language Models Better Instruction
  Followers
The Wisdom of Hindsight Makes Language Models Better Instruction Followers
Tianjun Zhang
Fangchen Liu
Justin Wong
Pieter Abbeel
Joseph E. Gonzalez
103
47
0
10 Feb 2023
Machine Learning for Synthetic Data Generation: A Review
Machine Learning for Synthetic Data Generation: A Review
Ying-Cheng Lu
Minjie Shen
Huazheng Wang
Xiao Wang
Capucine Van Rechem
Tianfan Fu
Wenqi Wei
SyDa
220
150
0
08 Feb 2023
The Gradient of Generative AI Release: Methods and Considerations
The Gradient of Generative AI Release: Methods and Considerations
Irene Solaiman
85
104
0
05 Feb 2023
Mnemosyne: Learning to Train Transformers with Transformers
Mnemosyne: Learning to Train Transformers with Transformers
Deepali Jain
K. Choromanski
Kumar Avinava Dubey
Sumeet Singh
Vikas Sindhwani
Tingnan Zhang
Jie Tan
OffRL
134
9
0
02 Feb 2023
Co-Writing with Opinionated Language Models Affects Users' Views
Co-Writing with Opinionated Language Models Affects Users' Views
Maurice Jakesch
Advait Bhat
Daniel Buschek
Lior Zalmanson
Mor Naaman
ELM
100
227
0
01 Feb 2023
Debiasing Vision-Language Models via Biased Prompts
Debiasing Vision-Language Models via Biased Prompts
Ching-Yao Chuang
Varun Jampani
Yuanzhen Li
Antonio Torralba
Stefanie Jegelka
VLM
119
107
0
31 Jan 2023
Red teaming ChatGPT via Jailbreaking: Bias, Robustness, Reliability and
  Toxicity
Red teaming ChatGPT via Jailbreaking: Bias, Robustness, Reliability and Toxicity
Terry Yue Zhuo
Yujin Huang
Chunyang Chen
Zhenchang Xing
SILM
105
107
0
30 Jan 2023
Toward General Design Principles for Generative AI Applications
Toward General Design Principles for Generative AI Applications
Justin D. Weisz
Michael J. Muller
Jessica He
Stephanie Houde
AI4CE
92
59
0
13 Jan 2023
Removing Non-Stationary Knowledge From Pre-Trained Language Models for
  Entity-Level Sentiment Classification in Finance
Removing Non-Stationary Knowledge From Pre-Trained Language Models for Entity-Level Sentiment Classification in Finance
Guijin Son
Hanwool Albert Lee
Nahyeon Kang
Moonjeong Hahm
64
8
0
09 Jan 2023
Second Thoughts are Best: Learning to Re-Align With Human Values from
  Text Edits
Second Thoughts are Best: Learning to Re-Align With Human Values from Text Edits
Ruibo Liu
Chenyan Jia
Ge Zhang
Ziyu Zhuang
Tony X. Liu
Soroush Vosoughi
189
36
0
01 Jan 2023
Large Language Models Encode Clinical Knowledge
Large Language Models Encode Clinical Knowledge
K. Singhal
Shekoofeh Azizi
T. Tu
S. S. Mahdavi
Jason W. Wei
...
A. Rajkomar
Joelle Barral
Christopher Semturs
Alan Karthikesalingam
Vivek Natarajan
LM&MAELMAI4MH
257
2,408
0
26 Dec 2022
Real or Fake Text?: Investigating Human Ability to Detect Boundaries
  Between Human-Written and Machine-Generated Text
Real or Fake Text?: Investigating Human Ability to Detect Boundaries Between Human-Written and Machine-Generated Text
Liam Dugan
Daphne Ippolito
Arun Kirubarajan
Sherry Shi
Chris Callison-Burch
DeLMO
110
74
0
24 Dec 2022
Inclusive Artificial Intelligence
Inclusive Artificial Intelligence
Dilip Arumugam
Shi Dong
Benjamin Van Roy
64
1
0
24 Dec 2022
Methodological reflections for AI alignment research using human
  feedback
Methodological reflections for AI alignment research using human feedback
Thilo Hagendorff
Sarah Fabi
73
6
0
22 Dec 2022
JASMINE: Arabic GPT Models for Few-Shot Learning
JASMINE: Arabic GPT Models for Few-Shot Learning
El Moatez Billah Nagoudi
Muhammad Abdul-Mageed
AbdelRahim Elmadany
Alcides Alcoba Inciarte
Md. Tawkat Islam Khondaker
72
8
0
21 Dec 2022
SODA: Million-scale Dialogue Distillation with Social Commonsense
  Contextualization
SODA: Million-scale Dialogue Distillation with Social Commonsense Contextualization
Hyunwoo J. Kim
Jack Hessel
Liwei Jiang
Peter West
Ximing Lu
...
Ronan Le Bras
Malihe Alikhani
Gunhee Kim
Maarten Sap
Yejin Choi
HILM
132
169
0
20 Dec 2022
Foveate, Attribute, and Rationalize: Towards Physically Safe and
  Trustworthy AI
Foveate, Attribute, and Rationalize: Towards Physically Safe and Trustworthy AI
Alex Mei
Sharon Levy
William Yang Wang
66
7
0
19 Dec 2022
Improving Cross-task Generalization of Unified Table-to-text Models with
  Compositional Task Configurations
Improving Cross-task Generalization of Unified Table-to-text Models with Compositional Task Configurations
Jifan Chen
Yuhao Zhang
Lan Liu
Rui Dong
Xinchi Chen
Patrick Ng
William Yang Wang
Zhiheng Huang
AI4CE
64
4
0
17 Dec 2022
MURMUR: Modular Multi-Step Reasoning for Semi-Structured Data-to-Text
  Generation
MURMUR: Modular Multi-Step Reasoning for Semi-Structured Data-to-Text Generation
Swarnadeep Saha
Xinyan Velocity Yu
Joey Tianyi Zhou
Ramakanth Pasunuru
Asli Celikyilmaz
ReLMLRM
49
11
0
16 Dec 2022
ALERT: Adapting Language Models to Reasoning Tasks
ALERT: Adapting Language Models to Reasoning Tasks
Ping Yu
Tianlu Wang
O. Yu. Golovneva
Badr AlKhamissi
Siddharth Verma
Zhijing Jin
Gargi Ghosh
Mona T. Diab
Asli Celikyilmaz
ReLMLRM
75
19
0
16 Dec 2022
Manifestations of Xenophobia in AI Systems
Manifestations of Xenophobia in AI Systems
Nenad Tomašev
J. L. Maynard
Iason Gabriel
102
9
0
15 Dec 2022
Thinking Fast and Slow in Large Language Models
Thinking Fast and Slow in Large Language Models
Thilo Hagendorff
Sarah Fabi
Michal Kosinski
LLMAGLRM
76
159
0
10 Dec 2022
Implicit causality in GPT-2: a case study
Implicit causality in GPT-2: a case study
H. Huynh
T. Lentz
Emiel van Miltenburg
LRM
91
3
0
08 Dec 2022
Discovering Latent Knowledge in Language Models Without Supervision
Discovering Latent Knowledge in Language Models Without Supervision
Collin Burns
Haotian Ye
Dan Klein
Jacob Steinhardt
163
386
0
07 Dec 2022
Talking About Large Language Models
Talking About Large Language Models
Murray Shanahan
AI4CE
115
273
0
07 Dec 2022
Fine-tuning language models to find agreement among humans with diverse
  preferences
Fine-tuning language models to find agreement among humans with diverse preferences
Michiel A. Bakker
Martin Chadwick
Hannah R. Sheahan
Michael Henry Tessler
Lucy Campbell-Gillingham
...
Nat McAleese
Amelia Glaese
John Aslanides
M. Botvinick
Christopher Summerfield
ALM
110
236
0
28 Nov 2022
Melting Pot 2.0
Melting Pot 2.0
J. Agapiou
A. Vezhnevets
Edgar A. Duénez-Guzmán
Jayd Matyas
Yiran Mao
...
Sukhdeep Singh
Julia Haas
Igor Mordatch
D. Mobbs
Joel Z Leibo
117
34
0
24 Nov 2022
AutoReply: Detecting Nonsense in Dialogue Introspectively with
  Discriminative Replies
AutoReply: Detecting Nonsense in Dialogue Introspectively with Discriminative Replies
Weiyan Shi
Emily Dinan
Adithya Renduchintala
Daniel Fried
Athul Paul Jacob
Zhou Yu
M. Lewis
AAML
108
2
0
22 Nov 2022
Can You Label Less by Using Out-of-Domain Data? Active & Transfer
  Learning with Few-shot Instructions
Can You Label Less by Using Out-of-Domain Data? Active & Transfer Learning with Few-shot Instructions
Rafal Kocielnik
Sara Kangaslahti
Shrimai Prabhumoye
M. Hari
R. Alvarez
Anima Anandkumar
54
8
0
21 Nov 2022
Ignore Previous Prompt: Attack Techniques For Language Models
Ignore Previous Prompt: Attack Techniques For Language Models
Fábio Perez
Ian Ribeiro
SILM
106
452
0
17 Nov 2022
kogito: A Commonsense Knowledge Inference Toolkit
kogito: A Commonsense Knowledge Inference Toolkit
Mete Ismayilzada
Antoine Bosselut
71
7
0
15 Nov 2022
Imagination is All You Need! Curved Contrastive Learning for Abstract
  Sequence Modeling Utilized on Long Short-Term Dialogue Planning
Imagination is All You Need! Curved Contrastive Learning for Abstract Sequence Modeling Utilized on Long Short-Term Dialogue Planning
Justus-Jonas Erker
Stefan Schaffer
Gerasimos Spanakis
77
1
0
14 Nov 2022
FormLM: Recommending Creation Ideas for Online Forms by Modelling
  Semantic and Structural Information
FormLM: Recommending Creation Ideas for Online Forms by Modelling Semantic and Structural Information
Yijia Shao
Mengyu Zhou
Yifan Zhong
Tao Wu
Hongwei Han
Shi Han
Gideon Huang
Dongmei Zhang
3DV
62
2
0
10 Nov 2022
Easily Accessible Text-to-Image Generation Amplifies Demographic
  Stereotypes at Large Scale
Easily Accessible Text-to-Image Generation Amplifies Demographic Stereotypes at Large Scale
Federico Bianchi
Pratyusha Kalluri
Esin Durmus
Faisal Ladhak
Myra Cheng
Debora Nozza
Tatsunori Hashimoto
Dan Jurafsky
James Zou
Aylin Caliskan
DiffMVLM
138
320
0
07 Nov 2022
SSD-LM: Semi-autoregressive Simplex-based Diffusion Language Model for
  Text Generation and Modular Control
SSD-LM: Semi-autoregressive Simplex-based Diffusion Language Model for Text Generation and Modular Control
Xiaochuang Han
Sachin Kumar
Yulia Tsvetkov
165
91
0
31 Oct 2022
Previous
123...10111213
Next