ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2112.04359
  4. Cited By
Ethical and social risks of harm from Language Models

Ethical and social risks of harm from Language Models

8 December 2021
Laura Weidinger
John F. J. Mellor
Maribeth Rauh
Conor Griffin
J. Uesato
Po-Sen Huang
Myra Cheng
Mia Glaese
Borja Balle
Atoosa Kasirzadeh
Zachary Kenton
S. Brown
Will Hawkins
T. Stepleton
Courtney Biles
Abeba Birhane
Julia Haas
Laura Rimell
Lisa Anne Hendricks
William S. Isaac
Sean Legassick
G. Irving
Iason Gabriel
    PILM
ArXiv (abs)PDFHTML

Papers citing "Ethical and social risks of harm from Language Models"

50 / 634 papers shown
Title
Planning with Logical Graph-based Language Model for Instruction
  Generation
Planning with Logical Graph-based Language Model for Instruction Generation
Fan Zhang
Kebing Jin
H. Zhuo
LRM
96
3
0
26 Aug 2023
Do-Not-Answer: A Dataset for Evaluating Safeguards in LLMs
Do-Not-Answer: A Dataset for Evaluating Safeguards in LLMs
Yuxia Wang
Haonan Li
Xudong Han
Preslav Nakov
Timothy Baldwin
101
117
0
25 Aug 2023
From Instructions to Intrinsic Human Values -- A Survey of Alignment
  Goals for Big Models
From Instructions to Intrinsic Human Values -- A Survey of Alignment Goals for Big Models
Jing Yao
Xiaoyuan Yi
Xiting Wang
Jindong Wang
Xing Xie
ALM
93
44
0
23 Aug 2023
FairMonitor: A Four-Stage Automatic Framework for Detecting Stereotypes
  and Biases in Large Language Models
FairMonitor: A Four-Stage Automatic Framework for Detecting Stereotypes and Biases in Large Language Models
Yanhong Bai
Jiabao Zhao
Jinxin Shi
Tingjiang Wei
Xingjiao Wu
Liangbo He
58
0
0
21 Aug 2023
A Methodology for Generative Spelling Correction via Natural Spelling
  Errors Emulation across Multiple Domains and Languages
A Methodology for Generative Spelling Correction via Natural Spelling Errors Emulation across Multiple Domains and Languages
Nikita Martynov
Mark Baushenko
Anastasia Kozlova
Katerina Kolomeytseva
Aleksandr Abramov
Alena Fenogenova
66
4
0
18 Aug 2023
ChatGPT-HealthPrompt. Harnessing the Power of XAI in Prompt-Based
  Healthcare Decision Support using ChatGPT
ChatGPT-HealthPrompt. Harnessing the Power of XAI in Prompt-Based Healthcare Decision Support using ChatGPT
Fatemeh Nazary
Yashar Deldjoo
Tommaso Di Noia
LM&MAAI4MH
98
16
0
17 Aug 2023
Self-Deception: Reverse Penetrating the Semantic Firewall of Large Language Models
Zhenhua Wang
Wei Xie
Kai Chen
Baosheng Wang
Zhiwen Gui
Enze Wang
AAMLSILM
102
6
0
16 Aug 2023
Through the Lens of Core Competency: Survey on Evaluation of Large
  Language Models
Through the Lens of Core Competency: Survey on Evaluation of Large Language Models
Ziyu Zhuang
Qiguang Chen
Longxuan Ma
Mingda Li
Yi Han
Yushan Qian
Haopeng Bai
Zixian Feng
Weinan Zhang
Ting Liu
ELM
80
13
0
15 Aug 2023
Building Trust in Conversational AI: A Comprehensive Review and Solution
  Architecture for Explainable, Privacy-Aware Systems using LLMs and Knowledge
  Graph
Building Trust in Conversational AI: A Comprehensive Review and Solution Architecture for Explainable, Privacy-Aware Systems using LLMs and Knowledge Graph
Ahtsham Zafar
V. Parthasarathy
Chan Le Van
Saad Shahid
A. khan
Arsalan Shahid
79
14
0
13 Aug 2023
ZYN: Zero-Shot Reward Models with Yes-No Questions for RLAIF
ZYN: Zero-Shot Reward Models with Yes-No Questions for RLAIF
Víctor Gallego
SyDa
70
4
0
11 Aug 2023
Metacognitive Prompting Improves Understanding in Large Language Models
Metacognitive Prompting Improves Understanding in Large Language Models
Yuqing Wang
Yun Zhao
ReLMLRM
95
34
0
10 Aug 2023
On the Unexpected Abilities of Large Language Models
On the Unexpected Abilities of Large Language Models
S. Nolfi
LRM
72
11
0
09 Aug 2023
A Cost Analysis of Generative Language Models and Influence Operations
A Cost Analysis of Generative Language Models and Influence Operations
Micah Musser
64
20
0
07 Aug 2023
Specious Sites: Tracking the Spread and Sway of Spurious News Stories at
  Scale
Specious Sites: Tracking the Spread and Sway of Spurious News Stories at Scale
Hans W. A. Hanley
Deepak Kumar
Zakir Durumeric
80
11
0
03 Aug 2023
ALE: A Simulation-Based Active Learning Evaluation Framework for the
  Parameter-Driven Comparison of Query Strategies for NLP
ALE: A Simulation-Based Active Learning Evaluation Framework for the Parameter-Driven Comparison of Query Strategies for NLP
Philipp Kohl
Nils Freyer
Yoka Krämer
H. Werth
Steffen Wolf
Bodo Kraft
Matthias Meinecke
Albert Zündorf
118
1
0
01 Aug 2023
The Ethics of AI Value Chains
The Ethics of AI Value Chains
Blair Attard-Frost
D. Widder
81
1
0
31 Jul 2023
Okapi: Instruction-tuned Large Language Models in Multiple Languages
  with Reinforcement Learning from Human Feedback
Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedback
Viet Dac Lai
Chien Van Nguyen
Nghia Trung Ngo
Thuat Nguyen
Franck Dernoncourt
Ryan Rossi
Thien Huu Nguyen
ALM
133
150
0
29 Jul 2023
Open Problems and Fundamental Limitations of Reinforcement Learning from
  Human Feedback
Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
Stephen Casper
Xander Davies
Claudia Shi
T. Gilbert
Jérémy Scheurer
...
Erdem Biyik
Anca Dragan
David M. Krueger
Dorsa Sadigh
Dylan Hadfield-Menell
ALMOffRL
155
533
0
27 Jul 2023
How User Language Affects Conflict Fatality Estimates in ChatGPT
How User Language Affects Conflict Fatality Estimates in ChatGPT
Daniel Kazenwadel
C. Steinert
44
1
0
26 Jul 2023
Towards Automatic Boundary Detection for Human-AI Collaborative Hybrid
  Essay in Education
Towards Automatic Boundary Detection for Human-AI Collaborative Hybrid Essay in Education
Zijie Zeng
Lele Sha
Yuheng Li
Kaixun Yang
D. Gašević
Guanliang Chen
DeLMO
124
17
0
23 Jul 2023
Exploring Perspectives on the Impact of Artificial Intelligence on the
  Creativity of Knowledge Work: Beyond Mechanised Plagiarism and Stochastic
  Parrots
Exploring Perspectives on the Impact of Artificial Intelligence on the Creativity of Knowledge Work: Beyond Mechanised Plagiarism and Stochastic Parrots
Advait Sarkar
91
40
0
20 Jul 2023
Can Instruction Fine-Tuned Language Models Identify Social Bias through
  Prompting?
Can Instruction Fine-Tuned Language Models Identify Social Bias through Prompting?
O. Dige
Jacob-Junqi Tian
David B. Emerson
Faiza Khan Khattak
ALM
57
5
0
19 Jul 2023
Llama 2: Open Foundation and Fine-Tuned Chat Models
Llama 2: Open Foundation and Fine-Tuned Chat Models
Hugo Touvron
Louis Martin
Kevin R. Stone
Peter Albert
Amjad Almahairi
...
Sharan Narang
Aurelien Rodriguez
Robert Stojnic
Sergey Edunov
Thomas Scialom
AI4MHALM
498
12,128
0
18 Jul 2023
BeaverTails: Towards Improved Safety Alignment of LLM via a
  Human-Preference Dataset
BeaverTails: Towards Improved Safety Alignment of LLM via a Human-Preference Dataset
Jiaming Ji
Mickel Liu
Juntao Dai
Xuehai Pan
Chi Zhang
Ce Bian
Chi Zhang
Ruiyang Sun
Yizhou Wang
Yaodong Yang
ALM
98
506
0
10 Jul 2023
On the Challenges of Deploying Privacy-Preserving Synthetic Data in the
  Enterprise
On the Challenges of Deploying Privacy-Preserving Synthetic Data in the Enterprise
L. Arthur
Jason W Costello
Jonathan Hardy
Will O'Brien
J. Rea
Gareth Rees
Georgi Ganev
68
2
0
09 Jul 2023
The Ethical Implications of Generative Audio Models: A Systematic
  Literature Review
The Ethical Implications of Generative Audio Models: A Systematic Literature Review
J. Barnett
86
32
0
07 Jul 2023
Improving Language Plasticity via Pretraining with Active Forgetting
Improving Language Plasticity via Pretraining with Active Forgetting
Yihong Chen
Kelly Marchisio
Roberta Raileanu
David Ifeoluwa Adelani
Pontus Stenetorp
Sebastian Riedel
Mikel Artetx
KELMAI4CECLL
116
27
0
03 Jul 2023
Minimum Levels of Interpretability for Artificial Moral Agents
Minimum Levels of Interpretability for Artificial Moral Agents
Avish Vijayaraghavan
C. Badea
AI4CE
64
5
0
02 Jul 2023
Provable Robust Watermarking for AI-Generated Text
Provable Robust Watermarking for AI-Generated Text
Xuandong Zhao
P. Ananth
Lei Li
Yu-Xiang Wang
WaLM
130
187
0
30 Jun 2023
Towards Measuring the Representation of Subjective Global Opinions in
  Language Models
Towards Measuring the Representation of Subjective Global Opinions in Language Models
Esin Durmus
Karina Nyugen
Thomas I. Liao
Nicholas Schiefer
Amanda Askell
...
Alex Tamkin
Janel Thamkul
Jared Kaplan
Jack Clark
Deep Ganguli
147
245
0
28 Jun 2023
VisText: A Benchmark for Semantically Rich Chart Captioning
VisText: A Benchmark for Semantically Rich Chart Captioning
Benny J. Tang
Angie Boggust
Arvind Satyanarayan
92
87
0
28 Jun 2023
CBBQ: A Chinese Bias Benchmark Dataset Curated with Human-AI
  Collaboration for Large Language Models
CBBQ: A Chinese Bias Benchmark Dataset Curated with Human-AI Collaboration for Large Language Models
Yufei Huang
Deyi Xiong
ALM
134
19
0
28 Jun 2023
ToolQA: A Dataset for LLM Question Answering with External Tools
ToolQA: A Dataset for LLM Question Answering with External Tools
Yuchen Zhuang
Yue Yu
Kuan-Chieh Wang
Haotian Sun
Chao Zhang
ELMLLMAG
101
251
0
23 Jun 2023
Public Attitudes Toward ChatGPT on Twitter: Sentiments, Topics, and
  Occupations
Public Attitudes Toward ChatGPT on Twitter: Sentiments, Topics, and Occupations
Ratanond Koonchanok
Ya-Chen Pan
Hyeju Jang
AI4MH
40
13
0
22 Jun 2023
VisoGender: A dataset for benchmarking gender bias in image-text pronoun
  resolution
VisoGender: A dataset for benchmarking gender bias in image-text pronoun resolution
Elizaveta Semenova
F. G. Abrantes
Hanwen Zhu
Grace A. Sodunke
Aleksandar Shtedritski
Hannah Rose Kirk
CoGe
125
46
0
21 Jun 2023
Mass-Producing Failures of Multimodal Systems with Language Models
Mass-Producing Failures of Multimodal Systems with Language Models
Shengbang Tong
Erik Jones
Jacob Steinhardt
106
36
0
21 Jun 2023
Opportunities and Risks of LLMs for Scalable Deliberation with Polis
Opportunities and Risks of LLMs for Scalable Deliberation with Polis
Christopher T. Small
Ivan Vendrov
Esin Durmus
Hadjar Homaei
Elizabeth Barry
Julien Cornebise
Ted Suzman
Deep Ganguli
Colin Megill
92
30
0
20 Jun 2023
Explicit Syntactic Guidance for Neural Text Generation
Explicit Syntactic Guidance for Neural Text Generation
Yafu Li
Leyang Cui
Jianhao Yan
Yongjng Yin
Wei Bi
Shuming Shi
Yue Zhang
94
9
0
20 Jun 2023
The Importance of Human-Labeled Data in the Era of LLMs
The Importance of Human-Labeled Data in the Era of LLMs
Yang Liu
ALM
77
10
0
18 Jun 2023
Deceptive AI Ecosystems: The Case of ChatGPT
Deceptive AI Ecosystems: The Case of ChatGPT
Xiao Zhan
Yifan Xu
Stefan Sarkadi
SILM
99
24
0
18 Jun 2023
Matching Pairs: Attributing Fine-Tuned Models to their Pre-Trained Large
  Language Models
Matching Pairs: Attributing Fine-Tuned Models to their Pre-Trained Large Language Models
Myles Foley
Ambrish Rawat
Taesung Lee
Yufang Hou
Gabriele Picco
Giulio Zizzo
DeLMO
138
6
0
15 Jun 2023
Can Language Models Teach Weaker Agents? Teacher Explanations Improve
  Students via Personalization
Can Language Models Teach Weaker Agents? Teacher Explanations Improve Students via Personalization
Swarnadeep Saha
Peter Hase
Mohit Bansal
LRM
80
11
0
15 Jun 2023
AraMUS: Pushing the Limits of Data and Model Scale for Arabic Natural
  Language Processing
AraMUS: Pushing the Limits of Data and Model Scale for Arabic Natural Language Processing
Asaad Alghamdi
Xinyu Duan
Wei Jiang
Zhenhai Wang
Yimeng Wu
...
Yifei Zheng
Mehdi Rezagholizadeh
Baoxing Huai
Peilun Cheng
Abbas Ghaddar
VLM
52
8
0
11 Jun 2023
Improving Knowledge Extraction from LLMs for Task Learning through Agent
  Analysis
Improving Knowledge Extraction from LLMs for Task Learning through Agent Analysis
James R. Kirk
R. Wray
Peter Lindes
John E. Laird
LLMAG
51
3
0
11 Jun 2023
Evaluating the Social Impact of Generative AI Systems in Systems and
  Society
Evaluating the Social Impact of Generative AI Systems in Systems and Society
Irene Solaiman
Zeerak Talat
William Agnew
Lama Ahmad
Dylan K. Baker
...
Marie-Therese Png
Shubham Singh
A. Strait
Lukas Struppek
Arjun Subramonian
ELMEGVM
139
117
0
09 Jun 2023
Towards a Robust Detection of Language Model Generated Text: Is ChatGPT
  that Easy to Detect?
Towards a Robust Detection of Language Model Generated Text: Is ChatGPT that Easy to Detect?
Wissam Antoun
Virginie Mouilleron
Benoît Sagot
Djamé Seddah
DeLMO
85
33
0
09 Jun 2023
Improving Open Language Models by Learning from Organic Interactions
Improving Open Language Models by Learning from Organic Interactions
Jing Xu
Da Ju
Joshua Lane
M. Komeili
Eric Michael Smith
...
Rashel Moritz
Sainbayar Sukhbaatar
Y-Lan Boureau
Jason Weston
Kurt Shuster
59
9
0
07 Jun 2023
Rewarded soups: towards Pareto-optimal alignment by interpolating
  weights fine-tuned on diverse rewards
Rewarded soups: towards Pareto-optimal alignment by interpolating weights fine-tuned on diverse rewards
Alexandre Ramé
Guillaume Couairon
Mustafa Shukor
Corentin Dancette
Jean-Baptiste Gaya
Laure Soulier
Matthieu Cord
MoMe
120
157
0
07 Jun 2023
Applying Standards to Advance Upstream & Downstream Ethics in Large
  Language Models
Applying Standards to Advance Upstream & Downstream Ethics in Large Language Models
Jose Berengueres
Marybeth Sandell
66
0
0
06 Jun 2023
SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight
  Compression
SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression
Tim Dettmers
Ruslan Svirschevski
Vage Egiazarian
Denis Kuznedelev
Elias Frantar
Saleh Ashkboos
Alexander Borzunov
Torsten Hoefler
Dan Alistarh
MQ
78
257
0
05 Jun 2023
Previous
123...1011121389
Next