Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2110.07574
Cited By
Can Machines Learn Morality? The Delphi Experiment
14 October 2021
Liwei Jiang
Jena D. Hwang
Chandra Bhagavatula
Ronan Le Bras
Jenny T Liang
Jesse Dodge
Keisuke Sakaguchi
Maxwell Forbes
Jon Borchardt
Saadia Gabriel
Yulia Tsvetkov
Oren Etzioni
Maarten Sap
Regina A. Rini
Yejin Choi
FaML
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Can Machines Learn Morality? The Delphi Experiment"
24 / 24 papers shown
Title
The Convergent Ethics of AI? Analyzing Moral Foundation Priorities in Large Language Models with a Multi-Framework Approach
Chad Coleman
W. Russell Neuman
Ali Dasdan
Safinah Ali
Manan Shah
ELM
LRM
38
0
0
27 Apr 2025
CLASH: Evaluating Language Models on Judging High-Stakes Dilemmas from Multiple Perspectives
Ayoung Lee
Ryan Sungmo Kwon
Peter Railton
Lu Wang
ELM
49
0
0
15 Apr 2025
Persona Dynamics: Unveiling the Impact of Personality Traits on Agents in Text-Based Games
Seungwon Lim
Seungbeen Lee
Dongjun Min
Youngjae Yu
AI4CE
44
0
0
09 Apr 2025
The Goofus & Gallant Story Corpus for Practical Value Alignment
Md Sultan al Nahian
Tasmia Tasrin
Spencer Frazier
Mark O. Riedl
Brent Harrison
45
0
0
17 Jan 2025
ClarityEthic: Explainable Moral Judgment Utilizing Contrastive Ethical Insights from Large Language Models
Yuxi Sun
Wei Gao
Jing Ma
Hongzhan Lin
Ziyang Luo
Wenxuan Zhang
ELM
74
0
0
17 Dec 2024
DailyDilemmas: Revealing Value Preferences of LLMs with Quandaries of Daily Life
Yu Ying Chiu
Liwei Jiang
Yejin Choi
51
3
0
03 Oct 2024
AI-LieDar: Examine the Trade-off Between Utility and Truthfulness in LLM Agents
Zhe Su
Xuhui Zhou
Sanketh Rangreji
Anubha Kabra
Julia Mendelsohn
Faeze Brahman
Maarten Sap
LLMAG
95
2
0
13 Sep 2024
Does Cross-Cultural Alignment Change the Commonsense Morality of Language Models?
Yuu Jinnai
49
1
0
24 Jun 2024
Ask LLMs Directly, "What shapes your bias?": Measuring Social Bias in Large Language Models
Jisu Shin
Hoyun Song
Huije Lee
Soyeong Jeong
Jong C. Park
28
6
0
06 Jun 2024
Attributions toward Artificial Agents in a modified Moral Turing Test
Eyal Aharoni
Sharlene Fernandes
Daniel J Brady
Caelan Alexander
Michael Criner
Kara Queen
Javier Rando
Eddy Nahmias
Victor Crespo
ELM
33
12
0
03 Apr 2024
COBIAS: Contextual Reliability in Bias Assessment
Priyanshul Govil
Hemang Jain
Vamshi Bonagiri
Aman Chadha
Ponnurangam Kumaraguru
Manas Gaur
S. Dey
43
2
0
22 Feb 2024
SaGE: Evaluating Moral Consistency in Large Language Models
Vamshi Bonagiri
Sreeram Vennam
Priyanshul Govil
Ponnurangam Kumaraguru
Manas Gaur
ELM
46
0
0
21 Feb 2024
EmoBench: Evaluating the Emotional Intelligence of Large Language Models
Sahand Sabour
Siyang Liu
Zheyuan Zhang
June M. Liu
Jinfeng Zhou
Alvionna S. Sunaryo
Juanzi Li
Tatia M.C. Lee
Rada Mihalcea
Minlie Huang
27
11
0
19 Feb 2024
Do Moral Judgment and Reasoning Capability of LLMs Change with Language? A Study using the Multilingual Defining Issues Test
Aditi Khandelwal
Utkarsh Agarwal
Kumar Tanmay
Monojit Choudhury
ELM
LRM
22
6
0
03 Feb 2024
Compositional Capabilities of Autoregressive Transformers: A Study on Synthetic, Interpretable Tasks
Rahul Ramesh
Ekdeep Singh Lubana
Mikail Khona
Robert P. Dick
Hidenori Tanaka
CoGe
27
6
0
21 Nov 2023
MOKA: Moral Knowledge Augmentation for Moral Event Extraction
Xinliang Frederick Zhang
Winston Wu
Nick Beauchamp
Lu Wang
35
7
0
16 Nov 2023
STREAM: Social data and knowledge collective intelligence platform for TRaining Ethical AI Models
Yuwei Wang
Enmeng Lu
Zizhe Ruan
Yao Liang
Yi Zeng
AI4TS
24
4
0
09 Oct 2023
Gesture-Informed Robot Assistance via Foundation Models
Li-Heng Lin
Yuchen Cui
Yilun Hao
Fei Xia
Dorsa Sadigh
LM&Ro
SLR
13
19
0
06 Sep 2023
Towards Theory-based Moral AI: Moral AI with Aggregating Models Based on Normative Ethical Theory
Masashi Takeshita
Rafal Rzepka
K. Araki
13
8
0
20 Jun 2023
Revision Transformers: Instructing Language Models to Change their Values
Felix Friedrich
Wolfgang Stammer
P. Schramowski
Kristian Kersting
KELM
21
6
0
19 Oct 2022
Does Moral Code Have a Moral Code? Probing Delphi's Moral Philosophy
Kathleen C. Fraser
S. Kiritchenko
Esma Balkir
107
37
0
25 May 2022
A Word on Machine Ethics: A Response to Jiang et al. (2021)
Zeerak Talat
Hagen Blix
Josef Valvoda
M. I. Ganesh
Ryan Cotterell
Adina Williams
SyDa
FaML
88
39
0
07 Nov 2021
Latent Hatred: A Benchmark for Understanding Implicit Hate Speech
Mai Elsherief
Caleb Ziems
D. Muchlinski
Vaishnavi Anupindi
Jordyn Seybolt
M. D. Choudhury
Diyi Yang
92
236
0
11 Sep 2021
The Woman Worked as a Babysitter: On Biases in Language Generation
Emily Sheng
Kai-Wei Chang
Premkumar Natarajan
Nanyun Peng
206
616
0
03 Sep 2019
1