ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.19212
  4. Cited By
When Ethics and Payoffs Diverge: LLM Agents in Morally Charged Social Dilemmas

When Ethics and Payoffs Diverge: LLM Agents in Morally Charged Social Dilemmas

25 May 2025
Steffen Backmann
David Guzman Piedrahita
Emanuel Tewolde
Rada Mihalcea
Bernhard Schölkopf
Zhijing Jin
Author Contacts:
sbackmann@ethz.chzjin@cs.toronto.edu
ArXiv (abs)PDFHTML

Papers citing "When Ethics and Payoffs Diverge: LLM Agents in Morally Charged Social Dilemmas"

32 / 32 papers shown
Title
Spontaneous Giving and Calculated Greed in Language Models
Spontaneous Giving and Calculated Greed in Language Models
Yuxuan Li
Hirokazu Shirado
ReLMLRMAI4CE
72
2
0
24 Feb 2025
Will Systems of LLM Agents Cooperate: An Investigation into a Social Dilemma
Richard Willis
Yali Du
Joel Z Leibo
Michael Luck
122
2
0
28 Jan 2025
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
DeepSeek-AI
Daya Guo
Dejian Yang
Haowei Zhang
Junxiao Song
...
Shiyu Wang
S. Yu
Shunfeng Zhou
Shuting Pan
S.S. Li
ReLMVLMOffRLAI4TSLRM
370
1,692
0
22 Jan 2025
Game-theoretic LLM: Agent Workflow for Negotiation Games
Game-theoretic LLM: Agent Workflow for Negotiation Games
Wenyue Hua
Ollie Liu
Lingyao Li
Alfonso Amayuelas
Julie Chen
...
Lizhou Fan
Fei Sun
William Yang Wang
Xinze Wang
Yongfeng Zhang
77
19
0
08 Nov 2024
GPT-4o System Card
GPT-4o System Card
OpenAI OpenAI
:
Aaron Hurst
Adam Lerer
Adam P. Goucher
...
Yuchen He
Yuchen Zhang
Yujia Jin
Yunxing Dai
Yury Malkov
MLLM
184
901
0
25 Oct 2024
Moral Alignment for LLM Agents
Moral Alignment for LLM Agents
Elizaveta Tennant
Stephen Hailes
Mirco Musolesi
87
5
0
02 Oct 2024
Paraphrase Types Elicit Prompt Engineering Capabilities
Paraphrase Types Elicit Prompt Engineering Capabilities
Jan Philip Wahle
Terry Ruas
Yang Xu
Bela Gipp
96
9
0
28 Jun 2024
MoralBench: Moral Evaluation of LLMs
MoralBench: Moral Evaluation of LLMs
Jianchao Ji
Yutong Chen
Mingyu Jin
Wujiang Xu
Wenyue Hua
Yongfeng Zhang
ELM
58
10
0
06 Jun 2024
Cooperate or Collapse: Emergence of Sustainable Cooperation in a Society
  of LLM Agents
Cooperate or Collapse: Emergence of Sustainable Cooperation in a Society of LLM Agents
Giorgio Piatti
Zhijing Jin
Max Kleiman-Weiner
Bernhard Schölkopf
Mrinmaya Sachan
Rada Mihalcea
LLMAG
79
24
0
25 Apr 2024
Personal LLM Agents: Insights and Survey about the Capability,
  Efficiency and Security
Personal LLM Agents: Insights and Survey about the Capability, Efficiency and Security
Yuanchun Li
Hao Wen
Weijun Wang
Xiangyu Li
Yizhen Yuan
...
Zhijun Li
Peng Li
Yang Liu
Yaqiong Zhang
Yunxin Liu
LLMAG
81
184
0
10 Jan 2024
Can Large Language Models Serve as Rational Players in Game Theory? A
  Systematic Analysis
Can Large Language Models Serve as Rational Players in Game Theory? A Systematic Analysis
Caoyun Fan
Jindou Chen
Yaohui Jin
Hao He
70
66
0
09 Dec 2023
Large Language Models can Strategically Deceive their Users when Put
  Under Pressure
Large Language Models can Strategically Deceive their Users when Put Under Pressure
Jérémy Scheurer
Mikita Balesni
Marius Hobbhahn
LLMAG
82
58
0
09 Nov 2023
Identifying the Risks of LM Agents with an LM-Emulated Sandbox
Identifying the Risks of LM Agents with an LM-Emulated Sandbox
Yangjun Ruan
Honghua Dong
Andrew Wang
Silviu Pitis
Yongchao Zhou
Jimmy Ba
Yann Dubois
Chris J. Maddison
Tatsunori Hashimoto
LLMAGELM
58
115
0
25 Sep 2023
Strategic Behavior of Large Language Models: Game Structure vs.
  Contextual Framing
Strategic Behavior of Large Language Models: Game Structure vs. Contextual Framing
Nunzio Lorè
Babak Heydari
48
39
0
12 Sep 2023
Rethinking Machine Ethics -- Can LLMs Perform Moral Reasoning through
  the Lens of Moral Theories?
Rethinking Machine Ethics -- Can LLMs Perform Moral Reasoning through the Lens of Moral Theories?
Jingyan Zhou
Minda Hu
Junan Li
Xiaoying Zhang
Xixin Wu
Irwin King
Helen M. Meng
LRM
71
27
0
29 Aug 2023
A Survey on Large Language Model based Autonomous Agents
A Survey on Large Language Model based Autonomous Agents
Lei Wang
Chengbang Ma
Xueyang Feng
Zeyu Zhang
Hao-ran Yang
...
Xu Chen
Yankai Lin
Wayne Xin Zhao
Zhewei Wei
Ji-Rong Wen
LLMAGAI4CELM&Ro
89
1,275
0
22 Aug 2023
Evaluating the Moral Beliefs Encoded in LLMs
Evaluating the Moral Beliefs Encoded in LLMs
Nino Scherrer
Claudia Shi
Amir Feder
David M. Blei
75
135
0
26 Jul 2023
Llama 2: Open Foundation and Fine-Tuned Chat Models
Llama 2: Open Foundation and Fine-Tuned Chat Models
Hugo Touvron
Louis Martin
Kevin R. Stone
Peter Albert
Amjad Almahairi
...
Sharan Narang
Aurelien Rodriguez
Robert Stojnic
Sergey Edunov
Thomas Scialom
AI4MHALM
299
11,894
0
18 Jul 2023
TrustGPT: A Benchmark for Trustworthy and Responsible Large Language
  Models
TrustGPT: A Benchmark for Trustworthy and Responsible Large Language Models
Yue Huang
Qihui Zhang
Philip S. Y
Lichao Sun
50
52
0
20 Jun 2023
Strategic Reasoning with Language Models
Strategic Reasoning with Language Models
Kanishk Gandhi
Dorsa Sadigh
Noah D. Goodman
LM&RoLRM
62
41
0
30 May 2023
Direct Preference Optimization: Your Language Model is Secretly a Reward
  Model
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Rafael Rafailov
Archit Sharma
E. Mitchell
Stefano Ermon
Christopher D. Manning
Chelsea Finn
ALM
385
3,981
0
29 May 2023
Playing repeated games with Large Language Models
Playing repeated games with Large Language Models
Elif Akata
Lion Schulz
Julian Coda-Forno
Seong Joon Oh
Matthias Bethge
Eric Schulz
541
134
0
26 May 2023
Voyager: An Open-Ended Embodied Agent with Large Language Models
Voyager: An Open-Ended Embodied Agent with Large Language Models
Guanzhi Wang
Yuqi Xie
Yunfan Jiang
Ajay Mandlekar
Chaowei Xiao
Yuke Zhu
Linxi Fan
Anima Anandkumar
LM&RoSyDa
145
813
0
25 May 2023
Generative Agents: Interactive Simulacra of Human Behavior
Generative Agents: Interactive Simulacra of Human Behavior
J. Park
Joseph C. O'Brien
Carrie J. Cai
Meredith Ringel Morris
Percy Liang
Michael S. Bernstein
LM&RoAI4CE
392
1,936
0
07 Apr 2023
Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards
  and Ethical Behavior in the MACHIAVELLI Benchmark
Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the MACHIAVELLI Benchmark
Alexander Pan
Chan Jun Shern
Andy Zou
Nathaniel Li
Steven Basart
Thomas Woodside
Jonathan Ng
Hanlin Zhang
Scott Emmons
Dan Hendrycks
56
132
0
06 Apr 2023
The Capacity for Moral Self-Correction in Large Language Models
The Capacity for Moral Self-Correction in Large Language Models
Deep Ganguli
Amanda Askell
Nicholas Schiefer
Thomas I. Liao
Kamil.e Lukovsiut.e
...
Tom B. Brown
C. Olah
Jack Clark
Sam Bowman
Jared Kaplan
LRMReLM
77
168
0
15 Feb 2023
Modeling Moral Choices in Social Dilemmas with Multi-Agent Reinforcement
  Learning
Modeling Moral Choices in Social Dilemmas with Multi-Agent Reinforcement Learning
Elizaveta Tennant
Stephen Hailes
Mirco Musolesi
58
19
0
20 Jan 2023
When to Make Exceptions: Exploring Language Models as Accounts of Human
  Moral Judgment
When to Make Exceptions: Exploring Language Models as Accounts of Human Moral Judgment
Zhijing Jin
Sydney Levine
Fernando Gonzalez
Ojasv Kamal
Maarten Sap
Mrinmaya Sachan
Rada Mihalcea
J. Tenenbaum
Bernhard Schölkopf
ELMLRM
65
99
0
04 Oct 2022
Training a Helpful and Harmless Assistant with Reinforcement Learning
  from Human Feedback
Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback
Yuntao Bai
Andy Jones
Kamal Ndousse
Amanda Askell
Anna Chen
...
Jack Clark
Sam McCandlish
C. Olah
Benjamin Mann
Jared Kaplan
249
2,561
0
12 Apr 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLMALM
874
12,973
0
04 Mar 2022
A General Language Assistant as a Laboratory for Alignment
A General Language Assistant as a Laboratory for Alignment
Amanda Askell
Yuntao Bai
Anna Chen
Dawn Drain
Deep Ganguli
...
Tom B. Brown
Jack Clark
Sam McCandlish
C. Olah
Jared Kaplan
ALM
118
779
0
01 Dec 2021
Deep reinforcement learning from human preferences
Deep reinforcement learning from human preferences
Paul Christiano
Jan Leike
Tom B. Brown
Miljan Martic
Shane Legg
Dario Amodei
190
3,318
0
12 Jun 2017
1