ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2503.19598
29
0

The Greatest Good Benchmark: Measuring LLMs' Alignment with Utilitarian Moral Dilemmas

25 March 2025
Giovanni Franco Gabriel Marraffini
Andrés Cotton
Noe Fabian Hsueh
Axel Fridman
Juan Wisznia
Luciano Del Corro
ArXivPDFHTML
Abstract

The question of how to make decisions that maximise the well-being of all persons is very relevant to design language models that are beneficial to humanity and free from harm. We introduce the Greatest Good Benchmark to evaluate the moral judgments of LLMs using utilitarian dilemmas. Our analysis across 15 diverse LLMs reveals consistently encoded moral preferences that diverge from established moral theories and lay population moral standards. Most LLMs have a marked preference for impartial beneficence and rejection of instrumental harm. These findings showcase the ártificial moral compass' of LLMs, offering insights into their moral alignment.

View on arXiv
@article{marraffini2025_2503.19598,
  title={ The Greatest Good Benchmark: Measuring LLMs' Alignment with Utilitarian Moral Dilemmas },
  author={ Giovanni Franco Gabriel Marraffini and Andrés Cotton and Noe Fabian Hsueh and Axel Fridman and Juan Wisznia and Luciano Del Corro },
  journal={arXiv preprint arXiv:2503.19598},
  year={ 2025 }
}
Comments on this paper