Beyond Preferences in AI Alignment

30 August 2024

Tan Zhi-Xuan

Papers citing "Beyond Preferences in AI Alignment"

3 / 3 papers shown

Title
Expressing stigma and inappropriate responses prevents LLMs from safely replacing mental health providers Jared Moore Declan Grabb William Agnew Kevin Klyman Stevie Chancellor Desmond C. Ong Nick Haber AI4MH 44 0 0 25 Apr 2025
Benchmarking the rationality of AI decision making using the transitivity axiom Kiwon Song James M. Jennings III Clintin P. Davis-Stober 38 0 0 14 Feb 2025
AI safety via debate G. Irving Paul Christiano Dario Amodei 204 199 0 02 May 2018