Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2505.23749
Cited By
Distortion of AI Alignment: Does Preference Optimization Optimize for Preferences?
29 May 2025
Paul Gölz
Nika Haghtalab
Kunhe Yang
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Distortion of AI Alignment: Does Preference Optimization Optimize for Preferences?"
2 / 2 papers shown
Title
Jackpot! Alignment as a Maximal Lottery
Roberto-Rafael Maura-Rivero
Marc Lanctot
Francesco Visin
Kate Larson
113
7
0
31 Jan 2025
Clone-Robust AI Alignment
Ariel D. Procaccia
Benjamin G. Schiffer
Shirley Zhang
48
3
0
17 Jan 2025
1