Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2506.08681
Cited By
v1
v2 (latest)
Mitigating Reward Over-optimization in Direct Alignment Algorithms with Importance Sampling
10 June 2025
Phuc Minh Nguyen
Ngoc-Hieu Nguyen
Duy Nguyen
Anji Liu
An Mai
Binh T. Nguyen
Daniel Sonntag
Khoa D. Doan
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Mitigating Reward Over-optimization in Direct Alignment Algorithms with Importance Sampling"
Title
No papers