Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2510.23049
Cited By
Advantage Shaping as Surrogate Reward Maximization: Unifying Pass@K Policy Gradients
27 October 2025
Christos Thrampoulidis
Sadegh Mahdavi
Wenlong Deng
OffRL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Advantage Shaping as Surrogate Reward Maximization: Unifying Pass@K Policy Gradients"
0 / 0 papers shown
Title
No papers found