ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.02197
  4. Cited By
Beyond Bradley-Terry Models: A General Preference Model for Language Model Alignment

Beyond Bradley-Terry Models: A General Preference Model for Language Model Alignment

3 October 2024
Yifan Zhang
Ge Zhang
Yue Wu
Kangping Xu
Quanquan Gu
ArXivPDFHTML

Papers citing "Beyond Bradley-Terry Models: A General Preference Model for Language Model Alignment"

2 / 2 papers shown
Title
Robust Reinforcement Learning from Human Feedback for Large Language Models Fine-Tuning
Robust Reinforcement Learning from Human Feedback for Large Language Models Fine-Tuning
Kai Ye
Hongyi Zhou
Jin Zhu
Francesco Quinzan
C. Shi
25
1
0
03 Apr 2025
Reinforcement Learning Enhanced LLMs: A Survey
Reinforcement Learning Enhanced LLMs: A Survey
Shuhe Wang
Shengyu Zhang
J. Zhang
Runyi Hu
Xiaoya Li
Tianwei Zhang
Jiwei Li
Fei Wu
G. Wang
Eduard H. Hovy
OffRL
134
7
0
05 Dec 2024
1