ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2502.08922
  4. Cited By
Self-Consistency of the Internal Reward Models Improves Self-Rewarding Language Models

Self-Consistency of the Internal Reward Models Improves Self-Rewarding Language Models

13 February 2025
Xin Zhou
Yiwen Guo
Ruotian Ma
Tao Gui
Qi Zhang
Xuanjing Huang
    LRM
ArXivPDFHTML

Papers citing "Self-Consistency of the Internal Reward Models Improves Self-Rewarding Language Models"

1 / 1 papers shown
Title
Sentient Agent as a Judge: Evaluating Higher-Order Social Cognition in Large Language Models
Sentient Agent as a Judge: Evaluating Higher-Order Social Cognition in Large Language Models
Bang Zhang
Ruotian Ma
Qingxuan Jiang
Peisong Wang
Jiaqi Chen
...
Fanghua Ye
Jian Li
Yifan Yang
Zhaopeng Tu
Xiaolong Li
LLMAG
ELM
ALM
109
0
1
01 May 2025
1