ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2506.14175
  4. Cited By
GRAM: A Generative Foundation Reward Model for Reward Generalization
v1v2 (latest)

GRAM: A Generative Foundation Reward Model for Reward Generalization

17 June 2025
Chenglong Wang
Yang Gan
Yifu Huo
Yongyu Mu
Qiaozhi He
Murun Yang
Bei Li
Tong Xiao
Chunliang Zhang
Tongran Liu
Jingbo Zhu
    ALMOffRLLRM
ArXiv (abs)PDFHTML

Papers citing "GRAM: A Generative Foundation Reward Model for Reward Generalization"

13 / 13 papers shown
Title
RM-R1: Reward Modeling as Reasoning
RM-R1: Reward Modeling as Reasoning
Xiusi Chen
Gaotang Li
Zehua Wang
Bowen Jin
Cheng Qian
...
Yu Zhang
D. Zhang
Tong Zhang
Hanghang Tong
Heng Ji
ReLMOffRLLRM
359
20
0
05 May 2025
Skywork-Reward: Bag of Tricks for Reward Modeling in LLMs
Skywork-Reward: Bag of Tricks for Reward Modeling in LLMs
Chris Yuhao Liu
Liang Zeng
Qingbin Liu
Rui Yan
Jujie He
Chaojie Wang
Shuicheng Yan
Yang Liu
Yahui Zhou
AI4TS
105
109
0
24 Oct 2024
RoVRM: A Robust Visual Reward Model Optimized via Auxiliary Textual Preference Data
RoVRM: A Robust Visual Reward Model Optimized via Auxiliary Textual Preference Data
Chenglong Wang
Yang Gan
Yifu Huo
Yongyu Mu
Murun Yang
...
Chunliang Zhang
Tongran Liu
Quan Du
Di Yang
Jingbo Zhu
VLM
113
6
0
22 Aug 2024
RewardBench: Evaluating Reward Models for Language Modeling
RewardBench: Evaluating Reward Models for Language Modeling
Nathan Lambert
Valentina Pyatkin
Jacob Morrison
Lester James V. Miranda
Bill Yuchen Lin
...
Sachin Kumar
Tom Zick
Yejin Choi
Noah A. Smith
Hanna Hajishirzi
ALM
157
261
0
20 Mar 2024
Reward Model Ensembles Help Mitigate Overoptimization
Reward Model Ensembles Help Mitigate Overoptimization
Thomas Coste
Usman Anwar
Robert Kirk
David M. Krueger
NoLaALM
77
138
0
04 Oct 2023
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
Lianmin Zheng
Wei-Lin Chiang
Ying Sheng
Siyuan Zhuang
Zhanghao Wu
...
Dacheng Li
Eric Xing
Haotong Zhang
Joseph E. Gonzalez
Ion Stoica
ALMOSLMELM
391
4,422
0
09 Jun 2023
AlpacaFarm: A Simulation Framework for Methods that Learn from Human
  Feedback
AlpacaFarm: A Simulation Framework for Methods that Learn from Human Feedback
Yann Dubois
Xuechen Li
Rohan Taori
Tianyi Zhang
Ishaan Gulrajani
Jimmy Ba
Carlos Guestrin
Percy Liang
Tatsunori B. Hashimoto
ALM
132
605
0
22 May 2023
A Survey on Aspect-Based Sentiment Classification
A Survey on Aspect-Based Sentiment Classification
Gianni Brauwers
Flavius Frasincar
LLMAG
69
118
0
27 Mar 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLMALM
880
13,148
0
04 Mar 2022
A General Language Assistant as a Laboratory for Alignment
A General Language Assistant as a Laboratory for Alignment
Amanda Askell
Yuntao Bai
Anna Chen
Dawn Drain
Deep Ganguli
...
Tom B. Brown
Jack Clark
Sam McCandlish
C. Olah
Jared Kaplan
ALM
118
789
0
01 Dec 2021
Does label smoothing mitigate label noise?
Does label smoothing mitigate label noise?
Michal Lukasik
Srinadh Bhojanapalli
A. Menon
Surinder Kumar
NoLa
187
351
0
05 Mar 2020
Fine-grained Sentiment Classification using BERT
Fine-grained Sentiment Classification using BERT
Manish Munikar
Sushil Shakya
Aakash Shrestha
SSeg
56
204
0
04 Oct 2019
When Does Label Smoothing Help?
When Does Label Smoothing Help?
Rafael Müller
Simon Kornblith
Geoffrey E. Hinton
UQCV
207
1,953
0
06 Jun 2019
1