Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2503.07806
Cited By
Towards Large Language Models that Benefit for All: Benchmarking Group Fairness in Reward Models
10 March 2025
Kefan Song
Jin Yao
Runnan Jiang
Rohan Chandra
Shangtong Zhang
ALM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Towards Large Language Models that Benefit for All: Benchmarking Group Fairness in Reward Models"
8 / 8 papers shown
Title
CEB: Compositional Evaluation Benchmark for Fairness in Large Language Models
Song Wang
Peng Wang
Tong Zhou
Yushun Dong
Zhen Tan
Jundong Li
CoGe
103
8
0
02 Jul 2024
Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective Rewards
Haoxiang Wang
Yong Lin
Wei Xiong
Rui Yang
Shizhe Diao
Shuang Qiu
Han Zhao
Tong Zhang
87
83
0
28 Feb 2024
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
Lianmin Zheng
Wei-Lin Chiang
Ying Sheng
Siyuan Zhuang
Zhanghao Wu
...
Dacheng Li
Eric Xing
Haotong Zhang
Joseph E. Gonzalez
Ion Stoica
ALM
OSLM
ELM
351
4,312
0
09 Jun 2023
RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment
Hanze Dong
Wei Xiong
Deepanshu Goyal
Yihan Zhang
Winnie Chow
Boyao Wang
Shizhe Diao
Jipeng Zhang
Kashun Shum
Tong Zhang
ALM
74
455
0
13 Apr 2023
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
874
12,973
0
04 Mar 2022
Measuring and Reducing Gendered Correlations in Pre-trained Models
Kellie Webster
Xuezhi Wang
Ian Tenney
Alex Beutel
Emily Pitler
Ellie Pavlick
Jilin Chen
Ed Chi
Slav Petrov
FaML
77
259
0
12 Oct 2020
BERT has a Mouth, and It Must Speak: BERT as a Markov Random Field Language Model
Alex Jinpeng Wang
Kyunghyun Cho
VLM
79
356
0
11 Feb 2019
Gender Bias in Neural Natural Language Processing
Kaiji Lu
Piotr (Peter) Mardziel
Fangjing Wu
Preetam Amancharla
Anupam Datta
116
355
0
31 Jul 2018
1