Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.06754
Cited By
ALaRM: Align Language Models via Hierarchical Rewards Modeling
11 March 2024
Yuhang Lai
Siyuan Wang
Shujun Liu
Xuanjing Huang
Zhongyu Wei
Re-assign community
ArXiv
PDF
HTML
Papers citing
"ALaRM: Align Language Models via Hierarchical Rewards Modeling"
5 / 5 papers shown
Title
HAF-RM: A Hybrid Alignment Framework for Reward Model Training
Shujun Liu
Xiaoyu Shen
Yuhang Lai
Siyuan Wang
Shengbin Yue
Zengfeng Huang
Xuanjing Huang
Zhongyu Wei
31
1
0
04 Jul 2024
MetaAligner: Towards Generalizable Multi-Objective Alignment of Language Models
Kailai Yang
Zhiwei Liu
Qianqian Xie
Jimin Huang
Tianlin Zhang
Sophia Ananiadou
37
15
0
25 Mar 2024
RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback
M. Steyvers
Yuan Yao
Haoye Zhang
Taiwen He
Yifeng Han
...
Xinyue Hu
Zhiyuan Liu
Hai-Tao Zheng
Maosong Sun
Tat-Seng Chua
MLLM
VLM
150
178
0
01 Dec 2023
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
372
12,081
0
04 Mar 2022
AI safety via debate
G. Irving
Paul Christiano
Dario Amodei
204
203
0
02 May 2018
1