Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2203.02155
Cited By
Training language models to follow instructions with human feedback
4 March 2022
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
Pamela Mishkin
Chong Zhang
Sandhini Agarwal
Katarina Slama
Alex Ray
John Schulman
Jacob Hilton
Fraser Kelton
Luke E. Miller
Maddie Simens
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Training language models to follow instructions with human feedback"
50 / 6,392 papers shown
Title
WorldModelBench: Judging Video Generation Models As World Models
Dacheng Li
Yunhao Fang
Yukang Chen
Shuo Yang
Shiyi Cao
...
Hongxu Yin
Joseph E. Gonzalez
Ion Stoica
Enze Xie
Yaojie Lu
VGen
110
7
0
28 Feb 2025
Re-evaluating Theory of Mind evaluation in large language models
Jennifer Hu
Felix Sosa
T. Ullman
156
2
0
28 Feb 2025
A Survey of Uncertainty Estimation Methods on Large Language Models
Zhiqiu Xia
Jinxuan Xu
Yuqian Zhang
Hang Liu
109
3
0
28 Feb 2025
Beware of Your Po! Measuring and Mitigating AI Safety Risks in Role-Play Fine-Tuning of LLMs
Weixiang Zhao
Yulin Hu
Yang Deng
Jiahe Guo
Xingyu Sui
...
An Zhang
Yanyan Zhao
Bing Qin
Tat-Seng Chua
Ting Liu
183
7
0
28 Feb 2025
Learning to Substitute Components for Compositional Generalization
Hao Sun
Gangwei Jiang
Chenwang Wu
Ying Wei
Defu Lian
Enhong Chen
116
0
0
28 Feb 2025
Palm: A Culturally Inclusive and Linguistically Diverse Dataset for Arabic LLMs
Fakhraddin Alwajih
Abdellah El Mekki
Samar Magdy
AbdelRahim Elmadany
Omer Nacar
...
Anis Koubaa
Ismail Berrada
Mustafa Jarrar
Shady Shehata
Muhammad Abdul-Mageed
146
3
0
28 Feb 2025
CS-PaperSum: A Large-Scale Dataset of AI-Generated Summaries for Scientific Papers
Javin Liu
Aryan Vats
Zihao He
113
0
0
27 Feb 2025
Beneath the Surface: How Large Language Models Reflect Hidden Bias
Jinhao Pan
Chahat Raj
Ziyu Yao
Ziwei Zhu
79
0
0
27 Feb 2025
Multi-Agent Verification: Scaling Test-Time Compute with Multiple Verifiers
Shalev Lifshitz
Sheila A. McIlraith
Yilun Du
LRM
138
8
0
27 Feb 2025
Bridging the Creativity Understanding Gap: Small-Scale Human Alignment Enables Expert-Level Humor Ranking in LLMs
Kuan Lok Zhou
Jiayi Chen
Siddharth Suresh
Reuben Narad
Timothy T. Rogers
Lalit K Jain
R. Nowak
Bob Mankoff
Jifan Zhang
71
1
0
27 Feb 2025
R1-T1: Fully Incentivizing Translation Capability in LLMs via Reasoning Learning
Minggui He
Yilun Liu
Shimin Tao
Yuanchang Luo
Hongyong Zeng
...
Daimeng Wei
Weibin Meng
Hao Yang
Boxing Chen
Osamu Yoshie
LRM
164
8
0
27 Feb 2025
Preference Learning Unlocks LLMs' Psycho-Counseling Skills
Mian Zhang
S. Eack
Zhiyu Zoey Chen
146
2
0
27 Feb 2025
Societal Alignment Frameworks Can Improve LLM Alignment
Karolina Stañczak
Nicholas Meade
Mehar Bhatia
Hattie Zhou
Konstantin Böttinger
...
Timothy P. Lillicrap
Ana Marasović
Sylvie Delacroix
Gillian K. Hadfield
Siva Reddy
489
1
0
27 Feb 2025
HuAMR: A Hungarian AMR Parser and Dataset
Botond Barta
Endre Hamerlik
Milán Konor Nyist
Judit Ács
78
0
0
27 Feb 2025
CarPlanner: Consistent Auto-regressive Trajectory Planning for Large-scale Reinforcement Learning in Autonomous Driving
Dongkun Zhang
Jiaming Liang
Ke Guo
Sha Lu
Qi Wang
R. Xiong
Zhenwei Miao
Yue Wang
270
6
0
27 Feb 2025
XCOMPS: A Multilingual Benchmark of Conceptual Minimal Pairs
Linyang He
Ercong Nie
Sukru Samet Dindar
Arsalan Firoozi
Adrian Nicolas Florea
...
Haotian Ye
Jonathan R. Brennan
Helmut Schmid
Hinrich Schütze
Nima Mesgarani
107
1
0
27 Feb 2025
InterMimic: Towards Universal Whole-Body Control for Physics-Based Human-Object Interactions
Sirui Xu
Hung Yu Ling
Yu-Xiong Wang
Liang-Yan Gui
150
10
0
27 Feb 2025
Recommendations from Sparse Comparison Data: Provably Fast Convergence for Nonconvex Matrix Factorization
Suryanarayana Sankagiri
Jalal Etesami
Matthias Grossglauser
72
0
0
27 Feb 2025
From Retrieval to Generation: Comparing Different Approaches
Abdelrahman Abdallah
Jamshid Mozafari
Bhawna Piryani
Mohammed Ali
Adam Jatowt
RALM
106
0
0
27 Feb 2025
Med-RLVR: Emerging Medical Reasoning from a 3B base model via reinforcement Learning
Sheng Zhang
Qianchu Liu
Guanghui Qin
Tristan Naumann
Hoifung Poon
ReLM
OffRL
LRM
141
9
0
27 Feb 2025
Alleviating Distribution Shift in Synthetic Data for Machine Translation Quality Estimation
Xiang Geng
Zhejian Lai
Jiajun Chen
Hao Yang
Shujian Huang
89
0
0
27 Feb 2025
Collab-Overcooked: Benchmarking and Evaluating Large Language Models as Collaborative Agents
Haochen Sun
Shuwen Zhang
Lujie Niu
Lei Ren
Hao Xu
Hao Fu
Fangkun Zhao
Caixia Yuan
Xiaojie Wang
LLMAG
ELM
146
2
0
27 Feb 2025
Foot-In-The-Door: A Multi-turn Jailbreak for LLMs
Zixuan Weng
Xiaolong Jin
Jinyuan Jia
Xinsong Zhang
AAML
387
1
0
27 Feb 2025
Amulet: ReAlignment During Test Time for Personalized Preference Adaptation of LLMs
Zhaowei Zhang
Fengshuo Bai
Qizhi Chen
Chengdong Ma
Mingzhi Wang
Haoran Sun
Zilong Zheng
Yaodong Yang
182
5
0
26 Feb 2025
MEBench: Benchmarking Large Language Models for Cross-Document Multi-Entity Question Answering
Teng Lin
RALM
115
2
0
26 Feb 2025
Drop-Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization
Taishi Nakamura
Takuya Akiba
Kazuki Fujii
Yusuke Oda
Rio Yokota
Jun Suzuki
MoMe
MoE
142
2
0
26 Feb 2025
MathClean: A Benchmark for Synthetic Mathematical Data Cleaning
Hao Liang
Meiyi Qiang
Yongbin Li
Zefeng He
Yongzhen Guo
Z. Zhu
Wentao Zhang
Tengjiao Wang
69
0
0
26 Feb 2025
ANPMI: Assessing the True Comprehension Capabilities of LLMs for Multiple Choice Questions
Gyeongje Cho
Yeonkyoung So
Jaejin Lee
ELM
126
0
0
26 Feb 2025
Two Heads Are Better Than One: Dual-Model Verbal Reflection at Inference-Time
Jiazheng Li
Yuxiang Zhou
Junru Lu
Gladys Tyen
Lin Gui
Cesare Aloisi
Yulan He
LRM
104
3
0
26 Feb 2025
Reward Shaping to Mitigate Reward Hacking in RLHF
Jiayi Fu
Xuandong Zhao
Chengyuan Yao
Han Wang
Qi Han
Yanghua Xiao
205
14
0
26 Feb 2025
MedVLM-R1: Incentivizing Medical Reasoning Capability of Vision-Language Models (VLMs) via Reinforcement Learning
Jiazhen Pan
Che Liu
Junde Wu
Fenglin Liu
Jiayuan Zhu
Hongwei Bran Li
Chen Chen
Cheng Ouyang
Daniel Rueckert
LRM
LM&MA
VLM
151
42
0
26 Feb 2025
OneRec: Unifying Retrieve and Rank with Generative Recommender and Iterative Preference Alignment
Jiaxin Deng
Shiyao Wang
Kuo Cai
Lejian Ren
Qigen Hu
Weifeng Ding
Qiang Luo
Guorui Zhou
129
12
0
26 Feb 2025
Rethinking LLM Unlearning Objectives: A Gradient Perspective and Go Beyond
Qizhou Wang
Jin Peng Zhou
Zhanke Zhou
Saebyeol Shin
Bo Han
Kilian Q. Weinberger
AILaw
ELM
MU
145
10
0
26 Feb 2025
Self-rewarding correction for mathematical reasoning
Wei Xiong
Hanning Zhang
Chenlu Ye
Lichang Chen
Nan Jiang
Tong Zhang
ReLM
KELM
LRM
166
22
0
26 Feb 2025
Between Circuits and Chomsky: Pre-pretraining on Formal Languages Imparts Linguistic Biases
Michael Y. Hu
Jackson Petty
Chuan Shi
William Merrill
Tal Linzen
AI4CE
138
2
0
26 Feb 2025
A Survey on Foundation-Model-Based Industrial Defect Detection
Tianle Yang
Luyao Chang
Jiadong Yan
Jiajian Li
Zhi Wang
Ke Zhang
AI4CE
169
3
0
26 Feb 2025
Simulation of Language Evolution under Regulated Social Media Platforms: A Synergistic Approach of Large Language Models and Genetic Algorithms
Jinyu Cai
Yusei Ishimizu
Mingyue Zhang
Munan Li
Jialong Li
Kenji Tei
LLMAG
123
1
1
26 Feb 2025
Can RLHF be More Efficient with Imperfect Reward Models? A Policy Coverage Perspective
Jiawei Huang
Bingcong Li
Christoph Dann
Niao He
OffRL
275
3
0
26 Feb 2025
Code to Think, Think to Code: A Survey on Code-Enhanced Reasoning and Reasoning-Driven Code Intelligence in LLMs
Dayu Yang
Tianyang Liu
Daoan Zhang
Antoine Simoulin
Xiaoyi Liu
...
Zhaopu Teng
Xin Qian
Grey Yang
Jiebo Luo
Julian McAuley
ReLM
OffRL
LRM
158
12
0
26 Feb 2025
Shh, don't say that! Domain Certification in LLMs
Cornelius Emde
Alasdair Paren
Preetham Arvind
Maxime Kayser
Tom Rainforth
Thomas Lukasiewicz
Guohao Li
Philip Torr
Adel Bibi
122
2
0
26 Feb 2025
VEM: Environment-Free Exploration for Training GUI Agent with Value Environment Model
Jiani Zheng
Lu Wang
Fangkai Yang
Chen Zhang
Lingrui Mei
Wenjie Yin
Qingwei Lin
Dongmei Zhang
Saravan Rajmohan
Qi Zhang
OffRL
115
8
0
26 Feb 2025
Towards Label-Only Membership Inference Attack against Pre-trained Large Language Models
Yu He
Boheng Li
Lu Liu
Zhongjie Ba
Wei Dong
Yiming Li
Zhan Qin
Kui Ren
Chong Chen
MIALM
180
3
0
26 Feb 2025
MathTutorBench: A Benchmark for Measuring Open-ended Pedagogical Capabilities of LLM Tutors
Jakub Macina
Nico Daheim
Ido Hakimi
Manu Kapur
Iryna Gurevych
Mrinmaya Sachan
ELM
129
4
0
26 Feb 2025
FSPO: Few-Shot Preference Optimization of Synthetic Preference Data in LLMs Elicits Effective Personalization to Real Users
Anikait Singh
Sheryl Hsu
Kyle Hsu
E. Mitchell
Stefano Ermon
Tatsunori Hashimoto
Archit Sharma
Chelsea Finn
SyDa
OffRL
135
3
0
26 Feb 2025
Kanana: Compute-efficient Bilingual Language Models
Kanana LLM Team
Yunju Bak
Hojin Lee
Minho Ryu
Jiyeon Ham
...
Daniel Lee
Minchul Lee
MinHyung Lee
Shinbok Lee
Gaeun Seo
192
1
0
26 Feb 2025
Controlled Diversity: Length-optimized Natural Language Generation
Diana Marie Schenke
Timo Baumann
71
0
0
26 Feb 2025
CryptoPulse: Short-Term Cryptocurrency Forecasting with Dual-Prediction and Cross-Correlated Market Indicators
Amit Kumar
Taoran Ji
168
0
0
26 Feb 2025
Exploring Graph Tasks with Pure LLMs: A Comprehensive Benchmark and Investigation
Yansen Wang
Xinnan Dai
Wenqi Fan
Yao Ma
144
2
0
26 Feb 2025
Conformal Linguistic Calibration: Trading-off between Factuality and Specificity
Zhengping Jiang
Anqi Liu
Benjamin Van Durme
178
3
0
26 Feb 2025
Uncertainty Quantification in Retrieval Augmented Question Answering
Laura Perez-Beltrachini
Mirella Lapata
RALM
163
0
0
25 Feb 2025
Previous
1
2
3
...
26
27
28
...
126
127
128
Next