Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2203.02155
Cited By
Training language models to follow instructions with human feedback
4 March 2022
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
Pamela Mishkin
Chong Zhang
Sandhini Agarwal
Katarina Slama
Alex Ray
John Schulman
Jacob Hilton
Fraser Kelton
Luke E. Miller
Maddie Simens
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Training language models to follow instructions with human feedback"
50 / 6,381 papers shown
Title
VLM Q-Learning: Aligning Vision-Language Models for Interactive Decision-Making
Jake Grigsby
Yuke Zhu
Michael S Ryoo
Juan Carlos Niebles
OffRL
VLM
96
1
0
06 May 2025
PosterO: Structuring Layout Trees to Enable Language Models in Generalized Content-Aware Layout Generation
HsiaoYuan Hsu
Yuxin Peng
95
0
0
06 May 2025
PARM: Multi-Objective Test-Time Alignment via Preference-Aware Autoregressive Reward Model
Baijiong Lin
Weisen Jiang
Yuancheng Xu
Hao Chen
Ying-Cong Chen
88
1
0
06 May 2025
Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning
Yibin Wang
Zhimin Li
Yuhang Zang
Chunyu Wang
Qinglin Lu
Cheng Jin
Jinqiao Wang
LRM
151
11
0
06 May 2025
Geospatial Mechanistic Interpretability of Large Language Models
Stef De Sabbata
Stefano Mizzaro
Kevin Roitero
AI4CE
136
0
0
06 May 2025
Optimizing Chain-of-Thought Reasoners via Gradient Variance Minimization in Rejection Sampling and RL
Jiarui Yao
Yifan Hao
Hanning Zhang
Hanze Dong
Wei Xiong
Nan Jiang
Tong Zhang
LRM
165
2
0
05 May 2025
Incentivizing Inclusive Contributions in Model Sharing Markets
Enpei Zhang
Jingyi Chai
Guangyi Liu
Yanfeng Wang
Siheng Chen
TDI
FedML
426
0
0
05 May 2025
Knowing You Don't Know: Learning When to Continue Search in Multi-round RAG through Self-Practicing
Diji Yang
Linda Zeng
Jinmeng Rao
Yize Zhang
80
0
0
05 May 2025
SuperEdit: Rectifying and Facilitating Supervision for Instruction-Based Image Editing
Ming Li
Xin Gu
Fan Chen
X. Xing
Longyin Wen
Chong Chen
Sijie Zhu
DiffM
270
2
0
05 May 2025
Bye-bye, Bluebook? Automating Legal Procedure with Large Language Models
Matthew Dahl
AILaw
ELM
101
0
0
05 May 2025
SIMPLEMIX: Frustratingly Simple Mixing of Off- and On-policy Data in Language Model Preference Learning
Tianjian Li
Daniel Khashabi
144
0
0
05 May 2025
A Survey on Progress in LLM Alignment from the Perspective of Reward Design
Miaomiao Ji
Yanqiu Wu
Zhibin Wu
Shoujin Wang
Jian Yang
Mark Dras
Usman Naseem
78
2
0
05 May 2025
Improving Model Alignment Through Collective Intelligence of Open-Source LLMS
Junlin Wang
Roy Xie
Shang Zhu
Jue Wang
Ben Athiwaratkun
Bhuwan Dhingra
Shuaiwen Leon Song
Ce Zhang
James Zou
ALM
75
0
0
05 May 2025
FairPO: Robust Preference Optimization for Fair Multi-Label Learning
Soumen Kumar Mondal
Akshit Varmora
Prateek Chanda
Ganesh Ramakrishnan
100
0
0
05 May 2025
RM-R1: Reward Modeling as Reasoning
Xiusi Chen
Gaotang Li
Zehua Wang
Bowen Jin
Cheng Qian
...
Yu Zhang
D. Zhang
Tong Zhang
Hanghang Tong
Heng Ji
ReLM
OffRL
LRM
396
21
0
05 May 2025
R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning
Yi-Fan Zhang
Xingyu Lu
X. Hu
Chaoyou Fu
Bin Wen
...
Jianfei Chen
Fan Yang
Zheng Zhang
Yan Li
Liang Wang
OffRL
LRM
135
6
0
05 May 2025
El Agente: An Autonomous Agent for Quantum Chemistry
Yunheng Zou
Austin H. Cheng
Abdulrahman Aldossary
Jiaru Bai
Shi Xuan Leong
...
Ilya Yakavets
Han Hao
Chris Crebolder
Varinia Bernales
Alán Aspuru-Guzik
41
0
0
05 May 2025
Voila: Voice-Language Foundation Models for Real-Time Autonomous Interaction and Voice Role-Play
Yemin Shi
Yu Shu
Siwei Dong
Guangyi Liu
Jaward Sesay
Jingwen Li
Zhiting Hu
AuLLM
VLM
98
0
0
05 May 2025
Semantic Probabilistic Control of Language Models
Kareem Ahmed
Catarina G Belém
Padhraic Smyth
Sameer Singh
119
1
0
04 May 2025
R-Bench: Graduate-level Multi-disciplinary Benchmarks for LLM & MLLM Complex Reasoning Evaluation
Meng-Hao Guo
Jiajun Xu
Yi Zhang
Jiaxi Song
Haoyang Peng
...
Yongming Rao
Houwen Peng
Han Hu
Gordon Wetzstein
Shi-Min Hu
ELM
LRM
129
4
0
04 May 2025
A Survey on Privacy Risks and Protection in Large Language Models
Kang Chen
Xiuze Zhou
Yuanguo Lin
Shibo Feng
Li Shen
Pengcheng Wu
AILaw
PILM
450
0
0
04 May 2025
From Mind to Machine: The Rise of Manus AI as a Fully Autonomous Digital Agent
Minjie Shen
Qikai Yang
LLMAG
AI4CE
89
8
0
04 May 2025
Demystifying optimized prompts in language models
Rimon Melamed
Lucas H. McCabe
H. H. Huang
77
0
0
04 May 2025
Exploring the Potential of Offline RL for Reasoning in LLMs: A Preliminary Study
Xiaoyu Tian
Sitong Zhao
Haotian Wang
Shuaiting Chen
Yiping Peng
Yunjie Ji
Han Zhao
Xiangang Li
OffRL
LRM
81
0
0
04 May 2025
Attention Mechanisms Perspective: Exploring LLM Processing of Graph-Structured Data
Zhong Guan
Likang Wu
Hongke Zhao
Ming He
Jianpin Fan
GNN
73
0
0
04 May 2025
Cannot See the Forest for the Trees: Invoking Heuristics and Biases to Elicit Irrational Choices of LLMs
Haoming Yang
Ke Ma
Xiaojun Jia
Yingfei Sun
Qianqian Xu
Qingming Huang
AAML
442
0
0
03 May 2025
LookAlike: Consistent Distractor Generation in Math MCQs
Nisarg Parikh
Nigel Fernandez
Alexander Scarlatos
Simon Woodhead
Andrew Lan
125
0
0
03 May 2025
Multi-agents based User Values Mining for Recommendation
Lawrence Yunliang Chen
Wei Yuan
Tong Chen
Xiangyu Zhao
Nguyen Quoc Viet Hung
Hongzhi Yin
OffRL
135
0
0
02 May 2025
Transferable Adversarial Attacks on Black-Box Vision-Language Models
Kai Hu
Weichen Yu
Lefei Zhang
Alexander Robey
Andy Zou
Chengming Xu
Haoqi Hu
Matt Fredrikson
AAML
VLM
141
2
0
02 May 2025
Aligning Large Language Models with Healthcare Stakeholders: A Pathway to Trustworthy AI Integration
Kexin Ding
Mu Zhou
Akshay Chaudhari
Shaoting Zhang
Dimitris N. Metaxas
LM&MA
87
0
0
02 May 2025
Value Portrait: Assessing Language Models' Values through Psychometrically and Ecologically Valid Items
Jongwook Han
Dongmin Choi
Woojung Song
Eun-Ju Lee
Yohan Jo
PILM
107
0
0
02 May 2025
Contextures: Representations from Contexts
Runtian Zhai
Kai Yang
Che-Ping Tsai
Burak Varici
Zico Kolter
Pradeep Ravikumar
449
0
0
02 May 2025
DeepCritic: Deliberate Critique with Large Language Models
Wenkai Yang
Jingwen Chen
Yankai Lin
Ji-Rong Wen
ALM
LRM
109
1
0
01 May 2025
FineScope : Precision Pruning for Domain-Specialized Large Language Models Using SAE-Guided Self-Data Cultivation
Chaitali Bhattacharyya
Yeseong Kim
121
0
0
01 May 2025
Steering Large Language Models with Register Analysis for Arbitrary Style Transfer
Xinchen Yang
Marine Carpuat
LLMSV
567
0
0
01 May 2025
Sentient Agent as a Judge: Evaluating Higher-Order Social Cognition in Large Language Models
Bang Zhang
Ruotian Ma
Qingxuan Jiang
Peisong Wang
Jiaqi Chen
...
Fanghua Ye
Jian Li
Yifan Yang
Zhaopeng Tu
Xiaolong Li
LLMAG
ELM
ALM
265
0
1
01 May 2025
Towards Efficient Online Tuning of VLM Agents via Counterfactual Soft Reinforcement Learning
Lang Feng
Weihao Tan
Zhiyi Lyu
Longtao Zheng
Haiyang Xu
Ming Yan
Fei Huang
Jingyi Wang
66
0
0
01 May 2025
Unlearning Sensitive Information in Multimodal LLMs: Benchmark and Attack-Defense Evaluation
Vaidehi Patil
Yi-Lin Sung
Peter Hase
Jie Peng
Jen-tse Huang
Joey Tianyi Zhou
AAML
MU
287
4
0
01 May 2025
Base Models Beat Aligned Models at Randomness and Creativity
Peter West
Christopher Potts
476
4
0
30 Apr 2025
An Empirical Study on the Effectiveness of Large Language Models for Binary Code Understanding
Xiuwei Shang
Zhenkan Fu
Shaoyin Cheng
Guoqiang Chen
Gangyang Li
Li Hu
Weinan Zhang
N. Yu
98
0
0
30 Apr 2025
ShorterBetter: Guiding Reasoning Models to Find Optimal Inference Length for Efficient Reasoning
Jingyang Yi
Jiazheng Wang
Sida Li
ReLM
OODD
LRM
435
8
0
30 Apr 2025
Confidence in Large Language Model Evaluation: A Bayesian Approach to Limited-Sample Challenges
Xiao Xiao
Yu Su
Sijing Zhang
Zhang Chen
Yadong Chen
Tian Liu
101
0
0
30 Apr 2025
Investigating Zero-Shot Diagnostic Pathology in Vision-Language Models with Efficient Prompt Design
Vasudev Sharma
Ahmed Alagha
Abdelhakim Khellaf
Vincent Quoc-Huy Trinh
Mahdi S. Hosseini
156
0
0
30 Apr 2025
Real-World Gaps in AI Governance Research
Ilan Strauss
Isobel Moure
Tim O'Reilly
Sruly Rosenblat
164
1
0
30 Apr 2025
Humanizing LLMs: A Survey of Psychological Measurements with Tools, Datasets, and Human-Agent Applications
Wenhan Dong
Yuemeng Zhao
Zhen Sun
Yule Liu
Zifan Peng
...
Jun Wu
Ruiming Wang
Shengmin Xu
Xinyi Huang
Xinlei He
LLMAG
189
1
0
30 Apr 2025
XBreaking: Explainable Artificial Intelligence for Jailbreaking LLMs
Marco Arazzi
Vignesh Kumar Kembu
Antonino Nocera
V. P.
175
0
0
30 Apr 2025
The Coral Protocol: Open Infrastructure Connecting The Internet of Agents
Roman J. Georgio
Caelum Forder
Suman Deb
Peter Carroll
Önder Gürcan
144
0
0
30 Apr 2025
Reinforcement Learning for Reasoning in Large Language Models with One Training Example
Yiping Wang
Qing Yang
Zhiyuan Zeng
Liliang Ren
Liu Liu
...
Jianfeng Gao
Weizhu Chen
Shuaiqiang Wang
Simon Shaolei Du
Yelong Shen
OffRL
ReLM
LRM
349
47
0
29 Apr 2025
NeuRel-Attack: Neuron Relearning for Safety Disalignment in Large Language Models
Yi Zhou
Wenpeng Xing
Dezhang Kong
Changting Lin
Meng Han
MU
KELM
LLMSV
68
0
0
29 Apr 2025
Multimodal Large Language Models for Medicine: A Comprehensive Survey
Jiarui Ye
Hao Tang
LM&MA
191
0
0
29 Apr 2025
Previous
1
2
3
...
14
15
16
...
126
127
128
Next