Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2203.02155
Cited By
Training language models to follow instructions with human feedback
4 March 2022
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
Pamela Mishkin
Chong Zhang
Sandhini Agarwal
Katarina Slama
Alex Ray
John Schulman
Jacob Hilton
Fraser Kelton
Luke E. Miller
Maddie Simens
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Training language models to follow instructions with human feedback"
50 / 4,601 papers shown
Title
Visual Planning: Let's Think Only with Images
Yi Xu
Chengzu Li
Han Zhou
Xingchen Wan
Caiqi Zhang
Anna Korhonen
Ivan Vulić
LM&Ro
LRM
7
0
0
16 May 2025
Attention-Based Reward Shaping for Sparse and Delayed Rewards
Ian Holmes
Min Chi
OffRL
24
0
0
16 May 2025
LARGO: Latent Adversarial Reflection through Gradient Optimization for Jailbreaking LLMs
Ran Li
Hao Wang
Chengzhi Mao
AAML
21
0
0
16 May 2025
GuideBench: Benchmarking Domain-Oriented Guideline Following for LLM Agents
Lingxiao Diao
Xinyue Xu
Wanxuan Sun
Cheng Yang
Zhuosheng Zhang
LLMAG
ALM
ELM
7
0
0
16 May 2025
Unifying Segment Anything in Microscopy with Multimodal Large Language Model
Manyu Li
Ruian He
Zixian Zhang
Weimin Tan
Bo Yan
VLM
9
0
0
16 May 2025
Unveiling the Potential of Vision-Language-Action Models with Open-Ended Multimodal Instructions
Wei Zhao
Gongsheng Li
Zhefei Gong
Pengxiang Ding
H. Zhao
Donglin Wang
LM&Ro
22
0
0
16 May 2025
Can Global XAI Methods Reveal Injected Bias in LLMs? SHAP vs Rule Extraction vs RuleSHAP
Francesco Sovrano
14
0
0
16 May 2025
BLEUBERI: BLEU is a surprisingly effective reward for instruction following
Yapei Chang
Yekyung Kim
Michael Krumdick
Amir Zadeh
Chuan Li
Chris Tanner
Mohit Iyyer
ALM
17
0
0
16 May 2025
MergeBench: A Benchmark for Merging Domain-Specialized LLMs
Yifei He
Siqi Zeng
Yuzheng Hu
Rui Yang
Tong Zhang
Han Zhao
MoMe
ALM
24
0
0
16 May 2025
Ranked Voting based Self-Consistency of Large Language Models
Weiqin Wang
Yile Wang
Hui Huang
LRM
12
0
0
16 May 2025
SoLoPO: Unlocking Long-Context Capabilities in LLMs via Short-to-Long Preference Optimization
Huashan Sun
Shengyi Liao
Yansen Han
Yu Bai
Yang Gao
...
Weizhou Shen
Fanqi Wan
Ming Yan
Junzhe Zhang
Fei Huang
20
0
0
16 May 2025
Feasibility with Language Models for Open-World Compositional Zero-Shot Learning
Jae Myung Kim
Stephan Alaniz
Cordelia Schmid
Zeynep Akata
14
0
0
16 May 2025
Review-Instruct: A Review-Driven Multi-Turn Conversations Generation Method for Large Language Models
Jian Wu
Cong Wang
TianHuang Su
Jun Yang
Haozhi Lin
...
Steve Yang
BinQing Pan
Zehan Li
Ni Yang
ZhenYu Yang
ALM
14
0
0
16 May 2025
ShiQ: Bringing back Bellman to LLMs
Pierre Clavier
Nathan Grinsztajn
Raphaël Avalos
Yannis Flet-Berliac
Irem Ergun
...
Eugene Tarassov
Olivier Pietquin
Pierre Harvey Richemond
Florian Strub
Matthieu Geist
OffRL
12
0
0
16 May 2025
Diffusion-NPO: Negative Preference Optimization for Better Preference Aligned Generation of Diffusion Models
Fu-Yun Wang
Yunhao Shui
Jingtan Piao
Keqiang Sun
Hongsheng Li
22
0
0
16 May 2025
Finetune-RAG: Fine-Tuning Language Models to Resist Hallucination in Retrieval-Augmented Generation
Zhan Peng Lee
Andre Lin
Calvin Tan
RALM
HILM
25
0
0
16 May 2025
A Systematic Analysis of Base Model Choice for Reward Modeling
Kian Ahrabian
Pegah Jandaghi
Negar Mokhberian
Sai Praneeth Karimireddy
Jay Pujara
22
0
0
16 May 2025
GuardReasoner-VL: Safeguarding VLMs via Reinforced Reasoning
Yong-Jin Liu
Shengfang Zhai
Mingzhe Du
Yulin Chen
Tri Cao
...
Xuzhao Li
Kun Wang
Junfeng Fang
Jiaheng Zhang
Bryan Hooi
OffRL
LRM
7
0
0
16 May 2025
When Thinking Fails: The Pitfalls of Reasoning for Instruction-Following in LLMs
Xiaomin Li
Zhou Yu
Zhiwei Zhang
Xupeng Chen
Ziji Zhang
Yingying Zhuang
Narayanan Sadagopan
Anurag Beniwal
LRM
7
0
0
16 May 2025
Search and Refine During Think: Autonomous Retrieval-Augmented Reasoning of LLMs
Yaorui Shi
Shihan Li
Chang Wu
Zhiyuan Liu
Fan Zhang
Hengxing Cai
An Zhang
Xinbing Wang
ReLM
LRM
28
0
0
16 May 2025
WorldPM: Scaling Human Preference Modeling
Binghui Wang
Runji Lin
K. Lu
L. Yu
Z. Zhang
...
Xuanjing Huang
Yu-Gang Jiang
Bowen Yu
J. Zhou
Junyang Lin
24
0
0
15 May 2025
Demystifying AI Agents: The Final Generation of Intelligence
Kevin J McNamara
Rhea Pritham Marpu
29
0
0
15 May 2025
Reinforcing the Diffusion Chain of Lateral Thought with Diffusion Language Models
Zemin Huang
Zhiyang Chen
Zijun Wang
Tiancheng Li
Guo-Jun Qi
DiffM
LRM
AI4CE
23
0
0
15 May 2025
J1: Incentivizing Thinking in LLM-as-a-Judge via Reinforcement Learning
Chenxi Whitehouse
Tianlu Wang
Ping Yu
Xian Li
Jason Weston
Ilia Kulikov
Swarnadeep Saha
ALM
ELM
LRM
19
0
0
15 May 2025
Towards a Deeper Understanding of Reasoning Capabilities in Large Language Models
Annie Wong
Thomas Bäck
Aske Plaat
Niki van Stein
Anna V. Kononova
ReLM
ELM
LRM
45
0
0
15 May 2025
PIG: Privacy Jailbreak Attack on LLMs via Gradient-based Iterative In-Context Optimization
Yidan Wang
Yanan Cao
Yubing Ren
Fang Fang
Zheng-Shen Lin
Binxing Fang
PILM
44
0
0
15 May 2025
Interpretable Risk Mitigation in LLM Agent Systems
Jan Chojnacki
LLMAG
12
0
0
15 May 2025
T2A-Feedback: Improving Basic Capabilities of Text-to-Audio Generation via Fine-grained AI Feedback
Zehan Wang
Ke Lei
Chen Zhu
Jiawei Huang
Sashuai Zhou
...
Xize Cheng
Shengpeng Ji
Zhenhui Ye
Tao Jin
Zhou Zhao
29
0
0
15 May 2025
Pre-Act: Multi-Step Planning and Reasoning Improves Acting in LLM Agents
Mrinal Rawat
Ambuje Gupta
Rushil Goomer
Alessandro Di Bari
Neha Gupta
Roberto Pieraccini
LLMAG
LRM
23
0
0
15 May 2025
Atomic Consistency Preference Optimization for Long-Form Question Answering
Jingfeng Chen
Raghuveer Thirukovalluru
Junlin Wang
Kaiwei Luo
Bhuwan Dhingra
KELM
HILM
20
0
0
14 May 2025
Ethics and Persuasion in Reinforcement Learning from Human Feedback: A Procedural Rhetorical Approach
Shannon Lodoen
Alexi Orchard
13
0
0
14 May 2025
System Prompt Optimization with Meta-Learning
Yumin Choi
Jinheon Baek
Sung Ju Hwang
LLMAG
52
0
0
14 May 2025
Language Agents Mirror Human Causal Reasoning Biases. How Can We Help Them Think Like Scientists?
Anthony GX-Chen
Dongyan Lin
Mandana Samiei
Doina Precup
Blake A. Richards
Rob Fergus
Kenneth Marino
CML
LRM
34
0
0
14 May 2025
WorldView-Bench: A Benchmark for Evaluating Global Cultural Perspectives in Large Language Models
Abdullah Mushtaq
Imran Taj
Rafay Naeem
Ibrahim Ghaznavi
Junaid Qadir
26
0
0
14 May 2025
TUMS: Enhancing Tool-use Abilities of LLMs with Multi-structure Handlers
Aiyao He
Sijia Cui
Shuai Xu
Yanna Wang
Bo Xu
39
0
0
13 May 2025
Memorization-Compression Cycles Improve Generalization
Fangyuan Yu
39
0
0
13 May 2025
Evaluating LLM Metrics Through Real-World Capabilities
Justin K Miller
Wenjia Tang
ELM
ALM
42
0
0
13 May 2025
Large Language Models for Computer-Aided Design: A Survey
Licheng Zhang
Bach Le
Naveed Akhtar
Siew-Kei Lam
Tuan Ngo
3DV
AI4CE
38
0
0
13 May 2025
Improved Algorithms for Differentially Private Language Model Alignment
Keyu Chen
Hao Tang
Qinglin Liu
Yizhao Xu
26
0
0
13 May 2025
Visually Guided Decoding: Gradient-Free Hard Prompt Inversion with Language Models
Donghoon Kim
Minji Bae
Kyuhong Shim
B. Shim
38
0
0
13 May 2025
Large Language Models Meet Stance Detection: A Survey of Tasks, Methods, Applications, Challenges and Future Directions
Lata Pangtey
Anukriti Bhatnagar
Shubhi Bansal
Shahid Shafi Dar
Nagendra Kumar
32
0
0
13 May 2025
Direct Density Ratio Optimization: A Statistically Consistent Approach to Aligning Large Language Models
Rei Higuchi
Taiji Suzuki
31
0
0
12 May 2025
You Only Look One Step: Accelerating Backpropagation in Diffusion Sampling with Gradient Shortcuts
Hongkun Dou
Zeyu Li
Xingyu Jiang
Hao Li
Lijun Yang
Wen Yao
Yue Deng
DiffM
38
0
0
12 May 2025
A Multi-Dimensional Constraint Framework for Evaluating and Improving Instruction Following in Large Language Models
Junjie Ye
Caishuang Huang
Zhe Chen
Wenjie Fu
Chenyuan Yang
...
Tao Gui
Qi Zhang
Zhongchao Shi
Jianping Fan
Xuanjing Huang
ALM
41
0
0
12 May 2025
DynamicRAG: Leveraging Outputs of Large Language Model as Feedback for Dynamic Reranking in Retrieval-Augmented Generation
Jiashuo Sun
Xianrui Zhong
Sizhe Zhou
Jiawei Han
RALM
31
0
0
12 May 2025
On the Robustness of Reward Models for Language Model Alignment
Jiwoo Hong
Noah Lee
Eunki Kim
Guijin Son
Woojin Chung
Aman Gupta
Shao Tang
James Thorne
29
0
0
12 May 2025
Skywork-VL Reward: An Effective Reward Model for Multimodal Understanding and Reasoning
Xiaokun Wang
Chris
Jiangbo Pei
Wei Shen
Yi Peng
...
Ai Jian
Tianyidan Xie
Xuchen Song
Yang Liu
Yahui Zhou
OffRL
LRM
28
0
0
12 May 2025
Assessing and Mitigating Medical Knowledge Drift and Conflicts in Large Language Models
Weiyi Wu
Xinwen Xu
Chongyang Gao
Xingjian Diao
Siting Li
Lucas A. Salas
Jiang Gui
26
0
0
12 May 2025
DanceGRPO: Unleashing GRPO on Visual Generation
Zeyue Xue
Jie Wu
Yu Gao
Fangyuan Kong
Lingting Zhu
...
Zhiheng Liu
Wei Liu
Qiushan Guo
Weilin Huang
Ping Luo
EGVM
VGen
52
0
0
12 May 2025
Towards Artificial General or Personalized Intelligence? A Survey on Foundation Models for Personalized Federated Intelligence
Yu Qiao
Huy Q. Le
Avi Deb Raha
Phuong-Nam Tran
Apurba Adhikary
Mengchun Zhang
Loc X. Nguyen
Eui-nam Huh
Dusit Niyato
Choong Seon Hong
AI4CE
31
0
0
11 May 2025
1
2
3
4
...
91
92
93
Next