ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.02155
  4. Cited By
Training language models to follow instructions with human feedback

Training language models to follow instructions with human feedback

4 March 2022
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
Pamela Mishkin
Chong Zhang
Sandhini Agarwal
Katarina Slama
Alex Ray
John Schulman
Jacob Hilton
Fraser Kelton
Luke E. Miller
Maddie Simens
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
    OSLMALM
ArXiv (abs)PDFHTML

Papers citing "Training language models to follow instructions with human feedback"

50 / 6,380 papers shown
Title
Data-Driven Breakthroughs and Future Directions in AI Infrastructure: A Comprehensive Review
Data-Driven Breakthroughs and Future Directions in AI Infrastructure: A Comprehensive Review
Beyazit Bestami Yuksel
Ayse Yilmazer Metin
25
0
0
22 May 2025
ConciseRL: Conciseness-Guided Reinforcement Learning for Efficient Reasoning Models
ConciseRL: Conciseness-Guided Reinforcement Learning for Efficient Reasoning Models
Razvan-Gabriel Dumitru
Darius Peteleaza
Vikas Yadav
Liangming Pan
ReLMLRM
115
1
0
22 May 2025
Data Doping or True Intelligence? Evaluating the Transferability of Injected Knowledge in LLMs
Data Doping or True Intelligence? Evaluating the Transferability of Injected Knowledge in LLMs
Essa Jan
Moiz Ali
Muhammad Saram Hassan
Fareed Zaffar
Yasir Zaki
KELM
43
0
0
22 May 2025
SWE-Dev: Evaluating and Training Autonomous Feature-Driven Software Development
SWE-Dev: Evaluating and Training Autonomous Feature-Driven Software Development
Yaxin Du
Yuzhu Cai
Yifan Zhou
Cheng-Yu Wang
Yu Qian
Xianghe Pang
Qian Liu
Yue Hu
Siheng Chen
65
0
0
22 May 2025
SC4ANM: Identifying Optimal Section Combinations for Automated Novelty Prediction in Academic Papers
SC4ANM: Identifying Optimal Section Combinations for Automated Novelty Prediction in Academic Papers
Wenqing Wu
Chengzhi Zhang
Tong Bao
Yi Zhao
221
1
0
22 May 2025
MPL: Multiple Programming Languages with Large Language Models for Information Extraction
MPL: Multiple Programming Languages with Large Language Models for Information Extraction
Bo Li
Gexiang Fang
Wei Ye
Zhenghua Xu
Jinglei Zhang
Hao Cheng
Shikun Zhang
54
0
0
22 May 2025
ReCopilot: Reverse Engineering Copilot in Binary Analysis
ReCopilot: Reverse Engineering Copilot in Binary Analysis
Guoqiang Chen
Huiqi Sun
Daguang Liu
Zhiqi Wang
Qiang Wang
Bin Yin
Lu Liu
Lingyun Ying
45
0
0
22 May 2025
Humans Hallucinate Too: Language Models Identify and Correct Subjective Annotation Errors With Label-in-a-Haystack Prompts
Humans Hallucinate Too: Language Models Identify and Correct Subjective Annotation Errors With Label-in-a-Haystack Prompts
Georgios Chochlakis
Peter Wu
Arjun Bedi
Marcus Ma
Kristina Lerman
Shrikanth Narayanan
198
0
0
22 May 2025
Locate-then-Merge: Neuron-Level Parameter Fusion for Mitigating Catastrophic Forgetting in Multimodal LLMs
Locate-then-Merge: Neuron-Level Parameter Fusion for Mitigating Catastrophic Forgetting in Multimodal LLMs
Zeping Yu
Sophia Ananiadou
MoMeKELMCLL
109
0
0
22 May 2025
SAE-SSV: Supervised Steering in Sparse Representation Spaces for Reliable Control of Language Models
SAE-SSV: Supervised Steering in Sparse Representation Spaces for Reliable Control of Language Models
Zirui He
Mingyu Jin
Bo Shen
Ali Payani
Yongfeng Zhang
Mengnan Du
LLMSV
76
0
0
22 May 2025
LightRouter: Towards Efficient LLM Collaboration with Minimal Overhead
LightRouter: Towards Efficient LLM Collaboration with Minimal Overhead
Yifan Zhang
Xinkui Zhao
Zuxin Wang
Guanjie Cheng
Yueshen Xu
Shuiguang Deng
Yuxiang Cai
95
0
0
22 May 2025
Shape it Up! Restoring LLM Safety during Finetuning
Shape it Up! Restoring LLM Safety during Finetuning
ShengYun Peng
Pin-Yu Chen
Jianfeng Chi
Seongmin Lee
Duen Horng Chau
70
0
0
22 May 2025
Sudoku-Bench: Evaluating creative reasoning with Sudoku variants
Sudoku-Bench: Evaluating creative reasoning with Sudoku variants
Jeffrey Seely
Yuki Imajuku
Tianyu Zhao
Edoardo Cetin
Llion Jones
LRM
82
1
0
22 May 2025
Aligning Explanations with Human Communication
Aligning Explanations with Human Communication
Jacopo Teneggi
Zhenzhen Wang
Paul H. Yi
Tianmin Shu
Jeremias Sulam
179
0
0
21 May 2025
The Atlas of In-Context Learning: How Attention Heads Shape In-Context Retrieval Augmentation
The Atlas of In-Context Learning: How Attention Heads Shape In-Context Retrieval Augmentation
Patrick Kahardipraja
Reduan Achtibat
Thomas Wiegand
Wojciech Samek
Sebastian Lapuschkin
151
0
0
21 May 2025
A Unified Theoretical Analysis of Private and Robust Offline Alignment: from RLHF to DPO
A Unified Theoretical Analysis of Private and Robust Offline Alignment: from RLHF to DPO
Xingyu Zhou
Yulian Wu
Francesco Orabona
OffRL
105
1
0
21 May 2025
Learning from Algorithm Feedback: One-Shot SAT Solver Guidance with GNNs
Learning from Algorithm Feedback: One-Shot SAT Solver Guidance with GNNs
Jan Tönshoff
Martin Grohe
94
0
0
21 May 2025
Prototypical Human-AI Collaboration Behaviors from LLM-Assisted Writing in the Wild
Prototypical Human-AI Collaboration Behaviors from LLM-Assisted Writing in the Wild
Sheshera Mysore
Debarati Das
Hancheng Cao
Bahareh Sarrafzadeh
123
0
0
21 May 2025
Aligning Dialogue Agents with Global Feedback via Large Language Model Reward Decomposition
Aligning Dialogue Agents with Global Feedback via Large Language Model Reward Decomposition
Dong Won Lee
Hae Won Park
C. Breazeal
Louis-Philippe Morency
47
0
0
21 May 2025
Ranking Free RAG: Replacing Re-ranking with Selection in RAG for Sensitive Domains
Ranking Free RAG: Replacing Re-ranking with Selection in RAG for Sensitive Domains
Yash Saxena
Ankur Padia
Mandar S Chaudhary
Kalpa Gunaratna
Srinivasan Parthasarathy
Manas Gaur
271
0
0
21 May 2025
Emotional Supporters often Use Multiple Strategies in a Single Turn
Emotional Supporters often Use Multiple Strategies in a Single Turn
Xin Bai
Guanyi Chen
Tingting He
Chenlian Zhou
Yu Liu
72
0
0
21 May 2025
Exploring the Limits of Vision-Language-Action Manipulations in Cross-task Generalization
Exploring the Limits of Vision-Language-Action Manipulations in Cross-task Generalization
Jiaming Zhou
Ke Ye
Jiayi Liu
Teli Ma
Zifang Wang
Ronghe Qiu
Kun-Yu Lin
Zhilin Zhao
Junwei Liang
127
2
0
21 May 2025
LFTF: Locating First and Then Fine-Tuning for Mitigating Gender Bias in Large Language Models
LFTF: Locating First and Then Fine-Tuning for Mitigating Gender Bias in Large Language Models
Zhanyue Qin
Yue Ding
Deyuan Liu
Qingbin Liu
Junxian Cai
Xi Chen
Zhiying Tu
Dianhui Chu
Cuiyun Gao
Dianbo Sui
84
0
0
21 May 2025
Teaching Language Models to Evolve with Users: Dynamic Profile Modeling for Personalized Alignment
Teaching Language Models to Evolve with Users: Dynamic Profile Modeling for Personalized Alignment
Weixiang Zhao
Xingyu Sui
Yulin Hu
Jiahe Guo
Haixiao Liu
Biye Li
Yanyan Zhao
Bing Qin
Ting Liu
OffRL
115
1
0
21 May 2025
CoT Information: Improved Sample Complexity under Chain-of-Thought Supervision
CoT Information: Improved Sample Complexity under Chain-of-Thought Supervision
Awni Altabaa
Omar Montasser
John Lafferty
LRM
58
0
0
21 May 2025
DISCO Balances the Scales: Adaptive Domain- and Difficulty-Aware Reinforcement Learning on Imbalanced Data
DISCO Balances the Scales: Adaptive Domain- and Difficulty-Aware Reinforcement Learning on Imbalanced Data
Yuhang Zhou
Jing Zhu
Shengyi Qian
Zhuokai Zhao
Xiyao Wang
Xiaoyu Liu
Ming Li
Paiheng Xu
Wei Ai
Furong Huang
99
1
0
21 May 2025
An Empirical Study on Reinforcement Learning for Reasoning-Search Interleaved LLM Agents
An Empirical Study on Reinforcement Learning for Reasoning-Search Interleaved LLM Agents
Bowen Jin
Jinsung Yoon
Priyanka Kargupta
Sercan O. Arik
Jiawei Han
LRM
150
2
0
21 May 2025
Joint Flashback Adaptation for Forgetting-Resistant Instruction Tuning
Joint Flashback Adaptation for Forgetting-Resistant Instruction Tuning
Yukun Zhao
Lingyong Yan
Zhenyang Li
Shuaiqiang Wang
Zhumin Chen
Zhaochun Ren
Dawei Yin
CLLKELMVLMLRM
77
0
0
21 May 2025
Explaining Puzzle Solutions in Natural Language: An Exploratory Study on 6x6 Sudoku
Explaining Puzzle Solutions in Natural Language: An Exploratory Study on 6x6 Sudoku
Anirudh Maiya
Razan Alghamdi
Maria Leonor Pacheco
Ashutosh Trivedi
Fabio Somenzi
ReLMLRM
54
0
0
21 May 2025
From Problem-Solving to Teaching Problem-Solving: Aligning LLMs with Pedagogy using Reinforcement Learning
From Problem-Solving to Teaching Problem-Solving: Aligning LLMs with Pedagogy using Reinforcement Learning
David Dinucu-Jianu
Jakub Macina
Nico Daheim
Ido Hakimi
Iryna Gurevych
Mrinmaya Sachan
KELMLRM
104
0
0
21 May 2025
VerifyBench: Benchmarking Reference-based Reward Systems for Large Language Models
VerifyBench: Benchmarking Reference-based Reward Systems for Large Language Models
Yuchen Yan
Jin Jiang
Zhenbang Ren
Yijun Li
Xudong Cai
...
Mengdi Zhang
Jian Shao
Yongliang Shen
Jun Xiao
Yueting Zhuang
OffRLALMLRM
139
0
0
21 May 2025
Trajectory Bellman Residual Minimization: A Simple Value-Based Method for LLM Reasoning
Trajectory Bellman Residual Minimization: A Simple Value-Based Method for LLM Reasoning
Yurun Yuan
Fan Chen
Zeyu Jia
Alexander Rakhlin
Tengyang Xie
OffRL
135
1
0
21 May 2025
KaFT: Knowledge-aware Fine-tuning for Boosting LLMs' Domain-specific Question-Answering Performance
KaFT: Knowledge-aware Fine-tuning for Boosting LLMs' Domain-specific Question-Answering Performance
Qihuang Zhong
Liang Ding
Xiantao Cai
Juhua Liu
Bo Du
Dacheng Tao
100
0
0
21 May 2025
Generalised Probabilistic Modelling and Improved Uncertainty Estimation in Comparative LLM-as-a-judge
Generalised Probabilistic Modelling and Improved Uncertainty Estimation in Comparative LLM-as-a-judge
Yassir Fathullah
Mark Gales
ELM
81
0
0
21 May 2025
Learning to Rank Chain-of-Thought: An Energy-Based Approach with Outcome Supervision
Learning to Rank Chain-of-Thought: An Energy-Based Approach with Outcome Supervision
Eric Hanchen Jiang
Haozheng Luo
Shengyuan Pang
Xiaomin Li
Zhenting Qi
...
Zongyu Lin
Xinfeng Li
Hao Xu
Kai-Wei Chang
Ying Nian Wu
LRM
123
0
0
21 May 2025
Alignment Under Pressure: The Case for Informed Adversaries When Evaluating LLM Defenses
Alignment Under Pressure: The Case for Informed Adversaries When Evaluating LLM Defenses
Xiaoxue Yang
Bozhidar Stevanoski
Matthieu Meeus
Yves-Alexandre de Montjoye
AAML
59
0
0
21 May 2025
Reward Is Enough: LLMs Are In-Context Reinforcement Learners
Reward Is Enough: LLMs Are In-Context Reinforcement Learners
Kefan Song
Amir Moeini
Peng Wang
Lei Gong
Rohan Chandra
Yanjun Qi
Shangtong Zhang
ReLMLRM
37
3
0
21 May 2025
Improving LLM First-Token Predictions in Multiple-Choice Question Answering via Prefilling Attack
Improving LLM First-Token Predictions in Multiple-Choice Question Answering via Prefilling Attack
Silvia Cappelletti
Tobia Poppi
Samuele Poppi
Zheng-Xin Yong
Diego Garcia-Olano
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
KELMAAML
61
0
0
21 May 2025
sudoLLM : On Multi-role Alignment of Language Models
sudoLLM : On Multi-role Alignment of Language Models
Soumadeep Saha
Akshay Chaturvedi
Joy Mahapatra
Utpal Garain
45
0
0
20 May 2025
Self-Evolving Curriculum for LLM Reasoning
Self-Evolving Curriculum for LLM Reasoning
Xiaoyin Chen
Jiarui Lu
Minsu Kim
Dinghuai Zhang
Jian Tang
Alexandre Piché
Nicolas Angelard-Gontier
Yoshua Bengio
Ehsan Kamalloo
ReLMLRM
126
0
0
20 May 2025
AAPO: Enhance the Reasoning Capabilities of LLMs with Advantage Momentum
AAPO: Enhance the Reasoning Capabilities of LLMs with Advantage Momentum
Jian Xiong
Jingbo Zhou
Jingyong Ye
Dejing Dou
LRM
97
0
0
20 May 2025
SAFEPATH: Preventing Harmful Reasoning in Chain-of-Thought via Early Alignment
SAFEPATH: Preventing Harmful Reasoning in Chain-of-Thought via Early Alignment
Wonje Jeung
Sangyeon Yoon
Minsuk Kahng
Albert No
LRMLLMSV
206
1
0
20 May 2025
QA-prompting: Improving Summarization with Large Language Models using Question-Answering
QA-prompting: Improving Summarization with Large Language Models using Question-Answering
Neelabh Sinha
RALMLRM
110
0
0
20 May 2025
YESciEval: Robust LLM-as-a-Judge for Scientific Question Answering
YESciEval: Robust LLM-as-a-Judge for Scientific Question Answering
Jennifer D'Souza
Hamed Babaei Giglou
Quentin Münch
ELM
109
0
0
20 May 2025
FisherSFT: Data-Efficient Supervised Fine-Tuning of Language Models Using Information Gain
FisherSFT: Data-Efficient Supervised Fine-Tuning of Language Models Using Information Gain
Rohan Deb
Kiran Thekumparampil
Kousha Kalantari
Gaurush Hiranandani
Shoham Sabach
Branislav Kveton
62
0
0
20 May 2025
Domain Gating Ensemble Networks for AI-Generated Text Detection
Domain Gating Ensemble Networks for AI-Generated Text Detection
Arihant Tripathi
Liam Dugan
Charis Gao
Maggie Huan
Emma Jin
Peter Zhang
David Zhang
Julia Zhao
Chris Callison-Burch
VLM
63
0
0
20 May 2025
Towards eliciting latent knowledge from LLMs with mechanistic interpretability
Towards eliciting latent knowledge from LLMs with mechanistic interpretability
Bartosz Cywiński
Emil Ryd
Senthooran Rajamanoharan
Neel Nanda
71
0
0
20 May 2025
Will AI Tell Lies to Save Sick Children? Litmus-Testing AI Values Prioritization with AIRiskDilemmas
Will AI Tell Lies to Save Sick Children? Litmus-Testing AI Values Prioritization with AIRiskDilemmas
Yu Ying Chiu
Zhilin Wang
Sharan Maiya
Yejin Choi
Kyle Fish
Sydney Levine
Evan Hubinger
89
0
0
20 May 2025
Investigating and Enhancing the Robustness of Large Multimodal Models Against Temporal Inconsistency
Investigating and Enhancing the Robustness of Large Multimodal Models Against Temporal Inconsistency
Jiafeng Liang
Shixin Jiang
Xuan Dong
Ning Wang
Zheng Chu
Hui Su
Jinlan Fu
Ming-Yuan Liu
See-Kiong Ng
Bing Qin
113
0
0
20 May 2025
Kaleidoscope Gallery: Exploring Ethics and Generative AI Through Art
Kaleidoscope Gallery: Exploring Ethics and Generative AI Through Art
Alayt Issak
Uttkarsh Narayan
Ramya Srinivasan
Erica Kleinman
Casper Harteveld
66
0
0
20 May 2025
Previous
123...101112...126127128
Next