ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.02155
  4. Cited By
Training language models to follow instructions with human feedback

Training language models to follow instructions with human feedback

4 March 2022
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
Pamela Mishkin
Chong Zhang
Sandhini Agarwal
Katarina Slama
Alex Ray
John Schulman
Jacob Hilton
Fraser Kelton
Luke E. Miller
Maddie Simens
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
    OSLM
    ALM
ArXivPDFHTML

Papers citing "Training language models to follow instructions with human feedback"

50 / 7,311 papers shown
Title
Shh, don't say that! Domain Certification in LLMs
Shh, don't say that! Domain Certification in LLMs
Cornelius Emde
Alasdair Paren
Preetham Arvind
Maxime Kayser
Tom Rainforth
Thomas Lukasiewicz
Guohao Li
Philip Torr
Adel Bibi
66
1
0
26 Feb 2025
MEBench: Benchmarking Large Language Models for Cross-Document Multi-Entity Question Answering
MEBench: Benchmarking Large Language Models for Cross-Document Multi-Entity Question Answering
Teng Lin
RALM
68
2
0
26 Feb 2025
Controlled Diversity: Length-optimized Natural Language Generation
Controlled Diversity: Length-optimized Natural Language Generation
Diana Marie Schenke
Timo Baumann
49
0
0
26 Feb 2025
Self-rewarding correction for mathematical reasoning
Self-rewarding correction for mathematical reasoning
Wei Xiong
Hanning Zhang
Chenlu Ye
Lichang Chen
Nan Jiang
Tong Zhang
ReLM
KELM
LRM
80
10
0
26 Feb 2025
Code to Think, Think to Code: A Survey on Code-Enhanced Reasoning and Reasoning-Driven Code Intelligence in LLMs
Code to Think, Think to Code: A Survey on Code-Enhanced Reasoning and Reasoning-Driven Code Intelligence in LLMs
Dayu Yang
Tianyang Liu
Daoan Zhang
Antoine Simoulin
Xiaoyi Liu
...
Zhaopu Teng
Xin Qian
Grey Yang
Jiebo Luo
Julian McAuley
ReLM
OffRL
LRM
91
4
0
26 Feb 2025
OneRec: Unifying Retrieve and Rank with Generative Recommender and Iterative Preference Alignment
OneRec: Unifying Retrieve and Rank with Generative Recommender and Iterative Preference Alignment
Jiaxin Deng
Shiyao Wang
Kuo Cai
Lejian Ren
Qigen Hu
Weifeng Ding
Qiang Luo
Guorui Zhou
84
3
0
26 Feb 2025
MathTutorBench: A Benchmark for Measuring Open-ended Pedagogical Capabilities of LLM Tutors
MathTutorBench: A Benchmark for Measuring Open-ended Pedagogical Capabilities of LLM Tutors
Jakub Macina
Nico Daheim
Ido Hakimi
Manu Kapur
Iryna Gurevych
Mrinmaya Sachan
ELM
73
1
0
26 Feb 2025
Can RLHF be More Efficient with Imperfect Reward Models? A Policy Coverage Perspective
Can RLHF be More Efficient with Imperfect Reward Models? A Policy Coverage Perspective
Jiawei Huang
Bingcong Li
Christoph Dann
Niao He
OffRL
85
0
0
26 Feb 2025
ZEBRA: Leveraging Model-Behavioral Knowledge for Zero-Annotation Preference Dataset Construction
ZEBRA: Leveraging Model-Behavioral Knowledge for Zero-Annotation Preference Dataset Construction
Jeesu Jung
Chanjun Park
Sangkeun Jung
79
0
0
26 Feb 2025
MathClean: A Benchmark for Synthetic Mathematical Data Cleaning
MathClean: A Benchmark for Synthetic Mathematical Data Cleaning
Hao Liang
Meiyi Qiang
Heng Chang
Zefeng He
Yongzhen Guo
Z. Zhu
Wentao Zhang
Bin Cui
51
0
0
26 Feb 2025
Language Models' Factuality Depends on the Language of Inquiry
Language Models' Factuality Depends on the Language of Inquiry
Tushar Aggarwal
Kumar Tanmay
Ayush Agrawal
Kumar Ayush
Hamid Palangi
Paul Pu Liang
HILM
KELM
76
2
0
25 Feb 2025
Beyond In-Distribution Success: Scaling Curves of CoT Granularity for Language Model Generalization
Beyond In-Distribution Success: Scaling Curves of CoT Granularity for Language Model Generalization
Ru Wang
Wei Huang
Selena Song
Haoyu Zhang
Yusuke Iwasawa
Y. Matsuo
Jiaxian Guo
OODD
LRM
80
2
0
25 Feb 2025
Your Language Model May Think Too Rigidly: Achieving Reasoning Consistency with Symmetry-Enhanced Training
Your Language Model May Think Too Rigidly: Achieving Reasoning Consistency with Symmetry-Enhanced Training
Yihang Yao
Zhepeng Cen
Miao Li
William Jongwon Han
Yuyou Zhang
Emerson Liu
Zuxin Liu
Chuang Gan
Ding Zhao
ReLM
LRM
82
0
0
25 Feb 2025
MPO: An Efficient Post-Processing Framework for Mixing Diverse Preference Alignment
MPO: An Efficient Post-Processing Framework for Mixing Diverse Preference Alignment
Tianze Wang
Dongnan Gui
Yifan Hu
Shuhang Lin
Linjun Zhang
40
0
0
25 Feb 2025
LevelRAG: Enhancing Retrieval-Augmented Generation with Multi-hop Logic Planning over Rewriting Augmented Searchers
LevelRAG: Enhancing Retrieval-Augmented Generation with Multi-hop Logic Planning over Rewriting Augmented Searchers
Zhuocheng Zhang
Yang Feng
Min Zhang
65
1
0
25 Feb 2025
Larger or Smaller Reward Margins to Select Preferences for Alignment?
Kexin Huang
Junkang Wu
Ziqian Chen
Xue Wang
Jinyang Gao
Bolin Ding
Jiancan Wu
Xiangnan He
Xuben Wang
55
0
0
25 Feb 2025
Can LLMs Explain Themselves Counterfactually?
Can LLMs Explain Themselves Counterfactually?
Zahra Dehghanighobadi
Asja Fischer
Muhammad Bilal Zafar
LRM
49
0
0
25 Feb 2025
On Synthetic Data Strategies for Domain-Specific Generative Retrieval
On Synthetic Data Strategies for Domain-Specific Generative Retrieval
Haoyang Wen
Jiang Guo
Yi Zhang
Jiarong Jiang
Ziyi Wang
SyDa
79
0
0
25 Feb 2025
Stackelberg Game Preference Optimization for Data-Efficient Alignment of Language Models
Stackelberg Game Preference Optimization for Data-Efficient Alignment of Language Models
Xu Chu
Zhixin Zhang
Tianyu Jia
Yujie Jin
82
0
0
25 Feb 2025
Uncertainty Quantification in Retrieval Augmented Question Answering
Uncertainty Quantification in Retrieval Augmented Question Answering
Laura Perez-Beltrachini
Mirella Lapata
RALM
53
0
0
25 Feb 2025
What is the Alignment Objective of GRPO?
What is the Alignment Objective of GRPO?
Milan Vojnovic
Se-Young Yun
78
2
0
25 Feb 2025
Comparing Native and Non-native English Speakers' Behaviors in Collaborative Writing through Visual Analytics
Comparing Native and Non-native English Speakers' Behaviors in Collaborative Writing through Visual Analytics
Yuexi Chen
Yimin Xiao
Kazi Tasnim Zinat
Naomi Yamashita
G. Gao
Zhicheng Liu
64
1
0
25 Feb 2025
Grandes modelos de lenguaje: de la predicción de palabras a la comprensión?
Grandes modelos de lenguaje: de la predicción de palabras a la comprensión?
Carlos Gómez-Rodríguez
SyDa
AILaw
ELM
VLM
115
0
0
25 Feb 2025
AMPO: Active Multi-Preference Optimization
AMPO: Active Multi-Preference Optimization
Taneesh Gupta
Rahul Madhavan
Xuchao Zhang
Chetan Bansal
Saravan Rajmohan
55
0
0
25 Feb 2025
Advantage-Guided Distillation for Preference Alignment in Small Language Models
Advantage-Guided Distillation for Preference Alignment in Small Language Models
Shiping Gao
Fanqi Wan
Jiajian Guo
Xiaojun Quan
Qifan Wang
ALM
60
0
0
25 Feb 2025
IMPROVE: Iterative Model Pipeline Refinement and Optimization Leveraging LLM Agents
IMPROVE: Iterative Model Pipeline Refinement and Optimization Leveraging LLM Agents
Eric Xue
Zeyi Huang
Yuyang Ji
Haohan Wang
91
1
0
25 Feb 2025
Scalable Best-of-N Selection for Large Language Models via Self-Certainty
Scalable Best-of-N Selection for Large Language Models via Self-Certainty
Zhewei Kang
Xuandong Zhao
Dawn Song
LRM
78
2
0
25 Feb 2025
Language Model Fine-Tuning on Scaled Survey Data for Predicting Distributions of Public Opinions
Language Model Fine-Tuning on Scaled Survey Data for Predicting Distributions of Public Opinions
Joseph Suh
Erfan Jahanparast
Suhong Moon
Minwoo Kang
Serina Chang
ALM
LM&MA
59
1
0
24 Feb 2025
GuidedBench: Equipping Jailbreak Evaluation with Guidelines
GuidedBench: Equipping Jailbreak Evaluation with Guidelines
Ruixuan Huang
Xunguang Wang
Zongjie Li
Daoyuan Wu
Shuai Wang
ALM
ELM
66
0
0
24 Feb 2025
Adversarial Prompt Evaluation: Systematic Benchmarking of Guardrails Against Prompt Input Attacks on LLMs
Adversarial Prompt Evaluation: Systematic Benchmarking of Guardrails Against Prompt Input Attacks on LLMs
Giulio Zizzo
Giandomenico Cornacchia
Kieran Fraser
Muhammad Zaid Hameed
Ambrish Rawat
Beat Buesser
Mark Purcell
Pin-Yu Chen
P. Sattigeri
Kush R. Varshney
AAML
43
2
0
24 Feb 2025
Fully automatic extraction of morphological traits from the Web: utopia or reality?
Fully automatic extraction of morphological traits from the Web: utopia or reality?
Diego Marcos
Robert van de Vlasakker
Ioannis Athanasiadis
P. Bonnet
Hervé Goëau
Alexis Joly
W. Daniel Kissling
César Leblanc
André S. J. van Proosdij
Konstantinos P. Panousis
91
3
0
24 Feb 2025
Is Free Self-Alignment Possible?
Is Free Self-Alignment Possible?
Dyah Adila
Changho Shin
Yijing Zhang
Frederic Sala
MoMe
118
2
0
24 Feb 2025
Policy Learning with a Natural Language Action Space: A Causal Approach
Policy Learning with a Natural Language Action Space: A Causal Approach
Bohan Zhang
Yixin Wang
Paramveer S. Dhillon
CML
46
0
0
24 Feb 2025
RLTHF: Targeted Human Feedback for LLM Alignment
RLTHF: Targeted Human Feedback for LLM Alignment
Yifei Xu
Tusher Chakraborty
Emre Kıcıman
Bibek Aryal
Eduardo Rodrigues
...
Rafael Padilha
Leonardo Nunes
Shobana Balakrishnan
Songwu Lu
Ranveer Chandra
121
1
0
24 Feb 2025
Model Lakes
Model Lakes
Koyena Pal
David Bau
Renée J. Miller
67
0
0
24 Feb 2025
CHBench: A Chinese Dataset for Evaluating Health in Large Language Models
CHBench: A Chinese Dataset for Evaluating Health in Large Language Models
Chenlu Guo
Nuo Xu
Yi-Ju Chang
Yuan Wu
AI4MH
LM&MA
62
1
0
24 Feb 2025
Proactive Privacy Amnesia for Large Language Models: Safeguarding PII with Negligible Impact on Model Utility
Proactive Privacy Amnesia for Large Language Models: Safeguarding PII with Negligible Impact on Model Utility
Martin Kuo
Jingyang Zhang
Jianyi Zhang
Minxue Tang
Louis DiValentin
...
William Chen
Amin Hass
Tianlong Chen
Yuxiao Chen
Haoyang Li
MU
KELM
51
2
0
24 Feb 2025
Correlating and Predicting Human Evaluations of Language Models from Natural Language Processing Benchmarks
Correlating and Predicting Human Evaluations of Language Models from Natural Language Processing Benchmarks
Rylan Schaeffer
Punit Singh Koura
Binh Tang
R. Subramanian
Aaditya K. Singh
...
Vedanuj Goswami
Sergey Edunov
Dieuwke Hupkes
Sanmi Koyejo
Sharan Narang
ALM
71
0
0
24 Feb 2025
Post-edits Are Preferences Too
Post-edits Are Preferences Too
Nathaniel Berger
Stefan Riezler
M. Exel
Matthias Huck
53
0
0
24 Feb 2025
Hyperspherical Normalization for Scalable Deep Reinforcement Learning
Hyperspherical Normalization for Scalable Deep Reinforcement Learning
Hojoon Lee
Youngdo Lee
Takuma Seno
Donghu Kim
Peter Stone
Jaegul Choo
70
1
0
24 Feb 2025
On the Robustness of Transformers against Context Hijacking for Linear Classification
On the Robustness of Transformers against Context Hijacking for Linear Classification
Tianle Li
Chenyang Zhang
Xingwu Chen
Yuan Cao
Difan Zou
79
1
0
24 Feb 2025
Mixup Model Merge: Enhancing Model Merging Performance through Randomized Linear Interpolation
Mixup Model Merge: Enhancing Model Merging Performance through Randomized Linear Interpolation
Yue Zhou
Yi-Ju Chang
Yuan Wu
MoMe
74
2
0
24 Feb 2025
Scale-Free Graph-Language Models
Scale-Free Graph-Language Models
Jianglin Lu
Yixuan Liu
Yitian Zhang
Y. Fu
73
1
0
24 Feb 2025
TituLLMs: A Family of Bangla LLMs with Comprehensive Benchmarking
TituLLMs: A Family of Bangla LLMs with Comprehensive Benchmarking
Shahriar Kabir Nahin
R. N. Nandi
Sagor Sarker
Quazi Sarwar Muhtaseem
Md. Kowsher
Apu Chandraw Shill
Md Ibrahim
Mehadi Hasan Menon
Tareq Al Muntasir
Firoj Alam
70
0
0
24 Feb 2025
CoT-UQ: Improving Response-wise Uncertainty Quantification in LLMs with Chain-of-Thought
CoT-UQ: Improving Response-wise Uncertainty Quantification in LLMs with Chain-of-Thought
Boxuan Zhang
Ruqi Zhang
LRM
37
2
0
24 Feb 2025
ATEB: Evaluating and Improving Advanced NLP Tasks for Text Embedding Models
ATEB: Evaluating and Improving Advanced NLP Tasks for Text Embedding Models
Simeng Han
Frank Palma Gomez
Tu Vu
Zefei Li
Daniel Cer
Hansi Zeng
Chris Tar
Arman Cohan
Gustavo Hernández Ábrego
61
1
0
24 Feb 2025
DistRL: An Asynchronous Distributed Reinforcement Learning Framework for On-Device Control Agents
DistRL: An Asynchronous Distributed Reinforcement Learning Framework for On-Device Control Agents
Taiyi Wang
Zhihao Wu
Jianheng Liu
Jianye Hao
Jun Wang
Kun Shao
OffRL
46
14
0
24 Feb 2025
Improving LLM General Preference Alignment via Optimistic Online Mirror Descent
Improving LLM General Preference Alignment via Optimistic Online Mirror Descent
Yuheng Zhang
Dian Yu
Tao Ge
Linfeng Song
Zhichen Zeng
Haitao Mi
Nan Jiang
Dong Yu
68
1
0
24 Feb 2025
Large Language Models and Mathematical Reasoning Failures
Large Language Models and Mathematical Reasoning Failures
Johan Boye
Birger Moell
ELM
LRM
48
2
0
24 Feb 2025
Streaming Looking Ahead with Token-level Self-reward
Han Zhang
Ruixin Hong
Dong Yu
49
1
0
24 Feb 2025
Previous
123...171819...145146147
Next