ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.02155
  4. Cited By
Training language models to follow instructions with human feedback

Training language models to follow instructions with human feedback

4 March 2022
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
Pamela Mishkin
Chong Zhang
Sandhini Agarwal
Katarina Slama
Alex Ray
John Schulman
Jacob Hilton
Fraser Kelton
Luke E. Miller
Maddie Simens
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
    OSLM
    ALM
ArXivPDFHTML

Papers citing "Training language models to follow instructions with human feedback"

50 / 7,311 papers shown
Title
CoT-UQ: Improving Response-wise Uncertainty Quantification in LLMs with Chain-of-Thought
CoT-UQ: Improving Response-wise Uncertainty Quantification in LLMs with Chain-of-Thought
Boxuan Zhang
Ruqi Zhang
LRM
37
2
0
24 Feb 2025
Streaming Looking Ahead with Token-level Self-reward
Han Zhang
Ruixin Hong
Dong Yu
49
1
0
24 Feb 2025
Language Model Fine-Tuning on Scaled Survey Data for Predicting Distributions of Public Opinions
Language Model Fine-Tuning on Scaled Survey Data for Predicting Distributions of Public Opinions
Joseph Suh
Erfan Jahanparast
Suhong Moon
Minwoo Kang
Serina Chang
ALM
LM&MA
59
1
0
24 Feb 2025
Model Lakes
Model Lakes
Koyena Pal
David Bau
Renée J. Miller
67
0
0
24 Feb 2025
Adversarial Prompt Evaluation: Systematic Benchmarking of Guardrails Against Prompt Input Attacks on LLMs
Adversarial Prompt Evaluation: Systematic Benchmarking of Guardrails Against Prompt Input Attacks on LLMs
Giulio Zizzo
Giandomenico Cornacchia
Kieran Fraser
Muhammad Zaid Hameed
Ambrish Rawat
Beat Buesser
Mark Purcell
Pin-Yu Chen
P. Sattigeri
Kush R. Varshney
AAML
45
2
0
24 Feb 2025
GuidedBench: Equipping Jailbreak Evaluation with Guidelines
GuidedBench: Equipping Jailbreak Evaluation with Guidelines
Ruixuan Huang
Xunguang Wang
Zongjie Li
Daoyuan Wu
Shuai Wang
ALM
ELM
66
0
0
24 Feb 2025
Lean and Mean: Decoupled Value Policy Optimization with Global Value Guidance
Lean and Mean: Decoupled Value Policy Optimization with Global Value Guidance
Chenghua Huang
Lu Wang
Fangkai Yang
Pu Zhao
Zechao Li
Qingwei Lin
Dongmei Zhang
Saravan Rajmohan
Qi Zhang
OffRL
57
1
0
24 Feb 2025
Multimodal Large Language Models for Text-rich Image Understanding: A Comprehensive Review
Multimodal Large Language Models for Text-rich Image Understanding: A Comprehensive Review
Pei Fu
Tongkun Guan
Zining Wang
Zhentao Guo
Chen Duan
...
Boming Chen
Jiayao Ma
Qianyi Jiang
Kai Zhou
Junfeng Luo
VLM
69
0
0
23 Feb 2025
Sequence-level Large Language Model Training with Contrastive Preference Optimization
Sequence-level Large Language Model Training with Contrastive Preference Optimization
Zhili Feng
Dhananjay Ram
Cole Hawkins
Aditya Rawal
Jinman Zhao
Sheng Zha
64
0
0
23 Feb 2025
Audio-FLAN: A Preliminary Release
Audio-FLAN: A Preliminary Release
Liumeng Xue
Ziya Zhou
J. Pan
Zhiyu Li
Shuai Fan
...
Haohe Liu
Emmanouil Benetos
Ge Zhang
Yike Guo
Wei Xue
MLLM
AuLLM
CLIP
VLM
57
1
0
23 Feb 2025
Keeping up with dynamic attackers: Certifying robustness to adaptive online data poisoning
Avinandan Bose
Laurent Lessard
Maryam Fazel
Krishnamurthy Dvijotham
AAML
41
0
0
23 Feb 2025
Navigation-GPT: A Robust and Adaptive Framework Utilizing Large Language Models for Navigation Applications
Navigation-GPT: A Robust and Adaptive Framework Utilizing Large Language Models for Navigation Applications
Feng Ma
Xuben Wang
Chen Chen
Xiao-bin Xu
Xin-ping Yan
220
0
0
23 Feb 2025
Towards Fully-Automated Materials Discovery via Large-Scale Synthesis Dataset and Expert-Level LLM-as-a-Judge
Towards Fully-Automated Materials Discovery via Large-Scale Synthesis Dataset and Expert-Level LLM-as-a-Judge
Heegyu Kim
Taeyang Jeon
Seungtaek Choi
Jihoon Hong
Dongwon Jeon
...
Jisu Bae
Chihoon Lee
Yunseo Kim
Jinsung Park
Hyunsouk Cho
ELM
65
0
1
23 Feb 2025
Guardians of the Agentic System: Preventing Many Shots Jailbreak with Agentic System
Guardians of the Agentic System: Preventing Many Shots Jailbreak with Agentic System
Saikat Barua
Mostafizur Rahman
Md Jafor Sadek
Rafiul Islam
Shehnaz Khaled
Ahmedul Kabir
LLMAG
65
1
0
23 Feb 2025
Intrinsic Model Weaknesses: How Priming Attacks Unveil Vulnerabilities in Large Language Models
Intrinsic Model Weaknesses: How Priming Attacks Unveil Vulnerabilities in Large Language Models
Yuyi Huang
Runzhe Zhan
Derek F. Wong
Lidia S. Chao
Ailin Tao
AAML
SyDa
ELM
55
0
0
23 Feb 2025
Be a Multitude to Itself: A Prompt Evolution Framework for Red Teaming
Be a Multitude to Itself: A Prompt Evolution Framework for Red Teaming
Rui Li
Peiyi Wang
Jingyuan Ma
Di Zhang
Lei Sha
Zhifang Sui
LLMAG
55
0
0
22 Feb 2025
Towards User-level Private Reinforcement Learning with Human Feedback
Towards User-level Private Reinforcement Learning with Human Feedback
Jingyang Zhang
Mingxi Lei
Meng Ding
Mengdi Li
Zihang Xiang
Difei Xu
Jinhui Xu
Di Wang
52
0
0
22 Feb 2025
Statistical Inference in Reinforcement Learning: A Selective Survey
Statistical Inference in Reinforcement Learning: A Selective Survey
Chengchun Shi
OffRL
74
1
0
22 Feb 2025
BiDeV: Bilateral Defusing Verification for Complex Claim Fact-Checking
BiDeV: Bilateral Defusing Verification for Complex Claim Fact-Checking
Yuxuan Liu
Hongda Sun
Wenya Guo
Xinyan Xiao
Cunli Mao
Zhengtao Yu
Rui Yan
79
2
0
22 Feb 2025
A generative approach to LLM harmfulness detection with special red flag tokens
A generative approach to LLM harmfulness detection with special red flag tokens
Sophie Xhonneux
David Dobre
Mehrnaz Mohfakhami
Leo Schwinn
Gauthier Gidel
58
1
0
22 Feb 2025
IPO: Your Language Model is Secretly a Preference Classifier
IPO: Your Language Model is Secretly a Preference Classifier
Shivank Garg
Ayush Singh
Shweta Singh
Paras Chopra
244
1
0
22 Feb 2025
Dynamic Parallel Tree Search for Efficient LLM Reasoning
Dynamic Parallel Tree Search for Efficient LLM Reasoning
Yifu Ding
Wentao Jiang
Shunyu Liu
Yongcheng Jing
J. Guo
...
Zengmao Wang
Ziqiang Liu
Bo Du
Xianglong Liu
Dacheng Tao
LRM
51
8
0
22 Feb 2025
Maybe I Should Not Answer That, but... Do LLMs Understand The Safety of Their Inputs?
Maybe I Should Not Answer That, but... Do LLMs Understand The Safety of Their Inputs?
Maciej Chrabąszcz
Filip Szatkowski
Bartosz Wójcik
Jan Dubiñski
Tomasz Trzciñski
62
0
0
22 Feb 2025
Moving Beyond Medical Exam Questions: A Clinician-Annotated Dataset of Real-World Tasks and Ambiguity in Mental Healthcare
Moving Beyond Medical Exam Questions: A Clinician-Annotated Dataset of Real-World Tasks and Ambiguity in Mental Healthcare
Max Lamparth
Declan Grabb
Amy Franks
Scott Gershan
Kaitlyn N. Kunstman
...
Monika Drummond Roots
Manu Sharma
Aryan Shrivastava
N. Vasan
Colleen Waickman
36
2
0
22 Feb 2025
Forecasting Frontier Language Model Agent Capabilities
Forecasting Frontier Language Model Agent Capabilities
Govind Pimpale
Axel Højmark
Jérémy Scheurer
Marius Hobbhahn
LLMAG
ELM
54
1
0
21 Feb 2025
Sparsity May Be All You Need: Sparse Random Parameter Adaptation
Sparsity May Be All You Need: Sparse Random Parameter Adaptation
Jesus Rios
Pierre Dognin
Ronny Luss
Karthikeyan N. Ramamurthy
37
1
0
21 Feb 2025
PAPI: Exploiting Dynamic Parallelism in Large Language Model Decoding with a Processing-In-Memory-Enabled Computing System
PAPI: Exploiting Dynamic Parallelism in Large Language Model Decoding with a Processing-In-Memory-Enabled Computing System
Yintao He
Haiyu Mao
Christina Giannoula
Mohammad Sadrosadati
Juan Gómez Luna
Huawei Li
Xiaowei Li
Ying Wang
O. Mutlu
45
6
0
21 Feb 2025
C3AI: Crafting and Evaluating Constitutions for Constitutional AI
Yara Kyrychenko
Ke Zhou
Edyta Bogucka
Daniele Quercia
ELM
55
3
0
21 Feb 2025
Position: Standard Benchmarks Fail -- LLM Agents Present Overlooked Risks for Financial Applications
Position: Standard Benchmarks Fail -- LLM Agents Present Overlooked Risks for Financial Applications
Zichen Chen
Jiaao Chen
Jianda Chen
Misha Sra
ELM
43
1
0
21 Feb 2025
SimPER: A Minimalist Approach to Preference Alignment without Hyperparameters
SimPER: A Minimalist Approach to Preference Alignment without Hyperparameters
Teng Xiao
Yige Yuan
Ziyang Chen
Mingxiao Li
Shangsong Liang
Zhaochun Ren
V. Honavar
110
6
0
21 Feb 2025
Making Them a Malicious Database: Exploiting Query Code to Jailbreak Aligned Large Language Models
Making Them a Malicious Database: Exploiting Query Code to Jailbreak Aligned Large Language Models
Qingsong Zou
Jingyu Xiao
Qing Li
Zhi Yan
Yansen Wang
Li Xu
Wenxuan Wang
Kuofeng Gao
Ruoyu Li
Yong-jia Jiang
AAML
262
0
0
21 Feb 2025
MILE: Model-based Intervention Learning
MILE: Model-based Intervention Learning
Yigit Korkmaz
Erdem Bıyık
93
2
0
21 Feb 2025
WorldCraft: Photo-Realistic 3D World Creation and Customization via LLM Agents
WorldCraft: Photo-Realistic 3D World Creation and Customization via LLM Agents
Xinhang Liu
Chi-Keung Tang
Yu-Wing Tai
VGen
75
0
0
21 Feb 2025
Problem-Solving Logic Guided Curriculum In-Context Learning for LLMs Complex Reasoning
Problem-Solving Logic Guided Curriculum In-Context Learning for LLMs Complex Reasoning
Xuetao Ma
Wenbin Jiang
Hua Huang
LRM
73
1
0
21 Feb 2025
IPAD: Inverse Prompt for AI Detection -- A Robust and Explainable LLM-Generated Text Detector
IPAD: Inverse Prompt for AI Detection -- A Robust and Explainable LLM-Generated Text Detector
Zheng Chen
Yushi Feng
Changyang He
Yue Deng
Hongxi Pu
Yue Liu
DeLMO
49
1
0
21 Feb 2025
Evaluating Implicit Bias in Large Language Models by Attacking From a Psychometric Perspective
Evaluating Implicit Bias in Large Language Models by Attacking From a Psychometric Perspective
Yuchen Wen
Keping Bi
Wei Chen
Jiafeng Guo
Xueqi Cheng
91
1
0
20 Feb 2025
Mixture of insighTful Experts (MoTE): The Synergy of Thought Chains and Expert Mixtures in Self-Alignment
Mixture of insighTful Experts (MoTE): The Synergy of Thought Chains and Expert Mixtures in Self-Alignment
Zhili Liu
Yunhao Gou
Kai Chen
Lanqing Hong
Jiahui Gao
...
Yu Zhang
Zhenguo Li
Xin Jiang
Qiang Liu
James T. Kwok
MoE
116
9
0
20 Feb 2025
Can a Single Model Master Both Multi-turn Conversations and Tool Use? CoALM: A Unified Conversational Agentic Language Model
Can a Single Model Master Both Multi-turn Conversations and Tool Use? CoALM: A Unified Conversational Agentic Language Model
Emre Can Acikgoz
Jeremiah Greer
Akul Datta
Ze Yang
William Zeng
Oussama Elachqar
Emmanouil Koukoumidis
Dilek Hakkani-Tur
Gokhan Tur
LLMAG
111
3
0
20 Feb 2025
DeepRTL: Bridging Verilog Understanding and Generation with a Unified Representation Model
DeepRTL: Bridging Verilog Understanding and Generation with a Unified Representation Model
Yi Liu
Changran Xu
Yunhao Zhou
Zhiyu Li
Qiang Xu
VLM
56
4
0
20 Feb 2025
Varco Arena: A Tournament Approach to Reference-Free Benchmarking Large Language Models
Varco Arena: A Tournament Approach to Reference-Free Benchmarking Large Language Models
Seonil Son
Ju-Min Oh
Heegon Jin
Cheolhun Jang
Jeongbeom Jeong
Kuntae Kim
51
0
0
20 Feb 2025
Pragmatic Reasoning improves LLM Code Generation
Pragmatic Reasoning improves LLM Code Generation
Zhuchen Cao
Sven Apel
Adish Singla
Vera Demberg
LRM
47
0
0
20 Feb 2025
Tabular Embeddings for Tables with Bi-Dimensional Hierarchical Metadata and Nesting
Tabular Embeddings for Tables with Bi-Dimensional Hierarchical Metadata and Nesting
Gyanendra Shrestha
Chutain Jiang
Sai Akula
Vivek Yannam
Anna Pyayt
Michael Gubanov
LMTD
100
0
0
20 Feb 2025
Reward-Guided Iterative Refinement in Diffusion Models at Test-Time with Applications to Protein and DNA Design
Reward-Guided Iterative Refinement in Diffusion Models at Test-Time with Applications to Protein and DNA Design
Masatoshi Uehara
Xingyu Su
Yulai Zhao
Xiner Li
Aviv Regev
Shuiwang Ji
Sergey Levine
Tommaso Biancalani
63
2
0
20 Feb 2025
Faster WIND: Accelerating Iterative Best-of-$N$ Distillation for LLM Alignment
Faster WIND: Accelerating Iterative Best-of-NNN Distillation for LLM Alignment
Tong Yang
Jincheng Mei
H. Dai
Zixin Wen
Shicong Cen
Dale Schuurmans
Yuejie Chi
Bo Dai
47
4
0
20 Feb 2025
UPCORE: Utility-Preserving Coreset Selection for Balanced Unlearning
UPCORE: Utility-Preserving Coreset Selection for Balanced Unlearning
Vaidehi Patil
Elias Stengel-Eskin
Joey Tianyi Zhou
MU
CLL
81
2
0
20 Feb 2025
Value-Incentivized Preference Optimization: A Unified Approach to Online and Offline RLHF
Value-Incentivized Preference Optimization: A Unified Approach to Online and Offline RLHF
Shicong Cen
Jincheng Mei
Katayoon Goshvadi
Hanjun Dai
Tong Yang
Sherry Yang
Dale Schuurmans
Yuejie Chi
Bo Dai
OffRL
70
24
0
20 Feb 2025
Slamming: Training a Speech Language Model on One GPU in a Day
Slamming: Training a Speech Language Model on One GPU in a Day
Gallil Maimon
Avishai Elmakies
Yossi Adi
40
3
0
19 Feb 2025
Local Differences, Global Lessons: Insights from Organisation Policies for International Legislation
Lucie-Aimée Kaffee
Pepa Atanasova
Anna Rogers
49
0
0
19 Feb 2025
SafeRoute: Adaptive Model Selection for Efficient and Accurate Safety Guardrails in Large Language Models
SafeRoute: Adaptive Model Selection for Efficient and Accurate Safety Guardrails in Large Language Models
Seanie Lee
Dong Bok Lee
Dominik Wagner
Minki Kang
Haebin Seong
Tobias Bocklet
Juho Lee
Sung Ju Hwang
24
1
0
18 Feb 2025
EDGE: Efficient Data Selection for LLM Agents via Guideline Effectiveness
EDGE: Efficient Data Selection for LLM Agents via Guideline Effectiveness
Yunxiao Zhang
Guanming Xiong
Haochen Li
Wen Zhao
LLMAG
74
0
0
18 Feb 2025
Previous
123...181920...145146147
Next