ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.02155
  4. Cited By
Training language models to follow instructions with human feedback

Training language models to follow instructions with human feedback

4 March 2022
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
Pamela Mishkin
Chong Zhang
Sandhini Agarwal
Katarina Slama
Alex Ray
John Schulman
Jacob Hilton
Fraser Kelton
Luke E. Miller
Maddie Simens
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
    OSLMALM
ArXiv (abs)PDFHTML

Papers citing "Training language models to follow instructions with human feedback"

50 / 6,398 papers shown
Title
Scalable Differential Privacy Mechanisms for Real-Time Machine Learning
  Applications
Scalable Differential Privacy Mechanisms for Real-Time Machine Learning Applications
Jessica Smith
David Williams
Emily Brown
78
0
0
16 Sep 2024
Model-in-the-Loop (MILO): Accelerating Multimodal AI Data Annotation
  with LLMs
Model-in-the-Loop (MILO): Accelerating Multimodal AI Data Annotation with LLMs
Yifan Wang
David Stevens
Pranay Shah
Wenwen Jiang
Miao Liu
...
Boying Gong
Daniel Lee
Jiabo Hu
Ning Zhang
Bob Kamma
103
1
0
16 Sep 2024
DRIVE: Dependable Robust Interpretable Visionary Ensemble Framework in
  Autonomous Driving
DRIVE: Dependable Robust Interpretable Visionary Ensemble Framework in Autonomous Driving
Songning Lai
Tianlang Xue
Hongru Xiao
Lijie Hu
Jiemin Wu
Ninghui Feng
Runwei Guan
Haicheng Liao
Zhenning Li
Yutao Yue
90
4
0
16 Sep 2024
Quantile Regression for Distributional Reward Models in RLHF
Quantile Regression for Distributional Reward Models in RLHF
Nicolai Dorka
104
26
0
16 Sep 2024
Emo-DPO: Controllable Emotional Speech Synthesis through Direct
  Preference Optimization
Emo-DPO: Controllable Emotional Speech Synthesis through Direct Preference Optimization
Xiaoxue Gao
Chen Zhang
Yiming Chen
Huayun Zhang
Nancy F. Chen
114
11
0
16 Sep 2024
Householder Pseudo-Rotation: A Novel Approach to Activation Editing in
  LLMs with Direction-Magnitude Perspective
Householder Pseudo-Rotation: A Novel Approach to Activation Editing in LLMs with Direction-Magnitude Perspective
Van-Cuong Pham
Thien Huu Nguyen
LLMSV
98
3
0
16 Sep 2024
AI Conversational Interviewing: Transforming Surveys with LLMs as Adaptive Interviewers
AI Conversational Interviewing: Transforming Surveys with LLMs as Adaptive Interviewers
Alexander Wuttke
Matthias Aßenmacher
Christopher Klamm
Max M. Lang
Quirin Würschinger
Frauke Kreuter
98
4
0
16 Sep 2024
GP-GPT: Large Language Model for Gene-Phenotype Mapping
GP-GPT: Large Language Model for Gene-Phenotype Mapping
Yanjun Lyu
Zihao Wu
Lu Zhang
Jing Zhang
Yiwei Li
...
Rongjie Liu
Chao Huang
Wentao Li
Tianming Liu
Dajiang Zhu
LM&MA
52
4
0
15 Sep 2024
Towards Data-Centric RLHF: Simple Metrics for Preference Dataset
  Comparison
Towards Data-Centric RLHF: Simple Metrics for Preference Dataset Comparison
Judy Hanwen Shen
Archit Sharma
Jun Qin
70
5
0
15 Sep 2024
A Survey of Foundation Models for Music Understanding
A Survey of Foundation Models for Music Understanding
Wenjun Li
Ying Cai
Ziyang Wu
Wenyi Zhang
Yifan Chen
...
Junwei Han
Bao Ge
Tianming Liu
Lin Gan
Tuo Zhang
120
2
0
15 Sep 2024
Thesis proposal: Are We Losing Textual Diversity to Natural Language
  Processing?
Thesis proposal: Are We Losing Textual Diversity to Natural Language Processing?
Josef Jon
75
0
0
15 Sep 2024
Estimating Wage Disparities Using Foundation Models
Estimating Wage Disparities Using Foundation Models
Keyon Vafa
Susan Athey
David M. Blei
190
3
0
15 Sep 2024
ExploreSelf: Fostering User-driven Exploration and Reflection on Personal Challenges with Adaptive Guidance by Large Language Models
ExploreSelf: Fostering User-driven Exploration and Reflection on Personal Challenges with Adaptive Guidance by Large Language Models
Inhwa Song
SoHyun Park
Sachin R. Pendse
J. Schleider
Munmun De Choudhury
Young-Ho Kim
157
0
0
15 Sep 2024
Causal Inference with Large Language Model: A Survey
Causal Inference with Large Language Model: A Survey
Jing Ma
CMLLRM
262
9
0
15 Sep 2024
ASFT: Aligned Supervised Fine-Tuning through Absolute Likelihood
ASFT: Aligned Supervised Fine-Tuning through Absolute Likelihood
Ruoyu Wang
Jiachen Sun
Shaowei Hua
Quan Fang
25
2
0
14 Sep 2024
MacST: Multi-Accent Speech Synthesis via Text Transliteration for Accent Conversion
MacST: Multi-Accent Speech Synthesis via Text Transliteration for Accent Conversion
Sho Inoue
Shuai Wang
Wanxing Wang
Pengcheng Zhu
Mengxiao Bi
Haizhou Li
122
2
0
14 Sep 2024
Seed-Music: A Unified Framework for High Quality and Controlled Music
  Generation
Seed-Music: A Unified Framework for High Quality and Controlled Music Generation
Ye Bai
Haonan Chen
Jitong Chen
Zhuo Chen
Yi Deng
...
Hang Zhao
Ziyi Zhao
Dejian Zhong
Shicen Zhou
Pei Zou
DiffM
110
8
0
13 Sep 2024
KodeXv0.1: A Family of State-of-the-Art Financial Large Language Models
KodeXv0.1: A Family of State-of-the-Art Financial Large Language Models
Neel Rajani
Lilli Kiessling
Aleksandr Ogaltsov
Claus Lang
ALM
63
0
0
13 Sep 2024
Affective Computing Has Changed: The Foundation Model Disruption
Affective Computing Has Changed: The Foundation Model Disruption
Björn Schuller
Adria Mallol-Ragolta
Alejandro Pena Almansa
Iosif Tsangko
Mostafa M. Amin
A. Semertzidou
Lukas Christ
Shahin Amiriparian
115
1
0
13 Sep 2024
AIPO: Improving Training Objective for Iterative Preference Optimization
AIPO: Improving Training Objective for Iterative Preference Optimization
Yaojie Shen
Xinyao Wang
Yulei Niu
Ying Zhou
Lexin Tang
Libo Zhang
Fan Chen
Longyin Wen
89
2
0
13 Sep 2024
Policy Prototyping for LLMs: Pluralistic Alignment via Interactive and
  Collaborative Policymaking
Policy Prototyping for LLMs: Pluralistic Alignment via Interactive and Collaborative Policymaking
K. J. Kevin Feng
Inyoung Cheong
Quan Ze Chen
Amy X. Zhang
126
3
0
13 Sep 2024
Sub-graph Based Diffusion Model for Link Prediction
Sub-graph Based Diffusion Model for Link Prediction
Hang Li
Wei Jin
Geri Skenderi
Harry Shomer
Wenzhuo Tang
Wenqi Fan
Jiliang Tang
DiffM
65
0
0
13 Sep 2024
Towards Unified Facial Action Unit Recognition Framework by Large
  Language Models
Towards Unified Facial Action Unit Recognition Framework by Large Language Models
Guohong Hu
Xing Lan
Hanyu Jiang
Jiayi Lyu
Jian Xue
CVBM
81
1
0
13 Sep 2024
AI-LieDar: Examine the Trade-off Between Utility and Truthfulness in LLM Agents
AI-LieDar: Examine the Trade-off Between Utility and Truthfulness in LLM Agents
Zhe Su
Xuhui Zhou
Sanketh Rangreji
Anubha Kabra
Julia Mendelsohn
Faeze Brahman
Maarten Sap
LLMAG
190
7
0
13 Sep 2024
AnyBipe: An End-to-End Framework for Training and Deploying Bipedal Robots Guided by Large Language Models
AnyBipe: An End-to-End Framework for Training and Deploying Bipedal Robots Guided by Large Language Models
Yifei Yao
Wentao He
Chenyu Gu
Jiaheng Du
Fuwei Tan
Zhen Zhu
Junguo Lu
OffRL
131
2
0
13 Sep 2024
Your Weak LLM is Secretly a Strong Teacher for Alignment
Your Weak LLM is Secretly a Strong Teacher for Alignment
Leitian Tao
Yixuan Li
156
9
0
13 Sep 2024
Scores as Actions: a framework of fine-tuning diffusion models by
  continuous-time reinforcement learning
Scores as Actions: a framework of fine-tuning diffusion models by continuous-time reinforcement learning
Hanyang Zhao
Haoxian Chen
Ji Zhang
David D. Yao
Wenpin Tang
138
5
0
12 Sep 2024
SimMAT: Exploring Transferability from Vision Foundation Models to Any
  Image Modality
SimMAT: Exploring Transferability from Vision Foundation Models to Any Image Modality
Chenyang Lei
Liyi Chen
Jun Cen
Xiao Chen
Zhen Lei
Felix Heide
Ziwei Liu
Qifeng Chen
Zhaoxiang Zhang
97
0
0
12 Sep 2024
Alignment with Preference Optimization Is All You Need for LLM Safety
Alignment with Preference Optimization Is All You Need for LLM Safety
Réda Alami
Ali Khalifa Almansoori
Ahmed Alzubaidi
M. Seddik
Mugariya Farooq
Hakim Hacid
73
1
0
12 Sep 2024
Synthetic continued pretraining
Synthetic continued pretraining
Zitong Yang
Neil Band
Shuangping Li
Emmanuel Candès
Tatsunori Hashimoto
CLLSyDa
110
16
0
11 Sep 2024
Awaking the Slides: A Tuning-free and Knowledge-regulated AI Tutoring
  System via Language Model Coordination
Awaking the Slides: A Tuning-free and Knowledge-regulated AI Tutoring System via Language Model Coordination
Daniel Zhang-Li
Zheyuan Zhang
Jifan Yu
Joy Lim Jia Yin
Shangqing Tu
...
Hongru Wang
Zhiyuan Liu
Huiqin Liu
Lei Hou
Juanzi Li
VLM
45
0
0
11 Sep 2024
Securing Vision-Language Models with a Robust Encoder Against Jailbreak
  and Adversarial Attacks
Securing Vision-Language Models with a Robust Encoder Against Jailbreak and Adversarial Attacks
Md Zarif Hossain
Ahmed Imteaj
AAMLVLM
81
6
0
11 Sep 2024
Alignment of Diffusion Models: Fundamentals, Challenges, and Future
Alignment of Diffusion Models: Fundamentals, Challenges, and Future
Buhua Liu
Shitong Shao
Bao Li
Lichen Bai
Zhiqiang Xu
Haoyi Xiong
James Kwok
Sumi Helal
Bo Han
122
14
0
11 Sep 2024
PiTe: Pixel-Temporal Alignment for Large Video-Language Model
PiTe: Pixel-Temporal Alignment for Large Video-Language Model
Yang Liu
Pengxiang Ding
Siteng Huang
Min Zhang
Han Zhao
Donglin Wang
99
7
0
11 Sep 2024
Leveraging Unstructured Text Data for Federated Instruction Tuning of
  Large Language Models
Leveraging Unstructured Text Data for Federated Instruction Tuning of Large Language Models
Rui Ye
Rui Ge
Yuchi Fengting
Jingyi Chai
Yanfeng Wang
Siheng Chen
FedML
114
2
0
11 Sep 2024
Beyond IID: Optimizing Instruction Learning from the Perspective of
  Instruction Interaction and Dependency
Beyond IID: Optimizing Instruction Learning from the Perspective of Instruction Interaction and Dependency
hanyu Zhao
Li Du
Yiming Ju
Chengwei Wu
Tengfei Pan
75
6
0
11 Sep 2024
Semi-Supervised Reward Modeling via Iterative Self-Training
Semi-Supervised Reward Modeling via Iterative Self-Training
Yifei He
Haoxiang Wang
Ziyan Jiang
Alexandros Papangelis
Han Zhao
OffRL
120
4
0
10 Sep 2024
Keyword-Aware ASR Error Augmentation for Robust Dialogue State Tracking
Keyword-Aware ASR Error Augmentation for Robust Dialogue State Tracking
Jihyun Lee
Solee Im
Wonjun Lee
Gary Geunbae Lee
75
0
0
10 Sep 2024
RNR: Teaching Large Language Models to Follow Roles and Rules
RNR: Teaching Large Language Models to Follow Roles and Rules
Kuan-Chieh Wang
Alexander Bukharin
Haoming Jiang
Qingyu Yin
Zhengyang Wang
...
Chao Zhang
Bing Yin
Xian Li
Jianshu Chen
Shiyang Li
ALM
84
2
0
10 Sep 2024
Larger Language Models Don't Care How You Think: Why Chain-of-Thought
  Prompting Fails in Subjective Tasks
Larger Language Models Don't Care How You Think: Why Chain-of-Thought Prompting Fails in Subjective Tasks
Georgios Chochlakis
Niyantha Maruthu Pandiyan
Kristina Lerman
Shrikanth Narayanan
ReLMKELMLRM
102
5
0
10 Sep 2024
KAG: Boosting LLMs in Professional Domains via Knowledge Augmented
  Generation
KAG: Boosting LLMs in Professional Domains via Knowledge Augmented Generation
Lei Liang
Mengshu Sun
Zhengke Gui
Zhongshu Zhu
Zhouyu Jiang
...
Qing Cui
Wen Zhang
Huajun Chen
Wenguang Chen
Jun Zhou
99
19
0
10 Sep 2024
MathGLM-Vision: Solving Mathematical Problems with Multi-Modal Large
  Language Model
MathGLM-Vision: Solving Mathematical Problems with Multi-Modal Large Language Model
Zhen Yang
Jinhao Chen
Zhengxiao Du
Wenmeng Yu
Weihan Wang
Wenyi Hong
Zhihuan Jiang
Bin Xu
Yuxiao Dong
Jie Tang
VLMLRM
90
11
0
10 Sep 2024
PingPong: A Benchmark for Role-Playing Language Models with User Emulation and Multi-Model Evaluation
PingPong: A Benchmark for Role-Playing Language Models with User Emulation and Multi-Model Evaluation
Ilya Gusev
LLMAG
136
3
0
10 Sep 2024
Elucidating Optimal Reward-Diversity Tradeoffs in Text-to-Image
  Diffusion Models
Elucidating Optimal Reward-Diversity Tradeoffs in Text-to-Image Diffusion Models
Rohit Jena
Ali Taghibakhshi
Sahil Jain
Gerald Shen
Nima Tajbakhsh
Arash Vahdat
105
5
0
09 Sep 2024
Forward KL Regularized Preference Optimization for Aligning Diffusion
  Policies
Forward KL Regularized Preference Optimization for Aligning Diffusion Policies
Zhao Shan
Chenyou Fan
Shuang Qiu
Jiyuan Shi
Chenjia Bai
112
4
0
09 Sep 2024
Towards Building a Robust Knowledge Intensive Question Answering Model
  with Large Language Models
Towards Building a Robust Knowledge Intensive Question Answering Model with Large Language Models
Xingyun Hong
Yan Shao
Zhilin Wang
Manni Duan
Jin Xiongnan
88
0
0
09 Sep 2024
Interactive Machine Teaching by Labeling Rules and Instances
Interactive Machine Teaching by Labeling Rules and Instances
Giannis Karamanolakis
Daniel J. Hsu
Luis Gravano
91
1
0
08 Sep 2024
RAGent: Retrieval-based Access Control Policy Generation
RAGent: Retrieval-based Access Control Policy Generation
Sakuna Jayasundara
N. Arachchilage
Giovanni Russello
88
4
0
08 Sep 2024
AGR: Age Group fairness Reward for Bias Mitigation in LLMs
AGR: Age Group fairness Reward for Bias Mitigation in LLMs
Shuirong Cao
Ruoxi Cheng
Zhiqiang Wang
82
5
0
06 Sep 2024
Large Language Models in Drug Discovery and Development: From Disease
  Mechanisms to Clinical Trials
Large Language Models in Drug Discovery and Development: From Disease Mechanisms to Clinical Trials
Yizhen Zheng
Huan Yee Koh
M. Yang
Li Li
Lauren T. May
Geoffrey I. Webb
Shirui Pan
George Church
LM&MA
106
14
0
06 Sep 2024
Previous
123...515253...126127128
Next