ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.02155
  4. Cited By
Training language models to follow instructions with human feedback

Training language models to follow instructions with human feedback

4 March 2022
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
Pamela Mishkin
Chong Zhang
Sandhini Agarwal
Katarina Slama
Alex Ray
John Schulman
Jacob Hilton
Fraser Kelton
Luke E. Miller
Maddie Simens
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
    OSLMALM
ArXiv (abs)PDFHTML

Papers citing "Training language models to follow instructions with human feedback"

50 / 6,392 papers shown
Title
Large Language Models for Human-like Autonomous Driving: A Survey
Large Language Models for Human-like Autonomous Driving: A Survey
Yun Li
Kai Katsumata
Ehsan Javanmardi
Manabu Tsukada
LM&MA
88
11
0
27 Jul 2024
LocalValueBench: A Collaboratively Built and Extensible Benchmark for
  Evaluating Localized Value Alignment and Ethical Safety in Large Language
  Models
LocalValueBench: A Collaboratively Built and Extensible Benchmark for Evaluating Localized Value Alignment and Ethical Safety in Large Language Models
Achintya Gopal
Nicholas Wai Long Lau
Eva Adelina Susanto
Chi Lok Yu
Aditya Paul
ELM
90
10
0
27 Jul 2024
Power-LLaVA: Large Language and Vision Assistant for Power Transmission
  Line Inspection
Power-LLaVA: Large Language and Vision Assistant for Power Transmission Line Inspection
Jiahao Wang
Mingxuan Li
Haichen Luo
Jinguo Zhu
A. Yang
M. Rong
Xiaohua Wang
54
3
0
27 Jul 2024
Large Language Models as Co-Pilots for Causal Inference in Medical
  Studies
Large Language Models as Co-Pilots for Causal Inference in Medical Studies
Ahmed Alaa
Rachael V. Phillips
Emre Kiciman
Laura B. Balzer
Mark van der Laan
Maya L Petersen
CMLELMLM&MA
81
1
0
26 Jul 2024
Modular RAG: Transforming RAG Systems into LEGO-like Reconfigurable
  Frameworks
Modular RAG: Transforming RAG Systems into LEGO-like Reconfigurable Frameworks
Yunfan Gao
Yun Xiong
Meng Wang
Haofen Wang
113
21
0
26 Jul 2024
Conversational Dueling Bandits in Generalized Linear Models
Conversational Dueling Bandits in Generalized Linear Models
Shuhua Yang
Hui Yuan
Xiaoying Zhang
Mengdi Wang
Kuanqi Cai
Huazheng Wang
65
1
0
26 Jul 2024
Fairness Definitions in Language Models Explained
Fairness Definitions in Language Models Explained
Thang Viet Doan
Zhibo Chu
Zichong Wang
Wenbin Zhang
ALM
113
10
0
26 Jul 2024
Guidance-Based Prompt Data Augmentation in Specialized Domains for Named
  Entity Recognition
Guidance-Based Prompt Data Augmentation in Specialized Domains for Named Entity Recognition
Hyeonseok Kang
H. Seo
Jeesu Jung
Sangkeun Jung
Du-Seong Chang
Riwoo Chung
63
1
0
26 Jul 2024
Right Now, Wrong Then: Non-Stationary Direct Preference Optimization under Preference Drift
Right Now, Wrong Then: Non-Stationary Direct Preference Optimization under Preference Drift
Seongho Son
William Bankes
Sayak Ray Chowdhury
Brooks Paige
Ilija Bogunovic
131
4
0
26 Jul 2024
Self-Directed Synthetic Dialogues and Revisions Technical Report
Self-Directed Synthetic Dialogues and Revisions Technical Report
Nathan Lambert
Hailey Schoelkopf
Aaron Gokaslan
Luca Soldaini
Valentina Pyatkin
Louis Castricato
SyDa
85
3
0
25 Jul 2024
Self-Training with Direct Preference Optimization Improves
  Chain-of-Thought Reasoning
Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning
Tianduo Wang
Shichen Li
Wei Lu
LRMAI4CE
88
20
1
25 Jul 2024
Efficient Inference of Vision Instruction-Following Models with Elastic
  Cache
Efficient Inference of Vision Instruction-Following Models with Elastic Cache
Zuyan Liu
Benlin Liu
Jiahui Wang
Yuhao Dong
Guangyi Chen
Yongming Rao
Ranjay Krishna
Jiwen Lu
VLM
89
14
0
25 Jul 2024
The Geometry of Queries: Query-Based Innovations in Retrieval-Augmented
  Generation
The Geometry of Queries: Query-Based Innovations in Retrieval-Augmented Generation
Eric Yang
Jonathan Amar
Jong Ha Lee
Bhawesh Kumar
Yugang Jia
45
0
0
25 Jul 2024
The Dark Side of Function Calling: Pathways to Jailbreaking Large
  Language Models
The Dark Side of Function Calling: Pathways to Jailbreaking Large Language Models
Zihui Wu
Haichang Gao
Jianping He
Ping Wang
112
10
0
25 Jul 2024
The Power of Combining Data and Knowledge: GPT-4o is an Effective
  Interpreter of Machine Learning Models in Predicting Lymph Node Metastasis of
  Lung Cancer
The Power of Combining Data and Knowledge: GPT-4o is an Effective Interpreter of Machine Learning Models in Predicting Lymph Node Metastasis of Lung Cancer
Danqing Hu
Bing Liu
Xiaofeng Zhu
Nan Wu
AI4CELM&MA
59
1
0
25 Jul 2024
Enhancing Model Performance: Another Approach to Vision-Language
  Instruction Tuning
Enhancing Model Performance: Another Approach to Vision-Language Instruction Tuning
Vedanshu
M. M. Tripathi
Bhavnesh Jaint
MLLMVLM
59
0
0
25 Jul 2024
Closing the gap between open-source and commercial large language models
  for medical evidence summarization
Closing the gap between open-source and commercial large language models for medical evidence summarization
Gongbo Zhang
Qiao Jin
Yiliang Zhou
Song Wang
B. Idnay
...
Ali Soroush
Thomas Campion
Zhiyong Lu
Chunhua Weng
Yifan Peng
ELMLM&MA
89
19
0
25 Jul 2024
BotEval: Facilitating Interactive Human Evaluation
BotEval: Facilitating Interactive Human Evaluation
Hyundong Justin Cho
Thamme Gowda
Yuyang Huang
Zixun Lu
Tianli Tong
Jonathan May
ALM
65
1
0
25 Jul 2024
Enhancing Agent Learning through World Dynamics Modeling
Enhancing Agent Learning through World Dynamics Modeling
Zhiyuan Sun
Haochen Shi
Marc-Alexandre Côté
Glen Berseth
Xingdi Yuan
Bang Liu
109
4
0
25 Jul 2024
Learn while Unlearn: An Iterative Unlearning Framework for Generative Language Models
Learn while Unlearn: An Iterative Unlearning Framework for Generative Language Models
Haoyu Tang
Ye Liu
Xukai Liu
Xukai Liu
Yanghai Zhang
Kai Zhang
Xiaofang Zhou
Enhong Chen
MU
160
3
0
25 Jul 2024
Exploring Domain Robust Lightweight Reward Models based on Router
  Mechanism
Exploring Domain Robust Lightweight Reward Models based on Router Mechanism
Hyuk Namgoong
Jeesu Jung
Sangkeun Jung
Yoonhyung Roh
74
1
0
24 Jul 2024
Grammar-based Game Description Generation using Large Language Models
Grammar-based Game Description Generation using Large Language Models
Tsunehiko Tanaka
Edgar Simo-Serra
137
2
0
24 Jul 2024
Large Language Models for Anomaly Detection in Computational Workflows:
  from Supervised Fine-Tuning to In-Context Learning
Large Language Models for Anomaly Detection in Computational Workflows: from Supervised Fine-Tuning to In-Context Learning
Hongwei Jin
George Papadimitriou
Krishnan Raghavan
Pawel Zuk
Prasanna Balaprakash
Cong Wang
A. Mandal
Ewa Deelman
73
2
0
24 Jul 2024
Bailicai: A Domain-Optimized Retrieval-Augmented Generation Framework
  for Medical Applications
Bailicai: A Domain-Optimized Retrieval-Augmented Generation Framework for Medical Applications
Cui Long
Yongbin Liu
Chunping Ouyang
Ying Yu
88
5
0
24 Jul 2024
SimCT: A Simple Consistency Test Protocol in LLMs Development Lifecycle
SimCT: A Simple Consistency Test Protocol in LLMs Development Lifecycle
Fufangchen Zhao
Guoqiang Jin
Rui Zhao
Jiangheng Huang
Fei Tan
76
1
0
24 Jul 2024
SAFETY-J: Evaluating Safety with Critique
SAFETY-J: Evaluating Safety with Critique
Yixiu Liu
Yuxiang Zheng
Shijie Xia
Jiajun Li
Yi Tu
Chaoling Song
Pengfei Liu
ELM
60
2
0
24 Jul 2024
Towards Aligning Language Models with Textual Feedback
Towards Aligning Language Models with Textual Feedback
Sauc Abadal Lloret
Shehzaad Dhuliawala
K. Murugesan
Mrinmaya Sachan
VLM
120
1
0
24 Jul 2024
Adapting Image-based RL Policies via Predicted Rewards
Adapting Image-based RL Policies via Predicted Rewards
Weiyao Wang
Xinyuan Fang
Gregory D. Hager
117
0
0
23 Jul 2024
Can Large Language Models Automatically Jailbreak GPT-4V?
Can Large Language Models Automatically Jailbreak GPT-4V?
Yuanwei Wu
Yue Huang
Yixin Liu
Xiang Li
Pan Zhou
Lichao Sun
SILM
72
2
0
23 Jul 2024
RedAgent: Red Teaming Large Language Models with Context-aware
  Autonomous Language Agent
RedAgent: Red Teaming Large Language Models with Context-aware Autonomous Language Agent
Huiyu Xu
Wenhui Zhang
Peng Kuang
Feng Xiao
Rui Zheng
Yunhe Feng
Zhongjie Ba
Kui Ren
AAMLLLMAG
87
16
0
23 Jul 2024
Course-Correction: Safety Alignment Using Synthetic Preferences
Course-Correction: Safety Alignment Using Synthetic Preferences
Rongwu Xu
Yishuo Cai
Zhenhong Zhou
Renjie Gu
Haiqin Weng
Yan Liu
Tianwei Zhang
Wei Xu
Han Qiu
76
7
0
23 Jul 2024
Knowledge-driven AI-generated data for accurate and interpretable breast
  ultrasound diagnoses
Knowledge-driven AI-generated data for accurate and interpretable breast ultrasound diagnoses
Haojun Yu
Youcheng Li
Nan Zhang
Zihan Niu
Xuantong Gong
...
Binghui Tang
Ling Huo
Qingli Zhu
Yong Wang
Liwei Wang
MedIm
69
3
0
23 Jul 2024
Shared Imagination: LLMs Hallucinate Alike
Shared Imagination: LLMs Hallucinate Alike
Yilun Zhou
Caiming Xiong
Silvio Savarese
Chien-Sheng Wu
HILM
60
2
0
23 Jul 2024
Exploring Automatic Cryptographic API Misuse Detection in the Era of
  LLMs
Exploring Automatic Cryptographic API Misuse Detection in the Era of LLMs
Yifan Xia
Zichen Xie
Peiyu Liu
Kangjie Lu
Yan Liu
Wenhai Wang
Shouling Ji
91
3
0
23 Jul 2024
TLCR: Token-Level Continuous Reward for Fine-grained Reinforcement
  Learning from Human Feedback
TLCR: Token-Level Continuous Reward for Fine-grained Reinforcement Learning from Human Feedback
Eunseop Yoon
Hee Suk Yoon
Soohwan Eom
Gunsoo Han
D. W. Nam
DaeJin Jo
Kyoung-Woon On
M. Hasegawa-Johnson
Sungwoong Kim
C. Yoo
ALM
109
21
0
23 Jul 2024
Assessing In-context Learning and Fine-tuning for Topic Classification
  of German Web Data
Assessing In-context Learning and Fine-tuning for Topic Classification of German Web Data
Julian Schelb
Roberto Ulloa
Andreas Spitz
79
3
0
23 Jul 2024
A Comprehensive Survey of LLM Alignment Techniques: RLHF, RLAIF, PPO,
  DPO and More
A Comprehensive Survey of LLM Alignment Techniques: RLHF, RLAIF, PPO, DPO and More
Zhichao Wang
Bin Bi
Shiva K. Pentyala
Kiran Ramnath
Sougata Chaudhuri
...
Z. Zhu
Xiang-Bo Mao
S. Asur
Na
Na Cheng
OffRL
97
58
0
23 Jul 2024
INF-LLaVA: Dual-perspective Perception for High-Resolution Multimodal
  Large Language Model
INF-LLaVA: Dual-perspective Perception for High-Resolution Multimodal Large Language Model
Yiwei Ma
Zhibin Wang
Xiaoshuai Sun
Weihuang Lin
Qiang-feng Zhou
Jiayi Ji
Rongrong Ji
MLLMVLM
110
2
0
23 Jul 2024
UniMEL: A Unified Framework for Multimodal Entity Linking with Large
  Language Models
UniMEL: A Unified Framework for Multimodal Entity Linking with Large Language Models
Liu Qi
Yongyi He
Lian Defu
Zhi Zheng
Tong Xu
Liu Che
Chen Enhong
MLLM
82
2
0
23 Jul 2024
DDK: Distilling Domain Knowledge for Efficient Large Language Models
DDK: Distilling Domain Knowledge for Efficient Large Language Models
Jiaheng Liu
Chenchen Zhang
Jinyang Guo
Yuanxing Zhang
Haoran Que
...
Congnan Liu
Wenbo Su
Jiamang Wang
Lin Qu
Bo Zheng
110
6
0
23 Jul 2024
MMInstruct: A High-Quality Multi-Modal Instruction Tuning Dataset with
  Extensive Diversity
MMInstruct: A High-Quality Multi-Modal Instruction Tuning Dataset with Extensive Diversity
Yangzhou Liu
Yue Cao
Zhangwei Gao
Weiyun Wang
Zhe Chen
...
Lewei Lu
Xizhou Zhu
Tong Lu
Yu Qiao
Jifeng Dai
VLMMLLM
116
29
0
22 Jul 2024
Do Large Language Models Have Compositional Ability? An Investigation
  into Limitations and Scalability
Do Large Language Models Have Compositional Ability? An Investigation into Limitations and Scalability
Zhuoyan Xu
Zhenmei Shi
Yingyu Liang
CoGeLRM
81
38
0
22 Jul 2024
CrashEventLLM: Predicting System Crashes with Large Language Models
CrashEventLLM: Predicting System Crashes with Large Language Models
Priyanka Mudgal
Bijan Arbab
Swaathi Sampath Kumar
80
2
0
22 Jul 2024
WebRPG: Automatic Web Rendering Parameters Generation for Visual
  Presentation
WebRPG: Automatic Web Rendering Parameters Generation for Visual Presentation
Zirui Shao
Feiyu Gao
Hangdi Xing
Zepeng Zhu
Zhi Yu
Jiajun Bu
Qi Zheng
Cong Yao
56
3
0
22 Jul 2024
Knowledge Mechanisms in Large Language Models: A Survey and Perspective
Knowledge Mechanisms in Large Language Models: A Survey and Perspective
Meng Wang
Yunzhi Yao
Ziwen Xu
Shuofei Qiao
Shumin Deng
...
Yong Jiang
Pengjun Xie
Fei Huang
Huajun Chen
Ningyu Zhang
143
39
0
22 Jul 2024
Is user feedback always informative? Retrieval Latent Defending for
  Semi-Supervised Domain Adaptation without Source Data
Is user feedback always informative? Retrieval Latent Defending for Semi-Supervised Domain Adaptation without Source Data
Junha Song
Tae Soo Kim
Junha Kim
Gunhee Nam
Thijs Kooi
Jaegul Choo
110
1
0
22 Jul 2024
Improving Minimum Bayes Risk Decoding with Multi-Prompt
Improving Minimum Bayes Risk Decoding with Multi-Prompt
David Heineman
Yao Dou
Wei Xu
90
8
0
22 Jul 2024
Building Machines that Learn and Think with People
Building Machines that Learn and Think with People
Katherine M. Collins
Ilia Sucholutsky
Umang Bhatt
Kartik Chandra
Lionel Wong
...
Mark K. Ho
Vikash K. Mansinghka
Adrian Weller
Joshua B. Tenenbaum
Thomas Griffiths
138
39
0
22 Jul 2024
Boosting Reward Model with Preference-Conditional Multi-Aspect Synthetic Data Generation
Boosting Reward Model with Preference-Conditional Multi-Aspect Synthetic Data Generation
Jiaming Shen
Ran Xu
Yennie Jun
Zhen Qin
Tianqi Liu
Carl Yang
Yi Liang
Simon Baumgartner
Michael Bendersky
SyDa
145
5
0
22 Jul 2024
VideoGameBunny: Towards vision assistants for video games
VideoGameBunny: Towards vision assistants for video games
Mohammad Reza Taesiri
Cor-Paul Bezemer
VLMMLLM
81
2
0
21 Jul 2024
Previous
123...575859...126127128
Next