ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.02155
  4. Cited By
Training language models to follow instructions with human feedback

Training language models to follow instructions with human feedback

4 March 2022
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
Pamela Mishkin
Chong Zhang
Sandhini Agarwal
Katarina Slama
Alex Ray
John Schulman
Jacob Hilton
Fraser Kelton
Luke E. Miller
Maddie Simens
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
    OSLMALM
ArXiv (abs)PDFHTML

Papers citing "Training language models to follow instructions with human feedback"

50 / 6,392 papers shown
Title
Threshold Filtering Packing for Supervised Fine-Tuning: Training Related Samples within Packs
Threshold Filtering Packing for Supervised Fine-Tuning: Training Related Samples within Packs
Jiancheng Dong
Lei Jiang
Wei Jin
Lu Cheng
110
1
0
18 Aug 2024
MergeRepair: An Exploratory Study on Merging Task-Specific Adapters in Code LLMs for Automated Program Repair
MergeRepair: An Exploratory Study on Merging Task-Specific Adapters in Code LLMs for Automated Program Repair
Meghdad Dehghan
Jie JW Wu
Fatemeh H. Fard
Ali Ouni
MoMe
99
2
0
18 Aug 2024
CyberPal.AI: Empowering LLMs with Expert-Driven Cybersecurity
  Instructions
CyberPal.AI: Empowering LLMs with Expert-Driven Cybersecurity Instructions
Matan Levi
Yair Alluouche
Daniel Ohayon
Anton Puzanov
88
6
0
17 Aug 2024
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering
Xianjie Wu
Jian Yang
Linzheng Chai
Ge Zhang
Jiaheng Liu
...
Xianfu Cheng
Tianzhen Sun
Guanglin Niu
Tongliang Li
Zhoujun Li
LMTDELM
112
41
0
17 Aug 2024
BaThe: Defense against the Jailbreak Attack in Multimodal Large Language Models by Treating Harmful Instruction as Backdoor Trigger
BaThe: Defense against the Jailbreak Attack in Multimodal Large Language Models by Treating Harmful Instruction as Backdoor Trigger
Yulin Chen
Haoran Li
Zihao Zheng
Zihao Zheng
Yangqiu Song
Bryan Hooi
190
7
0
17 Aug 2024
SEAL: Systematic Error Analysis for Value ALignment
SEAL: Systematic Error Analysis for Value ALignment
Manon Revel
Matteo Cargnelutti
Tyna Eloundou
Greg Leppert
106
5
0
16 Aug 2024
DAC: Decomposed Automation Correction for Text-to-SQL
DAC: Decomposed Automation Correction for Text-to-SQL
Dingzirui Wang
Longxu Dou
Xuanliang Zhang
Qingfu Zhu
Wanxiang Che
112
3
0
16 Aug 2024
Math-PUMA: Progressive Upward Multimodal Alignment to Enhance
  Mathematical Reasoning
Math-PUMA: Progressive Upward Multimodal Alignment to Enhance Mathematical Reasoning
Wenwen Zhuang
Xin Huang
Xiantao Zhang
Jin Zeng
LRM
126
31
0
16 Aug 2024
Learning A Low-Level Vision Generalist via Visual Task Prompt
Learning A Low-Level Vision Generalist via Visual Task Prompt
Xiangyu Chen
Yihao Liu
Yuandong Pu
Wenlong Zhang
Jiantao Zhou
Yu Qiao
Chao Dong
VLM
104
7
0
16 Aug 2024
Context-Aware Assistant Selection for Improved Inference Acceleration
  with Large Language Models
Context-Aware Assistant Selection for Improved Inference Acceleration with Large Language Models
Jerry Huang
Prasanna Parthasarathi
Mehdi Rezagholizadeh
Sarath Chandar
108
2
0
16 Aug 2024
Adaptive Uncertainty Quantification for Generative AI
Adaptive Uncertainty Quantification for Generative AI
Jungeum Kim
Sean O'Hagan
Veronika Rockova
MedIm
482
4
0
16 Aug 2024
An End-to-End Model for Photo-Sharing Multi-modal Dialogue Generation
An End-to-End Model for Photo-Sharing Multi-modal Dialogue Generation
Peiming Guo
Sinuo Liu
Yanzhao Zhang
Dingkun Long
Pengjun Xie
Meishan Zhang
Hao Fei
DiffM
158
1
0
16 Aug 2024
Visual Agents as Fast and Slow Thinkers
Visual Agents as Fast and Slow Thinkers
Guangyan Sun
Mingyu Jin
Zhenting Wang
Cheng-Long Wang
Siqi Ma
Qifan Wang
Ying Nian Wu
Ying Nian Wu
Dongfang Liu
Dongfang Liu
LLMAGLRM
237
19
0
16 Aug 2024
Large Language Models Might Not Care What You Are Saying: Prompt Format Beats Descriptions
Large Language Models Might Not Care What You Are Saying: Prompt Format Beats Descriptions
Chenming Tang
Zhixiang Wang
Hao Sun
Yunfang Wu
LRM
117
0
0
16 Aug 2024
Rater Cohesion and Quality from a Vicarious Perspective
Rater Cohesion and Quality from a Vicarious Perspective
Deepak Pandita
Tharindu Cyril Weerasooriya
Sujan Dutta
Sarah K. K. Luger
Tharindu Ranasinghe
Ashiqur R. KhudaBukhsh
Marcos Zampieri
Christopher M. Homan
63
1
0
15 Aug 2024
The Future of Open Human Feedback
The Future of Open Human Feedback
Shachar Don-Yehiya
Ben Burtenshaw
Ramon Fernandez Astudillo
Cailean Osborne
Mimansa Jaiswal
...
Omri Abend
Jennifer Ding
Sara Hooker
Hannah Rose Kirk
Leshem Choshen
VLMALM
92
4
0
15 Aug 2024
mhGPT: A Lightweight Generative Pre-Trained Transformer for Mental
  Health Text Analysis
mhGPT: A Lightweight Generative Pre-Trained Transformer for Mental Health Text Analysis
Dae-young Kim
Rebecca Hwa
Muhammad Mahbubur Rahman
LM&MAAI4MH
41
2
0
15 Aug 2024
Graph Retrieval-Augmented Generation: A Survey
Graph Retrieval-Augmented Generation: A Survey
Boci Peng
Yun Zhu
Yongchao Liu
Xiaohe Bo
Haizhou Shi
Chuntao Hong
Yan Zhang
Siliang Tang
3DV
113
113
0
15 Aug 2024
ArabLegalEval: A Multitask Benchmark for Assessing Arabic Legal
  Knowledge in Large Language Models
ArabLegalEval: A Multitask Benchmark for Assessing Arabic Legal Knowledge in Large Language Models
Faris Hijazi
Somayah Alharbi
Abdulaziz AlHussein
Harethah Shairah
Reem Alzahrani
Hebah Alshamlan
Omar Knio
G. Turkiyyah
AILawELM
87
4
0
15 Aug 2024
Instruct Large Language Models to Generate Scientific Literature Survey
  Step by Step
Instruct Large Language Models to Generate Scientific Literature Survey Step by Step
Yuxuan Lai
Yupeng Wu
Yidan Wang
Wenpeng Hu
Chen Zheng
120
3
0
15 Aug 2024
The ShareLM Collection and Plugin: Contributing Human-Model Chats for the Benefit of the Community
The ShareLM Collection and Plugin: Contributing Human-Model Chats for the Benefit of the Community
Shachar Don-Yehiya
Leshem Choshen
Omri Abend
78
2
0
15 Aug 2024
TurboEdit: Instant text-based image editing
TurboEdit: Instant text-based image editing
Zongze Wu
Nicholas I. Kolkin
Jonathan Brandt
Richard Zhang
Eli Shechtman
DiffM
100
13
0
14 Aug 2024
Large Language Models Know What Makes Exemplary Contexts
Large Language Models Know What Makes Exemplary Contexts
Quanyu Long
Jianda Chen
Wenya Wang
Sinno Jialin Pan
102
0
0
14 Aug 2024
Bridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization
Bridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization
Yuxin Jiang
Bo Huang
Yufei Wang
Xingshan Zeng
Liangyou Li
Yasheng Wang
Xin Jiang
Lifeng Shang
Ruiming Tang
Wei Wang
131
7
0
14 Aug 2024
CodeMirage: Hallucinations in Code Generated by Large Language Models
CodeMirage: Hallucinations in Code Generated by Large Language Models
Vibhor Agarwal
Yulong Pei
Salwa Alamir
Xiaomo Liu
86
5
0
14 Aug 2024
Problem Solving Through Human-AI Preference-Based Cooperation
Problem Solving Through Human-AI Preference-Based Cooperation
Subhabrata Dutta
Timo Kaufmann
Goran Glavaš
Ivan Habernal
Kristian Kersting
Frauke Kreuter
Mira Mezini
Iryna Gurevych
Eyke Hüllermeier
Hinrich Schuetze
237
2
0
14 Aug 2024
Agent Q: Advanced Reasoning and Learning for Autonomous AI Agents
Agent Q: Advanced Reasoning and Learning for Autonomous AI Agents
Pranav Putta
Edmund Mills
Naman Garg
S. Motwani
Chelsea Finn
Divyansh Garg
Rafael Rafailov
LLMAGLRM
103
88
0
13 Aug 2024
LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs
LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs
Yushi Bai
Jiajie Zhang
Xin Lv
Linzhi Zheng
Siqi Zhu
Lei Hou
Yuxiao Dong
Jie Tang
Juanzi Li
VGenLLMAGALM
100
56
0
13 Aug 2024
Evaluating Cultural Adaptability of a Large Language Model via
  Simulation of Synthetic Personas
Evaluating Cultural Adaptability of a Large Language Model via Simulation of Synthetic Personas
Louis Kwok
Michal Bravansky
Lewis D. Griffin
99
15
0
13 Aug 2024
Large language models can consistently generate high-quality content for
  election disinformation operations
Large language models can consistently generate high-quality content for election disinformation operations
Angus R. Williams
Liam Burke-Moore
Ryan Sze-Yin Chan
Florence E. Enock
Federico Nanni
Tvesha Sippy
Yi-Ling Chung
Evelina Gabasova
Kobi Hackenburg
Jonathan Bright
69
5
0
13 Aug 2024
SparkRA: A Retrieval-Augmented Knowledge Service System Based on Spark
  Large Language Model
SparkRA: A Retrieval-Augmented Knowledge Service System Based on Spark Large Language Model
Dayong Wu
Jiaqi Li
Baoxin Wang
Honghong Zhao
Siyuan Xue
...
Li Qian
Bo Wang
Shijin Wang
Zhixiong Zhang
Guoping Hu
RALM
86
1
0
13 Aug 2024
Layerwise Recurrent Router for Mixture-of-Experts
Layerwise Recurrent Router for Mixture-of-Experts
Zihan Qiu
Zeyu Huang
Shuang Cheng
Yizhi Zhou
Zili Wang
Ivan Titov
Jie Fu
MoE
155
2
0
13 Aug 2024
MAQA: Evaluating Uncertainty Quantification in LLMs Regarding Data Uncertainty
MAQA: Evaluating Uncertainty Quantification in LLMs Regarding Data Uncertainty
Yongjin Yang
Haneul Yoo
Hwaran Lee
165
4
0
13 Aug 2024
Animate, or Inanimate, That is the Question for Large Language Models
Animate, or Inanimate, That is the Question for Large Language Models
Leonardo Ranaldi
Giulia Pucci
Fabio Massimo Zanzotto
61
0
0
12 Aug 2024
FuxiTranyu: A Multilingual Large Language Model Trained with Balanced
  Data
FuxiTranyu: A Multilingual Large Language Model Trained with Balanced Data
Haoran Sun
Renren Jin
Shaoyang Xu
Leiyu Pan
Supryadi
...
Lei Yang
Ling Shi
Juesi Xiao
Shaolin Zhu
Deyi Xiong
98
4
0
12 Aug 2024
Anchored Preference Optimization and Contrastive Revisions: Addressing
  Underspecification in Alignment
Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment
Karel DÓosterlinck
Winnie Xu
Chris Develder
Thomas Demeester
A. Singh
Christopher Potts
Douwe Kiela
Shikib Mehri
80
17
0
12 Aug 2024
Med42-v2: A Suite of Clinical LLMs
Med42-v2: A Suite of Clinical LLMs
Clément Christophe
Praveen K Kanithi
Tathagata Raha
Shadab Khan
Marco AF Pimentel
ELMLM&MAAI4MH
86
27
0
12 Aug 2024
Building Decision Making Models Through Language Model Regime
Building Decision Making Models Through Language Model Regime
Yu Zhang
Haoxiang Liu
Feijun Jiang
Weihua Luo
Kaifu Zhang
81
0
0
12 Aug 2024
A New Pipeline For Generating Instruction Dataset via RAG and Self
  Fine-Tuning
A New Pipeline For Generating Instruction Dataset via RAG and Self Fine-Tuning
Chih-Wei Song
Yu-Kai Lee
Yin-Te Tsai
SyDaALM
70
4
0
12 Aug 2024
GFlowNet Training by Policy Gradients
GFlowNet Training by Policy Gradients
Puhua Niu
Shili Wu
Mingzhou Fan
Xiaoning Qian
145
3
0
12 Aug 2024
Defining Boundaries: A Spectrum of Task Feasibility for Large Language
  Models
Defining Boundaries: A Spectrum of Task Feasibility for Large Language Models
Wenbo Zhang
Zihang Xu
Hengrui Cai
79
1
0
11 Aug 2024
SAGA: A Participant-specific Examination of Story Alternatives and Goal
  Applicability for a Deeper Understanding of Complex Events
SAGA: A Participant-specific Examination of Story Alternatives and Goal Applicability for a Deeper Understanding of Complex Events
Sai Vallurupalli
Katrin Erk
Francis Ferraro
65
2
0
11 Aug 2024
A Training-Free Framework for Video License Plate Tracking and
  Recognition with Only One-Shot
A Training-Free Framework for Video License Plate Tracking and Recognition with Only One-Shot
Haoxuan Ding
Qi. Wang
Junyu Gao
Qiang Li
VLM
76
0
0
11 Aug 2024
Representation Alignment from Human Feedback for Cross-Embodiment Reward
  Learning from Mixed-Quality Demonstrations
Representation Alignment from Human Feedback for Cross-Embodiment Reward Learning from Mixed-Quality Demonstrations
Connor Mattson
Anurag Aribandi
Daniel S. Brown
90
0
0
10 Aug 2024
Your Context Is Not an Array: Unveiling Random Access Limitations in
  Transformers
Your Context Is Not an Array: Unveiling Random Access Limitations in Transformers
MohammadReza Ebrahimi
Sunny Panchal
Roland Memisevic
99
7
0
10 Aug 2024
Trajectory Planning for Teleoperated Space Manipulators Using Deep
  Reinforcement Learning
Trajectory Planning for Teleoperated Space Manipulators Using Deep Reinforcement Learning
Bo Xia
Xianru Tian
Bo Yuan
Zhiheng Li
Bin Liang
Xueqian Wang
77
0
0
10 Aug 2024
Investigating Instruction Tuning Large Language Models on Graphs
Investigating Instruction Tuning Large Language Models on Graphs
Kerui Zhu
Bo-Wei Huang
Bowen Jin
Yizhu Jiao
Ming Zhong
Kevin Chang
Shou-De Lin
Jiawei Han
93
3
0
10 Aug 2024
Path-LLM: A Shortest-Path-based LLM Learning for Unified Graph
  Representation
Path-LLM: A Shortest-Path-based LLM Learning for Unified Graph Representation
Wenbo Shang
Xuliang Zhu
Xin Huang
102
5
0
10 Aug 2024
A Hybrid RAG System with Comprehensive Enhancement on Complex Reasoning
A Hybrid RAG System with Comprehensive Enhancement on Complex Reasoning
Ye Yuan
Chengwu Liu
Jingyang Yuan
Gongbo Sun
Siqi Li
Ming Zhang
LRM
130
5
0
09 Aug 2024
Node Level Graph Autoencoder: Unified Pretraining for Textual Graph
  Learning
Node Level Graph Autoencoder: Unified Pretraining for Textual Graph Learning
Wenbin Hu
Huihao Jing
Qi Hu
Haoran Li
Yangqiu Song
SSLAI4CE
91
0
0
09 Aug 2024
Previous
123...545556...126127128
Next