ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.02155
  4. Cited By
Training language models to follow instructions with human feedback

Training language models to follow instructions with human feedback

4 March 2022
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
Pamela Mishkin
Chong Zhang
Sandhini Agarwal
Katarina Slama
Alex Ray
John Schulman
Jacob Hilton
Fraser Kelton
Luke E. Miller
Maddie Simens
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
    OSLM
    ALM
ArXivPDFHTML

Papers citing "Training language models to follow instructions with human feedback"

50 / 7,311 papers shown
Title
Reliable, Adaptable, and Attributable Language Models with Retrieval
Reliable, Adaptable, and Attributable Language Models with Retrieval
Akari Asai
Zexuan Zhong
Danqi Chen
Pang Wei Koh
Luke Zettlemoyer
Hanna Hajishirzi
Wen-tau Yih
KELM
RALM
49
55
0
05 Mar 2024
"In Dialogues We Learn": Towards Personalized Dialogue Without
  Pre-defined Profiles through In-Dialogue Learning
"In Dialogues We Learn": Towards Personalized Dialogue Without Pre-defined Profiles through In-Dialogue Learning
Chuanqi Cheng
Quan Tu
Wei Wu
Shuo Shang
Cunli Mao
Zhengtao Yu
Rui Yan
49
2
0
05 Mar 2024
Localized Zeroth-Order Prompt Optimization
Localized Zeroth-Order Prompt Optimization
Wenyang Hu
Yao Shu
Zongmin Yu
Zhaoxuan Wu
Xiangqiang Lin
Zhongxiang Dai
See-Kiong Ng
Bryan Kian Hsiang Low
35
6
0
05 Mar 2024
Demonstrating Mutual Reinforcement Effect through Information Flow
Demonstrating Mutual Reinforcement Effect through Information Flow
Chengguang Gan
Xuzheng He
Qinghao Zhang
Tatsunori Mori
24
0
0
05 Mar 2024
Zero-Shot Cross-Lingual Document-Level Event Causality Identification
  with Heterogeneous Graph Contrastive Transfer Learning
Zero-Shot Cross-Lingual Document-Level Event Causality Identification with Heterogeneous Graph Contrastive Transfer Learning
Zhitao He
Pengfei Cao
Zhuoran Jin
Yubo Chen
Kang Liu
Qing Cui
Mengshu Sun
Jun Zhao
34
2
0
05 Mar 2024
Role Prompting Guided Domain Adaptation with General Capability Preserve
  for Large Language Models
Role Prompting Guided Domain Adaptation with General Capability Preserve for Large Language Models
Rui Wang
Fei Mi
Yi Chen
Boyang Xue
Hongru Wang
Qi Zhu
Kam-Fai Wong
Rui-Lan Xu
CLL
43
6
0
05 Mar 2024
InjecAgent: Benchmarking Indirect Prompt Injections in Tool-Integrated
  Large Language Model Agents
InjecAgent: Benchmarking Indirect Prompt Injections in Tool-Integrated Large Language Model Agents
Qiusi Zhan
Zhixiang Liang
Zifan Ying
Daniel Kang
LLMAG
57
76
0
05 Mar 2024
Finetuned Multimodal Language Models Are High-Quality Image-Text Data
  Filters
Finetuned Multimodal Language Models Are High-Quality Image-Text Data Filters
Weizhi Wang
Khalil Mrini
Linjie Yang
Sateesh Kumar
Yu Tian
Xifeng Yan
Heng Wang
46
16
0
05 Mar 2024
Modeling Collaborator: Enabling Subjective Vision Classification With
  Minimal Human Effort via LLM Tool-Use
Modeling Collaborator: Enabling Subjective Vision Classification With Minimal Human Effort via LLM Tool-Use
Imad Eddine Toubal
Aditya Avinash
N. Alldrin
Jan Dlabal
Wenlei Zhou
...
Chun-Ta Lu
Howard Zhou
Ranjay Krishna
Ariel Fuxman
Tom Duerig
VLM
85
7
0
05 Mar 2024
Exploring the Limitations of Large Language Models in Compositional
  Relation Reasoning
Exploring the Limitations of Large Language Models in Compositional Relation Reasoning
Jinman Zhao
Xueyan Zhang
BDL
LRM
38
4
0
05 Mar 2024
ChatCite: LLM Agent with Human Workflow Guidance for Comparative
  Literature Summary
ChatCite: LLM Agent with Human Workflow Guidance for Comparative Literature Summary
Yutong Li
Lu Chen
Aiwei Liu
Kai Yu
Lijie Wen
34
19
0
05 Mar 2024
Alpaca against Vicuna: Using LLMs to Uncover Memorization of LLMs
Alpaca against Vicuna: Using LLMs to Uncover Memorization of LLMs
Aly M. Kassem
Omar Mahmoud
Niloofar Mireshghallah
Hyunwoo J. Kim
Yulia Tsvetkov
Yejin Choi
Sherif Saad
Santu Rana
50
19
0
05 Mar 2024
Enhancing Vision-Language Pre-training with Rich Supervisions
Enhancing Vision-Language Pre-training with Rich Supervisions
Yuan Gao
Kunyu Shi
Pengkai Zhu
Edouard Belval
Oren Nuriel
Srikar Appalaraju
Shabnam Ghadar
Vijay Mahadevan
Zhuowen Tu
Stefano Soatto
VLM
CLIP
69
12
0
05 Mar 2024
DACO: Towards Application-Driven and Comprehensive Data Analysis via
  Code Generation
DACO: Towards Application-Driven and Comprehensive Data Analysis via Code Generation
Xueqing Wu
Rui Zheng
Jingzhen Sha
Te-Lin Wu
Hanyu Zhou
Mohan Tang
Kai-Wei Chang
Nanyun Peng
Haoran Huang
55
2
0
04 Mar 2024
Trial and Error: Exploration-Based Trajectory Optimization for LLM
  Agents
Trial and Error: Exploration-Based Trajectory Optimization for LLM Agents
Yifan Song
Da Yin
Xiang Yue
Jie Huang
Sujian Li
Bill Yuchen Lin
45
68
0
04 Mar 2024
Vision-Language Models for Medical Report Generation and Visual Question
  Answering: A Review
Vision-Language Models for Medical Report Generation and Visual Question Answering: A Review
Iryna Hartsock
Ghulam Rasool
54
64
0
04 Mar 2024
RegionGPT: Towards Region Understanding Vision Language Model
RegionGPT: Towards Region Understanding Vision Language Model
Qiushan Guo
Shalini De Mello
Hongxu Yin
Wonmin Byeon
Ka Chun Cheung
Yizhou Yu
Ping Luo
Sifei Liu
VLM
49
35
0
04 Mar 2024
RIFF: Learning to Rephrase Inputs for Few-shot Fine-tuning of Language
  Models
RIFF: Learning to Rephrase Inputs for Few-shot Fine-tuning of Language Models
Saeed Najafi
Alona Fyshe
37
1
0
04 Mar 2024
Vanilla Transformers are Transfer Capability Teachers
Vanilla Transformers are Transfer Capability Teachers
Xin Lu
Yanyan Zhao
Bing Qin
MoE
43
0
0
04 Mar 2024
Online Training of Large Language Models: Learn while chatting
Online Training of Large Language Models: Learn while chatting
Juhao Liang
Ziwei Wang
Zhuoheng Ma
Jianquan Li
Zhiyi Zhang
Xiangbo Wu
Benyou Wang
KELM
44
3
0
04 Mar 2024
CatCode: A Comprehensive Evaluation Framework for LLMs On the Mixture of
  Code and Text
CatCode: A Comprehensive Evaluation Framework for LLMs On the Mixture of Code and Text
Zhenru Lin
Yiqun Yao
Yang Yuan
ELM
31
0
0
04 Mar 2024
WebCiteS: Attributed Query-Focused Summarization on Chinese Web Search
  Results with Citations
WebCiteS: Attributed Query-Focused Summarization on Chinese Web Search Results with Citations
Haolin Deng
Chang Wang
Xin Li
Dezhang Yuan
Junlang Zhan
Tianhua Zhou
Jin Ma
Jun Gao
Ruifeng Xu
HILM
66
2
0
04 Mar 2024
Towards Self-Contained Answers: Entity-Based Answer Rewriting in
  Conversational Search
Towards Self-Contained Answers: Entity-Based Answer Rewriting in Conversational Search
Ivan Sekulić
K. Balog
Fabio Crestani
KELM
44
4
0
04 Mar 2024
Theoretical Insights for Diffusion Guidance: A Case Study for Gaussian
  Mixture Models
Theoretical Insights for Diffusion Guidance: A Case Study for Gaussian Mixture Models
Yuchen Wu
Minshuo Chen
Zihao Li
Mengdi Wang
Yuting Wei
61
23
0
03 Mar 2024
Leveraging Biomolecule and Natural Language through Multi-Modal
  Learning: A Survey
Leveraging Biomolecule and Natural Language through Multi-Modal Learning: A Survey
Qizhi Pei
Lijun Wu
Kaiyuan Gao
Jinhua Zhu
Yue Wang
Zun Wang
Tao Qin
Rui Yan
AI4CE
62
19
0
03 Mar 2024
Ever-Evolving Memory by Blending and Refining the Past
Ever-Evolving Memory by Blending and Refining the Past
Seo Hyun Kim
Keummin Ka
Yohan Jo
Seung-won Hwang
Dongha Lee
Jinyoung Yeo
KELM
39
1
0
03 Mar 2024
Barrier Functions Inspired Reward Shaping for Reinforcement Learning
Barrier Functions Inspired Reward Shaping for Reinforcement Learning
Nilaksh Nilaksh
Abhishek Ranjan
Shreenabh Agrawal
Aayush Jain
Pushpak Jagtap
Shishir Kolathaya
OffRL
49
4
0
03 Mar 2024
What Is Missing in Multilingual Visual Reasoning and How to Fix It
What Is Missing in Multilingual Visual Reasoning and How to Fix It
Yueqi Song
Simran Khanuja
Graham Neubig
VLM
LRM
100
6
0
03 Mar 2024
AutoDefense: Multi-Agent LLM Defense against Jailbreak Attacks
AutoDefense: Multi-Agent LLM Defense against Jailbreak Attacks
Yifan Zeng
Yiran Wu
Xiao Zhang
Huazheng Wang
Qingyun Wu
LLMAG
AAML
42
64
0
02 Mar 2024
DMoERM: Recipes of Mixture-of-Experts for Effective Reward Modeling
DMoERM: Recipes of Mixture-of-Experts for Effective Reward Modeling
Shanghaoran Quan
MoE
OffRL
52
9
0
02 Mar 2024
Balancing Exploration and Exploitation in LLM using Soft RLLF for
  Enhanced Negation Understanding
Balancing Exploration and Exploitation in LLM using Soft RLLF for Enhanced Negation Understanding
Ha-Thanh Nguyen
Ken Satoh
55
3
0
02 Mar 2024
Learn Suspected Anomalies from Event Prompts for Video Anomaly Detection
Learn Suspected Anomalies from Event Prompts for Video Anomaly Detection
Chenchen Tao
Chong Wang
Yuexian Zou
Xiaohao Peng
Xiaogang Xu
Jiangbo Qian
42
2
0
02 Mar 2024
DINER: Debiasing Aspect-based Sentiment Analysis with Multi-variable
  Causal Inference
DINER: Debiasing Aspect-based Sentiment Analysis with Multi-variable Causal Inference
Jialong Wu
Linhai Zhang
Deyu Zhou
Guoqiang Xu
CML
32
3
0
02 Mar 2024
STAR: Constraint LoRA with Dynamic Active Learning for Data-Efficient
  Fine-Tuning of Large Language Models
STAR: Constraint LoRA with Dynamic Active Learning for Data-Efficient Fine-Tuning of Large Language Models
Linhai Zhang
Jialong Wu
Deyu Zhou
Guoqiang Xu
30
4
0
02 Mar 2024
HeteGen: Heterogeneous Parallel Inference for Large Language Models on
  Resource-Constrained Devices
HeteGen: Heterogeneous Parallel Inference for Large Language Models on Resource-Constrained Devices
Xuanlei Zhao
Bin Jia
Hao Zhou
Ziming Liu
Shenggan Cheng
Yang You
34
4
0
02 Mar 2024
MuseGraph: Graph-oriented Instruction Tuning of Large Language Models
  for Generic Graph Mining
MuseGraph: Graph-oriented Instruction Tuning of Large Language Models for Generic Graph Mining
Yanchao Tan
Hang Lv
Xin Huang
Jiawei Zhang
Shiping Wang
Carl Yang
50
11
0
02 Mar 2024
LLaMoCo: Instruction Tuning of Large Language Models for Optimization
  Code Generation
LLaMoCo: Instruction Tuning of Large Language Models for Optimization Code Generation
Zeyuan Ma
Hongshu Guo
Jiacheng Chen
Guojun Peng
Zhiguang Cao
Yining Ma
Yue-jiao Gong
SyDa
ALM
37
17
0
02 Mar 2024
Distilling Text Style Transfer With Self-Explanation From LLMs
Distilling Text Style Transfer With Self-Explanation From LLMs
Chiyu Zhang
Honglong Cai
Yuezhang Li
Li
Yuexin Wu
Le Hou
Muhammad Abdul-Mageed
52
10
0
02 Mar 2024
LAB: Large-Scale Alignment for ChatBots
LAB: Large-Scale Alignment for ChatBots
Shivchander Sudalairaj
Abhishek Bhandwaldar
Aldo Pareja
Kai Xu
David D. Cox
Akash Srivastava
OSLM
41
29
0
02 Mar 2024
Peacock: A Family of Arabic Multimodal Large Language Models and
  Benchmarks
Peacock: A Family of Arabic Multimodal Large Language Models and Benchmarks
Fakhraddin Alwajih
El Moatez Billah Nagoudi
Gagan Bhatia
Abdelrahman Mohamed
Muhammad Abdul-Mageed
VLM
LRM
35
11
0
01 Mar 2024
Attribute Structuring Improves LLM-Based Evaluation of Clinical Text
  Summaries
Attribute Structuring Improves LLM-Based Evaluation of Clinical Text Summaries
Zelalem Gero
Chandan Singh
Yiqing Xie
Sheng Zhang
Tristan Naumann
Jianfeng Gao
Hoifung Poon
ELM
ALM
41
4
0
01 Mar 2024
Predictions from language models for multiple-choice tasks are not
  robust under variation of scoring methods
Predictions from language models for multiple-choice tasks are not robust under variation of scoring methods
Polina Tsvilodub
Hening Wang
Sharon Grosch
Michael Franke
43
8
0
01 Mar 2024
Formulation Comparison for Timeline Construction using LLMs
Formulation Comparison for Timeline Construction using LLMs
Kimihiro Hasegawa
Nikhil Kandukuri
Susan Holm
Yukari Yamakawa
Teruko Mitamura
51
0
0
01 Mar 2024
LocalRQA: From Generating Data to Locally Training, Testing, and
  Deploying Retrieval-Augmented QA Systems
LocalRQA: From Generating Data to Locally Training, Testing, and Deploying Retrieval-Augmented QA Systems
Xiao Yu
Yunan Lu
Zhou Yu
RALM
42
6
0
01 Mar 2024
Mitigating Reversal Curse in Large Language Models via Semantic-aware
  Permutation Training
Mitigating Reversal Curse in Large Language Models via Semantic-aware Permutation Training
Qingyan Guo
Rui Wang
Junliang Guo
Xu Tan
Jiang Bian
Yujiu Yang
LRM
24
5
0
01 Mar 2024
LUCID: LLM-Generated Utterances for Complex and Interesting Dialogues
LUCID: LLM-Generated Utterances for Complex and Interesting Dialogues
Joe Stacey
Jianpeng Cheng
John Torr
Tristan Guigue
Joris Driesen
Alexandru Coca
Mark Gaynor
Anders Johannsen
45
3
0
01 Mar 2024
Authors' Values and Attitudes Towards AI-bridged Scalable
  Personalization of Creative Language Arts
Authors' Values and Attitudes Towards AI-bridged Scalable Personalization of Creative Language Arts
Taewook Kim
Hyomin Han
Eytan Adar
Matthew Kay
John Joon Young Chung
AI4CE
49
16
0
01 Mar 2024
Rethinking Tokenization: Crafting Better Tokenizers for Large Language
  Models
Rethinking Tokenization: Crafting Better Tokenizers for Large Language Models
Jinbiao Yang
LLMAG
105
11
0
01 Mar 2024
Provably Robust DPO: Aligning Language Models with Noisy Feedback
Provably Robust DPO: Aligning Language Models with Noisy Feedback
Sayak Ray Chowdhury
Anush Kini
Nagarajan Natarajan
45
58
0
01 Mar 2024
Gradient Cuff: Detecting Jailbreak Attacks on Large Language Models by
  Exploring Refusal Loss Landscapes
Gradient Cuff: Detecting Jailbreak Attacks on Large Language Models by Exploring Refusal Loss Landscapes
Xiaomeng Hu
Pin-Yu Chen
Tsung-Yi Ho
AAML
34
26
0
01 Mar 2024
Previous
123...818283...145146147
Next