Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2203.02155
Cited By
Training language models to follow instructions with human feedback
4 March 2022
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
Pamela Mishkin
Chong Zhang
Sandhini Agarwal
Katarina Slama
Alex Ray
John Schulman
Jacob Hilton
Fraser Kelton
Luke E. Miller
Maddie Simens
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Training language models to follow instructions with human feedback"
50 / 6,392 papers shown
Title
RSTeller: Scaling Up Visual Language Modeling in Remote Sensing with Rich Linguistic Semantics from Openly Available Data and Large Language Models
Junyao Ge
Xu Zhang
Yang Zheng
Kaitai Guo
Jimin Liang
183
2
0
27 Aug 2024
Bi-Factorial Preference Optimization: Balancing Safety-Helpfulness in Language Models
Wenxuan Zhang
Philip Torr
Mohamed Elhoseiny
Adel Bibi
216
15
0
27 Aug 2024
Revisiting Image Captioning Training Paradigm via Direct CLIP-based Optimization
Nicholas Moratelli
Davide Caffagni
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
CLIP
100
3
0
26 Aug 2024
Probing Causality Manipulation of Large Language Models
Chenyang Zhang
Haibo Tong
Bin Zhang
Dongyu Zhang
LRM
75
0
0
26 Aug 2024
Text3DAug -- Prompted Instance Augmentation for LiDAR Perception
Laurenz Reichardt
Luca Uhr
Oliver Wasenmüller
120
4
0
26 Aug 2024
LMM-VQA: Advancing Video Quality Assessment with Large Multimodal Models
Qihang Ge
Wei Sun
Yu Zhang
Yunhao Li
Zhongpeng Ji
Fengyu Sun
Shangling Jui
Xiongkuo Min
Guangtao Zhai
98
7
0
26 Aug 2024
TF-Attack: Transferable and Fast Adversarial Attacks on Large Language Models
Zelin Li
Kehai Chen
Lemao Liu
Xuefeng Bai
Mingming Yang
Yang Xiang
Min Zhang
AAML
89
1
0
26 Aug 2024
Genetic Approach to Mitigate Hallucination in Generative IR
Hrishikesh Kulkarni
Nazli Goharian
O. Frieder
Sean MacAvaney
HILM
62
2
0
25 Aug 2024
Making Large Language Models Better Planners with Reasoning-Decision Alignment
Zhijian Huang
Tao Tang
Shaoxiang Chen
Sihao Lin
Zequn Jie
Lin Ma
Guangrun Wang
Xiaodan Liang
153
15
0
25 Aug 2024
CodeGraph: Enhancing Graph Reasoning of LLMs with Code
Qiaolong Cai
Zhaowei Wang
Shizhe Diao
James Kwok
Yangqiu Song
LRM
119
4
0
25 Aug 2024
Mask-Encoded Sparsification: Mitigating Biased Gradients in Communication-Efficient Split Learning
Wenxuan Zhou
Zhihao Qu
Shen-Huan Lyu
Miao Cai
Baoliu Ye
115
0
0
25 Aug 2024
A Law of Next-Token Prediction in Large Language Models
Hangfeng He
Weijie J. Su
93
7
0
24 Aug 2024
LlamaDuo: LLMOps Pipeline for Seamless Migration from Service LLMs to Small-Scale Local LLMs
Chansung Park
Juyong Jiang
Fan Wang
Sayak Paul
Jing Tang
121
2
0
24 Aug 2024
A New Era in Computational Pathology: A Survey on Foundation and Vision-Language Models
Dibaloke Chanda
Milan Aryal
Nasim Yahya Soltani
Masoud Ganji
AI4CE
VLM
143
7
0
23 Aug 2024
Internal and External Knowledge Interactive Refinement Framework for Knowledge-Intensive Question Answering
Haowei Du
Dongyan Zhao
KELM
53
0
0
23 Aug 2024
What Do You Want? User-centric Prompt Generation for Text-to-image Synthesis via Multi-turn Guidance
Yilun Liu
Minggui He
Feiyu Yao
Yuhe Ji
Shimin Tao
...
Jian Gao
Li Zhang
Hao Yang
Boxing Chen
Osamu Yoshie
82
5
0
23 Aug 2024
CLLMFS: A Contrastive Learning enhanced Large Language Model Framework for Few-Shot Named Entity Recognition
Yafeng Zhang
Zilan Yu
Yuang Huang
Jing Tang
73
2
0
23 Aug 2024
Systematic Evaluation of LLM-as-a-Judge in LLM Alignment Tasks: Explainable Metrics and Diverse Prompt Templates
Hui Wei
Shenghua He
Tian Xia
Andy H. Wong
Jingyang Lin
Mei Han
Mei Han
ALM
ELM
203
32
0
23 Aug 2024
Can LLMs Understand Social Norms in Autonomous Driving Games?
Boxuan Wang
Haonan Duan
Yanhao Feng
Xu Chen
Yongjie Fu
Zhaobin Mo
Xuan Di
85
4
0
22 Aug 2024
SAM-SP: Self-Prompting Makes SAM Great Again
Chunpeng Zhou
Kangjie Ning
Qianqian Shen
Sheng Zhou
Zhi Yu
Haishuai Wang
VLM
82
3
0
22 Aug 2024
Enhanced Fine-Tuning of Lightweight Domain-Specific Q&A Model Based on Large Language Models
Shenglin Zhang
Pengtian Zhu
Minghua Ma
Jiagang Wang
Yongqian Sun
...
Jingyu Wang
Qianying Guo
Xiaolei Hua
Lin Zhu
Dan Pei
AI4TS
51
0
0
22 Aug 2024
RoVRM: A Robust Visual Reward Model Optimized via Auxiliary Textual Preference Data
Chenglong Wang
Yang Gan
Yifu Huo
Yongyu Mu
Murun Yang
...
Chunliang Zhang
Tongran Liu
Quan Du
Di Yang
Jingbo Zhu
VLM
180
6
0
22 Aug 2024
Critique-out-Loud Reward Models
Zachary Ankner
Mansheej Paul
Brandon Cui
Jonathan D. Chang
Prithviraj Ammanabrolu
ALM
LRM
110
38
0
21 Aug 2024
Macformer: Transformer with Random Maclaurin Feature Attention
Yuhan Guo
Lizhong Ding
Ye Yuan
Guoren Wang
124
0
0
21 Aug 2024
CIPHER: Cybersecurity Intelligent Penetration-testing Helper for Ethical Researcher
Derry Pratama
Naufal Suryanto
Andro Aprila Adiputra
Thi-Thu-Huong Le
Ahmada Yusril Kadiptya
Muhammad Iqbal
Howon Kim
82
9
0
21 Aug 2024
Xinyu: An Efficient LLM-based System for Commentary Generation
Yiquan Wu
Bo Tang
Chenyang Xi
Yu Yu
Pengyu Wang
...
Peng Cheng
Zhonghao Wang
Yi Wang
Yi Luo
Mingchuan Yang
81
3
0
21 Aug 2024
Cause-Aware Empathetic Response Generation via Chain-of-Thought Fine-Tuning
Xinhao Chen
Chong Yang
Man Lan
Li Cai
Yang Chen
Tu Hu
Xinlin Zhuang
Aimin Zhou
LRM
77
3
0
21 Aug 2024
EEG-Defender: Defending against Jailbreak through Early Exit Generation of Large Language Models
Chongwen Zhao
Zhihao Dou
Kaizhu Huang
AAML
69
3
0
21 Aug 2024
Making Large Vision Language Models to be Good Few-shot Learners
Fan Liu
Wenwen Cai
Jian Huo
Chuanyi Zhang
Delong Chen
Jun Zhou
89
0
0
21 Aug 2024
RePair: Automated Program Repair with Process-based Feedback
Yuze Zhao
Zhenya Huang
Yixiao Ma
Rui Li
Kai Zhang
Hao Jiang
Qi Liu
Linbo Zhu
Yu Su
KELM
85
9
0
21 Aug 2024
Beyond Labels: Aligning Large Language Models with Human-like Reasoning
Muhammad Rafsan Kabir
Rafeed Mohammad Sultan
Ihsanul Haque Asif
Jawad Ibn Ahad
Fuad Rahman
Mohammad Ruhul Amin
Nabeel Mohammed
Shafin Rahman
LRM
89
2
0
20 Aug 2024
CHECKWHY: Causal Fact Verification via Argument Structure
Jiasheng Si
Yibo Zhao
Yingjie Zhu
Haiyang Zhu
Wenpeng Lu
Deyu Zhou
CML
HILM
LRM
123
5
0
20 Aug 2024
To Code, or Not To Code? Exploring Impact of Code in Pre-training
Viraat Aryabumi
Yixuan Su
Raymond Ma
Adrien Morisot
Ivan Zhang
Acyr Locatelli
Marzieh Fadaee
Ahmet Üstün
Sara Hooker
SyDa
AI4CE
105
26
0
20 Aug 2024
Soda-Eval: Open-Domain Dialogue Evaluation in the age of LLMs
John Mendonça
Isabel Trancoso
A. Lavie
ALM
91
3
0
20 Aug 2024
HMoE: Heterogeneous Mixture of Experts for Language Modeling
An Wang
Xingwu Sun
Ruobing Xie
Shuaipeng Li
Jiaqi Zhu
...
J. N. Han
Zhanhui Kang
Di Wang
Naoaki Okazaki
Cheng-zhong Xu
MoE
127
18
0
20 Aug 2024
Probing the Safety Response Boundary of Large Language Models via Unsafe Decoding Path Generation
Haoyu Wang
Bingzhe Wu
Yatao Bian
Yongzhe Chang
Xueqian Wang
Peilin Zhao
146
2
0
20 Aug 2024
REInstruct: Building Instruction Data from Unlabeled Corpus
Shu Chen
Xinyan Guan
Yaojie Lu
Hongyu Lin
Xianpei Han
Le Sun
ALM
SyDa
59
3
0
20 Aug 2024
Minor SFT loss for LLM fine-tune to increase performance and reduce model deviation
Shiming Xie
Hong Chen
Fred Yu
Zeye Sun
Xiuyu Wu
58
0
0
20 Aug 2024
SZTU-CMU at MER2024: Improving Emotion-LLaMA with Conv-Attention for Multimodal Emotion Recognition
Zebang Cheng
Shuyuan Tu
Dawei Huang
Minghan Li
Xiaojiang Peng
Zhi-Qi Cheng
Alexander G. Hauptmann
145
2
0
20 Aug 2024
LeCov: Multi-level Testing Criteria for Large Language Models
Xuan Xie
Jiayang Song
Yuheng Huang
Da Song
Fuyuan Zhang
Felix Juefei-Xu
Lei Ma
ELM
99
0
0
20 Aug 2024
QPO: Query-dependent Prompt Optimization via Multi-Loop Offline Reinforcement Learning
Yilun Kong
Hangyu Mao
Qi Zhao
Bin Zhang
Jingqing Ruan
Li Shen
Yongzhe Chang
Xueqian Wang
Rui Zhao
Dacheng Tao
OffRL
137
2
0
20 Aug 2024
Task-level Distributionally Robust Optimization for Large Language Model-based Dense Retrieval
Guangyuan Ma
Yongliang Ma
Xing Wu
Zhenpeng Su
Ming Zhou
Songlin Hu
OOD
203
3
0
20 Aug 2024
CLIP-DPO: Vision-Language Models as a Source of Preference for Fixing Hallucinations in LVLMs
Yassine Ouali
Adrian Bulat
Brais Martínez
Georgios Tzimiropoulos
VLM
MLLM
113
25
0
19 Aug 2024
Value Alignment from Unstructured Text
Inkit Padhi
Karthikeyan N. Ramamurthy
P. Sattigeri
Manish Nagireddy
Pierre Dognin
Kush R. Varshney
93
0
0
19 Aug 2024
GLIMMER: Incorporating Graph and Lexical Features in Unsupervised Multi-Document Summarization
Ran Liu
Ming Liu
Min Yu
Jianguo Jiang
Gang Li
Dan Zhang
Jingyuan Li
Xiang Meng
Weiqing Huang
53
0
0
19 Aug 2024
Minor DPO reject penalty to increase training robustness
Shiming Xie
Hong Chen
Fred Yu
Zeye Sun
Xiuyu Wu
Yingfan Hu
75
4
0
19 Aug 2024
Are Large Language Models More Honest in Their Probabilistic or Verbalized Confidence?
Shiyu Ni
Keping Bi
Lulu Yu
Jiafeng Guo
HILM
90
8
0
19 Aug 2024
Antidote: Post-fine-tuning Safety Alignment for Large Language Models against Harmful Fine-tuning
Tiansheng Huang
Gautam Bhattacharya
Pratik Joshi
Josh Kimball
Ling Liu
AAML
MoMe
111
30
0
18 Aug 2024
Revisiting the Graph Reasoning Ability of Large Language Models: Case Studies in Translation, Connectivity and Shortest Path
Xinnan Dai
Qihao Wen
Yifei Shen
Hongzhi Wen
Dongsheng Li
Jiliang Tang
Caihua Shan
LRM
126
4
0
18 Aug 2024
Characterizing and Evaluating the Reliability of LLMs against Jailbreak Attacks
Kexin Chen
Yi Liu
Donghai Hong
Jiaying Chen
Wenhai Wang
74
3
0
18 Aug 2024
Previous
1
2
3
...
53
54
55
...
126
127
128
Next