Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2203.02155
Cited By
Training language models to follow instructions with human feedback
4 March 2022
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
Pamela Mishkin
Chong Zhang
Sandhini Agarwal
Katarina Slama
Alex Ray
John Schulman
Jacob Hilton
Fraser Kelton
Luke E. Miller
Maddie Simens
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Training language models to follow instructions with human feedback"
50 / 4,678 papers shown
Title
Enhancing Chart-to-Code Generation in Multimodal Large Language Models via Iterative Dual Preference Learning
Zhihan Zhang
Yixin Cao
Lizi Liao
28
0
0
03 Apr 2025
Prompt Optimization with Logged Bandit Data
Haruka Kiyohara
Daniel Yiming Cao
Yuta Saito
Thorsten Joachims
64
0
0
03 Apr 2025
Inference-Time Scaling for Generalist Reward Modeling
Zijun Liu
P. Wang
Ran Xu
Shirong Ma
Chong Ruan
Peng Li
Yang Liu
Y. Wu
OffRL
LRM
46
13
0
03 Apr 2025
Concept Lancet: Image Editing with Compositional Representation Transplant
Jinqi Luo
Tianjiao Ding
Kwan Ho Ryan Chan
Hancheng Min
Chris Callison-Burch
Rene Vidal
DiffM
KELM
72
0
0
03 Apr 2025
Advancing Semantic Caching for LLMs with Domain-Specific Embeddings and Synthetic Data
Waris Gill
Justin Cechmanek
Tyler Hutcherson
Srijith Rajamohan
Jen Agarwal
Muhammad Ali Gulzar
Manvinder Singh
Benoit Dion
37
0
0
03 Apr 2025
Cultural Learning-Based Culture Adaptation of Language Models
Chen Cecilia Liu
Anna Korhonen
Iryna Gurevych
39
0
0
03 Apr 2025
ANNEXE: Unified Analyzing, Answering, and Pixel Grounding for Egocentric Interaction
Yuejiao Su
Yi Wang
Qiongyang Hu
Chuang Yang
Lap-Pui Chau
47
0
0
02 Apr 2025
Increasing happiness through conversations with artificial intelligence
Joseph Heffner
Chongyu Qin
Martin Chadwick
Chris Knutsen
Christopher Summerfield
Zeb Kurth-Nelson
Robb B. Rutledge
AI4MH
44
0
0
02 Apr 2025
Tasks and Roles in Legal AI: Data Curation, Annotation, and Verification
Allison Koenecke
Jed Stiglitz
David Mimno
Matthew Wilkens
AILaw
ELM
82
0
0
02 Apr 2025
Review, Refine, Repeat: Understanding Iterative Decoding of AI Agents with Dynamic Evaluation and Selection
Souradip Chakraborty
Mohammadreza Pourreza
Ruoxi Sun
Yiwen Song
Nino Scherrer
...
Furong Huang
Amrit Singh Bedi
Ahmad Beirami
Hamid Palangi
Tomas Pfister
53
0
0
02 Apr 2025
Urban Computing in the Era of Large Language Models
Zhonghang Li
Lianghao Xia
Xubin Ren
J. Tang
Tianyi Chen
Yong-mei Xu
Chenyu Huang
83
0
0
02 Apr 2025
Less-to-More Generalization: Unlocking More Controllability by In-Context Generation
Shaojin Wu
Mengqi Huang
Wenxu Wu
Yufeng Cheng
Fei Ding
Qian He
DiffM
58
4
0
02 Apr 2025
PiCo: Jailbreaking Multimodal Large Language Models via
Pi
\textbf{Pi}
Pi
ctorial
Co
\textbf{Co}
Co
de Contextualization
Aofan Liu
Lulu Tang
Ting Pan
Yuguo Yin
Bin Wang
Ao Yang
MLLM
AAML
56
0
0
02 Apr 2025
AdPO: Enhancing the Adversarial Robustness of Large Vision-Language Models with Preference Optimization
Chaohu Liu
Tianyi Gui
Yu Liu
Linli Xu
VLM
AAML
68
1
0
02 Apr 2025
An Illusion of Progress? Assessing the Current State of Web Agents
Tianci Xue
Weijian Qi
Tianneng Shi
Chan Hee Song
Boyu Gou
D. Song
Huan Sun
Yu Su
LLMAG
ELM
Presented at
ResearchTrend Connect | LLMAG
on
21 May 2025
108
4
1
02 Apr 2025
Exploring the Capabilities of LLMs for IMU-based Fine-grained Human Activity Understanding
Lilin Xu
Kaiyuan Hou
Xiaofan Jiang
31
1
0
02 Apr 2025
Misaligned Roles, Misplaced Images: Structural Input Perturbations Expose Multimodal Alignment Blind Spots
Erfan Shayegani
G M Shahariar
Sara Abdali
Lei Yu
Nael B. Abu-Ghazaleh
Yue Dong
AAML
78
0
0
01 Apr 2025
Making Large Language Models Better Reasoners with Orchestrated Streaming Experiences
Xiangyang Liu
Junliang He
Xipeng Qiu
ReLM
LRM
67
0
0
01 Apr 2025
HERA: Hybrid Edge-cloud Resource Allocation for Cost-Efficient AI Agents
Shiyi Liu
Haiying Shen
Shuai Che
Mahdi Ghandi
Mingqin Li
LLMAG
53
0
0
01 Apr 2025
QG-VTC: Question-Guided Visual Token Compression in MLLMs for Efficient VQA
Shuai Li
Jian Xu
Xiao-Hui Li
Chao Deng
Lin-Lin Huang
MQ
43
0
0
01 Apr 2025
Cooper: A Library for Constrained Optimization in Deep Learning
Jose Gallego-Posada
Juan Ramirez
Meraj Hashemizadeh
Simon Lacoste-Julien
ODL
AI4CE
55
0
0
01 Apr 2025
Recitation over Reasoning: How Cutting-Edge Language Models Can Fail on Elementary School-Level Reasoning Problems?
Kai Yan
Yufei Xu
Zhengyin Du
Xuesong Yao
Zihan Wang
Xiaowen Guo
Jiecao Chen
ReLM
ELM
LRM
95
4
0
01 Apr 2025
VNJPTranslate: A comprehensive pipeline for Vietnamese-Japanese translation
Hoang Hai Phan
Nguyen Duc Minh Vu
Nam Dang Phuong
46
0
0
01 Apr 2025
How Difficulty-Aware Staged Reinforcement Learning Enhances LLMs' Reasoning Capabilities: A Preliminary Experimental Study
Yunjie Ji
Sitong Zhao
Xiaoyu Tian
Haotian Wang
Shuaiting Chen
Yiping Peng
Han Zhao
Xiangang Li
LRM
52
2
0
01 Apr 2025
Does "Reasoning" with Large Language Models Improve Recognizing, Generating, and Reframing Unhelpful Thoughts?
Yilin Qi
Dong Won Lee
C. Breazeal
Hae Won Park
LRM
41
0
0
31 Mar 2025
LLMs for Explainable AI: A Comprehensive Survey
Ahsan Bilal
David Ebert
Beiyu Lin
72
1
0
31 Mar 2025
Exploring the Effect of Reinforcement Learning on Video Understanding: Insights from SEED-Bench-R1
Yi Chen
Yuying Ge
Rui Wang
Yixiao Ge
Lu Qiu
Ying Shan
Xihui Liu
ReLM
VLM
OffRL
LRM
64
2
0
31 Mar 2025
Communication-Efficient and Personalized Federated Foundation Model Fine-Tuning via Tri-Matrix Adaptation
Yong Li
Bo Liu
Sheng Huang
Zhe Zhang
Xiaotong Yuan
Richang Hong
46
0
0
31 Mar 2025
Agents
Under
Siege
\textit{Agents Under Siege}
Agents Under Siege
: Breaking Pragmatic Multi-Agent LLM Systems with Optimized Prompt Attacks
Rana Muhammad Shahroz Khan
Zhen Tan
Sukwon Yun
Charles Flemming
Tianlong Chen
AAML
LLMAG
Presented at
ResearchTrend Connect | LLMAG
on
23 Apr 2025
104
3
0
31 Mar 2025
CONGRAD:Conflicting Gradient Filtering for Multilingual Preference Alignment
Jiangnan Li
Thuy-Trang Vu
Christian Herold
Amirhossein Tebbifakhr
Shahram Khadivi
Gholamreza Haffari
33
0
0
31 Mar 2025
Is LLM the Silver Bullet to Low-Resource Languages Machine Translation?
Yewei Song
Lujun Li
Cedric Lothritz
Saad Ezzini
Lama Sleem
Niccolo Gentile
Radu State
Tegawende F. Bissyande
Jacques Klein
52
1
0
31 Mar 2025
Integrating Large Language Models with Human Expertise for Disease Detection in Electronic Health Records
Jie Pan
Seungwon Lee
Cheligeer Cheligeer
Elliot A. Martin
Kiarash Riazi
Hude Quan
Na Li
LM&MA
24
0
0
31 Mar 2025
Consistent Subject Generation via Contrastive Instantiated Concepts
Lee Hsin-Ying
Kelvin Chan
Ming Yang
DiffM
95
0
0
31 Mar 2025
Pay More Attention to the Robustness of Prompt for Instruction Data Mining
Qiang Wang
Dawei Feng
Xu Zhang
Ao Shen
Yang Xu
Bo Ding
H. Wang
AAML
48
0
0
31 Mar 2025
Text2Tracks: Prompt-based Music Recommendation via Generative Retrieval
Enrico Palumbo
Gustavo Penha
Andreas Damianou
José Luis Redondo García
Timothy Christopher Heath
Alice Wang
Hugues Bouchard
M. Lalmas
56
0
0
31 Mar 2025
Order Independence With Finetuning
Katrina Brown
Reid McIlroy
35
0
0
30 Mar 2025
FeRG-LLM : Feature Engineering by Reason Generation Large Language Models
Jeonghyun Ko
Gyeongyun Park
Donghoon Lee
Kyunam Lee
LRM
57
0
0
30 Mar 2025
EagleVision: Object-level Attribute Multimodal LLM for Remote Sensing
Hongxiang Jiang
Jihao Yin
Qixiong Wang
Jiaqi Feng
Guo Chen
53
0
0
30 Mar 2025
An Analysis of Decoding Methods for LLM-based Agents for Faithful Multi-Hop Question Answering
Alexander Murphy
Mohd Sanad Zaki Rizvi
Aden Haussmann
Ping Nie
Guifu Liu
Aryo Pradipta Gema
Pasquale Minervini
52
0
0
30 Mar 2025
When 'YES' Meets 'BUT': Can Large Models Comprehend Contradictory Humor Through Comparative Reasoning?
Tuo Liang
Zhe Hu
Jing Li
Hao Zhang
Yiren Lu
...
Yiran Qiao
Disheng Liu
Jeirui Peng
Jing Ma
Yu Yin
54
0
0
29 Mar 2025
Aurelia: Test-time Reasoning Distillation in Audio-Visual LLMs
Sanjoy Chowdhury
Hanan Gani
Nishit Anand
Sayan Nag
Ruohan Gao
Mohamed Elhoseiny
Salman Khan
Dinesh Manocha
LRM
54
0
0
29 Mar 2025
Learning to Reason for Long-Form Story Generation
Alexander Gurung
Mirella Lapata
ReLM
OffRL
LRM
63
1
0
28 Mar 2025
Preference-based Learning with Retrieval Augmented Generation for Conversational Question Answering
Magdalena Kaiser
Gerhard Weikum
42
0
0
28 Mar 2025
DSO: Aligning 3D Generators with Simulation Feedback for Physical Soundness
Ruining Li
Chuanxia Zheng
Christian Rupprecht
Andrea Vedaldi
39
1
0
28 Mar 2025
RLDBF: Enhancing LLMs Via Reinforcement Learning With DataBase FeedBack
Weichen Dai
Zijie Dai
Zhijie Huang
Yixuan Pan
Xinhe Li
Xi Li
Yi Zhou
Ji Qi
Wu Jiang
24
0
0
28 Mar 2025
Token-Driven GammaTune: Adaptive Calibration for Enhanced Speculative Decoding
Aayush Gautam
Susav Shrestha
Narasimha Annapareddy
53
0
0
28 Mar 2025
The Mind in the Machine: A Survey of Incorporating Psychological Theories in LLMs
Zizhou Liu
Ziwei Gong
Lin Ai
Zheng Hui
Run Chen
Colin Wayne Leach
Michelle R. Greene
Julia Hirschberg
LLMAG
177
0
0
28 Mar 2025
SocialGen: Modeling Multi-Human Social Interaction with Language Models
Heng Yu
Juze Zhang
Changan Chen
Tiange Xiang
Yusu Fang
Juan Carlos Niebles
Ehsan Adeli
VGen
49
0
0
28 Mar 2025
Generating Synthetic Oracle Datasets to Analyze Noise Impact: A Study on Building Function Classification Using Tweets
Shanshan Bai
Anna Kruspe
X. X. Zhu
53
0
0
28 Mar 2025
Exploring Data Scaling Trends and Effects in Reinforcement Learning from Human Feedback
Wei Shen
Guanlin Liu
Zheng Wu
Ruofei Zhu
Qingping Yang
Chao Xin
Yu Yue
Lin Yan
92
8
0
28 Mar 2025
Previous
1
2
3
...
7
8
9
...
92
93
94
Next