Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2009.01325
Cited By
v1
v2
v3 (latest)
Learning to summarize from human feedback
2 September 2020
Nisan Stiennon
Long Ouyang
Jeff Wu
Daniel M. Ziegler
Ryan J. Lowe
Chelsea Voss
Alec Radford
Dario Amodei
Paul Christiano
ALM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Learning to summarize from human feedback"
50 / 1,548 papers shown
Title
Practical Aspects on Solving Differential Equations Using Deep Learning: A Primer
Georgios Is. Detorakis
152
0
0
21 Aug 2024
QPO: Query-dependent Prompt Optimization via Multi-Loop Offline Reinforcement Learning
Yilun Kong
Hangyu Mao
Qi Zhao
Bin Zhang
Jingqing Ruan
Li Shen
Yongzhe Chang
Xueqian Wang
Rui Zhao
Dacheng Tao
OffRL
137
2
0
20 Aug 2024
CLIP-DPO: Vision-Language Models as a Source of Preference for Fixing Hallucinations in LVLMs
Yassine Ouali
Adrian Bulat
Brais Martínez
Georgios Tzimiropoulos
VLM
MLLM
113
25
0
19 Aug 2024
Value Alignment from Unstructured Text
Inkit Padhi
Karthikeyan N. Ramamurthy
P. Sattigeri
Manish Nagireddy
Pierre Dognin
Kush R. Varshney
93
0
0
19 Aug 2024
Personalizing Reinforcement Learning from Human Feedback with Variational Preference Learning
S. Poddar
Yanming Wan
Hamish Ivison
Abhishek Gupta
Natasha Jaques
113
50
0
19 Aug 2024
SEAL: Systematic Error Analysis for Value ALignment
Manon Revel
Matteo Cargnelutti
Tyna Eloundou
Greg Leppert
109
5
0
16 Aug 2024
The Future of Open Human Feedback
Shachar Don-Yehiya
Ben Burtenshaw
Ramon Fernandez Astudillo
Cailean Osborne
Mimansa Jaiswal
...
Omri Abend
Jennifer Ding
Sara Hooker
Hannah Rose Kirk
Leshem Choshen
VLM
ALM
94
4
0
15 Aug 2024
Derivative-Free Guidance in Continuous and Discrete Diffusion Models with Soft Value-Based Decoding
Xiner Li
Yulai Zhao
Chenyu Wang
Gabriele Scalia
Gökçen Eraslan
Surag Nair
Tommaso Biancalani
Aviv Regev
Sergey Levine
Masatoshi Uehara
123
37
0
15 Aug 2024
Problem Solving Through Human-AI Preference-Based Cooperation
Subhabrata Dutta
Timo Kaufmann
Goran Glavaš
Ivan Habernal
Kristian Kersting
Frauke Kreuter
Mira Mezini
Iryna Gurevych
Eyke Hüllermeier
Hinrich Schuetze
244
2
0
14 Aug 2024
SAGA: A Participant-specific Examination of Story Alternatives and Goal Applicability for a Deeper Understanding of Complex Events
Sai Vallurupalli
Katrin Erk
Francis Ferraro
65
2
0
11 Aug 2024
Impacts of Darwinian Evolution on Pre-trained Deep Neural Networks
Guodong DU
Runhua Jiang
Senqiao Yang
HaoYang Li
Wei Chen
Keren Li
Sim Kuan Goh
Jing Li
69
4
0
10 Aug 2024
Listwise Reward Estimation for Offline Preference-based Reinforcement Learning
Heewoong Choi
Sangwon Jung
Hongjoon Ahn
Taesup Moon
OffRL
122
4
0
08 Aug 2024
On the Generalization of Preference Learning with DPO
Shawn Im
Yixuan Li
85
2
0
06 Aug 2024
Intermediate direct preference optimization
Atsushi Kojima
48
0
0
06 Aug 2024
Body of Her: A Preliminary Study on End-to-End Humanoid Agent
Tenglong Ao
LM&Ro
55
3
0
06 Aug 2024
Development of REGAI: Rubric Enabled Generative Artificial Intelligence
Zach Johnson
Jeremy Straub
107
1
0
05 Aug 2024
ARCLE: The Abstraction and Reasoning Corpus Learning Environment for Reinforcement Learning
Hosung Lee
Sejin Kim
Seungpil Lee
Sanha Hwang
Jihwan Lee
Byung-Jun Lee
Sundong Kim
LRM
94
9
0
30 Jul 2024
Meta-Rewarding Language Models: Self-Improving Alignment with LLM-as-a-Meta-Judge
Tianhao Wu
Weizhe Yuan
O. Yu. Golovneva
Jing Xu
Yuandong Tian
Jiantao Jiao
Jason Weston
Sainbayar Sukhbaatar
ALM
KELM
LRM
144
96
0
28 Jul 2024
Right Now, Wrong Then: Non-Stationary Direct Preference Optimization under Preference Drift
Seongho Son
William Bankes
Sayak Ray Chowdhury
Brooks Paige
Ilija Bogunovic
131
4
0
26 Jul 2024
Trust or Escalate: LLM Judges with Provable Guarantees for Human Agreement
Jaehun Jung
Faeze Brahman
Yejin Choi
ALM
95
23
0
25 Jul 2024
Towards Aligning Language Models with Textual Feedback
Sauc Abadal Lloret
Shehzaad Dhuliawala
K. Murugesan
Mrinmaya Sachan
VLM
120
1
0
24 Jul 2024
Multilingual Fine-Grained News Headline Hallucination Detection
Jiaming Shen
Tianqi Liu
Jialu Liu
Zhen Qin
Jay Pavagadhi
Simon Baumgartner
Michael Bendersky
89
0
0
22 Jul 2024
ALLaM: Large Language Models for Arabic and English
M Saiful Bari
Yazeed Alnumay
Norah A. Alzahrani
Nouf M. Alotaibi
H. A. Alyahya
...
Jeril Kuriakose
Abdalghani Abujabal
Nora Al-Twairesh
Areeb Alowisheq
Haidar Khan
83
17
0
22 Jul 2024
Is user feedback always informative? Retrieval Latent Defending for Semi-Supervised Domain Adaptation without Source Data
Junha Song
Tae Soo Kim
Junha Kim
Gunhee Nam
Thijs Kooi
Jaegul Choo
110
1
0
22 Jul 2024
Boosting Reward Model with Preference-Conditional Multi-Aspect Synthetic Data Generation
Jiaming Shen
Ran Xu
Yennie Jun
Zhen Qin
Tianqi Liu
Carl Yang
Yi Liang
Simon Baumgartner
Michael Bendersky
SyDa
147
5
0
22 Jul 2024
Improving Context-Aware Preference Modeling for Language Models
Silviu Pitis
Ziang Xiao
Nicolas Le Roux
Alessandro Sordoni
99
12
0
20 Jul 2024
Clinical Reading Comprehension with Encoder-Decoder Models Enhanced by Direct Preference Optimization
Md Sultan al Nahian
R. Kavuluru
MedIm
AI4CE
61
0
0
19 Jul 2024
Decomposed Direct Preference Optimization for Structure-Based Drug Design
Xiwei Cheng
Xiangxin Zhou
Yuwei Yang
Yu Bao
Quanquan Gu
70
3
0
19 Jul 2024
Data-Centric Human Preference with Rationales for Direct Preference Alignment
H. Just
Ming Jin
Anit Kumar Sahu
Huy Phan
Ruoxi Jia
90
3
0
19 Jul 2024
Learning Goal-Conditioned Representations for Language Reward Models
Vaskar Nath
Dylan Slack
Jeff Da
Yuntao Ma
Hugh Zhang
Spencer Whitehead
Sean Hendryx
61
0
0
18 Jul 2024
LLMs as Function Approximators: Terminology, Taxonomy, and Questions for Evaluation
David Schlangen
86
1
0
18 Jul 2024
Understanding Reference Policies in Direct Preference Optimization
Yixin Liu
Pengfei Liu
Arman Cohan
73
11
0
18 Jul 2024
DeepClair: Utilizing Market Forecasts for Effective Portfolio Selection
Donghee Choi
Jinkyu Kim
Mogan Gim
Jinho Lee
Jaewoo Kang
89
0
0
18 Jul 2024
MERLIN: Multimodal Embedding Refinement via LLM-based Iterative Navigation for Text-Video Retrieval-Rerank Pipeline
D. Han
Eunhwan Park
Gisang Lee
Adam Lee
Nojun Kwak
141
4
0
17 Jul 2024
Satisficing Exploration for Deep Reinforcement Learning
Dilip Arumugam
Saurabh Kumar
Ramki Gummadi
Benjamin Van Roy
71
1
0
16 Jul 2024
Exploration Unbound
Dilip Arumugam
Wanqiao Xu
Benjamin Van Roy
80
0
0
16 Jul 2024
SwitchCIT: Switching for Continual Instruction Tuning of Large Language Models
Xinbo Wu
Max Hartman
Vidhata Arjun Jayaraman
Lav Varshney
CLL
LRM
110
1
0
16 Jul 2024
New Desiderata for Direct Preference Optimization
Xiangkun Hu
Tong He
David Wipf
93
3
0
12 Jul 2024
Refuse Whenever You Feel Unsafe: Improving Safety in LLMs via Decoupled Refusal Training
Youliang Yuan
Wenxiang Jiao
Wenxuan Wang
Jen-tse Huang
Jiahao Xu
Tian Liang
Pinjia He
Zhaopeng Tu
126
32
0
12 Jul 2024
SoupLM: Model Integration in Large Language and Multi-Modal Models
Yue Bai
Zichen Zhang
Jiasen Lu
Yun Fu
MoMe
62
1
0
11 Jul 2024
Grounding and Evaluation for Large Language Models: Practical Challenges and Lessons Learned (Survey)
K. Kenthapadi
M. Sameki
Ankur Taly
HILM
ELM
AILaw
83
15
0
10 Jul 2024
Self-Recognition in Language Models
Tim R. Davidson
Viacheslav Surkov
V. Veselovsky
Giuseppe Russo
Robert West
Çağlar Gülçehre
PILM
323
4
0
09 Jul 2024
LIONs: An Empirically Optimized Approach to Align Language Models
Xiao Yu
Qingyang Wu
Yu Li
Zhou Yu
ALM
95
6
0
09 Jul 2024
Preference-Guided Reinforcement Learning for Efficient Exploration
Guojian Wang
Faguo Wu
Xiao Zhang
Tianyuan Chen
Xuyang Chen
Lin Zhao
77
0
0
09 Jul 2024
Variational Best-of-N Alignment
Afra Amini
Tim Vieira
Ryan Cotterell
Ryan Cotterell
BDL
111
23
0
08 Jul 2024
Exposing Privacy Gaps: Membership Inference Attack on Preference Data for LLM Alignment
Qizhang Feng
Siva Rajesh Kasa
Santhosh Kumar Kasa
Hyokun Yun
C. Teo
S. Bodapati
149
8
0
08 Jul 2024
AI Safety in Generative AI Large Language Models: A Survey
Jaymari Chua
Yun Yvonna Li
Shiyi Yang
Chen Wang
Lina Yao
LM&MA
102
19
0
06 Jul 2024
Towards Enhancing Coherence in Extractive Summarization: Dataset and Experiments with LLMs
Mihir Parmar
Hanieh Deilamsalehy
Franck Dernoncourt
Seunghyun Yoon
Ryan Rossi
Trung Bui
99
2
0
05 Jul 2024
MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?
Zhaorun Chen
Yichao Du
Zichen Wen
Yiyang Zhou
Chenhang Cui
...
Jiawei Zhou
Zhuokai Zhao
Rafael Rafailov
Chelsea Finn
Huaxiu Yao
EGVM
MLLM
121
35
0
05 Jul 2024
Spontaneous Reward Hacking in Iterative Self-Refinement
Jane Pan
He He
Samuel R. Bowman
Shi Feng
116
9
0
05 Jul 2024
Previous
1
2
3
...
10
11
12
...
29
30
31
Next