Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2009.01325
Cited By
Learning to summarize from human feedback
2 September 2020
Nisan Stiennon
Long Ouyang
Jeff Wu
Daniel M. Ziegler
Ryan J. Lowe
Chelsea Voss
Alec Radford
Dario Amodei
Paul Christiano
ALM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Learning to summarize from human feedback"
50 / 1,442 papers shown
Title
Problem Solving Through Human-AI Preference-Based Cooperation
Subhabrata Dutta
Timo Kaufmann
Goran Glavaš
Ivan Habernal
Kristian Kersting
Frauke Kreuter
Mira Mezini
Iryna Gurevych
Eyke Hüllermeier
Hinrich Schuetze
98
1
0
14 Aug 2024
SAGA: A Participant-specific Examination of Story Alternatives and Goal Applicability for a Deeper Understanding of Complex Events
Sai Vallurupalli
Katrin Erk
Francis Ferraro
34
1
0
11 Aug 2024
Impacts of Darwinian Evolution on Pre-trained Deep Neural Networks
Guodong Du
Runhua Jiang
Senqiao Yang
HaoYang Li
Wei Chen
Keren Li
S. Goh
Ho-Kin Tang
39
3
0
10 Aug 2024
Listwise Reward Estimation for Offline Preference-based Reinforcement Learning
Heewoong Choi
Sangwon Jung
Hongjoon Ahn
Taesup Moon
OffRL
47
2
0
08 Aug 2024
On the Generalization of Preference Learning with DPO
Shawn Im
Yixuan Li
52
1
0
06 Aug 2024
Intermediate direct preference optimization
Atsushi Kojima
26
0
0
06 Aug 2024
Body of Her: A Preliminary Study on End-to-End Humanoid Agent
Tenglong Ao
LM&Ro
31
1
0
06 Aug 2024
Development of REGAI: Rubric Enabled Generative Artificial Intelligence
Zach Johnson
Jeremy Straub
41
1
0
05 Aug 2024
ARCLE: The Abstraction and Reasoning Corpus Learning Environment for Reinforcement Learning
Hosung Lee
Sejin Kim
Seungpil Lee
Sanha Hwang
Jihwan Lee
Byung-Jun Lee
Sundong Kim
LRM
41
8
0
30 Jul 2024
Meta-Rewarding Language Models: Self-Improving Alignment with LLM-as-a-Meta-Judge
Tianhao Wu
Weizhe Yuan
O. Yu. Golovneva
Jing Xu
Yuandong Tian
Jiantao Jiao
Jason Weston
Sainbayar Sukhbaatar
ALM
KELM
LRM
64
74
0
28 Jul 2024
Right Now, Wrong Then: Non-Stationary Direct Preference Optimization under Preference Drift
Seongho Son
William Bankes
Sayak Ray Chowdhury
Brooks Paige
Ilija Bogunovic
42
4
0
26 Jul 2024
Trust or Escalate: LLM Judges with Provable Guarantees for Human Agreement
Jaehun Jung
Faeze Brahman
Yejin Choi
ALM
44
12
0
25 Jul 2024
Towards Aligning Language Models with Textual Feedback
Sauc Abadal Lloret
S. Dhuliawala
K. Murugesan
Mrinmaya Sachan
VLM
50
1
0
24 Jul 2024
Multilingual Fine-Grained News Headline Hallucination Detection
Jiaming Shen
Tianqi Liu
Jialu Liu
Zhen Qin
Jay Pavagadhi
Simon Baumgartner
Michael Bendersky
56
0
0
22 Jul 2024
ALLaM: Large Language Models for Arabic and English
M Saiful Bari
Yazeed Alnumay
Norah A. Alzahrani
Nouf M. Alotaibi
H. A. Alyahya
...
Jeril Kuriakose
Abdalghani Abujabal
Nora Al-Twairesh
Areeb Alowisheq
Haidar Khan
42
11
0
22 Jul 2024
Is user feedback always informative? Retrieval Latent Defending for Semi-Supervised Domain Adaptation without Source Data
Junha Song
Tae Soo Kim
Junha Kim
Gunhee Nam
Thijs Kooi
Jaegul Choo
53
1
0
22 Jul 2024
Boosting Reward Model with Preference-Conditional Multi-Aspect Synthetic Data Generation
Jiaming Shen
Ran Xu
Yennie Jun
Zhen Qin
Tianqi Liu
Carl Yang
Yi Liang
Simon Baumgartner
Michael Bendersky
SyDa
67
4
0
22 Jul 2024
Improving Context-Aware Preference Modeling for Language Models
Silviu Pitis
Ziang Xiao
Nicolas Le Roux
Alessandro Sordoni
42
8
0
20 Jul 2024
Data-Centric Human Preference Optimization with Rationales
H. Just
Ming Jin
Anit Kumar Sahu
Huy Phan
Ruoxi Jia
54
3
0
19 Jul 2024
Clinical Reading Comprehension with Encoder-Decoder Models Enhanced by Direct Preference Optimization
Md Sultan al Nahian
R. Kavuluru
MedIm
AI4CE
41
0
0
19 Jul 2024
Decomposed Direct Preference Optimization for Structure-Based Drug Design
Xiwei Cheng
Xiangxin Zhou
Yuwei Yang
Yu Bao
Quanquan Gu
43
3
0
19 Jul 2024
Learning Goal-Conditioned Representations for Language Reward Models
Vaskar Nath
Dylan Slack
Jeff Da
Yuntao Ma
Hugh Zhang
Spencer Whitehead
Sean Hendryx
32
0
0
18 Jul 2024
LLMs as Function Approximators: Terminology, Taxonomy, and Questions for Evaluation
David Schlangen
48
1
0
18 Jul 2024
Understanding Reference Policies in Direct Preference Optimization
Yixin Liu
Pengfei Liu
Arman Cohan
44
7
0
18 Jul 2024
DeepClair: Utilizing Market Forecasts for Effective Portfolio Selection
Donghee Choi
Jinkyu Kim
Mogan Gim
Jinho Lee
Jaewoo Kang
38
0
0
18 Jul 2024
MERLIN: Multimodal Embedding Refinement via LLM-based Iterative Navigation for Text-Video Retrieval-Rerank Pipeline
D. Han
Eunhwan Park
Gisang Lee
Adam Lee
Nojun Kwak
42
2
0
17 Jul 2024
Satisficing Exploration for Deep Reinforcement Learning
Dilip Arumugam
Saurabh Kumar
Ramki Gummadi
Benjamin Van Roy
42
1
0
16 Jul 2024
Exploration Unbound
Dilip Arumugam
Wanqiao Xu
Benjamin Van Roy
44
0
0
16 Jul 2024
SwitchCIT: Switching for Continual Instruction Tuning of Large Language Models
Xinbo Wu
Max Hartman
Vidhata Arjun Jayaraman
Lav Varshney
CLL
LRM
39
1
0
16 Jul 2024
Refuse Whenever You Feel Unsafe: Improving Safety in LLMs via Decoupled Refusal Training
Youliang Yuan
Wenxiang Jiao
Wenxuan Wang
Jen-tse Huang
Jiahao Xu
Tian Liang
Pinjia He
Zhaopeng Tu
45
19
0
12 Jul 2024
New Desiderata for Direct Preference Optimization
Xiangkun Hu
Tong He
David Wipf
59
2
0
12 Jul 2024
SoupLM: Model Integration in Large Language and Multi-Modal Models
Yue Bai
Zichen Zhang
Jiasen Lu
Yun Fu
MoMe
35
1
0
11 Jul 2024
Grounding and Evaluation for Large Language Models: Practical Challenges and Lessons Learned (Survey)
K. Kenthapadi
M. Sameki
Ankur Taly
HILM
ELM
AILaw
44
12
0
10 Jul 2024
Self-Recognition in Language Models
Tim R. Davidson
Viacheslav Surkov
V. Veselovsky
Giuseppe Russo
Robert West
Çağlar Gülçehre
PILM
248
2
0
09 Jul 2024
LIONs: An Empirically Optimized Approach to Align Language Models
Xiao Yu
Qingyang Wu
Yu Li
Zhou Yu
ALM
40
3
0
09 Jul 2024
Preference-Guided Reinforcement Learning for Efficient Exploration
Guojian Wang
Faguo Wu
Xiao Zhang
Tianyuan Chen
Xuyang Chen
Lin Zhao
45
0
0
09 Jul 2024
Variational Best-of-N Alignment
Afra Amini
Tim Vieira
Ryan Cotterell
Ryan Cotterell
BDL
43
19
0
08 Jul 2024
Exposing Privacy Gaps: Membership Inference Attack on Preference Data for LLM Alignment
Qizhang Feng
Siva Rajesh Kasa
Santhosh Kumar Kasa
Hyokun Yun
C. Teo
S. Bodapati
92
7
0
08 Jul 2024
AI Safety in Generative AI Large Language Models: A Survey
Jaymari Chua
Yun Yvonna Li
Shiyi Yang
Chen Wang
Lina Yao
LM&MA
47
12
0
06 Jul 2024
Towards Enhancing Coherence in Extractive Summarization: Dataset and Experiments with LLMs
Mihir Parmar
Hanieh Deilamsalehy
Franck Dernoncourt
Seunghyun Yoon
Ryan A. Rossi
Trung Bui
34
2
0
05 Jul 2024
MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?
Zhaorun Chen
Yichao Du
Zichen Wen
Yiyang Zhou
Chenhang Cui
...
Jiawei Zhou
Zhuokai Zhao
Rafael Rafailov
Chelsea Finn
Huaxiu Yao
EGVM
MLLM
65
29
0
05 Jul 2024
Spontaneous Reward Hacking in Iterative Self-Refinement
Jane Pan
He He
Samuel R. Bowman
Shi Feng
40
8
0
05 Jul 2024
HAF-RM: A Hybrid Alignment Framework for Reward Model Training
Shujun Liu
Xiaoyu Shen
Yuhang Lai
Siyuan Wang
Shengbin Yue
Zengfeng Huang
Xuanjing Huang
Zhongyu Wei
31
1
0
04 Jul 2024
Orchestrating LLMs with Different Personalizations
Jin Peng Zhou
Katie Z Luo
Jingwen Gu
Jason Yuan
Kilian Q. Weinberger
Wen Sun
57
2
0
04 Jul 2024
Uncertainty-Guided Optimization on Large Language Model Search Trees
Julia Grosse
Ruotian Wu
Ahmad Rashid
Philipp Hennig
Pascal Poupart
Agustinus Kristiadi
45
1
0
04 Jul 2024
Warm-up Free Policy Optimization: Improved Regret in Linear Markov Decision Processes
Asaf B. Cassel
Aviv A. Rosenberg
43
1
0
03 Jul 2024
Understanding Alignment in Multimodal LLMs: A Comprehensive Study
Elmira Amirloo
J. Fauconnier
Christoph Roesmann
Christian Kerl
Rinu Boney
...
Zirui Wang
Afshin Dehghan
Yinfei Yang
Zhe Gan
Peter Grasch
43
6
0
02 Jul 2024
RLHF Can Speak Many Languages: Unlocking Multilingual Preference Optimization for LLMs
John Dang
Arash Ahmadian
Kelly Marchisio
Julia Kreutzer
Ahmet Üstün
Sara Hooker
47
23
0
02 Jul 2024
CEB: Compositional Evaluation Benchmark for Fairness in Large Language Models
Song Wang
Peng Wang
Tong Zhou
Yushun Dong
Zhen Tan
Jundong Li
CoGe
63
7
0
02 Jul 2024
LLM See, LLM Do: Guiding Data Generation to Target Non-Differentiable Objectives
Luísa Shimabucoro
Sebastian Ruder
Julia Kreutzer
Marzieh Fadaee
Sara Hooker
SyDa
38
5
0
01 Jul 2024
Previous
1
2
3
...
8
9
10
...
27
28
29
Next