Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2411.10545
Cited By
v1
v2 (latest)
Efficient Alignment of Large Language Models via Data Sampling
15 November 2024
Amrit Khera
Rajat Ghosh
Debojyoti Dutta
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Efficient Alignment of Large Language Models via Data Sampling"
21 / 21 papers shown
Title
Scaling Laws for Reward Model Overoptimization in Direct Alignment Algorithms
Rafael Rafailov
Yaswanth Chittepu
Ryan Park
Harshit S. Sikchi
Joey Hejna
Bradley Knox
Chelsea Finn
S. Niekum
132
69
0
05 Jun 2024
How to Train Data-Efficient LLMs
Noveen Sachdeva
Benjamin Coleman
Wang-Cheng Kang
Jianmo Ni
Lichan Hong
Ed H. Chi
James Caverlee
Julian McAuley
D. Cheng
104
64
0
15 Feb 2024
KTO: Model Alignment as Prospect Theoretic Optimization
Kawin Ethayarajh
Winnie Xu
Niklas Muennighoff
Dan Jurafsky
Douwe Kiela
319
570
0
02 Feb 2024
What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning
Wei Liu
Weihao Zeng
Keqing He
Yong Jiang
Junxian He
ALM
145
239
0
25 Dec 2023
ULMA: Unified Language Model Alignment with Human Demonstration and Point-wise Preference
Tianchi Cai
Xierui Song
Jiyan Jiang
Fei Teng
Jinjie Gu
Guannan Zhang
ALM
94
5
0
05 Dec 2023
Fine-tuning Language Models for Factuality
Katherine Tian
Eric Mitchell
Huaxiu Yao
Christopher D. Manning
Chelsea Finn
KELM
HILM
SyDa
92
185
0
14 Nov 2023
Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
Stephen Casper
Xander Davies
Claudia Shi
T. Gilbert
Jérémy Scheurer
...
Erdem Biyik
Anca Dragan
David M. Krueger
Dorsa Sadigh
Dylan Hadfield-Menell
ALM
OffRL
162
535
0
27 Jul 2023
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Rafael Rafailov
Archit Sharma
E. Mitchell
Stefano Ermon
Christopher D. Manning
Chelsea Finn
ALM
405
4,190
0
29 May 2023
GPT-RE: In-context Learning for Relation Extraction using Large Language Models
Michele Focchi
Fei Cheng
Zhuoyuan Mao
Qianying Liu
Haiyue Song
Jiwei Li
Sadao Kurohashi
LRM
117
94
0
03 May 2023
GPT-NER: Named Entity Recognition via Large Language Models
Shuhe Wang
Xiaofei Sun
Xiaoya Li
Rongbin Ouyang
Leilei Gan
Tianwei Zhang
Jiwei Li
Guoyin Wang
115
202
0
20 Apr 2023
ChatIE: Zero-Shot Information Extraction via Chatting with ChatGPT
Xiang Wei
Xingyu Cui
Ning Cheng
Xiaobin Wang
Xin Zhang
...
Jinan Xu
Jinan Xu
Meishan Zhang
Yong Jiang
Wenjuan Han
143
345
0
20 Feb 2023
Data pruning and neural scaling laws: fundamental limitations of score-based algorithms
Fadhel Ayed
Soufiane Hayou
111
10
0
14 Feb 2023
Scaling Laws for Reward Model Overoptimization
Leo Gao
John Schulman
Jacob Hilton
ALM
137
569
0
19 Oct 2022
DeepCore: A Comprehensive Library for Coreset Selection in Deep Learning
Chengcheng Guo
B. Zhao
Yanbing Bai
OOD
145
143
0
18 Apr 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
1.3K
13,290
0
04 Mar 2022
Finetuned Language Models Are Zero-Shot Learners
Jason W. Wei
Maarten Bosma
Vincent Zhao
Kelvin Guu
Adams Wei Yu
Brian Lester
Nan Du
Andrew M. Dai
Quoc V. Le
ALM
UQCV
393
3,814
0
03 Sep 2021
Sentence-T5: Scalable Sentence Encoders from Pre-trained Text-to-Text Models
Jianmo Ni
Gustavo Hernández Ábrego
Noah Constant
Ji Ma
Keith B. Hall
Daniel Cer
Yinfei Yang
292
569
0
19 Aug 2021
LoRA: Low-Rank Adaptation of Large Language Models
J. E. Hu
Yelong Shen
Phillip Wallis
Zeyuan Allen-Zhu
Yuanzhi Li
Shean Wang
Lu Wang
Weizhu Chen
OffRL
AI4TS
AI4CE
ALM
AIMat
852
10,661
0
17 Jun 2021
Don't Stop Pretraining: Adapt Language Models to Domains and Tasks
Suchin Gururangan
Ana Marasović
Swabha Swayamdipta
Kyle Lo
Iz Beltagy
Doug Downey
Noah A. Smith
VLM
AI4CE
CLL
230
2,454
0
23 Apr 2020
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
715
19,378
0
20 Jul 2017
Deep reinforcement learning from human preferences
Paul Christiano
Jan Leike
Tom B. Brown
Miljan Martic
Shane Legg
Dario Amodei
246
3,390
0
12 Jun 2017
1