ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2411.10545
  4. Cited By
Efficient Alignment of Large Language Models via Data Sampling
v1v2 (latest)

Efficient Alignment of Large Language Models via Data Sampling

15 November 2024
Amrit Khera
Rajat Ghosh
Debojyoti Dutta
ArXiv (abs)PDFHTML

Papers citing "Efficient Alignment of Large Language Models via Data Sampling"

21 / 21 papers shown
Title
Scaling Laws for Reward Model Overoptimization in Direct Alignment
  Algorithms
Scaling Laws for Reward Model Overoptimization in Direct Alignment Algorithms
Rafael Rafailov
Yaswanth Chittepu
Ryan Park
Harshit S. Sikchi
Joey Hejna
Bradley Knox
Chelsea Finn
S. Niekum
132
69
0
05 Jun 2024
How to Train Data-Efficient LLMs
How to Train Data-Efficient LLMs
Noveen Sachdeva
Benjamin Coleman
Wang-Cheng Kang
Jianmo Ni
Lichan Hong
Ed H. Chi
James Caverlee
Julian McAuley
D. Cheng
104
64
0
15 Feb 2024
KTO: Model Alignment as Prospect Theoretic Optimization
KTO: Model Alignment as Prospect Theoretic Optimization
Kawin Ethayarajh
Winnie Xu
Niklas Muennighoff
Dan Jurafsky
Douwe Kiela
319
570
0
02 Feb 2024
What Makes Good Data for Alignment? A Comprehensive Study of Automatic
  Data Selection in Instruction Tuning
What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning
Wei Liu
Weihao Zeng
Keqing He
Yong Jiang
Junxian He
ALM
145
239
0
25 Dec 2023
ULMA: Unified Language Model Alignment with Human Demonstration and
  Point-wise Preference
ULMA: Unified Language Model Alignment with Human Demonstration and Point-wise Preference
Tianchi Cai
Xierui Song
Jiyan Jiang
Fei Teng
Jinjie Gu
Guannan Zhang
ALM
94
5
0
05 Dec 2023
Fine-tuning Language Models for Factuality
Fine-tuning Language Models for Factuality
Katherine Tian
Eric Mitchell
Huaxiu Yao
Christopher D. Manning
Chelsea Finn
KELMHILMSyDa
92
185
0
14 Nov 2023
Open Problems and Fundamental Limitations of Reinforcement Learning from
  Human Feedback
Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
Stephen Casper
Xander Davies
Claudia Shi
T. Gilbert
Jérémy Scheurer
...
Erdem Biyik
Anca Dragan
David M. Krueger
Dorsa Sadigh
Dylan Hadfield-Menell
ALMOffRL
162
535
0
27 Jul 2023
Direct Preference Optimization: Your Language Model is Secretly a Reward
  Model
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Rafael Rafailov
Archit Sharma
E. Mitchell
Stefano Ermon
Christopher D. Manning
Chelsea Finn
ALM
405
4,190
0
29 May 2023
GPT-RE: In-context Learning for Relation Extraction using Large Language
  Models
GPT-RE: In-context Learning for Relation Extraction using Large Language Models
Michele Focchi
Fei Cheng
Zhuoyuan Mao
Qianying Liu
Haiyue Song
Jiwei Li
Sadao Kurohashi
LRM
117
94
0
03 May 2023
GPT-NER: Named Entity Recognition via Large Language Models
GPT-NER: Named Entity Recognition via Large Language Models
Shuhe Wang
Xiaofei Sun
Xiaoya Li
Rongbin Ouyang
Leilei Gan
Tianwei Zhang
Jiwei Li
Guoyin Wang
115
202
0
20 Apr 2023
ChatIE: Zero-Shot Information Extraction via Chatting with ChatGPT
ChatIE: Zero-Shot Information Extraction via Chatting with ChatGPT
Xiang Wei
Xingyu Cui
Ning Cheng
Xiaobin Wang
Xin Zhang
...
Jinan Xu
Jinan Xu
Meishan Zhang
Yong Jiang
Wenjuan Han
143
345
0
20 Feb 2023
Data pruning and neural scaling laws: fundamental limitations of
  score-based algorithms
Data pruning and neural scaling laws: fundamental limitations of score-based algorithms
Fadhel Ayed
Soufiane Hayou
111
10
0
14 Feb 2023
Scaling Laws for Reward Model Overoptimization
Scaling Laws for Reward Model Overoptimization
Leo Gao
John Schulman
Jacob Hilton
ALM
137
569
0
19 Oct 2022
DeepCore: A Comprehensive Library for Coreset Selection in Deep Learning
DeepCore: A Comprehensive Library for Coreset Selection in Deep Learning
Chengcheng Guo
B. Zhao
Yanbing Bai
OOD
145
143
0
18 Apr 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLMALM
1.3K
13,290
0
04 Mar 2022
Finetuned Language Models Are Zero-Shot Learners
Finetuned Language Models Are Zero-Shot Learners
Jason W. Wei
Maarten Bosma
Vincent Zhao
Kelvin Guu
Adams Wei Yu
Brian Lester
Nan Du
Andrew M. Dai
Quoc V. Le
ALMUQCV
393
3,814
0
03 Sep 2021
Sentence-T5: Scalable Sentence Encoders from Pre-trained Text-to-Text
  Models
Sentence-T5: Scalable Sentence Encoders from Pre-trained Text-to-Text Models
Jianmo Ni
Gustavo Hernández Ábrego
Noah Constant
Ji Ma
Keith B. Hall
Daniel Cer
Yinfei Yang
292
569
0
19 Aug 2021
LoRA: Low-Rank Adaptation of Large Language Models
LoRA: Low-Rank Adaptation of Large Language Models
J. E. Hu
Yelong Shen
Phillip Wallis
Zeyuan Allen-Zhu
Yuanzhi Li
Shean Wang
Lu Wang
Weizhu Chen
OffRLAI4TSAI4CEALMAIMat
852
10,661
0
17 Jun 2021
Don't Stop Pretraining: Adapt Language Models to Domains and Tasks
Don't Stop Pretraining: Adapt Language Models to Domains and Tasks
Suchin Gururangan
Ana Marasović
Swabha Swayamdipta
Kyle Lo
Iz Beltagy
Doug Downey
Noah A. Smith
VLMAI4CECLL
230
2,454
0
23 Apr 2020
Proximal Policy Optimization Algorithms
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
715
19,378
0
20 Jul 2017
Deep reinforcement learning from human preferences
Deep reinforcement learning from human preferences
Paul Christiano
Jan Leike
Tom B. Brown
Miljan Martic
Shane Legg
Dario Amodei
246
3,390
0
12 Jun 2017
1