ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2312.15685
  4. Cited By
What Makes Good Data for Alignment? A Comprehensive Study of Automatic
  Data Selection in Instruction Tuning

What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning

25 December 2023
Wei Liu
Weihao Zeng
Keqing He
Yong Jiang
Junxian He
    ALM
ArXivPDFHTML

Papers citing "What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning"

22 / 172 papers shown
Title
Your Vision-Language Model Itself Is a Strong Filter: Towards
  High-Quality Instruction Tuning with Data Selection
Your Vision-Language Model Itself Is a Strong Filter: Towards High-Quality Instruction Tuning with Data Selection
Ruibo Chen
Yihan Wu
Lichang Chen
Guodong Liu
Qi He
Tianyi Xiong
Chenxi Liu
Junfeng Guo
Heng-Chiao Huang
VLM
23
17
0
19 Feb 2024
Reformatted Alignment
Reformatted Alignment
Run-Ze Fan
Xuefeng Li
Haoyang Zou
Junlong Li
Shwai He
Ethan Chern
Jiewen Hu
Pengfei Liu
65
8
0
19 Feb 2024
Enabling Weak LLMs to Judge Response Reliability via Meta Ranking
Enabling Weak LLMs to Judge Response Reliability via Meta Ranking
Zijun Liu
Boqun Kou
Peng Li
Ming Yan
Ji Zhang
Fei Huang
Yang Liu
32
2
0
19 Feb 2024
Selective Reflection-Tuning: Student-Selected Data Recycling for LLM
  Instruction-Tuning
Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning
Ming Li
Lichang Chen
Jiuhai Chen
Shwai He
Jiuxiang Gu
Dinesh Manocha
29
52
0
15 Feb 2024
API Pack: A Massive Multi-Programming Language Dataset for API Call Generation
API Pack: A Massive Multi-Programming Language Dataset for API Call Generation
Zhen Guo
Adriana Meza Soria
Wei Sun
Songlin Yang
Yikang Shen
ELM
ALM
55
1
0
14 Feb 2024
DolphCoder: Echo-Locating Code Large Language Models with Diverse and
  Multi-Objective Instruction Tuning
DolphCoder: Echo-Locating Code Large Language Models with Diverse and Multi-Objective Instruction Tuning
Yejie Wang
Keqing He
Guanting Dong
Pei Wang
Weihao Zeng
...
Yutao Mou
Mengdi Zhang
Jingang Wang
Xunliang Cai
Weiran Xu
ALM
28
9
0
14 Feb 2024
Knowledge Editing on Black-box Large Language Models
Knowledge Editing on Black-box Large Language Models
Xiaoshuai Song
Zhengyang Wang
Keqing He
Guanting Dong
Yutao Mou
Jinxu Zhao
Weiran Xu
KELM
31
2
0
13 Feb 2024
Rethinking Data Selection for Supervised Fine-Tuning
Rethinking Data Selection for Supervised Fine-Tuning
Ming Shen
34
17
0
08 Feb 2024
Long Is More for Alignment: A Simple but Tough-to-Beat Baseline for
  Instruction Fine-Tuning
Long Is More for Alignment: A Simple but Tough-to-Beat Baseline for Instruction Fine-Tuning
Hao Zhao
Maksym Andriushchenko
Francesco Croce
Nicolas Flammarion
ALM
97
44
0
07 Feb 2024
LESS: Selecting Influential Data for Targeted Instruction Tuning
LESS: Selecting Influential Data for Targeted Instruction Tuning
Mengzhou Xia
Sadhika Malladi
Suchin Gururangan
Sanjeev Arora
Danqi Chen
91
193
0
06 Feb 2024
EasyInstruct: An Easy-to-use Instruction Processing Framework for Large
  Language Models
EasyInstruct: An Easy-to-use Instruction Processing Framework for Large Language Models
Yixin Ou
Ningyu Zhang
Honghao Gui
Ziwen Xu
Shuofei Qiao
...
Kangwei Liu
Lei Li
Zhen Bi
Guozhou Zheng
Huajun Chen
SyDa
42
0
0
05 Feb 2024
A Survey on Data Selection for LLM Instruction Tuning
A Survey on Data Selection for LLM Instruction Tuning
Jiahao Wang
Bolin Zhang
Qianlong Du
Jiajun Zhang
Dianhui Chu
43
42
0
04 Feb 2024
Diversity Measurement and Subset Selection for Instruction Tuning
  Datasets
Diversity Measurement and Subset Selection for Instruction Tuning Datasets
Peiqi Wang
Songlin Yang
Zhen Guo
Matt Stallone
Yoon Kim
Polina Golland
Yikang Shen
31
9
0
04 Feb 2024
Weaver: Foundation Models for Creative Writing
Weaver: Foundation Models for Creative Writing
Tiannan Wang
Jiamin Chen
Qingrui Jia
Shuai Wang
Ruoyu Fang
...
Xiaohua Xu
Ningyu Zhang
Huajun Chen
Yuchen Eleanor Jiang
Wangchunshu Zhou
35
20
0
30 Jan 2024
Data Diversity Matters for Robust Instruction Tuning
Data Diversity Matters for Robust Instruction Tuning
Alexander Bukharin
Tuo Zhao
81
36
0
21 Nov 2023
How Abilities in Large Language Models are Affected by Supervised
  Fine-tuning Data Composition
How Abilities in Large Language Models are Affected by Supervised Fine-tuning Data Composition
Guanting Dong
Hongyi Yuan
Keming Lu
Chengpeng Li
Mingfeng Xue
Dayiheng Liu
Wei Wang
Zheng Yuan
Chang Zhou
Jingren Zhou
LRM
CLL
34
121
0
09 Oct 2023
JsonTuning: Towards Generalizable, Robust, and Controllable Instruction Tuning
JsonTuning: Towards Generalizable, Robust, and Controllable Instruction Tuning
Chang Gao
Wenxuan Zhang
Guizhen Chen
Wai Lam
55
5
0
04 Oct 2023
Error Norm Truncation: Robust Training in the Presence of Data Noise for
  Text Generation Models
Error Norm Truncation: Robust Training in the Presence of Data Noise for Text Generation Models
Tianjian Li
Haoran Xu
Philipp Koehn
Daniel Khashabi
Kenton W. Murray
38
4
0
02 Oct 2023
Think-on-Graph: Deep and Responsible Reasoning of Large Language Model
  on Knowledge Graph
Think-on-Graph: Deep and Responsible Reasoning of Large Language Model on Knowledge Graph
Jiashuo Sun
Chengjin Xu
Lumingyuan Tang
Sai Wang
Chen Lin
Yeyun Gong
Lionel M. Ni
H. Shum
Jian Guo
LRM
38
63
0
15 Jul 2023
Large Language Model Instruction Following: A Survey of Progresses and
  Challenges
Large Language Model Instruction Following: A Survey of Progresses and Challenges
Renze Lou
Kai Zhang
Wenpeng Yin
ALM
LRM
32
20
0
18 Mar 2023
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
351
12,003
0
04 Mar 2022
ZeRO-Offload: Democratizing Billion-Scale Model Training
ZeRO-Offload: Democratizing Billion-Scale Model Training
Jie Ren
Samyam Rajbhandari
Reza Yazdani Aminabadi
Olatunji Ruwase
Shuangyang Yang
Minjia Zhang
Dong Li
Yuxiong He
MoE
177
416
0
18 Jan 2021
Previous
1234