ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2504.07288
  4. Cited By
MDIT: A Model-free Data Interpolation Method for Diverse Instruction Tuning

MDIT: A Model-free Data Interpolation Method for Diverse Instruction Tuning

9 April 2025
Yangning Li
Zihua Lan
Lv Qingsong
Hai-Tao Zheng
Hai-Tao Zheng
ArXivPDFHTML

Papers citing "MDIT: A Model-free Data Interpolation Method for Diverse Instruction Tuning"

48 / 48 papers shown
Title
One Example Shown, Many Concepts Known! Counterexample-Driven Conceptual Reasoning in Mathematical LLMs
One Example Shown, Many Concepts Known! Counterexample-Driven Conceptual Reasoning in Mathematical LLMs
Hai-Tao Zheng
Jiayi Kuang
Haojing Huang
Zhikun Xu
Xinnian Liang
...
Jue Chen
Chao Qu
Ying Shen
Hai-Tao Zheng
Philip S. Yu
LRM
101
2
0
12 Feb 2025
Refine Knowledge of Large Language Models via Adaptive Contrastive Learning
Refine Knowledge of Large Language Models via Adaptive Contrastive Learning
Hai-Tao Zheng
Haojing Huang
Jiayi Kuang
Yangning Li
Shu Guo
Chao Qu
Jue Chen
Hai-Tao Zheng
Ying Shen
Philip S. Yu
CLL
90
5
0
11 Feb 2025
Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent
Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent
Yangning Li
Hai-Tao Zheng
Xinyu Wang
Yong Jiang
Zhen Zhang
...
Hui Wang
Hai-Tao Zheng
Pengjun Xie
Philip S. Yu
Fei Huang
94
22
0
05 Nov 2024
Recent Advances of Multimodal Continual Learning: A Comprehensive Survey
Recent Advances of Multimodal Continual Learning: A Comprehensive Survey
Dianzhi Yu
Xinni Zhang
Yankai Chen
Aiwei Liu
Yifei Zhang
Philip S. Yu
Irwin King
VLM
CLL
79
12
0
07 Oct 2024
COMMUNITY-CROSS-INSTRUCT: Unsupervised Instruction Generation for
  Aligning Large Language Models to Online Communities
COMMUNITY-CROSS-INSTRUCT: Unsupervised Instruction Generation for Aligning Large Language Models to Online Communities
Zihao He
Rebecca Dorn
Siyi Guo
Minh Duc Hoang Chu
Kristina Lerman
78
8
0
17 Jun 2024
G-DIG: Towards Gradient-based Diverse and High-quality Instruction Data
  Selection for Machine Translation
G-DIG: Towards Gradient-based Diverse and High-quality Instruction Data Selection for Machine Translation
Xingyuan Pan
Luyang Huang
Liyan Kang
Zhicheng Liu
Yu Lu
Shanbo Cheng
ALM
92
14
0
21 May 2024
DiffuseMix: Label-Preserving Data Augmentation with Diffusion Models
DiffuseMix: Label-Preserving Data Augmentation with Diffusion Models
Khawar Islam
Muhammad Zaigham Zaheer
Arif Mahmood
Karthik Nandakumar
DiffM
50
38
0
05 Apr 2024
Enhance Image Classification via Inter-Class Image Mixup with Diffusion
  Model
Enhance Image Classification via Inter-Class Image Mixup with Diffusion Model
Zhicai Wang
Longhui Wei
Tan Wang
Heyu Chen
Yanbin Hao
Xiang Wang
Xiangnan He
Qi Tian
VLM
DiffM
50
17
0
28 Mar 2024
Let LLMs Take on the Latest Challenges! A Chinese Dynamic Question
  Answering Benchmark
Let LLMs Take on the Latest Challenges! A Chinese Dynamic Question Answering Benchmark
Zhikun Xu
Hai-Tao Zheng
Ruixue Ding
Xinyu Wang
Boli Chen
Yong Jiang
Hai-Tao Zheng
Wenlian Lu
Pengjun Xie
Fei Huang
82
11
0
29 Feb 2024
Clustering and Ranking: Diversity-preserved Instruction Selection
  through Expert-aligned Quality Estimation
Clustering and Ranking: Diversity-preserved Instruction Selection through Expert-aligned Quality Estimation
Yuan Ge
Yilun Liu
Chi Hu
Weibin Meng
Shimin Tao
Xiaofeng Zhao
Hongxia Ma
Li Zhang
Hao Yang
Tong Xiao
ALM
53
34
0
28 Feb 2024
A Survey on Recent Advances in LLM-Based Multi-turn Dialogue Systems
A Survey on Recent Advances in LLM-Based Multi-turn Dialogue Systems
Zihao Yi
Jiarui Ouyang
Yuwen Liu
Tianhao Liao
Zhe Xu
Ying Shen
LLMAG
LRM
94
68
0
28 Feb 2024
Evaluating Robustness of Generative Search Engine on Adversarial Factual
  Questions
Evaluating Robustness of Generative Search Engine on Adversarial Factual Questions
Xuming Hu
Xiaochuan Li
Junzhe Chen
Hai-Tao Zheng
Yangning Li
...
Yasheng Wang
Qun Liu
Lijie Wen
Philip S. Yu
Zhijiang Guo
AAML
ELM
59
4
0
25 Feb 2024
What Makes Good Data for Alignment? A Comprehensive Study of Automatic
  Data Selection in Instruction Tuning
What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning
Wei Liu
Weihao Zeng
Keqing He
Yong Jiang
Junxian He
ALM
94
235
0
25 Dec 2023
Self-Evolved Diverse Data Sampling for Efficient Instruction Tuning
Self-Evolved Diverse Data Sampling for Efficient Instruction Tuning
Shengguang Wu
Keming Lu
Benfeng Xu
Junyang Lin
Qi Su
Chang Zhou
SyDa
ALM
32
39
0
14 Nov 2023
Active Instruction Tuning: Improving Cross-Task Generalization by
  Training on Prompt Sensitive Tasks
Active Instruction Tuning: Improving Cross-Task Generalization by Training on Prompt Sensitive Tasks
Po-Nien Kung
Fan Yin
Di Wu
Kai-Wei Chang
Nanyun Peng
117
43
0
01 Nov 2023
Sheared LLaMA: Accelerating Language Model Pre-training via Structured
  Pruning
Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning
Mengzhou Xia
Tianyu Gao
Zhiyuan Zeng
Danqi Chen
107
300
0
10 Oct 2023
DoG-Instruct: Towards Premium Instruction-Tuning Data via Text-Grounded
  Instruction Wrapping
DoG-Instruct: Towards Premium Instruction-Tuning Data via Text-Grounded Instruction Wrapping
Yongrui Chen
Haiyun Jiang
Xinting Huang
Shuming Shi
Guilin Qi
SyDa
27
11
0
11 Sep 2023
InstructionGPT-4: A 200-Instruction Paradigm for Fine-Tuning MiniGPT-4
InstructionGPT-4: A 200-Instruction Paradigm for Fine-Tuning MiniGPT-4
Lai Wei
Zihao Jiang
Weiran Huang
Lichao Sun
VLM
MLLM
74
60
0
23 Aug 2023
From Quantity to Quality: Boosting LLM Performance with Self-Guided Data
  Selection for Instruction Tuning
From Quantity to Quality: Boosting LLM Performance with Self-Guided Data Selection for Instruction Tuning
Ming Li
Yong Zhang
Zhitao Li
Jiuhai Chen
Lichang Chen
Ning Cheng
Jianzong Wang
Dinesh Manocha
Jing Xiao
102
203
0
23 Aug 2023
Instruction Tuning for Large Language Models: A Survey
Instruction Tuning for Large Language Models: A Survey
Shengyu Zhang
Linfeng Dong
Xiaoya Li
Sen Zhang
Xiaofei Sun
...
Jiwei Li
Runyi Hu
Tianwei Zhang
Leilei Gan
Guoyin Wang
LM&MA
75
591
0
21 Aug 2023
SeqGPT: An Out-of-the-box Large Language Model for Open Domain Sequence
  Understanding
SeqGPT: An Out-of-the-box Large Language Model for Open Domain Sequence Understanding
Tianyu Yu
Chengyue Jiang
Chao Lou
Shen Huang
Xiaobin Wang
...
Haitao Zheng
Ningyu Zhang
Pengjun Xie
Fei Huang
Yong Jiang
LRM
98
16
0
21 Aug 2023
A Preliminary Study of the Intrinsic Relationship between Complexity and
  Alignment
A Preliminary Study of the Intrinsic Relationship between Complexity and Alignment
Ying Zhao
Yu Bowen
Binyuan Hui
Haiyang Yu
Fei Huang
Yongbin Li
N. Zhang
81
24
0
10 Aug 2023
MESED: A Multi-modal Entity Set Expansion Dataset with Fine-grained
  Semantic Classes and Hard Negative Entities
MESED: A Multi-modal Entity Set Expansion Dataset with Fine-grained Semantic Classes and Hard Negative Entities
Yongqian Li
Tingwei Lu
Hai-Tao Zheng
Tianyu Yu
Shulin Huang
Haitao Zheng
Rui Zhang
Jun Yuan
74
11
0
27 Jul 2023
Llama 2: Open Foundation and Fine-Tuned Chat Models
Llama 2: Open Foundation and Fine-Tuned Chat Models
Hugo Touvron
Louis Martin
Kevin R. Stone
Peter Albert
Amjad Almahairi
...
Sharan Narang
Aurelien Rodriguez
Robert Stojnic
Sergey Edunov
Thomas Scialom
AI4MH
ALM
280
11,828
0
18 Jul 2023
On the (In)Effectiveness of Large Language Models for Chinese Text
  Correction
On the (In)Effectiveness of Large Language Models for Chinese Text Correction
Hai-Tao Zheng
Haojing Huang
Shirong Ma
Yong Jiang
Yongqian Li
F. Zhou
Haitao Zheng
Qingyu Zhou
57
46
0
18 Jul 2023
Enhancing Chat Language Models by Scaling High-quality Instructional
  Conversations
Enhancing Chat Language Models by Scaling High-quality Instructional Conversations
Ning Ding
Yulin Chen
Bokai Xu
Yujia Qin
Zhi Zheng
Shengding Hu
Zhiyuan Liu
Maosong Sun
Bowen Zhou
ALM
131
533
0
23 May 2023
Vision, Deduction and Alignment: An Empirical Study on Multi-modal
  Knowledge Graph Alignment
Vision, Deduction and Alignment: An Empirical Study on Multi-modal Knowledge Graph Alignment
Yongqian Li
Jiaoyan Chen
Hai-Tao Zheng
Yuejia Xiang
Xi Chen
Haitao Zheng
70
25
0
17 Feb 2023
The Flan Collection: Designing Data and Methods for Effective
  Instruction Tuning
The Flan Collection: Designing Data and Methods for Effective Instruction Tuning
Shayne Longpre
Le Hou
Tu Vu
Albert Webson
Hyung Won Chung
...
Denny Zhou
Quoc V. Le
Barret Zoph
Jason W. Wei
Adam Roberts
ALM
98
669
0
31 Jan 2023
Self-Instruct: Aligning Language Models with Self-Generated Instructions
Self-Instruct: Aligning Language Models with Self-Generated Instructions
Yizhong Wang
Yeganeh Kordi
Swaroop Mishra
Alisa Liu
Noah A. Smith
Daniel Khashabi
Hannaneh Hajishirzi
ALM
SyDa
LRM
100
2,212
0
20 Dec 2022
Demystifying Prompts in Language Models via Perplexity Estimation
Demystifying Prompts in Language Models via Perplexity Estimation
Hila Gonen
Srini Iyer
Terra Blevins
Noah A. Smith
Luke Zettlemoyer
LRM
115
210
0
08 Dec 2022
Embracing Ambiguity: Improving Similarity-oriented Tasks with Contextual
  Synonym Knowledge
Embracing Ambiguity: Improving Similarity-oriented Tasks with Contextual Synonym Knowledge
Yongqian Li
Jiaoyan Chen
Hai-Tao Zheng
Tianyu Yu
Xi Chen
Haitao Zheng
50
14
0
20 Nov 2022
Linguistic Rules-Based Corpus Generation for Native Chinese Grammatical
  Error Correction
Linguistic Rules-Based Corpus Generation for Native Chinese Grammatical Error Correction
Shirong Ma
Hai-Tao Zheng
Rongyi Sun
Qingyu Zhou
Shulin Huang
...
Ruiyang Liu
Zhongli Li
Yunbo Cao
Haitao Zheng
Ying Shen
58
27
0
19 Oct 2022
Learning from the Dictionary: Heterogeneous Knowledge Guided Fine-tuning
  for Chinese Spell Checking
Learning from the Dictionary: Heterogeneous Knowledge Guided Fine-tuning for Chinese Spell Checking
Hai-Tao Zheng
Shirong Ma
Qingyu Zhou
Zhongli Li
Li Yangning
Shulin Huang
R. Liu
Chao Li
Yunbo Cao
Haitao Zheng
36
36
0
19 Oct 2022
Automatic Context Pattern Generation for Entity Set Expansion
Automatic Context Pattern Generation for Entity Set Expansion
Hai-Tao Zheng
Shulin Huang
Xinwei Zhang
Qingyu Zhou
Yongqian Li
Ruiyang Liu
Yunbo Cao
Haitao Zheng
Ying Shen
68
23
0
17 Jul 2022
Contrastive Learning with Hard Negative Entities for Entity Set
  Expansion
Contrastive Learning with Hard Negative Entities for Entity Set Expansion
Hai-Tao Zheng
Yongqian Li
Yuxin He
Tianyu Yu
Ying Shen
Haitao Zheng
46
32
0
16 Apr 2022
The Past Mistake is the Future Wisdom: Error-driven Contrastive
  Probability Optimization for Chinese Spell Checking
The Past Mistake is the Future Wisdom: Error-driven Contrastive Probability Optimization for Chinese Spell Checking
Hai-Tao Zheng
Qingyu Zhou
Yongqian Li
Zhongli Li
Ruiyang Liu
Rongyi Sun
Zizhen Wang
Chao Li
Yunbo Cao
Haitao Zheng
109
70
0
02 Mar 2022
Are we ready for a new paradigm shift? A Survey on Visual Deep MLP
Are we ready for a new paradigm shift? A Survey on Visual Deep MLP
Ruiyang Liu
Hai-Tao Zheng
Li Tao
Dun Liang
Haitao Zheng
145
99
0
07 Nov 2021
Training Verifiers to Solve Math Word Problems
Training Verifiers to Solve Math Word Problems
K. Cobbe
V. Kosaraju
Mohammad Bavarian
Mark Chen
Heewoo Jun
...
Jerry Tworek
Jacob Hilton
Reiichiro Nakano
Christopher Hesse
John Schulman
ReLM
OffRL
LRM
267
4,397
0
27 Oct 2021
Program Synthesis with Large Language Models
Program Synthesis with Large Language Models
Jacob Austin
Augustus Odena
Maxwell Nye
Maarten Bosma
Henryk Michalewski
...
Ellen Jiang
Carrie J. Cai
Michael Terry
Quoc V. Le
Charles Sutton
ELM
AIMat
ReCod
ALM
193
1,948
0
16 Aug 2021
Evaluating Large Language Models Trained on Code
Evaluating Large Language Models Trained on Code
Mark Chen
Jerry Tworek
Heewoo Jun
Qiming Yuan
Henrique Pondé
...
Bob McGrew
Dario Amodei
Sam McCandlish
Ilya Sutskever
Wojciech Zaremba
ELM
ALM
222
5,513
0
07 Jul 2021
Learning Transferable Visual Models From Natural Language Supervision
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
871
29,372
0
26 Feb 2021
ResizeMix: Mixing Data with Preserved Object Information and True Labels
ResizeMix: Mixing Data with Preserved Object Information and True Labels
Jie Qin
Jiemin Fang
Qian Zhang
Wenyu Liu
Xingang Wang
Xinggang Wang
63
86
0
21 Dec 2020
Puzzle Mix: Exploiting Saliency and Local Statistics for Optimal Mixup
Puzzle Mix: Exploiting Saliency and Local Statistics for Optimal Mixup
Jang-Hyun Kim
Wonho Choo
Hyun Oh Song
AAML
81
390
0
15 Sep 2020
Measuring Massive Multitask Language Understanding
Measuring Massive Multitask Language Understanding
Dan Hendrycks
Collin Burns
Steven Basart
Andy Zou
Mantas Mazeika
D. Song
Jacob Steinhardt
ELM
RALM
171
4,418
0
07 Sep 2020
Patch-level Neighborhood Interpolation: A General and Effective
  Graph-based Regularization Strategy
Patch-level Neighborhood Interpolation: A General and Effective Graph-based Regularization Strategy
Ke Sun
Bin Yu
Zhouchen Lin
Zhanxing Zhu
110
5
0
21 Nov 2019
CutMix: Regularization Strategy to Train Strong Classifiers with
  Localizable Features
CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features
Sangdoo Yun
Dongyoon Han
Seong Joon Oh
Sanghyuk Chun
Junsuk Choe
Y. Yoo
OOD
609
4,777
0
13 May 2019
Think you have Solved Question Answering? Try ARC, the AI2 Reasoning
  Challenge
Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge
Peter Clark
Isaac Cowhey
Oren Etzioni
Tushar Khot
Ashish Sabharwal
Carissa Schoenick
Oyvind Tafjord
ELM
RALM
LRM
158
2,583
0
14 Mar 2018
mixup: Beyond Empirical Risk Minimization
mixup: Beyond Empirical Risk Minimization
Hongyi Zhang
Moustapha Cissé
Yann N. Dauphin
David Lopez-Paz
NoLa
273
9,760
0
25 Oct 2017
1