ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2309.05447
  4. Cited By
DoG-Instruct: Towards Premium Instruction-Tuning Data via Text-Grounded
  Instruction Wrapping
v1v2 (latest)

DoG-Instruct: Towards Premium Instruction-Tuning Data via Text-Grounded Instruction Wrapping

11 September 2023
Yongrui Chen
Haiyun Jiang
Xinting Huang
Shuming Shi
Guilin Qi
    SyDa
ArXiv (abs)PDFHTMLGithub (3★)

Papers citing "DoG-Instruct: Towards Premium Instruction-Tuning Data via Text-Grounded Instruction Wrapping"

10 / 10 papers shown
Title
Instruction-Tuning Data Synthesis from Scratch via Web Reconstruction
Instruction-Tuning Data Synthesis from Scratch via Web Reconstruction
Yuxin Jiang
Yijiao Wang
Chuhan Wu
Xinyi Dai
Yan Xu
...
Yucheng Wang
Xin Jiang
Lifeng Shang
Ruiming Tang
Wenjie Wang
138
0
0
22 Apr 2025
MAIN: Mutual Alignment Is Necessary for instruction tuning
MAIN: Mutual Alignment Is Necessary for instruction tuning
Fanyi Yang
Jianfeng Liu
Xinsong Zhang
Haoyu Liu
Xixin Cao
Yuefeng Zhan
H. Sun
Weiwei Deng
Feng Sun
Qi Zhang
ALM
55
0
0
17 Apr 2025
MDIT: A Model-free Data Interpolation Method for Diverse Instruction Tuning
MDIT: A Model-free Data Interpolation Method for Diverse Instruction Tuning
Yangning Li
Zihua Lan
Lv Qingsong
Hai-Tao Zheng
Hai-Tao Zheng
110
0
0
09 Apr 2025
XL-Instruct: Synthetic Data for Cross-Lingual Open-Ended Generation
XL-Instruct: Synthetic Data for Cross-Lingual Open-Ended Generation
Vivek Iyer
Ricardo Rei
Pinzhen Chen
Alexandra Birch
SyDaLM&MA
170
0
0
29 Mar 2025
CRAFT Your Dataset: Task-Specific Synthetic Dataset Generation Through
  Corpus Retrieval and Augmentation
CRAFT Your Dataset: Task-Specific Synthetic Dataset Generation Through Corpus Retrieval and Augmentation
Ingo Ziegler
Abdullatif Köksal
Desmond Elliott
Hinrich Schütze
78
6
0
03 Sep 2024
SHED: Shapley-Based Automated Dataset Refinement for Instruction
  Fine-Tuning
SHED: Shapley-Based Automated Dataset Refinement for Instruction Fine-Tuning
Yexiao He
Ziyao Wang
Zheyu Shen
Guoheng Sun
Yucong Dai
Yongkai Wu
Hongyi Wang
Ang Li
94
13
0
23 Apr 2024
A Survey on Data Selection for LLM Instruction Tuning
A Survey on Data Selection for LLM Instruction Tuning
Bolin Zhang
Jiahao Wang
Qianlong Du
Jiajun Zhang
Zhiying Tu
Dianhui Chu
101
48
0
04 Feb 2024
Quokka: An Open-source Large Language Model ChatBot for Material Science
Quokka: An Open-source Large Language Model ChatBot for Material Science
Xianjun Yang
Stephen D. Wilson
Linda R. Petzold
OSLM
74
2
0
02 Jan 2024
Shadow Alignment: The Ease of Subverting Safely-Aligned Language Models
Shadow Alignment: The Ease of Subverting Safely-Aligned Language Models
Xianjun Yang
Xiao Wang
Qi Zhang
Linda R. Petzold
William Y. Wang
Xun Zhao
Dahua Lin
83
190
0
04 Oct 2023
LongForm: Effective Instruction Tuning with Reverse Instructions
LongForm: Effective Instruction Tuning with Reverse Instructions
Abdullatif Köksal
Timo Schick
Anna Korhonen
Hinrich Schütze
SyDaALM
162
40
0
17 Apr 2023
1