ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2404.05875
  4. Cited By
CodecLM: Aligning Language Models with Tailored Synthetic Data

CodecLM: Aligning Language Models with Tailored Synthetic Data

8 April 2024
Zifeng Wang
Chun-Liang Li
Vincent Perot
Long T. Le
Jin Miao
Zizhao Zhang
Chen-Yu Lee
Tomas Pfister
    SyDa
    ALM
ArXivPDFHTML

Papers citing "CodecLM: Aligning Language Models with Tailored Synthetic Data"

17 / 17 papers shown
Title
KodCode: A Diverse, Challenging, and Verifiable Synthetic Dataset for Coding
Zhangchen Xu
Yang Liu
Yueqin Yin
Mingyuan Zhou
Radha Poovendran
ALM
OffRL
84
7
0
04 Mar 2025
CrowdSelect: Synthetic Instruction Data Selection with Multi-LLM Wisdom
Yisen Li
Lingfeng Yang
Wenxuan Shen
Pan Zhou
Yao Wan
Weiwei Lin
Danny Chen
70
0
0
03 Mar 2025
Building A Proof-Oriented Programmer That Is 64% Better Than GPT-4o Under Data Scarcity
Building A Proof-Oriented Programmer That Is 64% Better Than GPT-4o Under Data Scarcity
Dylan Zhang
Justin Wang
Tianran Sun
45
1
0
17 Feb 2025
A Domain Adaptation Framework for Speech Recognition Systems with Only Synthetic data
A Domain Adaptation Framework for Speech Recognition Systems with Only Synthetic data
Minh Tran
Yutong Pang
Debjyoti Paul
Laxmi Pandey
Kevin Jiang
Jinxi Guo
Ke Li
Shun Zhang
X. Zhang
Xin Lei
AI4CE
36
0
0
21 Jan 2025
LLM Teacher-Student Framework for Text Classification With No Manually
  Annotated Data: A Case Study in IPTC News Topic Classification
LLM Teacher-Student Framework for Text Classification With No Manually Annotated Data: A Case Study in IPTC News Topic Classification
Taja Kuzman
Nikola Ljubesic
72
0
0
29 Nov 2024
Stronger Models are NOT Stronger Teachers for Instruction Tuning
Stronger Models are NOT Stronger Teachers for Instruction Tuning
Zhangchen Xu
Fengqing Jiang
Luyao Niu
Bill Yuchen Lin
Radha Poovendran
ALM
53
5
0
11 Nov 2024
Forewarned is Forearmed: Leveraging LLMs for Data Synthesis through
  Failure-Inducing Exploration
Forewarned is Forearmed: Leveraging LLMs for Data Synthesis through Failure-Inducing Exploration
Qintong Li
Jiahui Gao
Sheng Wang
Renjie Pi
Xueliang Zhao
Chuan Wu
Xin Jiang
Z. Li
Lingpeng Kong
SyDa
28
3
0
22 Oct 2024
A Comprehensive Evaluation of Cognitive Biases in LLMs
A Comprehensive Evaluation of Cognitive Biases in LLMs
Simon Malberg
Roman Poletukhin
Carolin M. Schuster
Georg Groh
ELM
34
5
0
20 Oct 2024
Balancing Cost and Effectiveness of Synthetic Data Generation Strategies
  for LLMs
Balancing Cost and Effectiveness of Synthetic Data Generation Strategies for LLMs
Yung-Chieh Chan
George Pu
Apaar Shanker
Parth Suresh
Penn Jenks
John Heyer
Sam Denton
SyDa
29
8
0
29 Sep 2024
Self-play with Execution Feedback: Improving Instruction-following
  Capabilities of Large Language Models
Self-play with Execution Feedback: Improving Instruction-following Capabilities of Large Language Models
Guanting Dong
K. Lu
Chengpeng Li
Tingyu Xia
Bowen Yu
Chang Zhou
Jingren Zhou
SyDa
ALM
LRM
47
13
0
19 Jun 2024
Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs
  with Nothing
Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing
Zhangchen Xu
Fengqing Jiang
Luyao Niu
Yuntian Deng
Radha Poovendran
Yejin Choi
Bill Yuchen Lin
SyDa
36
120
0
12 Jun 2024
Continual Learning of Large Language Models: A Comprehensive Survey
Continual Learning of Large Language Models: A Comprehensive Survey
Haizhou Shi
Zihao Xu
Hengyi Wang
Weiyi Qin
Wenyuan Wang
Yibin Wang
Zifeng Wang
Sayna Ebrahimi
Hao Wang
CLL
KELM
LRM
52
64
0
25 Apr 2024
Large Language Models for Data Annotation: A Survey
Large Language Models for Data Annotation: A Survey
Zhen Tan
Dawei Li
Song Wang
Alimohammad Beigi
Bohan Jiang
Amrita Bhattacharjee
Mansooreh Karami
Wenlin Yao
Lu Cheng
Huan Liu
SyDa
56
50
0
21 Feb 2024
Distilling Step-by-Step! Outperforming Larger Language Models with Less
  Training Data and Smaller Model Sizes
Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes
Lokesh Nagalapatti
Chun-Liang Li
Chih-Kuan Yeh
Hootan Nakhost
Yasuhisa Fujii
Alexander Ratner
Ranjay Krishna
Chen-Yu Lee
Tomas Pfister
ALM
220
502
0
03 May 2023
Mixture of Soft Prompts for Controllable Data Generation
Mixture of Soft Prompts for Controllable Data Generation
Derek Chen
Celine Lee
Yunan Lu
Domenic Rosati
Zhou Yu
114
22
0
02 Mar 2023
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
313
11,953
0
04 Mar 2022
Multitask Prompted Training Enables Zero-Shot Task Generalization
Multitask Prompted Training Enables Zero-Shot Task Generalization
Victor Sanh
Albert Webson
Colin Raffel
Stephen H. Bach
Lintang Sutawika
...
T. Bers
Stella Biderman
Leo Gao
Thomas Wolf
Alexander M. Rush
LRM
213
1,657
0
15 Oct 2021
1