Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2311.03301
Cited By
Ziya2: Data-centric Learning is All LLMs Need
6 November 2023
Ruyi Gan
Ziwei Wu
Renliang Sun
Junyu Lu
Xiaojun Wu
Di Zhang
Kunhao Pan
Junqing He
Yuanhe Tian
Ping Yang
Qi Yang
Hao Wang
Jiaxing Zhang
Yan Song
VLM
ALM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Ziya2: Data-centric Learning is All LLMs Need"
24 / 24 papers shown
Title
Chinese-Vicuna: A Chinese Instruction-following Llama-based Model
Chenghao Fan
Zhenyi Lu
Jie Tian
ALM
62
12
0
17 Apr 2025
Preference Leakage: A Contamination Problem in LLM-as-a-judge
Dawei Li
Renliang Sun
Yue Huang
Ming Zhong
Bohan Jiang
Jiawei Han
Wei Wei
Wei Wang
Huan Liu
117
29
0
03 Feb 2025
Training on the Test Task Confounds Evaluation and Emergence
Ricardo Dominguez-Olmedo
Florian E. Dorner
Moritz Hardt
ELM
126
8
1
10 Jul 2024
DCA-Bench: A Benchmark for Dataset Curation Agents
Benhao Huang
Yingzhuo Yu
Jin Huang
Xingjian Zhang
Jiaqi Ma
64
1
0
11 Jun 2024
Baichuan 2: Open Large-scale Language Models
Ai Ming Yang
Bin Xiao
Bingning Wang
Borong Zhang
Ce Bian
...
Youxin Jiang
Yuchen Gao
Yupeng Zhang
Guosheng Dong
Zhiying Wu
ELM
LRM
160
743
0
19 Sep 2023
CMMLU: Measuring massive multitask language understanding in Chinese
Haonan Li
Yixuan Zhang
Fajri Koto
Yifei Yang
Hai Zhao
Yeyun Gong
Nan Duan
Tim Baldwin
ALM
ELM
89
259
0
15 Jun 2023
Zero-Shot Learners for Natural Language Understanding via a Unified Multiple Choice Perspective
Ping Yang
Junjie Wang
Ruyi Gan
Xinyu Zhu
Lin Zhang
Ziwei Wu
Xinyu Gao
Jiaxing Zhang
Tetsuya Sakai
BDL
42
26
0
16 Oct 2022
Towards No.1 in CLUE Semantic Matching Challenge: Pre-trained Language Model Erlangshen with Propensity-Corrected Loss
Junjie Wang
Yuxiang Zhang
Ping Yang
Ruyi Gan
35
2
0
05 Aug 2022
GPT-NeoX-20B: An Open-Source Autoregressive Language Model
Sid Black
Stella Biderman
Eric Hallahan
Quentin G. Anthony
Leo Gao
...
Shivanshu Purohit
Laria Reynolds
J. Tow
Benqi Wang
Samuel Weinbach
164
826
0
14 Apr 2022
Training Verifiers to Solve Math Word Problems
K. Cobbe
V. Kosaraju
Mohammad Bavarian
Mark Chen
Heewoo Jun
...
Jerry Tworek
Jacob Hilton
Reiichiro Nakano
Christopher Hesse
John Schulman
ReLM
OffRL
LRM
274
4,397
0
27 Oct 2021
Yuan 1.0: Large-Scale Pre-trained Language Model in Zero-Shot and Few-Shot Learning
Shaohua Wu
Xudong Zhao
Tong Yu
Rongguo Zhang
C. Shen
...
Feng Li
Hong Zhu
Jiangang Luo
Liang Xu
Xuanwei Zhang
ALM
46
60
0
10 Oct 2021
Deduplicating Training Data Makes Language Models Better
Katherine Lee
Daphne Ippolito
A. Nystrom
Chiyuan Zhang
Douglas Eck
Chris Callison-Burch
Nicholas Carlini
SyDa
356
628
0
14 Jul 2021
Evaluating Large Language Models Trained on Code
Mark Chen
Jerry Tworek
Heewoo Jun
Qiming Yuan
Henrique Pondé
...
Bob McGrew
Dario Amodei
Sam McCandlish
Ilya Sutskever
Wojciech Zaremba
ELM
ALM
224
5,518
0
07 Jul 2021
RoFormer: Enhanced Transformer with Rotary Position Embedding
Jianlin Su
Yu Lu
Shengfeng Pan
Ahmed Murtadha
Bo Wen
Yunfeng Liu
268
2,443
0
20 Apr 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
441
2,093
0
31 Dec 2020
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
593
4,801
0
23 Jan 2020
ZEN: Pre-training Chinese Text Encoder Enhanced by N-gram Representations
Shizhe Diao
Jiaxin Bai
Yan Song
Tong Zhang
Yonggang Wang
AI4CE
52
135
0
02 Nov 2019
CCNet: Extracting High Quality Monolingual Datasets from Web Crawl Data
Guillaume Wenzek
Marie-Anne Lachaux
Alexis Conneau
Vishrav Chaudhary
Francisco Guzmán
Armand Joulin
Edouard Grave
81
654
0
01 Nov 2019
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
419
20,127
0
23 Oct 2019
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
Mohammad Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
324
1,899
0
17 Sep 2019
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
AIMat
612
24,431
0
26 Jul 2019
SpanBERT: Improving Pre-training by Representing and Predicting Spans
Mandar Joshi
Danqi Chen
Yinhan Liu
Daniel S. Weld
Luke Zettlemoyer
Omer Levy
136
1,964
0
24 Jul 2019
Neural Machine Translation of Rare Words with Subword Units
Rico Sennrich
Barry Haddow
Alexandra Birch
212
7,735
0
31 Aug 2015
Efficient Estimation of Word Representations in Vector Space
Tomas Mikolov
Kai Chen
G. Corrado
J. Dean
3DV
665
31,489
0
16 Jan 2013
1