Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2110.04725
Cited By
Yuan 1.0: Large-Scale Pre-trained Language Model in Zero-Shot and Few-Shot Learning
10 October 2021
Shaohua Wu
Xudong Zhao
Tong Yu
Rongguo Zhang
C. Shen
Hongli Liu
Feng Li
Hong Zhu
Jiangang Luo
Liang Xu
Xuanwei Zhang
ALM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Yuan 1.0: Large-Scale Pre-trained Language Model in Zero-Shot and Few-Shot Learning"
34 / 34 papers shown
Title
ChineseWebText 2.0: Large-Scale High-quality Chinese Web Text with Multi-dimensional and fine-grained information
Wanyue Zhang
Ziyong Li
Wen Yang
Chunlin Leng
Yinan Bai
Qianlong Du
Chengqing Zong
Jiajun Zhang
66
0
0
29 Nov 2024
DecorateLM: Data Engineering through Corpus Rating, Tagging, and Editing with Language Models
Ranchi Zhao
Zhen Leng Thai
Yifan Zhang
Shengding Hu
Yunqi Ba
Jie Zhou
Jie Cai
Zhiyuan Liu
Maosong Sun
36
1
0
08 Oct 2024
Navigating the Landscape of Large Language Models: A Comprehensive Review and Analysis of Paradigms and Fine-Tuning Strategies
Benjue Weng
LM&MA
46
7
0
13 Apr 2024
LHMKE: A Large-scale Holistic Multi-subject Knowledge Evaluation Benchmark for Chinese Large Language Models
Chuang Liu
Renren Jin
Yuqi Ren
Deyi Xiong
ELM
37
0
0
19 Mar 2024
Exploring ChatGPT for Next-generation Information Retrieval: Opportunities and Challenges
Yizheng Huang
Jimmy X. Huang
35
10
0
17 Feb 2024
DolphCoder: Echo-Locating Code Large Language Models with Diverse and Multi-Objective Instruction Tuning
Yejie Wang
Keqing He
Guanting Dong
Pei Wang
Weihao Zeng
...
Yutao Mou
Mengdi Zhang
Jingang Wang
Xunliang Cai
Weiran Xu
ALM
26
9
0
14 Feb 2024
TeleChat Technical Report
Zhongjiang He
Zihan Wang
Xinzhan Liu
Shixuan Liu
Yitong Yao
...
Zilu Huang
Sishi Xiong
Yuxiang Zhang
Chao Wang
Shuangyong Song
AI4MH
LRM
ALM
60
3
0
08 Jan 2024
YUAN 2.0: A Large Language Model with Localized Filtering-based Attention
Shaohua Wu
Xudong Zhao
Shenling Wang
Jiangang Luo
Lingjun Li
...
Wei Wang
Tong Yu
Rongguo Zhang
Jiahua Zhang
Chao Wang
OSLM
48
6
0
27 Nov 2023
Who is leading in AI? An analysis of industry AI research
Ben Cottier
T. Besiroglu
David Owen
33
7
0
24 Nov 2023
Oasis: Data Curation and Assessment System for Pretraining of Large Language Models
Tong Zhou
Yubo Chen
Pengfei Cao
Kang Liu
Jun Zhao
Shengping Liu
29
3
0
21 Nov 2023
Ziya2: Data-centric Learning is All LLMs Need
Ruyi Gan
Ziwei Wu
Renliang Sun
Junyu Lu
Xiaojun Wu
...
Ping Yang
Qi Yang
Hao Wang
Jiaxing Zhang
Yan Song
VLM
ALM
23
16
0
06 Nov 2023
How Abilities in Large Language Models are Affected by Supervised Fine-tuning Data Composition
Guanting Dong
Hongyi Yuan
Keming Lu
Chengpeng Li
Mingfeng Xue
Dayiheng Liu
Wei Wang
Zheng Yuan
Chang Zhou
Jingren Zhou
LRM
CLL
32
121
0
09 Oct 2023
A Survey of Techniques for Optimizing Transformer Inference
Krishna Teja Chitty-Venkata
Sparsh Mittal
M. Emani
V. Vishwanath
Arun Somani
43
62
0
16 Jul 2023
A Comprehensive Overview of Large Language Models
Humza Naveed
Asad Ullah Khan
Shi Qiu
Muhammad Saqib
Saeed Anwar
Muhammad Usman
Naveed Akhtar
Nick Barnes
Ajmal Saeed Mian
OffRL
70
525
0
12 Jul 2023
M3KE: A Massive Multi-Level Multi-Subject Knowledge Evaluation Benchmark for Chinese Large Language Models
Chuang Liu
Renren Jin
Yuqi Ren
Linhao Yu
Tianyu Dong
...
Peiyi Zhang
Qingqing Lyu
Xiaowen Su
Qun Liu
Deyi Xiong
ELM
ALM
16
24
0
17 May 2023
When Gradient Descent Meets Derivative-Free Optimization: A Match Made in Black-Box Scenario
Chengcheng Han
Liqing Cui
Renyu Zhu
J. Wang
Nuo Chen
Qiushi Sun
Xiang Li
Ming Gao
33
7
0
17 May 2023
Cerebras-GPT: Open Compute-Optimal Language Models Trained on the Cerebras Wafer-Scale Cluster
Nolan Dey
Gurpreet Gosal
Zhiming Chen
Chen
Hemant Khachane
William Marshall
Ribhu Pathria
Marvin Tom
Joel Hestness
MoE
LRM
25
98
0
06 Apr 2023
REPLUG: Retrieval-Augmented Black-Box Language Models
Weijia Shi
Sewon Min
Michihiro Yasunaga
Minjoon Seo
Rich James
M. Lewis
Luke Zettlemoyer
Wen-tau Yih
RALM
VLM
KELM
59
580
0
30 Jan 2023
ATP: Adaptive Tensor Parallelism for Foundation Models
Shenggan Cheng
Ziming Liu
Jiangsu Du
Yang You
21
6
0
20 Jan 2023
Changes from Classical Statistics to Modern Statistics and Data Science
Kai Zhang
Shan-Yu Liu
M. Xiong
31
0
0
30 Oct 2022
What Language Model to Train if You Have One Million GPU Hours?
Teven Le Scao
Thomas Wang
Daniel Hesslow
Lucile Saulnier
Stas Bekman
...
Lintang Sutawika
Jaesung Tae
Zheng-Xin Yong
Julien Launay
Iz Beltagy
MoE
AI4CE
230
103
0
27 Oct 2022
PQLM -- Multilingual Decentralized Portable Quantum Language Model for Privacy Protection
Shuyue Stella Li
Xiangyu Zhang
Shu Zhou
Hongchao Shu
Ruixing Liang
Hexin Liu
Leibny Paola García
40
22
0
06 Oct 2022
GLM-130B: An Open Bilingual Pre-trained Model
Aohan Zeng
Xiao Liu
Zhengxiao Du
Zihan Wang
Hanyu Lai
...
Jidong Zhai
Wenguang Chen
Peng-Zhen Zhang
Yuxiao Dong
Jie Tang
BDL
LRM
250
1,073
0
05 Oct 2022
WeLM: A Well-Read Pre-trained Language Model for Chinese
Hui Su
Xiao Zhou
Houjin Yu
Xiaoyu Shen
Yuwen Chen
Zilin Zhu
Yang Yu
Jie Zhou
34
23
0
21 Sep 2022
Machine Learning Model Sizes and the Parameter Gap
Pablo Villalobos
J. Sevilla
T. Besiroglu
Lennart Heim
A. Ho
Marius Hobbhahn
ALM
ELM
AI4CE
30
58
0
05 Jul 2022
On the Role of Bidirectionality in Language Model Pre-Training
Mikel Artetxe
Jingfei Du
Naman Goyal
Luke Zettlemoyer
Ves Stoyanov
24
16
0
24 May 2022
What Language Model Architecture and Pretraining Objective Work Best for Zero-Shot Generalization?
Thomas Wang
Adam Roberts
Daniel Hesslow
Teven Le Scao
Hyung Won Chung
Iz Beltagy
Julien Launay
Colin Raffel
31
167
0
12 Apr 2022
EVA2.0: Investigating Open-Domain Chinese Dialogue Systems with Large-Scale Pre-Training
Yuxian Gu
Jiaxin Wen
Hao Sun
Yi Song
Pei Ke
...
Zheng Zhang
Jianzhu Yao
Lei Liu
Xiaoyan Zhu
Minlie Huang
21
55
0
17 Mar 2022
Compute Trends Across Three Eras of Machine Learning
J. Sevilla
Lennart Heim
A. Ho
T. Besiroglu
Marius Hobbhahn
Pablo Villalobos
27
269
0
11 Feb 2022
Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model
Shaden Smith
M. Patwary
Brandon Norick
P. LeGresley
Samyam Rajbhandari
...
M. Shoeybi
Yuxiong He
Michael Houston
Saurabh Tiwary
Bryan Catanzaro
MoE
90
730
0
28 Jan 2022
Black-Box Tuning for Language-Model-as-a-Service
Tianxiang Sun
Yunfan Shao
Hong Qian
Xuanjing Huang
Xipeng Qiu
VLM
50
256
0
10 Jan 2022
ERNIE 3.0 Titan: Exploring Larger-scale Knowledge Enhanced Pre-training for Language Understanding and Generation
Shuohuan Wang
Yu Sun
Yang Xiang
Zhihua Wu
Siyu Ding
...
Tian Wu
Wei Zeng
Ge Li
Wen Gao
Haifeng Wang
ELM
39
79
0
23 Dec 2021
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
258
4,489
0
23 Jan 2020
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
245
1,821
0
17 Sep 2019
1