ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2210.02414
  4. Cited By
GLM-130B: An Open Bilingual Pre-trained Model

GLM-130B: An Open Bilingual Pre-trained Model

5 October 2022
Aohan Zeng
Xiao Liu
Zhengxiao Du
Zihan Wang
Hanyu Lai
Ming Ding
Zhuoyi Yang
Yifan Xu
Wendi Zheng
Xiao Xia
Weng Lam Tam
Zixuan Ma
Yufei Xue
Jidong Zhai
Wenguang Chen
Peng Zhang
Yuxiao Dong
Jie Tang
    BDL
    LRM
ArXivPDFHTML

Papers citing "GLM-130B: An Open Bilingual Pre-trained Model"

50 / 149 papers shown
Title
Memory-Based Model Editing at Scale
Memory-Based Model Editing at Scale
E. Mitchell
Charles Lin
Antoine Bosselut
Christopher D. Manning
Chelsea Finn
KELM
59
337
0
13 Jun 2022
Beyond the Imitation Game: Quantifying and extrapolating the
  capabilities of language models
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models
Aarohi Srivastava
Abhinav Rastogi
Abhishek Rao
Abu Awal Md Shoeb
Abubakar Abid
...
Zhuoye Zhao
Zijian Wang
Zijie J. Wang
Zirui Wang
Ziyi Wu
ELM
63
1,726
0
09 Jun 2022
CogVideo: Large-scale Pretraining for Text-to-Video Generation via
  Transformers
CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers
Wenyi Hong
Ming Ding
Wendi Zheng
Xinghan Liu
Jie Tang
DiffM
270
585
0
29 May 2022
DeepStruct: Pretraining of Language Models for Structure Prediction
DeepStruct: Pretraining of Language Models for Structure Prediction
Chenguang Wang
Xiao Liu
Zui Chen
Haoyun Hong
Jie Tang
Dawn Song
250
67
0
21 May 2022
OPT: Open Pre-trained Transformer Language Models
OPT: Open Pre-trained Transformer Language Models
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
...
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
VLM
OSLM
AI4CE
272
3,583
0
02 May 2022
GPT-NeoX-20B: An Open-Source Autoregressive Language Model
GPT-NeoX-20B: An Open-Source Autoregressive Language Model
Sid Black
Stella Biderman
Eric Hallahan
Quentin G. Anthony
Leo Gao
...
Shivanshu Purohit
Laria Reynolds
J. Tow
Benqi Wang
Samuel Weinbach
115
812
0
14 Apr 2022
PaLM: Scaling Language Modeling with Pathways
PaLM: Scaling Language Modeling with Pathways
Aakanksha Chowdhery
Sharan Narang
Jacob Devlin
Maarten Bosma
Gaurav Mishra
...
Kathy Meier-Hellstern
Douglas Eck
J. Dean
Slav Petrov
Noah Fiedel
PILM
LRM
297
6,132
0
05 Apr 2022
Training Compute-Optimal Large Language Models
Training Compute-Optimal Large Language Models
Jordan Hoffmann
Sebastian Borgeaud
A. Mensch
Elena Buchatskaya
Trevor Cai
...
Karen Simonyan
Erich Elsen
Jack W. Rae
Oriol Vinyals
Laurent Sifre
AI4TS
105
1,894
0
29 Mar 2022
Compression of Generative Pre-trained Language Models via Quantization
Compression of Generative Pre-trained Language Models via Quantization
Chaofan Tao
Lu Hou
Wei Zhang
Lifeng Shang
Xin Jiang
Qun Liu
Ping Luo
Ngai Wong
MQ
50
103
0
21 Mar 2022
DeepNet: Scaling Transformers to 1,000 Layers
DeepNet: Scaling Transformers to 1,000 Layers
Hongyu Wang
Shuming Ma
Li Dong
Shaohan Huang
Dongdong Zhang
Furu Wei
MoE
AI4CE
88
160
0
01 Mar 2022
PromptSource: An Integrated Development Environment and Repository for
  Natural Language Prompts
PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts
Stephen H. Bach
Victor Sanh
Zheng-Xin Yong
Albert Webson
Colin Raffel
...
Khalid Almubarak
Xiangru Tang
Dragomir R. Radev
Mike Tian-Jian Jiang
Alexander M. Rush
VLM
286
345
0
02 Feb 2022
Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A
  Large-Scale Generative Language Model
Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model
Shaden Smith
M. Patwary
Brandon Norick
P. LeGresley
Samyam Rajbhandari
...
Mohammad Shoeybi
Yuxiong He
Michael Houston
Saurabh Tiwary
Bryan Catanzaro
MoE
127
737
0
28 Jan 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
536
9,009
0
28 Jan 2022
LaMDA: Language Models for Dialog Applications
LaMDA: Language Models for Dialog Applications
R. Thoppilan
Daniel De Freitas
Jamie Hall
Noam M. Shazeer
Apoorv Kulshreshtha
...
Blaise Aguera-Arcas
Claire Cui
M. Croak
Ed H. Chi
Quoc Le
ALM
91
1,577
0
20 Jan 2022
Counterfactual Memorization in Neural Language Models
Counterfactual Memorization in Neural Language Models
Chiyuan Zhang
Daphne Ippolito
Katherine Lee
Matthew Jagielski
Florian Tramèr
Nicholas Carlini
53
131
0
24 Dec 2021
ERNIE 3.0 Titan: Exploring Larger-scale Knowledge Enhanced Pre-training
  for Language Understanding and Generation
ERNIE 3.0 Titan: Exploring Larger-scale Knowledge Enhanced Pre-training for Language Understanding and Generation
Shuohuan Wang
Yu Sun
Yang Xiang
Zhihua Wu
Siyu Ding
...
Tian Wu
Wei Zeng
Ge Li
Wen Gao
Haifeng Wang
ELM
49
79
0
23 Dec 2021
Efficient Large Scale Language Modeling with Mixtures of Experts
Efficient Large Scale Language Modeling with Mixtures of Experts
Mikel Artetxe
Shruti Bhosale
Naman Goyal
Todor Mihaylov
Myle Ott
...
Jeff Wang
Luke Zettlemoyer
Mona T. Diab
Zornitsa Kozareva
Ves Stoyanov
MoE
107
192
0
20 Dec 2021
Scaling Language Models: Methods, Analysis & Insights from Training
  Gopher
Scaling Language Models: Methods, Analysis & Insights from Training Gopher
Jack W. Rae
Sebastian Borgeaud
Trevor Cai
Katie Millican
Jordan Hoffmann
...
Jeff Stanway
L. Bennett
Demis Hassabis
Koray Kavukcuoglu
G. Irving
37
1,303
0
08 Dec 2021
Ethical and social risks of harm from Language Models
Ethical and social risks of harm from Language Models
Laura Weidinger
John F. J. Mellor
Maribeth Rauh
Conor Griffin
J. Uesato
...
Lisa Anne Hendricks
William S. Isaac
Sean Legassick
G. Irving
Iason Gabriel
PILM
58
1,009
0
08 Dec 2021
Do Language Models Have Beliefs? Methods for Detecting, Updating, and
  Visualizing Model Beliefs
Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs
Peter Hase
Mona T. Diab
Asli Celikyilmaz
Xian Li
Zornitsa Kozareva
Veselin Stoyanov
Joey Tianyi Zhou
Srini Iyer
KELM
LRM
32
79
0
26 Nov 2021
ExT5: Towards Extreme Multi-Task Scaling for Transfer Learning
ExT5: Towards Extreme Multi-Task Scaling for Transfer Learning
V. Aribandi
Yi Tay
Tal Schuster
J. Rao
H. Zheng
...
Jianmo Ni
Jai Gupta
Kai Hui
Sebastian Ruder
Donald Metzler
MoE
58
215
0
22 Nov 2021
NormFormer: Improved Transformer Pretraining with Extra Normalization
NormFormer: Improved Transformer Pretraining with Extra Normalization
Sam Shleifer
Jason Weston
Myle Ott
AI4CE
42
76
0
18 Oct 2021
Multitask Prompted Training Enables Zero-Shot Task Generalization
Multitask Prompted Training Enables Zero-Shot Task Generalization
Victor Sanh
Albert Webson
Colin Raffel
Stephen H. Bach
Lintang Sutawika
...
T. Bers
Stella Biderman
Leo Gao
Thomas Wolf
Alexander M. Rush
LRM
293
1,679
0
15 Oct 2021
Learning Compact Metrics for MT
Learning Compact Metrics for MT
Amy Pu
Hyung Won Chung
Ankur P. Parikh
Sebastian Gehrmann
Thibault Sellam
53
99
0
12 Oct 2021
Yuan 1.0: Large-Scale Pre-trained Language Model in Zero-Shot and
  Few-Shot Learning
Yuan 1.0: Large-Scale Pre-trained Language Model in Zero-Shot and Few-Shot Learning
Shaohua Wu
Xudong Zhao
Tong Yu
Rongguo Zhang
C. Shen
...
Feng Li
Hong Zhu
Jiangang Luo
Liang Xu
Xuanwei Zhang
ALM
39
60
0
10 Oct 2021
8-bit Optimizers via Block-wise Quantization
8-bit Optimizers via Block-wise Quantization
Tim Dettmers
M. Lewis
Sam Shleifer
Luke Zettlemoyer
MQ
90
286
0
06 Oct 2021
TruthfulQA: Measuring How Models Mimic Human Falsehoods
TruthfulQA: Measuring How Models Mimic Human Falsehoods
Stephanie C. Lin
Jacob Hilton
Owain Evans
HILM
85
1,825
0
08 Sep 2021
Finetuned Language Models Are Zero-Shot Learners
Finetuned Language Models Are Zero-Shot Learners
Jason W. Wei
Maarten Bosma
Vincent Zhao
Kelvin Guu
Adams Wei Yu
Brian Lester
Nan Du
Andrew M. Dai
Quoc V. Le
ALM
UQCV
53
3,678
0
03 Sep 2021
Train Short, Test Long: Attention with Linear Biases Enables Input
  Length Extrapolation
Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation
Ofir Press
Noah A. Smith
M. Lewis
274
728
0
27 Aug 2021
Harms of Gender Exclusivity and Challenges in Non-Binary Representation
  in Language Technologies
Harms of Gender Exclusivity and Challenges in Non-Binary Representation in Language Technologies
Sunipa Dev
Masoud Monajatipoor
Anaelia Ovalle
Arjun Subramonian
J. M. Phillips
Kai-Wei Chang
55
166
0
27 Aug 2021
Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods
  in Natural Language Processing
Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing
Pengfei Liu
Weizhe Yuan
Jinlan Fu
Zhengbao Jiang
Hiroaki Hayashi
Graham Neubig
VLM
SyDa
155
3,906
0
28 Jul 2021
FewCLUE: A Chinese Few-shot Learning Evaluation Benchmark
FewCLUE: A Chinese Few-shot Learning Evaluation Benchmark
Liang Xu
Xiaojing Lu
Chenyang Yuan
Xuanwei Zhang
Huilin Xu
...
Guoao Wei
X. Pan
Xin Tian
Libo Qin
Hai Hu
ELM
37
57
0
15 Jul 2021
Towards Understanding and Mitigating Social Biases in Language Models
Towards Understanding and Mitigating Social Biases in Language Models
Paul Pu Liang
Chiyu Wu
Louis-Philippe Morency
Ruslan Salakhutdinov
63
387
0
24 Jun 2021
CogView: Mastering Text-to-Image Generation via Transformers
CogView: Mastering Text-to-Image Generation via Transformers
Ming Ding
Zhuoyi Yang
Wenyi Hong
Wendi Zheng
Chang Zhou
...
Junyang Lin
Xu Zou
Zhou Shao
Hongxia Yang
Jie Tang
ViT
VLM
75
773
0
26 May 2021
PanGu-$α$: Large-scale Autoregressive Pretrained Chinese Language
  Models with Auto-parallel Computation
PanGu-ααα: Large-scale Autoregressive Pretrained Chinese Language Models with Auto-parallel Computation
Wei Zeng
Xiaozhe Ren
Teng Su
Hui Wang
Yi-Lun Liao
...
Gaojun Fan
Yaowei Wang
Xuefeng Jin
Qun Liu
Yonghong Tian
ALM
MoE
AI4CE
60
213
0
26 Apr 2021
Carbon Emissions and Large Neural Network Training
Carbon Emissions and Large Neural Network Training
David A. Patterson
Joseph E. Gonzalez
Quoc V. Le
Chen Liang
Lluís-Miquel Munguía
D. Rothchild
David R. So
Maud Texier
J. Dean
AI4CE
311
653
0
21 Apr 2021
RoFormer: Enhanced Transformer with Rotary Position Embedding
RoFormer: Enhanced Transformer with Rotary Position Embedding
Jianlin Su
Yu Lu
Shengfeng Pan
Ahmed Murtadha
Bo Wen
Yunfeng Liu
112
2,307
0
20 Apr 2021
The Power of Scale for Parameter-Efficient Prompt Tuning
The Power of Scale for Parameter-Efficient Prompt Tuning
Brian Lester
Rami Al-Rfou
Noah Constant
VPVLM
435
3,952
0
18 Apr 2021
Editing Factual Knowledge in Language Models
Editing Factual Knowledge in Language Models
Nicola De Cao
Wilker Aziz
Ivan Titov
KELM
85
496
0
16 Apr 2021
An Empirical Study of Training Self-Supervised Vision Transformers
An Empirical Study of Training Self-Supervised Vision Transformers
Xinlei Chen
Saining Xie
Kaiming He
ViT
101
1,837
0
05 Apr 2021
Detecting Hate Speech with GPT-3
Detecting Hate Speech with GPT-3
Ke-Li Chiu
Annie Collins
Rohan Alexander
AILaw
28
110
0
23 Mar 2021
GPT Understands, Too
GPT Understands, Too
Xiao Liu
Yanan Zheng
Zhengxiao Du
Ming Ding
Yujie Qian
Zhilin Yang
Jie Tang
VLM
125
1,161
0
18 Mar 2021
GLM: General Language Model Pretraining with Autoregressive Blank
  Infilling
GLM: General Language Model Pretraining with Autoregressive Blank Infilling
Zhengxiao Du
Yujie Qian
Xiao Liu
Ming Ding
J. Qiu
Zhilin Yang
Jie Tang
BDL
AI4CE
78
1,520
0
18 Mar 2021
Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based
  Bias in NLP
Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP
Timo Schick
Sahana Udupa
Hinrich Schütze
273
380
0
28 Feb 2021
Zero-Shot Text-to-Image Generation
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
291
4,873
0
24 Feb 2021
The GEM Benchmark: Natural Language Generation, its Evaluation and
  Metrics
The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics
Sebastian Gehrmann
Tosin Adewumi
Karmanya Aggarwal
Pawan Sasanka Ammanamanchi
Aremu Anuoluwapo
...
Nishant Subramani
Wei Xu
Diyi Yang
Akhila Yerukola
Jiawei Zhou
VLM
276
285
0
02 Feb 2021
Exploring Text-transformers in AAAI 2021 Shared Task: COVID-19 Fake News
  Detection in English
Exploring Text-transformers in AAAI 2021 Shared Task: COVID-19 Fake News Detection in English
Xiangyang Li
Yu Xia
Xiang Long
Zheng Li
Sujian Li
226
37
0
07 Jan 2021
Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit
  Reasoning Strategies
Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit Reasoning Strategies
Mor Geva
Daniel Khashabi
Elad Segal
Tushar Khot
Dan Roth
Jonathan Berant
RALM
295
697
0
06 Jan 2021
Prefix-Tuning: Optimizing Continuous Prompts for Generation
Prefix-Tuning: Optimizing Continuous Prompts for Generation
Xiang Lisa Li
Percy Liang
152
4,167
0
01 Jan 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
365
2,051
0
31 Dec 2020
Previous
123
Next