ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.02311
  4. Cited By
PaLM: Scaling Language Modeling with Pathways

PaLM: Scaling Language Modeling with Pathways

5 April 2022
Aakanksha Chowdhery
Sharan Narang
Jacob Devlin
Maarten Bosma
Gaurav Mishra
Adam Roberts
P. Barham
Hyung Won Chung
Charles Sutton
Sebastian Gehrmann
Parker Schuh
Kensen Shi
Sasha Tsvyashchenko
Joshua Maynez
Abhishek Rao
Parker Barnes
Yi Tay
Noam M. Shazeer
Vinodkumar Prabhakaran
Emily Reif
Nan Du
Ben Hutchinson
Reiner Pope
James Bradbury
Jacob Austin
Michael Isard
Guy Gur-Ari
Pengcheng Yin
Toju Duke
Anselm Levskaya
Sanjay Ghemawat
Sunipa Dev
Henryk Michalewski
Xavier Garcia
Vedant Misra
Kevin Robinson
Liam Fedus
Denny Zhou
Daphne Ippolito
D. Luan
Hyeontaek Lim
Barret Zoph
A. Spiridonov
Ryan Sepassi
David Dohan
Shivani Agrawal
Mark Omernick
Andrew M. Dai
Thanumalayan Sankaranarayana Pillai
Marie Pellat
Aitor Lewkowycz
Erica Moreira
R. Child
Oleksandr Polozov
Katherine Lee
Zongwei Zhou
Xuezhi Wang
Brennan Saeta
Mark Díaz
Orhan Firat
Michele Catasta
Jason W. Wei
Kathy Meier-Hellstern
Douglas Eck
J. Dean
Slav Petrov
Noah Fiedel
    PILM
    LRM
ArXivPDFHTML

Papers citing "PaLM: Scaling Language Modeling with Pathways"

44 / 4,244 papers shown
Title
Iteratively Prompt Pre-trained Language Models for Chain of Thought
Iteratively Prompt Pre-trained Language Models for Chain of Thought
Boshi Wang
Xiang Deng
Huan Sun
KELM
ReLM
LRM
44
95
0
16 Mar 2022
HyperMixer: An MLP-based Low Cost Alternative to Transformers
HyperMixer: An MLP-based Low Cost Alternative to Transformers
Florian Mai
Arnaud Pannatier
Fabio Fehr
Haolin Chen
François Marelli
François Fleuret
James Henderson
35
11
0
07 Mar 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
384
12,081
0
04 Mar 2022
SGPT: GPT Sentence Embeddings for Semantic Search
SGPT: GPT Sentence Embeddings for Semantic Search
Niklas Muennighoff
RALM
35
176
0
17 Feb 2022
Repairing the Cracked Foundation: A Survey of Obstacles in Evaluation
  Practices for Generated Text
Repairing the Cracked Foundation: A Survey of Obstacles in Evaluation Practices for Generated Text
Sebastian Gehrmann
Elizabeth Clark
Thibault Sellam
ELM
AI4CE
71
184
0
14 Feb 2022
Scaling Laws Under the Microscope: Predicting Transformer Performance
  from Small Scale Experiments
Scaling Laws Under the Microscope: Predicting Transformer Performance from Small Scale Experiments
Maor Ivgi
Y. Carmon
Jonathan Berant
19
17
0
13 Feb 2022
Compute Trends Across Three Eras of Machine Learning
Compute Trends Across Three Eras of Machine Learning
J. Sevilla
Lennart Heim
A. Ho
T. Besiroglu
Marius Hobbhahn
Pablo Villalobos
39
272
0
11 Feb 2022
Robust Training of Neural Networks Using Scale Invariant Architectures
Robust Training of Neural Networks Using Scale Invariant Architectures
Zhiyuan Li
Srinadh Bhojanapalli
Manzil Zaheer
Sashank J. Reddi
Surinder Kumar
29
27
0
02 Feb 2022
Locally Typical Sampling
Locally Typical Sampling
Clara Meister
Tiago Pimentel
Gian Wiher
Ryan Cotterell
145
86
0
01 Feb 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
447
8,650
0
28 Jan 2022
Reasoning Like Program Executors
Reasoning Like Program Executors
Xinyu Pi
Qian Liu
Bei Chen
Morteza Ziyadi
Zeqi Lin
Qiang Fu
Yan Gao
Jian-Guang Lou
Weizhu Chen
ReLM
LRM
253
52
0
27 Jan 2022
Instance-aware Prompt Learning for Language Understanding and Generation
Instance-aware Prompt Learning for Language Understanding and Generation
Feihu Jin
Jinliang Lu
Jiajun Zhang
Chengqing Zong
25
32
0
18 Jan 2022
A Survey of Controllable Text Generation using Transformer-based
  Pre-trained Language Models
A Survey of Controllable Text Generation using Transformer-based Pre-trained Language Models
Hanqing Zhang
Haolin Song
Shaoyu Li
Ming Zhou
Dawei Song
54
215
0
14 Jan 2022
Counterfactual Memorization in Neural Language Models
Counterfactual Memorization in Neural Language Models
Chiyuan Zhang
Daphne Ippolito
Katherine Lee
Matthew Jagielski
Florian Tramèr
Nicholas Carlini
32
129
0
24 Dec 2021
CRASS: A Novel Data Set and Benchmark to Test Counterfactual Reasoning
  of Large Language Models
CRASS: A Novel Data Set and Benchmark to Test Counterfactual Reasoning of Large Language Models
Jorg Frohberg
Frank Binder
SLR
6
27
0
22 Dec 2021
Few-shot Learning with Multilingual Language Models
Few-shot Learning with Multilingual Language Models
Xi Lin
Todor Mihaylov
Mikel Artetxe
Tianlu Wang
Shuohui Chen
...
Luke Zettlemoyer
Zornitsa Kozareva
Mona T. Diab
Ves Stoyanov
Xian Li
BDL
ELM
LRM
64
286
0
20 Dec 2021
Few-Shot Semantic Parsing with Language Models Trained On Code
Few-Shot Semantic Parsing with Language Models Trained On Code
Richard Shin
Benjamin Van Durme
22
64
0
16 Dec 2021
Few-Shot Self-Rationalization with Natural Language Prompts
Few-Shot Self-Rationalization with Natural Language Prompts
Ana Marasović
Iz Beltagy
Doug Downey
Matthew E. Peters
LRM
28
106
0
16 Nov 2021
EncT5: A Framework for Fine-tuning T5 as Non-autoregressive Models
EncT5: A Framework for Fine-tuning T5 as Non-autoregressive Models
Frederick Liu
T. Huang
Shihang Lyu
Siamak Shakeri
Hongkun Yu
Jing Li
42
8
0
16 Oct 2021
Multitask Prompted Training Enables Zero-Shot Task Generalization
Multitask Prompted Training Enables Zero-Shot Task Generalization
Victor Sanh
Albert Webson
Colin Raffel
Stephen H. Bach
Lintang Sutawika
...
T. Bers
Stella Biderman
Leo Gao
Thomas Wolf
Alexander M. Rush
LRM
218
1,663
0
15 Oct 2021
A Framework for Learning to Request Rich and Contextually Useful
  Information from Humans
A Framework for Learning to Request Rich and Contextually Useful Information from Humans
Khanh Nguyen
Yonatan Bisk
Hal Daumé
49
16
0
14 Oct 2021
Pre-trained Language Models in Biomedical Domain: A Systematic Survey
Pre-trained Language Models in Biomedical Domain: A Systematic Survey
Benyou Wang
Qianqian Xie
Jiahuan Pei
Zhihong Chen
Prayag Tiwari
Zhao Li
Jie Fu
LM&MA
AI4CE
37
164
0
11 Oct 2021
PLATO-XL: Exploring the Large-scale Pre-training of Dialogue Generation
PLATO-XL: Exploring the Large-scale Pre-training of Dialogue Generation
Siqi Bao
H. He
Fan Wang
Hua Wu
Haifeng Wang
...
Xinxian Huang
Xin Tian
Xinchao Xu
Yingzhan Lin
Zhengyu Niu
VLM
ALM
26
60
0
20 Sep 2021
Challenges in Detoxifying Language Models
Challenges in Detoxifying Language Models
Johannes Welbl
Amelia Glaese
J. Uesato
Sumanth Dathathri
John F. J. Mellor
Lisa Anne Hendricks
Kirsty Anderson
Pushmeet Kohli
Ben Coppin
Po-Sen Huang
LM&MA
253
193
0
15 Sep 2021
The Stability-Efficiency Dilemma: Investigating Sequence Length Warmup
  for Training GPT Models
The Stability-Efficiency Dilemma: Investigating Sequence Length Warmup for Training GPT Models
Conglong Li
Minjia Zhang
Yuxiong He
20
38
0
13 Aug 2021
Deduplicating Training Data Makes Language Models Better
Deduplicating Training Data Makes Language Models Better
Katherine Lee
Daphne Ippolito
A. Nystrom
Chiyuan Zhang
Douglas Eck
Chris Callison-Burch
Nicholas Carlini
SyDa
242
599
0
14 Jul 2021
Can Deep Neural Networks Predict Data Correlations from Column Names?
Can Deep Neural Networks Predict Data Correlations from Column Names?
Immanuel Trummer
17
8
0
09 Jul 2021
A Primer on Pretrained Multilingual Language Models
A Primer on Pretrained Multilingual Language Models
Sumanth Doddapaneni
Gowtham Ramesh
Mitesh M. Khapra
Anoop Kunchukuttan
Pratyush Kumar
LRM
43
74
0
01 Jul 2021
Efficient Deep Learning: A Survey on Making Deep Learning Models
  Smaller, Faster, and Better
Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better
Gaurav Menghani
VLM
MedIm
23
367
0
16 Jun 2021
Asynchronous speedup in decentralized optimization
Asynchronous speedup in decentralized optimization
Mathieu Even
Hadrien Hendrikx
Laurent Massoulie
34
4
0
07 Jun 2021
Carbon Emissions and Large Neural Network Training
Carbon Emissions and Large Neural Network Training
David A. Patterson
Joseph E. Gonzalez
Quoc V. Le
Chen Liang
Lluís-Miquel Munguía
D. Rothchild
David R. So
Maud Texier
J. Dean
AI4CE
253
646
0
21 Apr 2021
Quantifying Gender Bias Towards Politicians in Cross-Lingual Language
  Models
Quantifying Gender Bias Towards Politicians in Cross-Lingual Language Models
Karolina Stañczak
Sagnik Ray Choudhury
Tiago Pimentel
Ryan Cotterell
Isabelle Augenstein
30
23
0
15 Apr 2021
The GEM Benchmark: Natural Language Generation, its Evaluation and
  Metrics
The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics
Sebastian Gehrmann
Tosin Adewumi
Karmanya Aggarwal
Pawan Sasanka Ammanamanchi
Aremu Anuoluwapo
...
Nishant Subramani
Wei Xu
Diyi Yang
Akhila Yerukola
Jiawei Zhou
VLM
260
285
0
02 Feb 2021
ZeRO-Offload: Democratizing Billion-Scale Model Training
ZeRO-Offload: Democratizing Billion-Scale Model Training
Jie Ren
Samyam Rajbhandari
Reza Yazdani Aminabadi
Olatunji Ruwase
Shuangyang Yang
Minjia Zhang
Dong Li
Yuxiong He
MoE
177
417
0
18 Jan 2021
Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit
  Reasoning Strategies
Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit Reasoning Strategies
Mor Geva
Daniel Khashabi
Elad Segal
Tushar Khot
Dan Roth
Jonathan Berant
RALM
259
682
0
06 Jan 2021
Inductive Biases for Deep Learning of Higher-Level Cognition
Inductive Biases for Deep Learning of Higher-Level Cognition
Anirudh Goyal
Yoshua Bengio
AI4CE
18
347
0
30 Nov 2020
Data-Informed Global Sparseness in Attention Mechanisms for Deep Neural
  Networks
Data-Informed Global Sparseness in Attention Mechanisms for Deep Neural Networks
Ileana Rugina
Rumen Dangovski
L. Jing
Preslav Nakov
Marin Soljacic
26
0
0
20 Nov 2020
Transfer Learning in Deep Reinforcement Learning: A Survey
Transfer Learning in Deep Reinforcement Learning: A Survey
Zhuangdi Zhu
Kaixiang Lin
Anil K. Jain
Jiayu Zhou
OffRL
LRM
30
564
0
16 Sep 2020
Efficient Transformers: A Survey
Efficient Transformers: A Survey
Yi Tay
Mostafa Dehghani
Dara Bahri
Donald Metzler
VLM
114
1,104
0
14 Sep 2020
Big Bird: Transformers for Longer Sequences
Big Bird: Transformers for Longer Sequences
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
...
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
288
2,023
0
28 Jul 2020
Contextualizing Enhances Gradient Based Meta Learning
Contextualizing Enhances Gradient Based Meta Learning
Evan Vogelbaum
Rumen Dangovski
L. Jing
Marin Soljacic
34
3
0
17 Jul 2020
Improving Readability for Automatic Speech Recognition Transcription
Improving Readability for Automatic Speech Recognition Transcription
Junwei Liao
Sefik Emre Eskimez
Liyang Lu
Yu Shi
Ming Gong
Linjun Shou
Hong Qu
Michael Zeng
27
55
0
09 Apr 2020
Efficient Content-Based Sparse Attention with Routing Transformers
Efficient Content-Based Sparse Attention with Routing Transformers
Aurko Roy
M. Saffar
Ashish Vaswani
David Grangier
MoE
255
580
0
12 Mar 2020
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
264
4,532
0
23 Jan 2020
Previous
123...838485