ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.02311
  4. Cited By
PaLM: Scaling Language Modeling with Pathways
v1v2v3v4v5 (latest)

PaLM: Scaling Language Modeling with Pathways

5 April 2022
Aakanksha Chowdhery
Sharan Narang
Jacob Devlin
Maarten Bosma
Gaurav Mishra
Adam Roberts
P. Barham
Hyung Won Chung
Charles Sutton
Sebastian Gehrmann
Parker Schuh
Kensen Shi
Sasha Tsvyashchenko
Joshua Maynez
Abhishek Rao
Parker Barnes
Yi Tay
Noam M. Shazeer
Vinodkumar Prabhakaran
Emily Reif
Nan Du
Ben Hutchinson
Reiner Pope
James Bradbury
Jacob Austin
Michael Isard
Guy Gur-Ari
Pengcheng Yin
Toju Duke
Anselm Levskaya
Sanjay Ghemawat
Sunipa Dev
Henryk Michalewski
Xavier Garcia
Vedant Misra
Kevin Robinson
Liam Fedus
Denny Zhou
Daphne Ippolito
D. Luan
Hyeontaek Lim
Barret Zoph
A. Spiridonov
Ryan Sepassi
David Dohan
Shivani Agrawal
Mark Omernick
Andrew M. Dai
Thanumalayan Sankaranarayana Pillai
Marie Pellat
Aitor Lewkowycz
Erica Moreira
R. Child
Oleksandr Polozov
Katherine Lee
Zongwei Zhou
Xuezhi Wang
Brennan Saeta
Mark Díaz
Orhan Firat
Michele Catasta
Jason W. Wei
Kathy Meier-Hellstern
Douglas Eck
J. Dean
Slav Petrov
Noah Fiedel
    PILMLRM
ArXiv (abs)PDFHTML

Papers citing "PaLM: Scaling Language Modeling with Pathways"

50 / 4,332 papers shown
Title
Scaling Vision Transformers to 22 Billion Parameters
Scaling Vision Transformers to 22 Billion Parameters
Mostafa Dehghani
Josip Djolonga
Basil Mustafa
Piotr Padlewski
Jonathan Heek
...
Mario Luvcić
Xiaohua Zhai
Daniel Keysers
Jeremiah Harmsen
N. Houlsby
MLLM
203
615
0
10 Feb 2023
DNArch: Learning Convolutional Neural Architectures by Backpropagation
DNArch: Learning Convolutional Neural Architectures by Backpropagation
David W. Romero
Neil Zeghidour
AI4CE
68
4
0
10 Feb 2023
Large Language Models for Code: Security Hardening and Adversarial
  Testing
Large Language Models for Code: Security Hardening and Adversarial Testing
Jingxuan He
Martin Vechev
ELMAAML
93
123
0
10 Feb 2023
Translating Natural Language to Planning Goals with Large-Language
  Models
Translating Natural Language to Planning Goals with Large-Language Models
Yaqi Xie
Chenyao Yu
Tongyao Zhu
Jinbin Bai
Ze Gong
Harold Soh
LM&RoLRM
96
149
0
10 Feb 2023
In-Context Learning with Many Demonstration Examples
In-Context Learning with Many Demonstration Examples
Mukai Li
Shansan Gong
Jiangtao Feng
Yiheng Xu
Jinchao Zhang
Zhiyong Wu
Lingpeng Kong
113
38
0
09 Feb 2023
Binarized Neural Machine Translation
Binarized Neural Machine Translation
Yichi Zhang
Ankush Garg
Yuan Cao
Lukasz Lew
Behrooz Ghorbani
Zhiru Zhang
Orhan Firat
MQ
76
14
0
09 Feb 2023
Explanation Selection Using Unlabeled Data for Chain-of-Thought
  Prompting
Explanation Selection Using Unlabeled Data for Chain-of-Thought Prompting
Xi Ye
Greg Durrett
LRMReLM
61
12
0
09 Feb 2023
Toolformer: Language Models Can Teach Themselves to Use Tools
Toolformer: Language Models Can Teach Themselves to Use Tools
Timo Schick
Jane Dwivedi-Yu
Roberto Dessì
Roberta Raileanu
Maria Lomeli
Luke Zettlemoyer
Nicola Cancedda
Thomas Scialom
SyDaRALM
231
1,781
0
09 Feb 2023
Global Constraints with Prompting for Zero-Shot Event Argument
  Classification
Global Constraints with Prompting for Zero-Shot Event Argument Classification
Zizheng Lin
Hongming Zhang
Yangqiu Song
70
16
0
09 Feb 2023
Black Box Adversarial Prompting for Foundation Models
Black Box Adversarial Prompting for Foundation Models
Natalie Maus
Patrick Chao
Eric Wong
Jacob R. Gardner
VLM
85
60
0
08 Feb 2023
A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on
  Reasoning, Hallucination, and Interactivity
A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity
Yejin Bang
Samuel Cahyawijaya
Nayeon Lee
Wenliang Dai
Jane Polak Scowcroft
...
Tiezheng Yu
Willy Chung
Quyet V. Do
Yan Xu
Pascale Fung
ReLMLRM
181
1,403
0
08 Feb 2023
Is ChatGPT a General-Purpose Natural Language Processing Task Solver?
Is ChatGPT a General-Purpose Natural Language Processing Task Solver?
Chengwei Qin
Aston Zhang
Zhuosheng Zhang
Jiaao Chen
Michihiro Yasunaga
Diyi Yang
LM&MAAI4MHLRMELM
176
708
0
08 Feb 2023
Learning Translation Quality Evaluation on Low Resource Languages from
  Large Language Models
Learning Translation Quality Evaluation on Low Resource Languages from Large Language Models
Amirkeivan Mohtashami
M. Verzetti
Paul Kishan Rubenstein
67
4
0
07 Feb 2023
Exploring the Benefits of Training Expert Language Models over
  Instruction Tuning
Exploring the Benefits of Training Expert Language Models over Instruction Tuning
Joel Jang
Seungone Kim
Seonghyeon Ye
Doyoung Kim
Lajanugen Logeswaran
Moontae Lee
Kyungjae Lee
Minjoon Seo
LRMALM
139
83
0
07 Feb 2023
Data Selection for Language Models via Importance Resampling
Data Selection for Language Models via Importance Resampling
Sang Michael Xie
Shibani Santurkar
Tengyu Ma
Percy Liang
144
196
0
06 Feb 2023
U-Clip: On-Average Unbiased Stochastic Gradient Clipping
U-Clip: On-Average Unbiased Stochastic Gradient Clipping
Bryn Elesedy
Marcus Hutter
67
1
0
06 Feb 2023
Computation vs. Communication Scaling for Future Transformers on Future
  Hardware
Computation vs. Communication Scaling for Future Transformers on Future Hardware
Suchita Pati
Shaizeen Aga
Mahzabeen Islam
Nuwan Jayasena
Matthew D. Sinclair
75
10
0
06 Feb 2023
Chain of Hindsight Aligns Language Models with Feedback
Chain of Hindsight Aligns Language Models with Feedback
Hao Liu
Carmelo Sferrazza
Pieter Abbeel
ALM
159
124
0
06 Feb 2023
Grounding Large Language Models in Interactive Environments with Online
  Reinforcement Learning
Grounding Large Language Models in Interactive Environments with Online Reinforcement Learning
Thomas Carta
Clément Romac
Thomas Wolf
Sylvain Lamprier
Olivier Sigaud
Pierre-Yves Oudeyer
LM&RoLLMAG
121
194
0
06 Feb 2023
Ten Lessons We Have Learned in the New "Sparseland": A Short Handbook
  for Sparse Neural Network Researchers
Ten Lessons We Have Learned in the New "Sparseland": A Short Handbook for Sparse Neural Network Researchers
Shiwei Liu
Zhangyang Wang
131
32
0
06 Feb 2023
A Categorical Archive of ChatGPT Failures
A Categorical Archive of ChatGPT Failures
Ali Borji
ELM
147
397
0
06 Feb 2023
The Gradient of Generative AI Release: Methods and Considerations
The Gradient of Generative AI Release: Methods and Considerations
Irene Solaiman
85
104
0
05 Feb 2023
Quantized Distributed Training of Large Models with Convergence
  Guarantees
Quantized Distributed Training of Large Models with Convergence Guarantees
I. Markov
Adrian Vladu
Qi Guo
Dan Alistarh
MQ
100
11
0
05 Feb 2023
Measuring The Impact Of Programming Language Distribution
Measuring The Impact Of Programming Language Distribution
Gabriel Orlanski
Kefan Xiao
Xavier Garcia
Jeffrey Hui
Joshua Howland
J. Malmaud
Jacob Austin
Rishah Singh
Michele Catasta
167
33
0
03 Feb 2023
The unreasonable effectiveness of few-shot learning for machine
  translation
The unreasonable effectiveness of few-shot learning for machine translation
Xavier Garcia
Yamini Bansal
Colin Cherry
George F. Foster
M. Krikun
Fan Feng
Melvin Johnson
Orhan Firat
120
109
0
02 Feb 2023
Accelerating Large Language Model Decoding with Speculative Sampling
Accelerating Large Language Model Decoding with Speculative Sampling
Charlie Chen
Sebastian Borgeaud
G. Irving
Jean-Baptiste Lespiau
Laurent Sifre
J. Jumper
BDLLRM
119
436
0
02 Feb 2023
Mnemosyne: Learning to Train Transformers with Transformers
Mnemosyne: Learning to Train Transformers with Transformers
Deepali Jain
K. Choromanski
Kumar Avinava Dubey
Sumeet Singh
Vikas Sindhwani
Tingnan Zhang
Jie Tan
OffRL
147
9
0
02 Feb 2023
QR-CLIP: Introducing Explicit Open-World Knowledge for Location and Time
  Reasoning
QR-CLIP: Introducing Explicit Open-World Knowledge for Location and Time Reasoning
Weimin Shi
Mingchen Zhuge
D. Gao
Zhong Zhou
Ming-Ming Cheng
Deng-Ping Fan
LRMVLM
90
0
0
02 Feb 2023
Language Quantized AutoEncoders: Towards Unsupervised Text-Image
  Alignment
Language Quantized AutoEncoders: Towards Unsupervised Text-Image Alignment
Hao Liu
Wilson Yan
Pieter Abbeel
99
25
0
02 Feb 2023
A Survey of Deep Learning: From Activations to Transformers
A Survey of Deep Learning: From Activations to Transformers
Johannes Schneider
Michalis Vlachos
ViTMedImAI4TSAI4CE
112
10
0
01 Feb 2023
Learning Universal Policies via Text-Guided Video Generation
Learning Universal Policies via Text-Guided Video Generation
Yilun Du
Mengjiao Yang
Bo Dai
H. Dai
Ofir Nachum
J. Tenenbaum
Dale Schuurmans
Pieter Abbeel
PINNLM&Ro
142
264
0
31 Jan 2023
Large Language Models Can Be Easily Distracted by Irrelevant Context
Large Language Models Can Be Easily Distracted by Irrelevant Context
Freda Shi
Xinyun Chen
Kanishka Misra
Nathan Scales
David Dohan
Ed H. Chi
Nathanael Scharli
Denny Zhou
ReLMRALMLRM
149
600
0
31 Jan 2023
Mathematical Capabilities of ChatGPT
Mathematical Capabilities of ChatGPT
Simon Frieder
Luca Pinchetti
Alexis Chevalier
Ryan-Rhys Griffiths
Tommaso Salvatori
Thomas Lukasiewicz
P. Petersen
Julius Berner
ELMAI4MH
148
434
0
31 Jan 2023
Benchmarking Large Language Models for News Summarization
Benchmarking Large Language Models for News Summarization
Tianyi Zhang
Faisal Ladhak
Esin Durmus
Percy Liang
Kathleen McKeown
Tatsunori B. Hashimoto
ELM
137
535
0
31 Jan 2023
Grounding Language Models to Images for Multimodal Inputs and Outputs
Grounding Language Models to Images for Multimodal Inputs and Outputs
Jing Yu Koh
Ruslan Salakhutdinov
Daniel Fried
MLLM
148
123
0
31 Jan 2023
Execution-based Code Generation using Deep Reinforcement Learning
Execution-based Code Generation using Deep Reinforcement Learning
Parshin Shojaee
Aneesh Jain
Sindhu Tipirneni
Chandan K. Reddy
152
58
0
31 Jan 2023
FLAME: A small language model for spreadsheet formulas
FLAME: A small language model for spreadsheet formulas
Harshit Joshi
Abishai Ebenezer
J. Cambronero
Sumit Gulwani
Aditya Kanade
Vu Le
Ivan Radivcek
Gust Verbruggen
LMTD
116
13
0
31 Jan 2023
The Flan Collection: Designing Data and Methods for Effective
  Instruction Tuning
The Flan Collection: Designing Data and Methods for Effective Instruction Tuning
Shayne Longpre
Le Hou
Tu Vu
Albert Webson
Hyung Won Chung
...
Denny Zhou
Quoc V. Le
Barret Zoph
Jason W. Wei
Adam Roberts
ALM
134
678
0
31 Jan 2023
The Power of External Memory in Increasing Predictive Model Capacity
The Power of External Memory in Increasing Predictive Model Capacity
Cenk Baykal
D. Cutler
Nishanth Dikkala
Nikhil Ghosh
Rina Panigrahy
Xin Wang
KELM
46
0
0
31 Jan 2023
Alternating Updates for Efficient Transformers
Alternating Updates for Efficient Transformers
Cenk Baykal
D. Cutler
Nishanth Dikkala
Nikhil Ghosh
Rina Panigrahy
Xin Wang
MoE
82
5
0
30 Jan 2023
Adaptive Machine Translation with Large Language Models
Adaptive Machine Translation with Large Language Models
Yasmin Moslem
Rejwanul Haque
John D. Kelleher
Andy Way
AI4CE
87
82
0
30 Jan 2023
Looped Transformers as Programmable Computers
Looped Transformers as Programmable Computers
Angeliki Giannou
Shashank Rajput
Jy-yong Sohn
Kangwook Lee
Jason D. Lee
Dimitris Papailiopoulos
103
107
0
30 Jan 2023
Red teaming ChatGPT via Jailbreaking: Bias, Robustness, Reliability and
  Toxicity
Red teaming ChatGPT via Jailbreaking: Bias, Robustness, Reliability and Toxicity
Terry Yue Zhuo
Yujin Huang
Chunyang Chen
Zhenchang Xing
SILM
105
107
0
30 Jan 2023
Crawling the Internal Knowledge-Base of Language Models
Crawling the Internal Knowledge-Base of Language Models
Roi Cohen
Mor Geva
Jonathan Berant
Amir Globerson
231
85
0
30 Jan 2023
Distributed Stochastic Optimization under a General Variance Condition
Distributed Stochastic Optimization under a General Variance Condition
Kun-Yen Huang
Xiao Li
Shin-Yi Pu
FedML
102
7
0
30 Jan 2023
REPLUG: Retrieval-Augmented Black-Box Language Models
REPLUG: Retrieval-Augmented Black-Box Language Models
Weijia Shi
Sewon Min
Michihiro Yasunaga
Minjoon Seo
Rich James
M. Lewis
Luke Zettlemoyer
Wen-tau Yih
RALMVLMKELM
278
647
0
30 Jan 2023
Understanding the Effectiveness of Very Large Language Models on Dialog
  Evaluation
Understanding the Effectiveness of Very Large Language Models on Dialog Evaluation
Jessica Huynh
Cathy Jiao
Prakhar Gupta
Shikib Mehri
Payal Bajaj
Vishrav Chaudhary
M. Eskénazi
ELMLM&MA
73
17
0
27 Jan 2023
SWARM Parallelism: Training Large Models Can Be Surprisingly
  Communication-Efficient
SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient
Max Ryabinin
Tim Dettmers
Michael Diskin
Alexander Borzunov
MoE
111
38
0
27 Jan 2023
Call for Papers -- The BabyLM Challenge: Sample-efficient pretraining on
  a developmentally plausible corpus
Call for Papers -- The BabyLM Challenge: Sample-efficient pretraining on a developmentally plausible corpus
Alex Warstadt
Leshem Choshen
Aaron Mueller
Adina Williams
Ethan Gotlieb Wilcox
Chengxu Zhuang
115
57
0
27 Jan 2023
Probing Out-of-Distribution Robustness of Language Models with
  Parameter-Efficient Transfer Learning
Probing Out-of-Distribution Robustness of Language Models with Parameter-Efficient Transfer Learning
Hyunsoo Cho
Choonghyun Park
Junyeop Kim
Sungmin Cho
Kang Min Yoo
Sang-goo Lee
OODD
102
3
0
27 Jan 2023
Previous
123...767778...858687
Next