ResearchTrend.AI
  • Papers
  • Communities
  • Organizations
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.02311
  4. Cited By
PaLM: Scaling Language Modeling with Pathways
v1v2v3v4v5 (latest)

PaLM: Scaling Language Modeling with Pathways

5 April 2022
Aakanksha Chowdhery
Sharan Narang
Jacob Devlin
Maarten Bosma
Gaurav Mishra
Adam Roberts
P. Barham
Hyung Won Chung
Charles Sutton
Sebastian Gehrmann
Parker Schuh
Kensen Shi
Sasha Tsvyashchenko
Joshua Maynez
Abhishek Rao
Parker Barnes
Yi Tay
Noam M. Shazeer
Vinodkumar Prabhakaran
Emily Reif
Nan Du
Ben Hutchinson
Reiner Pope
James Bradbury
Jacob Austin
Michael Isard
Guy Gur-Ari
Pengcheng Yin
Toju Duke
Anselm Levskaya
Sanjay Ghemawat
Sunipa Dev
Henryk Michalewski
Xavier Garcia
Vedant Misra
Kevin Robinson
Liam Fedus
Denny Zhou
Daphne Ippolito
D. Luan
Hyeontaek Lim
Barret Zoph
A. Spiridonov
Ryan Sepassi
David Dohan
Shivani Agrawal
Mark Omernick
Andrew M. Dai
Thanumalayan Sankaranarayana Pillai
Marie Pellat
Aitor Lewkowycz
Erica Moreira
R. Child
Oleksandr Polozov
Katherine Lee
Zongwei Zhou
Xuezhi Wang
Brennan Saeta
Mark Díaz
Orhan Firat
Michele Catasta
Jason W. Wei
Kathy Meier-Hellstern
Douglas Eck
J. Dean
Slav Petrov
Noah Fiedel
    PILMLRM
ArXiv (abs)PDFHTML

Papers citing "PaLM: Scaling Language Modeling with Pathways"

50 / 4,332 papers shown
Title
InstructZero: Efficient Instruction Optimization for Black-Box Large
  Language Models
InstructZero: Efficient Instruction Optimization for Black-Box Large Language Models
Lichang Chen
Jiuhai Chen
Tom Goldstein
Heng-Chiao Huang
Dinesh Manocha
105
45
0
05 Jun 2023
SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight
  Compression
SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression
Tim Dettmers
Ruslan Svirschevski
Vage Egiazarian
Denis Kuznedelev
Elias Frantar
Saleh Ashkboos
Alexander Borzunov
Torsten Hoefler
Dan Alistarh
MQ
83
257
0
05 Jun 2023
SelfEvolve: A Code Evolution Framework via Large Language Models
SelfEvolve: A Code Evolution Framework via Large Language Models
Shuyang Jiang
Yuhao Wang
Yu Wang
98
39
0
05 Jun 2023
On "Scientific Debt" in NLP: A Case for More Rigour in Language Model
  Pre-Training Research
On "Scientific Debt" in NLP: A Case for More Rigour in Language Model Pre-Training Research
Made Nindyatama Nityasya
Haryo Akbarianto Wibowo
Alham Fikri Aji
Genta Indra Winata
Radityo Eko Prasojo
Phil Blunsom
A. Kuncoro
72
8
0
05 Jun 2023
Transferring Annotator- and Instance-dependent Transition Matrix for
  Learning from Crowds
Transferring Annotator- and Instance-dependent Transition Matrix for Learning from Crowds
Shikun Li
Xiaobo Xia
Jiankang Deng
Shiming Ge
Tongliang Liu
118
15
0
05 Jun 2023
Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video
  Understanding
Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
Hang Zhang
Xin Li
Lidong Bing
MLLM
253
1,068
0
05 Jun 2023
MCTS: A Multi-Reference Chinese Text Simplification Dataset
MCTS: A Multi-Reference Chinese Text Simplification Dataset
Ruining Chong
Luming Lu
Liner Yang
Jinran Nie
Zhenghao Liu
Shuo Wang
Shuhan Zhou
Yaoxin Li
Erhong Yang
96
1
0
05 Jun 2023
Prompt to be Consistent is Better than Self-Consistent? Few-Shot and
  Zero-Shot Fact Verification with Pre-trained Language Models
Prompt to be Consistent is Better than Self-Consistent? Few-Shot and Zero-Shot Fact Verification with Pre-trained Language Models
Fengzhu Zeng
Wei Gao
82
7
0
05 Jun 2023
LLM-Blender: Ensembling Large Language Models with Pairwise Ranking and
  Generative Fusion
LLM-Blender: Ensembling Large Language Models with Pairwise Ranking and Generative Fusion
Dongfu Jiang
Xiang Ren
Bill Yuchen Lin
ELM
182
334
0
05 Jun 2023
Data Quality in Imitation Learning
Data Quality in Imitation Learning
Suneel Belkhale
Yuchen Cui
Dorsa Sadigh
98
52
0
04 Jun 2023
bgGLUE: A Bulgarian General Language Understanding Evaluation Benchmark
bgGLUE: A Bulgarian General Language Understanding Evaluation Benchmark
Momchil Hardalov
Pepa Atanasova
Todor Mihaylov
G. Angelova
K. Simov
P. Osenova
Ves Stoyanov
Ivan Koychev
Preslav Nakov
Dragomir R. Radev
ELMFedML
88
4
0
04 Jun 2023
Exploring the Impact of Model Scaling on Parameter-Efficient Tuning
Exploring the Impact of Model Scaling on Parameter-Efficient Tuning
Yusheng Su
Chi-Min Chan
Jiali Cheng
Yujia Qin
Yankai Lin
...
Ning Ding
Xingzhi Sun
Guotong Xie
Zhiyuan Liu
Maosong Sun
104
6
0
04 Jun 2023
A Mathematical Abstraction for Balancing the Trade-off Between
  Creativity and Reality in Large Language Models
A Mathematical Abstraction for Balancing the Trade-off Between Creativity and Reality in Large Language Models
Ritwik Sinha
Zhao Song
Dinesh Manocha
104
25
0
04 Jun 2023
A Technical Report for Polyglot-Ko: Open-Source Large-Scale Korean
  Language Models
A Technical Report for Polyglot-Ko: Open-Source Large-Scale Korean Language Models
H. Ko
Kichang Yang
Minho Ryu
Taekyoon Choi
Seungmu Yang
Jiwung Hyun
Sung-Yong Park
Kyubyong Park
93
30
0
04 Jun 2023
Auto-GPT for Online Decision Making: Benchmarks and Additional Opinions
Auto-GPT for Online Decision Making: Benchmarks and Additional Opinions
Hui Yang
Sifu Yue
Yunzhong He
RALM
85
172
0
04 Jun 2023
Utilizing ChatGPT to Enhance Clinical Trial Enrollment
Utilizing ChatGPT to Enhance Clinical Trial Enrollment
Georgios Peikos
S. Symeonidis
Pranav Kasela
G. Pasi
LM&MA
59
13
0
03 Jun 2023
LambdaBeam: Neural Program Search with Higher-Order Functions and
  Lambdas
LambdaBeam: Neural Program Search with Higher-Order Functions and Lambdas
Kensen Shi
H. Dai
Wen-Ding Li
Kevin Ellis
Charles Sutton
87
8
0
03 Jun 2023
On Optimal Caching and Model Multiplexing for Large Model Inference
On Optimal Caching and Model Multiplexing for Large Model Inference
Banghua Zhu
Ying Sheng
Lianmin Zheng
Clark W. Barrett
Michael I. Jordan
Jiantao Jiao
101
21
0
03 Jun 2023
Probabilistic Adaptation of Text-to-Video Models
Probabilistic Adaptation of Text-to-Video Models
Mengjiao Yang
Yilun Du
Bo Dai
Dale Schuurmans
J. Tenenbaum
Pieter Abbeel
VGenDiffM
139
26
0
02 Jun 2023
Learning Multi-Step Reasoning by Solving Arithmetic Tasks
Learning Multi-Step Reasoning by Solving Arithmetic Tasks
Tianduo Wang
Wei Lu
ReLMLRM
73
16
0
02 Jun 2023
Centered Self-Attention Layers
Centered Self-Attention Layers
Ameen Ali
Tomer Galanti
Lior Wolf
144
8
0
02 Jun 2023
PassGPT: Password Modeling and (Guided) Generation with Large Language
  Models
PassGPT: Password Modeling and (Guided) Generation with Large Language Models
Javier Rando
Fernando Perez-Cruz
Briland Hitaj
GAN
54
10
0
02 Jun 2023
GAIA Search: Hugging Face and Pyserini Interoperability for NLP Training
  Data Exploration
GAIA Search: Hugging Face and Pyserini Interoperability for NLP Training Data Exploration
Aleksandra Piktus
Odunayo Ogundepo
Christopher Akiki
Akintunde Oladipo
Xinyu Crystina Zhang
Hailey Schoelkopf
Stella Biderman
Martin Potthast
Jimmy J. Lin
CVBM
80
10
0
02 Jun 2023
An Overview on Generative AI at Scale with Edge-Cloud Computing
An Overview on Generative AI at Scale with Edge-Cloud Computing
Yun Cheng Wang
Jintang Xue
Chengwei Wei
C.-C. Jay Kuo
75
35
0
02 Jun 2023
KL-Divergence Guided Temperature Sampling
KL-Divergence Guided Temperature Sampling
Chung-Ching Chang
David Reitter
Renat Aksitov
Yun-hsuan Sung
HILM
65
7
0
02 Jun 2023
Responsible Task Automation: Empowering Large Language Models as
  Responsible Task Automators
Responsible Task Automation: Empowering Large Language Models as Responsible Task Automators
Zhizheng Zhang
Xiaoyi Zhang
Wenxuan Xie
Yan Lu
77
15
0
02 Jun 2023
Systematic Evaluation of GPT-3 for Zero-Shot Personality Estimation
Systematic Evaluation of GPT-3 for Zero-Shot Personality Estimation
Adithya Ganesan
Yash Kumar Lal
August Håkan Nilsson
H. Andrew Schwartz
82
24
0
01 Jun 2023
The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora
  with Web Data, and Web Data Only
The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only
Guilherme Penedo
Quentin Malartic
Daniel Hesslow
Ruxandra-Aimée Cojocaru
Alessandro Cappelli
Hamza Alobeidli
B. Pannier
Ebtesam Almazrouei
Julien Launay
207
778
0
01 Jun 2023
STEVE-1: A Generative Model for Text-to-Behavior in Minecraft
STEVE-1: A Generative Model for Text-to-Behavior in Minecraft
Shalev Lifshitz
Keiran Paster
Harris Chan
Jimmy Ba
Sheila A. McIlraith
LM&Ro
145
76
0
01 Jun 2023
Birth of a Transformer: A Memory Viewpoint
Birth of a Transformer: A Memory Viewpoint
A. Bietti
Vivien A. Cabannes
Diane Bouchacourt
Hervé Jégou
Léon Bottou
116
96
0
01 Jun 2023
Interpretable Math Word Problem Solution Generation Via Step-by-step
  Planning
Interpretable Math Word Problem Solution Generation Via Step-by-step Planning
Mengxue Zhang
Zichao Wang
Zhichao Yang
Weiqi Feng
Andrew Lan
LRM
71
18
0
01 Jun 2023
Column Type Annotation using ChatGPT
Column Type Annotation using ChatGPT
Keti Korini
Christian Bizer
LMTD
118
28
0
01 Jun 2023
Predicting the Quality of Revisions in Argumentative Writing
Predicting the Quality of Revisions in Argumentative Writing
Zhexiong Liu
Diane Litman
E. Wang
L. Matsumura
Richard Correnti
69
5
0
01 Jun 2023
Make Pre-trained Model Reversible: From Parameter to Memory Efficient
  Fine-Tuning
Make Pre-trained Model Reversible: From Parameter to Memory Efficient Fine-Tuning
Baohao Liao
Shaomu Tan
Christof Monz
KELM
116
30
0
01 Jun 2023
CFL: Causally Fair Language Models Through Token-level Attribute
  Controlled Generation
CFL: Causally Fair Language Models Through Token-level Attribute Controlled Generation
Rahul Madhavan
Rishabh Garg
Kahini Wadhawan
S. Mehta
94
5
0
01 Jun 2023
Prompt Algebra for Task Composition
Prompt Algebra for Task Composition
Pramuditha Perera
Matthew Trager
Luca Zancato
Alessandro Achille
Stefano Soatto
VLM
82
8
0
01 Jun 2023
An Invariant Learning Characterization of Controlled Text Generation
An Invariant Learning Characterization of Controlled Text Generation
Carolina Zheng
Claudia Shi
Keyon Vafa
Amir Feder
David M. Blei
OOD
103
8
0
31 May 2023
Measuring the Robustness of NLP Models to Domain Shifts
Measuring the Robustness of NLP Models to Domain Shifts
Nitay Calderon
Naveh Porat
Eyal Ben-David
Alexander Chapanin
Zorik Gekhman
Nadav Oved
Vitaly Shalumov
Roi Reichart
150
8
0
31 May 2023
Better patching using LLM prompting, via Self-Consistency
Better patching using LLM prompting, via Self-Consistency
Toufique Ahmed
Prem Devanbu
LRMReLMKELM
118
31
0
31 May 2023
Improving CLIP Training with Language Rewrites
Improving CLIP Training with Language Rewrites
Lijie Fan
Dilip Krishnan
Phillip Isola
Dina Katabi
Yonglong Tian
BDLVLMCLIP
128
179
0
31 May 2023
Decision-Oriented Dialogue for Human-AI Collaboration
Decision-Oriented Dialogue for Human-AI Collaboration
Jessy Lin
Nicholas Tomlin
Jacob Andreas
J. Eisner
LLMAG
120
28
0
31 May 2023
Evaluating Machine Learning Models with NERO: Non-Equivariance Revealed
  on Orbits
Evaluating Machine Learning Models with NERO: Non-Equivariance Revealed on Orbits
Zhuokai Zhao
Takumi Matsuzawa
W. Irvine
Michael Maire
G. Kindlmann
104
2
0
31 May 2023
Red Teaming Language Model Detectors with Language Models
Red Teaming Language Model Detectors with Language Models
Zhouxing Shi
Yihan Wang
Fan Yin
Xiangning Chen
Kai-Wei Chang
Cho-Jui Hsieh
DeLMO
92
57
0
31 May 2023
Large Language Models Are Not Strong Abstract Reasoners
Large Language Models Are Not Strong Abstract Reasoners
Gaël Gendron
Qiming Bao
Michael Witbrock
Gillian Dobbie
ELMLRM
129
37
0
31 May 2023
Harnessing Explanations: LLM-to-LM Interpreter for Enhanced
  Text-Attributed Graph Representation Learning
Harnessing Explanations: LLM-to-LM Interpreter for Enhanced Text-Attributed Graph Representation Learning
Xiaoxin He
Xavier Bresson
T. Laurent
Adam Perold
Yann LeCun
Bryan Hooi
150
86
0
31 May 2023
The Impact of Positional Encoding on Length Generalization in
  Transformers
The Impact of Positional Encoding on Length Generalization in Transformers
Amirhossein Kazemnejad
Inkit Padhi
Karthikeyan N. Ramamurthy
Payel Das
Siva Reddy
104
209
0
31 May 2023
Are Large Kernels Better Teachers than Transformers for ConvNets?
Are Large Kernels Better Teachers than Transformers for ConvNets?
Tianjin Huang
Lu Yin
Zhenyu Zhang
Lijuan Shen
Meng Fang
Mykola Pechenizkiy
Zhangyang Wang
Shiwei Liu
102
13
0
30 May 2023
GPT4GEO: How a Language Model Sees the World's Geography
GPT4GEO: How a Language Model Sees the World's Geography
Jonathan Roberts
Timo Lüddecke
Sowmen Das
Kai Han
Samuel Albanie
90
64
0
30 May 2023
Intriguing Properties of Quantization at Scale
Intriguing Properties of Quantization at Scale
Arash Ahmadian
Saurabh Dash
Hongyu Chen
Bharat Venkitesh
Stephen Gou
Phil Blunsom
Ahmet Üstün
Sara Hooker
MQ
131
38
0
30 May 2023
Forward-Forward Training of an Optical Neural Network
Forward-Forward Training of an Optical Neural Network
Ilker Oguz
Junjie Ke
Qifei Wang
Feng Yang
Mustafa Yildirim
Niyazi Ulaş Dinç
J. Hsieh
C. Moser
D. Psaltis
70
17
0
30 May 2023
Previous
123...636465...858687
Next