ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.02311
  4. Cited By
PaLM: Scaling Language Modeling with Pathways

PaLM: Scaling Language Modeling with Pathways

5 April 2022
Aakanksha Chowdhery
Sharan Narang
Jacob Devlin
Maarten Bosma
Gaurav Mishra
Adam Roberts
P. Barham
Hyung Won Chung
Charles Sutton
Sebastian Gehrmann
Parker Schuh
Kensen Shi
Sasha Tsvyashchenko
Joshua Maynez
Abhishek Rao
Parker Barnes
Yi Tay
Noam M. Shazeer
Vinodkumar Prabhakaran
Emily Reif
Nan Du
Ben Hutchinson
Reiner Pope
James Bradbury
Jacob Austin
Michael Isard
Guy Gur-Ari
Pengcheng Yin
Toju Duke
Anselm Levskaya
Sanjay Ghemawat
Sunipa Dev
Henryk Michalewski
Xavier Garcia
Vedant Misra
Kevin Robinson
Liam Fedus
Denny Zhou
Daphne Ippolito
D. Luan
Hyeontaek Lim
Barret Zoph
A. Spiridonov
Ryan Sepassi
David Dohan
Shivani Agrawal
Mark Omernick
Andrew M. Dai
Thanumalayan Sankaranarayana Pillai
Marie Pellat
Aitor Lewkowycz
Erica Moreira
R. Child
Oleksandr Polozov
Katherine Lee
Zongwei Zhou
Xuezhi Wang
Brennan Saeta
Mark Díaz
Orhan Firat
Michele Catasta
Jason W. Wei
Kathy Meier-Hellstern
Douglas Eck
J. Dean
Slav Petrov
Noah Fiedel
    PILM
    LRM
ArXivPDFHTML

Papers citing "PaLM: Scaling Language Modeling with Pathways"

50 / 1,253 papers shown
Title
Grammatical Error Correction: A Survey of the State of the Art
Grammatical Error Correction: A Survey of the State of the Art
Christopher Bryant
Zheng Yuan
Muhammad Reza Qorib
Hannan Cao
Hwee Tou Ng
Ted Briscoe
3DV
29
79
0
09 Nov 2022
Large Language Models with Controllable Working Memory
Large Language Models with Controllable Working Memory
Daliang Li
A. S. Rawat
Manzil Zaheer
Xin Wang
Michal Lukasik
Andreas Veit
Felix X. Yu
Surinder Kumar
KELM
61
154
0
09 Nov 2022
Efficiently Scaling Transformer Inference
Efficiently Scaling Transformer Inference
Reiner Pope
Sholto Douglas
Aakanksha Chowdhery
Jacob Devlin
James Bradbury
Anselm Levskaya
Jonathan Heek
Kefan Xiao
Shivani Agrawal
J. Dean
37
297
0
09 Nov 2022
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
BigScience Workshop
:
Teven Le Scao
Angela Fan
Christopher Akiki
...
Zhongli Xie
Zifan Ye
M. Bras
Younes Belkada
Thomas Wolf
VLM
118
2,315
0
09 Nov 2022
What is Wrong with Language Models that Can Not Tell a Story?
What is Wrong with Language Models that Can Not Tell a Story?
Ivan P. Yamshchikov
Alexey Tikhonov
32
6
0
09 Nov 2022
Conciseness: An Overlooked Language Task
Conciseness: An Overlooked Language Task
Felix Stahlberg
Aashish Kumar
Chris Alberti
Shankar Kumar
21
1
0
08 Nov 2022
COPEN: Probing Conceptual Knowledge in Pre-trained Language Models
COPEN: Probing Conceptual Knowledge in Pre-trained Language Models
Hao Peng
Xiaozhi Wang
Shengding Hu
Hailong Jin
Lei Hou
Juanzi Li
Zhiyuan Liu
Qun Liu
18
22
0
08 Nov 2022
Astronomia ex machina: a history, primer, and outlook on neural networks
  in astronomy
Astronomia ex machina: a history, primer, and outlook on neural networks in astronomy
Michael J. Smith
James E. Geach
35
32
0
07 Nov 2022
On minimal variations for unsupervised representation learning
On minimal variations for unsupervised representation learning
Vivien A. Cabannes
A. Bietti
Randall Balestriero
SSL
DRL
33
8
0
07 Nov 2022
Okapi: Generalising Better by Making Statistical Matches Match
Okapi: Generalising Better by Making Statistical Matches Match
Myles Bartlett
Sara Romiti
V. Sharmanska
Novi Quadrianto
45
3
0
07 Nov 2022
How Much Does Attention Actually Attend? Questioning the Importance of
  Attention in Pretrained Transformers
How Much Does Attention Actually Attend? Questioning the Importance of Attention in Pretrained Transformers
Michael Hassid
Hao Peng
Daniel Rotem
Jungo Kasai
Ivan Montero
Noah A. Smith
Roy Schwartz
32
24
0
07 Nov 2022
Intriguing Properties of Compression on Multilingual Models
Intriguing Properties of Compression on Multilingual Models
Kelechi Ogueji
Orevaoghene Ahia
Gbemileke Onilude
Sebastian Gehrmann
Sara Hooker
Julia Kreutzer
21
12
0
04 Nov 2022
MolE: a molecular foundation model for drug discovery
MolE: a molecular foundation model for drug discovery
Oscar Méndez-Lucio
C. Nicolaou
Berton Earnshaw
8
11
0
03 Nov 2022
LMentry: A Language Model Benchmark of Elementary Language Tasks
LMentry: A Language Model Benchmark of Elementary Language Tasks
Avia Efrat
Or Honovich
Omer Levy
29
20
0
03 Nov 2022
Inverse scaling can become U-shaped
Inverse scaling can become U-shaped
Jason W. Wei
Najoung Kim
Yi Tay
Quoc V. Le
LRM
29
60
0
03 Nov 2022
eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert
  Denoisers
eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers
Yogesh Balaji
Seungjun Nah
Xun Huang
Arash Vahdat
Jiaming Song
...
Timo Aila
S. Laine
Bryan Catanzaro
Tero Karras
Xuan Li
VLM
MoE
76
804
0
02 Nov 2022
Two-stage LLM Fine-tuning with Less Specialization and More
  Generalization
Two-stage LLM Fine-tuning with Less Specialization and More Generalization
Yihan Wang
Si Si
Daliang Li
Michal Lukasik
Felix X. Yu
Cho-Jui Hsieh
Inderjit S Dhillon
Sanjiv Kumar
46
29
0
01 Nov 2022
SSD-LM: Semi-autoregressive Simplex-based Diffusion Language Model for
  Text Generation and Modular Control
SSD-LM: Semi-autoregressive Simplex-based Diffusion Language Model for Text Generation and Modular Control
Xiaochuang Han
Sachin Kumar
Yulia Tsvetkov
45
79
0
31 Oct 2022
A Simple, Yet Effective Approach to Finding Biases in Code Generation
A Simple, Yet Effective Approach to Finding Biases in Code Generation
Spyridon Mouselinos
Mateusz Malinowski
Henryk Michalewski
18
7
0
31 Oct 2022
Changes from Classical Statistics to Modern Statistics and Data Science
Changes from Classical Statistics to Modern Statistics and Data Science
Kai Zhang
Shan-Yu Liu
M. Xiong
34
0
0
30 Oct 2022
A Solvable Model of Neural Scaling Laws
A Solvable Model of Neural Scaling Laws
A. Maloney
Daniel A. Roberts
J. Sully
47
51
0
30 Oct 2022
Beyond Prompting: Making Pre-trained Language Models Better Zero-shot
  Learners by Clustering Representations
Beyond Prompting: Making Pre-trained Language Models Better Zero-shot Learners by Clustering Representations
Yu Fei
Ping Nie
Zhao Meng
Roger Wattenhofer
Mrinmaya Sachan
VLM
53
20
0
29 Oct 2022
Solving Math Word Problems via Cooperative Reasoning induced Language
  Models
Solving Math Word Problems via Cooperative Reasoning induced Language Models
Xinyu Zhu
Junjie Wang
Lin Zhang
Yuxiang Zhang
Ruyi Gan
Jiaxing Zhang
Yujiu Yang
ReLM
LRM
30
76
0
28 Oct 2022
Truncation Sampling as Language Model Desmoothing
Truncation Sampling as Language Model Desmoothing
John Hewitt
Christopher D. Manning
Percy Liang
BDL
44
76
0
27 Oct 2022
Open-vocabulary Semantic Segmentation with Frozen Vision-Language Models
Open-vocabulary Semantic Segmentation with Frozen Vision-Language Models
Chaofan Ma
Yu-Hao Yang
Yanfeng Wang
Ya Zhang
Weidi Xie
VLM
31
48
0
27 Oct 2022
Multi-lingual Evaluation of Code Generation Models
Multi-lingual Evaluation of Code Generation Models
Ben Athiwaratkun
Sanjay Krishna Gouda
Zijian Wang
Xiaopeng Li
Yuchen Tian
...
Baishakhi Ray
Parminder Bhatia
Sudipta Sengupta
Dan Roth
Bing Xiang
ELM
120
161
0
26 Oct 2022
Piloting Copilot and Codex: Hot Temperature, Cold Prompts, or Black
  Magic?
Piloting Copilot and Codex: Hot Temperature, Cold Prompts, or Black Magic?
Jean-Baptiste Döderlein
M. Acher
D. Khelladi
B. Combemale
34
33
0
26 Oct 2022
Universal Evasion Attacks on Summarization Scoring
Universal Evasion Attacks on Summarization Scoring
Wenchuan Mu
Kwan Hui Lim
AAML
38
1
0
25 Oct 2022
Rich Knowledge Sources Bring Complex Knowledge Conflicts: Recalibrating
  Models to Reflect Conflicting Evidence
Rich Knowledge Sources Bring Complex Knowledge Conflicts: Recalibrating Models to Reflect Conflicting Evidence
Hung-Ting Chen
Michael J.Q. Zhang
Eunsol Choi
RALM
HILM
47
92
0
25 Oct 2022
Reinforcement Learning and Bandits for Speech and Language Processing:
  Tutorial, Review and Outlook
Reinforcement Learning and Bandits for Speech and Language Processing: Tutorial, Review and Outlook
Baihan Lin
OffRL
AI4TS
32
27
0
24 Oct 2022
MetaFormer Baselines for Vision
MetaFormer Baselines for Vision
Weihao Yu
Chenyang Si
Pan Zhou
Mi Luo
Yichen Zhou
Jiashi Feng
Shuicheng Yan
Xinchao Wang
MoE
40
156
0
24 Oct 2022
Towards Better Few-Shot and Finetuning Performance with Forgetful Causal
  Language Models
Towards Better Few-Shot and Finetuning Performance with Forgetful Causal Language Models
Hao Liu
Xinyang Geng
Lisa Lee
Igor Mordatch
Sergey Levine
Sharan Narang
Pieter Abbeel
KELM
CLL
35
2
0
24 Oct 2022
Neural Theory-of-Mind? On the Limits of Social Intelligence in Large LMs
Neural Theory-of-Mind? On the Limits of Social Intelligence in Large LMs
Maarten Sap
Ronan Le Bras
Daniel Fried
Yejin Choi
27
209
0
24 Oct 2022
Finding Memo: Extractive Memorization in Constrained Sequence Generation
  Tasks
Finding Memo: Extractive Memorization in Constrained Sequence Generation Tasks
Vikas Raunak
Arul Menezes
38
13
0
24 Oct 2022
The Curious Case of Absolute Position Embeddings
The Curious Case of Absolute Position Embeddings
Koustuv Sinha
Amirhossein Kazemnejad
Siva Reddy
J. Pineau
Dieuwke Hupkes
Adina Williams
87
15
0
23 Oct 2022
Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal
  Proofs
Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs
Albert Q. Jiang
Sean Welleck
Jin Peng Zhou
Wenda Li
Jiacheng Liu
M. Jamnik
Timothée Lacroix
Yuhuai Wu
Guillaume Lample
AIMat
75
159
0
21 Oct 2022
Amos: An Adam-style Optimizer with Adaptive Weight Decay towards
  Model-Oriented Scale
Amos: An Adam-style Optimizer with Adaptive Weight Decay towards Model-Oriented Scale
Ran Tian
Ankur P. Parikh
ODL
23
6
0
21 Oct 2022
Large Language Models Can Self-Improve
Large Language Models Can Self-Improve
Jiaxin Huang
S. Gu
Le Hou
Yuexin Wu
Xuezhi Wang
Hongkun Yu
Jiawei Han
ReLM
AI4MH
LRM
47
566
0
20 Oct 2022
Composing Ensembles of Pre-trained Models via Iterative Consensus
Composing Ensembles of Pre-trained Models via Iterative Consensus
Shuang Li
Yilun Du
J. Tenenbaum
Antonio Torralba
Igor Mordatch
MoMe
19
23
0
20 Oct 2022
Scaling Instruction-Finetuned Language Models
Scaling Instruction-Finetuned Language Models
Hyung Won Chung
Le Hou
Shayne Longpre
Barret Zoph
Yi Tay
...
Jacob Devlin
Adam Roberts
Denny Zhou
Quoc V. Le
Jason W. Wei
ReLM
LRM
103
3,012
0
20 Oct 2022
Transcending Scaling Laws with 0.1% Extra Compute
Transcending Scaling Laws with 0.1% Extra Compute
Yi Tay
Jason W. Wei
Hyung Won Chung
Vinh Q. Tran
David R. So
...
Donald Metzler
Slav Petrov
N. Houlsby
Quoc V. Le
Mostafa Dehghani
LRM
47
68
0
20 Oct 2022
lo-fi: distributed fine-tuning without communication
lo-fi: distributed fine-tuning without communication
Mitchell Wortsman
Suchin Gururangan
Shen Li
Ali Farhadi
Ludwig Schmidt
Michael G. Rabbat
Ari S. Morcos
32
24
0
19 Oct 2022
On the Feasibility of Cross-Task Transfer with Model-Based Reinforcement
  Learning
On the Feasibility of Cross-Task Transfer with Model-Based Reinforcement Learning
Yifan Xu
Nicklas Hansen
Zirui Wang
Yung-Chieh Chan
H. Su
Zhuowen Tu
OffRL
31
16
0
19 Oct 2022
Attribution and Obfuscation of Neural Text Authorship: A Data Mining
  Perspective
Attribution and Obfuscation of Neural Text Authorship: A Data Mining Perspective
Adaku Uchendu
Thai Le
Dongwon Lee
DeLMO
32
41
0
19 Oct 2022
The Tail Wagging the Dog: Dataset Construction Biases of Social Bias
  Benchmarks
The Tail Wagging the Dog: Dataset Construction Biases of Social Bias Benchmarks
Nikil Selvam
Sunipa Dev
Daniel Khashabi
Tushar Khot
Kai-Wei Chang
ALM
24
25
0
18 Oct 2022
Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them
Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them
Mirac Suzgun
Nathan Scales
Nathanael Scharli
Sebastian Gehrmann
Yi Tay
...
Aakanksha Chowdhery
Quoc V. Le
Ed H. Chi
Denny Zhou
Jason W. Wei
ALM
ELM
LRM
ReLM
119
1,011
0
17 Oct 2022
Table-To-Text generation and pre-training with TabT5
Table-To-Text generation and pre-training with TabT5
Ewa Andrejczuk
Julian Martin Eisenschlos
Francesco Piccinno
Syrine Krichene
Yasemin Altun
LMTD
34
31
0
17 Oct 2022
RARR: Researching and Revising What Language Models Say, Using Language
  Models
RARR: Researching and Revising What Language Models Say, Using Language Models
Luyu Gao
Zhuyun Dai
Panupong Pasupat
Anthony Chen
Arun Tejasvi Chaganty
...
Vincent Zhao
Ni Lao
Hongrae Lee
Da-Cheng Juan
Kelvin Guu
HILM
KELM
41
257
0
17 Oct 2022
Zero-Shot Learners for Natural Language Understanding via a Unified
  Multiple Choice Perspective
Zero-Shot Learners for Natural Language Understanding via a Unified Multiple Choice Perspective
Ping Yang
Junjie Wang
Ruyi Gan
Xinyu Zhu
Lin Zhang
Ziwei Wu
Xinyu Gao
Jiaxing Zhang
Tetsuya Sakai
BDL
22
25
0
16 Oct 2022
The Debate Over Understanding in AI's Large Language Models
The Debate Over Understanding in AI's Large Language Models
Melanie Mitchell
D. Krakauer
ELM
74
203
0
14 Oct 2022
Previous
123...212223242526
Next