ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.02311
  4. Cited By
PaLM: Scaling Language Modeling with Pathways
v1v2v3v4v5 (latest)

PaLM: Scaling Language Modeling with Pathways

5 April 2022
Aakanksha Chowdhery
Sharan Narang
Jacob Devlin
Maarten Bosma
Gaurav Mishra
Adam Roberts
P. Barham
Hyung Won Chung
Charles Sutton
Sebastian Gehrmann
Parker Schuh
Kensen Shi
Sasha Tsvyashchenko
Joshua Maynez
Abhishek Rao
Parker Barnes
Yi Tay
Noam M. Shazeer
Vinodkumar Prabhakaran
Emily Reif
Nan Du
Ben Hutchinson
Reiner Pope
James Bradbury
Jacob Austin
Michael Isard
Guy Gur-Ari
Pengcheng Yin
Toju Duke
Anselm Levskaya
Sanjay Ghemawat
Sunipa Dev
Henryk Michalewski
Xavier Garcia
Vedant Misra
Kevin Robinson
Liam Fedus
Denny Zhou
Daphne Ippolito
D. Luan
Hyeontaek Lim
Barret Zoph
A. Spiridonov
Ryan Sepassi
David Dohan
Shivani Agrawal
Mark Omernick
Andrew M. Dai
Thanumalayan Sankaranarayana Pillai
Marie Pellat
Aitor Lewkowycz
Erica Moreira
R. Child
Oleksandr Polozov
Katherine Lee
Zongwei Zhou
Xuezhi Wang
Brennan Saeta
Mark Díaz
Orhan Firat
Michele Catasta
Jason W. Wei
Kathy Meier-Hellstern
Douglas Eck
J. Dean
Slav Petrov
Noah Fiedel
    PILMLRM
ArXiv (abs)PDFHTML

Papers citing "PaLM: Scaling Language Modeling with Pathways"

50 / 4,332 papers shown
Title
What is Wrong with Language Models that Can Not Tell a Story?
What is Wrong with Language Models that Can Not Tell a Story?
Ivan P. Yamshchikov
Alexey Tikhonov
80
7
0
09 Nov 2022
Creative Writing with an AI-Powered Writing Assistant: Perspectives from
  Professional Writers
Creative Writing with an AI-Powered Writing Assistant: Perspectives from Professional Writers
Daphne Ippolito
Ann Yuan
Andy Coenen
Sehmon Burnam
106
101
0
09 Nov 2022
Conciseness: An Overlooked Language Task
Conciseness: An Overlooked Language Task
Felix Stahlberg
Aashish Kumar
Chris Alberti
Shankar Kumar
47
1
0
08 Nov 2022
COPEN: Probing Conceptual Knowledge in Pre-trained Language Models
COPEN: Probing Conceptual Knowledge in Pre-trained Language Models
Hao Peng
Xiaozhi Wang
Shengding Hu
Hailong Jin
Lei Hou
Juanzi Li
Zhiyuan Liu
Qun Liu
96
25
0
08 Nov 2022
Pretraining in Deep Reinforcement Learning: A Survey
Pretraining in Deep Reinforcement Learning: A Survey
Zhihui Xie
Zichuan Lin
Junyou Li
Shuai Li
Deheng Ye
OffRLOnRLAI4CE
87
23
0
08 Nov 2022
Astronomia ex machina: a history, primer, and outlook on neural networks
  in astronomy
Astronomia ex machina: a history, primer, and outlook on neural networks in astronomy
Michael J. Smith
James E. Geach
76
36
0
07 Nov 2022
On minimal variations for unsupervised representation learning
On minimal variations for unsupervised representation learning
Vivien A. Cabannes
A. Bietti
Randall Balestriero
SSLDRL
92
8
0
07 Nov 2022
On the importance of data collection for training general goal-reaching
  policies
On the importance of data collection for training general goal-reaching policies
Alexis Jacq
Manu Orsini
Gabriel Dulac-Arnold
Olivier Pietquin
Matthieu Geist
Olivier Bachem
OffRL
70
1
0
07 Nov 2022
Okapi: Generalising Better by Making Statistical Matches Match
Okapi: Generalising Better by Making Statistical Matches Match
Myles Bartlett
Sara Romiti
V. Sharmanska
Novi Quadrianto
88
3
0
07 Nov 2022
How Much Does Attention Actually Attend? Questioning the Importance of
  Attention in Pretrained Transformers
How Much Does Attention Actually Attend? Questioning the Importance of Attention in Pretrained Transformers
Michael Hassid
Hao Peng
Daniel Rotem
Jungo Kasai
Ivan Montero
Noah A. Smith
Roy Schwartz
96
26
0
07 Nov 2022
Intriguing Properties of Compression on Multilingual Models
Intriguing Properties of Compression on Multilingual Models
Kelechi Ogueji
Orevaoghene Ahia
Gbemileke Onilude
Sebastian Gehrmann
Sara Hooker
Julia Kreutzer
93
14
0
04 Nov 2022
Zero-shot Video Moment Retrieval With Off-the-Shelf Models
Zero-shot Video Moment Retrieval With Off-the-Shelf Models
Anuj Diwan
Puyuan Peng
Raymond J. Mooney
VLM
74
3
0
03 Nov 2022
MolE: a molecular foundation model for drug discovery
MolE: a molecular foundation model for drug discovery
Oscar Méndez-Lucio
C. Nicolaou
Berton Earnshaw
91
11
0
03 Nov 2022
LMentry: A Language Model Benchmark of Elementary Language Tasks
LMentry: A Language Model Benchmark of Elementary Language Tasks
Avia Efrat
Or Honovich
Omer Levy
111
20
0
03 Nov 2022
Inverse scaling can become U-shaped
Inverse scaling can become U-shaped
Jason W. Wei
Najoung Kim
Yi Tay
Quoc V. Le
LRM
116
64
0
03 Nov 2022
Fine-Tuning Language Models via Epistemic Neural Networks
Fine-Tuning Language Models via Epistemic Neural Networks
Ian Osband
S. Asghari
Benjamin Van Roy
Nat McAleese
John Aslanides
G. Irving
UQLM
89
20
0
03 Nov 2022
eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert
  Denoisers
eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers
Yogesh Balaji
Seungjun Nah
Xun Huang
Arash Vahdat
Jiaming Song
...
Timo Aila
S. Laine
Bryan Catanzaro
Tero Karras
Xuan Li
VLMMoE
225
833
0
02 Nov 2022
PLATO-K: Internal and External Knowledge Enhanced Dialogue Generation
PLATO-K: Internal and External Knowledge Enhanced Dialogue Generation
Siqi Bao
H. He
Jun Xu
Hua Lu
Fan Wang
Hua Wu
Han Zhou
Wenquan Wu
Zheng-Yu Niu
Haifeng Wang
54
4
0
02 Nov 2022
Two-stage LLM Fine-tuning with Less Specialization and More
  Generalization
Two-stage LLM Fine-tuning with Less Specialization and More Generalization
Yihan Wang
Si Si
Daliang Li
Michal Lukasik
Felix X. Yu
Cho-Jui Hsieh
Inderjit S Dhillon
Sanjiv Kumar
137
30
0
01 Nov 2022
ClassActionPrediction: A Challenging Benchmark for Legal Judgment
  Prediction of Class Action Cases in the US
ClassActionPrediction: A Challenging Benchmark for Legal Judgment Prediction of Class Action Cases in the US
Gil Semo
Dor Bernsohn
Ben Hagag
Gila Hayat
Joel Niklaus
AILawELM
107
20
0
01 Nov 2022
Preventing Verbatim Memorization in Language Models Gives a False Sense
  of Privacy
Preventing Verbatim Memorization in Language Models Gives a False Sense of Privacy
Daphne Ippolito
Florian Tramèr
Milad Nasr
Chiyuan Zhang
Matthew Jagielski
Katherine Lee
Christopher A. Choquette-Choo
Nicholas Carlini
PILMMU
102
69
0
31 Oct 2022
SSD-LM: Semi-autoregressive Simplex-based Diffusion Language Model for
  Text Generation and Modular Control
SSD-LM: Semi-autoregressive Simplex-based Diffusion Language Model for Text Generation and Modular Control
Xiaochuang Han
Sachin Kumar
Yulia Tsvetkov
169
91
0
31 Oct 2022
A Simple, Yet Effective Approach to Finding Biases in Code Generation
A Simple, Yet Effective Approach to Finding Biases in Code Generation
Spyridon Mouselinos
Mateusz Malinowski
Henryk Michalewski
129
9
0
31 Oct 2022
Changes from Classical Statistics to Modern Statistics and Data Science
Changes from Classical Statistics to Modern Statistics and Data Science
Kai Zhang
Shan-Yu Liu
M. Xiong
92
0
0
30 Oct 2022
A Solvable Model of Neural Scaling Laws
A Solvable Model of Neural Scaling Laws
A. Maloney
Daniel A. Roberts
J. Sully
126
57
0
30 Oct 2022
Beyond Prompting: Making Pre-trained Language Models Better Zero-shot
  Learners by Clustering Representations
Beyond Prompting: Making Pre-trained Language Models Better Zero-shot Learners by Clustering Representations
Yu Fei
Ping Nie
Zhao Meng
Roger Wattenhofer
Mrinmaya Sachan
VLM
100
20
0
29 Oct 2022
Aligning Offline Metrics and Human Judgments of Value for Code
  Generation Models
Aligning Offline Metrics and Human Judgments of Value for Code Generation Models
Victor C. Dibia
Adam Fourney
Gagan Bansal
Forough Poursabzi-Sangdeh
Han Liu
Saleema Amershi
ALMOffRL
103
13
0
29 Oct 2022
Knowledge-in-Context: Towards Knowledgeable Semi-Parametric Language
  Models
Knowledge-in-Context: Towards Knowledgeable Semi-Parametric Language Models
Xiaoman Pan
Wenlin Yao
Hongming Zhang
Dian Yu
Dong Yu
Jianshu Chen
KELM
303
25
0
28 Oct 2022
Solving Math Word Problems via Cooperative Reasoning induced Language
  Models
Solving Math Word Problems via Cooperative Reasoning induced Language Models
Xinyu Zhu
Junjie Wang
Lin Zhang
Yuxiang Zhang
Ruyi Gan
Jiaxing Zhang
Yujiu Yang
ReLMLRM
183
84
0
28 Oct 2022
QUILL: Query Intent with Large Language Models using Retrieval
  Augmentation and Multi-stage Distillation
QUILL: Query Intent with Large Language Models using Retrieval Augmentation and Multi-stage Distillation
Krishna Srinivasan
K. Raman
Anupam Samanta
Ling-Yen Liao
L. Bertelli
Michael Bendersky
RALMLRM
81
20
0
27 Oct 2022
Truncation Sampling as Language Model Desmoothing
Truncation Sampling as Language Model Desmoothing
John Hewitt
Christopher D. Manning
Percy Liang
BDL
97
84
0
27 Oct 2022
Open-vocabulary Semantic Segmentation with Frozen Vision-Language Models
Open-vocabulary Semantic Segmentation with Frozen Vision-Language Models
Chaofan Ma
Yu-Hao Yang
Yanfeng Wang
Ya Zhang
Weidi Xie
VLM
81
48
0
27 Oct 2022
Multi-lingual Evaluation of Code Generation Models
Multi-lingual Evaluation of Code Generation Models
Ben Athiwaratkun
Sanjay Krishna Gouda
Zijian Wang
Xiaopeng Li
Yuchen Tian
...
Baishakhi Ray
Parminder Bhatia
Sudipta Sengupta
Dan Roth
Bing Xiang
ELM
193
177
0
26 Oct 2022
Don't Prompt, Search! Mining-based Zero-Shot Learning with Language
  Models
Don't Prompt, Search! Mining-based Zero-Shot Learning with Language Models
Mozes van de Kar
Mengzhou Xia
Danqi Chen
Mikel Artetxe
93
19
0
26 Oct 2022
Scaling Laws Beyond Backpropagation
Scaling Laws Beyond Backpropagation
Matthew J. Filipovich
Alessandro Cappelli
Daniel Hesslow
Julien Launay
62
3
0
26 Oct 2022
Piloting Copilot, Codex, and StarCoder2: Hot Temperature, Cold Prompts, or Black Magic?
Piloting Copilot, Codex, and StarCoder2: Hot Temperature, Cold Prompts, or Black Magic?
Jean-Baptiste Döderlein
Nguessan Hermann Kouadio
M. Acher
D. Khelladi
B. Combemale
92
36
0
26 Oct 2022
Universal Evasion Attacks on Summarization Scoring
Universal Evasion Attacks on Summarization Scoring
Wenchuan Mu
Kwan Hui Lim
AAML
85
1
0
25 Oct 2022
IELM: An Open Information Extraction Benchmark for Pre-Trained Language
  Models
IELM: An Open Information Extraction Benchmark for Pre-Trained Language Models
Chenguang Wang
Xiao Liu
Dawn Song
VLM
41
2
0
25 Oct 2022
Rich Knowledge Sources Bring Complex Knowledge Conflicts: Recalibrating
  Models to Reflect Conflicting Evidence
Rich Knowledge Sources Bring Complex Knowledge Conflicts: Recalibrating Models to Reflect Conflicting Evidence
Hung-Ting Chen
Michael J.Q. Zhang
Eunsol Choi
RALMHILM
141
100
0
25 Oct 2022
Reinforcement Learning and Bandits for Speech and Language Processing:
  Tutorial, Review and Outlook
Reinforcement Learning and Bandits for Speech and Language Processing: Tutorial, Review and Outlook
Baihan Lin
OffRLAI4TS
137
27
0
24 Oct 2022
MetaFormer Baselines for Vision
MetaFormer Baselines for Vision
Weihao Yu
Chenyang Si
Pan Zhou
Mi Luo
Yichen Zhou
Jiashi Feng
Shuicheng Yan
Xinchao Wang
MoE
110
171
0
24 Oct 2022
Towards Better Few-Shot and Finetuning Performance with Forgetful Causal
  Language Models
Towards Better Few-Shot and Finetuning Performance with Forgetful Causal Language Models
Hao Liu
Xinyang Geng
Lisa Lee
Igor Mordatch
Sergey Levine
Sharan Narang
Pieter Abbeel
KELMCLL
91
2
0
24 Oct 2022
Instruction-Following Agents with Multimodal Transformer
Instruction-Following Agents with Multimodal Transformer
Hao Liu
Lisa Lee
Kimin Lee
Pieter Abbeel
LM&Ro
135
11
0
24 Oct 2022
Neural Theory-of-Mind? On the Limits of Social Intelligence in Large LMs
Neural Theory-of-Mind? On the Limits of Social Intelligence in Large LMs
Maarten Sap
Ronan Le Bras
Daniel Fried
Yejin Choi
103
232
0
24 Oct 2022
Finding Memo: Extractive Memorization in Constrained Sequence Generation
  Tasks
Finding Memo: Extractive Memorization in Constrained Sequence Generation Tasks
Vikas Raunak
Arul Menezes
71
14
0
24 Oct 2022
Code4Struct: Code Generation for Few-Shot Event Structure Prediction
Code4Struct: Code Generation for Few-Shot Event Structure Prediction
Xingyao Wang
Sha Li
Heng Ji
131
84
0
23 Oct 2022
Neural Eigenfunctions Are Structured Representation Learners
Neural Eigenfunctions Are Structured Representation Learners
Zhijie Deng
Jiaxin Shi
Hao Zhang
Peng Cui
Cewu Lu
Jun Zhu
115
14
0
23 Oct 2022
The Curious Case of Absolute Position Embeddings
The Curious Case of Absolute Position Embeddings
Koustuv Sinha
Amirhossein Kazemnejad
Siva Reddy
J. Pineau
Dieuwke Hupkes
Adina Williams
148
15
0
23 Oct 2022
Exploring The Landscape of Distributional Robustness for Question
  Answering Models
Exploring The Landscape of Distributional Robustness for Question Answering Models
Anas Awadalla
Mitchell Wortsman
Gabriel Ilharco
Sewon Min
Ian H. Magnusson
Hannaneh Hajishirzi
Ludwig Schmidt
ELMOODKELM
121
21
0
22 Oct 2022
Robots-Dont-Cry: Understanding Falsely Anthropomorphic Utterances in
  Dialog Systems
Robots-Dont-Cry: Understanding Falsely Anthropomorphic Utterances in Dialog Systems
David Gros
Yu Li
Zhou Yu
90
11
0
22 Oct 2022
Previous
123...808182...858687
Next