ResearchTrend.AI
  • Papers
  • Communities
  • Organizations
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.02311
  4. Cited By
PaLM: Scaling Language Modeling with Pathways
v1v2v3v4v5 (latest)

PaLM: Scaling Language Modeling with Pathways

5 April 2022
Aakanksha Chowdhery
Sharan Narang
Jacob Devlin
Maarten Bosma
Gaurav Mishra
Adam Roberts
P. Barham
Hyung Won Chung
Charles Sutton
Sebastian Gehrmann
Parker Schuh
Kensen Shi
Sasha Tsvyashchenko
Joshua Maynez
Abhishek Rao
Parker Barnes
Yi Tay
Noam M. Shazeer
Vinodkumar Prabhakaran
Emily Reif
Nan Du
Ben Hutchinson
Reiner Pope
James Bradbury
Jacob Austin
Michael Isard
Guy Gur-Ari
Pengcheng Yin
Toju Duke
Anselm Levskaya
Sanjay Ghemawat
Sunipa Dev
Henryk Michalewski
Xavier Garcia
Vedant Misra
Kevin Robinson
Liam Fedus
Denny Zhou
Daphne Ippolito
D. Luan
Hyeontaek Lim
Barret Zoph
A. Spiridonov
Ryan Sepassi
David Dohan
Shivani Agrawal
Mark Omernick
Andrew M. Dai
Thanumalayan Sankaranarayana Pillai
Marie Pellat
Aitor Lewkowycz
Erica Moreira
R. Child
Oleksandr Polozov
Katherine Lee
Zongwei Zhou
Xuezhi Wang
Brennan Saeta
Mark Díaz
Orhan Firat
Michele Catasta
Jason W. Wei
Kathy Meier-Hellstern
Douglas Eck
J. Dean
Slav Petrov
Noah Fiedel
    PILMLRM
ArXiv (abs)PDFHTML

Papers citing "PaLM: Scaling Language Modeling with Pathways"

50 / 4,332 papers shown
Title
Why Transformers Need Adam: A Hessian Perspective
Why Transformers Need Adam: A Hessian Perspective
Yushun Zhang
Congliang Chen
Tian Ding
Ziniu Li
Ruoyu Sun
Zhimin Luo
134
57
0
26 Feb 2024
Multi-Bit Distortion-Free Watermarking for Large Language Models
Multi-Bit Distortion-Free Watermarking for Large Language Models
Massieh Kordi Boroujeny
Ya Jiang
Kai Zeng
Brian L. Mark
WaLMVLM
89
7
0
26 Feb 2024
Defending LLMs against Jailbreaking Attacks via Backtranslation
Defending LLMs against Jailbreaking Attacks via Backtranslation
Yihan Wang
Zhouxing Shi
Andrew Bai
Cho-Jui Hsieh
AAML
102
42
0
26 Feb 2024
RoCoIns: Enhancing Robustness of Large Language Models through
  Code-Style Instructions
RoCoIns: Enhancing Robustness of Large Language Models through Code-Style Instructions
Yuan Zhang
Xiao Wang
Zhiheng Xi
Han Xia
Tao Gui
Qi Zhang
Xuanjing Huang
95
4
0
26 Feb 2024
LLM Inference Unveiled: Survey and Roofline Model Insights
LLM Inference Unveiled: Survey and Roofline Model Insights
Zhihang Yuan
Yuzhang Shang
Yang Zhou
Zhen Dong
Zhe Zhou
...
Yong Jae Lee
Yan Yan
Beidi Chen
Guangyu Sun
Kurt Keutzer
247
91
0
26 Feb 2024
An Integrated Data Processing Framework for Pretraining Foundation
  Models
An Integrated Data Processing Framework for Pretraining Foundation Models
Yiding Sun
Feng Wang
Yutao Zhu
Wayne Xin Zhao
Jiaxin Mao
149
5
0
26 Feb 2024
CodeS: Towards Building Open-source Language Models for Text-to-SQL
CodeS: Towards Building Open-source Language Models for Text-to-SQL
Haoyang Li
Jing Zhang
Hanbing Liu
Ju Fan
Yanling Wang
Jun Zhu
Renjie Wei
Hongyan Pan
Cuiping Li
Hong Chen
ELMAI4TS
120
119
0
26 Feb 2024
DrAttack: Prompt Decomposition and Reconstruction Makes Powerful LLM
  Jailbreakers
DrAttack: Prompt Decomposition and Reconstruction Makes Powerful LLM Jailbreakers
Xirui Li
Ruochen Wang
Minhao Cheng
Tianyi Zhou
Cho-Jui Hsieh
AAML
94
50
0
25 Feb 2024
StochCA: A Novel Approach for Exploiting Pretrained Models with
  Cross-Attention
StochCA: A Novel Approach for Exploiting Pretrained Models with Cross-Attention
SeungWon Seo
Suho Lee
Sangheum Hwang
96
0
0
25 Feb 2024
How Large Language Models Encode Context Knowledge? A Layer-Wise Probing
  Study
How Large Language Models Encode Context Knowledge? A Layer-Wise Probing Study
Tianjie Ju
Weiwei Sun
Wei Du
Xinwei Yuan
Zhaochun Ren
Gongshen Liu
KELM
68
33
0
25 Feb 2024
IR2: Information Regularization for Information Retrieval
IR2: Information Regularization for Information Retrieval
Jianyou Wang
Kaicheng Wang
Xiaoyue Wang
Weili Cao
R. Paturi
Leon Bergen
127
1
0
25 Feb 2024
Prompt Perturbation Consistency Learning for Robust Language Models
Prompt Perturbation Consistency Learning for Robust Language Models
Yao Qiang
Subhrangshu Nandi
Ninareh Mehrabi
Greg Ver Steeg
Anoop Kumar
Anna Rumshisky
Aram Galstyan
135
10
0
24 Feb 2024
Look Before You Leap: Problem Elaboration Prompting Improves
  Mathematical Reasoning in Large Language Models
Look Before You Leap: Problem Elaboration Prompting Improves Mathematical Reasoning in Large Language Models
Haoran Liao
Jidong Tian
Shaohua Hu
Hao He
Yaohui Jin
ReLMLRM
86
0
0
24 Feb 2024
Hal-Eval: A Universal and Fine-grained Hallucination Evaluation
  Framework for Large Vision Language Models
Hal-Eval: A Universal and Fine-grained Hallucination Evaluation Framework for Large Vision Language Models
Chaoya Jiang
Wei Ye
Mengfan Dong
Hongrui Jia
Haiyang Xu
Mingshi Yan
Ji Zhang
Shikun Zhang
VLMMLLM
120
16
0
24 Feb 2024
MegaScale: Scaling Large Language Model Training to More Than 10,000
  GPUs
MegaScale: Scaling Large Language Model Training to More Than 10,000 GPUs
Ziheng Jiang
Yanghua Peng
Yinmin Zhong
Qi Huang
Yangrui Chen
...
Zhe Li
X. Jia
Jia-jun Ye
Xin Jin
Xin Liu
LRM
129
124
0
23 Feb 2024
How Do Nonlinear Transformers Learn and Generalize in In-Context
  Learning?
How Do Nonlinear Transformers Learn and Generalize in In-Context Learning?
Hongkang Li
Meng Wang
Songtao Lu
Xiaodong Cui
Pin-Yu Chen
MLT
121
18
0
23 Feb 2024
AutoMMLab: Automatically Generating Deployable Models from Language
  Instructions for Computer Vision Tasks
AutoMMLab: Automatically Generating Deployable Models from Language Instructions for Computer Vision Tasks
Zekang Yang
Wang Zeng
Sheng Jin
Chao Qian
Ping Luo
Wentao Liu
MLLMVLM
106
10
0
23 Feb 2024
DEEM: Dynamic Experienced Expert Modeling for Stance Detection
DEEM: Dynamic Experienced Expert Modeling for Stance Detection
Xiaolong Wang
Yile Wang
Sijie Cheng
Peng Li
Yang Liu
66
9
0
23 Feb 2024
DeMPT: Decoding-enhanced Multi-phase Prompt Tuning for Making LLMs Be
  Better Context-aware Translators
DeMPT: Decoding-enhanced Multi-phase Prompt Tuning for Making LLMs Be Better Context-aware Translators
Xinglin Lyu
Junhui Li
Yanqing Zhao
Min Zhang
Daimeng Wei
Shimin Tao
Hao Yang
Min Zhang
95
5
0
23 Feb 2024
Entity-level Factual Adaptiveness of Fine-tuning based Abstractive
  Summarization Models
Entity-level Factual Adaptiveness of Fine-tuning based Abstractive Summarization Models
Jongyoon Song
Nohil Park
Bongkyu Hwang
Jaewoong Yun
Seongho Joe
Youngjune Gwon
Sungroh Yoon
KELMHILM
77
1
0
23 Feb 2024
AttributionBench: How Hard is Automatic Attribution Evaluation?
AttributionBench: How Hard is Automatic Attribution Evaluation?
Yifei Li
Xiang Yue
Zeyi Liao
Huan Sun
HILM
89
13
0
23 Feb 2024
Hands-Free VR
Hands-Free VR
J. Fernandez
Jae Joong Lee
Santiago Andrés Serrano Vacca
Alejandra Magana
Bedrich Benes
V. Popescu
53
0
0
23 Feb 2024
Optimizing Language Models for Human Preferences is a Causal Inference
  Problem
Optimizing Language Models for Human Preferences is a Causal Inference Problem
Victoria Lin
Eli Ben-Michael
Louis-Philippe Morency
113
5
0
22 Feb 2024
In-Context Learning of a Linear Transformer Block: Benefits of the MLP
  Component and One-Step GD Initialization
In-Context Learning of a Linear Transformer Block: Benefits of the MLP Component and One-Step GD Initialization
Ruiqi Zhang
Jingfeng Wu
Peter L. Bartlett
115
16
0
22 Feb 2024
PALO: A Polyglot Large Multimodal Model for 5B People
PALO: A Polyglot Large Multimodal Model for 5B People
Muhammad Maaz
H. Rasheed
Abdelrahman M. Shaker
Salman Khan
Hisham Cholakal
Rao M. Anwer
Timothy Baldwin
Michael Felsberg
Fahad S. Khan
VLMLRM
154
15
0
22 Feb 2024
CriticBench: Benchmarking LLMs for Critique-Correct Reasoning
CriticBench: Benchmarking LLMs for Critique-Correct Reasoning
Zicheng Lin
Zhibin Gou
Tian Liang
Ruilin Luo
Haowei Liu
Yujiu Yang
LRM
113
56
0
22 Feb 2024
MobileLLM: Optimizing Sub-billion Parameter Language Models for
  On-Device Use Cases
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases
Zechun Liu
Changsheng Zhao
Forrest N. Iandola
Chen Lai
Yuandong Tian
...
Ernie Chang
Yangyang Shi
Raghuraman Krishnamoorthi
Liangzhen Lai
Vikas Chandra
ALM
141
103
0
22 Feb 2024
Self-Guided Masked Autoencoders for Domain-Agnostic Self-Supervised
  Learning
Self-Guided Masked Autoencoders for Domain-Agnostic Self-Supervised Learning
Johnathan Xie
Yoonho Lee
Annie S. Chen
Chelsea Finn
87
3
0
22 Feb 2024
Zero-shot cross-lingual transfer in instruction tuning of large language
  models
Zero-shot cross-lingual transfer in instruction tuning of large language models
Nadezhda Chirkova
Vassilina Nikoulina
LRM
89
4
0
22 Feb 2024
DualFocus: Integrating Macro and Micro Perspectives in Multi-modal Large
  Language Models
DualFocus: Integrating Macro and Micro Perspectives in Multi-modal Large Language Models
Yuhang Cao
Pan Zhang
Xiao-wen Dong
Dahua Lin
Jiaqi Wang
82
12
0
22 Feb 2024
Cleaner Pretraining Corpus Curation with Neural Web Scraping
Cleaner Pretraining Corpus Curation with Neural Web Scraping
Zhipeng Xu
Zhenghao Liu
Yukun Yan
Zhiyuan Liu
Ge Yu
Chenyan Xiong
CLIPOnRL
118
5
0
22 Feb 2024
LLM-DA: Data Augmentation via Large Language Models for Few-Shot Named
  Entity Recognition
LLM-DA: Data Augmentation via Large Language Models for Few-Shot Named Entity Recognition
Junjie Ye
Nuo Xu
Yikun Wang
Jie Zhou
Qi Zhang
Tao Gui
Xuanjing Huang
72
16
0
22 Feb 2024
Towards Robust Instruction Tuning on Multimodal Large Language Models
Towards Robust Instruction Tuning on Multimodal Large Language Models
Wei Han
Hui Chen
Soujanya Poria
MLLM
82
1
0
22 Feb 2024
Vygotsky Distance: Measure for Benchmark Task Similarity
Vygotsky Distance: Measure for Benchmark Task Similarity
Maxim K. Surkov
Ivan P. Yamshchikov
93
0
0
22 Feb 2024
KoCoSa: Korean Context-aware Sarcasm Detection Dataset
KoCoSa: Korean Context-aware Sarcasm Detection Dataset
Yumin Kim
Heejae Suh
Mingi Kim
Dongyeon Won
Donghoon Shin
93
1
0
22 Feb 2024
Hint-before-Solving Prompting: Guiding LLMs to Effectively Utilize
  Encoded Knowledge
Hint-before-Solving Prompting: Guiding LLMs to Effectively Utilize Encoded Knowledge
Jinlan Fu
Shenzhen Huangfu
Hang Yan
See-Kiong Ng
Xipeng Qiu
LRM
96
8
0
22 Feb 2024
Qsnail: A Questionnaire Dataset for Sequential Question Generation
Qsnail: A Questionnaire Dataset for Sequential Question Generation
Yan Lei
Liang Pang
Yuanzhuo Wang
Huawei Shen
Xueqi Cheng
60
0
0
22 Feb 2024
Word-Sequence Entropy: Towards Uncertainty Estimation in Free-Form
  Medical Question Answering Applications and Beyond
Word-Sequence Entropy: Towards Uncertainty Estimation in Free-Form Medical Question Answering Applications and Beyond
Zhiyuan Wang
Jinhao Duan
Chenxi Yuan
Qingyu Chen
Tianlong Chen
Huaxiu Yao
Yue Zhang
Ren Wang
Kaidi Xu
Xiaoshuang Shi
UQLM
185
13
0
22 Feb 2024
LLMs with Industrial Lens: Deciphering the Challenges and Prospects -- A Survey
LLMs with Industrial Lens: Deciphering the Challenges and Prospects -- A Survey
Ashok Urlana
Charaka Vinayak Kumar
Ajeet Kumar Singh
B. Garlapati
S. Chalamala
Rahul Mishra
126
8
0
22 Feb 2024
Automatic Histograms: Leveraging Language Models for Text Dataset
  Exploration
Automatic Histograms: Leveraging Language Models for Text Dataset Exploration
Emily Reif
Crystal Qian
James Wexler
Minsuk Kahng
58
12
0
21 Feb 2024
Beyond Probabilities: Unveiling the Misalignment in Evaluating Large
  Language Models
Beyond Probabilities: Unveiling the Misalignment in Evaluating Large Language Models
Chenyang Lyu
Minghao Wu
Alham Fikri Aji
ELM
66
14
0
21 Feb 2024
Self-Distillation Bridges Distribution Gap in Language Model Fine-Tuning
Self-Distillation Bridges Distribution Gap in Language Model Fine-Tuning
Zhaorui Yang
Tianyu Pang
Hao Feng
Han Wang
Wei Chen
Minfeng Zhu
Qian Liu
ALM
101
50
0
21 Feb 2024
Infrastructure Ombudsman: Mining Future Failure Concerns from Structural
  Disaster Response
Infrastructure Ombudsman: Mining Future Failure Concerns from Structural Disaster Response
Towhid Chowdhury
Soumyajit Datta
Naveen Sharma
Ashiqur R. KhudaBukhsh
AI4CE
86
4
0
21 Feb 2024
RecMind: Japanese Movie Recommendation Dialogue with Seeker's Internal
  State
RecMind: Japanese Movie Recommendation Dialogue with Seeker's Internal State
Takashi Kodama
Hirokazu Kiyomaru
Yin Jou Huang
Sadao Kurohashi
60
0
0
21 Feb 2024
ProSparse: Introducing and Enhancing Intrinsic Activation Sparsity
  within Large Language Models
ProSparse: Introducing and Enhancing Intrinsic Activation Sparsity within Large Language Models
Chenyang Song
Xu Han
Zhengyan Zhang
Shengding Hu
Xiyu Shi
...
Chen Chen
Zhiyuan Liu
Guanglin Li
Tao Yang
Maosong Sun
167
32
0
21 Feb 2024
From Self-Attention to Markov Models: Unveiling the Dynamics of
  Generative Transformers
From Self-Attention to Markov Models: Unveiling the Dynamics of Generative Transformers
M. E. Ildiz
Yixiao Huang
Yingcong Li
A. S. Rawat
Samet Oymak
90
23
0
21 Feb 2024
GradSafe: Detecting Jailbreak Prompts for LLMs via Safety-Critical
  Gradient Analysis
GradSafe: Detecting Jailbreak Prompts for LLMs via Safety-Critical Gradient Analysis
Yueqi Xie
Minghong Fang
Renjie Pi
Neil Zhenqiang Gong
117
36
0
21 Feb 2024
Healthcare Copilot: Eliciting the Power of General LLMs for Medical
  Consultation
Healthcare Copilot: Eliciting the Power of General LLMs for Medical Consultation
Zhiyao Ren
Yibing Zhan
Baosheng Yu
Liang Ding
Dacheng Tao
LM&MA
65
14
0
20 Feb 2024
Transformer tricks: Precomputing the first layer
Transformer tricks: Precomputing the first layer
Nils Graef
MoE
67
4
0
20 Feb 2024
ChatEL: Entity Linking with Chatbots
ChatEL: Entity Linking with Chatbots
Yifan Ding
Qingkai Zeng
Tim Weninger
KELM
93
6
0
20 Feb 2024
Previous
123...333435...858687
Next