ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2010.14701
  4. Cited By
Scaling Laws for Autoregressive Generative Modeling

Scaling Laws for Autoregressive Generative Modeling

28 October 2020
T. Henighan
Jared Kaplan
Mor Katz
Mark Chen
Christopher Hesse
Jacob Jackson
Heewoo Jun
Tom B. Brown
Prafulla Dhariwal
Scott Gray
Chris Hallacy
Benjamin Mann
Alec Radford
Aditya A. Ramesh
Nick Ryder
Daniel M. Ziegler
John Schulman
Dario Amodei
Sam McCandlish
ArXivPDFHTML

Papers citing "Scaling Laws for Autoregressive Generative Modeling"

50 / 310 papers shown
Title
Predictable Emergent Abilities of LLMs: Proxy Tasks Are All You Need
Predictable Emergent Abilities of LLMs: Proxy Tasks Are All You Need
Bo Zhang
Yan Yan
Boxiang Yang
Yifei Xue
Guang Liu
LRM
76
1
0
10 Dec 2024
Scaling Particle Collision Data Analysis
Scaling Particle Collision Data Analysis
Hengkui Wu
Panpan Chi
Yongfeng Zhu
Liujiang Liu
Shuyang Hu
...
Yingsi Xin
Bruce Liu
Dahao Liang
Xiaojun Jia
Manqi Ruan
79
0
0
28 Nov 2024
Predicting Emergent Capabilities by Finetuning
Predicting Emergent Capabilities by Finetuning
Charlie Snell
Eric Wallace
Dan Klein
Sergey Levine
ELM
LRM
84
5
0
25 Nov 2024
Autoregressive Models in Vision: A Survey
Autoregressive Models in Vision: A Survey
Jing Xiong
Gongye Liu
Lun Huang
Chengyue Wu
Taiqiang Wu
...
Hao Fei
Guillermo Sapiro
Jiebo Luo
Ping Luo
Ngai Wong
VGen
48
9
0
08 Nov 2024
Scaling Laws for Pre-training Agents and World Models
Scaling Laws for Pre-training Agents and World Models
Tim Pearce
Tabish Rashid
Dave Bignell
Raluca Georgescu
Sam Devlin
Katja Hofmann
LM&Ro
42
6
0
07 Nov 2024
Training Compute-Optimal Protein Language Models
Training Compute-Optimal Protein Language Models
Xingyi Cheng
Bo Chen
Pan Li
Jing Gong
Jie Tang
Le Song
84
13
0
04 Nov 2024
Does equivariance matter at scale?
Does equivariance matter at scale?
Johann Brehmer
S. Behrends
P. D. Haan
Taco S. Cohen
49
11
0
30 Oct 2024
How Does Critical Batch Size Scale in Pre-training?
How Does Critical Batch Size Scale in Pre-training?
Hanlin Zhang
Depen Morwani
Nikhil Vyas
Jingfeng Wu
Difan Zou
Udaya Ghai
Dean Phillips Foster
Sham Kakade
80
8
0
29 Oct 2024
LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior
LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior
Hanyu Wang
Saksham Suri
Yixuan Ren
Hao Chen
Abhinav Shrivastava
VGen
31
10
0
28 Oct 2024
Meta-Learning for Speeding Up Large Model Inference in Decentralized
  Environments
Meta-Learning for Speeding Up Large Model Inference in Decentralized Environments
Yuzhe Yang
Yipeng Du
Ahmad Farhan
Claudio Angione
Yue Zhao
Harry Yang
Fielding Johnston
James Buban
Patrick Colangelo
29
0
0
28 Oct 2024
A Simple Model of Inference Scaling Laws
A Simple Model of Inference Scaling Laws
Noam Levi
LRM
32
6
0
21 Oct 2024
Elucidating the design space of language models for image generation
Elucidating the design space of language models for image generation
Xuantong Liu
Shaozhe Hao
Xianbiao Qi
Tianyang Hu
Jun Wang
Rong Xiao
Yuan Yao
VLM
32
3
0
21 Oct 2024
Fluid: Scaling Autoregressive Text-to-image Generative Models with
  Continuous Tokens
Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens
Lijie Fan
Tianhong Li
Siyang Qin
Yuanzhen Li
Chen Sun
Michael Rubinstein
Deqing Sun
Kaiming He
Yonglong Tian
VLM
DiffM
48
42
0
17 Oct 2024
A Comparative Study on Reasoning Patterns of OpenAI's o1 Model
A Comparative Study on Reasoning Patterns of OpenAI's o1 Model
Siwei Wu
Zhongyuan Peng
Xinrun Du
Tuney Zheng
Minghao Liu
...
Zhaoxiang Zhang
Wenhao Huang
Ge Zhang
Chenghua Lin
J. H. Liu
ELM
LLMAG
LRM
AI4CE
40
30
0
17 Oct 2024
Self-Comparison for Dataset-Level Membership Inference in Large
  (Vision-)Language Models
Self-Comparison for Dataset-Level Membership Inference in Large (Vision-)Language Models
J. Ren
Kangrui Chen
Chen Chen
Vikash Sehwag
Yue Xing
Jiliang Tang
Lingjuan Lyu
32
1
0
16 Oct 2024
Towards Neural Scaling Laws for Time Series Foundation Models
Towards Neural Scaling Laws for Time Series Foundation Models
Qingren Yao
Chao-Han Huck Yang
Renhe Jiang
Keli Zhang
Ming Jin
Shirui Pan
AI4TS
AI4CE
47
7
0
16 Oct 2024
Scaling laws for post-training quantized large language models
Scaling laws for post-training quantized large language models
Zifei Xu
Alexander Lan
W. Yazar
T. Webb
Sayeh Sharify
Xin Wang
MQ
30
0
0
15 Oct 2024
Scaling Laws for Multilingual Language Models
Scaling Laws for Multilingual Language Models
Yifei He
Alon Benhaim
Barun Patra
Praneetha Vaddamanu
Sanchit Ahuja
Parul Chopra
Vishrav Chaudhary
Han Zhao
Xia Song
36
4
0
15 Oct 2024
Universal scaling laws in quantum-probabilistic machine learning by
  tensor network towards interpreting representation and generalization powers
Universal scaling laws in quantum-probabilistic machine learning by tensor network towards interpreting representation and generalization powers
Sheng-Chen Bai
Shi-Ju Ran
64
1
0
13 Oct 2024
Scaling Laws for Predicting Downstream Performance in LLMs
Scaling Laws for Predicting Downstream Performance in LLMs
Yangyi Chen
Binxuan Huang
Yifan Gao
Zhengyang Wang
Jingfeng Yang
Heng Ji
LRM
53
9
0
11 Oct 2024
Scaling Laws For Diffusion Transformers
Scaling Laws For Diffusion Transformers
Zhengyang Liang
Hao He
Ceyuan Yang
Bo Dai
35
9
0
10 Oct 2024
CAR: Controllable Autoregressive Modeling for Visual Generation
CAR: Controllable Autoregressive Modeling for Visual Generation
Ziyu Yao
Jialin Li
Yifeng Zhou
Yong Liu
Xi Jiang
Chengjie Wang
Feng Zheng
Yuexian Zou
Lei Li
DiffM
37
13
0
07 Oct 2024
No Need to Talk: Asynchronous Mixture of Language Models
No Need to Talk: Asynchronous Mixture of Language Models
Anastasiia Filippova
Angelos Katharopoulos
David Grangier
Ronan Collobert
MoE
44
0
0
04 Oct 2024
Searching for Efficient Linear Layers over a Continuous Space of
  Structured Matrices
Searching for Efficient Linear Layers over a Continuous Space of Structured Matrices
Andres Potapczynski
Shikai Qiu
Marc Finzi
Christopher Ferri
Zixi Chen
Micah Goldblum
Bayan Bruss
Christopher De Sa
Andrew Gordon Wilson
45
1
0
03 Oct 2024
Scaling Proprioceptive-Visual Learning with Heterogeneous Pre-trained
  Transformers
Scaling Proprioceptive-Visual Learning with Heterogeneous Pre-trained Transformers
Lirui Wang
Xinlei Chen
Jialiang Zhao
Kaiming He
36
34
0
30 Sep 2024
A method for identifying causality in the response of nonlinear
  dynamical systems
A method for identifying causality in the response of nonlinear dynamical systems
Joseph Massingham
Ole Nielsen
Tore Butlin
CML
24
0
0
26 Sep 2024
BeanCounter: A low-toxicity, large-scale, and open dataset of
  business-oriented text
BeanCounter: A low-toxicity, large-scale, and open dataset of business-oriented text
Siyan Wang
Bradford Levy
34
2
0
26 Sep 2024
VLM's Eye Examination: Instruct and Inspect Visual Competency of Vision
  Language Models
VLM's Eye Examination: Instruct and Inspect Visual Competency of Vision Language Models
Nam Hyeon-Woo
Moon Ye-Bin
Wonseok Choi
Lee Hyun
Tae-Hyun Oh
CoGe
28
3
0
23 Sep 2024
An overview of domain-specific foundation model: key technologies,
  applications and challenges
An overview of domain-specific foundation model: key technologies, applications and challenges
Haolong Chen
Hanzhi Chen
Zijian Zhao
Kaifeng Han
Guangxu Zhu
Yichen Zhao
Ying Du
Wei Xu
Qingjiang Shi
ALM
VLM
66
4
0
06 Sep 2024
AgentMonitor: A Plug-and-Play Framework for Predictive and Secure
  Multi-Agent Systems
AgentMonitor: A Plug-and-Play Framework for Predictive and Secure Multi-Agent Systems
Chi-Min Chan
Jianxuan Yu
Weize Chen
Chunyang Jiang
Xinyu Liu
Weijie Shi
Zhiyuan Liu
Wei Xue
Yike Guo
LLMAG
46
0
0
27 Aug 2024
Do Neural Scaling Laws Exist on Graph Self-Supervised Learning?
Do Neural Scaling Laws Exist on Graph Self-Supervised Learning?
Qian Ma
Haitao Mao
Jingzhe Liu
Zhehua Zhang
Chunlin Feng
Yu Song
Yihan Shao
Yao Ma
40
3
0
20 Aug 2024
Scaling Law with Learning Rate Annealing
Scaling Law with Learning Rate Annealing
Howe Tissue
Venus Wang
Lu Wang
26
7
0
20 Aug 2024
ScalingFilter: Assessing Data Quality through Inverse Utilization of
  Scaling Laws
ScalingFilter: Assessing Data Quality through Inverse Utilization of Scaling Laws
Ruihang Li
Yixuan Wei
Miaosen Zhang
Nenghai Yu
Han Hu
Houwen Peng
50
2
0
15 Aug 2024
Understanding the Interplay of Scale, Data, and Bias in Language Models:
  A Case Study with BERT
Understanding the Interplay of Scale, Data, and Bias in Language Models: A Case Study with BERT
Muhammad Ali
Swetasudha Panda
Qinlan Shen
Michael Wick
Ari Kobren
MILM
36
3
0
25 Jul 2024
PEFT-U: Parameter-Efficient Fine-Tuning for User Personalization
PEFT-U: Parameter-Efficient Fine-Tuning for User Personalization
Christopher Clarke
Yuzhao Heng
Lingjia Tang
Jason Mars
17
3
0
25 Jul 2024
Scaling Training Data with Lossy Image Compression
Scaling Training Data with Lossy Image Compression
Katherine L. Mentzer
Andrea Montanari
36
0
0
25 Jul 2024
CMR Scaling Law: Predicting Critical Mixture Ratios for Continual
  Pre-training of Language Models
CMR Scaling Law: Predicting Critical Mixture Ratios for Continual Pre-training of Language Models
Jiawei Gu
Zacc Yang
Chuanghao Ding
Rui Zhao
Fei Tan
CLL
47
4
0
24 Jul 2024
Rethinking Learned Image Compression: Context is All You Need
Rethinking Learned Image Compression: Context is All You Need
Jixiang Luo
34
0
0
16 Jul 2024
Optical Diffusion Models for Image Generation
Optical Diffusion Models for Image Generation
Ilker Oguz
Niyazi Ulaş Dinç
Mustafa Yildirim
Junjie Ke
Innfarn Yoo
Qifei Wang
Feng Yang
Christophe Moser
D. Psaltis
40
0
0
15 Jul 2024
52B to 1T: Lessons Learned via Tele-FLM Series
52B to 1T: Lessons Learned via Tele-FLM Series
Xiang Li
Yiqun Yao
Xin Jiang
Xuezhi Fang
Chao Wang
...
Yequan Wang
Zhongjiang He
Zhongyuan Wang
Xuelong Li
Tiejun Huang
ALM
LRM
47
3
0
03 Jul 2024
Collaborative Performance Prediction for Large Language Models
Collaborative Performance Prediction for Large Language Models
Qiyuan Zhang
Fuyuan Lyu
Xue Liu
Chen Ma
32
3
0
01 Jul 2024
Resolving Discrepancies in Compute-Optimal Scaling of Language Models
Resolving Discrepancies in Compute-Optimal Scaling of Language Models
Tomer Porian
Mitchell Wortsman
J. Jitsev
Ludwig Schmidt
Y. Carmon
60
20
0
27 Jun 2024
Scaling Laws for Linear Complexity Language Models
Scaling Laws for Linear Complexity Language Models
Xuyang Shen
Dong Li
Ruitao Leng
Zhen Qin
Weigao Sun
Yiran Zhong
LRM
33
6
0
24 Jun 2024
Towards a Science Exocortex
Towards a Science Exocortex
Kevin G. Yager
80
0
0
24 Jun 2024
Optimizing Psychological Counseling with Instruction-Tuned Large
  Language Models
Optimizing Psychological Counseling with Instruction-Tuned Large Language Models
Wenjie Li
Tianyu Sun
Kun Qian
Wenhong Wang
LM&MA
41
1
0
19 Jun 2024
OmniTokenizer: A Joint Image-Video Tokenizer for Visual Generation
OmniTokenizer: A Joint Image-Video Tokenizer for Visual Generation
Junke Wang
Yi-Xin Jiang
Zehuan Yuan
Binyue Peng
Zuxuan Wu
Yu-Gang Jiang
ViT
VGen
80
38
0
13 Jun 2024
Towards an Improved Understanding and Utilization of Maximum Manifold
  Capacity Representations
Towards an Improved Understanding and Utilization of Maximum Manifold Capacity Representations
Rylan Schaeffer
Victor Lecomte
Dhruv Pai
Andres Carranza
Berivan Isik
...
Yann LeCun
SueYeon Chung
Andrey Gromov
Ravid Shwartz-Ziv
Sanmi Koyejo
49
6
0
13 Jun 2024
Scaling Laws in Linear Regression: Compute, Parameters, and Data
Scaling Laws in Linear Regression: Compute, Parameters, and Data
Licong Lin
Jingfeng Wu
Sham Kakade
Peter L. Bartlett
Jason D. Lee
LRM
44
15
0
12 Jun 2024
Reconciling Kaplan and Chinchilla Scaling Laws
Reconciling Kaplan and Chinchilla Scaling Laws
Tim Pearce
Jinyeop Song
34
8
0
12 Jun 2024
Autoregressive Model Beats Diffusion: Llama for Scalable Image
  Generation
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation
Peize Sun
Yi Jiang
Shoufa Chen
Shilong Zhang
Bingyue Peng
Ping Luo
Zehuan Yuan
VLM
66
229
0
10 Jun 2024
Previous
1234567
Next