ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2010.14701
  4. Cited By
Scaling Laws for Autoregressive Generative Modeling

Scaling Laws for Autoregressive Generative Modeling

28 October 2020
T. Henighan
Jared Kaplan
Mor Katz
Mark Chen
Christopher Hesse
Jacob Jackson
Heewoo Jun
Tom B. Brown
Prafulla Dhariwal
Scott Gray
Chris Hallacy
Benjamin Mann
Alec Radford
Aditya A. Ramesh
Nick Ryder
Daniel M. Ziegler
John Schulman
Dario Amodei
Sam McCandlish
ArXivPDFHTML

Papers citing "Scaling Laws for Autoregressive Generative Modeling"

50 / 310 papers shown
Title
Survey for Landing Generative AI in Social and E-commerce Recsys -- the
  Industry Perspectives
Survey for Landing Generative AI in Social and E-commerce Recsys -- the Industry Perspectives
Da Xu
Danqing Zhang
Guangyu Yang
Bo Yang
Shuyuan Xu
Lingling Zheng
Cindy Liang
32
2
0
10 Jun 2024
Compute Better Spent: Replacing Dense Layers with Structured Matrices
Compute Better Spent: Replacing Dense Layers with Structured Matrices
Shikai Qiu
Andres Potapczynski
Marc Finzi
Micah Goldblum
Andrew Gordon Wilson
40
11
0
10 Jun 2024
Scaling and evaluating sparse autoencoders
Scaling and evaluating sparse autoencoders
Leo Gao
Tom Dupré la Tour
Henk Tillman
Gabriel Goh
Rajan Troll
Alec Radford
Ilya Sutskever
Jan Leike
Jeffrey Wu
38
118
0
06 Jun 2024
Xmodel-LM Technical Report
Xmodel-LM Technical Report
Yichuan Wang
Yang Liu
Yu Yan
Qun Wang
Xucheng Huang
Ling Jiang
OSLM
ALM
35
1
0
05 Jun 2024
SAVA: Scalable Learning-Agnostic Data Valuation
SAVA: Scalable Learning-Agnostic Data Valuation
Samuel Kessler
Tam Le
Vu Nguyen
TDI
61
0
0
03 Jun 2024
LLMs Could Autonomously Learn Without External Supervision
LLMs Could Autonomously Learn Without External Supervision
Ke Ji
Junying Chen
Anningzhe Gao
Wenya Xie
Xiang Wan
Benyou Wang
37
4
0
02 Jun 2024
Quest: Query-centric Data Synthesis Approach for Long-context Scaling of
  Large Language Model
Quest: Query-centric Data Synthesis Approach for Long-context Scaling of Large Language Model
Chaochen Gao
Xing Wu
Qingfang Fu
Songlin Hu
SyDa
34
5
0
30 May 2024
LLMs achieve adult human performance on higher-order theory of mind
  tasks
LLMs achieve adult human performance on higher-order theory of mind tasks
Winnie Street
John Oliver Siy
Geoff Keeling
Adrien Baranes
Benjamin Barnett
Michael McKibben
Tatenda Kanyere
Alison Lentz
Blaise Agüera y Arcas
Robin I. M. Dunbar
LRM
51
33
0
29 May 2024
CLAQ: Pushing the Limits of Low-Bit Post-Training Quantization for LLMs
CLAQ: Pushing the Limits of Low-Bit Post-Training Quantization for LLMs
Haoyu Wang
Bei Liu
Hang Shao
Bo Xiao
Ke Zeng
Guanglu Wan
Yanmin Qian
MQ
31
0
0
27 May 2024
The Scaling Law in Stellar Light Curves
The Scaling Law in Stellar Light Curves
Jiashu Pan
Yuan-Sen Ting
Yang Huang
Jie Yu
Ji-Feng Liu
21
0
0
27 May 2024
Phase Transitions in the Output Distribution of Large Language Models
Phase Transitions in the Output Distribution of Large Language Models
Julian Arnold
Flemming Holtorf
Frank Schafer
Niels Lörch
41
1
0
27 May 2024
Bridging The Gap between Low-rank and Orthogonal Adaptation via
  Householder Reflection Adaptation
Bridging The Gap between Low-rank and Orthogonal Adaptation via Householder Reflection Adaptation
Shen Yuan
Haotian Liu
Hongteng Xu
44
2
0
24 May 2024
AstroPT: Scaling Large Observation Models for Astronomy
AstroPT: Scaling Large Observation Models for Astronomy
Michael J. Smith
Ryan J. Roberts
E. Angeloudi
M. Huertas-Company
46
1
0
23 May 2024
Scaling-laws for Large Time-series Models
Scaling-laws for Large Time-series Models
Thomas D. P. Edwards
James Alvey
Justin Alsing
Nam H. Nguyen
Benjamin Dan Wandelt
AI4TS
AIFin
36
7
0
22 May 2024
Large Language Models are Effective Priors for Causal Graph Discovery
Large Language Models are Effective Priors for Causal Graph Discovery
Victor-Alexandru Darvariu
Stephen Hailes
Mirco Musolesi
45
8
0
22 May 2024
A Survey of Robotic Language Grounding: Tradeoffs between Symbols and
  Embeddings
A Survey of Robotic Language Grounding: Tradeoffs between Symbols and Embeddings
Vanya Cohen
J. Liu
Raymond J. Mooney
Stefanie Tellex
David Watkins
LM&Ro
43
12
0
21 May 2024
OpenCarbonEval: A Unified Carbon Emission Estimation Framework in
  Large-Scale AI Models
OpenCarbonEval: A Unified Carbon Emission Estimation Framework in Large-Scale AI Models
Zhaojian Yu
Yinghao Wu
Zhuotao Deng
Yansong Tang
Xiao-Ping Zhang
52
2
0
21 May 2024
Separable Power of Classical and Quantum Learning Protocols Through the
  Lens of No-Free-Lunch Theorem
Separable Power of Classical and Quantum Learning Protocols Through the Lens of No-Free-Lunch Theorem
Xinbiao Wang
Yuxuan Du
Kecheng Liu
Yong Luo
Bo Du
Dacheng Tao
30
1
0
12 May 2024
Fake it to make it: Using synthetic data to remedy the data shortage in
  joint multimodal speech-and-gesture synthesis
Fake it to make it: Using synthetic data to remedy the data shortage in joint multimodal speech-and-gesture synthesis
Shivam Mehta
Anna Deichler
Jim O'Regan
Birger Moëll
Jonas Beskow
G. Henter
Simon Alexanderson
46
4
0
30 Apr 2024
KAN: Kolmogorov-Arnold Networks
KAN: Kolmogorov-Arnold Networks
Ziming Liu
Yixuan Wang
Sachin Vaidya
Fabian Ruehle
James Halverson
Marin Soljacic
Thomas Y. Hou
Max Tegmark
98
475
0
30 Apr 2024
Temporal Scaling Law for Large Language Models
Temporal Scaling Law for Large Language Models
Yizhe Xiong
Xiansheng Chen
Xin Ye
Hui Chen
Zijia Lin
...
Zhenpeng Su
Wei Huang
Jianwei Niu
J. Han
Guiguang Ding
43
9
0
27 Apr 2024
Tele-FLM Technical Report
Tele-FLM Technical Report
Xiang Li
Yiqun Yao
Xin Jiang
Xuezhi Fang
Chao Wang
...
Yequan Wang
Zhongjiang He
Zhongyuan Wang
Xuelong Li
Tiejun Huang
38
3
0
25 Apr 2024
Understanding Optimal Feature Transfer via a Fine-Grained Bias-Variance Analysis
Understanding Optimal Feature Transfer via a Fine-Grained Bias-Variance Analysis
Yufan Li
Subhabrata Sen
Ben Adlam
MLT
51
1
0
18 Apr 2024
MuPT: A Generative Symbolic Music Pretrained Transformer
MuPT: A Generative Symbolic Music Pretrained Transformer
Xingwei Qu
Yuelin Bai
Yi Ma
Ziya Zhou
Ka Man Lo
...
Xu Tan
Stephen W. Huang
Wenhu Chen
Jie Fu
Ge Zhang
57
10
0
09 Apr 2024
Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale
  Prediction
Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction
Keyu Tian
Yi-Xin Jiang
Zehuan Yuan
Bingyue Peng
Liwei Wang
VGen
42
250
0
03 Apr 2024
The Fine Line: Navigating Large Language Model Pretraining with
  Down-streaming Capability Analysis
The Fine Line: Navigating Large Language Model Pretraining with Down-streaming Capability Analysis
Chen Yang
Junzhuo Li
Xinyao Niu
Xinrun Du
Songyang Gao
...
Stephen W. Huang
Shawn Yue
Wenhu Chen
Jie Fu
Ge Zhang
43
2
0
01 Apr 2024
Global Vegetation Modeling with Pre-Trained Weather Transformers
Global Vegetation Modeling with Pre-Trained Weather Transformers
Pascal Janetzky
Florian Gallusser
Simon Hentschel
Andreas Hotho
Anna Krause
26
0
0
27 Mar 2024
Data Mixing Laws: Optimizing Data Mixtures by Predicting Language Modeling Performance
Data Mixing Laws: Optimizing Data Mixtures by Predicting Language Modeling Performance
Jiasheng Ye
Peiju Liu
Tianxiang Sun
Yunhua Zhou
Jun Zhan
Xipeng Qiu
55
64
0
25 Mar 2024
Understanding Emergent Abilities of Language Models from the Loss Perspective
Understanding Emergent Abilities of Language Models from the Loss Perspective
Zhengxiao Du
Aohan Zeng
Yuxiao Dong
Jie Tang
UQCV
LRM
70
46
0
23 Mar 2024
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training
Brandon McKinzie
Zhe Gan
J. Fauconnier
Sam Dodge
Bowen Zhang
...
Zirui Wang
Ruoming Pang
Peter Grasch
Alexander Toshev
Yinfei Yang
MLLM
43
187
0
14 Mar 2024
Language models scale reliably with over-training and on downstream
  tasks
Language models scale reliably with over-training and on downstream tasks
S. Gadre
Georgios Smyrnis
Vaishaal Shankar
Suchin Gururangan
Mitchell Wortsman
...
Y. Carmon
Achal Dave
Reinhard Heckel
Niklas Muennighoff
Ludwig Schmidt
ALM
ELM
LRM
108
40
0
13 Mar 2024
In-context learning enables multimodal large language models to classify
  cancer pathology images
In-context learning enables multimodal large language models to classify cancer pathology images
Dyke Ferber
Georg Wolflein
Isabella C. Wiest
M. Ligero
Srividhya Sainath
...
Omar S. M. El Nahhas
Gustav Muller-Franzes
Dirk Jager
Daniel Truhn
Jakob Nikolas Kather
VLM
MedIm
19
28
0
12 Mar 2024
Unraveling the Mystery of Scaling Laws: Part I
Unraveling the Mystery of Scaling Laws: Part I
Hui Su
Zhi Tian
Xiaoyu Shen
Xunliang Cai
33
19
0
11 Mar 2024
Yi: Open Foundation Models by 01.AI
Yi: Open Foundation Models by 01.AI
01. AI
Alex Young
01.AI Alex Young
Bei Chen
Chao Li
...
Yue Wang
Yuxuan Cai
Zhenyu Gu
Zhiyuan Liu
Zonghong Dai
OSLM
LRM
144
502
0
07 Mar 2024
On the Challenges and Opportunities in Generative AI
On the Challenges and Opportunities in Generative AI
Laura Manduchi
Kushagra Pandey
Robert Bamler
Ryan Cotterell
Sina Daubener
...
F. Wenzel
Frank Wood
Stephan Mandt
Vincent Fortuin
Vincent Fortuin
56
17
0
28 Feb 2024
When Scaling Meets LLM Finetuning: The Effect of Data, Model and
  Finetuning Method
When Scaling Meets LLM Finetuning: The Effect of Data, Model and Finetuning Method
Biao Zhang
Zhongtao Liu
Colin Cherry
Orhan Firat
LRM
63
126
0
27 Feb 2024
Scaling Laws for Fine-Grained Mixture of Experts
Scaling Laws for Fine-Grained Mixture of Experts
Jakub Krajewski
Jan Ludziejewski
Kamil Adamczewski
Maciej Pióro
Michal Krutul
...
Krystian Król
Tomasz Odrzygó'zd'z
Piotr Sankowski
Marek Cygan
Sebastian Jaszczur
MoE
51
54
0
12 Feb 2024
A Tale of Tails: Model Collapse as a Change of Scaling Laws
A Tale of Tails: Model Collapse as a Change of Scaling Laws
Elvis Dohmatob
Yunzhen Feng
Pu Yang
Francois Charton
Julia Kempe
29
64
0
10 Feb 2024
Offline Actor-Critic Reinforcement Learning Scales to Large Models
Offline Actor-Critic Reinforcement Learning Scales to Large Models
Jost Tobias Springenberg
A. Abdolmaleki
Jingwei Zhang
Oliver Groth
Michael Bloesch
...
Sarah Bechtle
Steven Kapturowski
Roland Hafner
N. Heess
Martin Riedmiller
OffRL
LRM
27
12
0
08 Feb 2024
Towards Understanding Inductive Bias in Transformers: A View From
  Infinity
Towards Understanding Inductive Bias in Transformers: A View From Infinity
Itay Lavie
Guy Gur-Ari
Zohar Ringel
37
1
0
07 Feb 2024
A Resource Model For Neural Scaling Law
A Resource Model For Neural Scaling Law
Jinyeop Song
Ziming Liu
Max Tegmark
Jeff Gore
96
4
0
07 Feb 2024
Scaling laws for learning with real and surrogate data
Scaling laws for learning with real and surrogate data
Ayush Jain
Andrea Montanari
Eren Sasoglu
40
12
0
06 Feb 2024
Selecting Large Language Model to Fine-tune via Rectified Scaling Law
Selecting Large Language Model to Fine-tune via Rectified Scaling Law
Haowei Lin
Baizhou Huang
Haotian Ye
Qinyu Chen
Zihao Wang
Sujian Li
Jianzhu Ma
Xiaojun Wan
James Zou
Yitao Liang
90
20
0
04 Feb 2024
Scaling Laws for Forgetting When Fine-Tuning Large Language Models
Scaling Laws for Forgetting When Fine-Tuning Large Language Models
Damjan Kalajdzievski
CLL
39
9
0
11 Jan 2024
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
DeepSeek-AI Xiao Bi
:
Xiao Bi
Deli Chen
Guanting Chen
...
Yao Zhao
Shangyan Zhou
Shunfeng Zhou
Qihao Zhu
Yuheng Zou
LRM
ALM
139
309
0
05 Jan 2024
Astraios: Parameter-Efficient Instruction Tuning Code Large Language
  Models
Astraios: Parameter-Efficient Instruction Tuning Code Large Language Models
Terry Yue Zhuo
A. Zebaze
Nitchakarn Suppattarachai
Leandro von Werra
H. D. Vries
Qian Liu
Niklas Muennighoff
ALM
41
15
0
01 Jan 2024
Beyond Chinchilla-Optimal: Accounting for Inference in Language Model Scaling Laws
Beyond Chinchilla-Optimal: Accounting for Inference in Language Model Scaling Laws
Nikhil Sardana
Jacob P. Portes
Sasha Doubov
Jonathan Frankle
LRM
240
69
0
31 Dec 2023
An Empirical Study of Scaling Law for OCR
An Empirical Study of Scaling Law for OCR
Miao Rang
Zhenni Bi
Chuanjian Liu
Yunhe Wang
Kai Han
41
6
0
29 Dec 2023
Gemini Pro Defeated by GPT-4V: Evidence from Education
Gemini Pro Defeated by GPT-4V: Evidence from Education
Gyeong-Geon Lee
Ehsan Latif
Lehong Shi
Xiaoming Zhai
31
22
0
27 Dec 2023
Tell, don't show: Declarative facts influence how LLMs generalize
Tell, don't show: Declarative facts influence how LLMs generalize
Alexander Meinke
Owain Evans
26
7
0
12 Dec 2023
Previous
1234567
Next