Revisiting Neural Scaling Laws in Language and Vision

13 September 2022

Papers citing "Revisiting Neural Scaling Laws in Language and Vision"

29 / 29 papers shown

Title
ORXE: Orchestrating Experts for Dynamically Configurable Efficiency Qingyuan Wang Guoxin Wang B. Cardiff Deepu John 38 0 0 07 May 2025
Scaling Laws For Scalable Oversight Joshua Engels David D. Baek Subhash Kantamneni Max Tegmark ELM 72 0 0 25 Apr 2025
MaskAttn-UNet: A Mask Attention-Driven Framework for Universal Low-Resolution Image Segmentation Anzhe Cheng Chenzhong Yin Yu Chang Heng Ping Shixuan Li Shahin Nazarian Paul Bogdan SSeg 86 0 0 11 Mar 2025
Shh, don't say that! Domain Certification in LLMs Cornelius Emde Alasdair Paren Preetham Arvind Maxime Kayser Tom Rainforth Thomas Lukasiewicz Bernard Ghanem Philip H. S. Torr Adel Bibi 50 1 0 26 Feb 2025
TimeHF: Billion-Scale Time Series Models Guided by Human Feedback Yongzhi Qi Hao Hu Dazhou Lei Jianshen Zhang Zhengxin Shi Yulin Huang Zhengyu Chen Xiaoming Lin Zuo-jun Shen AI4TS AI4CE 41 1 0 28 Jan 2025
Next Patch Prediction for Autoregressive Visual Generation Yatian Pang Peng Jin Shuo Yang Bin Lin Bin Zhu ... Liuhan Chen Francis E. H. Tay Ser-Nam Lim Harry Yang Li Yuan 120 8 0 19 Dec 2024
Scaling laws for post-training quantized large language models Zifei Xu Alexander Lan W. Yazar T. Webb Sayeh Sharify Xin Eric Wang MQ 28 0 0 15 Oct 2024
Scaling Laws for Predicting Downstream Performance in LLMs Yangyi Chen Binxuan Huang Yifan Gao Zhengyang Wang Jingfeng Yang Heng Ji LRM 43 8 0 11 Oct 2024
Towards Generalisable Time Series Understanding Across Domains Özgün Turgut Philip Muller M. Menten Daniel Rueckert AI4TS 48 1 0 09 Oct 2024
Time-MoE: Billion-Scale Time Series Foundation Models with Mixture of Experts X. Shi Shiyu Wang Yuqi Nie Dianqi Li Zhou Ye Qingsong Wen Ming Jin AI4TS 36 26 0 24 Sep 2024
Breaking Neural Network Scaling Laws with Modularity Akhilan Boopathy Sunshine Jiang William Yue Jaedong Hwang Abhiram Iyer Ila Fiete OOD 39 2 0 09 Sep 2024
DeepGate3: Towards Scalable Circuit Representation Learning Zhengyuan Shi Ziyang Zheng Sadaf Khan Jianyuan Zhong Min Li Qiang Xu GNN AI4CE 36 8 0 15 Jul 2024
Scaling Laws in Linear Regression: Compute, Parameters, and Data Licong Lin Jingfeng Wu Sham Kakade Peter L. Bartlett Jason D. Lee LRM 35 15 0 12 Jun 2024
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation Peize Sun Yi Jiang Shoufa Chen Shilong Zhang Bingyue Peng Ping Luo Zehuan Yuan VLM 66 222 0 10 Jun 2024
Data Mixing Laws: Optimizing Data Mixtures by Predicting Language Modeling Performance Jiasheng Ye Peiju Liu Tianxiang Sun Yunhua Zhou Jun Zhan Xipeng Qiu 42 62 0 25 Mar 2024
Is Adversarial Training with Compressed Datasets Effective? Tong Chen Raghavendra Selvan AAML 52 0 0 08 Feb 2024
End-to-End Temporal Action Detection with 1B Parameters Across 1000 Frames Shuming Liu Chen-Da Liu-Zhang Chen Zhao Bernard Ghanem 33 25 0 28 Nov 2023
The Universal Statistical Structure and Scaling Laws of Chaos and Turbulence Noam Levi Yaron Oz AI4CE 24 1 0 02 Nov 2023
No Train No Gain: Revisiting Efficient Training Algorithms For Transformer-based Language Models Jean Kaddour Oscar Key Piotr Nawrot Pasquale Minervini Matt J. Kusner 20 41 0 12 Jul 2023
Performance Scaling via Optimal Transport: Enabling Data Selection from Partially Revealed Sources Feiyang Kang H. Just Anit Kumar Sahu R. Jia 53 10 0 05 Jul 2023
Federated Conformal Predictors for Distributed Uncertainty Quantification Charles Lu Yaodong Yu Sai Praneeth Karimireddy Michael I. Jordan Ramesh Raskar FedML 34 21 0 27 May 2023
The Framework Tax: Disparities Between Inference Efficiency in NLP Research and Deployment Jared Fernandez Jacob Kahn Clara Na Yonatan Bisk Emma Strubell FedML 28 10 0 13 Feb 2023
Generalization on the Unseen, Logic Reasoning and Degree Curriculum Emmanuel Abbe Samy Bengio Aryo Lotfi Kevin Rizk LRM 36 48 0 30 Jan 2023
Broken Neural Scaling Laws Ethan Caballero Kshitij Gupta Irina Rish David M. Krueger 19 74 0 26 Oct 2022
Transfer Learning with Pretrained Remote Sensing Transformers A. Fuller K. Millard J.R. Green 27 11 0 28 Sep 2022
ResNet strikes back: An improved training procedure in timm Ross Wightman Hugo Touvron Hervé Jégou AI4TS 209 487 0 01 Oct 2021
MLP-Mixer: An all-MLP Architecture for Vision Ilya O. Tolstikhin N. Houlsby Alexander Kolesnikov Lucas Beyer Xiaohua Zhai ... Andreas Steiner Daniel Keysers Jakob Uszkoreit Mario Lucic Alexey Dosovitskiy 271 2,603 0 04 May 2021
Learning Curve Theory Marcus Hutter 132 58 0 08 Feb 2021
Scaling Laws for Neural Language Models Jared Kaplan Sam McCandlish T. Henighan Tom B. Brown B. Chess R. Child Scott Gray Alec Radford Jeff Wu Dario Amodei 231 4,460 0 23 Jan 2020