ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2405.15682
  4. Cited By
The Road Less Scheduled

The Road Less Scheduled

24 May 2024
Aaron Defazio
Xingyu Yang
Yang
Harsh Mehta
Konstantin Mishchenko
Ahmed Khaled
Ashok Cutkosky
ArXivPDFHTML

Papers citing "The Road Less Scheduled"

34 / 34 papers shown
Title
Deep Learning-Based Robust Optical Guidance for Hypersonic Platforms
Deep Learning-Based Robust Optical Guidance for Hypersonic Platforms
Adrien Chan-Hon-Tong
A. Plyer
Baptiste Cadalen
Laurent Serre
58
0
0
09 May 2025
S-EO: A Large-Scale Dataset for Geometry-Aware Shadow Detection in Remote Sensing Applications
S-EO: A Large-Scale Dataset for Geometry-Aware Shadow Detection in Remote Sensing Applications
Masquil Elías
Marí Roger
Ehret Thibaud
Meinhardt-Llopis Enric
Musé Pablo
Facciolo Gabriele
MDE
92
0
0
09 Apr 2025
TiC-LM: A Web-Scale Benchmark for Time-Continual LLM Pretraining
TiC-LM: A Web-Scale Benchmark for Time-Continual LLM Pretraining
Jeffrey Li
Mohammadreza Armandpour
Iman Mirzadeh
Sachin Mehta
Vaishaal Shankar
...
Samy Bengio
Oncel Tuzel
Mehrdad Farajtabar
Hadi Pouransari
Fartash Faghri
CLL
KELM
124
0
0
02 Apr 2025
A Probabilistic Neuro-symbolic Layer for Algebraic Constraint Satisfaction
A Probabilistic Neuro-symbolic Layer for Algebraic Constraint Satisfaction
Leander Kurscheidt
Paolo Morettin
Roberto Sebastiani
Andrea Passerini
Antonio Vergari
126
0
0
25 Mar 2025
Improving Diffusion-based Inverse Algorithms under Few-Step Constraint via Learnable Linear Extrapolation
Improving Diffusion-based Inverse Algorithms under Few-Step Constraint via Learnable Linear Extrapolation
Jiawei Zhang
Ziyuan Liu
Leon Yan
Gen Li
Yuantao Gu
113
1
0
13 Mar 2025
MedConv: Convolutions Beat Transformers on Long-Tailed Bone Density Prediction
MedConv: Convolutions Beat Transformers on Long-Tailed Bone Density Prediction
Xuyin Qi
Zeyu Zhang
Huazhan Zheng
Mingxi Chen
Numan Kutaiba
...
Hongtao Mao
Yongbin Li
Zhibin Liao
Yang Zhao
Minh-Son To
MedIm
83
8
0
02 Feb 2025
360Brew: A Decoder-only Foundation Model for Personalized Ranking and Recommendation
360Brew: A Decoder-only Foundation Model for Personalized Ranking and Recommendation
Hamed Firooz
Maziar Sanjabi
Adrian Englhardt
Aman Gupta
Ben Levine
...
Xiaoling Zhai
Ya Xu
Yu Wang
Yun Dai
Yun Dai
ALM
104
4
0
27 Jan 2025
Cautious Optimizers: Improving Training with One Line of Code
Cautious Optimizers: Improving Training with One Line of Code
Kaizhao Liang
Lizhang Chen
B. Liu
Qiang Liu
ODL
166
9
0
25 Nov 2024
How Does Critical Batch Size Scale in Pre-training?
How Does Critical Batch Size Scale in Pre-training?
Hanlin Zhang
Depen Morwani
Nikhil Vyas
Jingfeng Wu
Difan Zou
Udaya Ghai
Dean Phillips Foster
Sham Kakade
125
15
0
29 Oct 2024
Analyzing Generative Models by Manifold Entropic Metrics
Analyzing Generative Models by Manifold Entropic Metrics
Daniel Galperin
Ullrich Köthe
DRL
103
0
0
25 Oct 2024
SOAP: Improving and Stabilizing Shampoo using Adam
SOAP: Improving and Stabilizing Shampoo using Adam
Nikhil Vyas
Depen Morwani
Rosie Zhao
Itai Shapira
David Brandfonbrener
Lucas Janson
Sham Kakade
Sham Kakade
135
37
0
17 Sep 2024
4-bit Shampoo for Memory-Efficient Network Training
4-bit Shampoo for Memory-Efficient Network Training
Sike Wang
Jia Li
Pan Zhou
Hua Huang
MQ
111
9
0
28 May 2024
Benchmarking Neural Network Training Algorithms
Benchmarking Neural Network Training Algorithms
George E. Dahl
Frank Schneider
Zachary Nado
Naman Agarwal
Chandramouli Shama Sastry
...
Chris J. Maddison
R. Vasudev
Michal Badura
Ankush Garg
Peter Mattson
60
33
0
12 Jun 2023
Stop Wasting My Time! Saving Days of ImageNet and BERT Training with
  Latest Weight Averaging
Stop Wasting My Time! Saving Days of ImageNet and BERT Training with Latest Weight Averaging
Jean Kaddour
MoMe
3DH
54
41
0
29 Sep 2022
Masked Autoencoders Are Scalable Vision Learners
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
451
7,739
0
11 Nov 2021
Rethinking "Batch" in BatchNorm
Rethinking "Batch" in BatchNorm
Yuxin Wu
Justin Johnson
BDL
96
66
0
17 May 2021
Adaptivity without Compromise: A Momentumized, Adaptive, Dual Averaged
  Gradient Method for Stochastic Optimization
Adaptivity without Compromise: A Momentumized, Adaptive, Dual Averaged Gradient Method for Stochastic Optimization
Aaron Defazio
Samy Jelassi
ODL
45
68
0
26 Jan 2021
Momentum via Primal Averaging: Theoretical Insights and Learning Rate
  Schedules for Non-Convex Optimization
Momentum via Primal Averaging: Theoretical Insights and Learning Rate Schedules for Non-Convex Optimization
Aaron Defazio
54
23
0
01 Oct 2020
Conformer: Convolution-augmented Transformer for Speech Recognition
Conformer: Convolution-augmented Transformer for Speech Recognition
Anmol Gulati
James Qin
Chung-Cheng Chiu
Niki Parmar
Yu Zhang
...
Wei Han
Shibo Wang
Zhengdong Zhang
Yonghui Wu
Ruoming Pang
220
3,131
0
16 May 2020
Open Graph Benchmark: Datasets for Machine Learning on Graphs
Open Graph Benchmark: Datasets for Machine Learning on Graphs
Weihua Hu
Matthias Fey
Marinka Zitnik
Yuxiao Dong
Hongyu Ren
Bowen Liu
Michele Catasta
J. Leskovec
303
2,730
0
02 May 2020
UniXGrad: A Universal, Adaptive Algorithm with Optimal Guarantees for
  Constrained Optimization
UniXGrad: A Universal, Adaptive Algorithm with Optimal Guarantees for Constrained Optimization
Ali Kavis
Kfir Y. Levy
Francis R. Bach
Volkan Cevher
ODL
64
59
0
30 Oct 2019
Introduction to Online Convex Optimization
Introduction to Online Convex Optimization
Elad Hazan
OffRL
170
1,929
0
07 Sep 2019
Lookahead Optimizer: k steps forward, 1 step back
Lookahead Optimizer: k steps forward, 1 step back
Michael Ruogu Zhang
James Lucas
Geoffrey E. Hinton
Jimmy Ba
ODL
132
730
0
19 Jul 2019
Deep Learning Recommendation Model for Personalization and
  Recommendation Systems
Deep Learning Recommendation Model for Personalization and Recommendation Systems
Maxim Naumov
Dheevatsa Mudigere
Hao-Jun Michael Shi
Jianyu Huang
Narayanan Sundaraman
...
Wenlin Chen
Vijay Rao
Bill Jia
Liang Xiong
M. Smelyanskiy
88
733
0
31 May 2019
fastMRI: An Open Dataset and Benchmarks for Accelerated MRI
fastMRI: An Open Dataset and Benchmarks for Accelerated MRI
Jure Zbontar
Florian Knoll
Anuroop Sriram
Tullie Murrell
Zhengnan Huang
...
Erich Owens
C. L. Zitnick
M. Recht
D. Sodickson
Yvonne W. Lui
OOD
65
843
0
21 Nov 2018
Relational inductive biases, deep learning, and graph networks
Relational inductive biases, deep learning, and graph networks
Peter W. Battaglia
Jessica B. Hamrick
V. Bapst
Alvaro Sanchez-Gonzalez
V. Zambaldi
...
Pushmeet Kohli
M. Botvinick
Oriol Vinyals
Yujia Li
Razvan Pascanu
AI4CE
NAI
750
3,119
0
04 Jun 2018
Densely Connected Convolutional Networks
Densely Connected Convolutional Networks
Gao Huang
Zhuang Liu
Laurens van der Maaten
Kilian Q. Weinberger
PINN
3DV
766
36,794
0
25 Aug 2016
Wide Residual Networks
Wide Residual Networks
Sergey Zagoruyko
N. Komodakis
334
7,984
0
23 May 2016
Rethinking the Inception Architecture for Computer Vision
Rethinking the Inception Architecture for Computer Vision
Christian Szegedy
Vincent Vanhoucke
Sergey Ioffe
Jonathon Shlens
Z. Wojna
3DV
BDL
878
27,358
0
02 Dec 2015
ImageNet Large Scale Visual Recognition Challenge
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky
Jia Deng
Hao Su
J. Krause
S. Satheesh
...
A. Karpathy
A. Khosla
Michael S. Bernstein
Alexander C. Berg
Li Fei-Fei
VLM
ObjD
1.7K
39,525
0
01 Sep 2014
Non-strongly-convex smooth stochastic approximation with convergence
  rate O(1/n)
Non-strongly-convex smooth stochastic approximation with convergence rate O(1/n)
Francis R. Bach
Eric Moulines
87
405
0
10 Jun 2013
A simpler approach to obtaining an O(1/t) convergence rate for the
  projected stochastic subgradient method
A simpler approach to obtaining an O(1/t) convergence rate for the projected stochastic subgradient method
Simon Lacoste-Julien
Mark Schmidt
Francis R. Bach
178
260
0
10 Dec 2012
Stochastic Gradient Descent for Non-smooth Optimization: Convergence
  Results and Optimal Averaging Schemes
Stochastic Gradient Descent for Non-smooth Optimization: Convergence Results and Optimal Averaging Schemes
Ohad Shamir
Tong Zhang
148
574
0
08 Dec 2012
Online Learning with Predictable Sequences
Online Learning with Predictable Sequences
Alexander Rakhlin
Karthik Sridharan
207
357
0
18 Aug 2012
1