ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1608.03983
  4. Cited By
SGDR: Stochastic Gradient Descent with Warm Restarts

SGDR: Stochastic Gradient Descent with Warm Restarts

13 August 2016
I. Loshchilov
Frank Hutter
    ODL
ArXivPDFHTML

Papers citing "SGDR: Stochastic Gradient Descent with Warm Restarts"

50 / 4,280 papers shown
Title
Training a Generally Curious Agent
Training a Generally Curious Agent
Fahim Tajwar
Yiding Jiang
Abitha Thankaraj
Sumaita Sadia Rahman
J. Zico Kolter
Jeff Schneider
Ruslan Salakhutdinov
126
1
0
24 Feb 2025
REINFORCE Adversarial Attacks on Large Language Models: An Adaptive, Distributional, and Semantic Objective
Simon Geisler
Tom Wollschlager
M. H. I. Abdalla
Vincent Cohen-Addad
Johannes Gasteiger
Stephan Günnemann
AAML
88
2
0
24 Feb 2025
Decoding for Punctured Convolutional and Turbo Codes: A Deep Learning Solution for Protocols Compliance
Decoding for Punctured Convolutional and Turbo Codes: A Deep Learning Solution for Protocols Compliance
Yongli Yan
Linglong Dai
37
0
0
24 Feb 2025
Mantis: Lightweight Calibrated Foundation Model for User-Friendly Time Series Classification
Mantis: Lightweight Calibrated Foundation Model for User-Friendly Time Series Classification
Vasilii Feofanov
Songkang Wen
Marius Alonso
Romain Ilbert
Hongbo Guo
Malik Tiomoko
Lujia Pan
Jianfeng Zhang
I. Redko
AI4TS
VLM
60
1
0
24 Feb 2025
LongSpec: Long-Context Speculative Decoding with Efficient Drafting and Verification
LongSpec: Long-Context Speculative Decoding with Efficient Drafting and Verification
Penghui Yang
Cunxiao Du
Fengzhuo Zhang
Haonan Wang
Tianyu Pang
Chao Du
Bo An
RALM
50
0
0
24 Feb 2025
Mojito: LLM-Aided Motion Instructor with Jitter-Reduced Inertial Tokens
Mojito: LLM-Aided Motion Instructor with Jitter-Reduced Inertial Tokens
Ziwei Shan
Yaoyu He
Chengfeng Zhao
Jiashen Du
Jingyan Zhang
Qixuan Zhang
Jingyi Yu
Lan Xu
66
1
0
22 Feb 2025
Patch Stitching Data Augmentation for Cancer Classification in Pathology Images
Patch Stitching Data Augmentation for Cancer Classification in Pathology Images
Jiamu Wang
Chang-Su Kim
Jin Tae Kwak
MedIm
37
1
0
22 Feb 2025
Exploiting Deblurring Networks for Radiance Fields
Exploiting Deblurring Networks for Radiance Fields
Haeyun Choi
Heemin Yang
Janghyeok Han
Sunghyun Cho
66
0
0
20 Feb 2025
Slamming: Training a Speech Language Model on One GPU in a Day
Slamming: Training a Speech Language Model on One GPU in a Day
Gallil Maimon
Avishai Elmakies
Yossi Adi
38
3
0
19 Feb 2025
Multi-Step Alignment as Markov Games: An Optimistic Online Gradient Descent Approach with Convergence Guarantees
Multi-Step Alignment as Markov Games: An Optimistic Online Gradient Descent Approach with Convergence Guarantees
Yongtao Wu
Luca Viano
Yihang Chen
Zhenyu Zhu
Kimon Antonakopoulos
Quanquan Gu
V. Cevher
62
0
0
18 Feb 2025
Improving the Stability of GNN Force Field Models by Reducing Feature Correlation
Improving the Stability of GNN Force Field Models by Reducing Feature Correlation
Y. Zeng
Wenlong He
Ihor Vasyltsov
Jiaxin Wei
Ying Zhang
Lin Chen
Yuehua Dai
45
0
0
18 Feb 2025
An Efficient Row-Based Sparse Fine-Tuning
An Efficient Row-Based Sparse Fine-Tuning
Cen-Jhih Li
Aditya Bhaskara
58
0
0
17 Feb 2025
The Graph's Apprentice: Teaching an LLM Low Level Knowledge for Circuit Quality Estimation
The Graph's Apprentice: Teaching an LLM Low Level Knowledge for Circuit Quality Estimation
Reza Moravej
Saurabh Bodhe
Zhanguang Zhang
Didier Chetelat
Dimitrios Tsaras
Yingxue Zhang
Hui-Ling Zhen
Jianye Hao
M. Yuan
60
1
0
17 Feb 2025
Increasing Both Batch Size and Learning Rate Accelerates Stochastic Gradient Descent
Increasing Both Batch Size and Learning Rate Accelerates Stochastic Gradient Descent
Hikaru Umeda
Hideaki Iiduka
72
2
0
17 Feb 2025
CoLA: Compute-Efficient Pre-Training of LLMs via Low-Rank Activation
CoLA: Compute-Efficient Pre-Training of LLMs via Low-Rank Activation
Zhengwu Liu
Ruijie Zhang
Zhilin Wang
Zi Yang
Paul Hovland
Bogdan Nicolae
Franck Cappello
Z. Zhang
54
0
0
16 Feb 2025
CoCoEvo: Co-Evolution of Programs and Test Cases to Enhance Code Generation
CoCoEvo: Co-Evolution of Programs and Test Cases to Enhance Code Generation
Kefan Li
Hongyue Yu
Tingyu Guo
Shijie Cao
Yuan Yuan
55
0
0
15 Feb 2025
Diffusing DeBias: Synthetic Bias Amplification for Model Debiasing
Diffusing DeBias: Synthetic Bias Amplification for Model Debiasing
Massimiliano Ciranni
Vito Paolo Pastore
Roberto Di Via
Enzo Tartaglione
Francesca Odone
Vittorio Murino
DiffM
105
0
0
13 Feb 2025
CANeRV: Content Adaptive Neural Representation for Video Compression
CANeRV: Content Adaptive Neural Representation for Video Compression
Lv Tang
Jun Zhu
Xinfeng Zhang
Li Zhang
Siwei Ma
Qingming Huang
77
1
0
10 Feb 2025
UniDemoir\é: Towards Universal Image Demoir\éing with Data Generation and Synthesis
UniDemoir\é: Towards Universal Image Demoir\éing with Data Generation and Synthesis
Zemin Yang
Yujing Sun
Xidong Peng
Siu-Ming Yiu
Yuexin Ma
DiffM
79
1
0
10 Feb 2025
EquiTabPFN: A Target-Permutation Equivariant Prior Fitted Networks
EquiTabPFN: A Target-Permutation Equivariant Prior Fitted Networks
Michael Arbel
David Salinas
Frank Hutter
78
2
0
10 Feb 2025
When, Where and Why to Average Weights?
Niccolò Ajroldi
Antonio Orvieto
Jonas Geiping
MoMe
101
0
0
10 Feb 2025
Provably Near-Optimal Federated Ensemble Distillation with Negligible Overhead
Provably Near-Optimal Federated Ensemble Distillation with Negligible Overhead
Won-Jun Jang
Hyeon-Seo Park
Si-Hyeon Lee
FedML
262
0
0
10 Feb 2025
Detecting Content Rating Violations in Android Applications: A Vision-Language Approach
Detecting Content Rating Violations in Android Applications: A Vision-Language Approach
Dishanika Denipitiyage
B. Silva
Suranga Seneviratne
A. Seneviratne
Sanjay Chawla
48
0
0
07 Feb 2025
Advancing Weight and Channel Sparsification with Enhanced Saliency
Advancing Weight and Channel Sparsification with Enhanced Saliency
Xinglong Sun
Maying Shen
Hongxu Yin
Lei Mao
Pavlo Molchanov
Jose M. Alvarez
58
1
0
05 Feb 2025
Deep Weight Factorization: Sparse Learning Through the Lens of Artificial Symmetries
Deep Weight Factorization: Sparse Learning Through the Lens of Artificial Symmetries
Chris Kolb
T. Weber
Bernd Bischl
David Rügamer
117
0
0
04 Feb 2025
SimBEV: A Synthetic Multi-Task Multi-Sensor Driving Data Generation Tool and Dataset
SimBEV: A Synthetic Multi-Task Multi-Sensor Driving Data Generation Tool and Dataset
Goodarz Mehr
A. Eskandarian
72
1
0
04 Feb 2025
FireCastNet: Earth-as-a-Graph for Seasonal Fire Prediction
FireCastNet: Earth-as-a-Graph for Seasonal Fire Prediction
Dimitrios Michail
Charalampos Davalas
Lefki-Ioanna Panagiotou
Ioannis Prapas
Spyros Kondylatos
N. Bountos
Ioannis Papoutsis
47
0
0
03 Feb 2025
Learning Hyperparameters via a Data-Emphasized Variational Objective
Learning Hyperparameters via a Data-Emphasized Variational Objective
Ethan Harvey
Mikhail Petrov
Michael C. Hughes
65
0
0
03 Feb 2025
Target-driven Self-Distillation for Partial Observed Trajectories Forecasting
Pengfei Zhu
Peng Shu
Mengshi Qi
Liang Liu
Huadong Ma
83
0
0
28 Jan 2025
DiffSplat: Repurposing Image Diffusion Models for Scalable Gaussian Splat Generation
DiffSplat: Repurposing Image Diffusion Models for Scalable Gaussian Splat Generation
Chenguo Lin
Panwang Pan
Bangbang Yang
Zeming Li
Yadong Mu
3DGS
81
7
0
28 Jan 2025
Learning the Regularization Strength for Deep Fine-Tuning via a Data-Emphasized Variational Objective
Learning the Regularization Strength for Deep Fine-Tuning via a Data-Emphasized Variational Objective
Ethan Harvey
Mikhail Petrov
Michael C. Hughes
45
0
0
28 Jan 2025
FIT-Print: Towards False-claim-resistant Model Ownership Verification via Targeted Fingerprint
Shuo Shao
Haozhe Zhu
Hongwei Yao
Yiming Li
Tianwei Zhang
Zhan Qin
Kui Ren
230
0
0
28 Jan 2025
Towards Scalable Topological Regularizers
Towards Scalable Topological Regularizers
Hiu-Tung Wong
Darrick Lee
Hong Yan
BDL
67
0
0
24 Jan 2025
Learning Versatile Optimizers on a Compute Diet
Learning Versatile Optimizers on a Compute Diet
A. Moudgil
Boris Knyazev
Guillaume Lajoie
Eugene Belilovsky
235
0
0
22 Jan 2025
FOCUS: First Order Concentrated Updating Scheme
FOCUS: First Order Concentrated Updating Scheme
Yizhou Liu
Ziming Liu
Jeff Gore
ODL
113
1
0
21 Jan 2025
ENTIRE: Learning-based Volume Rendering Time Prediction
ENTIRE: Learning-based Volume Rendering Time Prediction
Zikai Yin
Hamid Gadirov
Jiri Kosinka
Steffen Frey
3DH
36
0
0
21 Jan 2025
WaveDH: Wavelet Sub-bands Guided ConvNet for Efficient Image Dehazing
WaveDH: Wavelet Sub-bands Guided ConvNet for Efficient Image Dehazing
Seongmin Hwang
Daeyoung Han
Cheolkon Jung
Moongu Jeon
78
5
0
20 Jan 2025
Text-guided Synthetic Geometric Augmentation for Zero-shot 3D Understanding
Text-guided Synthetic Geometric Augmentation for Zero-shot 3D Understanding
Kohei Torimi
Ryosuke Yamada
Daichi Otsuka
Kensho Hara
Yuki M. Asano
Hirokatsu Kataoka
Y. Aoki
3DV
42
0
0
20 Jan 2025
Self-supervised Transformation Learning for Equivariant Representations
Self-supervised Transformation Learning for Equivariant Representations
Jaemyung Yu
Jaehyun Choi
Dong-Jae Lee
H. Hong
Junmo Kim
49
0
0
15 Jan 2025
A Heterogeneous Multimodal Graph Learning Framework for Recognizing User Emotions in Social Networks
A Heterogeneous Multimodal Graph Learning Framework for Recognizing User Emotions in Social Networks
Sree Bhattacharyya
Shuhua Yang
James Z. Wang
41
0
0
13 Jan 2025
A Hessian-informed hyperparameter optimization for differential learning rate
A Hessian-informed hyperparameter optimization for differential learning rate
Shiyun Xu
Zhiqi Bu
Yiliang Zhang
Ian Barnett
52
1
0
12 Jan 2025
ARES: Auxiliary Range Expansion for Outlier Synthesis
ARES: Auxiliary Range Expansion for Outlier Synthesis
Eui-Soo Jung
Hae-Hun Seo
Hyun-Woo Jung
Je-Geon Oh
Yoon-Yeong Kim
OODD
61
0
0
11 Jan 2025
Tensor Product Attention Is All You Need
Tensor Product Attention Is All You Need
Yifan Zhang
Yifeng Liu
Huizhuo Yuan
Zhen Qin
Yang Yuan
Q. Gu
Andrew Chi-Chih Yao
96
9
0
11 Jan 2025
TTS-Transducer: End-to-End Speech Synthesis with Neural Transducer
TTS-Transducer: End-to-End Speech Synthesis with Neural Transducer
Vladimir Bataev
Subhankar Ghosh
Vitaly Lavrukhin
Jason Chun Lok Li
AI4TS
48
0
0
10 Jan 2025
Improving the U-Net Configuration for Automated Delineation of Head and Neck Cancer on MRI
Improving the U-Net Configuration for Automated Delineation of Head and Neck Cancer on MRI
Andrei Iantsen
26
1
0
10 Jan 2025
H-MBA: Hierarchical MamBa Adaptation for Multi-Modal Video Understanding in Autonomous Driving
H-MBA: Hierarchical MamBa Adaptation for Multi-Modal Video Understanding in Autonomous Driving
Tian Jin
Yuxiao Luo
Yue Ma
Yu Qiao
Yali Wang
Mamba
64
1
0
08 Jan 2025
CURing Large Models: Compression via CUR Decomposition
CURing Large Models: Compression via CUR Decomposition
Sanghyeon Park
Soo-Mook Moon
46
0
0
08 Jan 2025
Noise-Robust Target-Speaker Voice Activity Detection Through Self-Supervised Pretraining
Noise-Robust Target-Speaker Voice Activity Detection Through Self-Supervised Pretraining
H. S. Bovbjerg
Jan Østergaard
Jesper Jensen
Zheng-Hua Tan
45
0
0
06 Jan 2025
Noise-Tolerant Hybrid Prototypical Learning with Noisy Web Data
Chao Liang
Linchao Zhu
Zongxin Yang
Wei Chen
Yi Yang
NoLa
76
0
0
05 Jan 2025
Swift Cross-Dataset Pruning: Enhancing Fine-Tuning Efficiency in Natural Language Understanding
Swift Cross-Dataset Pruning: Enhancing Fine-Tuning Efficiency in Natural Language Understanding
Binh-Nguyen Nguyen
Yang He
43
1
0
05 Jan 2025
Previous
123456...848586
Next