Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1608.03983
Cited By
SGDR: Stochastic Gradient Descent with Warm Restarts
13 August 2016
I. Loshchilov
Frank Hutter
ODL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"SGDR: Stochastic Gradient Descent with Warm Restarts"
50 / 4,280 papers shown
Title
Training a Generally Curious Agent
Fahim Tajwar
Yiding Jiang
Abitha Thankaraj
Sumaita Sadia Rahman
J. Zico Kolter
Jeff Schneider
Ruslan Salakhutdinov
126
1
0
24 Feb 2025
REINFORCE Adversarial Attacks on Large Language Models: An Adaptive, Distributional, and Semantic Objective
Simon Geisler
Tom Wollschlager
M. H. I. Abdalla
Vincent Cohen-Addad
Johannes Gasteiger
Stephan Günnemann
AAML
88
2
0
24 Feb 2025
Decoding for Punctured Convolutional and Turbo Codes: A Deep Learning Solution for Protocols Compliance
Yongli Yan
Linglong Dai
37
0
0
24 Feb 2025
Mantis: Lightweight Calibrated Foundation Model for User-Friendly Time Series Classification
Vasilii Feofanov
Songkang Wen
Marius Alonso
Romain Ilbert
Hongbo Guo
Malik Tiomoko
Lujia Pan
Jianfeng Zhang
I. Redko
AI4TS
VLM
60
1
0
24 Feb 2025
LongSpec: Long-Context Speculative Decoding with Efficient Drafting and Verification
Penghui Yang
Cunxiao Du
Fengzhuo Zhang
Haonan Wang
Tianyu Pang
Chao Du
Bo An
RALM
50
0
0
24 Feb 2025
Mojito: LLM-Aided Motion Instructor with Jitter-Reduced Inertial Tokens
Ziwei Shan
Yaoyu He
Chengfeng Zhao
Jiashen Du
Jingyan Zhang
Qixuan Zhang
Jingyi Yu
Lan Xu
66
1
0
22 Feb 2025
Patch Stitching Data Augmentation for Cancer Classification in Pathology Images
Jiamu Wang
Chang-Su Kim
Jin Tae Kwak
MedIm
37
1
0
22 Feb 2025
Exploiting Deblurring Networks for Radiance Fields
Haeyun Choi
Heemin Yang
Janghyeok Han
Sunghyun Cho
66
0
0
20 Feb 2025
Slamming: Training a Speech Language Model on One GPU in a Day
Gallil Maimon
Avishai Elmakies
Yossi Adi
38
3
0
19 Feb 2025
Multi-Step Alignment as Markov Games: An Optimistic Online Gradient Descent Approach with Convergence Guarantees
Yongtao Wu
Luca Viano
Yihang Chen
Zhenyu Zhu
Kimon Antonakopoulos
Quanquan Gu
V. Cevher
62
0
0
18 Feb 2025
Improving the Stability of GNN Force Field Models by Reducing Feature Correlation
Y. Zeng
Wenlong He
Ihor Vasyltsov
Jiaxin Wei
Ying Zhang
Lin Chen
Yuehua Dai
45
0
0
18 Feb 2025
An Efficient Row-Based Sparse Fine-Tuning
Cen-Jhih Li
Aditya Bhaskara
58
0
0
17 Feb 2025
The Graph's Apprentice: Teaching an LLM Low Level Knowledge for Circuit Quality Estimation
Reza Moravej
Saurabh Bodhe
Zhanguang Zhang
Didier Chetelat
Dimitrios Tsaras
Yingxue Zhang
Hui-Ling Zhen
Jianye Hao
M. Yuan
60
1
0
17 Feb 2025
Increasing Both Batch Size and Learning Rate Accelerates Stochastic Gradient Descent
Hikaru Umeda
Hideaki Iiduka
72
2
0
17 Feb 2025
CoLA: Compute-Efficient Pre-Training of LLMs via Low-Rank Activation
Zhengwu Liu
Ruijie Zhang
Zhilin Wang
Zi Yang
Paul Hovland
Bogdan Nicolae
Franck Cappello
Z. Zhang
54
0
0
16 Feb 2025
CoCoEvo: Co-Evolution of Programs and Test Cases to Enhance Code Generation
Kefan Li
Hongyue Yu
Tingyu Guo
Shijie Cao
Yuan Yuan
55
0
0
15 Feb 2025
Diffusing DeBias: Synthetic Bias Amplification for Model Debiasing
Massimiliano Ciranni
Vito Paolo Pastore
Roberto Di Via
Enzo Tartaglione
Francesca Odone
Vittorio Murino
DiffM
105
0
0
13 Feb 2025
CANeRV: Content Adaptive Neural Representation for Video Compression
Lv Tang
Jun Zhu
Xinfeng Zhang
Li Zhang
Siwei Ma
Qingming Huang
77
1
0
10 Feb 2025
UniDemoir\é: Towards Universal Image Demoir\éing with Data Generation and Synthesis
Zemin Yang
Yujing Sun
Xidong Peng
Siu-Ming Yiu
Yuexin Ma
DiffM
79
1
0
10 Feb 2025
EquiTabPFN: A Target-Permutation Equivariant Prior Fitted Networks
Michael Arbel
David Salinas
Frank Hutter
78
2
0
10 Feb 2025
When, Where and Why to Average Weights?
Niccolò Ajroldi
Antonio Orvieto
Jonas Geiping
MoMe
101
0
0
10 Feb 2025
Provably Near-Optimal Federated Ensemble Distillation with Negligible Overhead
Won-Jun Jang
Hyeon-Seo Park
Si-Hyeon Lee
FedML
262
0
0
10 Feb 2025
Detecting Content Rating Violations in Android Applications: A Vision-Language Approach
Dishanika Denipitiyage
B. Silva
Suranga Seneviratne
A. Seneviratne
Sanjay Chawla
48
0
0
07 Feb 2025
Advancing Weight and Channel Sparsification with Enhanced Saliency
Xinglong Sun
Maying Shen
Hongxu Yin
Lei Mao
Pavlo Molchanov
Jose M. Alvarez
58
1
0
05 Feb 2025
Deep Weight Factorization: Sparse Learning Through the Lens of Artificial Symmetries
Chris Kolb
T. Weber
Bernd Bischl
David Rügamer
117
0
0
04 Feb 2025
SimBEV: A Synthetic Multi-Task Multi-Sensor Driving Data Generation Tool and Dataset
Goodarz Mehr
A. Eskandarian
72
1
0
04 Feb 2025
FireCastNet: Earth-as-a-Graph for Seasonal Fire Prediction
Dimitrios Michail
Charalampos Davalas
Lefki-Ioanna Panagiotou
Ioannis Prapas
Spyros Kondylatos
N. Bountos
Ioannis Papoutsis
47
0
0
03 Feb 2025
Learning Hyperparameters via a Data-Emphasized Variational Objective
Ethan Harvey
Mikhail Petrov
Michael C. Hughes
65
0
0
03 Feb 2025
Target-driven Self-Distillation for Partial Observed Trajectories Forecasting
Pengfei Zhu
Peng Shu
Mengshi Qi
Liang Liu
Huadong Ma
83
0
0
28 Jan 2025
DiffSplat: Repurposing Image Diffusion Models for Scalable Gaussian Splat Generation
Chenguo Lin
Panwang Pan
Bangbang Yang
Zeming Li
Yadong Mu
3DGS
81
7
0
28 Jan 2025
Learning the Regularization Strength for Deep Fine-Tuning via a Data-Emphasized Variational Objective
Ethan Harvey
Mikhail Petrov
Michael C. Hughes
45
0
0
28 Jan 2025
FIT-Print: Towards False-claim-resistant Model Ownership Verification via Targeted Fingerprint
Shuo Shao
Haozhe Zhu
Hongwei Yao
Yiming Li
Tianwei Zhang
Zhan Qin
Kui Ren
230
0
0
28 Jan 2025
Towards Scalable Topological Regularizers
Hiu-Tung Wong
Darrick Lee
Hong Yan
BDL
67
0
0
24 Jan 2025
Learning Versatile Optimizers on a Compute Diet
A. Moudgil
Boris Knyazev
Guillaume Lajoie
Eugene Belilovsky
235
0
0
22 Jan 2025
FOCUS: First Order Concentrated Updating Scheme
Yizhou Liu
Ziming Liu
Jeff Gore
ODL
113
1
0
21 Jan 2025
ENTIRE: Learning-based Volume Rendering Time Prediction
Zikai Yin
Hamid Gadirov
Jiri Kosinka
Steffen Frey
3DH
36
0
0
21 Jan 2025
WaveDH: Wavelet Sub-bands Guided ConvNet for Efficient Image Dehazing
Seongmin Hwang
Daeyoung Han
Cheolkon Jung
Moongu Jeon
78
5
0
20 Jan 2025
Text-guided Synthetic Geometric Augmentation for Zero-shot 3D Understanding
Kohei Torimi
Ryosuke Yamada
Daichi Otsuka
Kensho Hara
Yuki M. Asano
Hirokatsu Kataoka
Y. Aoki
3DV
42
0
0
20 Jan 2025
Self-supervised Transformation Learning for Equivariant Representations
Jaemyung Yu
Jaehyun Choi
Dong-Jae Lee
H. Hong
Junmo Kim
49
0
0
15 Jan 2025
A Heterogeneous Multimodal Graph Learning Framework for Recognizing User Emotions in Social Networks
Sree Bhattacharyya
Shuhua Yang
James Z. Wang
41
0
0
13 Jan 2025
A Hessian-informed hyperparameter optimization for differential learning rate
Shiyun Xu
Zhiqi Bu
Yiliang Zhang
Ian Barnett
52
1
0
12 Jan 2025
ARES: Auxiliary Range Expansion for Outlier Synthesis
Eui-Soo Jung
Hae-Hun Seo
Hyun-Woo Jung
Je-Geon Oh
Yoon-Yeong Kim
OODD
61
0
0
11 Jan 2025
Tensor Product Attention Is All You Need
Yifan Zhang
Yifeng Liu
Huizhuo Yuan
Zhen Qin
Yang Yuan
Q. Gu
Andrew Chi-Chih Yao
96
9
0
11 Jan 2025
TTS-Transducer: End-to-End Speech Synthesis with Neural Transducer
Vladimir Bataev
Subhankar Ghosh
Vitaly Lavrukhin
Jason Chun Lok Li
AI4TS
48
0
0
10 Jan 2025
Improving the U-Net Configuration for Automated Delineation of Head and Neck Cancer on MRI
Andrei Iantsen
26
1
0
10 Jan 2025
H-MBA: Hierarchical MamBa Adaptation for Multi-Modal Video Understanding in Autonomous Driving
Tian Jin
Yuxiao Luo
Yue Ma
Yu Qiao
Yali Wang
Mamba
64
1
0
08 Jan 2025
CURing Large Models: Compression via CUR Decomposition
Sanghyeon Park
Soo-Mook Moon
46
0
0
08 Jan 2025
Noise-Robust Target-Speaker Voice Activity Detection Through Self-Supervised Pretraining
H. S. Bovbjerg
Jan Østergaard
Jesper Jensen
Zheng-Hua Tan
45
0
0
06 Jan 2025
Noise-Tolerant Hybrid Prototypical Learning with Noisy Web Data
Chao Liang
Linchao Zhu
Zongxin Yang
Wei Chen
Yi Yang
NoLa
76
0
0
05 Jan 2025
Swift Cross-Dataset Pruning: Enhancing Fine-Tuning Efficiency in Natural Language Understanding
Binh-Nguyen Nguyen
Yang He
43
1
0
05 Jan 2025
Previous
1
2
3
4
5
6
...
84
85
86
Next