Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1908.03265
Cited By
On the Variance of the Adaptive Learning Rate and Beyond
8 August 2019
Liyuan Liu
Haoming Jiang
Pengcheng He
Weizhu Chen
Xiaodong Liu
Jianfeng Gao
Jiawei Han
ODL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"On the Variance of the Adaptive Learning Rate and Beyond"
50 / 373 papers shown
Title
True Zero-Shot Inference of Dynamical Systems Preserving Long-Term Statistics
Christoph Jürgen Hemmer
Daniel Durstewitz
AI4TS
SyDa
AI4CE
30
0
0
19 May 2025
LoRASuite: Efficient LoRA Adaptation Across Large Language Model Upgrades
Yanan Li
Fanxu Meng
Muhan Zhang
Shiai Zhu
Shangguang Wang
Mengwei Xu
MoMe
19
0
0
17 May 2025
Fixing Incomplete Value Function Decomposition for Multi-Agent Reinforcement Learning
Andrea Baisero
Rupali Bhati
Shuo Liu
Aathira Pillai
Christopher Amato
31
0
0
15 May 2025
ICE-Pruning: An Iterative Cost-Efficient Pruning Pipeline for Deep Neural Networks
Wenhao Hu
Paul Henderson
José Cano
55
0
0
12 May 2025
TopicVD: A Topic-Based Dataset of Video-Guided Multimodal Machine Translation for Documentaries
Jinze Lv
Jian Chen
Zi Long
Xianghua Fu
Yin Chen
VGen
72
0
0
09 May 2025
Spline-based Transformers
Prashanth Chandran
Agon Serifi
Markus Gross
Moritz Bächer
68
0
0
03 Apr 2025
MRI super-resolution reconstruction using efficient diffusion probabilistic model with residual shifting
Mojtaba Safari
Shansong Wang
Zach Eidex
Qiang Li
Erik H. Middlebrooks
D. Yu
Xiaofeng Yang
MedIm
98
1
0
03 Mar 2025
Carefully Blending Adversarial Training, Purification, and Aggregation Improves Adversarial Robustness
Emanuele Ballarin
A. Ansuini
Luca Bortolussi
AAML
79
0
0
20 Feb 2025
Increasing Both Batch Size and Learning Rate Accelerates Stochastic Gradient Descent
Hikaru Umeda
Hideaki Iiduka
74
2
0
17 Feb 2025
Amortized Safe Active Learning for Real-Time Data Acquisition: Pretrained Neural Policies from Simulated Nonparametric Functions
Cen-You Li
Marc Toussaint
Barbara Rakitsch
Christoph Zimmer
OffRL
306
0
0
26 Jan 2025
Learning Versatile Optimizers on a Compute Diet
A. Moudgil
Boris Knyazev
Guillaume Lajoie
Eugene Belilovsky
294
0
0
22 Jan 2025
ReFlow6D: Refraction-Guided Transparent Object 6D Pose Estimation via Intermediate Representation Learning
Hrishikesh Gupta
S. Thalhammer
Jean-Baptiste Weibel
Alexander Haberl
Markus Vincze
43
0
0
31 Dec 2024
Learning state and proposal dynamics in state-space models using differentiable particle filters and neural networks
Benjamin Cox
Santiago Segarra
Victor Elvira
99
0
0
23 Nov 2024
USpeech: Ultrasound-Enhanced Speech with Minimal Human Effort via Cross-Modal Synthesis
Luca Jiang-Tao Yu
Running Zhao
Sijie Ji
Edith C.H. Ngai
Chenshu Wu
45
0
0
29 Oct 2024
Data Generation for Hardware-Friendly Post-Training Quantization
Lior Dikstein
Ariel Lapid
Arnon Netzer
H. Habi
MQ
292
0
0
29 Oct 2024
What Does It Mean to Be a Transformer? Insights from a Theoretical Hessian Analysis
Weronika Ormaniec
Felix Dangel
Sidak Pal Singh
61
7
0
14 Oct 2024
Learning Interpretable Hierarchical Dynamical Systems Models from Time Series Data
Manuel Brenner
Elias Weber
G. Koppe
Daniel Durstewitz
AI4TS
AI4CE
44
4
0
07 Oct 2024
Dual Encoder GAN Inversion for High-Fidelity 3D Head Reconstruction from Single Images
Bahri Batuhan Bilecen
Ahmet Berke Gokmen
Aysegül Dündar
56
1
0
30 Sep 2024
Retro-li: Small-Scale Retrieval Augmented Generation Supporting Noisy Similarity Searches and Domain Shift Generalization
Gentiana Rashiti
G. Karunaratne
Mrinmaya Sachan
Abu Sebastian
Abbas Rahimi
RALM
67
0
0
12 Sep 2024
Dynamic Decoupling of Placid Terminal Attractor-based Gradient Descent Algorithm
Jinwei Zhao
Marco Gori
Alessandro Betti
S. Melacci
Hongtao Zhang
Jiedong Liu
Xinhong Hei
47
0
0
10 Sep 2024
DeMansia: Mamba Never Forgets Any Tokens
Ricky Fang
Mamba
34
0
0
04 Aug 2024
Deep Learning Framework for History Matching CO2 Storage with 4D Seismic and Monitoring Well Data
Ekta U. Samani
A. Banerjee
45
0
0
02 Aug 2024
ReLiK: Retrieve and LinK, Fast and Accurate Entity Linking and Relation Extraction on an Academic Budget
Adam Gould
Pere-Lluis Huguet-Cabot
S. Dadhania
Francesca Toni
91
9
0
31 Jul 2024
Latent Space Imaging
Matheus Souza
Yidan Zheng
Kaizhang Kang
Yogeshwar Nath Mishra
Qiang Fu
Wolfgang Heidrich
67
0
0
09 Jul 2024
Simplifying Deep Temporal Difference Learning
Matteo Gallici
Mattie Fellows
Benjamin Ellis
B. Pou
Ivan Masmitja
Jakob Foerster
Mario Martin
OffRL
62
18
0
05 Jul 2024
Inferring stochastic low-rank recurrent neural networks from neural data
Matthijs Pals
A Erdem Sağtekin
Felix Pei
Manuel Gloeckler
Jakob H Macke
46
7
0
24 Jun 2024
Optimizing Automatic Speech Assessment: W-RankSim Regularization and Hybrid Feature Fusion Strategies
Chung-Wen Wu
Berlin Chen
53
0
0
16 Jun 2024
Optimal Recurrent Network Topologies for Dynamical Systems Reconstruction
Christoph Jürgen Hemmer
Manuel Brenner
Florian Hess
Daniel Durstewitz
43
4
0
07 Jun 2024
AdaFisher: Adaptive Second Order Optimization via Fisher Information
Damien Martins Gomes
Yanlei Zhang
Eugene Belilovsky
Guy Wolf
Mahdi S. Hosseini
ODL
84
2
0
26 May 2024
Distilling Diffusion Models into Conditional GANs
Minguk Kang
Richard Zhang
Connelly Barnes
Sylvain Paris
Suha Kwak
Jaesik Park
Eli Shechtman
Jun-Yan Zhu
Taesung Park
48
38
0
09 May 2024
Toward end-to-end interpretable convolutional neural networks for waveform signals
Linh Vu
Thu Tran
Wern-Han Lim
Raphael Phan
33
1
0
03 May 2024
Image segmentation of treated and untreated tumor spheroids by Fully Convolutional Networks
Matthias Streller
S. Michlíková
Willy Ciecior
Katharina Lönnecke
L. Kunz-Schughart
Steffen Lange
Anja Voss-Böhme
69
1
0
02 May 2024
LidaRF: Delving into Lidar for Neural Radiance Field on Street Scenes
Shanlin Sun
Bingbing Zhuang
Ziyu Jiang
Buyu Liu
Xiaohui Xie
Manmohan Chandraker
82
3
0
01 May 2024
FisheyeDetNet: 360° Surround view Fisheye Camera based Object Detection System for Autonomous Driving
Ganesh Sistu
S. Yogamani
49
0
0
20 Apr 2024
Faster Convergence for Transformer Fine-tuning with Line Search Methods
Philip Kenneweg
Leonardo Galli
Tristan Kenneweg
Barbara Hammer
ODL
51
2
0
27 Mar 2024
Integrative Graph-Transformer Framework for Histopathology Whole Slide Image Representation and Classification
Zhan Shi
Jingwei Zhang
Jun Kong
Fusheng Wang
MedIm
51
4
0
26 Mar 2024
Bidirectional Consistency Models
Liangchen Li
Jiajun He
DiffM
72
12
0
26 Mar 2024
MamMIL: Multiple Instance Learning for Whole Slide Images with State Space Models
Zijie Fang
Yifeng Wang
Zhi Wang
Jian Zhang
Xiangyang Ji
Yongbing Zhang
Mamba
47
6
0
08 Mar 2024
Hierarchical Multi-Relational Graph Representation Learning for Large-Scale Prediction of Drug-Drug Interactions
Mengying Jiang
Guizhong Liu
Yuanchao Su
Weiqiang Jin
Biao Zhao
43
2
0
28 Feb 2024
Radar-Based Recognition of Static Hand Gestures in American Sign Language
C. Schuessler
Wenxuan Zhang
Johanna Braunig
Marcel Hoffmann
Michael Stelzig
Martin Vossiek
22
3
0
20 Feb 2024
DeepATLAS: One-Shot Localization for Biomedical Data
Peter D. Chang
37
0
0
14 Feb 2024
Fast Registration of Photorealistic Avatars for VR Facial Animation
Chaitanya Patel
Shaojie Bai
Tenia Wang
Jason M. Saragih
S. Wei
36
0
0
19 Jan 2024
MADA: Meta-Adaptive Optimizers through hyper-gradient Descent
Kaan Ozkara
Can Karakus
Parameswaran Raman
Mingyi Hong
Shoham Sabach
Branislav Kveton
Volkan Cevher
40
2
0
17 Jan 2024
A Novel Paradigm for Neural Computation: X-Net with Learnable Neurons and Adaptable Structure
Yanjie Li
Weijun Li
Lina Yu
Min Wu
Jinyi Liu
...
Xin Ning
Yugui Zhang
Baoli Lu
Jian Xu
Shuang Li
33
0
0
03 Jan 2024
A Coefficient Makes SVRG Effective
Yida Yin
Zhiqiu Xu
Zhiyuan Li
Trevor Darrell
Zhuang Liu
52
1
0
09 Nov 2023
Learning Object Permanence from Videos via Latent Imaginations
Manuel Traub
Frederic Becker
S. Otte
Martin Volker Butz
38
1
0
16 Oct 2023
MoConVQ: Unified Physics-Based Motion Control via Scalable Discrete Representations
Heyuan Yao
Zhenhua Song
Yuyang Zhou
Tenglong Ao
Baoquan Chen
Libin Liu
30
39
0
16 Oct 2023
SSG2: A new modelling paradigm for semantic segmentation
F. Diakogiannis
S. Furby
P. Caccetta
Xiaoliang Wu
Rodrigo Ibata
O. Hlinka
John Taylor
VLM
45
0
0
12 Oct 2023
Larth: Dataset and Machine Translation for Etruscan
Gianluca Vico
Gerasimos Spanakis
22
1
0
09 Oct 2023
Small-scale proxies for large-scale Transformer training instabilities
Mitchell Wortsman
Peter J. Liu
Lechao Xiao
Katie Everett
A. Alemi
...
Jascha Narain Sohl-Dickstein
Kelvin Xu
Jaehoon Lee
Justin Gilmer
Simon Kornblith
42
88
0
25 Sep 2023
1
2
3
4
5
6
7
8
Next