ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1211.5063
  4. Cited By
On the difficulty of training Recurrent Neural Networks

On the difficulty of training Recurrent Neural Networks

21 November 2012
Razvan Pascanu
Tomas Mikolov
Yoshua Bengio
    ODL
ArXivPDFHTML

Papers citing "On the difficulty of training Recurrent Neural Networks"

50 / 53 papers shown
Title
Deformable Attentive Visual Enhancement for Referring Segmentation Using Vision-Language Model
Deformable Attentive Visual Enhancement for Referring Segmentation Using Vision-Language Model
Alaa Dalaq
Muzammil Behzad
VLM
36
0
0
25 May 2025
DataRater: Meta-Learned Dataset Curation
DataRater: Meta-Learned Dataset Curation
Dan A. Calian
Gregory Farquhar
Iurii Kemaev
Luisa M. Zintgraf
Matteo Hessel
...
András Gyorgy
Tom Schaul
Jeffrey Dean
Hado van Hasselt
David Silver
106
0
0
23 May 2025
Sharpness-Aware Minimization with Z-Score Gradient Filtering for Neural Networks
Sharpness-Aware Minimization with Z-Score Gradient Filtering for Neural Networks
Juyoung Yun
91
0
0
05 May 2025
Three-Factor Learning in Spiking Neural Networks: An Overview of Methods and Trends from a Machine Learning Perspective
Three-Factor Learning in Spiking Neural Networks: An Overview of Methods and Trends from a Machine Learning Perspective
Szymon Mazurek
Jakub Caputa
Jan K. Argasiñski
Maciej Wielgosz
37
0
0
06 Apr 2025
Catch Me if You Search: When Contextual Web Search Results Affect the Detection of Hallucinations
Catch Me if You Search: When Contextual Web Search Results Affect the Detection of Hallucinations
Mahjabin Nahar
Eun-Ju Lee
Jin Won Park
Dongwon Lee
HILM
104
0
0
01 Apr 2025
Harnessing uncertainty when learning through Equilibrium Propagation in neural networks
Harnessing uncertainty when learning through Equilibrium Propagation in neural networks
Jonathan Peters
Philippe Talatchian
51
0
0
28 Mar 2025
Natural Language Generation
Natural Language Generation
Emiel van Miltenburg
Chenghua Lin
45
2
0
20 Mar 2025
Forecasting Empty Container availability for Vehicle Booking System Application
Forecasting Empty Container availability for Vehicle Booking System Application
Arthur Cartel Foahom Gouabou
Mohammed Al-Kharaz
Faouzi Hakimi
Tarek Khaled
Kenza Amzil
57
0
0
14 Mar 2025
On Vanishing Gradients, Over-Smoothing, and Over-Squashing in GNNs: Bridging Recurrent and Graph Learning
On Vanishing Gradients, Over-Smoothing, and Over-Squashing in GNNs: Bridging Recurrent and Graph Learning
Alvaro Arroyo
Alessio Gravina
Benjamin Gutteridge
Federico Barbero
Claudio Gallicchio
Xiaowen Dong
Michael M. Bronstein
P. Vandergheynst
76
8
0
15 Feb 2025
LOB-Bench: Benchmarking Generative AI for Finance - an Application to Limit Order Book Data
LOB-Bench: Benchmarking Generative AI for Finance - an Application to Limit Order Book Data
Peer Nagy
Sascha Frey
Kang Li
Bidipta Sarkar
Svitlana Vyetrenko
Stefan Zohren
Ani Calinescu
Jakob Foerster
120
1
0
13 Feb 2025
State-space models are accurate and efficient neural operators for dynamical systems
State-space models are accurate and efficient neural operators for dynamical systems
Zheyuan Hu
Nazanin Ahmadi Daryakenari
Qianli Shen
Kenji Kawaguchi
George Karniadakis
Mamba
AI4CE
117
15
0
28 Jan 2025
HadamRNN: Binary and Sparse Ternary Orthogonal RNNs
HadamRNN: Binary and Sparse Ternary Orthogonal RNNs
Armand Foucault
Franck Mamalet
François Malgouyres
MQ
139
0
0
28 Jan 2025
Modeling Time-Variant Responses of Optical Compressors with Selective State Space Models
Modeling Time-Variant Responses of Optical Compressors with Selective State Space Models
Riccardo Simionato
Stefano Fasciani
90
1
0
17 Jan 2025
MUNBa: Machine Unlearning via Nash Bargaining
MUNBa: Machine Unlearning via Nash Bargaining
Jing Wu
Mehrtash Harandi
MU
96
4
0
23 Nov 2024
Heuristical Comparison of Vision Transformers Against Convolutional Neural Networks for Semantic Segmentation on Remote Sensing Imagery
Heuristical Comparison of Vision Transformers Against Convolutional Neural Networks for Semantic Segmentation on Remote Sensing Imagery
Ashim Dahal
Saydul Akbar Murad
Nick Rahimi
ViT
99
1
0
14 Nov 2024
A Parameter Update Balancing Algorithm for Multi-task Ranking Models in Recommendation Systems
A Parameter Update Balancing Algorithm for Multi-task Ranking Models in Recommendation Systems
Jun Yuan
Guohao Cai
Zhenhua Dong
98
0
0
08 Oct 2024
Oscillatory State-Space Models
Oscillatory State-Space Models
T. Konstantin Rusch
Daniela Rus
AI4TS
290
6
0
04 Oct 2024
Learning from Demonstration with Implicit Nonlinear Dynamics Models
Learning from Demonstration with Implicit Nonlinear Dynamics Models
Peter David Fagan
Subramanian Ramamoorthy
312
0
0
27 Sep 2024
Robust Federated Learning Over the Air: Combating Heavy-Tailed Noise with Median Anchored Clipping
Robust Federated Learning Over the Air: Combating Heavy-Tailed Noise with Median Anchored Clipping
Jiaxing Li
Zihan Chen
Kai Fong Ernest Chong
Bikramjit Das
Tony Q.S. Quek
Howard H. Yang
55
0
0
23 Sep 2024
AutoFlow: An Autoencoder-based Approach for IP Flow Record Compression with Minimal Impact on Traffic Classification
AutoFlow: An Autoencoder-based Approach for IP Flow Record Compression with Minimal Impact on Traffic Classification
Adrian Pekar
30
1
0
17 Sep 2024
Oja's plasticity rule overcomes several challenges of training neural networks under biological constraints
Oja's plasticity rule overcomes several challenges of training neural networks under biological constraints
Navid Shervani-Tabar
Marzieh Alireza Mirhoseini
Robert Rosenbaum
AAML
AI4CE
87
0
0
15 Aug 2024
Differentially Private Block-wise Gradient Shuffle for Deep Learning
Differentially Private Block-wise Gradient Shuffle for Deep Learning
Zilong Zhang
FedML
57
0
0
31 Jul 2024
How Well Can a Long Sequence Model Model Long Sequences? Comparing Architechtural Inductive Biases on Long-Context Abilities
How Well Can a Long Sequence Model Model Long Sequences? Comparing Architechtural Inductive Biases on Long-Context Abilities
Jerry Huang
78
7
0
11 Jul 2024
Compressing Search with Language Models
Compressing Search with Language Models
Thomas Mulc
Jennifer L. Steele
70
1
0
24 Jun 2024
Regularized Gradient Clipping Provably Trains Wide and Deep Neural Networks
Regularized Gradient Clipping Provably Trains Wide and Deep Neural Networks
Matteo Tucat
Anirbit Mukherjee
Procheta Sen
Mingfei Sun
Omar Rivasplata
MLT
51
1
0
12 Apr 2024
PARMESAN: Parameter-Free Memory Search and Transduction for Dense Prediction Tasks
PARMESAN: Parameter-Free Memory Search and Transduction for Dense Prediction Tasks
Philip Matthias Winter
M. Wimmer
David Major
Dimitrios Lenis
Astrid Berg
Theresa Neubauer
Gaia Romana De Paolis
Johannes Novotny
Sophia Ulonska
Katja Bühler
94
0
0
18 Mar 2024
Leveraging Continuously Differentiable Activation Functions for Learning in Quantized Noisy Environments
Leveraging Continuously Differentiable Activation Functions for Learning in Quantized Noisy Environments
Vivswan Shah
Nathan Youngblood
49
2
0
04 Feb 2024
NACHOS: Neural Architecture Search for Hardware Constrained Early Exit Neural Networks
NACHOS: Neural Architecture Search for Hardware Constrained Early Exit Neural Networks
Matteo Gambella
Jary Pomponi
Simone Scardapane
Manuel Roveri
40
2
0
24 Jan 2024
Diffusion-EXR: Controllable Review Generation for Explainable Recommendation via Diffusion Models
Diffusion-EXR: Controllable Review Generation for Explainable Recommendation via Diffusion Models
Ling Li
Shaohua Li
Winda Marantika
Alex C. Kot
Huijing Zhan
78
2
0
24 Dec 2023
Prompt-Driven Building Footprint Extraction in Aerial Images with Offset-Building Model
Prompt-Driven Building Footprint Extraction in Aerial Images with Offset-Building Model
Kai Li
Yupeng Deng
Yun-long Kong
Diyou Liu
Jingbo Chen
Yu Meng
Junxian Ma
Chenhao Wang
119
1
0
25 Oct 2023
FLTrojan: Privacy Leakage Attacks against Federated Language Models Through Selective Weight Tampering
FLTrojan: Privacy Leakage Attacks against Federated Language Models Through Selective Weight Tampering
Md Rafi Ur Rashid
Vishnu Asutosh Dasu
Kang Gu
Najrin Sultana
Shagufta Mehnaz
AAML
FedML
71
11
0
24 Oct 2023
Smooth Exact Gradient Descent Learning in Spiking Neural Networks
Smooth Exact Gradient Descent Learning in Spiking Neural Networks
Christian Klos
Raoul-Martin Memmesheimer
74
6
0
25 Sep 2023
Nonconvex Stochastic Bregman Proximal Gradient Method with Application to Deep Learning
Nonconvex Stochastic Bregman Proximal Gradient Method with Application to Deep Learning
Kuan-Fu Ding
Jingyang Li
Kim-Chuan Toh
62
8
0
26 Jun 2023
Gated Recurrent Neural Networks with Weighted Time-Delay Feedback
Gated Recurrent Neural Networks with Weighted Time-Delay Feedback
N. Benjamin Erichson
Soon Hoe Lim
Michael W. Mahoney
49
6
0
01 Dec 2022
Designing Universal Causal Deep Learning Models: The Case of Infinite-Dimensional Dynamical Systems from Stochastic Analysis
Designing Universal Causal Deep Learning Models: The Case of Infinite-Dimensional Dynamical Systems from Stochastic Analysis
Luca Galimberti
Anastasis Kratsios
Giulia Livieri
OOD
50
14
0
24 Oct 2022
Gaussian Pre-Activations in Neural Networks: Myth or Reality?
Gaussian Pre-Activations in Neural Networks: Myth or Reality?
Pierre Wolinski
Julyan Arbel
AI4CE
118
8
0
24 May 2022
Forecasting Sequential Data using Consistent Koopman Autoencoders
Forecasting Sequential Data using Consistent Koopman Autoencoders
Omri Azencot
N. Benjamin Erichson
Vanessa Lin
Michael W. Mahoney
AI4TS
AI4CE
87
147
0
04 Mar 2020
Spatial As Deep: Spatial CNN for Traffic Scene Understanding
Spatial As Deep: Spatial CNN for Traffic Scene Understanding
Xingang Pan
Jianping Shi
Ping Luo
Xiaogang Wang
Xiaoou Tang
3DPC
118
980
0
17 Dec 2017
Adaptive Detrending to Accelerate Convolutional Gated Recurrent Unit
  Training for Contextual Video Recognition
Adaptive Detrending to Accelerate Convolutional Gated Recurrent Unit Training for Contextual Video Recognition
Minju Jung
Haanvid Lee
Jun Tani
AI4TS
40
42
0
24 May 2017
Practical Processing of Mobile Sensor Data for Continual Deep Learning
  Predictions
Practical Processing of Mobile Sensor Data for Continual Deep Learning Predictions
Kleomenis Katevas
Ilias Leontiadis
M. Pielot
Joan Serrà
HAI
37
12
0
17 May 2017
Frustratingly Short Attention Spans in Neural Language Modeling
Frustratingly Short Attention Spans in Neural Language Modeling
Michal Daniluk
Tim Rocktaschel
Johannes Welbl
Sebastian Riedel
66
111
0
15 Feb 2017
Deep Learning with Low Precision by Half-wave Gaussian Quantization
Deep Learning with Low Precision by Half-wave Gaussian Quantization
Zhaowei Cai
Xiaodong He
Jian Sun
Nuno Vasconcelos
MQ
100
504
0
03 Feb 2017
Deep Reinforcement Learning for Visual Object Tracking in Videos
Deep Reinforcement Learning for Visual Object Tracking in Videos
Da Zhang
H. Maei
Xin Eric Wang
Yuan-fang Wang
80
116
0
31 Jan 2017
Fully Character-Level Neural Machine Translation without Explicit
  Segmentation
Fully Character-Level Neural Machine Translation without Explicit Segmentation
Jason D. Lee
Kyunghyun Cho
Thomas Hofmann
VLM
86
456
0
10 Oct 2016
Pointer Sentinel Mixture Models
Pointer Sentinel Mixture Models
Stephen Merity
Caiming Xiong
James Bradbury
R. Socher
RALM
147
2,783
0
26 Sep 2016
End-to-End Tracking and Semantic Segmentation Using Recurrent Neural
  Networks
End-to-End Tracking and Semantic Segmentation Using Recurrent Neural Networks
Peter Ondruska
J. Dequaire
Dominic Zeng Wang
Ingmar Posner
52
62
0
18 Apr 2016
Exploring the Limits of Language Modeling
Exploring the Limits of Language Modeling
Rafal Jozefowicz
Oriol Vinyals
M. Schuster
Noam M. Shazeer
Yonghui Wu
103
1,143
0
07 Feb 2016
A Kronecker-factored approximate Fisher matrix for convolution layers
A Kronecker-factored approximate Fisher matrix for convolution layers
Roger C. Grosse
James Martens
ODL
65
260
0
03 Feb 2016
Implicit Distortion and Fertility Models for Attention-based
  Encoder-Decoder NMT Model
Implicit Distortion and Fertility Models for Attention-based Encoder-Decoder NMT Model
Shi Feng
Shujie Liu
Mu Li
M. Zhou
51
44
0
13 Jan 2016
Language to Logical Form with Neural Attention
Language to Logical Form with Neural Attention
Li Dong
Mirella Lapata
AI4CE
NAI
58
729
0
06 Jan 2016
12
Next