ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1711.05101
  4. Cited By
Decoupled Weight Decay Regularization

Decoupled Weight Decay Regularization

14 November 2017
I. Loshchilov
Frank Hutter
    OffRL
ArXivPDFHTML

Papers citing "Decoupled Weight Decay Regularization"

50 / 369 papers shown
Title
ConTNet: Why not use convolution and transformer at the same time?
ConTNet: Why not use convolution and transformer at the same time?
Haotian Yan
Zhe Li
Weijian Li
Changhu Wang
Ming Wu
Chuang Zhang
ViT
20
76
0
27 Apr 2021
Multiscale Vision Transformers
Multiscale Vision Transformers
Haoqi Fan
Bo Xiong
K. Mangalam
Yanghao Li
Zhicheng Yan
Jitendra Malik
Christoph Feichtenhofer
ViT
63
1,224
0
22 Apr 2021
How to Train BERT with an Academic Budget
How to Train BERT with an Academic Budget
Peter Izsak
Moshe Berchansky
Omer Levy
17
113
0
15 Apr 2021
Emotion Dynamics Modeling via BERT
Emotion Dynamics Modeling via BERT
Haiqing Yang
Jianping Shen
24
11
0
15 Apr 2021
Fighting the COVID-19 Infodemic with a Holistic BERT Ensemble
Fighting the COVID-19 Infodemic with a Holistic BERT Ensemble
Georgios Tziafas
Konstantinos Kogkalidis
Tommaso Caselli
24
9
0
12 Apr 2021
A Deep Learning Based Cost Model for Automatic Code Optimization
A Deep Learning Based Cost Model for Automatic Code Optimization
Riyadh Baghdadi
Massinissa Merouani
Mohamed-Hicham Leghettas
K. Abdous
T. Arbaoui
K. Benatchba
Saman P. Amarasinghe
19
68
0
11 Apr 2021
SiT: Self-supervised vIsion Transformer
SiT: Self-supervised vIsion Transformer
Sara Atito Ali Ahmed
Muhammad Awais
J. Kittler
ViT
39
139
0
08 Apr 2021
Going deeper with Image Transformers
Going deeper with Image Transformers
Hugo Touvron
Matthieu Cord
Alexandre Sablayrolles
Gabriel Synnaeve
Hervé Jégou
ViT
27
986
0
31 Mar 2021
R-GSN: The Relation-based Graph Similar Network for Heterogeneous Graph
R-GSN: The Relation-based Graph Similar Network for Heterogeneous Graph
Xinliang Wu
Mengying Jiang
Guizhong Liu
GNN
24
7
0
14 Mar 2021
Bidirectional Machine Reading Comprehension for Aspect Sentiment Triplet
  Extraction
Bidirectional Machine Reading Comprehension for Aspect Sentiment Triplet Extraction
Shaowei Chen
Yu Wang
Jie Liu
Yuelin Wang
24
176
0
13 Mar 2021
ZJUKLAB at SemEval-2021 Task 4: Negative Augmentation with Language
  Model for Reading Comprehension of Abstract Meaning
ZJUKLAB at SemEval-2021 Task 4: Negative Augmentation with Language Model for Reading Comprehension of Abstract Meaning
Xin Xie
Xiangnan Chen
Xiang Chen
Yong Wang
Ningyu Zhang
Shumin Deng
Huajun Chen
42
2
0
25 Feb 2021
Multilingual Answer Sentence Reranking via Automatically Translated Data
Multilingual Answer Sentence Reranking via Automatically Translated Data
Thuy Vu
Alessandro Moschitti
22
5
0
20 Feb 2021
Meta-Learning for Effective Multi-task and Multilingual Modelling
Meta-Learning for Effective Multi-task and Multilingual Modelling
Ishan Tarunesh
Sushil Khyalia
Vishwajeet Kumar
Ganesh Ramakrishnan
P. Jyothi
31
16
0
25 Jan 2021
g2tmn at Constraint@AAAI2021: Exploiting CT-BERT and Ensembling Learning
  for COVID-19 Fake News Detection
g2tmn at Constraint@AAAI2021: Exploiting CT-BERT and Ensembling Learning for COVID-19 Fake News Detection
Anna Glazkova
Maksim Glazkov
T. Trifonov
24
58
0
22 Dec 2020
DialogXL: All-in-One XLNet for Multi-Party Conversation Emotion
  Recognition
DialogXL: All-in-One XLNet for Multi-Party Conversation Emotion Recognition
Weizhou Shen
Junqing Chen
Xiaojun Quan
Zhixiang Xie
22
200
0
16 Dec 2020
Topological Planning with Transformers for Vision-and-Language
  Navigation
Topological Planning with Transformers for Vision-and-Language Navigation
Kevin Chen
Junshen K. Chen
Jo Chuang
Marynel Vázquez
Silvio Savarese
LM&Ro
27
99
0
09 Dec 2020
End-to-End Object Detection with Adaptive Clustering Transformer
End-to-End Object Detection with Adaptive Clustering Transformer
Minghang Zheng
Peng Gao
Renrui Zhang
Kunchang Li
Xiaogang Wang
Hongsheng Li
Hao Dong
ViT
24
193
0
18 Nov 2020
EffiScene: Efficient Per-Pixel Rigidity Inference for Unsupervised Joint
  Learning of Optical Flow, Depth, Camera Pose and Motion Segmentation
EffiScene: Efficient Per-Pixel Rigidity Inference for Unsupervised Joint Learning of Optical Flow, Depth, Camera Pose and Motion Segmentation
Yang Jiao
T. Tran
Guangming Shi
35
33
0
16 Nov 2020
Reverse engineering learned optimizers reveals known and novel
  mechanisms
Reverse engineering learned optimizers reveals known and novel mechanisms
Niru Maheswaranathan
David Sussillo
Luke Metz
Ruoxi Sun
Jascha Narain Sohl-Dickstein
22
21
0
04 Nov 2020
Multi-View Adaptive Fusion Network for 3D Object Detection
Multi-View Adaptive Fusion Network for 3D Object Detection
Guojun Wang
Bin Tian
Yachen Zhang
Long Chen
Dongpu Cao
Jian Wu
3DPC
26
25
0
02 Nov 2020
EDCNN: Edge enhancement-based Densely Connected Network with Compound
  Loss for Low-Dose CT Denoising
EDCNN: Edge enhancement-based Densely Connected Network with Compound Loss for Low-Dose CT Denoising
Tengfei Liang
Yi Jin
Yidong Li
Tao Wang
Songhe Feng
Congyan Lang
16
94
0
30 Oct 2020
Scaling Laws for Autoregressive Generative Modeling
Scaling Laws for Autoregressive Generative Modeling
T. Henighan
Jared Kaplan
Mor Katz
Mark Chen
Christopher Hesse
...
Nick Ryder
Daniel M. Ziegler
John Schulman
Dario Amodei
Sam McCandlish
32
405
0
28 Oct 2020
Discriminative Nearest Neighbor Few-Shot Intent Detection by
  Transferring Natural Language Inference
Discriminative Nearest Neighbor Few-Shot Intent Detection by Transferring Natural Language Inference
Jianguo Zhang
Kazuma Hashimoto
Wenhao Liu
Chien-Sheng Wu
Yao Wan
Philip S. Yu
R. Socher
Caiming Xiong
14
92
0
25 Oct 2020
DialogueTRM: Exploring the Intra- and Inter-Modal Emotional Behaviors in
  the Conversation
DialogueTRM: Exploring the Intra- and Inter-Modal Emotional Behaviors in the Conversation
Yuzhao Mao
Qi Sun
Guang Liu
Xiaojie Wang
Weiguo Gao
Xuan Li
Jianping Shen
27
24
0
15 Oct 2020
Extracting a Knowledge Base of Mechanisms from COVID-19 Papers
Extracting a Knowledge Base of Mechanisms from COVID-19 Papers
Tom Hope
Aida Amini
David Wadden
Madeleine van Zuylen
Sravanthi Parasa
Eric Horvitz
Daniel S. Weld
Roy Schwartz
Hannaneh Hajishirzi
24
29
0
08 Oct 2020
Like hiking? You probably enjoy nature: Persona-grounded Dialog with
  Commonsense Expansions
Like hiking? You probably enjoy nature: Persona-grounded Dialog with Commonsense Expansions
Bodhisattwa Prasad Majumder
Harsh Jhamtani
Taylor Berg-Kirkpatrick
Julian McAuley
27
85
0
07 Oct 2020
Towards a Multi-modal, Multi-task Learning based Pre-training Framework
  for Document Representation Learning
Towards a Multi-modal, Multi-task Learning based Pre-training Framework for Document Representation Learning
Subhojeet Pramanik
Shashank Mujumdar
Hima Patel
19
31
0
30 Sep 2020
DSC IIT-ISM at SemEval-2020 Task 6: Boosting BERT with Dependencies for
  Definition Extraction
DSC IIT-ISM at SemEval-2020 Task 6: Boosting BERT with Dependencies for Definition Extraction
Aadarsh Singh
Priyanshu Kumar
Aman Sinha
14
4
0
17 Sep 2020
Length-Controllable Image Captioning
Length-Controllable Image Captioning
Chaorui Deng
Ning Ding
Mingkui Tan
Qi Wu
VLM
30
56
0
19 Jul 2020
UniTrans: Unifying Model Transfer and Data Transfer for Cross-Lingual
  Named Entity Recognition with Unlabeled Data
UniTrans: Unifying Model Transfer and Data Transfer for Cross-Lingual Named Entity Recognition with Unlabeled Data
Qianhui Wu
Zijia Lin
Börje F. Karlsson
Biqing Huang
Jian-Guang Lou
18
46
0
15 Jul 2020
Ensemble Transfer Learning for Emergency Landing Field Identification on
  Moderate Resource Heterogeneous Kubernetes Cluster
Ensemble Transfer Learning for Emergency Landing Field Identification on Moderate Resource Heterogeneous Kubernetes Cluster
Andreas Klos
Marius Rosenbaum
W. Schiffmann
10
2
0
26 Jun 2020
FrostNet: Towards Quantization-Aware Network Architecture Search
FrostNet: Towards Quantization-Aware Network Architecture Search
Taehoon Kim
Y. Yoo
Jihoon Yang
MQ
22
2
0
17 Jun 2020
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
Pengcheng He
Xiaodong Liu
Jianfeng Gao
Weizhu Chen
AAML
62
2,622
0
05 Jun 2020
Pseudo-Representation Labeling Semi-Supervised Learning
Pseudo-Representation Labeling Semi-Supervised Learning
Song-Bo Yang
Tian-li Yu
24
3
0
31 May 2020
Real-Time Apple Detection System Using Embedded Systems With Hardware
  Accelerators: An Edge AI Application
Real-Time Apple Detection System Using Embedded Systems With Hardware Accelerators: An Edge AI Application
Vittorio Mazzia
Francesco Salvetti
Aleem Khaliq
Marcello Chiaberge
22
152
0
28 Apr 2020
Generating Fact Checking Explanations
Generating Fact Checking Explanations
Pepa Atanasova
J. Simonsen
Christina Lioma
Isabelle Augenstein
19
189
0
13 Apr 2020
FastBERT: a Self-distilling BERT with Adaptive Inference Time
FastBERT: a Self-distilling BERT with Adaptive Inference Time
Weijie Liu
Peng Zhou
Zhe Zhao
Zhiruo Wang
Haotang Deng
Qi Ju
31
354
0
05 Apr 2020
PointNetKL: Deep Inference for GICP Covariance Estimation in Bathymetric
  SLAM
PointNetKL: Deep Inference for GICP Covariance Estimation in Bathymetric SLAM
Ignacio Torroba
Christopher Iliffe Sprague
Nils Bore
John Folkesson
3DPC
22
16
0
24 Mar 2020
Training Question Answering Models From Synthetic Data
Training Question Answering Models From Synthetic Data
Raul Puri
Ryan Spring
M. Patwary
M. Shoeybi
Bryan Catanzaro
ELM
24
159
0
22 Feb 2020
Guider láttention dans les modeles de sequence a sequence pour la
  prediction des actes de dialogue
Guider láttention dans les modeles de sequence a sequence pour la prediction des actes de dialogue
Pierre Colombo
E. Chapuis
Matteo Manica
Emmanuel Vignon
Giovanna Varni
Chloé Clavel
3DV
31
2
0
21 Feb 2020
Keyphrase Extraction with Span-based Feature Representations
Keyphrase Extraction with Span-based Feature Representations
Funan Mu
Zhenting Yu
Lifeng Wang
Yequan Wang
Qingyu Yin
Yibo Sun
Liqun Liu
Teng Ma
Jing Tang
Xing Zhou
29
17
0
13 Feb 2020
SegVoxelNet: Exploring Semantic Context and Depth-aware Features for 3D
  Vehicle Detection from Point Cloud
SegVoxelNet: Exploring Semantic Context and Depth-aware Features for 3D Vehicle Detection from Point Cloud
Hongwei Yi
Shaoshuai Shi
Mingyu Ding
Jiankai Sun
Kui Xu
Hui Zhou
Zhe Wang
Sheng Li
Guoping Wang
3DPC
83
56
0
13 Feb 2020
LaProp: Separating Momentum and Adaptivity in Adam
LaProp: Separating Momentum and Adaptivity in Adam
Liu Ziyin
Zhikang T.Wang
Masahito Ueda
ODL
8
18
0
12 Feb 2020
fastai: A Layered API for Deep Learning
fastai: A Layered API for Deep Learning
Jeremy Howard
Sylvain Gugger
AI4CE
6
857
0
11 Feb 2020
Machine-Learning-Based Diagnostics of EEG Pathology
Machine-Learning-Based Diagnostics of EEG Pathology
Lukas A. W. Gemein
R. Schirrmeister
P. Chrabaszcz
Daniel Wilson
Joschka Boedecker
A. Schulze-Bonhage
Frank Hutter
T. Ball
22
153
0
11 Feb 2020
Faster On-Device Training Using New Federated Momentum Algorithm
Faster On-Device Training Using New Federated Momentum Algorithm
Zhouyuan Huo
Qian Yang
Bin Gu
Heng-Chiao Huang
FedML
22
47
0
06 Feb 2020
Object as Hotspots: An Anchor-Free 3D Object Detection Approach via
  Firing of Hotspots
Object as Hotspots: An Anchor-Free 3D Object Detection Approach via Firing of Hotspots
Qi Chen
Lin Sun
Zhixin Wang
Kui Jia
Alan Yuille
3DPC
170
169
0
30 Dec 2019
E-BERT: Efficient-Yet-Effective Entity Embeddings for BERT
E-BERT: Efficient-Yet-Effective Entity Embeddings for BERT
Nina Poerner
Ulli Waltinger
Hinrich Schütze
11
156
0
09 Nov 2019
An Adaptive and Momental Bound Method for Stochastic Learning
An Adaptive and Momental Bound Method for Stochastic Learning
Jianbang Ding
Xuancheng Ren
Ruixuan Luo
Xu Sun
ODL
11
46
0
27 Oct 2019
Finding New Diagnostic Information for Detecting Glaucoma using Neural
  Networks
Finding New Diagnostic Information for Detecting Glaucoma using Neural Networks
Erfan Noury
Suria S. Mannil
R. Chang
A. Ran
C. Cheung
...
M. Riyazuddin
Dolly Chang
Sriharsha Nagaraj
Clement C. Tham
R. Zadeh
14
3
0
14 Oct 2019
Previous
12345678
Next