Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1711.05101
Cited By
Decoupled Weight Decay Regularization
14 November 2017
I. Loshchilov
Frank Hutter
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Decoupled Weight Decay Regularization"
50 / 369 papers shown
Title
Sequential Classification of Misinformation
Daniel Toma
Wasim Huleihel
35
0
0
07 Sep 2024
Information-Theoretic Progress Measures reveal Grokking is an Emergent Phase Transition
Kenzo Clauw
S. Stramaglia
Daniele Marinazzo
50
3
0
16 Aug 2024
GLDiTalker: Speech-Driven 3D Facial Animation with Graph Latent Diffusion Transformer
Yihong Lin
Zhaoxin Fan
Lingyu Xiong
Liang Peng
Xiandong Li
Xiandong Li
Wenxiong Kang
Xiandong Li
Huang Xu
42
3
0
03 Aug 2024
Meltemi: The first open Large Language Model for Greek
Leon Voukoutis
Dimitris Roussis
Georgios Paraskevopoulos
Sokratis Sofianopoulos
Prokopis Prokopidis
Vassilis Papavasileiou
Athanasios Katsamanis
Stelios Piperidis
V. Katsouros
VLM
35
7
0
30 Jul 2024
VisFocus: Prompt-Guided Vision Encoders for OCR-Free Dense Document Understanding
Ofir Abramovich
Niv Nayman
Sharon Fogel
I. Lavi
Ron Litman
Shahar Tsiper
Royee Tichauer
Srikar Appalaraju
Shai Mazor
R. Manmatha
VLM
33
3
0
17 Jul 2024
RoboMorph: Evolving Robot Morphology using Large Language Models
Kevin Qiu
Krzysztof Ciebiera
Krzysztof Ciebiera
Marek Cygan
Marek Cygan
Łukasz Kuciński
LM&Ro
49
0
0
11 Jul 2024
Predicting Visual Attention in Graphic Design Documents
Souradeep Chakraborty
Zijun Wei
Conor Kelton
Seoyoung Ahn
A. Balasubramanian
G. Zelinsky
Dimitris Samaras
38
15
0
02 Jul 2024
Enhancing Travel Decision-Making: A Contrastive Learning Approach for Personalized Review Rankings in Accommodations
Reda Igebaria
Eran Fainman
Sarai Mizrachi
Moran Beladev
Fengjun Wang
27
1
0
30 Jun 2024
Molecular Diffusion Models with Virtual Receptors
Matan Halfon
Eyal Rozenberg
Ehud Rivlin
Daniel Freedman
53
0
0
26 Jun 2024
Active Diffusion Subsampling
Oisin Nolan
Tristan S. W. Stevens
Wessel L. van Nierop
Ruud J. G. van Sloun
DiffM
MedIm
37
2
0
20 Jun 2024
MoE-RBench
\texttt{MoE-RBench}
MoE-RBench
: Towards Building Reliable Language Models with Sparse Mixture-of-Experts
Guanjie Chen
Xinyu Zhao
Tianlong Chen
Yu Cheng
MoE
76
5
0
17 Jun 2024
P-TA: Using Proximal Policy Optimization to Enhance Tabular Data Augmentation via Large Language Models
Shuo Yang
Chenchen Yuan
Yao Rong
Felix Steinbauer
Gjergji Kasneci
38
1
0
17 Jun 2024
What Languages are Easy to Language-Model? A Perspective from Learning Probabilistic Regular Languages
Nadav Borenstein
Anej Svete
R. Chan
Josef Valvoda
Franz Nowak
Isabelle Augenstein
Eleanor Chodroff
Ryan Cotterell
42
11
0
06 Jun 2024
Infinite 3D Landmarks: Improving Continuous 2D Facial Landmark Detection
Prashanth Chandran
Gaspard Zoss
Paulo F. U. Gotardo
Derek Bradley
CVBM
37
1
0
30 May 2024
BenthicNet: A global compilation of seafloor images for deep learning applications
Scott C. Lowe
B. Misiuk
Isaac Xu
Shakhboz Abdulazizov
A. R. Baroi
...
Jordan A. Thomson
Brittany R. Wilson
Melisa C. Wong
Craig J. Brown
Thomas Trappenberg
49
3
0
08 May 2024
Performance-Aligned LLMs for Generating Fast Code
Daniel Nichols
Pranav Polasam
Harshitha Menon
Aniruddha Marathe
T. Gamblin
A. Bhatele
35
8
0
29 Apr 2024
Event-based Video Frame Interpolation with Edge Guided Motion Refinement
Yuhan Liu
Yongjian Deng
Hao Chen
Bochen Xie
Youfu Li
Zhen Yang
39
1
0
28 Apr 2024
Point-JEPA: A Joint Embedding Predictive Architecture for Self-Supervised Learning on Point Cloud
Ayumu Saito
Prachi Kudeshia
Jiju Poovvancheri
3DPC
45
7
0
25 Apr 2024
GeMQuAD : Generating Multilingual Question Answering Datasets from Large Language Models using Few Shot Learning
Amani Namboori
Shivam Mangale
Andrew Rosenbaum
Saleh Soltan
40
0
0
14 Apr 2024
OPSD: an Offensive Persian Social media Dataset and its baseline evaluations
M. Safayani
Amir Sartipi
Amir Hossein Ahmadi
Parniyan Jalali
Amir Hossein Mansouri
Mohammad Bisheh-Niasar
Zahra Pourbahman
16
0
0
08 Apr 2024
PejorativITy: Disambiguating Pejorative Epithets to Improve Misogyny Detection in Italian Tweets
Arianna Muti
Federico Ruggeri
Cagri Toraman
Lorenzo Musetti
Samuel Algherini
Silvia Ronchi
G. Saretto
Caterina Zapparoli
Alberto Barrón-Cedeño
23
3
0
03 Apr 2024
M-HOF-Opt: Multi-Objective Hierarchical Output Feedback Optimization via Multiplier Induced Loss Landscape Scheduling
Xudong Sun
Nutan Chen
Alexej Gossmann
Yu Xing
Carla Feistner
...
Felix Drost
Daniele Scarcella
Lisa Beer
Carsten Marr
Carsten Marr
59
1
0
20 Mar 2024
Take Care of Your Prompt Bias! Investigating and Mitigating Prompt Bias in Factual Knowledge Extraction
Ziyang Xu
Keqin Peng
Liang Ding
Dacheng Tao
Xiliang Lu
34
10
0
15 Mar 2024
Platypose: Calibrated Zero-Shot Multi-Hypothesis 3D Human Motion Estimation
Paweł Antoni Pierzchlewicz
Caio da Silva
R. J. Cotton
Fabian H. Sinz
30
0
0
10 Mar 2024
XPSR: Cross-modal Priors for Diffusion-based Image Super-Resolution
Yunpeng Qu
Kun Yuan
Kai Zhao
Qizhi Xie
Jinhua Hao
Ming-hui Sun
Chao Zhou
27
17
0
08 Mar 2024
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis
Patrick Esser
Sumith Kulal
A. Blattmann
Rahim Entezari
Jonas Muller
...
Zion English
Kyle Lacey
Alex Goodwin
Yannik Marek
Robin Rombach
DiffM
109
1,071
0
05 Mar 2024
NeuraLUT: Hiding Neural Network Density in Boolean Synthesizable Functions
Marta Andronic
G. Constantinides
26
5
0
29 Feb 2024
Can LLMs Learn New Concepts Incrementally without Forgetting?
Junhao Zheng
Shengjie Qiu
Qianli Ma
CLL
35
0
0
13 Feb 2024
Pushing Boundaries: Mixup's Influence on Neural Collapse
Quinn Fisher
Haoming Meng
V. Papyan
AAML
UQCV
44
5
0
09 Feb 2024
Mesoscale Traffic Forecasting for Real-Time Bottleneck and Shockwave Prediction
Raphael Chekroun
Han Wang
Jonathan W. Lee
Marin Toromanoff
Sascha Hornauer
Fabien Moutarde
M. D. Monache
28
0
0
08 Feb 2024
Sample, estimate, aggregate: A recipe for causal discovery foundation models
Menghua Wu
Yujia Bao
Regina Barzilay
Tommi Jaakkola
CML
49
7
0
02 Feb 2024
Triple Disentangled Representation Learning for Multimodal Affective Analysis
Ying Zhou
Xuefeng Liang
Han Chen
Yin Zhao
Xin Chen
Lida Yu
52
3
0
29 Jan 2024
P2Seg: Pointly-supervised Segmentation via Mutual Distillation
Zipeng Wang
Xuehui Yu
Xumeng Han
Wenwen Yu
Zhixun Huang
Jianbin Jiao
Zhenjun Han
31
0
0
18 Jan 2024
Stream Query Denoising for Vectorized HD Map Construction
Shuo Wang
Fan Jia
Yingfei Liu
Yucheng Zhao
Zehui Chen
Tiancai Wang
Chi Zhang
Xiangyu Zhang
Feng Zhao
36
19
0
17 Jan 2024
DiffDA: a Diffusion Model for Weather-scale Data Assimilation
Langwen Huang
Lukas Gianinazzi
Yuejiang Yu
P. Dueben
Torsten Hoefler
28
28
0
11 Jan 2024
Learn or Recall? Revisiting Incremental Learning with Pre-trained Language Models
Junhao Zheng
Shengjie Qiu
Qianli Ma
25
9
0
13 Dec 2023
GQKVA: Efficient Pre-training of Transformers by Grouping Queries, Keys, and Values
Farnoosh Javadi
Walid Ahmed
Habib Hajimolahoseini
Foozhan Ataiefard
Mohammad Hassanpour
Saina Asani
Austin Wen
Omar Mohamed Awad
Kangling Liu
Yang Liu
VLM
34
7
0
06 Nov 2023
Video Frame Interpolation with Many-to-many Splatting and Spatial Selective Refinement
Ping Hu
Simon Niklaus
Lu Zhang
Stan Sclaroff
Kate Saenko
25
6
0
29 Oct 2023
Expression Syntax Information Bottleneck for Math Word Problems
Jing Xiong
Chengming Li
Min Yang
Xiping Hu
Bin Hu
25
5
0
24 Oct 2023
Text2Topic: Multi-Label Text Classification System for Efficient Topic Detection in User Generated Content with Zero-Shot Capabilities
Fengjun Wang
Moran Beladev
Ofri Kleinfeld
Elina Frayerman
Tal Shachar
Eran Fainman
Karen Lastmann Assaraf
Sarai Mizrachi
Benjamin Wang
VLM
10
8
0
23 Oct 2023
RMap: Millimeter-Wave Radar Mapping Through Volumetric Upsampling
Ajay Narasimha Mopidevi
Kyle Harlow
Christoffer Heckman
27
1
0
19 Oct 2023
Identifying and Adapting Transformer-Components Responsible for Gender Bias in an English Language Model
Abhijith Chintam
Rahel Beloch
Willem H. Zuidema
Michael Hanna
Oskar van der Wal
28
16
0
19 Oct 2023
BanglaNLP at BLP-2023 Task 2: Benchmarking different Transformer Models for Sentiment Analysis of Bangla Social Media Posts
Saumajit Saha
Albert Nanda
29
0
0
13 Oct 2023
Argumentative Stance Prediction: An Exploratory Study on Multimodality and Few-Shot Learning
Arushi Sharma
Abhibha Gupta
Maneesh Bilalpur
24
4
0
11 Oct 2023
Weakly-supervised Automated Audio Captioning via text only training
Theodoros Kouzelis
V. Katsouros
CLIP
32
6
0
21 Sep 2023
Measuring Catastrophic Forgetting in Cross-Lingual Transfer Paradigms: Exploring Tuning Strategies
Boshko Koloski
Blaž Škrlj
Marko Robnik-Šikonja
Senja Pollak
CLL
24
2
0
12 Sep 2023
FLM-101B: An Open LLM and How to Train It with
100
K
B
u
d
g
e
t
100K Budget
100
K
B
u
d
g
e
t
Xiang Li
Yiqun Yao
Xin Jiang
Xuezhi Fang
Xuying Meng
...
LI DU
Bowen Qin
Zheng-Wei Zhang
Aixin Sun
Yequan Wang
60
21
0
07 Sep 2023
Taxonomic Loss for Morphological Glossing of Low-Resource Languages
Michael Ginn
Alexis Palmer
21
0
0
29 Aug 2023
Pro-Cap: Leveraging a Frozen Vision-Language Model for Hateful Meme Detection
Rui Cao
Ming Shan Hee
Adriel Kuek
Wen-Haw Chong
Roy Ka-Wei Lee
Jing Jiang
VLM
MLLM
27
36
0
16 Aug 2023
Complementary Frequency-Varying Awareness Network for Open-Set Fine-Grained Image Recognition
Jiaying Sun
Hong Wang
Qiulei Dong
25
0
0
14 Jul 2023
Previous
1
2
3
4
5
6
7
8
Next