Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1911.07013
Cited By
Understanding and Improving Layer Normalization
16 November 2019
Jingjing Xu
Xu Sun
Zhiyuan Zhang
Guangxiang Zhao
Junyang Lin
FAtt
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Understanding and Improving Layer Normalization"
50 / 129 papers shown
Title
NetSight: Graph Attention Based Traffic Forecasting in Computer Networks
Jinming Xing
Guoheng Sun
Hui Sun
Linchao Pan
Shakir Mahmood
Xuanhao Luo
Muhammad Shahzad
31
0
0
11 May 2025
Mitigating Image Captioning Hallucinations in Vision-Language Models
Fei Zhao
Chenyi Zhang
Runlin Zhang
Tianyang Wang
Xi Li
VLM
44
0
0
06 May 2025
BARIS: Boundary-Aware Refinement with Environmental Degradation Priors for Robust Underwater Instance Segmentation
Pin-Chi Pan
Soo-Chang Pei
64
0
0
28 Apr 2025
Learning to Drive from a World Model
Mitchell Goff
Greg Hogan
George Hotz
Armand du Parc Locmaria
Kacper Raczy
Harald Schäfer
Adeeb Shihadeh
Weixing Zhang
Yassine Yousfi
39
0
0
27 Apr 2025
WORLDMEM: Long-term Consistent World Simulation with Memory
Zeqi Xiao
Yushi Lan
Yifan Zhou
Wenqi Ouyang
Shuai Yang
Yanhong Zeng
Xingang Pan
78
0
0
16 Apr 2025
Decentralized Federated Domain Generalization with Style Sharing: A Formal Modeling and Convergence Analysis
Shahryar Zehtabi
Dong-Jun Han
Seyyedali Hosseinalipour
Christopher G. Brinton
FedML
AI4CE
50
0
0
08 Apr 2025
Simple yet Effective Node Property Prediction on Edge Streams under Distribution Shifts
Jongha Lee
Taehyung Kwon
Heechan Moon
Kijung Shin
AI4TS
46
0
0
01 Apr 2025
Simple Feedfoward Neural Networks are Almost All You Need for Time Series Forecasting
Fan-Keng Sun
Yu-Cheng Wu
Duane S. Boning
AI4TS
51
0
0
30 Mar 2025
IgCraft: A versatile sequence generation framework for antibody discovery and engineering
Matthew Greenig
Haowen Zhao
Vladimir Radenkovic
Aubin Ramon
Pietro Sormanni
49
0
0
25 Mar 2025
Boosting Resolution Generalization of Diffusion Transformers with Randomized Positional Encodings
Cong Liu
Liang Hou
Mingwu Zheng
Xin Tao
Pengfei Wan
Di Zhang
Kun Gai
49
0
0
24 Mar 2025
HunyuanPortrait: Implicit Condition Control for Enhanced Portrait Animation
Zunnan Xu
Zhentao Yu
Zixiang Zhou
Jun Zhou
Xiaoyu Jin
...
Chengfei Cai
Shiyu Tang
Qin Lin
Xiu Li
Qinglin Lu
DiffM
VGen
92
8
0
24 Mar 2025
Diffusion Dynamics Models with Generative State Estimation for Cloth Manipulation
Tongxuan Tian
Haoyang Li
Bo Ai
Xiaodi Yuan
Zhiao Huang
H. Su
DiffM
AI4CE
73
3
0
15 Mar 2025
Transformers without Normalization
Jiachen Zhu
Xinlei Chen
Kaiming He
Yann LeCun
Zhuang Liu
ViT
OffRL
65
7
0
13 Mar 2025
Self-Adjust Softmax
Chuanyang Zheng
Yihang Gao
Guoxuan Chen
Han Shi
Jing Xiong
Xiaozhe Ren
Chao Huang
Xin Jiang
Zhiyu Li
Yu Li
50
0
0
25 Feb 2025
A Transformer-in-Transformer Network Utilizing Knowledge Distillation for Image Recognition
Dewan Tauhid Rahman
Yeahia Sarker
Antar Mazumder
Md. Shamim Anower
ViT
53
0
0
24 Feb 2025
Baichuan-M1: Pushing the Medical Capability of Large Language Models
Binghui Wang
Haizhou Zhao
Huozhi Zhou
Liang Song
Mingyu Xu
...
Yan Zhang
Yifei Duan
Yuyan Zhou
Zhi-Ming Ma
Zhikai Wu
LM&MA
ELM
AI4MH
42
4
0
18 Feb 2025
Hypergraph Diffusion for High-Order Recommender Systems
Darnbi Sakong
T. T. Huynh
Jun Jo
DiffM
82
0
0
28 Jan 2025
Unveiling Discrete Clues: Superior Healthcare Predictions for Rare Diseases
Chuang Zhao
Hui Tang
Jiheng Zhang
Xiaomeng Li
37
0
0
23 Jan 2025
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference
Benjamin Warner
Antoine Chaffin
Benjamin Clavié
Orion Weller
Oskar Hallström
...
Tom Aarsen
Nathan Cooper
Griffin Adams
Jeremy Howard
Iacopo Poli
93
79
0
18 Dec 2024
Transducer Tuning: Efficient Model Adaptation for Software Tasks Using Code Property Graphs
Imam Nur Bani Yusuf
Lingxiao Jiang
90
0
0
18 Dec 2024
Navigation World Models
Amir Bar
G. Zhou
Danny Tran
Trevor Darrell
Yann LeCun
VGen
EgoV
82
14
0
04 Dec 2024
ENAT: Rethinking Spatial-temporal Interactions in Token-based Image Synthesis
Zanlin Ni
Yulin Wang
Renping Zhou
Yizeng Han
Jiayi Guo
Zhiyuan Liu
Yuan Yao
Gao Huang
63
4
0
11 Nov 2024
MEANT: Multimodal Encoder for Antecedent Information
Benjamin Iyoya Irving
Annika Marie Schoene
AIFin
34
0
0
10 Nov 2024
A Closer Look at Neural Codec Resynthesis: Bridging the Gap between Codec and Waveform Generation
Alexander H. Liu
Qirui Wang
Yuan Gong
James Glass
33
0
0
29 Oct 2024
SINGAPO: Single Image Controlled Generation of Articulated Parts in Objects
Jiayi Liu
Denys Iliash
Angel X. Chang
Manolis Savva
Ali Mahdavi-Amiri
61
8
0
21 Oct 2024
Action abstractions for amortized sampling
Oussama Boussif
Léna Néhale Ezzine
J. Viviano
Michał Koziarski
Moksh Jain
Nikolay Malkin
Emmanuel Bengio
Rim Assouel
Yoshua Bengio
28
0
0
19 Oct 2024
Streaming Deep Reinforcement Learning Finally Works
Mohamed Elsayed
Gautham Vasan
A. R. Mahmood
OffRL
50
4
0
18 Oct 2024
DARE the Extreme: Revisiting Delta-Parameter Pruning For Fine-Tuned Models
Wenlong Deng
Yize Zhao
V. Vakilian
Minghui Chen
Xiaoxiao Li
Christos Thrampoulidis
45
3
0
12 Oct 2024
Cross-conditioned Diffusion Model for Medical Image to Image Translation
Zhaohu Xing
Sicheng Yang
Sixiang Chen
Tian-Chun Ye
Yijun Yang
Jing Qin
Lei Zhu
DiffM
MedIm
42
7
0
13 Sep 2024
CALM: Cognitive Assessment using Light-insensitive Model
Akhil Meethal
Anita Paas
Nerea Urrestilla Anguiozar
David St-Onge
22
0
0
05 Sep 2024
5%>100%: Breaking Performance Shackles of Full Fine-Tuning on Visual Recognition Tasks
Dongshuo Yin
Leiyi Hu
Bin Li
Youqun Zhang
Xue Yang
46
7
0
15 Aug 2024
Expanding the Medical Decathlon dataset: segmentation of colon and colorectal cancer from computed tomography images
Miao Cao
Y. A. Drach
S. R. Mustakimova
Huan Wang
Xin Yuan
S. K. Efetov
M. V. Feldsherov
35
0
0
31 Jul 2024
Transformer Normalisation Layers and the Independence of Semantic Subspaces
S. Menary
Samuel Kaski
Andre Freitas
44
2
0
25 Jun 2024
GraphKAN: Enhancing Feature Extraction with Graph Kolmogorov Arnold Networks
Fan Zhang
Xin Zhang
38
26
0
19 Jun 2024
On the Nonlinearity of Layer Normalization
Yunhao Ni
Yuxin Guo
Junlong Jia
Lei Huang
47
4
0
03 Jun 2024
Lateralization MLP: A Simple Brain-inspired Architecture for Diffusion
Zizhao Hu
Mohammad Rostami
34
0
0
25 May 2024
NFCL: Simply interpretable neural networks for a short-term multivariate forecasting
Wonkeun Jo
Dongil Kim
AI4TS
38
0
0
22 May 2024
Privacy Backdoors: Stealing Data with Corrupted Pretrained Models
Shanglun Feng
Florian Tramèr
SILM
40
14
0
30 Mar 2024
LayerNorm: A key component in parameter-efficient fine-tuning
Taha ValizadehAslani
Hualou Liang
51
1
0
29 Mar 2024
TRIP: Temporal Residual Learning with Image Noise Prior for Image-to-Video Diffusion Models
Zhongwei Zhang
Fuchen Long
Yingwei Pan
Zhaofan Qiu
Ting Yao
Yang Cao
Tao Mei
VGen
43
23
0
25 Mar 2024
Opportunities and challenges in the application of large artificial intelligence models in radiology
Liangrui Pan
Zhenyu Zhao
Ying Lu
Kewei Tang
Liyong Fu
Qingchun Liang
Shaoliang Peng
LM&MA
MedIm
AI4CE
45
5
0
24 Mar 2024
Dissecting Deep RL with High Update Ratios: Combatting Value Divergence
Marcel Hussing
C. Voelcker
Igor Gilitschenski
Amir-massoud Farahmand
Eric Eaton
39
3
0
09 Mar 2024
RFWave: Multi-band Rectified Flow for Audio Waveform Reconstruction
Peng Liu
Dongyang Dai
Zhiyong Wu
38
2
0
08 Mar 2024
Self-Attention Empowered Graph Convolutional Network for Structure Learning and Node Embedding
Mengying Jiang
Guizhong Liu
Yuanchao Su
Xinliang Wu
GNN
SSL
32
2
0
06 Mar 2024
Neural Redshift: Random Networks are not Random Functions
Damien Teney
A. Nicolicioiu
Valentin Hartmann
Ehsan Abbasnejad
103
18
0
04 Mar 2024
ProtoP-OD: Explainable Object Detection with Prototypical Parts
Pavlos Rath-Manakidis
Frederik Strothmann
Tobias Glasmachers
Laurenz Wiskott
ViT
35
1
0
29 Feb 2024
Disentangling the Causes of Plasticity Loss in Neural Networks
Clare Lyle
Zeyu Zheng
Khimya Khetarpal
H. V. Hasselt
Razvan Pascanu
James Martens
Will Dabney
AI4CE
55
32
0
29 Feb 2024
Supervised Contrastive Learning based Dual-Mixer Model for Remaining Useful Life Prediction
En Fu
Yanyan Hu
Kaixiang Peng
Yuxin Chu
21
4
0
29 Jan 2024
Correlation-Embedded Transformer Tracking: A Single-Branch Framework
Fei Xie
Wankou Yang
Chunyu Wang
Lei Chu
Yue Cao
Chao Ma
Wenjun Zeng
35
5
0
23 Jan 2024
Tuning LayerNorm in Attention: Towards Efficient Multi-Modal LLM Finetuning
Bingchen Zhao
Haoqin Tu
Chen Wei
Jieru Mei
Cihang Xie
25
32
0
18 Dec 2023
1
2
3
Next