ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1607.06450
  4. Cited By
Layer Normalization

Layer Normalization

21 July 2016
Jimmy Lei Ba
J. Kiros
Geoffrey E. Hinton
ArXivPDFHTML

Papers citing "Layer Normalization"

50 / 5,502 papers shown
Title
Metric as Transform: Exploring beyond Affine Transform for Interpretable
  Neural Network
Metric as Transform: Exploring beyond Affine Transform for Interpretable Neural Network
Suman Sapkota
28
0
0
21 Oct 2024
Focus on BEV: Self-calibrated Cycle View Transformation for Monocular
  Birds-Eye-View Segmentation
Focus on BEV: Self-calibrated Cycle View Transformation for Monocular Birds-Eye-View Segmentation
Jiawei Zhao
Qixing Jiang
Xuede Li
Junfeng Luo
44
0
0
21 Oct 2024
Mesa-Extrapolation: A Weave Position Encoding Method for Enhanced
  Extrapolation in LLMs
Mesa-Extrapolation: A Weave Position Encoding Method for Enhanced Extrapolation in LLMs
Xin Ma
Yang Liu
Qingbin Liu
Xiaoxu Ma
31
1
0
21 Oct 2024
Object-Centric Temporal Consistency via Conditional Autoregressive
  Inductive Biases
Object-Centric Temporal Consistency via Conditional Autoregressive Inductive Biases
Cristian Meo
Akihiro Nakano
Mircea Lica
Aniket Didolkar
Masahiro Suzuki
Anirudh Goyal
Mengmi Zhang
Justin Dauwels
Y. Matsuo
Yoshua Bengio
OCL
41
2
0
21 Oct 2024
All You Need is an Improving Column: Enhancing Column Generation for
  Parallel Machine Scheduling via Transformers
All You Need is an Improving Column: Enhancing Column Generation for Parallel Machine Scheduling via Transformers
Amira Hijazi
Osman Ozaltin
Reha Uzsoy
41
0
0
21 Oct 2024
Multimodal Learning for Embryo Viability Prediction in Clinical IVF
Multimodal Learning for Embryo Viability Prediction in Clinical IVF
Junsik Kim
Zhiyi Shi
Davin Jeong
Johannes Knittel
H. Yang
...
Wanhua Li
Yicong Li
D. Ben-Yosef
D. Needleman
Hanspeter Pfister
36
0
0
21 Oct 2024
Streaming Deep Reinforcement Learning Finally Works
Streaming Deep Reinforcement Learning Finally Works
Mohamed Elsayed
Gautham Vasan
A. R. Mahmood
OffRL
52
4
0
18 Oct 2024
CaTs and DAGs: Integrating Directed Acyclic Graphs with Transformers and
  Fully-Connected Neural Networks for Causally Constrained Predictions
CaTs and DAGs: Integrating Directed Acyclic Graphs with Transformers and Fully-Connected Neural Networks for Causally Constrained Predictions
M. Vowels
Mathieu Rochat
S. Akbari
CML
GNN
OOD
27
0
0
18 Oct 2024
Improving Vector-Quantized Image Modeling with Latent Consistency-Matching Diffusion
Improving Vector-Quantized Image Modeling with Latent Consistency-Matching Diffusion
Bac Nguyen
and Chieh-Hsin Lai
Yuhta Takida
Naoki Murata
Toshimitsu Uesaka
Stefano Ermon
Yuki Mitsufuji
66
0
0
18 Oct 2024
Improving Vision Transformers by Overlapping Heads in Multi-Head Self-Attention
Improving Vision Transformers by Overlapping Heads in Multi-Head Self-Attention
Tianxiao Zhang
Bo Luo
G. Wang
ViT
21
1
0
18 Oct 2024
On the Regularization of Learnable Embeddings for Time Series Forecasting
On the Regularization of Learnable Embeddings for Time Series Forecasting
L. Butera
G. Felice
Andrea Cini
Cesare Alippi
AI4TS
34
0
0
18 Oct 2024
State Estimation Transformers for Agile Legged Locomotion
State Estimation Transformers for Agile Legged Locomotion
Chen Yu
Yichu Yang
Tianlin Liu
Yangwei You
Mingliang Zhou
Diyun Xiang
31
1
0
17 Oct 2024
SemSim: Revisiting Weak-to-Strong Consistency from a Semantic Similarity
  Perspective for Semi-supervised Medical Image Segmentation
SemSim: Revisiting Weak-to-Strong Consistency from a Semantic Similarity Perspective for Semi-supervised Medical Image Segmentation
Shiao Xie
Haoran Wang
Ziwei Niu
Hao Sun
Shuyi Ouyang
Yen-Wei Chen
Lanfen Lin
21
0
0
17 Oct 2024
Hiformer: Hybrid Frequency Feature Enhancement Inverted Transformer for
  Long-Term Wind Power Prediction
Hiformer: Hybrid Frequency Feature Enhancement Inverted Transformer for Long-Term Wind Power Prediction
Chongyang Wan
Shunbo Lei
Yuan Luo
21
0
0
17 Oct 2024
Reward-free World Models for Online Imitation Learning
Reward-free World Models for Online Imitation Learning
Shangzhe Li
Zhiao Huang
H. Su
OffRL
67
1
0
17 Oct 2024
Artificial Kuramoto Oscillatory Neurons
Artificial Kuramoto Oscillatory Neurons
Takeru Miyato
Sindy Löwe
Andreas Geiger
Max Welling
AI4CE
77
6
0
17 Oct 2024
Super-resolving Real-world Image Illumination Enhancement: A New Dataset
  and A Conditional Diffusion Model
Super-resolving Real-world Image Illumination Enhancement: A New Dataset and A Conditional Diffusion Model
Yang Liu
Yaofang Liu
J. Pan
Yuxiang Hui
Fan Jia
Raymond H. Chan
T. Zeng
49
0
0
16 Oct 2024
Radon Implicit Field Transform (RIFT): Learning Scenes from Radar
  Signals
Radon Implicit Field Transform (RIFT): Learning Scenes from Radar Signals
Daqian Bao
Alex Saad-Falcon
Justin Romberg
29
0
0
16 Oct 2024
RAFA-Net: Region Attention Network For Food Items And Agricultural
  Stress Recognition
RAFA-Net: Region Attention Network For Food Items And Agricultural Stress Recognition
Asish Bera
O. Krejcar
D. Bhattacharjee
52
6
0
16 Oct 2024
Mind the Gap Between Prototypes and Images in Cross-domain Finetuning
Mind the Gap Between Prototypes and Images in Cross-domain Finetuning
Hongduan Tian
Feng Liu
Zhanke Zhou
Tongliang Liu
Chengqi Zhang
Bo Han
VLM
40
1
0
16 Oct 2024
Loss Landscape Characterization of Neural Networks without
  Over-Parametrization
Loss Landscape Characterization of Neural Networks without Over-Parametrization
Rustem Islamov
Niccolò Ajroldi
Antonio Orvieto
Aurelien Lucchi
43
4
0
16 Oct 2024
A Prompt-Based Knowledge Graph Foundation Model for Universal In-Context
  Reasoning
A Prompt-Based Knowledge Graph Foundation Model for Universal In-Context Reasoning
Yuanning Cui
Zequn Sun
Wei Hu
ReLM
LRM
28
2
0
16 Oct 2024
Is Less More? Exploring Token Condensation as Training-free Test-time Adaptation
Is Less More? Exploring Token Condensation as Training-free Test-time Adaptation
Zixin Wang
Dong Gong
Sen Wang
Zi Huang
Yadan Luo
VLM
36
0
0
16 Oct 2024
MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained
  Vision-Language Understanding
MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained Vision-Language Understanding
Yue Cao
Yangzhou Liu
Zhe Chen
Guangchen Shi
Wenhai Wang
Danhuai Zhao
Tong Lu
57
7
0
15 Oct 2024
Regional Ocean Forecasting with Hierarchical Graph Neural Networks
Regional Ocean Forecasting with Hierarchical Graph Neural Networks
Daniel Holmberg
Emanuela Clementi
Teemu Roos
AI4Cl
37
1
0
15 Oct 2024
Enhancing Unimodal Latent Representations in Multimodal VAEs through
  Iterative Amortized Inference
Enhancing Unimodal Latent Representations in Multimodal VAEs through Iterative Amortized Inference
Yuta Oshima
Masahiro Suzuki
Y. Matsuo
38
0
0
15 Oct 2024
Survey and Evaluation of Converging Architecture in LLMs based on
  Footsteps of Operations
Survey and Evaluation of Converging Architecture in LLMs based on Footsteps of Operations
Seongho Kim
Jihyun Moon
Juntaek Oh
Insu Choi
Joon-Sung Yang
26
0
0
15 Oct 2024
Communication-Control Codesign for Large-Scale Wireless Networked
  Control Systems
Communication-Control Codesign for Large-Scale Wireless Networked Control Systems
Gaoyang Pang
Wanchun Liu
Dusit Niyato
Branka Vucetic
Yonghui Li
AI4CE
26
0
0
15 Oct 2024
Optimizing Encoder-Only Transformers for Session-Based Recommendation
  Systems
Optimizing Encoder-Only Transformers for Session-Based Recommendation Systems
Anis Redjdal
Luis Pinto
Michel Desmarais
23
0
0
15 Oct 2024
Liger Kernel: Efficient Triton Kernels for LLM Training
Liger Kernel: Efficient Triton Kernels for LLM Training
Pin-Lun Hsu
Yun Dai
Vignesh Kothapalli
Qingquan Song
Shao Tang
Siyu Zhu
Steven Shimizu
Shivam Sahni
Haowen Ning
Yanning Chen
53
30
0
14 Oct 2024
ControlMM: Controllable Masked Motion Generation
ControlMM: Controllable Masked Motion Generation
Ekkasit Pinyoanuntapong
Muhammad Usama Saleem
Korrawe Karunratanakul
Pu Wang
Hongfei Xue
Chong Chen
Chuan Guo
Junli Cao
J. Ren
Sergey Tulyakov
VGen
42
4
0
14 Oct 2024
SLaNC: Static LayerNorm Calibration
SLaNC: Static LayerNorm Calibration
Mahsa Salmani
Nikita Trukhanov
I. Soloveychik
MQ
33
0
0
14 Oct 2024
Hybrid Transformer for Early Alzheimer's Detection: Integration of
  Handwriting-Based 2D Images and 1D Signal Features
Hybrid Transformer for Early Alzheimer's Detection: Integration of Handwriting-Based 2D Images and 1D Signal Features
Changqing Gong
Huafeng Qin
M. El-Yacoubi
32
0
0
14 Oct 2024
Lambda-Skip Connections: the architectural component that prevents Rank Collapse
Lambda-Skip Connections: the architectural component that prevents Rank Collapse
Federico Arangath Joseph
Jerome Sieber
Melanie Zeilinger
Carmen Amo Alonso
35
0
0
14 Oct 2024
Understanding Robustness of Parameter-Efficient Tuning for Image
  Classification
Understanding Robustness of Parameter-Efficient Tuning for Image Classification
Jiacheng Ruan
Xian Gao
Suncheng Xiang
Mingye Xie
Ting Liu
Yuzhuo Fu
AAML
VLM
26
0
0
13 Oct 2024
SimBa: Simplicity Bias for Scaling Up Parameters in Deep Reinforcement
  Learning
SimBa: Simplicity Bias for Scaling Up Parameters in Deep Reinforcement Learning
Hojoon Lee
Dongyoon Hwang
Donghu Kim
Hyunseung Kim
Jun Jet Tai
K. Subramanian
Peter R. Wurman
Jaegul Choo
Peter Stone
Takuma Seno
OffRL
75
7
0
13 Oct 2024
ReLU's Revival: On the Entropic Overload in Normalization-Free Large
  Language Models
ReLU's Revival: On the Entropic Overload in Normalization-Free Large Language Models
N. Jha
Brandon Reagen
OffRL
AI4CE
33
0
0
12 Oct 2024
ControLRM: Fast and Controllable 3D Generation via Large Reconstruction
  Model
ControLRM: Fast and Controllable 3D Generation via Large Reconstruction Model
Hongbin Xu
Weitao Chen
Zhipeng Zhou
Feng Xiao
Baigui Sun
Mike Zheng Shou
Wenxiong Kang
31
2
0
12 Oct 2024
TOP-ERL: Transformer-based Off-Policy Episodic Reinforcement Learning
TOP-ERL: Transformer-based Off-Policy Episodic Reinforcement Learning
Ge Li
Dong Tian
Hongyi Zhou
Xinkai Jiang
Rudolf Lioutikov
Gerhard Neumann
OffRL
241
3
0
12 Oct 2024
M3Hop-CoT: Misogynous Meme Identification with Multimodal Multi-hop
  Chain-of-Thought
M3Hop-CoT: Misogynous Meme Identification with Multimodal Multi-hop Chain-of-Thought
G. Kumari
Kirtan Jain
Asif Ekbal
25
1
0
11 Oct 2024
Identifying Money Laundering Subgraphs on the Blockchain
Identifying Money Laundering Subgraphs on the Blockchain
Kiwhan Song
Mohamed Ali Dhraief
Muhua Xu
Locke Cai
Xuhao Chen
Arvind
Jie Chen
20
0
0
10 Oct 2024
Self-Attention Mechanism in Multimodal Context for Banking Transaction
  Flow
Self-Attention Mechanism in Multimodal Context for Banking Transaction Flow
Cyrile Delestre
Yoann Sola
34
0
0
10 Oct 2024
Masked Generative Priors Improve World Models Sequence Modelling Capabilities
Masked Generative Priors Improve World Models Sequence Modelling Capabilities
Cristian Meo
Mircea Lica
Zarif Ikram
Akihiro Nakano
Vedant Shah
Aniket Didolkar
Dianbo Liu
Anirudh Goyal
Justin Dauwels
OffRL
90
0
0
10 Oct 2024
Deep Correlated Prompting for Visual Recognition with Missing Modalities
Deep Correlated Prompting for Visual Recognition with Missing Modalities
Lianyu Hu
Tongkai Shi
Wei Feng
Fanhua Shang
Liang Wan
VLM
44
1
0
09 Oct 2024
Compressing Large Language Models with Automated Sub-Network Search
Compressing Large Language Models with Automated Sub-Network Search
R. Sukthanker
B. Staffler
Frank Hutter
Aaron Klein
LRM
38
0
0
09 Oct 2024
A Survey: Collaborative Hardware and Software Design in the Era of Large
  Language Models
A Survey: Collaborative Hardware and Software Design in the Era of Large Language Models
Cong Guo
Feng Cheng
Zhixu Du
James Kiessling
Jonathan Ku
...
Qilin Zheng
Guanglei Zhou
Hai
Li-Wei Li
Yiran Chen
31
7
0
08 Oct 2024
Learning in complex action spaces without policy gradients
Learning in complex action spaces without policy gradients
Arash Tavakoli
Sina Ghiassian
Nemanja Rakićević
OffRL
28
0
0
08 Oct 2024
Gesture2Text: A Generalizable Decoder for Word-Gesture Keyboards in XR
  Through Trajectory Coarse Discretization and Pre-training
Gesture2Text: A Generalizable Decoder for Word-Gesture Keyboards in XR Through Trajectory Coarse Discretization and Pre-training
Junxiao Shen
Khadija Khaldi
Enmin Zhou
Hemant Bhaskar Surale
Amy Karlson
16
0
0
08 Oct 2024
The USTC-NERCSLIP Systems for the CHiME-8 MMCSG Challenge
The USTC-NERCSLIP Systems for the CHiME-8 MMCSG Challenge
Ya Jiang
Hongbo Lan
Jun Du
Qing Wang
Shutong Niu
45
1
0
08 Oct 2024
Guided Self-attention: Find the Generalized Necessarily Distinct Vectors
  for Grain Size Grading
Guided Self-attention: Find the Generalized Necessarily Distinct Vectors for Grain Size Grading
Fang Gao
XueTao Li
Jiabao Wang
Shengheng Ma
Jun Yu
28
0
0
08 Oct 2024
Previous
123...789...109110111
Next