ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1607.06450
  4. Cited By
Layer Normalization

Layer Normalization

21 July 2016
Jimmy Lei Ba
J. Kiros
Geoffrey E. Hinton
ArXivPDFHTML

Papers citing "Layer Normalization"

50 / 5,513 papers shown
Title
M3Hop-CoT: Misogynous Meme Identification with Multimodal Multi-hop
  Chain-of-Thought
M3Hop-CoT: Misogynous Meme Identification with Multimodal Multi-hop Chain-of-Thought
G. Kumari
Kirtan Jain
Asif Ekbal
25
1
0
11 Oct 2024
Identifying Money Laundering Subgraphs on the Blockchain
Identifying Money Laundering Subgraphs on the Blockchain
Kiwhan Song
Mohamed Ali Dhraief
Muhua Xu
Locke Cai
Xuhao Chen
Arvind
Jie Chen
20
0
0
10 Oct 2024
Self-Attention Mechanism in Multimodal Context for Banking Transaction
  Flow
Self-Attention Mechanism in Multimodal Context for Banking Transaction Flow
Cyrile Delestre
Yoann Sola
34
0
0
10 Oct 2024
Masked Generative Priors Improve World Models Sequence Modelling Capabilities
Masked Generative Priors Improve World Models Sequence Modelling Capabilities
Cristian Meo
Mircea Lica
Zarif Ikram
Akihiro Nakano
Vedant Shah
Aniket Didolkar
Dianbo Liu
Anirudh Goyal
Justin Dauwels
OffRL
90
0
0
10 Oct 2024
Deep Correlated Prompting for Visual Recognition with Missing Modalities
Deep Correlated Prompting for Visual Recognition with Missing Modalities
Lianyu Hu
Tongkai Shi
Wei Feng
Fanhua Shang
Liang Wan
VLM
44
1
0
09 Oct 2024
Compressing Large Language Models with Automated Sub-Network Search
Compressing Large Language Models with Automated Sub-Network Search
R. Sukthanker
B. Staffler
Frank Hutter
Aaron Klein
LRM
38
0
0
09 Oct 2024
A Survey: Collaborative Hardware and Software Design in the Era of Large
  Language Models
A Survey: Collaborative Hardware and Software Design in the Era of Large Language Models
Cong Guo
Feng Cheng
Zhixu Du
James Kiessling
Jonathan Ku
...
Qilin Zheng
Guanglei Zhou
Hai
Li-Wei Li
Yiran Chen
33
7
0
08 Oct 2024
Learning in complex action spaces without policy gradients
Learning in complex action spaces without policy gradients
Arash Tavakoli
Sina Ghiassian
Nemanja Rakićević
OffRL
34
0
0
08 Oct 2024
Gesture2Text: A Generalizable Decoder for Word-Gesture Keyboards in XR
  Through Trajectory Coarse Discretization and Pre-training
Gesture2Text: A Generalizable Decoder for Word-Gesture Keyboards in XR Through Trajectory Coarse Discretization and Pre-training
Junxiao Shen
Khadija Khaldi
Enmin Zhou
Hemant Bhaskar Surale
Amy Karlson
21
0
0
08 Oct 2024
The USTC-NERCSLIP Systems for the CHiME-8 MMCSG Challenge
The USTC-NERCSLIP Systems for the CHiME-8 MMCSG Challenge
Ya Jiang
Hongbo Lan
Jun Du
Qing Wang
Shutong Niu
45
1
0
08 Oct 2024
Guided Self-attention: Find the Generalized Necessarily Distinct Vectors
  for Grain Size Grading
Guided Self-attention: Find the Generalized Necessarily Distinct Vectors for Grain Size Grading
Fang Gao
XueTao Li
Jiabao Wang
Shengheng Ma
Jun Yu
28
0
0
08 Oct 2024
Enhancing Temporal Modeling of Video LLMs via Time Gating
Enhancing Temporal Modeling of Video LLMs via Time Gating
Zi-Yuan Hu
Yiwu Zhong
Shijia Huang
M. Lyu
Liwei Wang
VLM
33
0
0
08 Oct 2024
A Parameter Update Balancing Algorithm for Multi-task Ranking Models in Recommendation Systems
A Parameter Update Balancing Algorithm for Multi-task Ranking Models in Recommendation Systems
Jun Yuan
Guohao Cai
Zhenhua Dong
28
0
0
08 Oct 2024
Diffusion Model Predictive Control
Diffusion Model Predictive Control
Guangyao Zhou
Sivaramakrishnan Swaminathan
Rajkumar Vasudeva Raju
J. S. Guntupalli
Wolfgang Lehrach
Joseph Ortiz
Antoine Dedieu
Miguel Lázaro-Gredilla
Kevin P. Murphy
39
6
0
07 Oct 2024
TuneVLSeg: Prompt Tuning Benchmark for Vision-Language Segmentation
  Models
TuneVLSeg: Prompt Tuning Benchmark for Vision-Language Segmentation Models
Rabin Adhikari
Safal Thapaliya
Manish Dhakal
Bishesh Khanal
MLLM
VLM
38
0
0
07 Oct 2024
Tourism destination events classifier based on artificial intelligence
  techniques
Tourism destination events classifier based on artificial intelligence techniques
Miguel Camacho-Ruiz
Ramón Alberto Carrasco
Gema Fernández-Avilés
Antonio LaTorre
29
4
0
07 Oct 2024
Initialization of Large Language Models via Reparameterization to
  Mitigate Loss Spikes
Initialization of Large Language Models via Reparameterization to Mitigate Loss Spikes
Kosuke Nishida
Kyosuke Nishida
Kuniko Saito
36
2
0
07 Oct 2024
Active Fine-Tuning of Generalist Policies
Active Fine-Tuning of Generalist Policies
Marco Bagatella
Jonas Hübotter
Georg Martius
Andreas Krause
37
0
0
07 Oct 2024
Stage-Wise and Prior-Aware Neural Speech Phase Prediction
Stage-Wise and Prior-Aware Neural Speech Phase Prediction
Fei Liu
Yang Ai
Hui-Peng Du
Ye-Xin Lu
Rui Zheng
Zhen-Hua Ling
32
0
0
07 Oct 2024
Activation Scaling for Steering and Interpreting Language Models
Activation Scaling for Steering and Interpreting Language Models
Niklas Stoehr
Kevin Du
Vésteinn Snæbjarnarson
Robert West
Ryan Cotterell
Aaron Schein
LLMSV
LRM
39
4
0
07 Oct 2024
On Efficient Variants of Segment Anything Model: A Survey
On Efficient Variants of Segment Anything Model: A Survey
Xiaorui Sun
Jing Liu
H. Shen
Xiaofeng Zhu
Ping Hu
VLM
53
4
0
07 Oct 2024
Efficient transformer with reinforced position embedding for language
  models
Efficient transformer with reinforced position embedding for language models
Yen-Che Hsiao
Abhishek Dutta
31
0
0
07 Oct 2024
NeuroBOLT: Resting-state EEG-to-fMRI Synthesis with Multi-dimensional
  Feature Mapping
NeuroBOLT: Resting-state EEG-to-fMRI Synthesis with Multi-dimensional Feature Mapping
Yamin Li
Ange Lou
Ziyuan Xu
Shengchao Zhang
Shiyu Wang
Dario J. Englot
Soheil Kolouri
Daniel Moyer
Roza G. Bayrak
Catie Chang
27
4
0
07 Oct 2024
CAR: Controllable Autoregressive Modeling for Visual Generation
CAR: Controllable Autoregressive Modeling for Visual Generation
Ziyu Yao
Jialin Li
Yifeng Zhou
Yong Liu
Xi Jiang
Chengjie Wang
Feng Zheng
Yuexian Zou
Lei Li
DiffM
45
13
0
07 Oct 2024
Cross Resolution Encoding-Decoding For Detection Transformers
Cross Resolution Encoding-Decoding For Detection Transformers
Ashish Kumar
Jaesik Park
ViT
38
0
0
05 Oct 2024
GraphCroc: Cross-Correlation Autoencoder for Graph Structural
  Reconstruction
GraphCroc: Cross-Correlation Autoencoder for Graph Structural Reconstruction
Shijin Duan
Ruyi Ding
Jiaxing He
A. A. Ding
Yunsi Fei
Xiaolin Xu
31
0
0
04 Oct 2024
Test-time Adaptation for Regression by Subspace Alignment
Test-time Adaptation for Regression by Subspace Alignment
Kazuki Adachi
Shin'ya Yamaguchi
Atsutoshi Kumagai
Tomoki Hamagami
TTA
48
0
0
04 Oct 2024
In-context Learning in Presence of Spurious Correlations
In-context Learning in Presence of Spurious Correlations
Hrayr Harutyunyan
R. Darbinyan
Samvel Karapetyan
Hrant Khachatrian
LRM
54
1
0
04 Oct 2024
Multilingual Topic Classification in X: Dataset and Analysis
Multilingual Topic Classification in X: Dataset and Analysis
Dimosthenis Antypas
Asahi Ushio
Francesco Barbieri
Jose Camacho-Collados
32
1
0
04 Oct 2024
MLP-KAN: Unifying Deep Representation and Function Learning
MLP-KAN: Unifying Deep Representation and Function Learning
Yunhong He
Yifeng Xie
Zhengqing Yuan
Lichao Sun
29
1
0
03 Oct 2024
Graph-tree Fusion Model with Bidirectional Information Propagation for
  Long Document Classification
Graph-tree Fusion Model with Bidirectional Information Propagation for Long Document Classification
Sudipta Singha Roy
Xindi Wang
Robert E. Mercer
Frank Rudzicz
24
0
0
03 Oct 2024
Diss-l-ECT: Dissecting Graph Data with local Euler Characteristic
  Transforms
Diss-l-ECT: Dissecting Graph Data with local Euler Characteristic Transforms
Julius von Rohrscheidt
Bastian Alexander Rieck
31
0
0
03 Oct 2024
Learning from Offline Foundation Features with Tensor Augmentations
Learning from Offline Foundation Features with Tensor Augmentations
Emir Konuk
Christos Matsoukas
Moein Sorkhei
Phitchapha Lertsiravaramet
Kevin Smith
OffRL
26
1
0
03 Oct 2024
Mamba Neural Operator: Who Wins? Transformers vs. State-Space Models for PDEs
Mamba Neural Operator: Who Wins? Transformers vs. State-Space Models for PDEs
Chun-Wun Cheng
Jiahao Huang
Yi Zhang
Guang Yang
Carola-Bibiane Schonlieb
Angelica I Aviles-Rivero
Mamba
AI4CE
93
3
0
03 Oct 2024
Deep Signature: Characterization of Large-Scale Molecular Dynamics
Deep Signature: Characterization of Large-Scale Molecular Dynamics
Tiexin Qin
Mengxu Zhu
Chunyang Li
Terry Lyons
Hong Yan
Haoliang Li
35
0
0
03 Oct 2024
Robust Weight Initialization for Tanh Neural Networks with Fixed Point Analysis
Robust Weight Initialization for Tanh Neural Networks with Fixed Point Analysis
Hyunwoo Lee
Hayoung Choi
Hyunju Kim
44
1
0
03 Oct 2024
Training Language Models on Synthetic Edit Sequences Improves Code Synthesis
Training Language Models on Synthetic Edit Sequences Improves Code Synthesis
Ulyana Piterbarg
Lerrel Pinto
Rob Fergus
SyDa
37
2
0
03 Oct 2024
FeelAnyForce: Estimating Contact Force Feedback from Tactile Sensation
  for Vision-Based Tactile Sensors
FeelAnyForce: Estimating Contact Force Feedback from Tactile Sensation for Vision-Based Tactile Sensors
A. Shahidzadeh
G. Caddeo
Koushik Alapati
Lorenzo Natale
Cornelia Fermuller
Yiannis Aloimonos
28
2
0
02 Oct 2024
Addressing Data Heterogeneity in Federated Learning with Adaptive
  Normalization-Free Feature Recalibration
Addressing Data Heterogeneity in Federated Learning with Adaptive Normalization-Free Feature Recalibration
Vasilis Siomos
Sergio Naval Marimont
Jonathan Passerat-Palmbach
G. Tarroni
32
0
0
02 Oct 2024
Scale-Invariant Learning-to-Rank
Scale-Invariant Learning-to-Rank
Alessio Petrozziello
Christian Sommeregger
Ye-Sheen Lim
32
0
0
02 Oct 2024
Samba: Synchronized Set-of-Sequences Modeling for Multiple Object
  Tracking
Samba: Synchronized Set-of-Sequences Modeling for Multiple Object Tracking
Mattia Segu
Luigi Piccinelli
Siyuan Li
Yung-Hsu Yang
Bernt Schiele
Luc Van Gool
Mamba
50
2
0
02 Oct 2024
PASS:Test-Time Prompting to Adapt Styles and Semantic Shapes in Medical
  Image Segmentation
PASS:Test-Time Prompting to Adapt Styles and Semantic Shapes in Medical Image Segmentation
Chuyan Zhang
Hao Zheng
Xin You
Yefeng Zheng
Yun Gu
VLM
OOD
MedIm
47
1
0
02 Oct 2024
Foldable SuperNets: Scalable Merging of Transformers with Different
  Initializations and Tasks
Foldable SuperNets: Scalable Merging of Transformers with Different Initializations and Tasks
Edan Kinderman
Itay Hubara
Haggai Maron
Daniel Soudry
MoMe
54
1
0
02 Oct 2024
Getting Free Bits Back from Rotational Symmetries in LLMs
Getting Free Bits Back from Rotational Symmetries in LLMs
Jiajun He
Gergely Flamich
José Miguel Hernández-Lobato
MQ
23
0
0
02 Oct 2024
LaGeM: A Large Geometry Model for 3D Representation Learning and
  Diffusion
LaGeM: A Large Geometry Model for 3D Representation Learning and Diffusion
Biao Zhang
Peter Wonka
AI4CE
3DV
DiffM
36
7
0
02 Oct 2024
Denoising with a Joint-Embedding Predictive Architecture
Denoising with a Joint-Embedding Predictive Architecture
Dengsheng Chen
Jie Hu
Xiaoming Wei
Enhua Wu
DiffM
52
2
0
02 Oct 2024
House of Cards: Massive Weights in LLMs
House of Cards: Massive Weights in LLMs
Jaehoon Oh
Seungjun Shin
Dokwan Oh
45
1
0
02 Oct 2024
Using Interleaved Ensemble Unlearning to Keep Backdoors at Bay for
  Finetuning Vision Transformers
Using Interleaved Ensemble Unlearning to Keep Backdoors at Bay for Finetuning Vision Transformers
Zeyu Michael Li
AAML
26
0
0
01 Oct 2024
softmax is not enough (for sharp out-of-distribution)
softmax is not enough (for sharp out-of-distribution)
Petar Veličković
Christos Perivolaropoulos
Federico Barbero
Razvan Pascanu
47
18
0
01 Oct 2024
Squeeze-and-Remember Block
Squeeze-and-Remember Block
Rinor Cakaj
Jens Mehnert
Bin Yang
18
0
0
01 Oct 2024
Previous
123...8910...109110111
Next