Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1606.08415
Cited By
Gaussian Error Linear Units (GELUs)
27 June 2016
Dan Hendrycks
Kevin Gimpel
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Gaussian Error Linear Units (GELUs)"
50 / 840 papers shown
Title
Lift-Attend-Splat: Bird's-eye-view camera-lidar fusion using transformers
James Gunn
Zygmunt Lenyk
Anuj Sharma
Andrea Donati
Alexandru Buburuzan
John Redford
Romain Mueller
MDE
38
8
0
22 Dec 2023
SCEdit: Efficient and Controllable Image Diffusion Generation via Skip Connection Editing
Zeyinzi Jiang
Chaojie Mao
Yulin Pan
Zhen Han
Jingfeng Zhang
29
28
0
18 Dec 2023
Guided Image Restoration via Simultaneous Feature and Image Guided Fusion
Xinyi Liu
Qian Zhao
Jie-Kai Liang
Huiyu Zeng
Deyu Meng
Lei Zhang
37
0
0
14 Dec 2023
4M: Massively Multimodal Masked Modeling
David Mizrahi
Roman Bachmann
Ouguzhan Fatih Kar
Teresa Yeo
Mingfei Gao
Afshin Dehghan
Amir Zamir
MLLM
50
63
0
11 Dec 2023
DreamVideo: Composing Your Dream Videos with Customized Subject and Motion
Yujie Wei
Shiwei Zhang
Zhiwu Qing
Hangjie Yuan
Zhiheng Liu
Yu Liu
Yingya Zhang
Jingren Zhou
Hongming Shan
DiffM
VGen
17
89
0
07 Dec 2023
Defense Against Adversarial Attacks using Convolutional Auto-Encoders
Shreyasi Mandal
AAML
23
1
0
06 Dec 2023
C3: High-performance and low-complexity neural compression from a single image or video
Hyunjik Kim
Matthias Bauer
Lucas Theis
Jonathan Richard Schwarz
Emilien Dupont
VGen
22
23
0
05 Dec 2023
Analyzing and Improving the Training Dynamics of Diffusion Models
Tero Karras
M. Aittala
J. Lehtinen
Janne Hellsten
Timo Aila
S. Laine
42
155
0
05 Dec 2023
HUGS: Human Gaussian Splats
Muhammed Kocabas
Jen-Hao Rick Chang
J. Gabriel
Oncel Tuzel
Anurag Ranjan
3DGS
42
91
0
29 Nov 2023
Improving Feature Stability during Upsampling -- Spectral Artifacts and the Importance of Spatial Context
Shashank Agnihotri
Julia Grabinski
M. Keuper
30
6
0
29 Nov 2023
End-to-End Temporal Action Detection with 1B Parameters Across 1000 Frames
Shuming Liu
Chen-Da Liu-Zhang
Chen Zhao
Guohao Li
33
25
0
28 Nov 2023
Compositional Capabilities of Autoregressive Transformers: A Study on Synthetic, Interpretable Tasks
Rahul Ramesh
Ekdeep Singh Lubana
Mikail Khona
Robert P. Dick
Hidenori Tanaka
CoGe
36
6
0
21 Nov 2023
Deep Learning-Based Real-Time Quality Control of Standard Video Compression for Live Streaming
Matin Mortaheb
M. A. Khojastepour
S. Chakradhar
S. Ulukus
13
1
0
21 Nov 2023
GRAM: An Interpretable Approach for Graph Anomaly Detection using Gradient Attention Maps
Yifei Yang
Peng Wang
Xiaofan He
Dongmian Zou
14
5
0
10 Nov 2023
Towards a Unified Framework of Contrastive Learning for Disentangled Representations
Stefan Matthes
Zhiwei Han
Hao Shen
34
4
0
08 Nov 2023
OmniVec: Learning robust representations with cross modal sharing
Siddharth Srivastava
Gaurav Sharma
SSL
29
64
0
07 Nov 2023
Copilot4D: Learning Unsupervised World Models for Autonomous Driving via Discrete Diffusion
Lunjun Zhang
Yuwen Xiong
Ze Yang
Sergio Casas
Rui Hu
R. Urtasun
41
50
0
02 Nov 2023
End-to-End Single-Channel Speaker-Turn Aware Conversational Speech Translation
Juan Pablo Zuluaga
Zhaocheng Huang
Xing Niu
Rohit Paturi
S. Srinivasan
Prashant Mathur
Brian Thompson
Marcello Federico
BDL
35
2
0
01 Nov 2023
Learn to Categorize or Categorize to Learn? Self-Coding for Generalized Category Discovery
Sarah Rastegar
Hazel Doughty
Cees G. M. Snoek
33
15
0
30 Oct 2023
Video Frame Interpolation with Many-to-many Splatting and Spatial Selective Refinement
Ping Hu
Simon Niklaus
Lu Zhang
Stan Sclaroff
Kate Saenko
25
6
0
29 Oct 2023
TorchDEQ: A Library for Deep Equilibrium Models
Zhengyang Geng
J. Zico Kolter
VLM
56
12
0
28 Oct 2023
Understanding the Effects of Projectors in Knowledge Distillation
Yudong Chen
Sen Wang
Jiajun Liu
Xuwei Xu
Frank de Hoog
Brano Kusy
Zi Huang
26
0
0
26 Oct 2023
Cross-attention Spatio-temporal Context Transformer for Semantic Segmentation of Historical Maps
Sidi Wu
Yizi Chen
Konrad Schindler
L. Hurni
26
2
0
19 Oct 2023
From Alexnet to Transformers: Measuring the Non-linearity of Deep Neural Networks with Affine Optimal Transport
Quentin Bouniot
I. Redko
Anton Mallasto
Charlotte Laclau
Karol Arndt
Oliver Struckmeier
Markus Heinonen
Ville Kyrki
Samuel Kaski
58
2
0
17 Oct 2023
A Non-monotonic Smooth Activation Function
Koushik Biswas
Meghana Karri
Ulacs Baugci
16
1
0
16 Oct 2023
SeUNet-Trans: A Simple yet Effective UNet-Transformer Model for Medical Image Segmentation
Tan-Hanh Pham
Xianqi Li
Kim-Doang Nguyen
MedIm
ViT
26
8
0
16 Oct 2023
Homophone Disambiguation Reveals Patterns of Context Mixing in Speech Transformers
Hosein Mohebbi
Grzegorz Chrupała
Willem H. Zuidema
A. Alishahi
36
12
0
15 Oct 2023
Temporally Aligning Long Audio Interviews with Questions: A Case Study in Multimodal Data Integration
Piyush Singh Pasi
Karthikeya Battepati
P. Jyothi
Ganesh Ramakrishnan
T. Mahapatra
Manoj Singh
51
0
0
10 Oct 2023
Understanding the Feature Norm for Out-of-Distribution Detection
Jaewoo Park
Jacky Chen Long Chai
Jaeho Yoon
Andrew Beng Jin Teoh
OODD
24
12
0
09 Oct 2023
Low-Resolution Self-Attention for Semantic Segmentation
Yu-Huan Wu
Shi-Chen Zhang
Yun-Hai Liu
Le Zhang
Xin Zhan
Daquan Zhou
Jiashi Feng
Ming-Ming Cheng
Liangli Zhen
ViT
45
3
0
08 Oct 2023
Deep Learning Based Uplink Multi-User SIMO Beamforming Design
Cemil Vahapoglu
Tim O'Shea
Tamoghna Roy
S. Ulukus
26
7
0
28 Sep 2023
Deep Learning-Based Real-Time Rate Control for Live Streaming on Wireless Networks
Matin Mortaheb
M. A. Khojastepour
S. Chakradhar
S. Ulukus
13
0
0
27 Sep 2023
Rethinking Session Variability: Leveraging Session Embeddings for Session Robustness in Speaker Verification
Hee-Soo Heo
Ki-hyun Nam
Bong-Jin Lee
Youngki Kwon
Min-Ji Lee
You Jin Kim
Joon Son Chung
26
1
0
26 Sep 2023
Introducing DictaLM -- A Large Generative Language Model for Modern Hebrew
Shaltiel Shmidman
Avi Shmidman
Amir DN Cohen
Moshe Koppel
27
0
0
25 Sep 2023
Small-scale proxies for large-scale Transformer training instabilities
Mitchell Wortsman
Peter J. Liu
Lechao Xiao
Katie Everett
A. Alemi
...
Jascha Narain Sohl-Dickstein
Kelvin Xu
Jaehoon Lee
Justin Gilmer
Simon Kornblith
35
81
0
25 Sep 2023
On the Posterior Distribution in Denoising: Application to Uncertainty Quantification
Hila Manor
T. Michaeli
UQCV
23
17
0
24 Sep 2023
Large-scale Pretraining Improves Sample Efficiency of Active Learning based Molecule Virtual Screening
Zhonglin Cao
Simone Sciabola
Ye Wang
35
1
0
20 Sep 2023
PDPCRN: Parallel Dual-Path CRN with Bi-directional Inter-Branch Interactions for Multi-Channel Speech Enhancement
Jia-Yu Pan
Shulin He
Tianci Wu
Hui Zhang
Xueliang Zhang
19
0
0
19 Sep 2023
Limited-Angle Tomography Reconstruction via Deep End-To-End Learning on Synthetic Data
Thomas Germer
Jan Robine
S. Konietzny
Stefan Harmeling
Tobias Uelwer
MedIm
23
5
0
13 Sep 2023
Advancing Parsimonious Deep Learning Weather Prediction using the HEALPix Mesh
Matthias Karlbauer
Nathaniel Cresswell-Clay
Dale Durran
Raul A Moreno
Thorsten Kurth
Boris Bonev
Noah D. Brenowitz
Martin Volker Butz
MDE
28
20
0
11 Sep 2023
ImageBind-LLM: Multi-modality Instruction Tuning
Jiaming Han
Renrui Zhang
Wenqi Shao
Peng Gao
Peng-Tao Xu
...
Yafei Wen
Xiaoxin Chen
Xiangyu Yue
Hongsheng Li
Yu Qiao
MLLM
49
116
0
07 Sep 2023
3D Transformer based on deformable patch location for differential diagnosis between Alzheimer's disease and Frontotemporal dementia
H. Nguyen
Michael Clement
Boris Mansencal
Pierrick Coupé
MedIm
31
0
0
06 Sep 2023
Character Queries: A Transformer-based Approach to On-Line Handwritten Character Segmentation
Michael Jungo
Beat Wolf
Andrii Maksai
C. Musat
Andreas Fischer
27
2
0
06 Sep 2023
A Unified Masked Autoencoder with Patchified Skeletons for Motion Synthesis
Esteve Valls Mascaro
Hyemin Ahn
Dongheui Lee
CVBM
37
4
0
14 Aug 2023
Large-kernel Attention for Efficient and Robust Brain Lesion Segmentation
Liam Chalcroft
Ruben Lourencco Pereira
Mikael Brudfors
Andrew S. Kayser
M. D’Esposito
Cathy J. Price
Ioannis Pappas
John Ashburner
ViT
3DV
MedIm
29
8
0
14 Aug 2023
Composable Function-preserving Expansions for Transformer Architectures
Andrea Gesmundo
Kaitlin Maile
AI4CE
40
8
0
11 Aug 2023
Graph Embedding Dynamic Feature-based Supervised Contrastive Learning of Transient Stability for Changing Power Grid Topologies
Zijian Lv
Xinyu Chen
Zijian Feng
22
0
0
01 Aug 2023
Generative Models as a Complex Systems Science: How can we make sense of large language model behavior?
Ari Holtzman
Peter West
Luke Zettlemoyer
AI4CE
30
14
0
31 Jul 2023
Efficient Federated Learning via Local Adaptive Amended Optimizer with Linear Speedup
Yan Sun
Li Shen
Hao Sun
Liang Ding
Dacheng Tao
FedML
24
17
0
30 Jul 2023
BARTPhoBEiT: Pre-trained Sequence-to-Sequence and Image Transformers Models for Vietnamese Visual Question Answering
Khiem Vinh Tran
Kiet Van Nguyen
Ngan Luu-Thuy Nguyen
ViT
31
2
0
28 Jul 2023
Previous
1
2
3
4
5
6
...
15
16
17
Next