Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1606.08415
Cited By
Gaussian Error Linear Units (GELUs)
27 June 2016
Dan Hendrycks
Kevin Gimpel
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Gaussian Error Linear Units (GELUs)"
50 / 876 papers shown
Title
Curve Your Enthusiasm: Concurvity Regularization in Differentiable Generalized Additive Models
Julien N. Siems
Konstantin Ditschuneit
Winfried Ripken
Alma Lindborg
Maximilian Schambach
Johannes Otterbach
Martin Genzel
19
6
0
19 May 2023
Boost Vision Transformer with GPU-Friendly Sparsity and Quantization
Chong Yu
Tao Chen
Zhongxue Gan
Jiayuan Fan
MQ
ViT
30
23
0
18 May 2023
Token-wise Decomposition of Autoregressive Language Model Hidden States for Analyzing Model Predictions
Byung-Doh Oh
William Schuler
29
2
0
17 May 2023
Multi-Level Global Context Cross Consistency Model for Semi-Supervised Ultrasound Image Segmentation with Diffusion Model
Fenghe Tang
Jianrui Ding
Lingtao Wang
Min Xian
C. Ning
DiffM
MedIm
34
12
0
16 May 2023
Evaluation of self-supervised pre-training for automatic infant movement classification using wearable movement sensors
Einari Vaaras
Manu Airaksinen
S. Vanhatalo
Okko Rasanen
25
4
0
16 May 2023
Toward Moiré-Free and Detail-Preserving Demosaicking
Xuan-Yi Li
Y. Niu
Bo Zhao
Haoyuan Shi
Zitong An
31
1
0
15 May 2023
MaxViT-UNet: Multi-Axis Attention for Medical Image Segmentation
Abdul Rehman Khan
Asifullah Khan
ViT
MedIm
44
14
0
15 May 2023
A Multidimensional Graph Fourier Transformation Neural Network for Vehicle Trajectory Prediction
Marion Neumeier
Andreas Tollkühn
M. Botsch
Wolfgang Utschick
22
5
0
12 May 2023
Multitask learning in Audio Captioning: a sentence embedding regression loss acts as a regularizer
Etienne Labbé
J. Pinquier
Thomas Pellegrini
48
5
0
02 May 2023
Consolidator: Mergeable Adapter with Grouped Connections for Visual Adaptation
Tianxiang Hao
Hui Chen
Yuchen Guo
Guiguang Ding
44
16
0
30 Apr 2023
MINN: Learning the dynamics of differential-algebraic equations and application to battery modeling
Yicun Huang
Changfu Zou
Yong Li
T. Wik
PINN
31
10
0
27 Apr 2023
Training Large Scale Polynomial CNNs for E2E Inference over Homomorphic Encryption
Moran Baruch
Nir Drucker
Gilad Ezov
Yoav Goldberg
Eyal Kushnir
Jenny Lerner
Omri Soceanu
Itamar Zimerman
49
6
0
26 Apr 2023
State Spaces Aren't Enough: Machine Translation Needs Attention
Ali Vardasbi
Telmo Pires
Robin M. Schmidt
Stephan Peitz
24
9
0
25 Apr 2023
End-to-End Spatio-Temporal Action Localisation with Video Transformers
A. Gritsenko
Xuehan Xiong
Josip Djolonga
Mostafa Dehghani
Chen Sun
Mario Lucic
Cordelia Schmid
Anurag Arnab
ViT
37
13
0
24 Apr 2023
The Disharmony between BN and ReLU Causes Gradient Explosion, but is Offset by the Correlation between Activations
Inyoung Paik
Jaesik Choi
18
0
0
23 Apr 2023
Variance-Reduced Gradient Estimation via Noise-Reuse in Online Evolution Strategies
Oscar Li
James Harrison
Jascha Narain Sohl-Dickstein
Virginia Smith
Luke Metz
51
5
0
21 Apr 2023
Transformer-based models and hardware acceleration analysis in autonomous driving: A survey
J. Zhong
Zheng Liu
Xiangshan Chen
ViT
44
17
0
21 Apr 2023
LLIC: Large Receptive Field Transform Coding with Adaptive Weights for Learned Image Compression
Wei Jiang
Peirong Ning
Jiayu Yang
Yongqi Zhai
Feng Gao
Ronggang Wang
38
6
0
19 Apr 2023
CoPR: Towards Accurate Visual Localization With Continuous Place-descriptor Regression
Mubariz Zaffar
Liangliang Nan
Julian F. P. Kooij
22
2
0
14 Apr 2023
Reinforcement Learning Tutor Better Supported Lower Performers in a Math Task
S. Ruan
Allen Nie
William Steenbergen
Jiayu He
JQ Zhang
...
Kyle Dang Nguyen
Catherine Y Wang
Rui Ying
James A. Landay
Emma Brunskill
28
18
0
11 Apr 2023
Embodied Concept Learner: Self-supervised Learning of Concepts and Mapping through Instruction Following
Mingyu Ding
Yan Xu
Zhenfang Chen
David D. Cox
Ping Luo
J. Tenenbaum
Chuang Gan
LM&Ro
59
21
0
07 Apr 2023
On Efficient Training of Large-Scale Deep Learning Models: A Literature Review
Li Shen
Yan Sun
Zhiyuan Yu
Liang Ding
Xinmei Tian
Dacheng Tao
VLM
30
41
0
07 Apr 2023
ClothCombo: Modeling Inter-Cloth Interaction for Draping Multi-Layered Clothes
Dohae Lee
Hyun Kang
In-Kwon Lee
3DH
AI4CE
32
7
0
07 Apr 2023
Anomaly Detection via Gumbel Noise Score Matching
Ahsan Mahmood
Junier Oliva
Martin Styner
24
1
0
06 Apr 2023
Segment Anything
A. Kirillov
Eric Mintun
Nikhila Ravi
Hanzi Mao
Chloe Rolland
...
Spencer Whitehead
Alexander C. Berg
Wan-Yen Lo
Piotr Dollár
Ross B. Girshick
MLLM
VLM
60
6,822
0
05 Apr 2023
Industrial Anomaly Detection with Domain Shift: A Real-world Dataset and Masked Multi-scale Reconstruction
Zilong Zhang
Zhibin Zhao
Xingwu Zhang
Chuang Sun
Xuefeng Chen
27
50
0
05 Apr 2023
Blockwise Compression of Transformer-based Models without Retraining
Gaochen Dong
W. Chen
20
3
0
04 Apr 2023
TransPimLib: A Library for Efficient Transcendental Functions on Processing-in-Memory Systems
Maurus Item
Juan Gómez Luna
Yu-Yin Guo
Geraldo F. Oliveira
Mohammad Sadrosadati
O. Mutlu
37
4
0
03 Apr 2023
Transformer-based interpretable multi-modal data fusion for skin lesion classification
Theodor Cheslerean-Boghiu
Melia-Evelina Fleischmann
Theresa Willem
Tobias Lasser
ViT
MedIm
AI4CE
24
2
0
03 Apr 2023
CNNs with Multi-Level Attention for Domain Generalization
Aristotelis Ballas
Christos Diou
OOD
27
6
0
02 Apr 2023
Resolution-Invariant Image Classification based on Fourier Neural Operators
Samira Kabri
Tim Roith
Daniel Tenbrinck
Martin Burger
23
5
0
02 Apr 2023
Hierarchical Vision Transformers for Cardiac Ejection Fraction Estimation
Lhuqita Fazry
Asep Haryono
Nuzulul Khairu Nissa
Sunarno
Naufal Muhammad Hirzi
M. F. Rachmadi
W. Jatmiko
MedIm
16
16
0
31 Mar 2023
CodeGeeX: A Pre-Trained Model for Code Generation with Multilingual Benchmarking on HumanEval-X
Qinkai Zheng
Xiao Xia
Xu Zou
Yuxiao Dong
Shanshan Wang
...
Andi Wang
Yang Li
Teng Su
Zhilin Yang
Jie Tang
ELM
ALM
SyDa
57
317
0
30 Mar 2023
BloombergGPT: A Large Language Model for Finance
Shijie Wu
Ozan Irsoy
Steven Lu
Vadim Dabravolski
Mark Dredze
Sebastian Gehrmann
P. Kambadur
David S. Rosenberg
Gideon Mann
AIFin
76
786
0
30 Mar 2023
Ensemble weather forecast post-processing with a flexible probabilistic neural network approach
P. Mlakar
J. Merse
Jana Faganeli Pucer
22
4
0
29 Mar 2023
GNNBuilder: An Automated Framework for Generic Graph Neural Network Accelerator Generation, Simulation, and Optimization
Stefan Abi-Karam
Cong Hao
GNN
36
7
0
29 Mar 2023
InceptionNeXt: When Inception Meets ConvNeXt
Weihao Yu
Pan Zhou
Shuicheng Yan
Xinchao Wang
48
119
0
29 Mar 2023
Multi-modal learning for geospatial vegetation forecasting
V. Benson
Claire Robin
C. Requena-Mesa
Lazaro Alonso
Nuno Carvalhais
José A. Cortés
Zhihan Gao
Nora Linscheid
M. Weynants
Markus Reichstein
30
11
0
28 Mar 2023
SELF-VS: Self-supervised Encoding Learning For Video Summarization
Hojjat Mokhtarabadi
Kaveh Bahraman
M. Hosseinzadeh
M. Eftekhari
AI4TS
SSL
ViT
25
0
0
28 Mar 2023
Progressive Semantic-Visual Mutual Adaption for Generalized Zero-Shot Learning
Man Liu
Feng Li
Chunjie Zhang
Yunchao Wei
H. Bai
Yao-Min Zhao
47
39
0
27 Mar 2023
Troika: Multi-Path Cross-Modal Traction for Compositional Zero-Shot Learning
Siteng Huang
Biao Gong
Yutong Feng
Min Zhang
Yiliang Lv
Donglin Wang
CoGe
35
10
0
27 Mar 2023
Towards Better Dynamic Graph Learning: New Architecture and Unified Library
Le Yu
Leilei Sun
Bowen Du
Weifeng Lv
AI4CE
29
96
0
23 Mar 2023
Online Transformers with Spiking Neurons for Fast Prosthetic Hand Control
Nathan Leroux
Jan Finkbeiner
Emre Neftci
33
9
0
21 Mar 2023
GlueGen: Plug and Play Multi-modal Encoders for X-to-image Generation
Can Qin
Ning Yu
Chen Xing
Shu Zhen Zhang
Zeyuan Chen
Stefano Ermon
Yun Fu
Caiming Xiong
Ran Xu
DiffM
40
20
0
17 Mar 2023
MedNeXt: Transformer-driven Scaling of ConvNets for Medical Image Segmentation
Saikat Roy
Gregor Koehler
Constantin Ulrich
Michael Baumgartner
Jens Petersen
Fabian Isensee
Paul F. Jaeger
Klaus Maier-Hein
ViT
MedIm
35
138
0
17 Mar 2023
Block-wise Bit-Compression of Transformer-based Models
Gaochen Dong
W. Chen
24
0
0
16 Mar 2023
Graph Transformer GANs for Graph-Constrained House Generation
H. Tang
Zhenyu Zhang
Humphrey Shi
Bo-wen Li
Lin Shao
N. Sebe
Radu Timofte
Luc Van Gool
46
19
0
14 Mar 2023
Good Neighbors Are All You Need for Chinese Grapheme-to-Phoneme Conversion
Jungjun Kim
C. Han
Gyuhyeon Nam
Gyeongsu Chae
11
2
0
14 Mar 2023
ViM: Vision Middleware for Unified Downstream Transferring
Yutong Feng
Biao Gong
Jianwen Jiang
Yiliang Lv
Yujun Shen
Deli Zhao
Jingren Zhou
32
1
0
13 Mar 2023
Transformer Encoder with Multiscale Deep Learning for Pain Classification Using Physiological Signals
Zhenyu Lu
Burcu Ozek
S. Kamarthi
ViT
MedIm
29
14
0
13 Mar 2023
Previous
1
2
3
...
6
7
8
...
16
17
18
Next