ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1606.08415
  4. Cited By
Gaussian Error Linear Units (GELUs)

Gaussian Error Linear Units (GELUs)

27 June 2016
Dan Hendrycks
Kevin Gimpel
ArXivPDFHTML

Papers citing "Gaussian Error Linear Units (GELUs)"

50 / 876 papers shown
Title
Curve Your Enthusiasm: Concurvity Regularization in Differentiable
  Generalized Additive Models
Curve Your Enthusiasm: Concurvity Regularization in Differentiable Generalized Additive Models
Julien N. Siems
Konstantin Ditschuneit
Winfried Ripken
Alma Lindborg
Maximilian Schambach
Johannes Otterbach
Martin Genzel
19
6
0
19 May 2023
Boost Vision Transformer with GPU-Friendly Sparsity and Quantization
Boost Vision Transformer with GPU-Friendly Sparsity and Quantization
Chong Yu
Tao Chen
Zhongxue Gan
Jiayuan Fan
MQ
ViT
30
23
0
18 May 2023
Token-wise Decomposition of Autoregressive Language Model Hidden States
  for Analyzing Model Predictions
Token-wise Decomposition of Autoregressive Language Model Hidden States for Analyzing Model Predictions
Byung-Doh Oh
William Schuler
29
2
0
17 May 2023
Multi-Level Global Context Cross Consistency Model for Semi-Supervised
  Ultrasound Image Segmentation with Diffusion Model
Multi-Level Global Context Cross Consistency Model for Semi-Supervised Ultrasound Image Segmentation with Diffusion Model
Fenghe Tang
Jianrui Ding
Lingtao Wang
Min Xian
C. Ning
DiffM
MedIm
34
12
0
16 May 2023
Evaluation of self-supervised pre-training for automatic infant movement
  classification using wearable movement sensors
Evaluation of self-supervised pre-training for automatic infant movement classification using wearable movement sensors
Einari Vaaras
Manu Airaksinen
S. Vanhatalo
Okko Rasanen
25
4
0
16 May 2023
Toward Moiré-Free and Detail-Preserving Demosaicking
Toward Moiré-Free and Detail-Preserving Demosaicking
Xuan-Yi Li
Y. Niu
Bo Zhao
Haoyuan Shi
Zitong An
31
1
0
15 May 2023
MaxViT-UNet: Multi-Axis Attention for Medical Image Segmentation
MaxViT-UNet: Multi-Axis Attention for Medical Image Segmentation
Abdul Rehman Khan
Asifullah Khan
ViT
MedIm
44
14
0
15 May 2023
A Multidimensional Graph Fourier Transformation Neural Network for
  Vehicle Trajectory Prediction
A Multidimensional Graph Fourier Transformation Neural Network for Vehicle Trajectory Prediction
Marion Neumeier
Andreas Tollkühn
M. Botsch
Wolfgang Utschick
22
5
0
12 May 2023
Multitask learning in Audio Captioning: a sentence embedding regression
  loss acts as a regularizer
Multitask learning in Audio Captioning: a sentence embedding regression loss acts as a regularizer
Etienne Labbé
J. Pinquier
Thomas Pellegrini
48
5
0
02 May 2023
Consolidator: Mergeable Adapter with Grouped Connections for Visual
  Adaptation
Consolidator: Mergeable Adapter with Grouped Connections for Visual Adaptation
Tianxiang Hao
Hui Chen
Yuchen Guo
Guiguang Ding
44
16
0
30 Apr 2023
MINN: Learning the dynamics of differential-algebraic equations and application to battery modeling
MINN: Learning the dynamics of differential-algebraic equations and application to battery modeling
Yicun Huang
Changfu Zou
Yong Li
T. Wik
PINN
31
10
0
27 Apr 2023
Training Large Scale Polynomial CNNs for E2E Inference over Homomorphic
  Encryption
Training Large Scale Polynomial CNNs for E2E Inference over Homomorphic Encryption
Moran Baruch
Nir Drucker
Gilad Ezov
Yoav Goldberg
Eyal Kushnir
Jenny Lerner
Omri Soceanu
Itamar Zimerman
49
6
0
26 Apr 2023
State Spaces Aren't Enough: Machine Translation Needs Attention
State Spaces Aren't Enough: Machine Translation Needs Attention
Ali Vardasbi
Telmo Pires
Robin M. Schmidt
Stephan Peitz
24
9
0
25 Apr 2023
End-to-End Spatio-Temporal Action Localisation with Video Transformers
End-to-End Spatio-Temporal Action Localisation with Video Transformers
A. Gritsenko
Xuehan Xiong
Josip Djolonga
Mostafa Dehghani
Chen Sun
Mario Lucic
Cordelia Schmid
Anurag Arnab
ViT
37
13
0
24 Apr 2023
The Disharmony between BN and ReLU Causes Gradient Explosion, but is
  Offset by the Correlation between Activations
The Disharmony between BN and ReLU Causes Gradient Explosion, but is Offset by the Correlation between Activations
Inyoung Paik
Jaesik Choi
18
0
0
23 Apr 2023
Variance-Reduced Gradient Estimation via Noise-Reuse in Online Evolution
  Strategies
Variance-Reduced Gradient Estimation via Noise-Reuse in Online Evolution Strategies
Oscar Li
James Harrison
Jascha Narain Sohl-Dickstein
Virginia Smith
Luke Metz
51
5
0
21 Apr 2023
Transformer-based models and hardware acceleration analysis in
  autonomous driving: A survey
Transformer-based models and hardware acceleration analysis in autonomous driving: A survey
J. Zhong
Zheng Liu
Xiangshan Chen
ViT
44
17
0
21 Apr 2023
LLIC: Large Receptive Field Transform Coding with Adaptive Weights for
  Learned Image Compression
LLIC: Large Receptive Field Transform Coding with Adaptive Weights for Learned Image Compression
Wei Jiang
Peirong Ning
Jiayu Yang
Yongqi Zhai
Feng Gao
Ronggang Wang
38
6
0
19 Apr 2023
CoPR: Towards Accurate Visual Localization With Continuous
  Place-descriptor Regression
CoPR: Towards Accurate Visual Localization With Continuous Place-descriptor Regression
Mubariz Zaffar
Liangliang Nan
Julian F. P. Kooij
22
2
0
14 Apr 2023
Reinforcement Learning Tutor Better Supported Lower Performers in a Math
  Task
Reinforcement Learning Tutor Better Supported Lower Performers in a Math Task
S. Ruan
Allen Nie
William Steenbergen
Jiayu He
JQ Zhang
...
Kyle Dang Nguyen
Catherine Y Wang
Rui Ying
James A. Landay
Emma Brunskill
28
18
0
11 Apr 2023
Embodied Concept Learner: Self-supervised Learning of Concepts and
  Mapping through Instruction Following
Embodied Concept Learner: Self-supervised Learning of Concepts and Mapping through Instruction Following
Mingyu Ding
Yan Xu
Zhenfang Chen
David D. Cox
Ping Luo
J. Tenenbaum
Chuang Gan
LM&Ro
59
21
0
07 Apr 2023
On Efficient Training of Large-Scale Deep Learning Models: A Literature
  Review
On Efficient Training of Large-Scale Deep Learning Models: A Literature Review
Li Shen
Yan Sun
Zhiyuan Yu
Liang Ding
Xinmei Tian
Dacheng Tao
VLM
30
41
0
07 Apr 2023
ClothCombo: Modeling Inter-Cloth Interaction for Draping Multi-Layered
  Clothes
ClothCombo: Modeling Inter-Cloth Interaction for Draping Multi-Layered Clothes
Dohae Lee
Hyun Kang
In-Kwon Lee
3DH
AI4CE
32
7
0
07 Apr 2023
Anomaly Detection via Gumbel Noise Score Matching
Anomaly Detection via Gumbel Noise Score Matching
Ahsan Mahmood
Junier Oliva
Martin Styner
24
1
0
06 Apr 2023
Segment Anything
Segment Anything
A. Kirillov
Eric Mintun
Nikhila Ravi
Hanzi Mao
Chloe Rolland
...
Spencer Whitehead
Alexander C. Berg
Wan-Yen Lo
Piotr Dollár
Ross B. Girshick
MLLM
VLM
60
6,822
0
05 Apr 2023
Industrial Anomaly Detection with Domain Shift: A Real-world Dataset and
  Masked Multi-scale Reconstruction
Industrial Anomaly Detection with Domain Shift: A Real-world Dataset and Masked Multi-scale Reconstruction
Zilong Zhang
Zhibin Zhao
Xingwu Zhang
Chuang Sun
Xuefeng Chen
27
50
0
05 Apr 2023
Blockwise Compression of Transformer-based Models without Retraining
Blockwise Compression of Transformer-based Models without Retraining
Gaochen Dong
W. Chen
20
3
0
04 Apr 2023
TransPimLib: A Library for Efficient Transcendental Functions on
  Processing-in-Memory Systems
TransPimLib: A Library for Efficient Transcendental Functions on Processing-in-Memory Systems
Maurus Item
Juan Gómez Luna
Yu-Yin Guo
Geraldo F. Oliveira
Mohammad Sadrosadati
O. Mutlu
37
4
0
03 Apr 2023
Transformer-based interpretable multi-modal data fusion for skin lesion
  classification
Transformer-based interpretable multi-modal data fusion for skin lesion classification
Theodor Cheslerean-Boghiu
Melia-Evelina Fleischmann
Theresa Willem
Tobias Lasser
ViT
MedIm
AI4CE
24
2
0
03 Apr 2023
CNNs with Multi-Level Attention for Domain Generalization
CNNs with Multi-Level Attention for Domain Generalization
Aristotelis Ballas
Christos Diou
OOD
27
6
0
02 Apr 2023
Resolution-Invariant Image Classification based on Fourier Neural
  Operators
Resolution-Invariant Image Classification based on Fourier Neural Operators
Samira Kabri
Tim Roith
Daniel Tenbrinck
Martin Burger
23
5
0
02 Apr 2023
Hierarchical Vision Transformers for Cardiac Ejection Fraction
  Estimation
Hierarchical Vision Transformers for Cardiac Ejection Fraction Estimation
Lhuqita Fazry
Asep Haryono
Nuzulul Khairu Nissa
Sunarno
Naufal Muhammad Hirzi
M. F. Rachmadi
W. Jatmiko
MedIm
16
16
0
31 Mar 2023
CodeGeeX: A Pre-Trained Model for Code Generation with Multilingual
  Benchmarking on HumanEval-X
CodeGeeX: A Pre-Trained Model for Code Generation with Multilingual Benchmarking on HumanEval-X
Qinkai Zheng
Xiao Xia
Xu Zou
Yuxiao Dong
Shanshan Wang
...
Andi Wang
Yang Li
Teng Su
Zhilin Yang
Jie Tang
ELM
ALM
SyDa
57
317
0
30 Mar 2023
BloombergGPT: A Large Language Model for Finance
BloombergGPT: A Large Language Model for Finance
Shijie Wu
Ozan Irsoy
Steven Lu
Vadim Dabravolski
Mark Dredze
Sebastian Gehrmann
P. Kambadur
David S. Rosenberg
Gideon Mann
AIFin
76
786
0
30 Mar 2023
Ensemble weather forecast post-processing with a flexible probabilistic
  neural network approach
Ensemble weather forecast post-processing with a flexible probabilistic neural network approach
P. Mlakar
J. Merse
Jana Faganeli Pucer
22
4
0
29 Mar 2023
GNNBuilder: An Automated Framework for Generic Graph Neural Network
  Accelerator Generation, Simulation, and Optimization
GNNBuilder: An Automated Framework for Generic Graph Neural Network Accelerator Generation, Simulation, and Optimization
Stefan Abi-Karam
Cong Hao
GNN
36
7
0
29 Mar 2023
InceptionNeXt: When Inception Meets ConvNeXt
InceptionNeXt: When Inception Meets ConvNeXt
Weihao Yu
Pan Zhou
Shuicheng Yan
Xinchao Wang
48
119
0
29 Mar 2023
Multi-modal learning for geospatial vegetation forecasting
Multi-modal learning for geospatial vegetation forecasting
V. Benson
Claire Robin
C. Requena-Mesa
Lazaro Alonso
Nuno Carvalhais
José A. Cortés
Zhihan Gao
Nora Linscheid
M. Weynants
Markus Reichstein
30
11
0
28 Mar 2023
SELF-VS: Self-supervised Encoding Learning For Video Summarization
SELF-VS: Self-supervised Encoding Learning For Video Summarization
Hojjat Mokhtarabadi
Kaveh Bahraman
M. Hosseinzadeh
M. Eftekhari
AI4TS
SSL
ViT
25
0
0
28 Mar 2023
Progressive Semantic-Visual Mutual Adaption for Generalized Zero-Shot
  Learning
Progressive Semantic-Visual Mutual Adaption for Generalized Zero-Shot Learning
Man Liu
Feng Li
Chunjie Zhang
Yunchao Wei
H. Bai
Yao-Min Zhao
47
39
0
27 Mar 2023
Troika: Multi-Path Cross-Modal Traction for Compositional Zero-Shot
  Learning
Troika: Multi-Path Cross-Modal Traction for Compositional Zero-Shot Learning
Siteng Huang
Biao Gong
Yutong Feng
Min Zhang
Yiliang Lv
Donglin Wang
CoGe
35
10
0
27 Mar 2023
Towards Better Dynamic Graph Learning: New Architecture and Unified
  Library
Towards Better Dynamic Graph Learning: New Architecture and Unified Library
Le Yu
Leilei Sun
Bowen Du
Weifeng Lv
AI4CE
29
96
0
23 Mar 2023
Online Transformers with Spiking Neurons for Fast Prosthetic Hand
  Control
Online Transformers with Spiking Neurons for Fast Prosthetic Hand Control
Nathan Leroux
Jan Finkbeiner
Emre Neftci
33
9
0
21 Mar 2023
GlueGen: Plug and Play Multi-modal Encoders for X-to-image Generation
GlueGen: Plug and Play Multi-modal Encoders for X-to-image Generation
Can Qin
Ning Yu
Chen Xing
Shu Zhen Zhang
Zeyuan Chen
Stefano Ermon
Yun Fu
Caiming Xiong
Ran Xu
DiffM
40
20
0
17 Mar 2023
MedNeXt: Transformer-driven Scaling of ConvNets for Medical Image
  Segmentation
MedNeXt: Transformer-driven Scaling of ConvNets for Medical Image Segmentation
Saikat Roy
Gregor Koehler
Constantin Ulrich
Michael Baumgartner
Jens Petersen
Fabian Isensee
Paul F. Jaeger
Klaus Maier-Hein
ViT
MedIm
35
138
0
17 Mar 2023
Block-wise Bit-Compression of Transformer-based Models
Gaochen Dong
W. Chen
24
0
0
16 Mar 2023
Graph Transformer GANs for Graph-Constrained House Generation
Graph Transformer GANs for Graph-Constrained House Generation
H. Tang
Zhenyu Zhang
Humphrey Shi
Bo-wen Li
Lin Shao
N. Sebe
Radu Timofte
Luc Van Gool
46
19
0
14 Mar 2023
Good Neighbors Are All You Need for Chinese Grapheme-to-Phoneme
  Conversion
Good Neighbors Are All You Need for Chinese Grapheme-to-Phoneme Conversion
Jungjun Kim
C. Han
Gyuhyeon Nam
Gyeongsu Chae
11
2
0
14 Mar 2023
ViM: Vision Middleware for Unified Downstream Transferring
ViM: Vision Middleware for Unified Downstream Transferring
Yutong Feng
Biao Gong
Jianwen Jiang
Yiliang Lv
Yujun Shen
Deli Zhao
Jingren Zhou
32
1
0
13 Mar 2023
Transformer Encoder with Multiscale Deep Learning for Pain
  Classification Using Physiological Signals
Transformer Encoder with Multiscale Deep Learning for Pain Classification Using Physiological Signals
Zhenyu Lu
Burcu Ozek
S. Kamarthi
ViT
MedIm
29
14
0
13 Mar 2023
Previous
123...678...161718
Next