ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1606.08415
  4. Cited By
Gaussian Error Linear Units (GELUs)

Gaussian Error Linear Units (GELUs)

27 June 2016
Dan Hendrycks
Kevin Gimpel
ArXivPDFHTML

Papers citing "Gaussian Error Linear Units (GELUs)"

50 / 945 papers shown
Title
Improved Feature Distillation via Projector Ensemble
Improved Feature Distillation via Projector Ensemble
Yudong Chen
Sen Wang
Jiajun Liu
Xuwei Xu
Frank de Hoog
Zi Huang
39
37
0
27 Oct 2022
M$^3$ViT: Mixture-of-Experts Vision Transformer for Efficient Multi-task
  Learning with Model-Accelerator Co-design
M3^33ViT: Mixture-of-Experts Vision Transformer for Efficient Multi-task Learning with Model-Accelerator Co-design
Hanxue Liang
Zhiwen Fan
Rishov Sarkar
Ziyu Jiang
Tianlong Chen
Kai Zou
Yu Cheng
Cong Hao
Zhangyang Wang
MoE
42
81
0
26 Oct 2022
PredNAS: A Universal and Sample Efficient Neural Architecture Search
  Framework
PredNAS: A Universal and Sample Efficient Neural Architecture Search Framework
Liuchun Yuan
Zehao Huang
Naiyan Wang
29
0
0
26 Oct 2022
Clinically-Inspired Multi-Agent Transformers for Disease Trajectory
  Forecasting from Multimodal Data
Clinically-Inspired Multi-Agent Transformers for Disease Trajectory Forecasting from Multimodal Data
Huy Hoang Nguyen
Matthew B. Blaschko
S. Saarakkala
A. Tiulpin
MedIm
AI4CE
50
15
0
25 Oct 2022
MetaFormer Baselines for Vision
MetaFormer Baselines for Vision
Weihao Yu
Chenyang Si
Pan Zhou
Mi Luo
Yichen Zhou
Jiashi Feng
Shuicheng Yan
Xinchao Wang
MoE
40
158
0
24 Oct 2022
A Continuous Convolutional Trainable Filter for Modelling Unstructured
  Data
A Continuous Convolutional Trainable Filter for Modelling Unstructured Data
Dario Coscia
L. Meneghetti
N. Demo
G. Stabile
G. Rozza
24
8
0
24 Oct 2022
CMU-Net: A Strong ConvMixer-based Medical Ultrasound Image Segmentation
  Network
CMU-Net: A Strong ConvMixer-based Medical Ultrasound Image Segmentation Network
Fenghe Tang
Lingtao Wang
C. Ning
Min Xian
Jianrui Ding
30
60
0
24 Oct 2022
Compressing multidimensional weather and climate data into neural
  networks
Compressing multidimensional weather and climate data into neural networks
La-mei Huang
Torsten Hoefler
AI4CE
49
31
0
22 Oct 2022
Stochastic Adaptive Activation Function
Stochastic Adaptive Activation Function
Kyungsu Lee
Jaeseung Yang
Haeyun Lee
J. Y. Hwang
30
3
0
21 Oct 2022
Graphically Structured Diffusion Models
Graphically Structured Diffusion Models
Christian D. Weilbach
William Harvey
Frank Wood
DiffM
40
7
0
20 Oct 2022
Coordinates Are NOT Lonely -- Codebook Prior Helps Implicit Neural 3D
  Representations
Coordinates Are NOT Lonely -- Codebook Prior Helps Implicit Neural 3D Representations
Fukun Yin
Wen Liu
Zilong Huang
Pei Cheng
Tao Chen
Gang Yu
22
19
0
20 Oct 2022
Dense but Efficient VideoQA for Intricate Compositional Reasoning
Dense but Efficient VideoQA for Intricate Compositional Reasoning
Jihyeon Janel Lee
Wooyoung Kang
Eun-Sol Kim
CoGe
24
3
0
19 Oct 2022
Nish: A Novel Negative Stimulated Hybrid Activation Function
Nish: A Novel Negative Stimulated Hybrid Activation Function
Yildiray Anagün
Ş. Işık
27
2
0
17 Oct 2022
Scratching Visual Transformer's Back with Uniform Attention
Scratching Visual Transformer's Back with Uniform Attention
Nam Hyeon-Woo
Kim Yu-Ji
Byeongho Heo
Doonyoon Han
Seong Joon Oh
Tae-Hyun Oh
366
23
0
16 Oct 2022
Hierarchical Approach for Joint Semantic, Plant Instance, and Leaf
  Instance Segmentation in the Agricultural Domain
Hierarchical Approach for Joint Semantic, Plant Instance, and Leaf Instance Segmentation in the Agricultural Domain
Gianmarco Roggiolani
Matteo Sodano
Tiziano Guadagnino
Federico Magistri
Jens Behley
C. Stachniss
17
23
0
14 Oct 2022
Experiments on Turkish ASR with Self-Supervised Speech Representation
  Learning
Experiments on Turkish ASR with Self-Supervised Speech Representation Learning
Ali Safaya
E. Erzin
21
1
0
13 Oct 2022
CTL++: Evaluating Generalization on Never-Seen Compositional Patterns of
  Known Functions, and Compatibility of Neural Representations
CTL++: Evaluating Generalization on Never-Seen Compositional Patterns of Known Functions, and Compatibility of Neural Representations
Róbert Csordás
Kazuki Irie
Jürgen Schmidhuber
NAI
19
12
0
12 Oct 2022
Bridging the Gap Between Vision Transformers and Convolutional Neural
  Networks on Small Datasets
Bridging the Gap Between Vision Transformers and Convolutional Neural Networks on Small Datasets
Zhiying Lu
Hongtao Xie
Chuanbin Liu
Yongdong Zhang
ViT
28
57
0
12 Oct 2022
Investigating the Failure Modes of the AUC metric and Exploring
  Alternatives for Evaluating Systems in Safety Critical Applications
Investigating the Failure Modes of the AUC metric and Exploring Alternatives for Evaluating Systems in Safety Critical Applications
Swaroop Mishra
Anjana Arunkumar
Chitta Baral
33
0
0
10 Oct 2022
Coded Residual Transform for Generalizable Deep Metric Learning
Coded Residual Transform for Generalizable Deep Metric Learning
Shichao Kan
Yixiong Liang
Min Li
Yigang Cen
Jianxin Wang
Z. He
36
3
0
09 Oct 2022
A Transformer-based deep neural network model for SSVEP classification
A Transformer-based deep neural network model for SSVEP classification
Jianbo Chen
Yangsong Zhang
Yudong Pan
Peng Xu
Cuntai Guan
22
50
0
09 Oct 2022
Time-Space Transformers for Video Panoptic Segmentation
Time-Space Transformers for Video Panoptic Segmentation
Andra Petrovai
S. Nedevschi
ViT
27
3
0
07 Oct 2022
GLM-130B: An Open Bilingual Pre-trained Model
GLM-130B: An Open Bilingual Pre-trained Model
Aohan Zeng
Xiao Liu
Zhengxiao Du
Zihan Wang
Hanyu Lai
...
Jidong Zhai
Wenguang Chen
Peng Zhang
Yuxiao Dong
Jie Tang
BDL
LRM
275
1,077
0
05 Oct 2022
Granularity-aware Adaptation for Image Retrieval over Multiple Tasks
Granularity-aware Adaptation for Image Retrieval over Multiple Tasks
Jon Almazán
ByungSoo Ko
Geonmo Gu
Diane Larlus
Yannis Kalantidis
ObjD
VLM
48
7
0
05 Oct 2022
Robust Fair Clustering: A Novel Fairness Attack and Defense Framework
Robust Fair Clustering: A Novel Fairness Attack and Defense Framework
Anshuman Chhabra
Peizhao Li
P. Mohapatra
Hongfu Liu
OOD
34
22
0
04 Oct 2022
MOAT: Alternating Mobile Convolution and Attention Brings Strong Vision
  Models
MOAT: Alternating Mobile Convolution and Attention Brings Strong Vision Models
Chenglin Yang
Siyuan Qiao
Qihang Yu
Xiaoding Yuan
Yukun Zhu
Alan Yuille
Hartwig Adam
Liang-Chieh Chen
ViT
MoE
41
60
0
04 Oct 2022
The Effectiveness of Masked Language Modeling and Adapters for Factual
  Knowledge Injection
The Effectiveness of Masked Language Modeling and Adapters for Factual Knowledge Injection
Sondre Wold
KELM
39
4
0
03 Oct 2022
CRISP: Curriculum based Sequential Neural Decoders for Polar Code Family
CRISP: Curriculum based Sequential Neural Decoders for Polar Code Family
Ashwin Hebbar
Viraj Nadkarni
Ashok Vardhan Makkuva
S. Bhat
Sewoong Oh
Pramod Viswanath
30
6
0
01 Oct 2022
E-Branchformer: Branchformer with Enhanced merging for speech
  recognition
E-Branchformer: Branchformer with Enhanced merging for speech recognition
Kwangyoun Kim
Felix Wu
Yifan Peng
Jing Pan
Prashant Sridhar
Kyu Jeong Han
Shinji Watanabe
61
105
0
30 Sep 2022
BayesFT: Bayesian Optimization for Fault Tolerant Neural Network
  Architecture
BayesFT: Bayesian Optimization for Fault Tolerant Neural Network Architecture
Nanyang Ye
Jingbiao Mei
Zhicheng Fang
Yuwen Zhang
Ziqing Zhang
Huaying Wu
Xiaoyao Liang
OOD
33
5
0
30 Sep 2022
Towards Multi-spatiotemporal-scale Generalized PDE Modeling
Towards Multi-spatiotemporal-scale Generalized PDE Modeling
Jayesh K. Gupta
Johannes Brandstetter
AI4CE
61
120
0
30 Sep 2022
Protein structure generation via folding diffusion
Protein structure generation via folding diffusion
Kevin E. Wu
Kevin Kaichuang Yang
Rianne van den Berg
James Zou
Alex X. Lu
Ava P. Amini
DiffM
35
193
0
30 Sep 2022
State-specific protein-ligand complex structure prediction with a
  multi-scale deep generative model
State-specific protein-ligand complex structure prediction with a multi-scale deep generative model
Zhuoran Qiao
Weili Nie
Arash Vahdat
Thomas F. Miller
Anima Anandkumar
DiffM
39
84
0
30 Sep 2022
DreamFusion: Text-to-3D using 2D Diffusion
DreamFusion: Text-to-3D using 2D Diffusion
Ben Poole
Ajay Jain
Jonathan T. Barron
B. Mildenhall
85
2,323
0
29 Sep 2022
Continuous PDE Dynamics Forecasting with Implicit Neural Representations
Continuous PDE Dynamics Forecasting with Implicit Neural Representations
Yuan Yin
Matthieu Kirchmeyer
Jean-Yves Franceschi
A. Rakotomamonjy
Patrick Gallinari
AI4CE
25
49
0
29 Sep 2022
Transfer Learning with Pretrained Remote Sensing Transformers
Transfer Learning with Pretrained Remote Sensing Transformers
A. Fuller
K. Millard
J.R. Green
35
11
0
28 Sep 2022
Evolution TANN and the identification of internal variables and
  evolution equations in solid mechanics
Evolution TANN and the identification of internal variables and evolution equations in solid mechanics
Filippo Masi
I. Stefanou
AI4CE
31
30
0
27 Sep 2022
Rethinking Performance Gains in Image Dehazing Networks
Rethinking Performance Gains in Image Dehazing Networks
Yuda Song
Yang Zhou
Hui Qian
Xin Du
SSeg
36
48
0
23 Sep 2022
Lightweight Transformers for Human Activity Recognition on Mobile
  Devices
Lightweight Transformers for Human Activity Recognition on Mobile Devices
Sannara Ek
François Portet
P. Lalanda
37
28
0
22 Sep 2022
DFX: A Low-latency Multi-FPGA Appliance for Accelerating
  Transformer-based Text Generation
DFX: A Low-latency Multi-FPGA Appliance for Accelerating Transformer-based Text Generation
Seongmin Hong
Seungjae Moon
Junsoo Kim
Sungjae Lee
Minsub Kim
Dongsoo Lee
Joo-Young Kim
72
77
0
22 Sep 2022
An Efficient End-to-End Transformer with Progressive Tri-modal Attention
  for Multi-modal Emotion Recognition
An Efficient End-to-End Transformer with Progressive Tri-modal Attention for Multi-modal Emotion Recognition
Yang Wu
Pai Peng
Zhenyu Zhang
Yanyan Zhao
Bing Qin
32
1
0
20 Sep 2022
LogGD:Detecting Anomalies from System Logs by Graph Neural Networks
LogGD:Detecting Anomalies from System Logs by Graph Neural Networks
Yongzhen Xie
Hongyu Zhang
M. Babar
AI4TS
23
20
0
16 Sep 2022
A Light Recipe to Train Robust Vision Transformers
A Light Recipe to Train Robust Vision Transformers
Edoardo Debenedetti
Vikash Sehwag
Prateek Mittal
ViT
32
69
0
15 Sep 2022
Gromov-Wasserstein Autoencoders
Gromov-Wasserstein Autoencoders
Nao Nakagawa
Ren Togo
Takahiro Ogawa
Miki Haseyama
GAN
DRL
26
11
0
15 Sep 2022
On the interplay of adversarial robustness and architecture components:
  patches, convolution and attention
On the interplay of adversarial robustness and architecture components: patches, convolution and attention
Francesco Croce
Matthias Hein
43
6
0
14 Sep 2022
Characterizing Graph Datasets for Node Classification:
  Homophily-Heterophily Dichotomy and Beyond
Characterizing Graph Datasets for Node Classification: Homophily-Heterophily Dichotomy and Beyond
Oleg Platonov
Denis Kuznedelev
Artem Babenko
Liudmila Prokhorenkova
59
37
0
13 Sep 2022
On the Factory Floor: ML Engineering for Industrial-Scale Ads
  Recommendation Models
On the Factory Floor: ML Engineering for Industrial-Scale Ads Recommendation Models
Rohan Anil
S. Gadanho
Danya Huang
Nijith Jacob
Zhuoshu Li
...
Cristina Pop
Kevin Regan
G. Shamir
Rakesh Shivanna
Qiqi Yan
3DV
29
41
0
12 Sep 2022
Spach Transformer: Spatial and Channel-wise Transformer Based on Local
  and Global Self-attentions for PET Image Denoising
Spach Transformer: Spatial and Channel-wise Transformer Based on Local and Global Self-attentions for PET Image Denoising
Se-In Jang
T. Pan
Ye Li
P. Heidari
Junyu Chen
Quanzheng Li
Kuang Gong
ViT
MedIm
36
27
0
07 Sep 2022
Bag of Tricks for FGSM Adversarial Training
Bag of Tricks for FGSM Adversarial Training
Zichao Li
Li Liu
Zeyu Wang
Yuyin Zhou
Cihang Xie
AAML
35
6
0
06 Sep 2022
How important are activation functions in regression and classification?
  A survey, performance comparison, and future directions
How important are activation functions in regression and classification? A survey, performance comparison, and future directions
Ameya Dilip Jagtap
George Karniadakis
AI4CE
37
71
0
06 Sep 2022
Previous
123...101112...171819
Next