ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1606.08415
  4. Cited By
Gaussian Error Linear Units (GELUs)

Gaussian Error Linear Units (GELUs)

27 June 2016
Dan Hendrycks
Kevin Gimpel
ArXivPDFHTML

Papers citing "Gaussian Error Linear Units (GELUs)"

50 / 886 papers shown
Title
CITRIS: Causal Identifiability from Temporal Intervened Sequences
CITRIS: Causal Identifiability from Temporal Intervened Sequences
Phillip Lippe
Sara Magliacane
Sindy Lowe
Yuki M. Asano
Taco S. Cohen
E. Gavves
CML
43
101
0
07 Feb 2022
TIML: Task-Informed Meta-Learning for Agriculture
TIML: Task-Informed Meta-Learning for Agriculture
Gabriel Tseng
Hannah Kerner
David Rolnick
19
7
0
04 Feb 2022
Robust Training of Neural Networks Using Scale Invariant Architectures
Robust Training of Neural Networks Using Scale Invariant Architectures
Zhiyuan Li
Srinadh Bhojanapalli
Manzil Zaheer
Sashank J. Reddi
Surinder Kumar
29
27
0
02 Feb 2022
Designing Universal Causal Deep Learning Models: The Geometric
  (Hyper)Transformer
Designing Universal Causal Deep Learning Models: The Geometric (Hyper)Transformer
Beatrice Acciaio
Anastasis Kratsios
G. Pammer
OOD
52
20
0
31 Jan 2022
IGLUE: A Benchmark for Transfer Learning across Modalities, Tasks, and
  Languages
IGLUE: A Benchmark for Transfer Learning across Modalities, Tasks, and Languages
Emanuele Bugliarello
Fangyu Liu
Jonas Pfeiffer
Siva Reddy
Desmond Elliott
E. Ponti
Ivan Vulić
MLLM
VLM
ELM
50
62
0
27 Jan 2022
DSFormer: A Dual-domain Self-supervised Transformer for Accelerated
  Multi-contrast MRI Reconstruction
DSFormer: A Dual-domain Self-supervised Transformer for Accelerated Multi-contrast MRI Reconstruction
Bo Zhou
Neel Dey
Jo Schlemper
S. Salehi
Chi Liu
James S. Duncan
M. Sofka
MedIm
30
56
0
26 Jan 2022
Video Transformers: A Survey
Video Transformers: A Survey
Javier Selva
A. S. Johansen
Sergio Escalera
Kamal Nasrollahi
T. Moeslund
Albert Clapés
ViT
22
103
0
16 Jan 2022
Boundary-aware Self-supervised Learning for Video Scene Segmentation
Boundary-aware Self-supervised Learning for Video Scene Segmentation
Jonghwan Mun
Minchul Shin
Gunsoo Han
Sangho Lee
S. Ha
Joonseok Lee
Eun-Sol Kim
SSL
49
20
0
14 Jan 2022
Multiview Transformers for Video Recognition
Multiview Transformers for Video Recognition
Shen Yan
Xuehan Xiong
Anurag Arnab
Zhichao Lu
Mi Zhang
Chen Sun
Cordelia Schmid
ViT
26
212
0
12 Jan 2022
A ConvNet for the 2020s
A ConvNet for the 2020s
Zhuang Liu
Hanzi Mao
Chaozheng Wu
Christoph Feichtenhofer
Trevor Darrell
Saining Xie
ViT
42
4,983
0
10 Jan 2022
Linear Variational State-Space Filtering
Linear Variational State-Space Filtering
Daniel Pfrommer
Nikolai Matni
30
1
0
04 Jan 2022
Learning Operators with Coupled Attention
Learning Operators with Coupled Attention
Georgios Kissas
Jacob H. Seidman
Leonardo Ferreira Guilhoto
V. Preciado
George J. Pappas
P. Perdikaris
32
110
0
04 Jan 2022
PyramidTNT: Improved Transformer-in-Transformer Baselines with Pyramid
  Architecture
PyramidTNT: Improved Transformer-in-Transformer Baselines with Pyramid Architecture
Kai Han
Jianyuan Guo
Yehui Tang
Yunhe Wang
ViT
34
22
0
04 Jan 2022
EvoMoE: An Evolutional Mixture-of-Experts Training Framework via
  Dense-To-Sparse Gate
EvoMoE: An Evolutional Mixture-of-Experts Training Framework via Dense-To-Sparse Gate
Xiaonan Nie
Xupeng Miao
Shijie Cao
Lingxiao Ma
Qibin Liu
Jilong Xue
Youshan Miao
Yi Liu
Zhi-Xin Yang
Bin Cui
MoMe
MoE
29
22
0
29 Dec 2021
Augmenting Convolutional networks with attention-based aggregation
Augmenting Convolutional networks with attention-based aggregation
Hugo Touvron
Matthieu Cord
Alaaeldin El-Nouby
Piotr Bojanowski
Armand Joulin
Gabriel Synnaeve
Hervé Jégou
ViT
38
47
0
27 Dec 2021
Learning Generative Vision Transformer with Energy-Based Latent Space
  for Saliency Prediction
Learning Generative Vision Transformer with Energy-Based Latent Space for Saliency Prediction
Jing Zhang
Jianwen Xie
Nick Barnes
Ping Li
ViT
40
90
0
27 Dec 2021
Extending CLIP for Category-to-image Retrieval in E-commerce
Extending CLIP for Category-to-image Retrieval in E-commerce
Mariya Hendriksen
Maurits J. R. Bleeker
Svitlana Vakulenko
Nanne van Noord
E. Kuiper
Maarten de Rijke
VLM
11
30
0
21 Dec 2021
RepMLPNet: Hierarchical Vision MLP with Re-parameterized Locality
RepMLPNet: Hierarchical Vision MLP with Re-parameterized Locality
Xiaohan Ding
Honghao Chen
Xinming Zhang
Jungong Han
Guiguang Ding
17
71
0
21 Dec 2021
Lite Vision Transformer with Enhanced Self-Attention
Lite Vision Transformer with Enhanced Self-Attention
Chenglin Yang
Yilin Wang
Jianming Zhang
He Zhang
Zijun Wei
Zhe-nan Lin
Alan Yuille
ViT
21
112
0
20 Dec 2021
Efficient Large Scale Language Modeling with Mixtures of Experts
Efficient Large Scale Language Modeling with Mixtures of Experts
Mikel Artetxe
Shruti Bhosale
Naman Goyal
Todor Mihaylov
Myle Ott
...
Jeff Wang
Luke Zettlemoyer
Mona T. Diab
Zornitsa Kozareva
Ves Stoyanov
MoE
61
188
0
20 Dec 2021
Decoupling and Recoupling Spatiotemporal Representation for RGB-D-based
  Motion Recognition
Decoupling and Recoupling Spatiotemporal Representation for RGB-D-based Motion Recognition
Benjia Zhou
Pichao Wang
Jun Wan
Yanyan Liang
Fan Wang
Du Zhang
Zhen Lei
Hao Li
Rong Jin
36
29
0
16 Dec 2021
Consistent Depth Prediction under Various Illuminations using Dilated
  Cross Attention
Consistent Depth Prediction under Various Illuminations using Dilated Cross Attention
Zitian Zhang
Chuhua Xian
3DV
MDE
43
0
0
15 Dec 2021
Hformer: Hybrid CNN-Transformer for Fringe Order Prediction in Phase
  Unwrapping of Fringe Projection
Hformer: Hybrid CNN-Transformer for Fringe Order Prediction in Phase Unwrapping of Fringe Projection
Xinjun Zhu
Zhiqiang Han
Mengkai Yuan
Qinghua Guo
Hongyi Wang
22
4
0
13 Dec 2021
Measuring Complexity of Learning Schemes Using Hessian-Schatten Total
  Variation
Measuring Complexity of Learning Schemes Using Hessian-Schatten Total Variation
Shayan Aziznejad
Joaquim Campos
M. Unser
27
9
0
12 Dec 2021
Perceptual Loss with Recognition Model for Single-Channel Enhancement
  and Robust ASR
Perceptual Loss with Recognition Model for Single-Channel Enhancement and Robust ASR
Peter William VanHarn Plantinga
Deblin Bagchi
Eric Fosler-Lussier
46
10
0
11 Dec 2021
Graph Neural Networks Accelerated Molecular Dynamics
Graph Neural Networks Accelerated Molecular Dynamics
Zijie Li
Kazem Meidani
Prakarsh Yadav
A. Farimani
GNN
AI4CE
21
53
0
06 Dec 2021
Efficient Symptom Inquiring and Diagnosis via Adaptive Alignment of
  Reinforcement Learning and Classification
Efficient Symptom Inquiring and Diagnosis via Adaptive Alignment of Reinforcement Learning and Classification
Hongyi Yuan
Sheng Yu
24
15
0
01 Dec 2021
MC-SSL0.0: Towards Multi-Concept Self-Supervised Learning
MC-SSL0.0: Towards Multi-Concept Self-Supervised Learning
Sara Atito
Muhammad Awais
Ammarah Farooq
Zhenhua Feng
J. Kittler
19
17
0
30 Nov 2021
TransWeather: Transformer-based Restoration of Images Degraded by
  Adverse Weather Conditions
TransWeather: Transformer-based Restoration of Images Degraded by Adverse Weather Conditions
Jeya Maria Jose Valanarasu
R. Yasarla
Vishal M. Patel
ViT
54
276
0
29 Nov 2021
ClimART: A Benchmark Dataset for Emulating Atmospheric Radiative
  Transfer in Weather and Climate Models
ClimART: A Benchmark Dataset for Emulating Atmospheric Radiative Transfer in Weather and Climate Models
Salva Rühling Cachay
Venkatesh Ramesh
J. Cole
H. Barker
David Rolnick
14
20
0
29 Nov 2021
diffConv: Analyzing Irregular Point Clouds with an Irregular View
diffConv: Analyzing Irregular Point Clouds with an Irregular View
Manxi Lin
Aasa Feragen
GNN
3DPC
33
11
0
29 Nov 2021
Mixed Precision Low-bit Quantization of Neural Network Language Models
  for Speech Recognition
Mixed Precision Low-bit Quantization of Neural Network Language Models for Speech Recognition
Junhao Xu
Jianwei Yu
Shoukang Hu
Xunying Liu
Helen Meng
MQ
27
13
0
29 Nov 2021
Mixed Precision of Quantization of Transformer Language Models for
  Speech Recognition
Mixed Precision of Quantization of Transformer Language Models for Speech Recognition
Junhao Xu
Shoukang Hu
Jianwei Yu
Xunying Liu
Helen M. Meng
MQ
40
15
0
29 Nov 2021
Video Frame Interpolation Transformer
Video Frame Interpolation Transformer
Zhihao Shi
Xiangyu Xu
Xiaohong Liu
Jun Chen
Ming-Hsuan Yang
ViT
17
159
0
27 Nov 2021
Scene Representation Transformer: Geometry-Free Novel View Synthesis
  Through Set-Latent Scene Representations
Scene Representation Transformer: Geometry-Free Novel View Synthesis Through Set-Latent Scene Representations
Mehdi S. M. Sajjadi
H. Meyer
Etienne Pot
Urs M. Bergmann
Klaus Greff
...
Daniel Duckworth
Alexey Dosovitskiy
Jakob Uszkoreit
Thomas Funkhouser
Andrea Tagliasacchi
ViT
49
184
0
25 Nov 2021
Pruning Self-attentions into Convolutional Layers in Single Path
Pruning Self-attentions into Convolutional Layers in Single Path
Haoyu He
Jianfei Cai
Jing Liu
Zizheng Pan
Jing Zhang
Dacheng Tao
Bohan Zhuang
ViT
34
40
0
23 Nov 2021
Benchmarking Detection Transfer Learning with Vision Transformers
Benchmarking Detection Transfer Learning with Vision Transformers
Yanghao Li
Saining Xie
Xinlei Chen
Piotr Dollar
Kaiming He
Ross B. Girshick
20
165
0
22 Nov 2021
PointMixer: MLP-Mixer for Point Cloud Understanding
PointMixer: MLP-Mixer for Point Cloud Understanding
Jaesung Choe
Chunghyun Park
François Rameau
Jaesik Park
In So Kweon
3DPC
45
98
0
22 Nov 2021
Mesa: A Memory-saving Training Framework for Transformers
Mesa: A Memory-saving Training Framework for Transformers
Zizheng Pan
Peng Chen
Haoyu He
Jing Liu
Jianfei Cai
Bohan Zhuang
31
20
0
22 Nov 2021
Global and Local Alignment Networks for Unpaired Image-to-Image
  Translation
Global and Local Alignment Networks for Unpaired Image-to-Image Translation
Guanglei Yang
H. Tang
Humphrey Shi
M. Ding
N. Sebe
Radu Timofte
Luc Van Gool
Elisa Ricci
18
1
0
19 Nov 2021
Restormer: Efficient Transformer for High-Resolution Image Restoration
Restormer: Efficient Transformer for High-Resolution Image Restoration
Syed Waqas Zamir
Aditya Arora
Salman Khan
Munawar Hayat
Fahad Shahbaz Khan
Ming-Hsuan Yang
ViT
73
2,128
0
18 Nov 2021
Are Transformers More Robust Than CNNs?
Are Transformers More Robust Than CNNs?
Yutong Bai
Jieru Mei
Alan Yuille
Cihang Xie
ViT
AAML
192
258
0
10 Nov 2021
Data Augmentation Can Improve Robustness
Data Augmentation Can Improve Robustness
Sylvestre-Alvise Rebuffi
Sven Gowal
D. A. Calian
Florian Stimberg
Olivia Wiles
Timothy A. Mann
AAML
22
270
0
09 Nov 2021
SMU: smooth activation function for deep networks using smoothing
  maximum technique
SMU: smooth activation function for deep networks using smoothing maximum technique
Koushik Biswas
Sandeep Kumar
Shilpak Banerjee
A. Pandey
36
32
0
08 Nov 2021
Are we ready for a new paradigm shift? A Survey on Visual Deep MLP
Are we ready for a new paradigm shift? A Survey on Visual Deep MLP
Ruiyang Liu
Hai-Tao Zheng
Li Tao
Dun Liang
Haitao Zheng
85
97
0
07 Nov 2021
AGGLIO: Global Optimization for Locally Convex Functions
AGGLIO: Global Optimization for Locally Convex Functions
Debojyoti Dey
B. Mukhoty
Purushottam Kar
14
2
0
06 Nov 2021
Hybrid Spectrogram and Waveform Source Separation
Hybrid Spectrogram and Waveform Source Separation
Alexandre Défossez
24
162
0
05 Nov 2021
Detecting Logical Relation In Contract Clauses
Detecting Logical Relation In Contract Clauses
Alexandre Yukio Ichida
Felipe Meneguzzi
8
0
0
02 Nov 2021
Cross-Modality Fusion Transformer for Multispectral Object Detection
Cross-Modality Fusion Transformer for Multispectral Object Detection
Q. Fang
D. Han
Zhaokui Wang
ViT
22
140
0
30 Oct 2021
BiC-Net: Learning Efficient Spatio-Temporal Relation for Text-Video
  Retrieval
BiC-Net: Learning Efficient Spatio-Temporal Relation for Text-Video Retrieval
Ning Han
Jingjing Chen
Chuhao Shi
Yawen Zeng
Guangyi Xiao
Hao Chen
22
10
0
29 Oct 2021
Previous
123...131415161718
Next