ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1606.08415
  4. Cited By
Gaussian Error Linear Units (GELUs)

Gaussian Error Linear Units (GELUs)

27 June 2016
Dan Hendrycks
Kevin Gimpel
ArXivPDFHTML

Papers citing "Gaussian Error Linear Units (GELUs)"

50 / 916 papers shown
Title
Guaranteed Discovery of Control-Endogenous Latent States with Multi-Step
  Inverse Models
Guaranteed Discovery of Control-Endogenous Latent States with Multi-Step Inverse Models
Alex Lamb
Riashat Islam
Yonathan Efroni
Aniket Didolkar
Dipendra Kumar Misra
Dylan J. Foster
Lekan Molu
Rajan Chari
A. Krishnamurthy
John Langford
46
24
0
17 Jul 2022
CATRE: Iterative Point Clouds Alignment for Category-level Object Pose
  Refinement
CATRE: Iterative Point Clouds Alignment for Category-level Object Pose Refinement
Xingyu Liu
Gu Wang
Yi Li
Xiangyang Ji
3DPC
34
28
0
17 Jul 2022
Explainable Sparse Knowledge Graph Completion via High-order Graph
  Reasoning Network
Explainable Sparse Knowledge Graph Completion via High-order Graph Reasoning Network
Weijia Chen
Yixin Cao
Fuli Feng
Xiangnan He
Yongdong Zhang
13
2
0
14 Jul 2022
Sub 8-Bit Quantization of Streaming Keyword Spotting Models for Embedded
  Chipsets
Sub 8-Bit Quantization of Streaming Keyword Spotting Models for Embedded Chipsets
Lu Zeng
S. Parthasarathi
Yuzong Liu
Alex Escott
S. Cheekatmalla
N. Strom
S. Vitaladevuni
MQ
33
5
0
13 Jul 2022
Earthformer: Exploring Space-Time Transformers for Earth System
  Forecasting
Earthformer: Exploring Space-Time Transformers for Earth System Forecasting
Zhihan Gao
Xingjian Shi
Hao Wang
Yi Zhu
Yuyang Wang
Mu Li
Dit-Yan Yeung
AI4TS
42
150
0
12 Jul 2022
Vision Transformer for NeRF-Based View Synthesis from a Single Input
  Image
Vision Transformer for NeRF-Based View Synthesis from a Single Input Image
Kai-En Lin
Yen-Chen Lin
Wei-Sheng Lai
Nayeon Lee
Yichang Shih
R. Ramamoorthi
ViT
27
112
0
12 Jul 2022
Robust and efficient computation of retinal fractal dimension through
  deep approximation
Robust and efficient computation of retinal fractal dimension through deep approximation
Justin Engelmann
A. Villaplana-Velasco
Amos Storkey
Miguel O. Bernabeu
9
11
0
12 Jul 2022
The Cosmic Graph: Optimal Information Extraction from Large-Scale
  Structure using Catalogues
The Cosmic Graph: Optimal Information Extraction from Large-Scale Structure using Catalogues
T. Lucas Makinen
Tom Charnock
Pablo Lemos
Natalia Porqueres
A. Heavens
Benjamin Dan Wandelt
22
26
0
11 Jul 2022
PI-Trans: Parallel-ConvMLP and Implicit-Transformation Based GAN for
  Cross-View Image Translation
PI-Trans: Parallel-ConvMLP and Implicit-Transformation Based GAN for Cross-View Image Translation
Bin Ren
Hao Tang
Yiming Wang
Xia Li
Wei Wang
N. Sebe
ViT
24
5
0
09 Jul 2022
TFCNs: A CNN-Transformer Hybrid Network for Medical Image Segmentation
TFCNs: A CNN-Transformer Hybrid Network for Medical Image Segmentation
Zihan Li
Dihan Li
Cangbai Xu
Wei-Chien Wang
Qingqi Hong
Qingde Li
Jie Tian
ViT
MedIm
35
46
0
07 Jul 2022
Branchformer: Parallel MLP-Attention Architectures to Capture Local and
  Global Context for Speech Recognition and Understanding
Branchformer: Parallel MLP-Attention Architectures to Capture Local and Global Context for Speech Recognition and Understanding
Yifan Peng
Siddharth Dalmia
Ian Lane
Shinji Watanabe
30
143
0
06 Jul 2022
Pure Transformers are Powerful Graph Learners
Pure Transformers are Powerful Graph Learners
Jinwoo Kim
Tien Dat Nguyen
Seonwoo Min
Sungjun Cho
Moontae Lee
Honglak Lee
Seunghoon Hong
43
189
0
06 Jul 2022
A Densely Interconnected Network for Deep Learning Accelerated MRI
A Densely Interconnected Network for Deep Learning Accelerated MRI
Jon André Ottesen
M. Caan
I. Groote
A. Bjørnerud
49
9
0
05 Jul 2022
The Deep Ritz Method for Parametric $p$-Dirichlet Problems
The Deep Ritz Method for Parametric ppp-Dirichlet Problems
A. Kaltenbach
Marius Zeinhofer
22
3
0
05 Jul 2022
I-ViT: Integer-only Quantization for Efficient Vision Transformer
  Inference
I-ViT: Integer-only Quantization for Efficient Vision Transformer Inference
Zhikai Li
Qingyi Gu
MQ
57
95
0
04 Jul 2022
MotionMixer: MLP-based 3D Human Body Pose Forecasting
MotionMixer: MLP-based 3D Human Body Pose Forecasting
Arij Bouazizi
Adrian Holzbock
U. Kressel
Klaus C. J. Dietmayer
Vasileios Belagiannis
3DH
35
73
0
01 Jul 2022
LaRa: Latents and Rays for Multi-Camera Bird's-Eye-View Semantic
  Segmentation
LaRa: Latents and Rays for Multi-Camera Bird's-Eye-View Semantic Segmentation
Florent Bartoccioni
Éloi Zablocki
Andrei Bursuc
Patrick Pérez
Matthieu Cord
Alahari Karteek
51
33
0
27 Jun 2022
QbyE-MLPMixer: Query-by-Example Open-Vocabulary Keyword Spotting using
  MLPMixer
QbyE-MLPMixer: Query-by-Example Open-Vocabulary Keyword Spotting using MLPMixer
Jinmiao Huang
W. Gharbieh
Qianhui Wan
Han Suk Shim
Chul Lee
22
9
0
23 Jun 2022
Answer Fast: Accelerating BERT on the Tensor Streaming Processor
Answer Fast: Accelerating BERT on the Tensor Streaming Processor
I. Ahmed
Sahil Parmar
Matthew Boyd
Michael Beidler
Kris Kang
Bill Liu
Kyle Roach
John Kim
D. Abts
LLMAG
20
6
0
22 Jun 2022
Surgical-VQA: Visual Question Answering in Surgical Scenes using
  Transformer
Surgical-VQA: Visual Question Answering in Surgical Scenes using Transformer
Lalithkumar Seenivasan
Mobarakol Islam
Adithya K. Krishna
Hongliang Ren
MedIm
21
45
0
22 Jun 2022
EdgeNeXt: Efficiently Amalgamated CNN-Transformer Architecture for
  Mobile Vision Applications
EdgeNeXt: Efficiently Amalgamated CNN-Transformer Architecture for Mobile Vision Applications
Muhammad Maaz
Abdelrahman M. Shaker
Hisham Cholakkal
Salman Khan
Syed Waqas Zamir
Rao Muhammad Anwer
Fahad Shahbaz Khan
ViT
40
184
0
21 Jun 2022
Robust SDE-Based Variational Formulations for Solving Linear PDEs via
  Deep Learning
Robust SDE-Based Variational Formulations for Solving Linear PDEs via Deep Learning
Lorenz Richter
Julius Berner
27
19
0
21 Jun 2022
Transformers Improve Breast Cancer Diagnosis from Unregistered
  Multi-View Mammograms
Transformers Improve Breast Cancer Diagnosis from Unregistered Multi-View Mammograms
Xuxin Chen
Keqin Zhang
Neman Abdoli
Patrik W. Gilley
Ximing Wang
Hong Liu
Bin Zheng
Y. Qiu
ViT
MedIm
19
51
0
21 Jun 2022
CMT-DeepLab: Clustering Mask Transformers for Panoptic Segmentation
CMT-DeepLab: Clustering Mask Transformers for Panoptic Segmentation
Qihang Yu
Huiyu Wang
Dahun Kim
Siyuan Qiao
Maxwell D. Collins
Yukun Zhu
Hartwig Adam
Alan Yuille
Liang-Chieh Chen
ViT
MedIm
32
90
0
17 Jun 2022
SimA: Simple Softmax-free Attention for Vision Transformers
SimA: Simple Softmax-free Attention for Vision Transformers
Soroush Abbasi Koohpayegani
Hamed Pirsiavash
24
25
0
17 Jun 2022
IRISformer: Dense Vision Transformers for Single-Image Inverse Rendering
  in Indoor Scenes
IRISformer: Dense Vision Transformers for Single-Image Inverse Rendering in Indoor Scenes
Rui Zhu
Zhengqin Li
J. Matai
Fatih Porikli
Manmohan Chandraker
ViT
43
46
0
16 Jun 2022
Wide Bayesian neural networks have a simple weight posterior: theory and
  accelerated sampling
Wide Bayesian neural networks have a simple weight posterior: theory and accelerated sampling
Jiri Hron
Roman Novak
Jeffrey Pennington
Jascha Narain Sohl-Dickstein
UQCV
BDL
48
6
0
15 Jun 2022
Understanding the Generalization Benefit of Normalization Layers:
  Sharpness Reduction
Understanding the Generalization Benefit of Normalization Layers: Sharpness Reduction
Kaifeng Lyu
Zhiyuan Li
Sanjeev Arora
FAtt
45
71
0
14 Jun 2022
Multimodal Learning with Transformers: A Survey
Multimodal Learning with Transformers: A Survey
P. Xu
Xiatian Zhu
David A. Clifton
ViT
72
528
0
13 Jun 2022
MLP-3D: A MLP-like 3D Architecture with Grouped Time Mixing
MLP-3D: A MLP-like 3D Architecture with Grouped Time Mixing
Zhaofan Qiu
Ting Yao
Chong-Wah Ngo
Tao Mei
ViT
37
15
0
13 Jun 2022
Bootstrapping Multi-view Representations for Fake News Detection
Bootstrapping Multi-view Representations for Fake News Detection
Qichao Ying
Xiaoxiao Hu
Yangming Zhou
Zhenxing Qian
Dan Zeng
Shiming Ge
24
45
0
12 Jun 2022
NOMAD: Nonlinear Manifold Decoders for Operator Learning
NOMAD: Nonlinear Manifold Decoders for Operator Learning
Jacob H. Seidman
Georgios Kissas
P. Perdikaris
George J. Pappas
AI4CE
31
68
0
07 Jun 2022
How to Dissect a Muppet: The Structure of Transformer Embedding Spaces
How to Dissect a Muppet: The Structure of Transformer Embedding Spaces
Timothee Mickus
Denis Paperno
Mathieu Constant
31
19
0
07 Jun 2022
Utility of Equivariant Message Passing in Cortical Mesh Segmentation
Utility of Equivariant Message Passing in Cortical Mesh Segmentation
Dániel Unyi
F. Insalata
Petar Velickovic
Bálint Gyires-Tóth
22
0
0
07 Jun 2022
PEER: A Comprehensive and Multi-Task Benchmark for Protein Sequence
  Understanding
PEER: A Comprehensive and Multi-Task Benchmark for Protein Sequence Understanding
Minghao Xu
Zuobai Zhang
Jiarui Lu
Zhaocheng Zhu
Yang Zhang
Chang Ma
Runcheng Liu
Jian Tang
16
98
0
05 Jun 2022
Towards Improving the Generation Quality of Autoregressive Slot VAEs
Towards Improving the Generation Quality of Autoregressive Slot VAEs
Patrick Emami
Pan He
Sanjay Ranka
Anand Rangarajan
OCL
38
1
0
03 Jun 2022
Vision GNN: An Image is Worth Graph of Nodes
Vision GNN: An Image is Worth Graph of Nodes
Kai Han
Yunhe Wang
Jianyuan Guo
Yehui Tang
Enhua Wu
GNN
3DH
19
356
0
01 Jun 2022
Inferring 3D change detection from bitemporal optical images
Inferring 3D change detection from bitemporal optical images
V. Marsocci
V. Coletta
R. Ravanelli
Simone Scardapane
M. Crespi
3DPC
33
19
0
31 May 2022
Exact Feature Collisions in Neural Networks
Exact Feature Collisions in Neural Networks
Utku Ozbulak
Manvel Gasparyan
Shodhan Rao
W. D. Neve
Arnout Van Messem
AAML
27
1
0
31 May 2022
NEWTS: A Corpus for News Topic-Focused Summarization
NEWTS: A Corpus for News Topic-Focused Summarization
Seyed Ali Bahrainian
Sheridan Feucht
Carsten Eickhoff
74
24
0
31 May 2022
COFS: Controllable Furniture layout Synthesis
COFS: Controllable Furniture layout Synthesis
W. Para
Paul Guerrero
Niloy Mitra
Peter Wonka
3DV
42
16
0
29 May 2022
On the Symmetries of Deep Learning Models and their Internal
  Representations
On the Symmetries of Deep Learning Models and their Internal Representations
Charles Godfrey
Davis Brown
Tegan H. Emerson
Henry Kvinge
28
40
0
27 May 2022
Transformer for Partial Differential Equations' Operator Learning
Transformer for Partial Differential Equations' Operator Learning
Zijie Li
Kazem Meidani
A. Farimani
42
144
0
26 May 2022
MemeTector: Enforcing deep focus for meme detection
MemeTector: Enforcing deep focus for meme detection
C. Koutlis
Emmanouil Schinas
Symeon Papadopoulos
VLM
48
9
0
26 May 2022
ATTEMPT: Parameter-Efficient Multi-task Tuning via Attentional Mixtures
  of Soft Prompts
ATTEMPT: Parameter-Efficient Multi-task Tuning via Attentional Mixtures of Soft Prompts
Akari Asai
Mohammadreza Salehi
Matthew E. Peters
Hannaneh Hajishirzi
130
100
0
24 May 2022
Simple Recurrence Improves Masked Language Models
Simple Recurrence Improves Masked Language Models
Tao Lei
Ran Tian
Jasmijn Bastings
Ankur P. Parikh
85
4
0
23 May 2022
Super Vision Transformer
Super Vision Transformer
Mingbao Lin
Yonghong Tian
Yuxin Zhang
Yunhang Shen
Rongrong Ji
Liujuan Cao
ViT
46
20
0
23 May 2022
Time-series Transformer Generative Adversarial Networks
Time-series Transformer Generative Adversarial Networks
Padmanaba Srinivasan
William J. Knottenbelt
AI4TS
28
13
0
23 May 2022
Deep Digging into the Generalization of Self-Supervised Monocular Depth
  Estimation
Deep Digging into the Generalization of Self-Supervised Monocular Depth Estimation
Ji-Hoon Bae
Sungho Moon
Sunghoon Im
MDE
33
84
0
23 May 2022
Memorization Without Overfitting: Analyzing the Training Dynamics of
  Large Language Models
Memorization Without Overfitting: Analyzing the Training Dynamics of Large Language Models
Kushal Tirumala
Aram H. Markosyan
Luke Zettlemoyer
Armen Aghajanyan
TDI
29
187
0
22 May 2022
Previous
123...111213...171819
Next