ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1606.08415
  4. Cited By
Gaussian Error Linear Units (GELUs)

Gaussian Error Linear Units (GELUs)

27 June 2016
Dan Hendrycks
Kevin Gimpel
ArXivPDFHTML

Papers citing "Gaussian Error Linear Units (GELUs)"

50 / 966 papers shown
Title
Language-Driven Representation Learning for Robotics
Language-Driven Representation Learning for Robotics
Siddharth Karamcheti
Suraj Nair
Annie S. Chen
Thomas Kollar
Chelsea Finn
Dorsa Sadigh
Percy Liang
LM&Ro
SSL
47
145
0
24 Feb 2023
Adapting Pre-trained Language Models for Quantum Natural Language
  Processing
Adapting Pre-trained Language Models for Quantum Natural Language Processing
Qiuchi Li
Benyou Wang
Yudong Zhu
Christina Lioma
Qun Liu
AI4CE
37
4
0
24 Feb 2023
A critical look at the evaluation of GNNs under heterophily: Are we
  really making progress?
A critical look at the evaluation of GNNs under heterophily: Are we really making progress?
Oleg Platonov
Denis Kuznedelev
Michael Diskin
Artem Babenko
Liudmila Prokhorenkova
36
193
0
22 Feb 2023
A residual dense vision transformer for medical image super-resolution
  with segmentation-based perceptual loss fine-tuning
A residual dense vision transformer for medical image super-resolution with segmentation-based perceptual loss fine-tuning
Jin Zhu
Guang Yang
Pietro Lio
ViT
MedIm
32
5
0
22 Feb 2023
Variational Autoencoding Neural Operators
Variational Autoencoding Neural Operators
Jacob H. Seidman
Georgios Kissas
George J. Pappas
P. Perdikaris
DRL
AI4CE
29
7
0
20 Feb 2023
Measuring Equality in Machine Learning Security Defenses: A Case Study
  in Speech Recognition
Measuring Equality in Machine Learning Security Defenses: A Case Study in Speech Recognition
Luke E. Richards
Edward Raff
Cynthia Matuszek
AAML
16
2
0
17 Feb 2023
G-Signatures: Global Graph Propagation With Randomized Signatures
G-Signatures: Global Graph Propagation With Randomized Signatures
Bernhard Schafl
Lukas Gruber
Johannes Brandstetter
Sepp Hochreiter
27
2
0
17 Feb 2023
Synaptic Stripping: How Pruning Can Bring Dead Neurons Back To Life
Synaptic Stripping: How Pruning Can Bring Dead Neurons Back To Life
Tim Whitaker
L. D. Whitley
CVBM
33
2
0
11 Feb 2023
Neural Capacitated Clustering
Neural Capacitated Clustering
Jonas K. Falkner
Lars Schmidt-Thieme
32
1
0
10 Feb 2023
The Monge Gap: A Regularizer to Learn All Transport Maps
The Monge Gap: A Regularizer to Learn All Transport Maps
Théo Uscidda
Marco Cuturi
OT
55
27
0
09 Feb 2023
Better Diffusion Models Further Improve Adversarial Training
Better Diffusion Models Further Improve Adversarial Training
Zekai Wang
Tianyu Pang
Chao Du
Min Lin
Weiwei Liu
Shuicheng Yan
DiffM
26
210
0
09 Feb 2023
MTS-Mixers: Multivariate Time Series Forecasting via Factorized Temporal
  and Channel Mixing
MTS-Mixers: Multivariate Time Series Forecasting via Factorized Temporal and Channel Mixing
Zhe Li
Zhongwen Rao
Lujia Pan
Zenglin Xu
AI4TS
33
62
0
09 Feb 2023
Climate Intervention Analysis using AI Model Guided by Statistical
  Physics Principles
Climate Intervention Analysis using AI Model Guided by Statistical Physics Principles
S. K. Kim
Kalai Ramea
Salva Rühling Cachay
H. Hirasawa
Subhashis Hazarika
D. Hingmire
Peetak Mitra
P. Rasch
Hansi K. A. Singh
AI4CE
35
0
0
07 Feb 2023
On the Ideal Number of Groups for Isometric Gradient Propagation
On the Ideal Number of Groups for Isometric Gradient Propagation
Bum Jun Kim
Hyeyeon Choi
Hyeonah Jang
Sang Woo Kim
32
1
0
07 Feb 2023
GPS++: Reviving the Art of Message Passing for Molecular Property
  Prediction
GPS++: Reviving the Art of Message Passing for Molecular Property Prediction
Dominic Masters
Josef Dean
Kerstin Klaser
Zhiyi Li
Sam Maddrell-Mander
...
D. Beker
Andrew Fitzgibbon
Shenyang Huang
Ladislav Rampášek
Dominique Beaini
41
8
0
06 Feb 2023
Randomized prior wavelet neural operator for uncertainty quantification
Randomized prior wavelet neural operator for uncertainty quantification
Shailesh Garg
S. Chakraborty
UQCV
BDL
28
1
0
02 Feb 2023
FCB-SwinV2 Transformer for Polyp Segmentation
FCB-SwinV2 Transformer for Polyp Segmentation
Kerr Fitzgerald
B. Matuszewski
ViT
MedIm
21
12
0
02 Feb 2023
An Enhanced V-cycle MgNet Model for Operator Learning in Numerical
  Partial Differential Equations
An Enhanced V-cycle MgNet Model for Operator Learning in Numerical Partial Differential Equations
Jianqing Zhu
Juncai He
Qiumei Huang
40
4
0
02 Feb 2023
A Survey of Deep Learning: From Activations to Transformers
A Survey of Deep Learning: From Activations to Transformers
Johannes Schneider
Michalis Vlachos
ViT
MedIm
AI4TS
AI4CE
50
10
0
01 Feb 2023
Reverse Ordering Techniques for Attention-Based Channel Prediction
Reverse Ordering Techniques for Attention-Based Channel Prediction
Valentina Rizzello
Benedikt Bock
M. Joham
Wolfgang Utschick
AI4TS
38
7
0
01 Feb 2023
Learning Functional Transduction
Learning Functional Transduction
Mathieu Chalvidal
Thomas Serre
Rufin VanRullen
AI4CE
50
2
0
01 Feb 2023
Learning to Speak from Text: Zero-Shot Multilingual Text-to-Speech with
  Unsupervised Text Pretraining
Learning to Speak from Text: Zero-Shot Multilingual Text-to-Speech with Unsupervised Text Pretraining
Takaaki Saeki
Soumi Maiti
Xinjian Li
Shinji Watanabe
Shinnosuke Takamichi
Hiroshi Saruwatari
32
18
0
30 Jan 2023
Case-Based Reasoning with Language Models for Classification of Logical
  Fallacies
Case-Based Reasoning with Language Models for Classification of Logical Fallacies
Zhivar Sourati
Filip Ilievski
Hông-Ân Sandlin
Alain Mermoud
LRM
26
12
0
27 Jan 2023
Boundary Aware U-Net for Glacier Segmentation
Boundary Aware U-Net for Glacier Segmentation
B. Aryal
K. Miles
S. V. Zesati
O. Fuentes
14
2
0
26 Jan 2023
Rigid Body Flows for Sampling Molecular Crystal Structures
Rigid Body Flows for Sampling Molecular Crystal Structures
Jonas Köhler
Michele Invernizzi
P. D. Haan
Frank Noé
AI4CE
46
27
0
26 Jan 2023
AQuaMaM: An Autoregressive, Quaternion Manifold Model for Rapidly
  Estimating Complex SO(3) Distributions
AQuaMaM: An Autoregressive, Quaternion Manifold Model for Rapidly Estimating Complex SO(3) Distributions
Michael A. Alcorn
35
0
0
21 Jan 2023
Holistically Explainable Vision Transformers
Holistically Explainable Vision Transformers
Moritz D Boehle
Mario Fritz
Bernt Schiele
ViT
41
9
0
20 Jan 2023
Human-Timescale Adaptation in an Open-Ended Task Space
Human-Timescale Adaptation in an Open-Ended Task Space
Adaptive Agent Team
Jakob Bauer
Kate Baumli
Satinder Baveja
Feryal M. P. Behbahani
...
Jakub Sygnowski
K. Tuyls
Sarah York
Alexander Zacherl
Lei Zhang
LM&Ro
OffRL
AI4CE
LRM
40
110
0
18 Jan 2023
Ankh: Optimized Protein Language Model Unlocks General-Purpose Modelling
Ankh: Optimized Protein Language Model Unlocks General-Purpose Modelling
Ahmed Elnaggar
Hazem Essam
Wafaa Salah-Eldin
Walid Moustafa
Mohamed Elkerdawy
Charlotte Rochereau
B. Rost
172
87
0
16 Jan 2023
Efficient Activation Function Optimization through Surrogate Modeling
Efficient Activation Function Optimization through Surrogate Modeling
G. Bingham
Risto Miikkulainen
24
2
0
13 Jan 2023
LVRNet: Lightweight Image Restoration for Aerial Images under Low
  Visibility
LVRNet: Lightweight Image Restoration for Aerial Images under Low Visibility
Esha Pahwa
Achleshwar Luthra
Pratik Narang
39
4
0
13 Jan 2023
Interaction-Aware Trajectory Planning for Autonomous Vehicles with
  Analytic Integration of Neural Networks into Model Predictive Control
Interaction-Aware Trajectory Planning for Autonomous Vehicles with Analytic Integration of Neural Networks into Model Predictive Control
Piyush Gupta
David Isele
Donggun Lee
S. Bae
47
19
0
13 Jan 2023
Tracr: Compiled Transformers as a Laboratory for Interpretability
Tracr: Compiled Transformers as a Laboratory for Interpretability
David Lindner
János Kramár
Sebastian Farquhar
Matthew Rahtz
Tom McGrath
Vladimir Mikulik
34
72
0
12 Jan 2023
ViTs for SITS: Vision Transformers for Satellite Image Time Series
ViTs for SITS: Vision Transformers for Satellite Image Time Series
Michail Tarasiou
Erik Chavez
S. Zafeiriou
ViT
26
50
0
12 Jan 2023
Evaluating the Transferability of Machine-Learned Force Fields for
  Material Property Modeling
Evaluating the Transferability of Machine-Learned Force Fields for Material Property Modeling
Shaswat Mohanty
S. Yoo
K. Kang
W. Cai
31
2
0
10 Jan 2023
A Study on the Generality of Neural Network Structures for Monocular
  Depth Estimation
A Study on the Generality of Neural Network Structures for Monocular Depth Estimation
Ji-Hoon Bae
K. Hwang
Sunghoon Im
MDE
34
7
0
09 Jan 2023
"No, to the Right" -- Online Language Corrections for Robotic
  Manipulation via Shared Autonomy
"No, to the Right" -- Online Language Corrections for Robotic Manipulation via Shared Autonomy
Yuchen Cui
Siddharth Karamcheti
Raj Palleti
Nidhya Shivakumar
Percy Liang
Dorsa Sadigh
LM&Ro
43
76
0
06 Jan 2023
Skip-Attention: Improving Vision Transformers by Paying Less Attention
Skip-Attention: Improving Vision Transformers by Paying Less Attention
Shashanka Venkataramanan
Amir Ghodrati
Yuki M. Asano
Fatih Porikli
A. Habibian
ViT
23
25
0
05 Jan 2023
On the Geometry of Reinforcement Learning in Continuous State and Action
  Spaces
On the Geometry of Reinforcement Learning in Continuous State and Action Spaces
Saket Tiwari
Omer Gottesman
George Konidaris
29
0
0
29 Dec 2022
OVO: One-shot Vision Transformer Search with Online distillation
Zimian Wei
H. Pan
Xin-Yi Niu
Dongsheng Li
ViT
34
1
0
28 Dec 2022
Representation Separation for Semantic Segmentation with Vision
  Transformers
Representation Separation for Semantic Segmentation with Vision Transformers
Yuanduo Hong
Huihui Pan
Weichao Sun
Xinghu Yu
Huijun Gao
ViT
28
5
0
28 Dec 2022
DAE-Former: Dual Attention-guided Efficient Transformer for Medical
  Image Segmentation
DAE-Former: Dual Attention-guided Efficient Transformer for Medical Image Segmentation
Reza Azad
René Arimond
Ehsan Khodapanah Aghdam
Amirhosein Kazerouni
Dorit Merhof
ViT
MedIm
32
78
0
27 Dec 2022
A Generalization of ViT/MLP-Mixer to Graphs
A Generalization of ViT/MLP-Mixer to Graphs
Xiaoxin He
Bryan Hooi
T. Laurent
Adam Perold
Yann LeCun
Xavier Bresson
49
88
0
27 Dec 2022
Training Integer-Only Deep Recurrent Neural Networks
Training Integer-Only Deep Recurrent Neural Networks
V. Nia
Eyyub Sari
Vanessa Courville
M. Asgharian
MQ
53
2
0
22 Dec 2022
ReVISE: Self-Supervised Speech Resynthesis with Visual Input for
  Universal and Generalized Speech Enhancement
ReVISE: Self-Supervised Speech Resynthesis with Visual Input for Universal and Generalized Speech Enhancement
Wei-Ning Hsu
Tal Remez
Bowen Shi
Jacob Donley
Yossi Adi
DiffM
27
12
0
21 Dec 2022
Pretraining Without Attention
Pretraining Without Attention
Junxiong Wang
J. Yan
Albert Gu
Alexander M. Rush
27
48
0
20 Dec 2022
Visual Transformers for Primates Classification and Covid Detection
Visual Transformers for Primates Classification and Covid Detection
Steffen Illium
Robert Muller
Andreas Sedlmeier
Claudia Linnhoff-Popien
38
11
0
20 Dec 2022
Constructing Organism Networks from Collaborative Self-Replicators
Constructing Organism Networks from Collaborative Self-Replicators
Steffen Illium
Maximilian Zorn
Cristian Lenta
Michael Kolle
Claudia Linnhoff-Popien
Thomas Gabor
21
0
0
20 Dec 2022
SegAugment: Maximizing the Utility of Speech Translation Data with
  Segmentation-based Augmentations
SegAugment: Maximizing the Utility of Speech Translation Data with Segmentation-based Augmentations
Ioannis Tsiamas
José A. R. Fonollosa
Marta R. Costa-jussá
46
6
0
19 Dec 2022
Latent Diffusion for Language Generation
Latent Diffusion for Language Generation
Justin Lovelace
Varsha Kishore
Chao-gang Wan
Eliot Shekhtman
Kilian Q. Weinberger
DiffM
29
71
0
19 Dec 2022
Previous
123...8910...181920
Next