ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1606.08415
  4. Cited By
Gaussian Error Linear Units (GELUs)

Gaussian Error Linear Units (GELUs)

27 June 2016
Dan Hendrycks
Kevin Gimpel
ArXivPDFHTML

Papers citing "Gaussian Error Linear Units (GELUs)"

50 / 966 papers shown
Title
Inductive Attention for Video Action Anticipation
Inductive Attention for Video Action Anticipation
Tsung-Ming Tai
G. Fiameni
Cheng-Kuang Lee
Simon See
Oswald Lanz
39
1
0
17 Dec 2022
Geometry-aware Autoregressive Models for Calorimeter Shower Simulations
Geometry-aware Autoregressive Models for Calorimeter Shower Simulations
Junze Liu
A. Ghosh
Dylan Smith
Pierre Baldi
D. Whiteson
OOD
AI4CE
19
5
0
16 Dec 2022
Gradient-based Intra-attention Pruning on Pre-trained Language Models
Gradient-based Intra-attention Pruning on Pre-trained Language Models
Ziqing Yang
Yiming Cui
Xin Yao
Shijin Wang
VLM
42
8
0
15 Dec 2022
DeFT-AN: Dense Frequency-Time Attentive Network for Multichannel Speech
  Enhancement
DeFT-AN: Dense Frequency-Time Attentive Network for Multichannel Speech Enhancement
Dongheon Lee
Jung-Woo Choi
32
25
0
15 Dec 2022
Efficient Self-supervised Learning with Contextualized Target
  Representations for Vision, Speech and Language
Efficient Self-supervised Learning with Contextualized Target Representations for Vision, Speech and Language
Alexei Baevski
Arun Babu
Wei-Ning Hsu
Michael Auli
VLM
SSL
36
92
0
14 Dec 2022
Reliable extrapolation of deep neural operators informed by physics or
  sparse observations
Reliable extrapolation of deep neural operators informed by physics or sparse observations
Min Zhu
Handi Zhang
Anran Jiao
George Karniadakis
Lu Lu
50
91
0
13 Dec 2022
Robust and Explainable Identification of Logical Fallacies in Natural
  Language Arguments
Robust and Explainable Identification of Logical Fallacies in Natural Language Arguments
Zhivar Sourati
Vishnu Priya Prasanna Venkatesh
D. Deshpande
Himanshu Rawlani
Filip Ilievski
Hông-Ân Sandlin
Alain Mermoud
AAML
33
20
0
12 Dec 2022
SchNetPack 2.0: A neural network toolbox for atomistic machine learning
SchNetPack 2.0: A neural network toolbox for atomistic machine learning
Kristof T. Schütt
Stefaan S. P. Hessmann
Niklas W. A. Gebauer
Jonas Lederer
M. Gastegger
32
59
0
11 Dec 2022
MAGVIT: Masked Generative Video Transformer
MAGVIT: Masked Generative Video Transformer
Lijun Yu
Yong Cheng
Kihyuk Sohn
José Lezama
Han Zhang
...
Alexander G. Hauptmann
Ming-Hsuan Yang
Yuan Hao
Irfan Essa
Lu Jiang
DiffM
VGen
38
228
0
10 Dec 2022
A Learned Born Series for Highly-Scattering Media
A Learned Born Series for Highly-Scattering Media
A. Stanziola
Simon Arridge
B. Cox
B. Treeby
26
4
0
09 Dec 2022
Deep Learning Methods for Partial Differential Equations and Related
  Parameter Identification Problems
Deep Learning Methods for Partial Differential Equations and Related Parameter Identification Problems
Derick Nganyu Tanyu
Jianfeng Ning
Tom Freudenberg
Nick Heilenkötter
A. Rademacher
U. Iben
Peter Maass
AI4CE
23
34
0
06 Dec 2022
Parameter Efficient Transfer Learning for Various Speech Processing
  Tasks
Parameter Efficient Transfer Learning for Various Speech Processing Tasks
Shinta Otake
Rei Kawakami
Nakamasa Inoue
24
16
0
06 Dec 2022
Fast Inference from Transformers via Speculative Decoding
Fast Inference from Transformers via Speculative Decoding
Yaniv Leviathan
Matan Kalman
Yossi Matias
LRM
49
636
0
30 Nov 2022
Improving Cross-Modal Retrieval with Set of Diverse Embeddings
Improving Cross-Modal Retrieval with Set of Diverse Embeddings
Dongwon Kim
Nam-Won Kim
Suha Kwak
26
37
0
30 Nov 2022
NoisyQuant: Noisy Bias-Enhanced Post-Training Activation Quantization
  for Vision Transformers
NoisyQuant: Noisy Bias-Enhanced Post-Training Activation Quantization for Vision Transformers
Yijiang Liu
Huanrui Yang
Zhen Dong
Kurt Keutzer
Li Du
Shanghang Zhang
MQ
33
47
0
29 Nov 2022
SegCLIP: Patch Aggregation with Learnable Centers for Open-Vocabulary
  Semantic Segmentation
SegCLIP: Patch Aggregation with Learnable Centers for Open-Vocabulary Semantic Segmentation
Huaishao Luo
Junwei Bao
Youzheng Wu
Xiaodong He
Tianrui Li
VLM
32
146
0
27 Nov 2022
PaCMO: Partner Dependent Human Motion Generation in Dyadic Human
  Activity using Neural Operators
PaCMO: Partner Dependent Human Motion Generation in Dyadic Human Activity using Neural Operators
Md Ashiqur Rahman
Jasorsi Ghosh
Hrishikesh Viswanath
Kamyar Azizzadenesheli
Aniket Bera
32
8
0
25 Nov 2022
MPCViT: Searching for Accurate and Efficient MPC-Friendly Vision
  Transformer with Heterogeneous Attention
MPCViT: Searching for Accurate and Efficient MPC-Friendly Vision Transformer with Heterogeneous Attention
Wenyuan Zeng
Meng Li
Wenjie Xiong
Tong Tong
Wen-jie Lu
Jin Tan
Runsheng Wang
Ru Huang
29
21
0
25 Nov 2022
Towards Good Practices for Missing Modality Robust Action Recognition
Towards Good Practices for Missing Modality Robust Action Recognition
Sangmin Woo
Sumin Lee
Yeonju Park
Muhammad Adi Nugroho
Changick Kim
32
44
0
25 Nov 2022
TetraDiffusion: Tetrahedral Diffusion Models for 3D Shape Generation
TetraDiffusion: Tetrahedral Diffusion Models for 3D Shape Generation
Nikolai Kalischek
T. Peters
Jan Dirk Wegner
Konrad Schindler
DiffM
30
12
0
23 Nov 2022
Boundary-aware Camouflaged Object Detection via Deformable Point
  Sampling
Boundary-aware Camouflaged Object Detection via Deformable Point Sampling
Minhyeok Lee
Suhwan Cho
Chaewon Park
Dogyoon Lee
Jungho Lee
Sangyoun Lee
26
3
0
22 Nov 2022
Rethinking Implicit Neural Representations for Vision Learners
Rethinking Implicit Neural Representations for Vision Learners
Yiran Song
Qianyu Zhou
Lizhuang Ma
24
7
0
22 Nov 2022
Conv2Former: A Simple Transformer-Style ConvNet for Visual Recognition
Conv2Former: A Simple Transformer-Style ConvNet for Visual Recognition
Qibin Hou
Cheng Lu
Mingg-Ming Cheng
Jiashi Feng
ViT
34
129
0
22 Nov 2022
Blur Interpolation Transformer for Real-World Motion from Blur
Blur Interpolation Transformer for Real-World Motion from Blur
Zhihang Zhong
Ming Cao
Xiang Ji
Yinqiang Zheng
Imari Sato
ViT
41
25
0
21 Nov 2022
PS-Transformer: Learning Sparse Photometric Stereo Network using
  Self-Attention Mechanism
PS-Transformer: Learning Sparse Photometric Stereo Network using Self-Attention Mechanism
Satoshi Ikehata
ViT
3DPC
36
26
0
21 Nov 2022
Heterogenous Ensemble of Models for Molecular Property Prediction
Heterogenous Ensemble of Models for Molecular Property Prediction
Sajad Darabi
Shayan Fazeli
Jiwei Liu
Alexandre Milesi
Pawel Morkisz
Jean-François Puget
Gilberto Titericz
16
0
0
20 Nov 2022
An interpretable imbalanced semi-supervised deep learning framework for
  improving differential diagnosis of skin diseases
An interpretable imbalanced semi-supervised deep learning framework for improving differential diagnosis of skin diseases
Futian Weng
Yuanting Ma
J. Sun
Shijun Shan
Qiyuan Li
Jianping Zhu
Yang Wang
Yan Xu
42
0
0
20 Nov 2022
TORE: Token Reduction for Efficient Human Mesh Recovery with Transformer
TORE: Token Reduction for Efficient Human Mesh Recovery with Transformer
Zhiyang Dou
Qingxuan Wu
Chu-Hsing Lin
Zeyu Cao
Qiangqiang Wu
Weilin Wan
Taku Komura
Wenping Wang
29
39
0
19 Nov 2022
GPS++: An Optimised Hybrid MPNN/Transformer for Molecular Property
  Prediction
GPS++: An Optimised Hybrid MPNN/Transformer for Molecular Property Prediction
Dominic Masters
Josef Dean
Kerstin Klaser
Zhiyi Li
Sam Maddrell-Mander
Adam Sanders
Hatem Helal
D. Beker
Ladislav Rampášek
Dominique Beaini
34
23
0
18 Nov 2022
Stereo Image Rain Removal via Dual-View Mutual Attention
Stereo Image Rain Removal via Dual-View Mutual Attention
Yanyan Wei
Zhao Zhang
Zhong-zhong Zhao
Yang Zhao
Richang Hong
Yi Yang
26
3
0
18 Nov 2022
Efficient Transformers with Dynamic Token Pooling
Efficient Transformers with Dynamic Token Pooling
Piotr Nawrot
J. Chorowski
Adrian Lañcucki
Edoardo Ponti
22
42
0
17 Nov 2022
VeLO: Training Versatile Learned Optimizers by Scaling Up
VeLO: Training Versatile Learned Optimizers by Scaling Up
Luke Metz
James Harrison
C. Freeman
Amil Merchant
Lucas Beyer
...
Naman Agrawal
Ben Poole
Igor Mordatch
Adam Roberts
Jascha Narain Sohl-Dickstein
40
60
0
17 Nov 2022
Listen, Denoise, Action! Audio-Driven Motion Synthesis with Diffusion
  Models
Listen, Denoise, Action! Audio-Driven Motion Synthesis with Diffusion Models
Simon Alexanderson
Rajmund Nagy
Jonas Beskow
G. Henter
DiffM
VGen
29
166
0
17 Nov 2022
Cross-Modal Adapter for Text-Video Retrieval
Cross-Modal Adapter for Text-Video Retrieval
Haojun Jiang
Jianke Zhang
Rui Huang
Chunjiang Ge
Zanlin Ni
Jiwen Lu
Jie Zhou
S. Song
Gao Huang
53
37
0
17 Nov 2022
UniFormerV2: Spatiotemporal Learning by Arming Image ViTs with Video
  UniFormer
UniFormerV2: Spatiotemporal Learning by Arming Image ViTs with Video UniFormer
Kunchang Li
Yali Wang
Yinan He
Yizhuo Li
Yi Wang
Limin Wang
Yu Qiao
ViT
30
107
0
17 Nov 2022
Multilayer Perceptron-based Surrogate Models for Finite Element Analysis
Multilayer Perceptron-based Surrogate Models for Finite Element Analysis
Lawson Oliveira Lima
Julien Rosenberger
E. Antier
Frédéric Magoulès
AI4CE
14
0
0
17 Nov 2022
Galactica: A Large Language Model for Science
Galactica: A Large Language Model for Science
Ross Taylor
Marcin Kardas
Guillem Cucurull
Thomas Scialom
Anthony Hartshorn
Elvis Saravia
Andrew Poulton
Viktor Kerkez
Robert Stojnic
ELM
ReLM
46
740
0
16 Nov 2022
HMOE: Hypernetwork-based Mixture of Experts for Domain Generalization
HMOE: Hypernetwork-based Mixture of Experts for Domain Generalization
Jingang Qu
T. Faney
Zehao Wang
Patrick Gallinari
Soleiman Yousef
J. D. Hemptinne
OOD
24
7
0
15 Nov 2022
HeatViT: Hardware-Efficient Adaptive Token Pruning for Vision
  Transformers
HeatViT: Hardware-Efficient Adaptive Token Pruning for Vision Transformers
Peiyan Dong
Mengshu Sun
Alec Lu
Yanyue Xie
Li-Yu Daisy Liu
...
Xin Meng
Zechao Li
Xue Lin
Zhenman Fang
Yanzhi Wang
ViT
36
63
0
15 Nov 2022
Multi-VQG: Generating Engaging Questions for Multiple Images
Multi-VQG: Generating Engaging Questions for Multiple Images
Min-Hsuan Yeh
Vicent Chen
Ting-Hao Haung
Lun-Wei Ku
CoGe
18
7
0
14 Nov 2022
MLIC: Multi-Reference Entropy Model for Learned Image Compression
MLIC: Multi-Reference Entropy Model for Learned Image Compression
Wei Jiang
Jiayu Yang
Yongqi Zhai
Peirong Ning
Feng Gao
Ronggang Wang
35
79
0
14 Nov 2022
Enhancing Few-shot Image Classification with Cosine Transformer
Enhancing Few-shot Image Classification with Cosine Transformer
Quang-Huy Nguyen
Cuong Q. Nguyen
Dung D. Le
Hieu H. Pham
ViT
31
12
0
13 Nov 2022
Do Bayesian Neural Networks Need To Be Fully Stochastic?
Do Bayesian Neural Networks Need To Be Fully Stochastic?
Mrinank Sharma
Sebastian Farquhar
Eric T. Nalisnick
Tom Rainforth
BDL
23
52
0
11 Nov 2022
Deep equilibrium models as estimators for continuous latent variables
Deep equilibrium models as estimators for continuous latent variables
Russell Tsuchida
Cheng Soon Ong
35
8
0
11 Nov 2022
InternImage: Exploring Large-Scale Vision Foundation Models with
  Deformable Convolutions
InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions
Wenhai Wang
Jifeng Dai
Zhe Chen
Zhenhang Huang
Zhiqi Li
...
Tong Lu
Lewei Lu
Hongsheng Li
Xiaogang Wang
Yu Qiao
VLM
53
661
0
10 Nov 2022
Massively Multilingual ASR on 70 Languages: Tokenization, Architecture,
  and Generalization Capabilities
Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization Capabilities
Andros Tjandra
Nayan Singhal
David C. Zhang
Ozlem Kalinli
Abdel-rahman Mohamed
Duc Le
M. Seltzer
40
12
0
10 Nov 2022
MGTCOM: Community Detection in Multimodal Graphs
MGTCOM: Community Detection in Multimodal Graphs
E. Dmitriev
M. Chekol
S. Wang
25
0
0
10 Nov 2022
Simulation-Based Parallel Training
Simulation-Based Parallel Training
Lucas Meyer
Alejandro Ribés
Bruno Raffin
AI4CE
41
2
0
08 Nov 2022
Improving performance of real-time full-band blind packet-loss
  concealment with predictive network
Improving performance of real-time full-band blind packet-loss concealment with predictive network
Viet-Anh Nguyen
Anh H. T. Nguyen
Andy W. H. Khong
29
7
0
08 Nov 2022
ATCO2 corpus: A Large-Scale Dataset for Research on Automatic Speech
  Recognition and Natural Language Understanding of Air Traffic Control
  Communications
ATCO2 corpus: A Large-Scale Dataset for Research on Automatic Speech Recognition and Natural Language Understanding of Air Traffic Control Communications
Juan Pablo Zuluaga
Karel Veselý
Igor Szöke
Alexander Blatt
P. Motlícek
...
Claudia Cevenini
Pavel Kolcárek
Allan Tart
J. Černocký
Dietrich Klakow
34
23
0
08 Nov 2022
Previous
123...91011...181920
Next