Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1606.08415
Cited By
Gaussian Error Linear Units (GELUs)
27 June 2016
Dan Hendrycks
Kevin Gimpel
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Gaussian Error Linear Units (GELUs)"
50 / 829 papers shown
Title
Simple Semi-supervised Knowledge Distillation from Vision-Language Models via
D
\mathbf{\texttt{D}}
D
ual-
H
\mathbf{\texttt{H}}
H
ead
O
\mathbf{\texttt{O}}
O
ptimization
Seongjae Kang
Dong Bok Lee
Hyungjoon Jang
Sung Ju Hwang
VLM
57
0
0
12 May 2025
Emotion-Qwen: Training Hybrid Experts for Unified Emotion and General Vision-Language Understanding
Dawei Huang
Qing Li
Chuan Yan
Zebang Cheng
Jiaming Ji
Xiang Li
Yangqiu Song
X. U. Wang
Zheng Lian
Xiaojiang Peng
29
0
0
10 May 2025
Attention Is Not All You Need: The Importance of Feedforward Networks in Transformer Models
Isaac Gerber
31
0
0
10 May 2025
FedTDP: A Privacy-Preserving and Unified Framework for Trajectory Data Preparation via Federated Learning
Zhihao Zeng
Ziquan Fang
Wei Shao
Lu Chen
Yunjun Gao
FedML
51
0
0
08 May 2025
An Enhanced YOLOv8 Model for Real-Time and Accurate Pothole Detection and Measurement
Mustafa Yurdakul
Şakir Tasdemir
47
0
0
07 May 2025
Mamba-Diffusion Model with Learnable Wavelet for Controllable Symbolic Music Generation
Jincheng Zhang
Gyorgy Fazekas
C. Saitis
53
0
0
06 May 2025
MergeGuard: Efficient Thwarting of Trojan Attacks in Machine Learning Models
Soheil Zibakhsh Shabgahi
Yaman Jandali
F. Koushanfar
MoMe
AAML
57
0
0
06 May 2025
DualReal: Adaptive Joint Training for Lossless Identity-Motion Fusion in Video Customization
Wenchuan Wang
Mengqi Huang
Yijing Tu
Zhendong Mao
VGen
69
0
0
04 May 2025
Vision Mamba in Remote Sensing: A Comprehensive Survey of Techniques, Applications and Outlook
Muyi Bao
Shuchang Lyu
Zhaoyang Xu
Huiyu Zhou
Jinchang Ren
Shiming Xiang
Xiaomeng Li
Guangliang Cheng
Mamba
87
0
0
01 May 2025
JFlow: Model-Independent Spherical Jeans Analysis using Equivariant Continuous Normalizing Flows
Sung Hak Lim
Kohei Hayashi
Shuníchi Horigome
Shigeki Matsumoto
M. Nojiri
24
0
0
01 May 2025
Enhancing Tropical Cyclone Path Forecasting with an Improved Transformer Network
Nguyen Van Thanh
Nguyen Dang Huynh
Nguyen Ngoc Tan
Nguyen Thai Minh
Nguyen Nam Hoang
21
0
0
01 May 2025
MemeBLIP2: A novel lightweight multimodal system to detect harmful memes
Jiaqi Liu
Ran Tong
Aowei Shen
Shuzheng Li
Changlin Yang
Lisha Xu
VLM
77
1
0
29 Apr 2025
R-Sparse: Rank-Aware Activation Sparsity for Efficient LLM Inference
Zhenyu (Allen) Zhang
Zechun Liu
Yuandong Tian
Harshit Khaitan
Zhilin Wang
Steven Li
57
0
0
28 Apr 2025
Towards Robust Multimodal Physiological Foundation Models: Handling Arbitrary Missing Modalities
Xi Fu
Wei-Bang Jiang
Yi Ding
Cuntai Guan
46
0
0
28 Apr 2025
DISCO: learning to DISCover an evolution Operator for multi-physics-agnostic prediction
Rudy Morel
Jiequn Han
Edouard Oyallon
AI4CE
56
0
0
28 Apr 2025
TeleSparse: Practical Privacy-Preserving Verification of Deep Neural Networks
Mohammad Maheri
Hamed Haddadi
Alex Davidson
71
0
0
27 Apr 2025
Generative Adversarial Network based Voice Conversion: Techniques, Challenges, and Recent Advancements
Sandipan Dhar
N. D. Jana
Swagatam Das
45
0
0
27 Apr 2025
Reliable and Efficient Inverse Analysis using Physics-Informed Neural Networks with Distance Functions and Adaptive Weight Tuning
Shota Deguchi
Mitsuteru Asai
PINN
AI4CE
81
0
0
25 Apr 2025
Text-to-Decision Agent: Learning Generalist Policies from Natural Language Supervision
Shilin Zhang
Zican Hu
Wenhao Wu
Xinyi Xie
Jianxiang Tang
Chunlin Chen
Daoyi Dong
Yu Cheng
Zhenhong Sun
Zhi Wang
OffRL
139
0
0
21 Apr 2025
The Geometry of Self-Verification in a Task-Specific Reasoning Model
Andrew Lee
Lihao Sun
Chris Wendler
Fernanda Viégas
Martin Wattenberg
LRM
31
0
0
19 Apr 2025
SC3EF: A Joint Self-Correlation and Cross-Correspondence Estimation Framework for Visible and Thermal Image Registration
Xi Tong
Xing Luo
Jiangxin Yang
Yanpeng Cao
34
0
0
17 Apr 2025
Hadamard product in deep learning: Introduction, Advances and Challenges
Grigorios G. Chrysos
Yongtao Wu
Razvan Pascanu
Philip Torr
V. Cevher
AAML
98
0
0
17 Apr 2025
Tree-NeRV: A Tree-Structured Neural Representation for Efficient Non-Uniform Video Encoding
Jiancheng Zhao
Yifan Zhan
Qingtian Zhu
Mingze Ma
Muyao Niu
Zunian Wan
Xiang Ji
Yinqiang Zheng
32
0
0
17 Apr 2025
CSPLADE: Learned Sparse Retrieval with Causal Language Models
Zhichao Xu
Aosong Feng
Yijun Tian
Haibo Ding
Lin Leee Cheong
RALM
40
0
0
15 Apr 2025
FANeRV: Frequency Separation and Augmentation based Neural Representation for Video
Li Yu
Zhihui Li
Chao Yao
Jimin Xiao
Moncef Gabbouj
35
0
0
09 Apr 2025
Spline-based Transformers
Prashanth Chandran
Agon Serifi
Markus Gross
Moritz Bächer
41
0
0
03 Apr 2025
Adaptive Rank Allocation: Speeding Up Modern Transformers with RaNA Adapters
Roberto Garcia
Jerry Liu
Daniel Sorvisto
Sabri Eyuboglu
90
0
0
23 Mar 2025
GraspCorrect: Robotic Grasp Correction via Vision-Language Model-Guided Feedback
Sungjae Lee
Yeonjoo Hong
Kwang In KIm
48
0
0
19 Mar 2025
Core-Periphery Principle Guided State Space Model for Functional Connectome Classification
Minheng Chen
Xiaowei Yu
Jing Zhang
Tong Chen
Chao-Yang Cao
Yan Zhuang
Yanjun Lyu
Lu Zhang
Tianming Liu
D. Zhu
Mamba
58
0
0
18 Mar 2025
Learning Shape-Independent Transformation via Spherical Representations for Category-Level Object Pose Estimation
Huan Ren
Wenfei Yang
Xiang Liu
Shifeng Zhang
Tianzhu Zhang
69
2
0
18 Mar 2025
Self-Supervised Pretraining for Fine-Grained Plankton Recognition
Joona Kareinen
T. Eerola
K. Kraft
L. Lensu
S. Suikkanen
Heikki Kälviäinen
SSL
174
0
0
14 Mar 2025
End-to-End Action Segmentation Transformer
Tieqiao Wang
Sinisa Todorovic
ViT
39
0
0
08 Mar 2025
A Real-time Multimodal Transformer Neural Network-powered Wildfire Forecasting System
Qijun Chen
Shaofan Li
48
0
0
07 Mar 2025
Neural Configuration-Space Barriers for Manipulation Planning and Control
Kehan Long
Ki Myung Brian Lee
Nikola Raicevic
Niyas Attasseri
Melvin Leok
Nikolay Atanasov
74
0
0
06 Mar 2025
FANformer: Improving Large Language Models Through Effective Periodicity Modeling
Yihong Dong
Bernard Ghanem
Xue Jiang
Yongding Tao
Kechi Zhang
...
Huanyu Liu
Jiazheng Ding
Jia Li
Jinliang Deng
Hong Mei
AI4TS
41
0
0
28 Feb 2025
Towards Lossless Implicit Neural Representation via Bit Plane Decomposition
Woo Kyoung Han
Byeonghun Lee
Hyunmin Cho
Sunghoon Im
Kyong Hwan Jin
MQ
147
0
0
28 Feb 2025
MIM-Refiner: A Contrastive Learning Boost from Intermediate Pre-Trained Representations
Benedikt Alkin
Lukas Miklautz
Sepp Hochreiter
Johannes Brandstetter
VLM
71
8
0
24 Feb 2025
VaViM and VaVAM: Autonomous Driving through Video Generative Modeling
Florent Bartoccioni
Elias Ramzi
Victor Besnier
Shashanka Venkataramanan
Tuan-Hung Vu
...
Mickael Chen
Éloi Zablocki
Andrei Bursuc
Eduardo Valle
Matthieu Cord
VGen
86
1
0
24 Feb 2025
Evolving Form and Function: Dual-Objective Optimization in Neural Symbolic Regression Networks
Amanda Bertschinger
James P. Bagrow
Joshua Bongard
82
1
0
24 Feb 2025
The Empirical Impact of Reducing Symmetries on the Performance of Deep Ensembles and MoE
Andrei Chernov
Oleg Novitskij
48
0
0
24 Feb 2025
Int2Int: a framework for mathematics with transformers
François Charton
ViT
46
0
0
22 Feb 2025
Robin3D: Improving 3D Large Language Model via Robust Instruction Tuning
Weitai Kang
Haifeng Huang
Yuzhang Shang
Mubarak Shah
Yan Yan
46
7
0
21 Feb 2025
NEAR: A Training-Free Pre-Estimator of Machine Learning Model Performance
Raphael T. Husistein
Markus Reiher
Marco Eckhoff
142
1
0
20 Feb 2025
Contrastive Localized Language-Image Pre-Training
Hong-You Chen
Zhengfeng Lai
H. Zhang
Xuben Wang
Marcin Eichner
Keen You
Meng Cao
Bowen Zhang
Yuqing Yang
Zhe Gan
CLIP
VLM
68
7
0
20 Feb 2025
video-SALMONN-o1: Reasoning-enhanced Audio-visual Large Language Model
Guangzhi Sun
Yudong Yang
Jimin Zhuang
Changli Tang
Yongqian Li
W. Li
Z. Ma
Chao Zhang
LRM
MLLM
VLM
64
4
0
17 Feb 2025
Generative Adversarial Networks for High-Dimensional Item Factor Analysis: A Deep Adversarial Learning Algorithm
Nanyu Luo
Feng Ji
DRL
41
0
0
15 Feb 2025
TLOB: A Novel Transformer Model with Dual Attention for Price Trend Prediction with Limit Order Book Data
Leonardo Berti
Gjergji Kasneci
AI4TS
42
0
0
12 Feb 2025
Kolmogorov-Arnold Fourier Networks
Jusheng Zhang
Yijia Fan
Kaitong Cai
Keze Wang
68
0
0
09 Feb 2025
High-Fidelity Simultaneous Speech-To-Speech Translation
Tom Labiausse
Laurent Mazaré
Edouard Grave
P. Pérez
Alexandre Défossez
Neil Zeghidour
171
0
0
05 Feb 2025
Learnable polynomial, trigonometric, and tropical activations
Ismail Khalfaoui-Hassani
Stefan Kesselheim
64
0
0
03 Feb 2025
1
2
3
4
...
15
16
17
Next