On the Relationship between Self-Attention and Convolutional Layers

8 November 2019

Papers citing "On the Relationship between Self-Attention and Convolutional Layers"

50 / 269 papers shown

Title
Chasing Sparsity in Vision Transformers: An End-to-End Exploration Tianlong Chen Yu Cheng Zhe Gan Lu Yuan Lei Zhang Zhangyang Wang ViT 18 216 0 08 Jun 2021
On the Connection between Local Attention and Dynamic Depth-wise Convolution Qi Han Zejia Fan Qi Dai Lei-huan Sun Ming-Ming Cheng Jiaying Liu Jingdong Wang ViT 29 105 0 08 Jun 2021
On the Expressive Power of Self-Attention Matrices Valerii Likhosherstov K. Choromanski Adrian Weller 37 34 0 07 Jun 2021
Convolutional Neural Networks with Gated Recurrent Connections Jianfeng Wang Xiaolin Hu ObjD 22 40 0 05 Jun 2021
Detect the Interactions that Matter in Matter: Geometric Attention for Many-Body Systems Thorben Frank Stefan Chmiela 23 3 0 04 Jun 2021
X-volution: On the unification of convolution and self-attention Xuanhong Chen Hang Wang Bingbing Ni ViT 27 24 0 04 Jun 2021
Transformers are Deep Infinite-Dimensional Non-Mercer Binary Kernel Machines Matthew A. Wright Joseph E. Gonzalez 34 20 0 02 Jun 2021
Less is More: Pay Less Attention in Vision Transformers Zizheng Pan Bohan Zhuang Haoyu He Jing Liu Jianfei Cai ViT 24 82 0 29 May 2021
KVT: k-NN Attention for Boosting Vision Transformers Pichao Wang Xue Wang F. Wang Ming Lin Shuning Chang Hao Li R. L. Jin ViT 51 105 0 28 May 2021
Nested Hierarchical Transformer: Towards Accurate, Data-Efficient and Interpretable Visual Understanding Zizhao Zhang Han Zhang Long Zhao Ting Chen Sercan Ö. Arik Tomas Pfister ViT 22 169 0 26 May 2021
Are Convolutional Neural Networks or Transformers more like human vision? Shikhar Tuli Ishita Dasgupta Erin Grant Thomas L. Griffiths ViT FaML 16 182 0 15 May 2021
Graph Attention Networks with Positional Embeddings Liheng Ma Reihaneh Rabbany Adriana Romero Soriano GNN 33 235 0 09 May 2021
Attention-based Stylisation for Exemplar Image Colourisation Marc Górriz Blanch Issa Khalifeh Alan F. Smeaton Noel E. O'Connor M. Mrak 31 4 0 04 May 2021
MLP-Mixer: An all-MLP Architecture for Vision Ilya O. Tolstikhin N. Houlsby Alexander Kolesnikov Lucas Beyer Xiaohua Zhai ... Andreas Steiner Daniel Keysers Jakob Uszkoreit Mario Lucic Alexey Dosovitskiy 280 2,606 0 04 May 2021
Deformable TDNN with adaptive receptive fields for speech recognition Keyu An Yi Zhang Zhijian Ou 11 5 0 30 Apr 2021
Higher-Order Attribute-Enhancing Heterogeneous Graph Neural Networks Jianxin Li Hao Peng Yuwei Cao Yingtong Dou Hekai Zhang Philip S. Yu Lifang He 30 79 0 16 Apr 2021
GAttANet: Global attention agreement for convolutional neural networks R. V. Rullen A. Alamia ViT 16 2 0 12 Apr 2021
Differentiable Patch Selection for Image Recognition Jean-Baptiste Cordonnier Aravindh Mahendran Alexey Dosovitskiy Dirk Weissenborn Jakob Uszkoreit Thomas Unterthiner 33 93 0 07 Apr 2021
LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference Ben Graham Alaaeldin El-Nouby Hugo Touvron Pierre Stock Armand Joulin Hervé Jégou Matthijs Douze ViT 22 770 0 02 Apr 2021
Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval Max Bain Arsha Nagrani Gül Varol Andrew Zisserman VGen 39 1,128 0 01 Apr 2021
Motion Guided Attention Fusion to Recognize Interactions from Videos Tae Soo Kim Jonathan D. Jones Gregory Hager 22 15 0 01 Apr 2021
On the Robustness of Vision Transformers to Adversarial Examples Kaleel Mahmood Rigel Mahmood Marten van Dijk ViT 22 217 0 31 Mar 2021
Understanding Robustness of Transformers for Image Classification Srinadh Bhojanapalli Ayan Chakrabarti Daniel Glasner Daliang Li Thomas Unterthiner Andreas Veit ViT 25 378 0 26 Mar 2021
An Image is Worth 16x16 Words, What is a Video Worth? Gilad Sharir Asaf Noy Lihi Zelnik-Manor ViT 24 120 0 25 Mar 2021
Scaling Local Self-Attention for Parameter Efficient Visual Backbones Ashish Vaswani Prajit Ramachandran A. Srinivas Niki Parmar Blake A. Hechtman Jonathon Shlens 27 395 0 23 Mar 2021
ConViT: Improving Vision Transformers with Soft Convolutional Inductive Biases Stéphane dÁscoli Hugo Touvron Matthew L. Leavitt Ari S. Morcos Giulio Biroli Levent Sagun ViT 58 805 0 19 Mar 2021
Scalable Vision Transformers with Hierarchical Pooling Zizheng Pan Bohan Zhuang Jing Liu Haoyu He Jianfei Cai ViT 27 126 0 19 Mar 2021
Involution: Inverting the Inherence of Convolution for Visual Recognition Duo Li Jie Hu Changhu Wang Xiangtai Li Qi She Lei Zhu Tong Zhang Qifeng Chen BDL 19 304 0 10 Mar 2021
Lipschitz Normalization for Self-Attention Layers with Application to Graph Neural Networks George Dasoulas Kevin Scaman Aladin Virmaux GNN 24 40 0 08 Mar 2021
Attention is Not All You Need: Pure Attention Loses Rank Doubly Exponentially with Depth Yihe Dong Jean-Baptiste Cordonnier Andreas Loukas 52 373 0 05 Mar 2021
Perceiver: General Perception with Iterative Attention Andrew Jaegle Felix Gimeno Andrew Brock Andrew Zisserman Oriol Vinyals João Carreira VLM ViT MDE 91 976 0 04 Mar 2021
Generative Adversarial Transformers Drew A. Hudson C. L. Zitnick ViT 25 179 0 01 Mar 2021
Conditional Positional Encodings for Vision Transformers Xiangxiang Chu Zhi Tian Bo-Wen Zhang Xinlong Wang Chunhua Shen ViT 36 605 0 22 Feb 2021
UniT: Multimodal Multitask Learning with a Unified Transformer Ronghang Hu Amanpreet Singh ViT 25 295 0 22 Feb 2021
LambdaNetworks: Modeling Long-Range Interactions Without Attention Irwan Bello 281 179 0 17 Feb 2021
Is Space-Time Attention All You Need for Video Understanding? Gedas Bertasius Heng Wang Lorenzo Torresani ViT 283 1,984 0 09 Feb 2021
Relaxed Transformer Decoders for Direct Action Proposal Generation Jing Tan Jiaqi Tang Limin Wang Gangshan Wu ViT 81 178 0 03 Feb 2021
CNN with large memory layers R. Karimov Yury Malkov Karim Iskakov Victor Lempitsky 27 0 0 27 Jan 2021
Spectral Leakage and Rethinking the Kernel Size in CNNs Nergis Tomen Jan van Gemert AAML 24 18 0 25 Jan 2021
Transformers in Vision: A Survey Salman Khan Muzammal Naseer Munawar Hayat Syed Waqas Zamir Fahad Shahbaz Khan M. Shah ViT 227 2,431 0 04 Jan 2021
Training data-efficient image transformers & distillation through attention Hugo Touvron Matthieu Cord Matthijs Douze Francisco Massa Alexandre Sablayrolles Hervé Jégou ViT 152 6,567 0 23 Dec 2020
ShineOn: Illuminating Design Choices for Practical Video-based Virtual Clothing Try-on Gaurav Kuppa Andrew Jong Vera Liu Ziwei Liu Teng-Sheng Moh CVBM 13 19 0 18 Dec 2020
Toward Transformer-Based Object Detection Josh Beal Eric Kim Eric Tzeng Dong Huk Park Andrew Zhai Dmitry Kislyuk ViT 27 209 0 17 Dec 2020
GTA: Global Temporal Attention for Video Action Understanding Bo He Xitong Yang Zuxuan Wu Hao Chen Ser-Nam Lim Abhinav Shrivastava ViT 33 27 0 15 Dec 2020
Convolutional LSTM Neural Networks for Modeling Wildland Fire Dynamics J. Burge M. Bonanni M. Ihme Lily Hu 28 19 0 11 Dec 2020
SAFCAR: Structured Attention Fusion for Compositional Action Recognition Tae Soo Kim Gregory Hager CoGe 16 10 0 03 Dec 2020
Attention Aware Cost Volume Pyramid Based Multi-view Stereo Network for 3D Reconstruction Anzhu Yu Wenyue Guo Bing Liu Xin Chen Xin Wang Xuefeng Cao Bingchuan Jiang 3DV 26 64 0 25 Nov 2020
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale Alexey Dosovitskiy Lucas Beyer Alexander Kolesnikov Dirk Weissenborn Xiaohua Zhai ... Matthias Minderer G. Heigold Sylvain Gelly Jakob Uszkoreit N. Houlsby ViT 41 39,330 0 22 Oct 2020
Group Equivariant Stand-Alone Self-Attention For Vision David W. Romero Jean-Baptiste Cordonnier MDE 26 57 0 02 Oct 2020
Multi-timescale Representation Learning in LSTM Language Models Shivangi Mahto Vy A. Vo Javier S. Turek Alexander G. Huth 15 29 0 27 Sep 2020