ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2101.01169
  4. Cited By
Transformers in Vision: A Survey
v1v2v3v4v5 (latest)

Transformers in Vision: A Survey

4 January 2021
Salman Khan
Muzammal Naseer
Munawar Hayat
Syed Waqas Zamir
Fahad Shahbaz Khan
M. Shah
    ViT
ArXiv (abs)PDFHTML

Papers citing "Transformers in Vision: A Survey"

50 / 263 papers shown
Title
Self-Attention Generative Adversarial Networks
Self-Attention Generative Adversarial Networks
Han Zhang
Ian Goodfellow
Dimitris N. Metaxas
Augustus Odena
GAN
151
3,732
0
21 May 2018
Image Super-Resolution via Dual-State Recurrent Networks
Image Super-Resolution via Dual-State Recurrent Networks
Wei Han
Shiyu Chang
Ding Liu
Mo Yu
Michael Witbrock
Thomas S. Huang
SupR
61
215
0
07 May 2018
Look into Person: Joint Body Parsing & Pose Estimation Network and A New
  Benchmark
Look into Person: Joint Body Parsing & Pose Estimation Network and A New Benchmark
Xiaodan Liang
Ke Gong
Xiaohui Shen
Liang Lin
3DH
186
353
0
05 Apr 2018
End-to-End Dense Video Captioning with Masked Transformer
End-to-End Dense Video Captioning with Masked Transformer
Luowei Zhou
Yingbo Zhou
Jason J. Corso
R. Socher
Caiming Xiong
94
530
0
03 Apr 2018
Stacked Cross Attention for Image-Text Matching
Stacked Cross Attention for Image-Text Matching
Kuang-Huei Lee
Xi Chen
G. Hua
Houdong Hu
Xiaodong He
109
1,158
0
21 Mar 2018
Unsupervised Representation Learning by Predicting Image Rotations
Unsupervised Representation Learning by Predicting Image Rotations
Spyros Gidaris
Praveer Singh
N. Komodakis
OODSSLDRL
267
3,300
0
21 Mar 2018
Self-Attention with Relative Position Representations
Self-Attention with Relative Position Representations
Peter Shaw
Jakob Uszkoreit
Ashish Vaswani
184
2,299
0
06 Mar 2018
Image Transformer
Image Transformer
Niki Parmar
Ashish Vaswani
Jakob Uszkoreit
Lukasz Kaiser
Noam M. Shazeer
Alexander Ku
Dustin Tran
ViT
149
1,687
0
15 Feb 2018
Efficient Neural Architecture Search via Parameter Sharing
Efficient Neural Architecture Search via Parameter Sharing
Hieu H. Pham
M. Guan
Barret Zoph
Quoc V. Le
J. Dean
117
2,770
0
09 Feb 2018
Learning Image Representations by Completing Damaged Jigsaw Puzzles
Learning Image Representations by Completing Damaged Jigsaw Puzzles
Dahun Kim
Donghyeon Cho
Donggeun Yoo
In So Kweon
SSL
74
152
0
06 Feb 2018
Panoptic Segmentation
Panoptic Segmentation
Alexander Kirillov
Kaiming He
Ross B. Girshick
Carsten Rother
Piotr Dollár
132
1,448
0
03 Jan 2018
AttnGAN: Fine-Grained Text to Image Generation with Attentional
  Generative Adversarial Networks
AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks
Tao Xu
Pengchuan Zhang
Qiuyuan Huang
Han Zhang
Zhe Gan
Xiaolei Huang
Xiaodong He
GANViT
120
1,722
0
28 Nov 2017
Non-local Neural Networks
Non-local Neural Networks
Xinyu Wang
Ross B. Girshick
Abhinav Gupta
Kaiming He
OffRL
307
8,918
0
21 Nov 2017
StackGAN++: Realistic Image Synthesis with Stacked Generative
  Adversarial Networks
StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks
Han Zhang
Tao Xu
Hongsheng Li
Shaoting Zhang
Xiaogang Wang
Xiaolei Huang
Dimitris N. Metaxas
GAN
114
1,062
0
19 Oct 2017
Unsupervised Representation Learning by Sorting Sequences
Unsupervised Representation Learning by Sorting Sequences
Hsin-Ying Lee
Jia-Bin Huang
Maneesh Kumar Singh
Ming-Hsuan Yang
SSLDRL
94
536
0
03 Aug 2017
Enhanced Deep Residual Networks for Single Image Super-Resolution
Enhanced Deep Residual Networks for Single Image Super-Resolution
Bee Lim
Sanghyun Son
Heewon Kim
Seungjun Nah
Kyoung Mu Lee
SupR
185
5,924
0
10 Jul 2017
Attention Is All You Need
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
805
132,725
0
12 Jun 2017
PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric
  Space
PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space
C. Qi
L. Yi
Hao Su
Leonidas Guibas
3DPC3DV
366
11,154
0
07 Jun 2017
Look, Listen and Learn
Look, Listen and Learn
Relja Arandjelović
Andrew Zisserman
SSL
127
906
0
23 May 2017
The Kinetics Human Action Video Dataset
The Kinetics Human Action Video Dataset
W. Kay
João Carreira
Karen Simonyan
Brian Zhang
Chloe Hillier
...
Tim Green
T. Back
Apostol Natsev
Mustafa Suleyman
Andrew Zisserman
260
3,816
0
19 May 2017
Dense-Captioning Events in Videos
Dense-Captioning Events in Videos
Ranjay Krishna
Kenji Hata
F. Ren
Li Fei-Fei
Juan Carlos Niebles
152
1,251
0
02 May 2017
Towards Automatic Learning of Procedures from Web Instructional Videos
Towards Automatic Learning of Procedures from Web Instructional Videos
Luowei Zhou
Chenliang Xu
Jason J. Corso
EgoV
79
831
0
28 Mar 2017
Mask R-CNN
Mask R-CNN
Kaiming He
Georgia Gkioxari
Piotr Dollár
Ross B. Girshick
ObjD
381
27,253
0
20 Mar 2017
Deformable Convolutional Networks
Deformable Convolutional Networks
Jifeng Dai
Haozhi Qi
Yuwen Xiong
Yi Li
Guodong Zhang
Han Hu
Yichen Wei
230
5,339
0
17 Mar 2017
Deep Sets
Deep Sets
Manzil Zaheer
Satwik Kottur
Siamak Ravanbakhsh
Barnabás Póczós
Ruslan Salakhutdinov
Alex Smola
429
2,480
0
10 Mar 2017
MARTA GANs: Unsupervised Representation Learning for Remote Sensing
  Image Classification
MARTA GANs: Unsupervised Representation Learning for Remote Sensing Image Classification
Daoyu Lin
Kun Fu
Yang Wang
Guangluan Xu
Xian Sun
GAN
66
174
0
28 Dec 2016
EnhanceNet: Single Image Super-Resolution Through Automated Texture
  Synthesis
EnhanceNet: Single Image Super-Resolution Through Automated Texture Synthesis
Mehdi S. M. Sajjadi
Bernhard Schölkopf
M. Hirsch
SupR
79
972
0
23 Dec 2016
StackGAN: Text to Photo-realistic Image Synthesis with Stacked
  Generative Adversarial Networks
StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks
Han Zhang
Tao Xu
Hongsheng Li
Shaoting Zhang
Xiaogang Wang
Xiaolei Huang
Dimitris N. Metaxas
GAN
125
2,728
0
10 Dec 2016
Feature Pyramid Networks for Object Detection
Feature Pyramid Networks for Object Detection
Nayeon Lee
Piotr Dollár
Ross B. Girshick
Kaiming He
Bharath Hariharan
Serge J. Belongie
ObjD
491
22,158
0
09 Dec 2016
Making the V in VQA Matter: Elevating the Role of Image Understanding in
  Visual Question Answering
Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering
Yash Goyal
Tejas Khot
D. Summers-Stay
Dhruv Batra
Devi Parikh
CoGe
355
3,273
0
02 Dec 2016
Aggregated Residual Transformations for Deep Neural Networks
Aggregated Residual Transformations for Deep Neural Networks
Saining Xie
Ross B. Girshick
Piotr Dollár
Zhuowen Tu
Kaiming He
524
10,351
0
16 Nov 2016
Photo-Realistic Single Image Super-Resolution Using a Generative
  Adversarial Network
Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network
C. Ledig
Lucas Theis
Ferenc Huszár
Jose Caballero
Andrew Cunningham
...
Andrew P. Aitken
Alykhan Tejani
J. Totz
Zehan Wang
Wenzhe Shi
GAN
244
10,717
0
15 Sep 2016
Semi-Supervised Classification with Graph Convolutional Networks
Semi-Supervised Classification with Graph Convolutional Networks
Thomas Kipf
Max Welling
GNNSSL
679
29,183
0
09 Sep 2016
Modeling Context in Referring Expressions
Modeling Context in Referring Expressions
Licheng Yu
Patrick Poirson
Shan Yang
Alexander C. Berg
Tamara L. Berg
133
1,277
0
31 Jul 2016
Layer Normalization
Layer Normalization
Jimmy Lei Ba
J. Kiros
Geoffrey E. Hinton
435
10,541
0
21 Jul 2016
Conditional Image Generation with PixelCNN Decoders
Conditional Image Generation with PixelCNN Decoders
Aaron van den Oord
Nal Kalchbrenner
Oriol Vinyals
L. Espeholt
Alex Graves
Koray Kavukcuoglu
VLM
232
2,519
0
16 Jun 2016
Towards a Neural Statistician
Towards a Neural Statistician
Harrison Edwards
Amos Storkey
BDL
99
427
0
07 Jun 2016
Generative Adversarial Text to Image Synthesis
Generative Adversarial Text to Image Synthesis
Scott E. Reed
Zeynep Akata
Xinchen Yan
Lajanugen Logeswaran
Bernt Schiele
Honglak Lee
GAN
209
3,149
0
17 May 2016
Context Encoders: Feature Learning by Inpainting
Context Encoders: Feature Learning by Inpainting
Deepak Pathak
Philipp Krahenbuhl
Jeff Donahue
Trevor Darrell
Alexei A. Efros
SSL
69
5,300
0
25 Apr 2016
NTU RGB+D: A Large Scale Dataset for 3D Human Activity Analysis
NTU RGB+D: A Large Scale Dataset for 3D Human Activity Analysis
Amir Shahroudy
Jun Liu
T. Ng
G. Wang
258
2,496
0
11 Apr 2016
Hollywood in Homes: Crowdsourcing Data Collection for Activity
  Understanding
Hollywood in Homes: Crowdsourcing Data Collection for Activity Understanding
Gunnar Sigurdsson
Gül Varol
Xinyu Wang
Ali Farhadi
Ivan Laptev
Abhinav Gupta
VGen
117
1,247
0
06 Apr 2016
The Cityscapes Dataset for Semantic Urban Scene Understanding
The Cityscapes Dataset for Semantic Urban Scene Understanding
Marius Cordts
Mohamed Omran
Sebastian Ramos
Timo Rehfeld
Markus Enzweiler
Rodrigo Benenson
Uwe Franke
Stefan Roth
Bernt Schiele
1.1K
11,654
0
06 Apr 2016
Unsupervised Learning of Visual Representations by Solving Jigsaw
  Puzzles
Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles
M. Noroozi
Paolo Favaro
SSL
180
2,986
0
30 Mar 2016
Colorful Image Colorization
Colorful Image Colorization
Richard Y. Zhang
Phillip Isola
Alexei A. Efros
149
3,534
0
28 Mar 2016
Perceptual Losses for Real-Time Style Transfer and Super-Resolution
Perceptual Losses for Real-Time Style Transfer and Super-Resolution
Justin Johnson
Alexandre Alahi
Li Fei-Fei
SupR
262
10,267
0
27 Mar 2016
Visual Genome: Connecting Language and Vision Using Crowdsourced Dense
  Image Annotations
Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations
Ranjay Krishna
Yuke Zhu
Oliver Groth
Justin Johnson
Kenji Hata
...
Yannis Kalantidis
Li Li
David A. Shamma
Michael S. Bernstein
Fei-Fei Li
234
5,765
0
23 Feb 2016
Deep Residual Learning for Image Recognition
Deep Residual Learning for Image Recognition
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
MedIm
2.3K
194,510
0
10 Dec 2015
ShapeNet: An Information-Rich 3D Model Repository
ShapeNet: An Information-Rich 3D Model Repository
Angel X. Chang
Thomas Funkhouser
Leonidas Guibas
Pat Hanrahan
Qi-Xing Huang
...
Shuran Song
Hao Su
Jianxiong Xiao
L. Yi
Feng Yu
3DV
176
5,538
0
09 Dec 2015
SSD: Single Shot MultiBox Detector
SSD: Single Shot MultiBox Detector
Wen Liu
Dragomir Anguelov
D. Erhan
Christian Szegedy
Scott E. Reed
Cheng-Yang Fu
Alexander C. Berg
ObjDBDL
257
29,879
0
08 Dec 2015
Rethinking the Inception Architecture for Computer Vision
Rethinking the Inception Architecture for Computer Vision
Christian Szegedy
Vincent Vanhoucke
Sergey Ioffe
Jonathon Shlens
Z. Wojna
3DVBDL
886
27,427
0
02 Dec 2015
Previous
123456
Next