ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1802.05751
  4. Cited By
Image Transformer
v1v2v3 (latest)

Image Transformer

15 February 2018
Niki Parmar
Ashish Vaswani
Jakob Uszkoreit
Lukasz Kaiser
Noam M. Shazeer
Alexander Ku
Dustin Tran
    ViT
ArXiv (abs)PDFHTML

Papers citing "Image Transformer"

50 / 837 papers shown
Title
Retrieval meets Long Context Large Language Models
Retrieval meets Long Context Large Language Models
Peng Xu
Ming-Yu Liu
Xianchao Wu
Lawrence C. McAfee
Chen Zhu
Zihan Liu
Sandeep Subramanian
Evelina Bakhturina
Mohammad Shoeybi
Bryan Catanzaro
RALMLRM
95
86
0
04 Oct 2023
Pixel-Inconsistency Modeling for Image Manipulation Localization
Pixel-Inconsistency Modeling for Image Manipulation Localization
Chenqi Kong
Anwei Luo
Shiqi Wang
Haoliang Li
Anderson de Rezende Rocha
Alex C. Kot
AAML
88
17
0
30 Sep 2023
Robust Sequential DeepFake Detection
Robust Sequential DeepFake Detection
R. Shao
Tianxing Wu
Ziwei Liu
ViTAAML
68
8
0
26 Sep 2023
ADU-Depth: Attention-based Distillation with Uncertainty Modeling for
  Depth Estimation
ADU-Depth: Attention-based Distillation with Uncertainty Modeling for Depth Estimation
Zizhang Wu
Zhuozheng Li
Zhi-Gang Fan
Yunzhe Wu
Xiaoquan Wang
Rui Tang
Jian Pu
79
2
0
26 Sep 2023
Algorithms for Object Detection in Substations
Algorithms for Object Detection in Substations
Bingying Jin
Yadong Liu
Qinlin Qian
13
1
0
23 Sep 2023
Vision Transformers for Computer Go
Vision Transformers for Computer Go
Amani Sagri
Tristan Cazenave
Jérôme Arjonilla
Abdallah Saffidine
ViT
32
2
0
22 Sep 2023
Fully Transformer-Equipped Architecture for End-to-End Referring Video
  Object Segmentation
Fully Transformer-Equipped Architecture for End-to-End Referring Video Object Segmentation
P. Li
Yu Zhang
L. Yuan
Xianghua Xu
VOS
55
9
0
21 Sep 2023
Localize, Retrieve and Fuse: A Generalized Framework for Free-Form
  Question Answering over Tables
Localize, Retrieve and Fuse: A Generalized Framework for Free-Form Question Answering over Tables
Wenting Zhao
Ye Liu
Yao Wan
Yibo Wang
Zhongfen Deng
Philip S. Yu
RALMLMTD
80
7
0
20 Sep 2023
Attention-Only Transformers and Implementing MLPs with Attention Heads
Attention-Only Transformers and Implementing MLPs with Attention Heads
R. Huben
Valerie Morris
32
0
0
15 Sep 2023
Learning Objective-Specific Active Learning Strategies with Attentive
  Neural Processes
Learning Objective-Specific Active Learning Strategies with Attentive Neural Processes
Tim Bakker
H. V. Hoof
Max Welling
79
2
0
11 Sep 2023
MapPrior: Bird's-Eye View Map Layout Estimation with Generative Models
MapPrior: Bird's-Eye View Map Layout Estimation with Generative Models
Xiyue Zhu
Vlas Zyrianov
Zhijian Liu
Shenlong Wang
86
12
0
24 Aug 2023
How Much Temporal Long-Term Context is Needed for Action Segmentation?
How Much Temporal Long-Term Context is Needed for Action Segmentation?
Emad Bahrami Rad
Gianpiero Francesca
Juergen Gall
ViT
89
27
0
22 Aug 2023
LDCSF: Local depth convolution-based Swim framework for classifying
  multi-label histopathology images
LDCSF: Local depth convolution-based Swim framework for classifying multi-label histopathology images
Liangrui Pan
Yutao Dou
Zhichao Feng
Liwen Xu
Shaoliang Peng
MedIm
46
3
0
21 Aug 2023
FashionLOGO: Prompting Multimodal Large Language Models for Fashion Logo
  Embeddings
FashionLOGO: Prompting Multimodal Large Language Models for Fashion Logo Embeddings
Yulin Su
Min Yang
Minghui Qiu
Jing Wang
Tao Wang
VLM
79
0
0
17 Aug 2023
Graph-Segmenter: Graph Transformer with Boundary-aware Attention for
  Semantic Segmentation
Graph-Segmenter: Graph Transformer with Boundary-aware Attention for Semantic Segmentation
Zizhang Wu
Yuanzhu Gan
Tianhao Xu
Fan Wang
ViT
66
8
0
15 Aug 2023
RestoreFormer++: Towards Real-World Blind Face Restoration from
  Undegraded Key-Value Pairs
RestoreFormer++: Towards Real-World Blind Face Restoration from Undegraded Key-Value Pairs
Zhouxia Wang
Jiawei Zhang
Tianshui Chen
Wenping Wang
Ping Luo
98
20
0
14 Aug 2023
Bayesian Flow Networks
Bayesian Flow Networks
Alex Graves
R. Srivastava
Timothy James Atkinson
Faustino J. Gomez
BDL
139
45
0
14 Aug 2023
Category Feature Transformer for Semantic Segmentation
Category Feature Transformer for Semantic Segmentation
Quan Tang
Chuanjian Liu
Fagui Liu
Yifan Liu
Jun Jiang
Bowen Zhang
Kai Han
Yunhe Wang
ViT
103
2
0
10 Aug 2023
Efficient Bayesian Optimization with Deep Kernel Learning and
  Transformer Pre-trained on Multiple Heterogeneous Datasets
Efficient Bayesian Optimization with Deep Kernel Learning and Transformer Pre-trained on Multiple Heterogeneous Datasets
Wenlong Lyu
Shoubo Hu
Jie Chuai
Zhitang Chen
32
2
0
09 Aug 2023
Don't be so negative! Score-based Generative Modeling with
  Oracle-assisted Guidance
Don't be so negative! Score-based Generative Modeling with Oracle-assisted Guidance
Saeid Naderiparizi
Xiaoxuan Liang
Berend Zwartsenberg
Frank Wood
DiffM
74
5
0
31 Jul 2023
Generative AI for Medical Imaging: extending the MONAI Framework
Generative AI for Medical Imaging: extending the MONAI Framework
W. H. Pinaya
M. Graham
E. Kerfoot
Petru-Daniel Tudosiu
J. Dafflon
...
Andrew Feng
Marc Modat
P. Nachev
Sebastien Ourselin
M. Jorge Cardoso
SyDaMedIm
105
72
0
27 Jul 2023
Adaptive Local Basis Functions for Shape Completion
Adaptive Local Basis Functions for Shape Completion
Hui Ying
Tianjia Shao
He Wang
Yifan Yang
Kun Zhou
3DPC
77
4
0
17 Jul 2023
Video-FocalNets: Spatio-Temporal Focal Modulation for Video Action
  Recognition
Video-FocalNets: Spatio-Temporal Focal Modulation for Video Action Recognition
Syed Talal Wasim
Muhammad Uzair Khattak
Muzammal Naseer
Salman Khan
M. Shah
Fahad Shahbaz Khan
ViT
118
21
0
13 Jul 2023
Separate-and-Aggregate: A Transformer-based Patch Refinement Model for
  Knowledge Graph Completion
Separate-and-Aggregate: A Transformer-based Patch Refinement Model for Knowledge Graph Completion
Chen Chen
Yufei Wang
Yang Zhang
Quan.Z Sheng
Kwok-Yan Lam
KELM
137
3
0
11 Jul 2023
The Ethical Implications of Generative Audio Models: A Systematic
  Literature Review
The Ethical Implications of Generative Audio Models: A Systematic Literature Review
J. Barnett
86
32
0
07 Jul 2023
Cross-Spatial Pixel Integration and Cross-Stage Feature Fusion Based
  Transformer Network for Remote Sensing Image Super-Resolution
Cross-Spatial Pixel Integration and Cross-Stage Feature Fusion Based Transformer Network for Remote Sensing Image Super-Resolution
Yuting Lu
Lingtong Min
Binglu Wang
Le Zheng
Xiaoxu Wang
Yongqiang Zhao
Teng Long
59
8
0
06 Jul 2023
Efficient Contextformer: Spatio-Channel Window Attention for Fast
  Context Modeling in Learned Image Compression
Efficient Contextformer: Spatio-Channel Window Attention for Fast Context Modeling in Learned Image Compression
A. B. Koyuncu
Panqi Jia
Atanas Boev
Elena Alshina
Eckehard Steinbach
73
17
0
25 Jun 2023
Waypoint Transformer: Reinforcement Learning via Supervised Learning
  with Intermediate Targets
Waypoint Transformer: Reinforcement Learning via Supervised Learning with Intermediate Targets
Anirudhan Badrinath
Yannis Flet-Berliac
Allen Nie
Emma Brunskill
OffRL
102
19
0
24 Jun 2023
Efficient Online Processing with Deep Neural Networks
Efficient Online Processing with Deep Neural Networks
Lukas Hedegaard
56
0
0
23 Jun 2023
RXFOOD: Plug-in RGB-X Fusion for Object of Interest Detection
RXFOOD: Plug-in RGB-X Fusion for Object of Interest Detection
Jin Ma
Jinlong Li
Qing Guo
Tianyu Zhang
Yuewei Lin
Hongkai Yu
72
0
0
22 Jun 2023
Instant Soup: Cheap Pruning Ensembles in A Single Pass Can Draw Lottery
  Tickets from Large Models
Instant Soup: Cheap Pruning Ensembles in A Single Pass Can Draw Lottery Tickets from Large Models
A. Jaiswal
Shiwei Liu
Tianlong Chen
Ying Ding
Zhangyang Wang
VLM
115
21
0
18 Jun 2023
Efficient HDR Reconstruction from Real-World Raw Images
Efficient HDR Reconstruction from Real-World Raw Images
Qirui Yang
Yihao Liu
Qihua Chen
Huanjing Yue
Kun Li
Jingyu Yang
3DV
69
2
0
17 Jun 2023
Training-free Diffusion Model Adaptation for Variable-Sized
  Text-to-Image Synthesis
Training-free Diffusion Model Adaptation for Variable-Sized Text-to-Image Synthesis
Zhiyu Jin
Xuli Shen
Bin Li
Xiangyang Xue
82
38
0
14 Jun 2023
A Comprehensive Survey on Applications of Transformers for Deep Learning
  Tasks
A Comprehensive Survey on Applications of Transformers for Deep Learning Tasks
Saidul Islam
Hanae Elmekki
Ahmed Elsebai
Jamal Bentahar
Najat Drawel
Gaith Rjoub
Witold Pedrycz
ViTMedIm
94
212
0
11 Jun 2023
InvPT++: Inverted Pyramid Multi-Task Transformer for Visual Scene
  Understanding
InvPT++: Inverted Pyramid Multi-Task Transformer for Visual Scene Understanding
Hanrong Ye
Dan Xu
ViT
110
13
0
08 Jun 2023
Object Detection with Transformers: A Review
Object Detection with Transformers: A Review
Tahira Shehzadi
K. Hashmi
D. Stricker
Muhammad Zeshan Afzal
ViTMU
104
29
0
07 Jun 2023
The Emergence of Essential Sparsity in Large Pre-trained Models: The
  Weights that Matter
The Emergence of Essential Sparsity in Large Pre-trained Models: The Weights that Matter
Ajay Jaiswal
Shiwei Liu
Tianlong Chen
Zhangyang Wang
VLM
77
34
0
06 Jun 2023
A Unified Framework to Super-Resolve Face Images of Varied Low
  Resolutions
A Unified Framework to Super-Resolve Face Images of Varied Low Resolutions
Qiuyu Peng
Zifei Jiang
Yan Huang
Jingliang Peng
CVBMSupR
77
0
0
06 Jun 2023
Hierarchical Attention Encoder Decoder
Hierarchical Attention Encoder Decoder
Asier Mujika
BDL
62
3
0
01 Jun 2023
UniDiff: Advancing Vision-Language Models with Generative and
  Discriminative Learning
UniDiff: Advancing Vision-Language Models with Generative and Discriminative Learning
Xiao Dong
Runhu Huang
Xiaoyong Wei
Zequn Jie
Jianxing Yu
Jian Yin
Xiaodan Liang
VLMDiffM
72
1
0
01 Jun 2023
Self-supervised Vision Transformers for 3D Pose Estimation of Novel
  Objects
Self-supervised Vision Transformers for 3D Pose Estimation of Novel Objects
S. Thalhammer
Jean-Baptiste Weibel
Markus Vincze
Jose Garcia-Rodriguez
ViT
96
10
0
31 May 2023
Visual Affordance Prediction for Guiding Robot Exploration
Visual Affordance Prediction for Guiding Robot Exploration
Homanga Bharadhwaj
Abhi Gupta
Shubham Tulsiani
122
15
0
28 May 2023
Parameter Estimation in DAGs from Incomplete Data via Optimal Transport
Parameter Estimation in DAGs from Incomplete Data via Optimal Transport
Vy Vo
Trung Le
L. Vuong
He Zhao
Edwin V. Bonilla
Dinh Q. Phung
OT
73
4
0
25 May 2023
Beyond Individual Input for Deep Anomaly Detection on Tabular Data
Beyond Individual Input for Deep Anomaly Detection on Tabular Data
Hugo Thimonier
Fabrice Popineau
Arpad Rimmel
Bich-Liên Doan
85
6
0
24 May 2023
Not All Image Regions Matter: Masked Vector Quantization for
  Autoregressive Image Generation
Not All Image Regions Matter: Masked Vector Quantization for Autoregressive Image Generation
Mengqi Huang
Zhendong Mao
Quang Wang
Yongdong Zhang
VGenDiffM
127
24
0
23 May 2023
FIT: Far-reaching Interleaved Transformers
FIT: Far-reaching Interleaved Transformers
Ting-Li Chen
Lala Li
108
13
0
22 May 2023
Graph Propagation Transformer for Graph Representation Learning
Graph Propagation Transformer for Graph Representation Learning
Zhe Chen
Hao Hao Tan
Tao Wang
Tianrun Shen
Tong Lu
Qiuying Peng
Cheng Cheng
Yue Qi
82
13
0
19 May 2023
Deep Multiple Instance Learning with Distance-Aware Self-Attention
Deep Multiple Instance Learning with Distance-Aware Self-Attention
Georg Wolflein
Lucie Charlotte Magister
Pietro Lio
David J. Harrison
Ognjen Arandjelovic
65
3
0
17 May 2023
CageViT: Convolutional Activation Guided Efficient Vision Transformer
CageViT: Convolutional Activation Guided Efficient Vision Transformer
Hao Zheng
Jinbao Wang
Xiantong Zhen
Hao Chen
Jingkuan Song
Feng Zheng
ViT
80
0
0
17 May 2023
MMoT: Mixture-of-Modality-Tokens Transformer for Composed Multimodal
  Conditional Image Synthesis
MMoT: Mixture-of-Modality-Tokens Transformer for Composed Multimodal Conditional Image Synthesis
Jinsheng Zheng
Daqing Liu
Chaoyue Wang
Minghui Hu
Zuopeng Yang
Changxing Ding
Dacheng Tao
72
1
0
10 May 2023
Previous
12345...151617
Next