ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2206.04403
  4. Cited By
VITA: Video Instance Segmentation via Object Token Association
v1v2 (latest)

VITA: Video Instance Segmentation via Object Token Association

9 June 2022
Miran Heo
Sukjun Hwang
Seoung Wug Oh
Joon-Young Lee
Seon Joo Kim
    VOS
ArXiv (abs)PDFHTMLGithub (100★)

Papers citing "VITA: Video Instance Segmentation via Object Token Association"

34 / 34 papers shown
Title
From Slices to Sequences: Autoregressive Tracking Transformer for Cohesive and Consistent 3D Lymph Node Detection in CT Scans
Qinji Yu
Yirui Wang
K. Yan
Dandan Zheng
Dashan Ai
...
N. Shen
Xiaowei Ding
Le Lu
X. Ye
Dakai Jin
ViTMedIm
157
0
0
11 Mar 2025
ViLLa: Video Reasoning Segmentation with Large Language Model
ViLLa: Video Reasoning Segmentation with Large Language Model
Rongkun Zheng
Lu Qi
Xi Chen
Yi Wang
Kun Wang
Yu Qiao
Hengshuang Zhao
VOSLRM
104
5
0
18 Jul 2024
Audio-Visual Instance Segmentation
Audio-Visual Instance Segmentation
Ruohao Guo
Yaru Chen
Yanyu Qi
Wenzhen Yue
Dantong Niu
...
Wenzhen Yue
Ji Shi
Qixun Wang
Peiliang Zhang
Buwen Liang
VLMVOS
77
2
0
28 Oct 2023
Cannot See the Forest for the Trees: Aggregating Multiple Viewpoints to
  Better Classify Objects in Videos
Cannot See the Forest for the Trees: Aggregating Multiple Viewpoints to Better Classify Objects in Videos
Sukjun Hwang
Miran Heo
Seoung Wug Oh
Seon Joo Kim
VOT
123
6
0
05 Jun 2022
Temporally Efficient Vision Transformer for Video Instance Segmentation
Temporally Efficient Vision Transformer for Video Instance Segmentation
Shusheng Yang
Xinggang Wang
Yu Li
Yuxin Fang
Jiemin Fang
Wenyu Liu
Xun Zhao
Ying Shan
ViT
51
66
0
18 Apr 2022
Global Tracking Transformers
Global Tracking Transformers
Xingyi Zhou
Tianwei Yin
V. Koltun
Philipp Krahenbuhl
VOT
83
137
0
24 Mar 2022
Efficient Video Instance Segmentation via Tracklet Query and Proposal
Efficient Video Instance Segmentation via Tracklet Query and Proposal
Jialian Wu
Sudhir Yarram
Hui Liang
Tian Lan
Junsong Yuan
J. Eledath
Gérard Medioni
62
37
0
03 Mar 2022
Mask2Former for Video Instance Segmentation
Mask2Former for Video Instance Segmentation
Bowen Cheng
Anwesa Choudhuri
Ishan Misra
Alexander Kirillov
Rohit Girdhar
Alex Schwing
VOS
98
169
0
20 Dec 2021
VISOLO: Grid-Based Space-Time Aggregation for Efficient Online Video
  Instance Segmentation
VISOLO: Grid-Based Space-Time Aggregation for Efficient Online Video Instance Segmentation
Sunggeun Han
Sukjun Hwang
Seoung Wug Oh
Yeonchool Park
Hyunwoo J. Kim
Minjung Kim
Seon Joo Kim
42
30
0
08 Dec 2021
Masked-attention Mask Transformer for Universal Image Segmentation
Masked-attention Mask Transformer for Universal Image Segmentation
Bowen Cheng
Ishan Misra
Alex Schwing
Alexander Kirillov
Rohit Girdhar
ISeg
248
2,364
0
02 Dec 2021
Occluded Video Instance Segmentation: Dataset and ICCV 2021 Challenge
Occluded Video Instance Segmentation: Dataset and ICCV 2021 Challenge
Jiyang Qi
Yan Gao
Yao Hu
Xinggang Wang
Xiaoyu Liu
Xiang Bai
Serge Belongie
Alan Yuille
Philip Torr
S. Bai
VOS
45
6
0
15 Nov 2021
Fast Convergence of DETR with Spatially Modulated Co-Attention
Fast Convergence of DETR with Spatially Modulated Co-Attention
Peng Gao
Minghang Zheng
Xiaogang Wang
Jifeng Dai
Hongsheng Li
ViT
70
307
0
05 Aug 2021
Prototypical Cross-Attention Networks for Multiple Object Tracking and
  Segmentation
Prototypical Cross-Attention Networks for Multiple Object Tracking and Segmentation
Lei Ke
Xia Li
Martin Danelljan
Yu-Wing Tai
Chi-Keung Tang
Feng Yu
VOS
57
74
0
22 Jun 2021
Video Instance Segmentation using Inter-Frame Communication Transformers
Video Instance Segmentation using Inter-Frame Communication Transformers
Sukjun Hwang
Miran Heo
Seoung Wug Oh
Seon Joo Kim
ViT
112
137
0
07 Jun 2021
Crossover Learning for Fast Online Video Instance Segmentation
Crossover Learning for Fast Online Video Instance Segmentation
Shusheng Yang
Yuxin Fang
Xinggang Wang
Yu Li
Chen Fang
Ying Shan
Bin Feng
Wenyu Liu
89
105
0
13 Apr 2021
Spatial Feature Calibration and Temporal Fusion for Effective One-stage
  Video Instance Segmentation
Spatial Feature Calibration and Temporal Fusion for Effective One-stage Video Instance Segmentation
Minghan Li
Shuai Li
Lida Li
Lei Zhang
VOS
68
52
0
06 Apr 2021
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Ze Liu
Yutong Lin
Yue Cao
Han Hu
Yixuan Wei
Zheng Zhang
Stephen Lin
B. Guo
ViT
450
21,439
0
25 Mar 2021
SG-Net: Spatial Granularity Network for One-Stage Video Instance
  Segmentation
SG-Net: Spatial Granularity Network for One-Stage Video Instance Segmentation
Dongfang Liu
Yiming Cui
Wenbo Tan
Yingjie Chen
75
132
0
18 Mar 2021
Learning a Proposal Classifier for Multiple Object Tracking
Learning a Proposal Classifier for Multiple Object Tracking
Peng Dai
Renliang Weng
Wongun Choi
Changshui Zhang
Zhangping He
Wei Ding
VOT
64
89
0
14 Mar 2021
CompFeat: Comprehensive Feature Aggregation for Video Instance
  Segmentation
CompFeat: Comprehensive Feature Aggregation for Video Instance Segmentation
Yang Fu
Linjie Yang
Ding Liu
Thomas S. Huang
Humphrey Shi
VOS
66
71
0
07 Dec 2020
End-to-End Video Instance Segmentation with Transformers
End-to-End Video Instance Segmentation with Transformers
Yuqing Wang
Zhaoliang Xu
Xinlong Wang
Chunhua Shen
Baoshan Cheng
Hao Shen
Huaxia Xia
ViT
79
691
0
30 Nov 2020
Rethinking Transformer-based Set Prediction for Object Detection
Rethinking Transformer-based Set Prediction for Object Detection
Zhiqing Sun
Shengcao Cao
Yiming Yang
Kris Kitani
ViT
120
322
0
21 Nov 2020
An Image is Worth 16x16 Words: Transformers for Image Recognition at
  Scale
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
657
41,103
0
22 Oct 2020
Deformable DETR: Deformable Transformers for End-to-End Object Detection
Deformable DETR: Deformable Transformers for End-to-End Object Detection
Xizhou Zhu
Weijie Su
Lewei Lu
Bin Li
Xiaogang Wang
Jifeng Dai
ViT
221
5,080
0
08 Oct 2020
SipMask: Spatial Information Preservation for Fast Image and Video
  Instance Segmentation
SipMask: Spatial Information Preservation for Fast Image and Video Instance Segmentation
Jiale Cao
Rao Muhammad Anwer
Hisham Cholakkal
Fahad Shahbaz Khan
Yanwei Pang
Ling Shao
ISeg
54
171
0
29 Jul 2020
End-to-End Object Detection with Transformers
End-to-End Object Detection with Transformers
Nicolas Carion
Francisco Massa
Gabriel Synnaeve
Nicolas Usunier
Alexander Kirillov
Sergey Zagoruyko
ViT3DVPINN
415
13,048
0
26 May 2020
STEm-Seg: Spatio-temporal Embeddings for Instance Segmentation in Videos
STEm-Seg: Spatio-temporal Embeddings for Instance Segmentation in Videos
A. Athar
Sabarinath Mahadevan
Aljosa Osep
Laura Leal-Taixé
Bastian Leibe
VOS
88
171
0
18 Mar 2020
Learning a Neural Solver for Multiple Object Tracking
Learning a Neural Solver for Multiple Object Tracking
Guillem Brasó
Laura Leal-Taixé
VOT
83
400
0
16 Dec 2019
Classifying, Segmenting, and Tracking Object Instances in Video with
  Mask Propagation
Classifying, Segmenting, and Tracking Object Instances in Video with Mask Propagation
Gedas Bertasius
Lorenzo Torresani
56
179
0
10 Dec 2019
Video Instance Segmentation
Video Instance Segmentation
Linjie Yang
Yuchen Fan
N. Xu
VOSISeg
85
508
0
12 May 2019
Attention Is All You Need
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
701
131,652
0
12 Jun 2017
Mask R-CNN
Mask R-CNN
Kaiming He
Georgia Gkioxari
Piotr Dollár
Ross B. Girshick
ObjD
352
27,195
0
20 Mar 2017
Deep Residual Learning for Image Recognition
Deep Residual Learning for Image Recognition
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
MedIm
2.2K
194,020
0
10 Dec 2015
Microsoft COCO: Common Objects in Context
Microsoft COCO: Common Objects in Context
Nayeon Lee
Michael Maire
Serge J. Belongie
Lubomir Bourdev
Ross B. Girshick
James Hays
Pietro Perona
Deva Ramanan
C. L. Zitnick
Piotr Dollár
ObjD
413
43,667
0
01 May 2014
1