ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2407.06871
  4. Cited By
Rethinking Image-to-Video Adaptation: An Object-centric Perspective

Rethinking Image-to-Video Adaptation: An Object-centric Perspective

9 July 2024
Rui Qian
Shuangrui Ding
Dahua Lin
    OCL
ArXivPDFHTML

Papers citing "Rethinking Image-to-Video Adaptation: An Object-centric Perspective"

31 / 31 papers shown
Title
Learning to Visually Connect Actions and their Effects
Learning to Visually Connect Actions and their Effects
Eric Peh
Paritosh Parmar
Basura Fernando
68
2
0
19 Jan 2024
Self-supervised Object-Centric Learning for Videos
Self-supervised Object-Centric Learning for Videos
Görkay Aydemir
Weidi Xie
Fatma Guney
OCL
VOS
SSL
42
26
0
10 Oct 2023
VideoCutLER: Surprisingly Simple Unsupervised Video Instance
  Segmentation
VideoCutLER: Surprisingly Simple Unsupervised Video Instance Segmentation
Xudong Wang
Ishan Misra
Ziyun Zeng
Rohit Girdhar
Trevor Darrell
58
16
0
28 Aug 2023
Prune Spatio-temporal Tokens by Semantic-aware Temporal Accumulation
Prune Spatio-temporal Tokens by Semantic-aware Temporal Accumulation
Shuangrui Ding
Peisen Zhao
Xiaopeng Zhang
Rui Qian
H. Xiong
Qi Tian
ViT
34
16
0
08 Aug 2023
Expanding Language-Image Pretrained Models for General Video Recognition
Expanding Language-Image Pretrained Models for General Video Recognition
Bolin Ni
Houwen Peng
Minghao Chen
Songyang Zhang
Gaofeng Meng
Jianlong Fu
Shiming Xiang
Haibin Ling
VLM
CLIP
ViT
61
319
0
04 Aug 2022
Dual Contrastive Learning for Spatio-temporal Representation
Dual Contrastive Learning for Spatio-temporal Representation
Shuangrui Ding
Rui Qian
H. Xiong
AI4TS
SSL
46
21
0
12 Jul 2022
ST-Adapter: Parameter-Efficient Image-to-Video Transfer Learning
ST-Adapter: Parameter-Efficient Image-to-Video Transfer Learning
Junting Pan
Ziyi Lin
Xiatian Zhu
Jing Shao
Hongsheng Li
34
197
0
27 Jun 2022
Visual Prompt Tuning
Visual Prompt Tuning
Menglin Jia
Luming Tang
Bor-Chun Chen
Claire Cardie
Serge Belongie
Bharath Hariharan
Ser-Nam Lim
VLM
VPVLM
66
1,576
0
23 Mar 2022
VL-Adapter: Parameter-Efficient Transfer Learning for
  Vision-and-Language Tasks
VL-Adapter: Parameter-Efficient Transfer Learning for Vision-and-Language Tasks
Yi-Lin Sung
Jaemin Cho
Joey Tianyi Zhou
VLM
VPVLM
58
345
0
13 Dec 2021
Prompting Visual-Language Models for Efficient Video Understanding
Prompting Visual-Language Models for Efficient Video Understanding
Chen Ju
Tengda Han
Kunhao Zheng
Ya Zhang
Weidi Xie
VPVLM
VLM
47
371
0
08 Dec 2021
MViTv2: Improved Multiscale Vision Transformers for Classification and
  Detection
MViTv2: Improved Multiscale Vision Transformers for Classification and Detection
Yanghao Li
Chaoxia Wu
Haoqi Fan
K. Mangalam
Bo Xiong
Jitendra Malik
Christoph Feichtenhofer
ViT
91
683
0
02 Dec 2021
Conditional Object-Centric Learning from Video
Conditional Object-Centric Learning from Video
Thomas Kipf
Gamaleldin F. Elsayed
Aravindh Mahendran
Austin Stone
S. Sabour
G. Heigold
Rico Jonschkowski
Alexey Dosovitskiy
Klaus Greff
OCL
61
215
0
24 Nov 2021
iBOT: Image BERT Pre-Training with Online Tokenizer
iBOT: Image BERT Pre-Training with Online Tokenizer
Jinghao Zhou
Chen Wei
Huiyu Wang
Wei Shen
Cihang Xie
Alan Yuille
Tao Kong
45
722
0
15 Nov 2021
Motion-aware Contrastive Video Representation Learning via
  Foreground-background Merging
Motion-aware Contrastive Video Representation Learning via Foreground-background Merging
Shuangrui Ding
Maomao Li
Tianyu Yang
Rui Qian
Haohang Xu
Qingyi Chen
Jue Wang
Hongkai Xiong
SSL
35
51
0
30 Sep 2021
BitFit: Simple Parameter-efficient Fine-tuning for Transformer-based
  Masked Language-models
BitFit: Simple Parameter-efficient Fine-tuning for Transformer-based Masked Language-models
Elad Ben-Zaken
Shauli Ravfogel
Yoav Goldberg
118
1,191
0
18 Jun 2021
LoRA: Low-Rank Adaptation of Large Language Models
LoRA: Low-Rank Adaptation of Large Language Models
J. E. Hu
Yelong Shen
Phillip Wallis
Zeyuan Allen-Zhu
Yuanzhi Li
Shean Wang
Lu Wang
Weizhu Chen
OffRL
AI4TS
AI4CE
ALM
AIMat
149
9,946
0
17 Jun 2021
The Power of Scale for Parameter-Efficient Prompt Tuning
The Power of Scale for Parameter-Efficient Prompt Tuning
Brian Lester
Rami Al-Rfou
Noah Constant
VPVLM
413
3,952
0
18 Apr 2021
Self-supervised Video Object Segmentation by Motion Grouping
Self-supervised Video Object Segmentation by Motion Grouping
Charig Yang
Hala Lamdouar
Erika Lu
Andrew Zisserman
Weidi Xie
VOS
OCL
42
158
0
15 Apr 2021
ViViT: A Video Vision Transformer
ViViT: A Video Vision Transformer
Anurag Arnab
Mostafa Dehghani
G. Heigold
Chen Sun
Mario Lucic
Cordelia Schmid
ViT
76
2,119
0
29 Mar 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy
  Text Supervision
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
372
3,778
0
11 Feb 2021
Is Space-Time Attention All You Need for Video Understanding?
Is Space-Time Attention All You Need for Video Understanding?
Gedas Bertasius
Heng Wang
Lorenzo Torresani
ViT
299
2,016
0
09 Feb 2021
Prefix-Tuning: Optimizing Continuous Prompts for Generation
Prefix-Tuning: Optimizing Continuous Prompts for Generation
Xiang Lisa Li
Percy Liang
136
4,167
0
01 Jan 2021
Open-Vocabulary Object Detection Using Captions
Open-Vocabulary Object Detection Using Captions
Alireza Zareian
Kevin Dela Rosa
Derek Hao Hu
Shih-Fu Chang
VLM
ObjD
87
423
0
20 Nov 2020
Object-Centric Learning with Slot Attention
Object-Centric Learning with Slot Attention
Francesco Locatello
Dirk Weissenborn
Thomas Unterthiner
Aravindh Mahendran
G. Heigold
Jakob Uszkoreit
Alexey Dosovitskiy
Thomas Kipf
OCL
130
832
0
26 Jun 2020
Contrastive Learning for Weakly Supervised Phrase Grounding
Contrastive Learning for Weakly Supervised Phrase Grounding
Tanmay Gupta
Arash Vahdat
Gal Chechik
Xiaodong Yang
Jan Kautz
Derek Hoiem
ObjD
SSL
101
141
0
17 Jun 2020
AdapterFusion: Non-Destructive Task Composition for Transfer Learning
AdapterFusion: Non-Destructive Task Composition for Transfer Learning
Jonas Pfeiffer
Aishwarya Kamath
Andreas Rucklé
Kyunghyun Cho
Iryna Gurevych
CLL
MoMe
85
837
0
01 May 2020
Exploiting Spatial Invariance for Scalable Unsupervised Object Tracking
Exploiting Spatial Invariance for Scalable Unsupervised Object Tracking
Eric Crawford
Joelle Pineau
70
66
0
20 Nov 2019
The 2019 DAVIS Challenge on VOS: Unsupervised Multi-Object Segmentation
The 2019 DAVIS Challenge on VOS: Unsupervised Multi-Object Segmentation
Sergi Caelles
Jordi Pont-Tuset
Federico Perazzi
Alberto Montes
Kevis-Kokitsi Maninis
Luc Van Gool
VOS
43
147
0
02 May 2019
Video Action Transformer Network
Video Action Transformer Network
Rohit Girdhar
João Carreira
Carl Doersch
Andrew Zisserman
ViT
101
706
0
06 Dec 2018
AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual
  Actions
AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual Actions
Chunhui Gu
Chen Sun
David A. Ross
Carl Vondrick
C. Pantofaru
...
G. Toderici
Susanna Ricco
Rahul Sukthankar
Cordelia Schmid
Jitendra Malik
VGen
80
1,021
0
23 May 2017
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
João Carreira
Andrew Zisserman
174
7,961
0
22 May 2017
1