ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2309.09858
  4. Cited By
Unsupervised Open-Vocabulary Object Localization in Videos

Unsupervised Open-Vocabulary Object Localization in Videos

18 September 2023
Ke Fan
Zechen Bai
Tianjun Xiao
Dominik Zietlow
Max Horn
Zixu Zhao
Carl-Johann Simon-Gabriel
Mike Zheng Shou
Francesco Locatello
Bernt Schiele
Thomas Brox
Zheng-Wei Zhang
Yanwei Fu
Tong He
ArXivPDFHTML

Papers citing "Unsupervised Open-Vocabulary Object Localization in Videos"

8 / 8 papers shown
Title
CTRL-O: Language-Controllable Object-Centric Visual Representation Learning
CTRL-O: Language-Controllable Object-Centric Visual Representation Learning
Aniket Didolkar
Andrii Zadaianchuk
Rabiul Awal
Maximilian Seitzer
E. Gavves
Aishwarya Agrawal
OCL
VLM
89
2
0
27 Mar 2025
CountGD: Multi-Modal Open-World Counting
CountGD: Multi-Modal Open-World Counting
Niki Amini-Naieni
Tengda Han
Andrew Zisserman
ObjD
56
7
0
05 Jul 2024
Hallucination of Multimodal Large Language Models: A Survey
Hallucination of Multimodal Large Language Models: A Survey
Zechen Bai
Pichao Wang
Tianjun Xiao
Tong He
Zongbo Han
Zheng Zhang
Mike Zheng Shou
VLM
LRM
95
139
0
29 Apr 2024
GroupViT: Semantic Segmentation Emerges from Text Supervision
GroupViT: Semantic Segmentation Emerges from Text Supervision
Jiarui Xu
Shalini De Mello
Sifei Liu
Wonmin Byeon
Thomas Breuel
Jan Kautz
X. Wang
ViT
VLM
189
499
0
22 Feb 2022
End-to-End Video Object Detection with Spatial-Temporal Transformers
End-to-End Video Object Detection with Spatial-Temporal Transformers
Lu He
Qianyu Zhou
Xiangtai Li
Li Niu
Guangliang Cheng
Xiao Li
Wenxuan Liu
Yu Tong
Lizhuang Ma
Liqing Zhang
ViT
49
96
0
23 May 2021
Emerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
314
5,775
0
29 Apr 2021
Memory Enhanced Global-Local Aggregation for Video Object Detection
Memory Enhanced Global-Local Aggregation for Video Object Detection
Yihong Chen
Yue Cao
Han Hu
Liwei Wang
112
261
0
26 Mar 2020
ImageNet Large Scale Visual Recognition Challenge
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky
Jia Deng
Hao Su
J. Krause
S. Satheesh
...
A. Karpathy
A. Khosla
Michael S. Bernstein
Alexander C. Berg
Li Fei-Fei
VLM
ObjD
296
39,194
0
01 Sep 2014
1