Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2202.11094
Cited By
GroupViT: Semantic Segmentation Emerges from Text Supervision
22 February 2022
Jiarui Xu
Shalini De Mello
Sifei Liu
Wonmin Byeon
Thomas Breuel
Jan Kautz
Xinyu Wang
ViT
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"GroupViT: Semantic Segmentation Emerges from Text Supervision"
50 / 126 papers shown
Title
Open3DIS: Open-Vocabulary 3D Instance Segmentation with 2D Mask Guidance
P. Nguyen
T.D. Ngo
E. Kalogerakis
Chuang Gan
Anh Tran
Cuong Pham
Khoi Duc Minh Nguyen
ISeg
57
51
0
17 Dec 2023
CLIP-guided Federated Learning on Heterogeneous and Long-Tailed Data
Jiangming Shi
Shanshan Zheng
Xiangbo Yin
Yang Lu
Yuan Xie
Yanyun Qu
VLM
FedML
71
10
0
14 Dec 2023
MmAP : Multi-modal Alignment Prompt for Cross-domain Multi-task Learning
Yi Xin
Junlong Du
Qiang Wang
Ke Yan
Shouhong Ding
VLM
51
49
0
14 Dec 2023
ZePT: Zero-Shot Pan-Tumor Segmentation via Query-Disentangling and Self-Prompting
Yankai Jiang
Zhongzhen Huang
Rongzhao Zhang
Xiaofan Zhang
Shaoting Zhang
VLM
56
11
0
07 Dec 2023
Auto-Vocabulary Semantic Segmentation
Osman Ülger
Maksymilian Kulicki
Yuki M. Asano
Martin R. Oswald
VLM
58
2
0
07 Dec 2023
Emergent Open-Vocabulary Semantic Segmentation from Off-the-shelf Vision-Language Models
Jiayun Luo
Siddhesh Khandelwal
Leonid Sigal
Boyang Albert Li
MLLM
VLM
77
7
0
28 Nov 2023
OpenForest: A data catalogue for machine learning in forest monitoring
Arthur Ouaknine
T. Kattenborn
Etienne Laliberté
David Rolnick
100
6
0
01 Nov 2023
TESTA: Temporal-Spatial Token Aggregation for Long-form Video-Language Understanding
Shuhuai Ren
Sishuo Chen
Shicheng Li
Xu Sun
Lu Hou
ViT
56
29
0
29 Oct 2023
SILC: Improving Vision Language Pretraining with Self-Distillation
Muhammad Ferjad Naeem
Yongqin Xian
Xiaohua Zhai
Lukas Hoyer
Luc Van Gool
F. Tombari
VLM
38
33
0
20 Oct 2023
SAIR: Learning Semantic-aware Implicit Representation
Canyu Zhang
Xiaoguang Li
Qing Guo
Song Wang
41
4
0
13 Oct 2023
TextPSG: Panoptic Scene Graph Generation from Textual Descriptions
Chengyang Zhao
Songlin Yang
Zhenfang Chen
Mingyu Ding
Chuang Gan
64
16
0
10 Oct 2023
CLIP-Hand3D: Exploiting 3D Hand Pose Estimation via Context-Aware Prompting
Shaoxiang Guo
Qing Cai
Lin Qi
Junyu Dong
3DH
54
8
0
28 Sep 2023
Object-Centric Open-Vocabulary Image-Retrieval with Aggregated Features
Hila Levi
Guy Heller
Dan Levi
Ethan Fetaya
OCL
VLM
42
3
0
26 Sep 2023
Which Tokens to Use? Investigating Token Reduction in Vision Transformers
Joakim Bruslund Haurum
Sergio Escalera
Graham W. Taylor
T. Moeslund
ViT
51
34
0
09 Aug 2023
Convolutions Die Hard: Open-Vocabulary Segmentation with Single Frozen Convolutional CLIP
Qihang Yu
Ju He
XueQing Deng
Xiaohui Shen
Liang-Chieh Chen
VLM
CLIP
47
139
0
04 Aug 2023
Transferable Decoding with Visual Entities for Zero-Shot Image Captioning
Junjie Fei
Teng Wang
Jinrui Zhang
Zhenyu He
Chengjie Wang
Feng Zheng
VLM
36
34
0
31 Jul 2023
Foundational Models Defining a New Era in Vision: A Survey and Outlook
Muhammad Awais
Muzammal Naseer
Salman Khan
Rao Muhammad Anwer
Hisham Cholakkal
M. Shah
Ming-Hsuan Yang
Fahad Shahbaz Khan
VLM
57
120
0
25 Jul 2023
An Intelligent Remote Sensing Image Quality Inspection System
Yi Yu
Tao Wang
Kang Ran
Changjiang Li
Hao Wu
29
1
0
22 Jul 2023
Language-based Action Concept Spaces Improve Video Self-Supervised Learning
Kanchana Ranasinghe
Michael S. Ryoo
SSL
VLM
67
12
0
20 Jul 2023
Does Visual Pretraining Help End-to-End Reasoning?
Chen Sun
Calvin Luo
Xingyi Zhou
Anurag Arnab
Cordelia Schmid
OCL
LRM
ViT
50
3
0
17 Jul 2023
Multi-Modal Prototypes for Open-World Semantic Segmentation
Yu-Hao Yang
Chaofan Ma
Chen Ju
Fei Zhang
Jiangchao Yao
Ya Zhang
Yanfeng Wang
VLM
61
9
0
05 Jul 2023
Hierarchical Open-vocabulary Universal Image Segmentation
Xudong Wang
Shufang Li
Konstantinos Kallidromitis
Yu Kato
Kazuki Kozuka
Trevor Darrell
VLM
OCL
53
37
0
03 Jul 2023
VisoGender: A dataset for benchmarking gender bias in image-text pronoun resolution
S. Hall
F. G. Abrantes
Hanwen Zhu
Grace A. Sodunke
Aleksandar Shtedritski
Hannah Rose Kirk
CoGe
47
42
0
21 Jun 2023
Using Unreliable Pseudo-Labels for Label-Efficient Semantic Segmentation
Haochen Wang
Yuchao Wang
Yujun Shen
Junsong Fan
Yuxi Wang
Zhaoxiang Zhang
UQCV
53
10
0
04 Jun 2023
PuMer: Pruning and Merging Tokens for Efficient Vision Language Models
Qingqing Cao
Bhargavi Paranjape
Hannaneh Hajishirzi
MLLM
VLM
33
22
0
27 May 2023
Interactive Segment Anything NeRF with Feature Imitation
Xiaokang Chen
Jiaxiang Tang
Diwen Wan
Jingbo Wang
Gang Zeng
67
22
0
25 May 2023
Adapt and Align to Improve Zero-Shot Sketch-Based Image Retrieval
Shiyin Dong
Mingrui Zhu
N. Wang
Xinbo Gao
VLM
46
3
0
09 May 2023
CLIP-S
4
^4
4
: Language-Guided Self-Supervised Semantic Segmentation
Wenbin He
Suphanut Jamonnak
Liangke Gou
Liu Ren
VLM
65
32
0
01 May 2023
VicTR: Video-conditioned Text Representations for Activity Recognition
Kumara Kahatapitiya
Anurag Arnab
Arsha Nagrani
Michael S. Ryoo
58
20
0
05 Apr 2023
Blind Image Quality Assessment via Vision-Language Correspondence: A Multitask Learning Perspective
Weixia Zhang
Guangtao Zhai
Ying Wei
Xiaokang Yang
Kede Ma
VLM
46
175
0
27 Mar 2023
Top-Down Visual Attention from Analysis by Synthesis
Baifeng Shi
Trevor Darrell
Xin Eric Wang
35
30
0
23 Mar 2023
A Simple Framework for Open-Vocabulary Segmentation and Detection
Hao Zhang
Feng Li
Xueyan Zou
Siyi Liu
Chun-yue Li
Jianfeng Gao
Jianwei Yang
Lei Zhang
ObjD
VLM
40
153
0
14 Mar 2023
CLIP-FO3D: Learning Free Open-world 3D Scene Representations from 2D Dense CLIP
Junbo Zhang
Runpei Dong
Kaisheng Ma
CLIP
VLM
45
77
0
08 Mar 2023
CLIPER: A Unified Vision-Language Framework for In-the-Wild Facial Expression Recognition
Hanting Li
Hongjing Niu
Zhaoqing Zhu
Feng Zhao
VLM
CLIP
36
26
0
01 Mar 2023
Learning Visual Representations via Language-Guided Sampling
Mohamed El Banani
Karan Desai
Justin Johnson
SSL
VLM
59
28
0
23 Feb 2023
Rejecting Cognitivism: Computational Phenomenology for Deep Learning
P. Beckmann
G. Köstner
Ines Hipólito
56
4
0
16 Feb 2023
SimCon Loss with Multiple Views for Text Supervised Semantic Segmentation
Yash J. Patel
Yusheng Xie
Yi Zhu
Srikar Appalaraju
R. Manmatha
42
4
0
07 Feb 2023
ViewCo: Discovering Text-Supervised Segmentation Masks via Multi-View Semantic Consistency
Pengzhen Ren
Changlin Li
Hang Xu
Yi Zhu
Guangrun Wang
Jian-zhuo Liu
Xiaojun Chang
Xiaodan Liang
54
43
0
31 Jan 2023
Class Enhancement Losses with Pseudo Labels for Zero-shot Semantic Segmentation
S. D. Dao
Hengcan Shi
Dinh Q. Phung
Jianfei Cai
VLM
39
0
0
18 Jan 2023
Causal Triplet: An Open Challenge for Intervention-centric Causal Representation Learning
Yuejiang Liu
Alexandre Alahi
Chris Russell
Max Horn
Dominik Zietlow
Bernhard Schölkopf
Francesco Locatello
CML
69
22
0
12 Jan 2023
Improving self-supervised representation learning via sequential adversarial masking
Dylan Sam
Min Bai
Tristan McKinney
Li Erran Li
SSL
55
0
0
16 Dec 2022
GPViT: A High Resolution Non-Hierarchical Vision Transformer with Group Propagation
Chenhongyi Yang
Jiarui Xu
Shalini De Mello
Elliot J. Crowley
Xinyu Wang
ViT
48
22
0
13 Dec 2022
Open Vocabulary Semantic Segmentation with Patch Aligned Contrastive Learning
Jishnu Mukhoti
Tsung-Yu Lin
Omid Poursaeed
Rui Wang
Ashish Shah
Philip Torr
Ser-Nam Lim
VLM
60
81
0
09 Dec 2022
SegCLIP: Patch Aggregation with Learnable Centers for Open-Vocabulary Semantic Segmentation
Huaishao Luo
Junwei Bao
Youzheng Wu
Xiaodong He
Tianrui Li
VLM
43
147
0
27 Nov 2022
Peekaboo: Text to Image Diffusion Models are Zero-Shot Segmentors
R. Burgert
Kanchana Ranasinghe
Xiang Li
Michael S. Ryoo
DiffM
VLM
48
37
0
23 Nov 2022
Texts as Images in Prompt Tuning for Multi-Label Image Recognition
Zixian Guo
Bowen Dong
Zhilong Ji
Jinfeng Bai
Yiwen Guo
W. Zuo
VLM
VPVLM
44
59
0
23 Nov 2022
Hybrid Transformer Based Feature Fusion for Self-Supervised Monocular Depth Estimation
S. Tomar
Maitreya Suin
A. N. Rajagopalan
ViT
MDE
52
4
0
20 Nov 2022
OneFormer: One Transformer to Rule Universal Image Segmentation
Jitesh Jain
Jiacheng Li
M. Chiu
Ali Hassani
Nikita Orlov
Humphrey Shi
ViT
31
335
0
10 Nov 2022
Token Merging: Your ViT But Faster
Daniel Bolya
Cheng-Yang Fu
Xiaoliang Dai
Peizhao Zhang
Christoph Feichtenhofer
Judy Hoffman
MoMe
51
436
0
17 Oct 2022
Novel 3D Scene Understanding Applications From Recurrence in a Single Image
Shimian Zhang
Skanda Bharadwaj
Keaton Kraiger
Yashasvi Asthana
Hong Zhang
R. Collins
Yanxi Liu
64
1
0
14 Oct 2022
Previous
1
2
3
Next