ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2202.11094
  4. Cited By
GroupViT: Semantic Segmentation Emerges from Text Supervision

GroupViT: Semantic Segmentation Emerges from Text Supervision

22 February 2022
Jiarui Xu
Shalini De Mello
Sifei Liu
Wonmin Byeon
Thomas Breuel
Jan Kautz
Xinyu Wang
    ViT
    VLM
ArXivPDFHTML

Papers citing "GroupViT: Semantic Segmentation Emerges from Text Supervision"

26 / 126 papers shown
Title
Multi-Granularity Cross-modal Alignment for Generalized Medical Visual
  Representation Learning
Multi-Granularity Cross-modal Alignment for Generalized Medical Visual Representation Learning
Fuying Wang
Yuyin Zhou
Shujun Wang
V. Vardhanabhuti
Lequan Yu
42
139
0
12 Oct 2022
CLIP-Diffusion-LM: Apply Diffusion Model on Image Captioning
CLIP-Diffusion-LM: Apply Diffusion Model on Image Captioning
Shi-You Xu
VLM
DiffM
42
12
0
10 Oct 2022
Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP
Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP
Feng Liang
Bichen Wu
Xiaoliang Dai
Kunpeng Li
Yinan Zhao
Hang Zhang
Peizhao Zhang
Peter Vajda
Diana Marculescu
CLIP
VLM
49
436
0
09 Oct 2022
Unsupervised Multi-View Object Segmentation Using Radiance Field
  Propagation
Unsupervised Multi-View Object Segmentation Using Radiance Field Propagation
Xinhang Liu
Jiaben Chen
Huai Yu
Yu-Wing Tai
Chi-Keung Tang
97
28
0
02 Oct 2022
Learning Hierarchical Image Segmentation For Recognition and By
  Recognition
Learning Hierarchical Image Segmentation For Recognition and By Recognition
Tsung-Wei Ke
Sangwoo Mo
Stella X. Yu
VLM
65
10
0
01 Oct 2022
Bridging the Gap to Real-World Object-Centric Learning
Bridging the Gap to Real-World Object-Centric Learning
Maximilian Seitzer
Max Horn
Andrii Zadaianchuk
Dominik Zietlow
Tianjun Xiao
...
Tong He
Zheng Zhang
Bernhard Schölkopf
Thomas Brox
Francesco Locatello
OCL
56
141
0
29 Sep 2022
FreeSeg: Free Mask from Interpretable Contrastive Language-Image Pretraining for Semantic Segmentation
Yi Li
Huifeng Yao
Hualiang Wang
Xuelong Li
ISeg
VLM
49
2
0
27 Sep 2022
Visual Recognition by Request
Visual Recognition by Request
Chufeng Tang
Lingxi Xie
Xiaopeng Zhang
Xiaolin Hu
Qi Tian
VLM
26
15
0
28 Jul 2022
Unsupervised Semantic Segmentation with Self-supervised Object-centric
  Representations
Unsupervised Semantic Segmentation with Self-supervised Object-centric Representations
Andrii Zadaianchuk
Matthaeus Kleindessner
Yi Zhu
Francesco Locatello
Thomas Brox
48
49
0
11 Jul 2022
LViT: Language meets Vision Transformer in Medical Image Segmentation
LViT: Language meets Vision Transformer in Medical Image Segmentation
Zihan Li
Yunxiang Li
Qingde Li
Puyang Wang
Dazhou Guo
Le Lu
D. Jin
You Zhang
Qingqi Hong
VLM
MedIm
69
136
0
29 Jun 2022
MixGen: A New Multi-Modal Data Augmentation
MixGen: A New Multi-Modal Data Augmentation
Xiaoshuai Hao
Yi Zhu
Srikar Appalaraju
Aston Zhang
Wanqian Zhang
Boyang Li
Mu Li
VLM
35
85
0
16 Jun 2022
SAVi++: Towards End-to-End Object-Centric Learning from Real-World
  Videos
SAVi++: Towards End-to-End Object-Centric Learning from Real-World Videos
Gamaleldin F. Elsayed
Aravindh Mahendran
Sjoerd van Steenkiste
Klaus Greff
Michael C. Mozer
Thomas Kipf
VOS
OCL
62
141
0
15 Jun 2022
HCFormer: Unified Image Segmentation with Hierarchical Clustering
HCFormer: Unified Image Segmentation with Hierarchical Clustering
Teppei Suzuki
38
0
0
20 May 2022
Weakly-supervised segmentation of referring expressions
Weakly-supervised segmentation of referring expressions
Robin Strudel
Ivan Laptev
Cordelia Schmid
29
21
0
10 May 2022
Scaling Open-Vocabulary Image Segmentation with Image-Level Labels
Scaling Open-Vocabulary Image Segmentation with Image-Level Labels
Golnaz Ghiasi
Xiuye Gu
Huayu Chen
Nayeon Lee
VLM
74
373
0
22 Dec 2021
Simpler is Better: Few-shot Semantic Segmentation with Classifier Weight
  Transformer
Simpler is Better: Few-shot Semantic Segmentation with Classifier Weight Transformer
Zhihe Lu
Sen He
Xiatian Zhu
Li Zhang
Yi-Zhe Song
Tao Xiang
ViT
175
174
0
06 Aug 2021
MLP-Mixer: An all-MLP Architecture for Vision
MLP-Mixer: An all-MLP Architecture for Vision
Ilya O. Tolstikhin
N. Houlsby
Alexander Kolesnikov
Lucas Beyer
Xiaohua Zhai
...
Andreas Steiner
Daniel Keysers
Jakob Uszkoreit
Mario Lucic
Alexey Dosovitskiy
326
2,626
0
04 May 2021
Emerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
436
5,899
0
29 Apr 2021
AAformer: Auto-Aligned Transformer for Person Re-Identification
AAformer: Auto-Aligned Transformer for Person Re-Identification
Kuan Zhu
Haiyun Guo
Shiliang Zhang
Yaowei Wang
Jing Liu
Jinqiao Wang
Ming Tang
ViT
40
112
0
02 Apr 2021
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction
  without Convolutions
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
328
3,653
0
24 Feb 2021
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize
  Long-Tail Visual Concepts
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts
Soravit Changpinyo
P. Sharma
Nan Ding
Radu Soricut
VLM
328
1,096
0
17 Feb 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy
  Text Supervision
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
365
3,760
0
11 Feb 2021
Is Space-Time Attention All You Need for Video Understanding?
Is Space-Time Attention All You Need for Video Understanding?
Gedas Bertasius
Heng Wang
Lorenzo Torresani
ViT
288
2,006
0
09 Feb 2021
DEAL: Difficulty-aware Active Learning for Semantic Segmentation
DEAL: Difficulty-aware Active Learning for Semantic Segmentation
Shuai Xie
Zunlei Feng
Ying Chen
Songtao Sun
Chao Ma
Xiuming Zhang
VLM
131
51
0
17 Oct 2020
Unified Vision-Language Pre-Training for Image Captioning and VQA
Unified Vision-Language Pre-Training for Image Captioning and VQA
Luowei Zhou
Hamid Palangi
Lei Zhang
Houdong Hu
Jason J. Corso
Jianfeng Gao
MLLM
VLM
300
929
0
24 Sep 2019
Learning Pixel-level Semantic Affinity with Image-level Supervision for
  Weakly Supervised Semantic Segmentation
Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation
Jiwoon Ahn
Suha Kwak
259
743
0
28 Mar 2018
Previous
123