ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2207.14227
  4. Cited By
Visual Recognition by Request

Visual Recognition by Request

28 July 2022
Chufeng Tang
Lingxi Xie
Xiaopeng Zhang
Xiaolin Hu
Qi Tian
    VLM
ArXivPDFHTML

Papers citing "Visual Recognition by Request"

50 / 64 papers shown
Title
VisRL: Intention-Driven Visual Perception via Reinforced Reasoning
VisRL: Intention-Driven Visual Perception via Reinforced Reasoning
Zhangquan Chen
Xufang Luo
Dongsheng Li
OffRL
LRM
99
3
0
10 Mar 2025
SAM-CP: Marrying SAM with Composable Prompts for Versatile Segmentation
SAM-CP: Marrying SAM with Composable Prompts for Versatile Segmentation
Pengfei Chen
Lingxi Xie
Xinyue Huo
Xuehui Yu
Xiaopeng Zhang
Yingfei Sun
Zhenjun Han
Qi Tian
VLM
125
1
0
23 Jul 2024
Simple Open-Vocabulary Object Detection with Vision Transformers
Simple Open-Vocabulary Object Detection with Vision Transformers
Matthias Minderer
A. Gritsenko
Austin Stone
Maxim Neumann
Dirk Weissenborn
...
Zhuoran Shen
Tianlin Li
Xiaohua Zhai
Thomas Kipf
N. Houlsby
ObjD
CLIP
VLM
ViT
OCL
81
310
0
12 May 2022
Panoptic-PartFormer: Learning a Unified Model for Panoptic Part
  Segmentation
Panoptic-PartFormer: Learning a Unified Model for Panoptic Part Segmentation
Xiangtai Li
Shilin Xu
Yibo Yang
Guangliang Cheng
Yunhai Tong
Dacheng Tao
ViT
34
46
0
10 Apr 2022
GroupViT: Semantic Segmentation Emerges from Text Supervision
GroupViT: Semantic Segmentation Emerges from Text Supervision
Jiarui Xu
Shalini De Mello
Sifei Liu
Wonmin Byeon
Thomas Breuel
Jan Kautz
Xinyu Wang
ViT
VLM
276
517
0
22 Feb 2022
Language-driven Semantic Segmentation
Language-driven Semantic Segmentation
Boyi Li
Kilian Q. Weinberger
Serge Belongie
V. Koltun
René Ranftl
VLM
94
610
0
10 Jan 2022
Mask2Former for Video Instance Segmentation
Mask2Former for Video Instance Segmentation
Bowen Cheng
Anwesa Choudhuri
Ishan Misra
Alexander Kirillov
Rohit Girdhar
Alex Schwing
VOS
84
167
0
20 Dec 2021
Grounded Language-Image Pre-training
Grounded Language-Image Pre-training
Liunian Harold Li
Pengchuan Zhang
Haotian Zhang
Jianwei Yang
Chunyuan Li
...
Lu Yuan
Lei Zhang
Lei Li
Kai-Wei Chang
Jianfeng Gao
ObjD
VLM
69
1,047
0
07 Dec 2021
PartImageNet: A Large, High-Quality Dataset of Parts
PartImageNet: A Large, High-Quality Dataset of Parts
Ju He
Shuo Yang
Shaokang Yang
Adam Kortylewski
Xiaoding Yuan
Jieneng Chen
Shuai Liu
Cheng Yang
Qihang Yu
Alan Yuille
3DV
MLLM
3DH
VLM
79
97
0
02 Dec 2021
Swin Transformer V2: Scaling Up Capacity and Resolution
Swin Transformer V2: Scaling Up Capacity and Resolution
Ze Liu
Han Hu
Yutong Lin
Zhuliang Yao
Zhenda Xie
...
Yue Cao
Zheng Zhang
Li Dong
Furu Wei
B. Guo
ViT
182
1,783
0
18 Nov 2021
CLIP-Adapter: Better Vision-Language Models with Feature Adapters
CLIP-Adapter: Better Vision-Language Models with Feature Adapters
Peng Gao
Shijie Geng
Renrui Zhang
Teli Ma
Rongyao Fang
Yongfeng Zhang
Hongsheng Li
Yu Qiao
VLM
CLIP
206
1,011
0
09 Oct 2021
Pix2seq: A Language Modeling Framework for Object Detection
Pix2seq: A Language Modeling Framework for Object Detection
Ting-Li Chen
Saurabh Saxena
Lala Li
David J. Fleet
Geoffrey E. Hinton
MLLM
ViT
VLM
260
346
0
22 Sep 2021
Learning to Prompt for Vision-Language Models
Learning to Prompt for Vision-Language Models
Kaiyang Zhou
Jingkang Yang
Chen Change Loy
Ziwei Liu
VPVLM
CLIP
VLM
443
2,340
0
02 Sep 2021
Conditional DETR for Fast Training Convergence
Conditional DETR for Fast Training Convergence
Depu Meng
Xiaokang Chen
Zejia Fan
Gang Zeng
Houqiang Li
Yuhui Yuan
Lei-huan Sun
Jingdong Wang
ViT
48
615
0
13 Aug 2021
Per-Pixel Classification is Not All You Need for Semantic Segmentation
Per-Pixel Classification is Not All You Need for Semantic Segmentation
Bowen Cheng
Alex Schwing
Alexander Kirillov
VLM
ViT
147
1,517
0
13 Jul 2021
K-Net: Towards Unified Image Segmentation
K-Net: Towards Unified Image Segmentation
Wenwei Zhang
Jiangmiao Pang
Kai-xiang Chen
Chen Change Loy
ISeg
53
361
0
28 Jun 2021
Part-aware Panoptic Segmentation
Part-aware Panoptic Segmentation
Daan de Geus
Panagiotis Meletis
Chenyang Lu
Xiaoxiao Wen
Gijs Dubbelman
80
61
0
11 Jun 2021
Scaling Vision Transformers
Scaling Vision Transformers
Xiaohua Zhai
Alexander Kolesnikov
N. Houlsby
Lucas Beyer
ViT
114
1,074
0
08 Jun 2021
SOLQ: Segmenting Objects by Learning Queries
SOLQ: Segmenting Objects by Learning Queries
Bin Dong
Fangao Zeng
Tiancai Wang
Xinming Zhang
Yichen Wei
ISeg
41
117
0
04 Jun 2021
SegFormer: Simple and Efficient Design for Semantic Segmentation with
  Transformers
SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers
Enze Xie
Wenhai Wang
Zhiding Yu
Anima Anandkumar
J. Álvarez
Ping Luo
ViT
165
4,934
0
31 May 2021
What Is Considered Complete for Visual Recognition?
What Is Considered Complete for Visual Recognition?
Lingxi Xie
Xiaopeng Zhang
Longhui Wei
Jianlong Chang
Qi Tian
VLM
63
4
0
28 May 2021
Segmenter: Transformer for Semantic Segmentation
Segmenter: Transformer for Semantic Segmentation
Robin Strudel
Ricardo Garcia Pinel
Ivan Laptev
Cordelia Schmid
ViT
142
1,442
0
12 May 2021
Instances as Queries
Instances as Queries
Yuxin Fang
Shusheng Yang
Xinggang Wang
Yu Li
Chen Fang
Ying Shan
Bin Feng
Wenyu Liu
ISeg
64
258
0
05 May 2021
Open-vocabulary Object Detection via Vision and Language Knowledge
  Distillation
Open-vocabulary Object Detection via Vision and Language Knowledge Distillation
Xiuye Gu
Nayeon Lee
Weicheng Kuo
Huayu Chen
VLM
ObjD
252
906
0
28 Apr 2021
Points as Queries: Weakly Semi-supervised Object Detection by Points
Points as Queries: Weakly Semi-supervised Object Detection by Points
Liangyu Chen
Tong Yang
Xinming Zhang
Wei Zhang
Jian Sun
48
84
0
15 Apr 2021
Look Closer to Segment Better: Boundary Patch Refinement for Instance
  Segmentation
Look Closer to Segment Better: Boundary Patch Refinement for Instance Segmentation
Chufeng Tang
Hang Chen
Xiao-Li Li
Jianmin Li
Zhaoxiang Zhang
Xiaolin Hu
ISeg
52
79
0
12 Apr 2021
Hypercorrelation Squeeze for Few-Shot Segmentation
Hypercorrelation Squeeze for Few-Shot Segmentation
Juhong Min
Dahyun Kang
Minsu Cho
67
294
0
04 Apr 2021
Towards Open World Object Detection
Towards Open World Object Detection
K. J. Joseph
Salman Khan
Fahad Shahbaz Khan
V. Balasubramanian
ObjD
61
453
0
03 Mar 2021
Learning Transferable Visual Models From Natural Language Supervision
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
696
28,659
0
26 Feb 2021
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize
  Long-Tail Visual Concepts
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts
Soravit Changpinyo
P. Sharma
Nan Ding
Radu Soricut
VLM
391
1,103
0
17 Feb 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy
  Text Supervision
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
397
3,778
0
11 Feb 2021
Sparse R-CNN: End-to-End Object Detection with Learnable Proposals
Sparse R-CNN: End-to-End Object Detection with Learnable Proposals
Pei Sun
Rufeng Zhang
Yi Jiang
Tao Kong
Chenfeng Xu
...
Masayoshi Tomizuka
Lei Li
Zehuan Yuan
Changhu Wang
Ping Luo
ObjD
84
1,085
0
25 Nov 2020
Open-Vocabulary Object Detection Using Captions
Open-Vocabulary Object Detection Using Captions
Alireza Zareian
Kevin Dela Rosa
Derek Hao Hu
Shih-Fu Chang
VLM
ObjD
114
426
0
20 Nov 2020
Deformable DETR: Deformable Transformers for End-to-End Object Detection
Deformable DETR: Deformable Transformers for End-to-End Object Detection
Xizhou Zhu
Weijie Su
Lewei Lu
Bin Li
Xiaogang Wang
Jifeng Dai
ViT
169
4,993
0
08 Oct 2020
Part-aware Prototype Network for Few-shot Semantic Segmentation
Part-aware Prototype Network for Few-shot Semantic Segmentation
Yongfei Liu
Xiangyi Zhang
Songyang Zhang
Xuming He
39
320
0
13 Jul 2020
End-to-End Object Detection with Transformers
End-to-End Object Detection with Transformers
Nicolas Carion
Francisco Massa
Gabriel Synnaeve
Nicolas Usunier
Alexander Kirillov
Sergey Zagoruyko
ViT
3DV
PINN
310
12,906
0
26 May 2020
Cityscapes-Panoptic-Parts and PASCAL-Panoptic-Parts datasets for Scene
  Understanding
Cityscapes-Panoptic-Parts and PASCAL-Panoptic-Parts datasets for Scene Understanding
Panagiotis Meletis
Xiaoxiao Wen
Chenyang Lu
Daan de Geus
Gijs Dubbelman
3DPC
20
16
0
16 Apr 2020
EfficientPS: Efficient Panoptic Segmentation
EfficientPS: Efficient Panoptic Segmentation
Rohit Mohan
Abhinav Valada
64
233
0
05 Apr 2020
Conditional Convolutions for Instance Segmentation
Conditional Convolutions for Instance Segmentation
Zhi Tian
Chunhua Shen
Hao Chen
ISeg
220
602
0
12 Mar 2020
PolyTransform: Deep Polygon Transformer for Instance Segmentation
PolyTransform: Deep Polygon Transformer for Instance Segmentation
Justin Liang
N. Homayounfar
Wei-Chiu Ma
Yuwen Xiong
Rui Hu
R. Urtasun
ViT
ISeg
48
175
0
05 Dec 2019
PANet: Few-Shot Image Semantic Segmentation with Prototype Alignment
PANet: Few-Shot Image Semantic Segmentation with Prototype Alignment
Kaixin Wang
Jun Hao Liew
Yingtian Zou
Daquan Zhou
Jiashi Feng
VLM
47
1,058
0
18 Aug 2019
The Fishyscapes Benchmark: Measuring Blind Spots in Semantic
  Segmentation
The Fishyscapes Benchmark: Measuring Blind Spots in Semantic Segmentation
Hermann Blum
Paul-Edouard Sarlin
Juan I. Nieto
Roland Siegwart
Cesar Cadena
UQCV
43
157
0
05 Apr 2019
FCOS: Fully Convolutional One-Stage Object Detection
FCOS: Fully Convolutional One-Stage Object Detection
Zhi Tian
Chunhua Shen
Hao Chen
Tong He
ObjD
100
4,969
0
02 Apr 2019
Attention-guided Unified Network for Panoptic Segmentation
Attention-guided Unified Network for Panoptic Segmentation
Yanwei Li
Xinze Chen
Zheng Zhu
Lingxi Xie
Guan Huang
Dalong Du
Xingang Wang
41
279
0
10 Dec 2018
Learning to Fuse Things and Stuff
Learning to Fuse Things and Stuff
Jie Li
Allan Raventos
Arjun Bhargava
Takaaki Tagawa
Adrien Gaidon
64
104
0
04 Dec 2018
From Recognition to Cognition: Visual Commonsense Reasoning
From Recognition to Cognition: Visual Commonsense Reasoning
Rowan Zellers
Yonatan Bisk
Ali Farhadi
Yejin Choi
LRM
BDL
OCL
ReLM
131
873
0
27 Nov 2018
The Open Images Dataset V4: Unified image classification, object
  detection, and visual relationship detection at scale
The Open Images Dataset V4: Unified image classification, object detection, and visual relationship detection at scale
Alina Kuznetsova
H. Rom
N. Alldrin
J. Uijlings
Ivan Krasin
...
S. Popov
Matteo Malloci
Alexander Kolesnikov
Tom Duerig
V. Ferrari
ObjD
VLM
89
1,340
0
02 Nov 2018
A Comprehensive Survey of Deep Learning for Image Captioning
A Comprehensive Survey of Deep Learning for Image Captioning
Md Zakir Hossain
Ferdous Sohel
M. Shiratuddin
Hamid Laga
VLM
3DV
72
769
0
06 Oct 2018
Panoptic Segmentation
Panoptic Segmentation
Alexander Kirillov
Kaiming He
Ross B. Girshick
Carsten Rother
Piotr Dollár
97
1,425
0
03 Jan 2018
IQA: Visual Question Answering in Interactive Environments
IQA: Visual Question Answering in Interactive Environments
Daniel Gordon
Aniruddha Kembhavi
Mohammad Rastegari
Joseph Redmon
Dieter Fox
Ali Farhadi
LM&Ro
64
388
0
09 Dec 2017
12
Next