Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2210.06366
Cited By
v1
v2
v3
v4 (latest)
A Generalist Framework for Panoptic Segmentation of Images and Videos
12 October 2022
Ting-Li Chen
Lala Li
Saurabh Saxena
Geoffrey E. Hinton
David J. Fleet
VGen
MLLM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"A Generalist Framework for Panoptic Segmentation of Images and Videos"
50 / 76 papers shown
Title
A Comprehensive Survey on Video Scene Parsing:Advances, Challenges, and Prospects
Guohuan Xie
Syed Ariff Syed Hesham
Wenya Guo
Bing Li
Ming-Ming Cheng
Guolei Sun
Yun-Hai Liu
13
0
0
16 Jun 2025
Lay-Your-Scene: Natural Scene Layout Generation with Diffusion Transformers
Divyansh Srivastava
Xiang Zhang
He Wen
Chenru Wen
Zhuowen Tu
DiffM
77
0
0
07 May 2025
DIVE: Inverting Conditional Diffusion Models for Discriminative Tasks
Yinqi Li
Hong Chang
Ruibing Hou
Shiguang Shan
Xilin Chen
DiffM
93
0
0
24 Apr 2025
SD-ReID: View-aware Stable Diffusion for Aerial-Ground Person Re-Identification
Xiang Hu
Pingping Zhang
Yuhao Wang
Bin Yan
Huchuan Lu
58
0
0
13 Apr 2025
Studying Image Diffusion Features for Zero-Shot Video Object Segmentation
Thanos Delatolas
Vicky S. Kalogeiton
Dim P. Papadopoulos
DiffM
VOS
128
2
0
07 Apr 2025
Segment Any-Quality Images with Generative Latent Space Enhancement
Guangqian Guo
Yoong Guo
Xuehui Yu
Wenbo Li
Yaoxing Wang
Shan Gao
VLM
191
0
0
16 Mar 2025
VRMDiff: Text-Guided Video Referring Matting Generation of Diffusion
Lehan Yang
Jincen Song
Tianlong Wang
Daiqing Qi
Weili Shi
Yuheng Liu
Sheng Li
DiffM
VOS
VGen
131
0
0
11 Mar 2025
Diffusion Suction Grasping with Large-Scale Parcel Dataset
Ding-Tao Huang
Xinyi He
Debei Hua
Dongfang Yu
En-Te Lin
Long Zeng
DiffM
87
0
0
11 Feb 2025
A Comprehensive Review on Noise Control of Diffusion Model
Zhehao Guo
Jiedong Lang
Shuyu Huang
Yunfei Gao
Xintong Ding
DiffM
70
0
0
07 Feb 2025
VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks
Jiannan Wu
Muyan Zhong
Sen Xing
Zeqiang Lai
Zhaoyang Liu
...
Lewei Lu
Tong Lu
Ping Luo
Yu Qiao
Jifeng Dai
MLLM
VLM
LRM
334
59
0
03 Jan 2025
PSDiff: Diffusion Model for Person Search with Iterative and Collaborative Refinement
Chengyou Jia
Minnan Luo
Zhuohang Dang
Guangwen Dai
Xiao Chang
Jiangming Wang
DiffM
134
1
0
31 Dec 2024
Schedule Your Edit: A Simple yet Effective Diffusion Noise Schedule for Image Editing
Haonan Lin
Mengmeng Wang
Jiahao Wang
Wenbin An
Yan Chen
Yong Liu
Feng Tian
Guang Dai
Jingdong Wang
Qianying Wang
DiffM
87
12
0
24 Oct 2024
TinyVLA: Towards Fast, Data-Efficient Vision-Language-Action Models for Robotic Manipulation
Junjie Wen
Yinlin Zhu
Jinming Li
Minjie Zhu
Kun Wu
...
Ran Cheng
Yaxin Peng
Chaomin Shen
Feifei Feng
Jian Tang
LM&Ro
173
70
0
19 Sep 2024
Resolving Inconsistent Semantics in Multi-Dataset Image Segmentation
Qilong Zhangli
Di Liu
Abhishek Aich
Dimitris Metaxas
S. Schulter
65
0
0
15 Sep 2024
A Simple and Generalist Approach for Panoptic Segmentation
Nedyalko Prisadnikov
Wouter Van Gansbeke
Danda Pani Paudel
Luc Van Gool
VLM
116
0
0
29 Aug 2024
Image Segmentation in Foundation Model Era: A Survey
Tianfei Zhou
Fei Zhang
Boyu Chang
Wenguan Wang
Ye Yuan
E. Konukoglu
Daniel Cremers
VLM
140
12
0
23 Aug 2024
CatFree3D: Category-agnostic 3D Object Detection with Diffusion
Wenjing Bian
Zirui Wang
Andrea Vedaldi
91
1
0
22 Aug 2024
Diff3DETR:Agent-based Diffusion Model for Semi-supervised 3D Object Detection
Sachin Pathiyan Cherumanal
Jiahao Lu
Damiano Spina
DiffM
85
4
0
01 Aug 2024
Gated Temporal Diffusion for Stochastic Long-Term Dense Anticipation
Olga Zatsarynna
Emad Bahrami
Yazan Abu Farha
Gianpiero Francesca
Juergen Gall
130
2
0
16 Jul 2024
IDOL: Unified Dual-Modal Latent Diffusion for Human-Centric Joint Video-Depth Generation
Yuanhao Zhai
Kevin Qinghong Lin
Linjie Li
Chung-Ching Lin
Jianfeng Wang
Zhengyuan Yang
David Doermann
Junsong Yuan
Zicheng Liu
Lijuan Wang
DiffM
VGen
81
6
0
15 Jul 2024
Self-supervised Monocular Depth Estimation Based on Hierarchical Feature-Guided Diffusion
Runze Liu
Dongchen Zhu
Guanghui Zhang
Yue Xu
Wenjun Shi
DiffM
MDE
68
0
0
14 Jun 2024
SemFlow: Binding Semantic Segmentation and Image Synthesis via Rectified Flow
Chaoyang Wang
Xiangtai Li
Lu Qi
Henghui Ding
Yunhai Tong
Ming-Hsuan Yang
DiffM
134
7
0
30 May 2024
Chameleon: A Data-Efficient Generalist for Dense Visual Prediction in the Wild
Donggyun Kim
Seongwoong Cho
Semin Kim
Chong Luo
Seunghoon Hong
VLM
80
3
0
29 Apr 2024
OccGen: Generative Multi-modal 3D Occupancy Prediction for Autonomous Driving
Guoqing Wang
Zhongdao Wang
Pin Tang
Jilai Zheng
Xiangxuan Ren
Bailan Feng
Chao Ma
DiffM
98
19
0
23 Apr 2024
Implicit and Explicit Language Guidance for Diffusion-based Visual Perception
Hefeng Wang
Jiale Cao
Jin Xie
Aiping Yang
Yanwei Pang
VLM
DiffM
110
2
0
11 Apr 2024
Investigating the Effectiveness of Cross-Attention to Unlock Zero-Shot Editing of Text-to-Video Diffusion Models
Saman Motamed
Wouter Van Gansbeke
Luc Van Gool
VGen
DiffM
83
1
0
08 Apr 2024
DepthFM: Fast Monocular Depth Estimation with Flow Matching
Ming Gui
Johannes S. Fischer
Ulrich Prestel
Pingchuan Ma
Dmytro Kotovenko
Olga Grebenkova
S. A. Baumann
Vincent Tao Hu
Bjorn Ommer
MDE
104
59
0
20 Mar 2024
Explore In-Context Segmentation via Latent Diffusion Models
Chaoyang Wang
Xiangtai Li
Henghui Ding
Lu Qi
Jiangning Zhang
Yunhai Tong
Chen Change Loy
Shuicheng Yan
DiffM
156
7
0
14 Mar 2024
ACC-ViT : Atrous Convolution's Comeback in Vision Transformers
Nabil Ibtehaz
Ning Yan
Masood S. Mortazavi
Daisuke Kihara
ViT
92
3
0
07 Mar 2024
Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models
Yixin Liu
Kai Zhang
Yuan Li
Zhiling Yan
Chujie Gao
...
Yue Huang
Hanchi Sun
Jianfeng Gao
Lifang He
Lichao Sun
VLM
VGen
EGVM
178
300
0
27 Feb 2024
Open-Vocabulary Segmentation with Unpaired Mask-Text Supervision
Zhaoqing Wang
Xiaobo Xia
Ziye Chen
Xiao He
Yandong Guo
Biwei Huang
Tongliang Liu
VLM
98
13
0
14 Feb 2024
A Simple Latent Diffusion Approach for Panoptic Segmentation and Mask Inpainting
Wouter Van Gansbeke
Bert De Brabandere
DiffM
123
11
0
18 Jan 2024
Zero-Shot Metric Depth with a Field-of-View Conditioned Diffusion Model
Saurabh Saxena
Junhwa Hur
Charles Herrmann
Deqing Sun
David J. Fleet
DiffM
97
29
0
20 Dec 2023
SegRefiner: Towards Model-Agnostic Segmentation Refinement with Discrete Diffusion Process
Meng Wang
Henghui Ding
Jun Hao Liew
Jiajun Liu
Yao-Min Zhao
Yunchao Wei
DiffM
106
19
0
19 Dec 2023
PCRDiffusion: Diffusion Probabilistic Models for Point Cloud Registration
Yue Wu
Yongzhe Yuan
Xiaolong Fan
Xiaoshui Huang
Maoguo Gong
Qiguang Miao
DiffM
100
3
0
11 Dec 2023
Diffusion for Natural Image Matting
Yihan Hu
Yiheng Lin
Wei Wang
Yao-Min Zhao
Yunchao Wei
Humphrey Shi
103
9
0
10 Dec 2023
Diffusion-SS3D: Diffusion Model for Semi-supervised 3D Object Detection
Cheng-Ju Ho
Chen-Hsuan Tai
Yen-Yu Lin
Ming-Hsuan Yang
Yi-Hsuan Tsai
DiffM
117
11
0
05 Dec 2023
UniGS: Unified Representation for Image Generation and Segmentation
Lu Qi
Lehan Yang
Weidong Guo
Yu-Syuan Xu
Bo Du
Varun Jampani
Ming-Hsuan Yang
91
19
0
04 Dec 2023
DiffusionMat: Alpha Matting as Sequential Refinement Learning
Yangyang Xu
Shengfeng He
Wenqi Shao
Kwan-Yee K. Wong
Yu Qiao
Ping Luo
DiffM
62
3
0
22 Nov 2023
Using Stochastic Gradient Descent to Smooth Nonconvex Functions: Analysis of Implicit Graduated Optimization with Optimal Noise Scheduling
Naoki Sato
Hideaki Iiduka
63
3
0
15 Nov 2023
MonoDiffusion: Self-Supervised Monocular Depth Estimation Using Diffusion Model
Shuwei Shao
Zhongcai Pei
Weihai Chen
Dingchi Sun
Peter C. Y. Chen
Zhengguo Li
MDE
DiffM
72
8
0
13 Nov 2023
Towards A Unified Neural Architecture for Visual Recognition and Reasoning
Calvin Luo
Boqing Gong
Ting Chen
Chen Sun
OCL
ObjD
52
1
0
10 Nov 2023
3DiffTection: 3D Object Detection with Geometry-Aware Diffusion Features
Chenfeng Xu
Huan Ling
Sanja Fidler
Or Litany
90
15
0
07 Nov 2023
SegGen: Supercharging Segmentation Models with Text2Mask and Mask2Img Synthesis
Hanrong Ye
Jason Kuen
Qing Liu
Zhe Lin
Brian L. Price
Dan Xu
VLM
118
12
0
06 Nov 2023
CogVLM: Visual Expert for Pretrained Language Models
Weihan Wang
Qingsong Lv
Wenmeng Yu
Wenyi Hong
Ji Qi
...
Bin Xu
Juanzi Li
Yuxiao Dong
Ming Ding
Jie Tang
VLM
MLLM
150
517
0
06 Nov 2023
Towards Generic Semi-Supervised Framework for Volumetric Medical Image Segmentation
Haonan Wang
Xiaomeng Li
85
33
0
17 Oct 2023
A Survey on Video Diffusion Models
Zhen Xing
Qijun Feng
Haoran Chen
Qi Dai
Hang-Rui Hu
Hang Xu
Zuxuan Wu
Yu-Gang Jiang
EGVM
VGen
174
138
0
16 Oct 2023
FreeReg: Image-to-Point Cloud Registration Leveraging Pretrained Diffusion Models and Monocular Depth Estimators
Haiping Wang
Yuan Liu
Bing Wang
Yujing Sun
Zhenchao Dong
Wenping Wang
Bisheng Yang
DiffM
68
12
0
05 Oct 2023
Diffusion-based 3D Object Detection with Random Boxes
Xin Zhou
Jinghua Hou
Tingting Yao
Dingkang Liang
Zhe Liu
Zhikang Zou
Xiaoqing Ye
Jianwei Cheng
Xiang Bai
DiffM
63
8
0
05 Sep 2023
A Recycling Training Strategy for Medical Image Segmentation with Diffusion Denoising Models
Yunguan Fu
Yiwen Li
Shaheer U. Saeed
Matthew J Clarkson
Yipeng Hu
DiffM
MedIm
86
6
0
30 Aug 2023
1
2
Next