Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2303.02153
Cited By
Unleashing Text-to-Image Diffusion Models for Visual Perception
3 March 2023
Wenliang Zhao
Yongming Rao
Zuyan Liu
Benlin Liu
Jie Zhou
Jiwen Lu
ObjD
VLM
MDE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Unleashing Text-to-Image Diffusion Models for Visual Perception"
39 / 39 papers shown
Title
Knowledge-Aligned Counterfactual-Enhancement Diffusion Perception for Unsupervised Cross-Domain Visual Emotion Recognition
Wen Yin
Yong Wang
Guiduo Duan
Dongyang Zhang
Xin Hu
Yuan-Fang Li
Tao He
74
0
0
26 May 2025
Diffusion Classifiers Understand Compositionality, but Conditions Apply
Yujin Jeong
Arnas Uselis
Seong Joon Oh
Anna Rohrbach
DiffM
CoGe
413
0
3
23 May 2025
SynRES: Towards Referring Expression Segmentation in the Wild via Synthetic Data
Dong-Hee Kim
Hyunjee Song
Donghyun Kim
142
0
0
23 May 2025
Segment Anyword: Mask Prompt Inversion for Open-Set Grounded Segmentation
Zhihua Liu
Amrutha Saseendran
Lei Tong
Xilin He
Fariba Yousefi
...
Dino Oglic
Tom Diethe
Philip Teare
Huiyu Zhou
Chen Jin
VLM
274
0
0
23 May 2025
3D Visual Illusion Depth Estimation
Chengtang Yao
Zhidan Liu
Jiaxi Zeng
Lidong Yu
Yuwei Wu
Yunde Jia
MDE
61
0
0
19 May 2025
Aligning Generative Denoising with Discriminative Objectives Unleashes Diffusion for Visual Perception
Ziqi Pang
Xin Xu
Yu-Xiong Wang
DiffM
126
0
0
15 Apr 2025
Diffusion Meets Few-shot Class Incremental Learning
Junsu Kim
Yunhoe Ku
Dongyoon Han
Seungryul Baek
DiffM
CLL
117
0
0
30 Mar 2025
Segment Any-Quality Images with Generative Latent Space Enhancement
Guangqian Guo
Yoong Guo
Xuehui Yu
Wenbo Li
Yaoxing Wang
Shan Gao
VLM
121
0
0
16 Mar 2025
What's in a Latent? Leveraging Diffusion Latent Space for Domain Generalization
Xavier Thomas
Deepti Ghadiyaram
DiffM
130
0
0
09 Mar 2025
LaRE
2
^2
2
: Latent Reconstruction Error Based Method for Diffusion-Generated Image Detection
Yunpeng Luo
Junlong Du
Ke Yan
Shouhong Ding
DiffM
170
22
0
24 Feb 2025
ControlText: Unlocking Controllable Fonts in Multilingual Text Rendering without Font Annotations
Bowen Jiang
Yuan Yuan
Xinyi Bai
Zhuoqun Hao
Alyson Yin
Yaojie Hu
Wenyu Liao
Lyle Ungar
Camillo J Taylor
DiffM
78
2
0
16 Feb 2025
Leveraging Stable Diffusion for Monocular Depth Estimation via Image Semantic Encoding
Jingming Xia
Guanqun Cao
Guang Ma
Yiben Luo
Qinzhao Li
John Oyekan
MDE
79
0
0
01 Feb 2025
DPBridge: Latent Diffusion Bridge for Dense Prediction
Haorui Ji
Taojun Lin
Hongdong Li
DiffM
157
1
0
29 Dec 2024
PriorDiffusion: Leverage Language Prior in Diffusion Models for Monocular Depth Estimation
Ziyao Zeng
Jingcheng Ni
Daniel Wang
Patrick Rim
Younjoon Chung
Fengyu Yang
Byung-Woo Hong
A. Wong
DiffM
MDE
155
2
0
24 Nov 2024
InvSeg: Test-Time Prompt Inversion for Semantic Segmentation
Jiayi Lin
Jiabo Huang
Jian Hu
S. Gong
DiffM
VLM
76
0
0
15 Oct 2024
A Simple Approach to Unifying Diffusion-based Conditional Generation
Xirui Li
Charles Herrmann
Kelvin C.K. Chan
Yinxiao Li
Deqing Sun
Chao Ma
Ming-Hsuan Yang
DiffM
VLM
65
1
0
15 Oct 2024
DiffGAD: A Diffusion-based Unsupervised Graph Anomaly Detector
Jinghan Li
Yuan Gao
Jinda Lu
Sihang Li
Congcong Wen
Hui Lin
Xiang Wang
73
2
0
09 Oct 2024
Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think
Gonzalo Martin Garcia
Karim Abou Zeid
Christian Schmidt
Daan de Geus
Alexander Hermans
Bastian Leibe
69
27
0
17 Sep 2024
Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding
Yunze Man
Shuhong Zheng
Zhipeng Bao
M. Hebert
Liang-Yan Gui
Yu-Xiong Wang
97
16
0
05 Sep 2024
Zero-Shot Medical Phrase Grounding with Off-the-shelf Diffusion Models
Konstantinos Vilouras
Pedro Sanchez
Alison Q. OÑeil
Sotirios A. Tsaftaris
MedIm
101
2
0
19 Apr 2024
GazeHTA: End-to-end Gaze Target Detection with Head-Target Association
Zhi-Yi Lin
Jouh Yeong Chew
Jan van Gemert
Xucong Zhang
93
3
0
16 Apr 2024
Explore In-Context Segmentation via Latent Diffusion Models
Chaoyang Wang
Xiangtai Li
Henghui Ding
Lu Qi
Jiangning Zhang
Yunhai Tong
Chen Change Loy
Shuicheng Yan
DiffM
95
6
0
14 Mar 2024
All in Tokens: Unifying Output Space of Visual Tasks via Soft Token
Jia Ning
Chen Li
Zheng Zhang
Zigang Geng
Qi Dai
Kun He
Han Hu
71
45
0
05 Jan 2023
DiffusionDet: Diffusion Model for Object Detection
Shoufa Chen
Pei Sun
Yibing Song
Ping Luo
81
450
0
17 Nov 2022
BEiT v2: Masked Image Modeling with Vector-Quantized Visual Tokenizers
Zhiliang Peng
Li Dong
Hangbo Bao
QiXiang Ye
Furu Wei
45
316
0
12 Aug 2022
Prompt-to-Prompt Image Editing with Cross Attention Control
Amir Hertz
Ron Mokady
J. Tenenbaum
Kfir Aberman
Yael Pritch
Daniel Cohen-Or
DiffM
123
1,746
0
02 Aug 2022
DPM-Solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps
Cheng Lu
Yuhao Zhou
Fan Bao
Jianfei Chen
Chongxuan Li
Jun Zhu
DiffM
134
1,394
0
02 Jun 2022
Hierarchical Text-Conditional Image Generation with CLIP Latents
Aditya A. Ramesh
Prafulla Dhariwal
Alex Nichol
Casey Chu
Mark Chen
VLM
DiffM
282
6,768
0
13 Apr 2022
Pseudo Numerical Methods for Diffusion Models on Manifolds
Luping Liu
Yi Ren
Zhijie Lin
Zhou Zhao
DiffM
80
640
0
20 Feb 2022
SegDiff: Image Segmentation with Diffusion Probabilistic Models
Tomer Amit
Tal Shaharbany
Eliya Nachmani
Lior Wolf
DiffM
55
299
0
01 Dec 2021
Swin Transformer V2: Scaling Up Capacity and Resolution
Ze Liu
Han Hu
Yutong Lin
Zhuliang Yao
Zhenda Xie
...
Yue Cao
Zheng Zhang
Li Dong
Furu Wei
B. Guo
ViT
180
1,783
0
18 Nov 2021
CLIP-Adapter: Better Vision-Language Models with Feature Adapters
Peng Gao
Shijie Geng
Renrui Zhang
Teli Ma
Rongyao Fang
Yongfeng Zhang
Hongsheng Li
Yu Qiao
VLM
CLIP
202
1,011
0
09 Oct 2021
Learning to Prompt for Vision-Language Models
Kaiyang Zhou
Jingkang Yang
Chen Change Loy
Ziwei Liu
VPVLM
CLIP
VLM
438
2,340
0
02 Sep 2021
BEiT: BERT Pre-Training of Image Transformers
Hangbo Bao
Li Dong
Songhao Piao
Furu Wei
ViT
170
2,790
0
15 Jun 2021
Vision Transformers for Dense Prediction
René Ranftl
Alexey Bochkovskiy
V. Koltun
ViT
MDE
109
1,696
0
24 Mar 2021
Multi-task Collaborative Network for Joint Referring Expression Comprehension and Segmentation
Gen Luo
Yiyi Zhou
Xiaoshuai Sun
Liujuan Cao
Chenglin Wu
Cheng Deng
Rongrong Ji
ObjD
222
288
0
19 Mar 2020
Momentum Contrast for Unsupervised Visual Representation Learning
Kaiming He
Haoqi Fan
Yuxin Wu
Saining Xie
Ross B. Girshick
SSL
111
12,007
0
13 Nov 2019
Panoptic Feature Pyramid Networks
Alexander Kirillov
Ross B. Girshick
Kaiming He
Piotr Dollár
ISeg
SSeg
94
1,280
0
08 Jan 2019
Auto-Encoding Variational Bayes
Diederik P. Kingma
Max Welling
BDL
362
16,962
0
20 Dec 2013
1