ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2303.02153
  4. Cited By
Unleashing Text-to-Image Diffusion Models for Visual Perception

Unleashing Text-to-Image Diffusion Models for Visual Perception

3 March 2023
Wenliang Zhao
Yongming Rao
Zuyan Liu
Benlin Liu
Jie Zhou
Jiwen Lu
    ObjD
    VLM
    MDE
ArXivPDFHTML

Papers citing "Unleashing Text-to-Image Diffusion Models for Visual Perception"

39 / 39 papers shown
Title
Knowledge-Aligned Counterfactual-Enhancement Diffusion Perception for Unsupervised Cross-Domain Visual Emotion Recognition
Knowledge-Aligned Counterfactual-Enhancement Diffusion Perception for Unsupervised Cross-Domain Visual Emotion Recognition
Wen Yin
Yong Wang
Guiduo Duan
Dongyang Zhang
Xin Hu
Yuan-Fang Li
Tao He
74
0
0
26 May 2025
Diffusion Classifiers Understand Compositionality, but Conditions Apply
Diffusion Classifiers Understand Compositionality, but Conditions Apply
Yujin Jeong
Arnas Uselis
Seong Joon Oh
Anna Rohrbach
DiffM
CoGe
413
0
3
23 May 2025
SynRES: Towards Referring Expression Segmentation in the Wild via Synthetic Data
SynRES: Towards Referring Expression Segmentation in the Wild via Synthetic Data
Dong-Hee Kim
Hyunjee Song
Donghyun Kim
142
0
0
23 May 2025
Segment Anyword: Mask Prompt Inversion for Open-Set Grounded Segmentation
Zhihua Liu
Amrutha Saseendran
Lei Tong
Xilin He
Fariba Yousefi
...
Dino Oglic
Tom Diethe
Philip Teare
Huiyu Zhou
Chen Jin
VLM
274
0
0
23 May 2025
3D Visual Illusion Depth Estimation
3D Visual Illusion Depth Estimation
Chengtang Yao
Zhidan Liu
Jiaxi Zeng
Lidong Yu
Yuwei Wu
Yunde Jia
MDE
61
0
0
19 May 2025
Aligning Generative Denoising with Discriminative Objectives Unleashes Diffusion for Visual Perception
Aligning Generative Denoising with Discriminative Objectives Unleashes Diffusion for Visual Perception
Ziqi Pang
Xin Xu
Yu-Xiong Wang
DiffM
126
0
0
15 Apr 2025
Diffusion Meets Few-shot Class Incremental Learning
Diffusion Meets Few-shot Class Incremental Learning
Junsu Kim
Yunhoe Ku
Dongyoon Han
Seungryul Baek
DiffM
CLL
117
0
0
30 Mar 2025
Segment Any-Quality Images with Generative Latent Space Enhancement
Segment Any-Quality Images with Generative Latent Space Enhancement
Guangqian Guo
Yoong Guo
Xuehui Yu
Wenbo Li
Yaoxing Wang
Shan Gao
VLM
121
0
0
16 Mar 2025
What's in a Latent? Leveraging Diffusion Latent Space for Domain Generalization
What's in a Latent? Leveraging Diffusion Latent Space for Domain Generalization
Xavier Thomas
Deepti Ghadiyaram
DiffM
130
0
0
09 Mar 2025
LaRE$^2$: Latent Reconstruction Error Based Method for Diffusion-Generated Image Detection
LaRE2^22: Latent Reconstruction Error Based Method for Diffusion-Generated Image Detection
Yunpeng Luo
Junlong Du
Ke Yan
Shouhong Ding
DiffM
170
22
0
24 Feb 2025
ControlText: Unlocking Controllable Fonts in Multilingual Text Rendering without Font Annotations
ControlText: Unlocking Controllable Fonts in Multilingual Text Rendering without Font Annotations
Bowen Jiang
Yuan Yuan
Xinyi Bai
Zhuoqun Hao
Alyson Yin
Yaojie Hu
Wenyu Liao
Lyle Ungar
Camillo J Taylor
DiffM
78
2
0
16 Feb 2025
Leveraging Stable Diffusion for Monocular Depth Estimation via Image Semantic Encoding
Leveraging Stable Diffusion for Monocular Depth Estimation via Image Semantic Encoding
Jingming Xia
Guanqun Cao
Guang Ma
Yiben Luo
Qinzhao Li
John Oyekan
MDE
79
0
0
01 Feb 2025
DPBridge: Latent Diffusion Bridge for Dense Prediction
DPBridge: Latent Diffusion Bridge for Dense Prediction
Haorui Ji
Taojun Lin
Hongdong Li
DiffM
157
1
0
29 Dec 2024
PriorDiffusion: Leverage Language Prior in Diffusion Models for Monocular Depth Estimation
PriorDiffusion: Leverage Language Prior in Diffusion Models for Monocular Depth Estimation
Ziyao Zeng
Jingcheng Ni
Daniel Wang
Patrick Rim
Younjoon Chung
Fengyu Yang
Byung-Woo Hong
A. Wong
DiffM
MDE
155
2
0
24 Nov 2024
InvSeg: Test-Time Prompt Inversion for Semantic Segmentation
InvSeg: Test-Time Prompt Inversion for Semantic Segmentation
Jiayi Lin
Jiabo Huang
Jian Hu
S. Gong
DiffM
VLM
76
0
0
15 Oct 2024
A Simple Approach to Unifying Diffusion-based Conditional Generation
A Simple Approach to Unifying Diffusion-based Conditional Generation
Xirui Li
Charles Herrmann
Kelvin C.K. Chan
Yinxiao Li
Deqing Sun
Chao Ma
Ming-Hsuan Yang
DiffM
VLM
65
1
0
15 Oct 2024
DiffGAD: A Diffusion-based Unsupervised Graph Anomaly Detector
DiffGAD: A Diffusion-based Unsupervised Graph Anomaly Detector
Jinghan Li
Yuan Gao
Jinda Lu
Sihang Li
Congcong Wen
Hui Lin
Xiang Wang
73
2
0
09 Oct 2024
Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think
Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think
Gonzalo Martin Garcia
Karim Abou Zeid
Christian Schmidt
Daan de Geus
Alexander Hermans
Bastian Leibe
69
27
0
17 Sep 2024
Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding
Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding
Yunze Man
Shuhong Zheng
Zhipeng Bao
M. Hebert
Liang-Yan Gui
Yu-Xiong Wang
97
16
0
05 Sep 2024
Zero-Shot Medical Phrase Grounding with Off-the-shelf Diffusion Models
Zero-Shot Medical Phrase Grounding with Off-the-shelf Diffusion Models
Konstantinos Vilouras
Pedro Sanchez
Alison Q. OÑeil
Sotirios A. Tsaftaris
MedIm
101
2
0
19 Apr 2024
GazeHTA: End-to-end Gaze Target Detection with Head-Target Association
GazeHTA: End-to-end Gaze Target Detection with Head-Target Association
Zhi-Yi Lin
Jouh Yeong Chew
Jan van Gemert
Xucong Zhang
93
3
0
16 Apr 2024
Explore In-Context Segmentation via Latent Diffusion Models
Explore In-Context Segmentation via Latent Diffusion Models
Chaoyang Wang
Xiangtai Li
Henghui Ding
Lu Qi
Jiangning Zhang
Yunhai Tong
Chen Change Loy
Shuicheng Yan
DiffM
95
6
0
14 Mar 2024
All in Tokens: Unifying Output Space of Visual Tasks via Soft Token
All in Tokens: Unifying Output Space of Visual Tasks via Soft Token
Jia Ning
Chen Li
Zheng Zhang
Zigang Geng
Qi Dai
Kun He
Han Hu
71
45
0
05 Jan 2023
DiffusionDet: Diffusion Model for Object Detection
DiffusionDet: Diffusion Model for Object Detection
Shoufa Chen
Pei Sun
Yibing Song
Ping Luo
81
450
0
17 Nov 2022
BEiT v2: Masked Image Modeling with Vector-Quantized Visual Tokenizers
BEiT v2: Masked Image Modeling with Vector-Quantized Visual Tokenizers
Zhiliang Peng
Li Dong
Hangbo Bao
QiXiang Ye
Furu Wei
45
316
0
12 Aug 2022
Prompt-to-Prompt Image Editing with Cross Attention Control
Prompt-to-Prompt Image Editing with Cross Attention Control
Amir Hertz
Ron Mokady
J. Tenenbaum
Kfir Aberman
Yael Pritch
Daniel Cohen-Or
DiffM
123
1,746
0
02 Aug 2022
DPM-Solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling
  in Around 10 Steps
DPM-Solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps
Cheng Lu
Yuhao Zhou
Fan Bao
Jianfei Chen
Chongxuan Li
Jun Zhu
DiffM
134
1,394
0
02 Jun 2022
Hierarchical Text-Conditional Image Generation with CLIP Latents
Hierarchical Text-Conditional Image Generation with CLIP Latents
Aditya A. Ramesh
Prafulla Dhariwal
Alex Nichol
Casey Chu
Mark Chen
VLM
DiffM
282
6,768
0
13 Apr 2022
Pseudo Numerical Methods for Diffusion Models on Manifolds
Pseudo Numerical Methods for Diffusion Models on Manifolds
Luping Liu
Yi Ren
Zhijie Lin
Zhou Zhao
DiffM
80
640
0
20 Feb 2022
SegDiff: Image Segmentation with Diffusion Probabilistic Models
SegDiff: Image Segmentation with Diffusion Probabilistic Models
Tomer Amit
Tal Shaharbany
Eliya Nachmani
Lior Wolf
DiffM
55
299
0
01 Dec 2021
Swin Transformer V2: Scaling Up Capacity and Resolution
Swin Transformer V2: Scaling Up Capacity and Resolution
Ze Liu
Han Hu
Yutong Lin
Zhuliang Yao
Zhenda Xie
...
Yue Cao
Zheng Zhang
Li Dong
Furu Wei
B. Guo
ViT
180
1,783
0
18 Nov 2021
CLIP-Adapter: Better Vision-Language Models with Feature Adapters
CLIP-Adapter: Better Vision-Language Models with Feature Adapters
Peng Gao
Shijie Geng
Renrui Zhang
Teli Ma
Rongyao Fang
Yongfeng Zhang
Hongsheng Li
Yu Qiao
VLM
CLIP
202
1,011
0
09 Oct 2021
Learning to Prompt for Vision-Language Models
Learning to Prompt for Vision-Language Models
Kaiyang Zhou
Jingkang Yang
Chen Change Loy
Ziwei Liu
VPVLM
CLIP
VLM
438
2,340
0
02 Sep 2021
BEiT: BERT Pre-Training of Image Transformers
BEiT: BERT Pre-Training of Image Transformers
Hangbo Bao
Li Dong
Songhao Piao
Furu Wei
ViT
170
2,790
0
15 Jun 2021
Vision Transformers for Dense Prediction
Vision Transformers for Dense Prediction
René Ranftl
Alexey Bochkovskiy
V. Koltun
ViT
MDE
109
1,696
0
24 Mar 2021
Multi-task Collaborative Network for Joint Referring Expression
  Comprehension and Segmentation
Multi-task Collaborative Network for Joint Referring Expression Comprehension and Segmentation
Gen Luo
Yiyi Zhou
Xiaoshuai Sun
Liujuan Cao
Chenglin Wu
Cheng Deng
Rongrong Ji
ObjD
222
288
0
19 Mar 2020
Momentum Contrast for Unsupervised Visual Representation Learning
Momentum Contrast for Unsupervised Visual Representation Learning
Kaiming He
Haoqi Fan
Yuxin Wu
Saining Xie
Ross B. Girshick
SSL
111
12,007
0
13 Nov 2019
Panoptic Feature Pyramid Networks
Panoptic Feature Pyramid Networks
Alexander Kirillov
Ross B. Girshick
Kaiming He
Piotr Dollár
ISeg
SSeg
94
1,280
0
08 Jan 2019
Auto-Encoding Variational Bayes
Auto-Encoding Variational Bayes
Diederik P. Kingma
Max Welling
BDL
362
16,962
0
20 Dec 2013
1