Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2204.06125
Cited By
Hierarchical Text-Conditional Image Generation with CLIP Latents
13 April 2022
Aditya A. Ramesh
Prafulla Dhariwal
Alex Nichol
Casey Chu
Mark Chen
VLM
DiffM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Hierarchical Text-Conditional Image Generation with CLIP Latents"
50 / 4,897 papers shown
Title
MAVOS-DD: Multilingual Audio-Video Open-Set Deepfake Detection Benchmark
Florinel-Alin Croitoru
Vlad Hondru
Marius Popescu
Radu Tudor Ionescu
Fahad Shahbaz Khan
Mubarak Shah
105
0
0
16 May 2025
IMAGE-ALCHEMY: Advancing subject fidelity in personalised text-to-image generation
Amritanshu Tiwari
Cherish Puniani
Kaustubh Sharma
Ojasva Nema
DiffM
101
0
0
15 May 2025
Exploring the Deep Fusion of Large Language Models and Diffusion Transformers for Text-to-Image Synthesis
Bingda Tang
Boyang Zheng
Xichen Pan
Sayak Paul
Saining Xie
78
0
0
15 May 2025
Generative AI for Urban Planning: Synthesizing Satellite Imagery via Diffusion Models
Qingyi Wang
Yuxuan Liang
Yunhan Zheng
Kaiyuan Xu
Jinhua Zhao
Shenhao Wang
59
0
0
13 May 2025
Visually Guided Decoding: Gradient-Free Hard Prompt Inversion with Language Models
Donghoon Kim
Minji Bae
Kyuhong Shim
B. Shim
75
1
0
13 May 2025
Large Language Models for Computer-Aided Design: A Survey
Licheng Zhang
Bach Le
Naveed Akhtar
Siew-Kei Lam
Tuan Ngo
3DV
AI4CE
126
1
0
13 May 2025
IntrinsicEdit: Precise generative image manipulation in intrinsic space
Linjie Lyu
Valentin Deschaintre
Yannick Hold-Geoffroy
Jian Yang
Jae Shin Yoon
Thomas Leimkuhler
Christian Theobalt
Iliyan Georgiev
DiffM
70
0
0
13 May 2025
Unsupervised Raindrop Removal from a Single Image using Conditional Diffusion Models
Lhuqita Fazry
Valentino Vito
DiffM
65
0
0
13 May 2025
You Only Look One Step: Accelerating Backpropagation in Diffusion Sampling with Gradient Shortcuts
Hongkun Dou
Zeyu Li
Xingyu Jiang
Haoyang Li
Lijun Yang
Wen Yao
Yue Deng
DiffM
232
0
0
12 May 2025
ShotAdapter: Text-to-Multi-Shot Video Generation with Diffusion Models
Ozgur Kara
Krishna Kumar Singh
Feng Liu
Duygu Ceylan
James M. Rehg
Tobias Hinz
DiffM
VGen
82
0
0
12 May 2025
Addressing degeneracies in latent interpolation for diffusion models
Erik Landolsi
Fredrik Kahl
DiffM
112
0
0
12 May 2025
Towards SFW sampling for diffusion models via external conditioning
Camilo Carvajal Reyes
J. Fontbona
Felipe A. Tobar
DiffM
88
0
0
12 May 2025
Pixel Motion as Universal Representation for Robot Control
Kanchana Ranasinghe
Xiang Li
Cristina Mata
J. Park
Michael S. Ryoo
VGen
71
0
0
12 May 2025
TokenProber: Jailbreaking Text-to-image Models via Fine-grained Word Impact Analysis
Longtian Wang
Xiaofei Xie
Tianlin Li
Yuhan Zhi
Chao Shen
60
0
0
11 May 2025
Whitened CLIP as a Likelihood Surrogate of Images and Captions
Roy Betser
Meir Yossef Levi
Guy Gilboa
55
0
0
11 May 2025
Unsupervised Learning for Class Distribution Mismatch
Pan Du
Wangbo Zhao
Xinai Lu
Nian Liu
Zechao Li
...
Suyun Zhao
H. Chen
Cuiping Li
Kai Wang
Yang You
52
0
0
11 May 2025
Learning Graph Representation of Agent Diffusers
Youcef Djenouri
Nassim Belmecheri
Tomasz Michalak
Jan Dubiñski
Ahmed Nabil Belbachir
Anis Yazidi
AI4CE
214
0
0
10 May 2025
ProFashion: Prototype-guided Fashion Video Generation with Multiple Reference Images
Xianghao Kong
Qiaosong Qi
Yuanbin Wang
Anyi Rao
Biaolong Chen
Aixi Zhang
Si Liu
Hao Jiang
DiffM
VGen
65
1
0
10 May 2025
HCMA: Hierarchical Cross-model Alignment for Grounded Text-to-Image Generation
Hang Wang
Zhi-Qi Cheng
Chenhao Lin
Chao Shen
Lei Zhang
DiffM
140
0
0
10 May 2025
MAGE:A Multi-stage Avatar Generator with Sparse Observations
Fangyu Du
Yang Yang
Xuehao Gao
Hongye Hou
VGen
54
0
0
09 May 2025
Demystifying Diffusion Policies: Action Memorization and Simple Lookup Table Alternatives
Chengyang He
Xu Liu
Gadiel Sznaier Camps
Guillaume Sartoretti
Mac Schwager
67
1
0
09 May 2025
Computationally Efficient Diffusion Models in Medical Imaging: A Comprehensive Review
Abdullah
Wei Chen
Ickjai Lee
Euijoon Ahn
MedIm
75
0
0
09 May 2025
Automated Learning of Semantic Embedding Representations for Diffusion Models
Limai Jiang
Yunpeng Cai
DiffM
56
0
0
09 May 2025
MDE-Edit: Masked Dual-Editing for Multi-Object Image Editing via Diffusion Models
Hongyang Zhu
Haipeng Liu
Bo Fu
Yang Wang
DiffM
129
0
0
08 May 2025
Epistemic Artificial Intelligence is Essential for Machine Learning Models to Truly 'Know When They Do Not Know'
Shireen Kudukkil Manchingal
Andrew Bradley
Julian F. P. Kooij
Keivan K1 Shariatmadar
Neil Yorke-Smith
Fabio Cuzzolin
151
1
0
08 May 2025
PIDiff: Image Customization for Personalized Identities with Diffusion Models
Jinyu Gu
Haipeng Liu
M. Y. Wang
Yijiao Wang
140
0
0
08 May 2025
ItDPDM: Information-Theoretic Discrete Poisson Diffusion Model
Sagnik Bhattacharya
Abhiram Gorle
Ahmed Mohsin
Ahsan Bilal
Connor Ding
Amit Kumar Singh Yadav
DiffM
217
1
0
08 May 2025
Flow-GRPO: Training Flow Matching Models via Online RL
Jie Liu
Gongye Liu
Jiajun Liang
Yongqian Li
Jiaheng Liu
Xinyu Wang
Pengfei Wan
Di Zhang
Wanli Ouyang
AI4CE
221
5
0
08 May 2025
Denoising Diffusion Probabilistic Models for Coastal Inundation Forecasting
Kazi Ashik Islam
Zakaria Mehrab
Mahantesh Halappanavar
H. Mortveit
Sridhar Katragadda
Jon Derek Loftis
Madhav V. Marathe
DiffM
AI4CE
72
0
0
08 May 2025
FLAM: Frame-Wise Language-Audio Modeling
Yusong Wu
Christos Tsirigotis
Ke Chen
Cheng-Zhi Anna Huang
Rameswar Panda
Oriol Nieto
Prem Seetharaman
Justin Salamon
80
1
0
08 May 2025
FG-CLIP: Fine-Grained Visual and Textual Alignment
Chunyu Xie
Bin Wang
Fanjing Kong
Jincheng Li
Dawei Liang
Gengshen Zhang
Dawei Leng
Yuhui Yin
CLIP
VLM
180
1
0
08 May 2025
Prompt to Polyp: Medical Text-Conditioned Image Synthesis with Diffusion Models
Mikhail Chaichuk
Sushant Gautam
Steven A. Hicks
Elena Tutubalina
DiffM
MedIm
120
0
0
08 May 2025
ELGAR: Expressive Cello Performance Motion Generation for Audio Rendition
Zhiping Qiu
Yitong Jin
Yijiao Wang
Yi Shi
Changbo Wang
Chao Tan
Xiaobing Li
Feng Yu
Tao Yu
Qionghai Dai
66
0
0
07 May 2025
Distribution-Conditional Generation: From Class Distribution to Creative Generation
Fu Feng
Yucheng Xie
Xu Yang
Jing Wang
Xin Geng
DiffM
76
0
0
06 May 2025
PiCo: Enhancing Text-Image Alignment with Improved Noise Selection and Precise Mask Control in Diffusion Models
Chang Xie
Chenyi Zhuang
Pan Gao
VLM
70
0
0
06 May 2025
Robustness in AI-Generated Detection: Enhancing Resistance to Adversarial Attacks
Sun Haoxuan
Hong Yan
Zhan Jiahui
Chen Haoxing
Lan Jun
Zhu Huijia
Wang Weiqiang
Zhang Liqing
Zhang Jianfu
AAML
406
0
0
06 May 2025
FLUX-Text: A Simple and Advanced Diffusion Transformer Baseline for Scene Text Editing
Rui Lan
Y. Bai
Xu Duan
Mingxing Li
Lei Sun
Xiaowen Chu
DiffM
409
0
0
06 May 2025
Deepfakes on Demand: the rise of accessible non-consensual deepfake image generators
Will Hawkins
Chris Russell
Brent Mittelstadt
DiffM
412
0
0
06 May 2025
SuperEdit: Rectifying and Facilitating Supervision for Instruction-Based Image Editing
Ming Li
Xin Gu
Fan Chen
X. Xing
Longyin Wen
Chong Chen
Sijie Zhu
DiffM
259
2
0
05 May 2025
Ming-Lite-Uni: Advancements in Unified Architecture for Natural Multimodal Interaction
Biao Gong
Cheng Zou
Dandan Zheng
Hu Yu
Jingdong Chen
...
Qingpei Guo
Rui Liu
Weilong Chai
Xinyu Xiao
Ziyuan Huang
MLLM
208
3
0
05 May 2025
Towards Dataset Copyright Evasion Attack against Personalized Text-to-Image Diffusion Models
Kuofeng Gao
Yufei Zhu
Yiming Li
Jiawang Bai
Yong-Liang Yang
Zerui Li
Shu-Tao Xia
84
0
0
05 May 2025
MCCD: Multi-Agent Collaboration-based Compositional Diffusion for Complex Text-to-Image Generation
Mingcheng Li
Xiaolu Hou
Ziyang Liu
Dingkang Yang
Ziyun Qian
Jiawei Chen
Jinjie Wei
Yiheng Jiang
Qingyao Xu
Li Zhang
DiffM
488
0
0
05 May 2025
Efficient Multi Subject Visual Reconstruction from fMRI Using Aligned Representations
Christos Zangos
Danish Ebadulla
Thomas C. Sprague
Ambuj Singh
101
0
0
03 May 2025
Where's the liability in the Generative Era? Recovery-based Black-Box Detection of AI-Generated Content
Haoyue Bai
Yiyou Sun
Wei Cheng
Haifeng Chen
AAML
96
0
0
02 May 2025
Provable Efficiency of Guidance in Diffusion Models for General Data Distribution
Gen Li
Yuchen Jiao
84
2
0
02 May 2025
Any-to-Any Vision-Language Model for Multimodal X-ray Imaging and Radiological Report Generation
Daniele Molino
Francesco Di Feola
Linlin Shen
Paolo Soda
V. Guarrasi
MedIm
LM&MA
125
1
0
02 May 2025
InstructAttribute: Fine-grained Object Attributes editing with Instruction
Xingxi Yin
Jingfeng Zhang
Zhi Li
You Li
Yanzhe Zhang
Yin Zhang
DiffM
453
1
0
01 May 2025
The Dual Power of Interpretable Token Embeddings: Jailbreaking Attacks and Defenses for Diffusion Model Unlearning
Siyi Chen
Yimeng Zhang
Sijia Liu
Q. Qu
AAML
422
0
0
30 Apr 2025
Capturing Conditional Dependence via Auto-regressive Diffusion Models
Xunpeng Huang
Yujin Han
Difan Zou
Yian Ma
Tong Zhang
DiffM
104
0
0
30 Apr 2025
AGHI-QA: A Subjective-Aligned Dataset and Metric for AI-Generated Human Images
Yunhao Li
Sijing Wu
Wei Sun
Zhichao Zhang
Yucheng Zhu
Zicheng Zhang
Huiyu Duan
Xiongkuo Min
Guangtao Zhai
EGVM
138
0
0
30 Apr 2025
Previous
1
2
3
4
5
...
96
97
98
Next