Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2204.06125
Cited By
Hierarchical Text-Conditional Image Generation with CLIP Latents
13 April 2022
Aditya A. Ramesh
Prafulla Dhariwal
Alex Nichol
Casey Chu
Mark Chen
VLM
DiffM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Hierarchical Text-Conditional Image Generation with CLIP Latents"
50 / 4,897 papers shown
Title
OFTSR: One-Step Flow for Image Super-Resolution with Tunable Fidelity-Realism Trade-offs
Yuanzhi Zhu
R. Wang
Shilin Lu
Junnan Li
Hanshu Yan
Peng Sun
SupR
184
5
0
12 Dec 2024
UFO: Enhancing Diffusion-Based Video Generation with a Uniform Frame Organizer
Delong Liu
Zhaohui Hou
Mingjie Zhan
Shihao Han
Zhicheng Zhao
Fei Su
VGen
105
0
0
12 Dec 2024
Illusion3D: 3D Multiview Illusion with 2D Diffusion Priors
Yue Feng
Vaibhav Sanjay
Spencer Lutz
Badour Albahar
Songwei Ge
Jia-Bin Huang
148
1
0
12 Dec 2024
SVGFusion: Scalable Text-to-SVG Generation via Vector Space Diffusion
Ximing Xing
Juncheng Hu
Jing Zhang
Dong Xu
Qian Yu
216
4
0
11 Dec 2024
FireFlow: Fast Inversion of Rectified Flow for Image Semantic Editing
Yingying Deng
Xiangyu He
Changwang Mei
Peisong Wang
Fan Tang
124
9
0
10 Dec 2024
Buster: Implanting Semantic Backdoor into Text Encoder to Mitigate NSFW Content Generation
Xin Zhao
Xiaojun Chen
Yuexin Xuan
Zhendong Zhao
Xiaojun Jia
Xinfeng Li
Xiaofeng Wang
118
1
0
10 Dec 2024
FIRE: Robust Detection of Diffusion-Generated Images via Frequency-Guided Reconstruction Error
Beilin Chu
Xuan Xu
Xin Wang
Yanzhe Zhang
Weike You
Linna Zhou
DiffM
161
4
0
10 Dec 2024
ArtFormer: Controllable Generation of Diverse 3D Articulated Objects
Jiayi Su
Youhe Feng
Zheng Li
Jinhua Song
Yangfan He
Botao Ren
Botian Xu
AI4CE
156
3
0
10 Dec 2024
Sound2Vision: Generating Diverse Visuals from Audio through Cross-Modal Latent Alignment
Kim Sung-Bin
Arda Senocak
Hyunwoo Ha
Tae-Hyun Oh
DiffM
219
0
0
09 Dec 2024
Nested Diffusion Models Using Hierarchical Latent Priors
Xiao Zhang
Ruoxi Jiang
Rebecca Willett
Michael Maire
BDL
DiffM
118
1
0
08 Dec 2024
Evaluating Hallucination in Text-to-Image Diffusion Models with Scene-Graph based Question-Answering Agent
Ziyuan Qin
D. Cheng
Haoyu Wang
Huahui Yi
Yuting Shao
Zhiyuan Fan
Kang Li
Qicheng Lao
EGVM
MLLM
467
0
0
07 Dec 2024
Combining Genre Classification and Harmonic-Percussive Features with Diffusion Models for Music-Video Generation
Leonardo Pina
Yongmin Li
VGen
DiffM
95
0
0
07 Dec 2024
SMIC: Semantic Multi-Item Compression based on CLIP dictionary
Tom Bachard
Thomas Maugey
116
0
0
06 Dec 2024
HumanEdit: A High-Quality Human-Rewarded Dataset for Instruction-based Image Editing
Jinbin Bai
Wei Chow
L. Yang
Hefei Ling
Juncheng Billy Li
Hao Zhang
Shuicheng Yan
187
10
0
05 Dec 2024
Multi-view Image Diffusion via Coordinate Noise and Fourier Attention
Justin D. Theiss
Norman Müller
Daeil Kim
Aayush Prakash
102
0
0
04 Dec 2024
MV-Adapter: Multi-view Consistent Image Generation Made Easy
Zehuan Huang
Yu Guo
Haoran Wang
Ran Yi
Lizhuang Ma
Yan-Pei Cao
Lu Sheng
169
18
0
04 Dec 2024
Implicit Priors Editing in Stable Diffusion via Targeted Token Adjustment
Feng He
Chao Zhang
Zhixue Zhao
181
0
0
04 Dec 2024
DynamicControl: Adaptive Condition Selection for Improved Text-to-Image Generation
Qu He
Jinlong Peng
P. Xu
Boyuan Jiang
Xiaobin Hu
...
Yang Liu
Yun Wang
Chengjie Wang
Xuelong Li
Jing Zhang
DiffM
212
1
0
04 Dec 2024
ShapeWords: Guiding Text-to-Image Synthesis with 3D Shape-Aware Prompts
Dmitry Petrov
Pradyumn Goyal
Divyansh Shivashok
Yuanming Tao
Melinos Averkiou
E. Kalogerakis
127
0
0
03 Dec 2024
SyncFlow: Toward Temporally Aligned Joint Audio-Video Generation from Text
Haohe Liu
Gaël Le Lan
Xinhao Mei
Zhaoheng Ni
Anurag Kumar
Varun K. Nagaraja
Wenwu Wang
Mark D. Plumbley
Yangyang Shi
Vikas Chandra
VGen
157
1
0
03 Dec 2024
FoundHand: Large-Scale Domain-Specific Learning for Controllable Hand Image Generation
Kefan Chen
Chaerin Min
Linguang Zhang
Shreyas Hampali
Cem Keskin
Srinath Sridhar
140
0
0
03 Dec 2024
Diffusion models learn distributions generated by complex Langevin dynamics
Diaa E. Habibi
Gert Aarts
Lei Wang
K. Zhou
DiffM
132
2
0
02 Dec 2024
CTRL-D: Controllable Dynamic 3D Scene Editing with Personalized 2D Diffusion
Kai He
Chin-Hsuan Wu
Igor Gilitschenski
DiffM
3DGS
127
0
0
02 Dec 2024
Gen-SIS: Generative Self-augmentation Improves Self-supervised Learning
Varun Belagali
Srikar Yellapragada
Alexandros Graikos
S. Kapse
Zilinghan Li
Tarak Nandi
Ravi K. Madduri
Prateek Prasanna
Joel H. Saltz
Dimitris Samaras
DiffM
144
2
0
02 Dec 2024
CopyrightShield: Spatial Similarity Guided Backdoor Defense against Copyright Infringement in Diffusion Models
Zhixiang Guo
Siyuan Liang
Aishan Liu
Dacheng Tao
AAML
135
3
0
02 Dec 2024
MFTF: Mask-free Training-free Object Level Layout Control Diffusion Model
Shan Yang
DiffM
85
0
0
02 Dec 2024
PainterNet: Adaptive Image Inpainting with Actual-Token Attention and Diverse Mask Control
Ruichen Wang
Junliang Zhang
Qingsong Xie
Chen Chen
H. Lu
DiffM
127
1
0
02 Dec 2024
Unleashing In-context Learning of Autoregressive Models for Few-shot Image Manipulation
Bolin Lai
F. Xu
Miao Liu
Xiaoliang Dai
Nikhil Mehta
...
Zeyi Huang
James M. Rehg
Sangmin Lee
Ning Zhang
Tong Xiao
136
3
0
02 Dec 2024
Schedule On the Fly: Diffusion Time Prediction for Faster and Better Image Generation
Zilyu Ye
Zhiyang Chen
Tiancheng Li
Zemin Huang
Weijian Luo
Guo-Jun Qi
DiffM
132
6
0
02 Dec 2024
SerialGen: Personalized Image Generation by First Standardization Then Personalization
Cong Xie
Han Zou
Ruiqi Yu
Yan Zhang
Zhenpeng Zhan
146
1
0
02 Dec 2024
DiffPatch: Generating Customizable Adversarial Patches using Diffusion Models
Zhixiang Wang
Guangnan Ye
Xinyu Wang
Siheng Chen
Ziyi Wang
Xingjun Ma
Yu-Gang Jiang
AAML
DiffM
195
0
0
02 Dec 2024
MuLan: Adapting Multilingual Diffusion Models for Hundreds of Languages with Negligible Cost
Sen Xing
Muyan Zhong
Zeqiang Lai
Liangchen Li
Jing Liu
Yaohui Wang
Jifeng Dai
Wenhai Wang
205
2
0
02 Dec 2024
STEVE-Audio: Expanding the Goal Conditioning Modalities of Embodied Agents in Minecraft
Nicholas Lenzen
Amogh Raut
Andrew Melnik
VGen
113
0
0
01 Dec 2024
Advancing Myopia To Holism: Fully Contrastive Language-Image Pre-training
Haicheng Wang
Chen Ju
Weixiong Lin
Shuai Xiao
Mengting Chen
...
Mingshuai Yao
Jinsong Lan
Ying Chen
Qingwen Liu
Yanfeng Wang
VLM
CLIP
121
4
0
30 Nov 2024
Deepfake Media Generation and Detection in the Generative AI Era: A Survey and Outlook
Florinel-Alin Croitoru
Andrei Iulian Hiji
Vlad Hondru
Nicolae-Cătălin Ristea
Paul Irofti
Marius Popescu
Cristian Rusu
Radu Tudor Ionescu
Fahad Shahbaz Khan
Mubarak Shah
135
5
0
29 Nov 2024
DreamBlend: Advancing Personalized Fine-tuning of Text-to-Image Diffusion Models
Shwetha Ram
T. Neiman
Qianli Feng
Andrew Stuart
S. D. Tran
Trishul Chilimbi
128
2
0
28 Nov 2024
Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model
Feng Liu
Shiwei Zhang
Xiaofeng Wang
Yujie Wei
Haonan Qiu
Yuzhong Zhao
Yingya Zhang
Qixiang Ye
Fang Wan
VGen
AI4TS
216
30
0
28 Nov 2024
Enhancing Few-Shot Vision-Language Classification with Large Multimodal Model Features
Chancharik Mitra
Brandon Huang
Tianning Chai
Zhiqiu Lin
Assaf Arbelle
Rogerio Feris
Leonid Karlinsky
Trevor Darrell
Deva Ramanan
Roei Herzig
VLM
391
4
0
28 Nov 2024
Any-Resolution AI-Generated Image Detection by Spectral Learning
Dimitrios Karageorgiou
Symeon Papadopoulos
I. Kompatsiaris
Efstratios Gavves
176
1
0
28 Nov 2024
Orthus: Autoregressive Interleaved Image-Text Generation with Modality-Specific Heads
Siqi Kou
Jiachun Jin
Chang Liu
Ye Ma
Jian Jia
Quan Chen
Peng Jiang
Zhijie Deng
Zhijie Deng
DiffM
VGen
VLM
243
12
0
28 Nov 2024
Self-Cross Diffusion Guidance for Text-to-Image Synthesis of Similar Subjects
Weimin Qiu
Jieke Wang
Meng Tang
DiffM
185
1
0
28 Nov 2024
FaithDiff: Unleashing Diffusion Priors for Faithful Image Super-resolution
Junyang Chen
Jinshan Pan
Jiangxin Dong
111
2
0
27 Nov 2024
Steering Rectified Flow Models in the Vector Field for Controlled Image Generation
Maitreya Patel
Song Wen
Dimitris N. Metaxas
Yezhou Yang
DiffM
196
6
0
27 Nov 2024
Diffusion Self-Distillation for Zero-Shot Customized Image Generation
Shengqu Cai
Eric Ryan Chan
Yunzhi Zhang
Leonidas Guibas
Jiajun Wu
Gordon Wetzstein
132
13
0
27 Nov 2024
Enhancing MMDiT-Based Text-to-Image Models for Similar Subject Generation
Tianyi Wei
Dongdong Chen
Yifan Zhou
Xingang Pan
EGVM
137
3
0
27 Nov 2024
Diffusion Autoencoders for Few-shot Image Generation in Hyperbolic Space
Lingxiao Li
Kaixuan Fan
Boqing Gong
Xiangyu Yue
DiffM
124
0
0
27 Nov 2024
Pan-protein Design Learning Enables Task-adaptive Generalization for Low-resource Enzyme Design
Jiangbin Zheng
Ge Wang
Han Zhang
Stan Z. Li
117
0
0
26 Nov 2024
Reward Incremental Learning in Text-to-Image Generation
Maorong Wang
Jiafeng Mao
Xueting Wang
Toshihiko Yamasaki
EGVM
127
0
0
26 Nov 2024
LiteVAR: Compressing Visual Autoregressive Modelling with Efficient Attention and Quantization
Rui Xie
Tianchen Zhao
Zhihang Yuan
Rui Wan
Wenxi Gao
Zhenhua Zhu
Xuefei Ning
Yu Wang
VGen
MQ
92
4
0
26 Nov 2024
Relations, Negations, and Numbers: Looking for Logic in Generative Text-to-Image Models
C. Conwell
Rupert Tawiah-Quashie
T. Ullman
123
3
0
26 Nov 2024
Previous
1
2
3
...
13
14
15
...
96
97
98
Next