Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2204.06125
Cited By
Hierarchical Text-Conditional Image Generation with CLIP Latents
13 April 2022
Aditya A. Ramesh
Prafulla Dhariwal
Alex Nichol
Casey Chu
Mark Chen
VLM
DiffM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Hierarchical Text-Conditional Image Generation with CLIP Latents"
50 / 4,739 papers shown
Title
DynASyn: Multi-Subject Personalization Enabling Dynamic Action Synthesis
Yongjin Choi
Chanhun Park
Seung Jun Baek
DiffM
51
0
0
22 Mar 2025
Towards Transformer-Based Aligned Generation with Self-Coherence Guidance
Shulei Wang
Wang Lin
Hai Huang
Hanting Wang
Sihang Cai
...
Tao Jin
Jingyuan Chen
Jiacheng Sun
Jieming Zhu
Zhou Zhao
DiffM
55
2
0
22 Mar 2025
ARFlow: Human Action-Reaction Flow Matching with Physical Guidance
Wentao Jiang
Jingya Wang
Haotao Lu
Kaiyang Ji
Baoxiong Jia
Siyuan Huang
Ye-ling Shi
44
0
0
21 Mar 2025
Not Only Text: Exploring Compositionality of Visual Representations in Vision-Language Models
Davide Berasi
Matteo Farina
Massimiliano Mancini
Elisa Ricci
Nicola Strisciuglio
CoGe
68
0
0
21 Mar 2025
R2LDM: An Efficient 4D Radar Super-Resolution Framework Leveraging Diffusion Model
Boyuan Zheng
Shouyi Lu
Renbo Huang
Minqing Huang
Fan Lu
Wei Tian
G. Zhuo
Lu Xiong
70
1
0
21 Mar 2025
DermDiff: Generative Diffusion Model for Mitigating Racial Biases in Dermatology Diagnosis
Nusrat Munia
Abdullah-Al-Zubaer Imran
MedIm
47
1
0
21 Mar 2025
Enabling Versatile Controls for Video Diffusion Models
Xu Zhang
Hao Zhou
Haoming Qin
Xiaobin Lu
Jiaxing Yan
Guanzhong Wang
Zeyu Chen
Yi Liu
DiffM
VGen
65
0
0
21 Mar 2025
What's Producible May Not Be Reachable: Measuring the Steerability of Generative Models
Keyon Vafa
Sarah Bentley
Jon M. Kleinberg
S. Mullainathan
38
0
0
21 Mar 2025
PRIMAL: Physically Reactive and Interactive Motor Model for Avatar Learning
Yan Zhang
Yao Feng
Alpár Cseke
Nitin Saini
Nathan Bajandas
Nicolas Heron
M. Black
DiffM
VGen
66
0
0
21 Mar 2025
AnimatePainter: A Self-Supervised Rendering Framework for Reconstructing Painting Process
J. Hu
Shuyong Gao
Qianyu Guo
Yan Wang
Qishan Wang
Yuang Feng
Wenqiang Zhang
DiffM
VGen
47
0
0
21 Mar 2025
PoseTraj: Pose-Aware Trajectory Control in Video Diffusion
Longbin Ji
Lei Zhong
Pengfei Wei
Changjian Li
DiffM
VGen
43
0
0
20 Mar 2025
Scale-wise Distillation of Diffusion Models
Nikita Starodubcev
Denis Kuznedelev
Artem Babenko
Dmitry Baranchuk
DiffM
53
0
0
20 Mar 2025
LaPIG: Cross-Modal Generation of Paired Thermal and Visible Facial Images
Leyang Wang
Joice Lin
DiffM
63
0
0
20 Mar 2025
MagicMotion: Controllable Video Generation with Dense-to-Sparse Trajectory Guidance
Quanhao Li
Zhen Xing
Rui Wang
Hui Zhang
Qi Dai
Zuxuan Wu
VGen
66
0
0
20 Mar 2025
FreeFlux: Understanding and Exploiting Layer-Specific Roles in RoPE-Based MMDiT for Versatile Image Editing
Tianyi Wei
Yifan Zhou
Dongdong Chen
Xingang Pan
77
0
0
20 Mar 2025
Bezier Distillation
Ling Feng
SK Yang
44
0
0
20 Mar 2025
Visual Persona: Foundation Model for Full-Body Human Customization
Jisu Nam
Soowon Son
Zhan Xu
Jing Shi
Difan Liu
Feng Liu
Aashish Misraa
Seungryong Kim
Yang Zhou
DiffM
51
0
0
19 Mar 2025
How to Train Your Dragon: Automatic Diffusion-Based Rigging for Characters with Diverse Topologies
Zeqi Gu
Difan Liu
Timothy Langlois
Matthew Fisher
Abe Davis
DiffM
3DH
62
0
0
19 Mar 2025
Machine Unlearning in Hyperbolic vs. Euclidean Multimodal Contrastive Learning: Adapting Alignment Calibration to MERU
Àlex Pujol Vidal
Sergio Escalera
Kamal Nasrollahi
T. Moeslund
MU
64
0
0
19 Mar 2025
Detect-and-Guide: Self-regulation of Diffusion Models for Safe Text-to-Image Generation via Guideline Token Optimization
Feifei Li
Mi Zhang
Yiming Sun
Min Yang
DiffM
56
1
0
19 Mar 2025
Efficient Personalization of Quantized Diffusion Model without Backpropagation
H. Seo
Wongi Jeong
Kyungryeol Lee
Se Young Chun
DiffM
MQ
78
0
0
19 Mar 2025
CRCE: Coreference-Retention Concept Erasure in Text-to-Image Diffusion Models
Yuyang Xue
Edward Moroshko
Feng Chen
Steven G. McDonagh
Sotirios A. Tsaftaris
Sotirios A. Tsaftaris
56
1
0
18 Mar 2025
Diffusion-based Facial Aesthetics Enhancement with 3D Structure Guidance
Lisha Li
Jingwen Hou
Weide Liu
Yuming Fang
Jiebin Yan
DiffM
56
1
0
18 Mar 2025
DPImageBench: A Unified Benchmark for Differentially Private Image Synthesis
Chen Gong
Kecen Li
Zinan Lin
Tianhao Wang
61
3
0
18 Mar 2025
The Power of Context: How Multimodality Improves Image Super-Resolution
Kangfu Mei
Hossein Talebi
Mojtaba Ardakani
Vishal M. Patel
P. Milanfar
M. Delbracio
DiffM
85
1
0
18 Mar 2025
Edit Transfer: Learning Image Editing via Vision In-Context Relations
Lan Chen
Qi Mao
Yuchao Gu
Mike Zheng Shou
56
1
0
17 Mar 2025
BlobCtrl: A Unified and Flexible Framework for Element-level Image Generation and Editing
Yaowei Li
Lingen Li
Zhaoyang Zhang
Xiaoyu Li
Guangzhi Wang
Hongxiang Li
Xiaodong Cun
Ying Shan
Yuexian Zou
DiffM
67
1
0
17 Mar 2025
TextInVision: Text and Prompt Complexity Driven Visual Text Generation Benchmark
Forouzan Fallah
Maitreya Patel
Agneet Chatterjee
Vlad I. Morariu
Chitta Baral
Yezhou Yang
CoGe
61
0
0
17 Mar 2025
DreamRenderer: Taming Multi-Instance Attribute Control in Large-Scale Text-to-Image Models
Dewei Zhou
Mingwei Li
Zongxin Yang
Yi Yang
94
0
0
17 Mar 2025
Adams Bashforth Moulton Solver for Inversion and Editing in Rectified Flow
Yongjia Ma
Donglin Di
Xuan Liu
Xiaokai Chen
Lei Fan
Wei Chen
Tonghua Su
47
0
0
17 Mar 2025
BalancedDPO: Adaptive Multi-Metric Alignment
Dipesh Tamboli
Souradip Chakraborty
Aditya Malusare
B. Banerjee
Amrit Singh Bedi
Vaneet Aggarwal
EGVM
67
0
0
16 Mar 2025
VRsketch2Gaussian: 3D VR Sketch Guided 3D Object Generation with Gaussian Splatting
Songen Gu
Haoxuan Song
Binjie Liu
Qian Yu
Sanyi Zhang
Haiyong Jiang
Jin Huang
Feng Tian
3DGS
3DV
54
0
0
16 Mar 2025
Personalize Anything for Free with Diffusion Transformer
Haoran Feng
Zehuan Huang
Lin Li
Hairong Lv
Lu Sheng
DiffM
87
1
0
16 Mar 2025
UniVG: A Generalist Diffusion Model for Unified Image Generation and Editing
Tsu-jui Fu
Yusu Qian
Chen Chen
Wenze Hu
Zhe Gan
Yuqing Yang
100
1
0
16 Mar 2025
Cross-Modal Diffusion for Biomechanical Dynamical Systems Through Local Manifold Alignment
S. Dey
Sarath Ravindran Nair
DiffM
75
0
0
15 Mar 2025
LIAM: Multimodal Transformer for Language Instructions, Images, Actions and Semantic Maps
Yihao Wang
Raphael Memmesheimer
Sven Behnke
LM&Ro
53
0
0
15 Mar 2025
CHOrD: Generation of Collision-Free, House-Scale, and Organized Digital Twins for 3D Indoor Scenes with Controllable Floor Plans and Optimal Layouts
Chong Su
Yingbin Fu
Zheyuan Hu
Jing Yang
Param Hanji
Shaojun Wang
Xuan Zhao
Cengiz Öztireli
Fangcheng Zhong
3DV
56
0
0
15 Mar 2025
Threefold model for AI Readiness: A Case Study with Finnish Healthcare SMEs
Mohammed Alnajjar
Khalid Alnajjar
Mika Hämäläinen
41
0
0
15 Mar 2025
Safe Vision-Language Models via Unsafe Weights Manipulation
Moreno DÍncà
E. Peruzzo
Xingqian Xu
Humphrey Shi
N. Sebe
Massimiliano Mancini
MU
55
0
0
14 Mar 2025
Neurons: Emulating the Human Visual Cortex Improves Fidelity and Interpretability in fMRI-to-Video Reconstruction
Haonan Wang
Qixiang Zhang
Lehan Wang
Xuanqi Huang
Xiaomeng Li
VOS
VGen
62
0
0
14 Mar 2025
Safe-VAR: Safe Visual Autoregressive Model for Text-to-Image Generative Watermarking
Ziyi Wang
Songbai Tan
Gang Xu
Xuerui Qiu
Hongbin Xu
Xin Meng
Ming Li
Fei Richard Yu
WIGM
63
0
0
14 Mar 2025
TikZero: Zero-Shot Text-Guided Graphics Program Synthesis
Jonas Belouadi
Eddy Ilg
M. Keuper
Hideki Tanaka
Masao Utiyama
Raj Dabre
Steffen Eger
Simone Paolo Ponzetto
52
0
0
14 Mar 2025
Upcycling Text-to-Image Diffusion Models for Multi-Task Capabilities
Ruchika Chavhan
Abhinav Mehrotra
Malcolm Chadwick
Alberto Gil C. P. Ramos
Luca Morreale
Mehdi Noroozi
Sourav Bhattacharya
49
0
0
14 Mar 2025
Noise Synthesis for Low-Light Image Denoising with Diffusion Models
Liying Lu
Raphaël Achddou
Sabine Süsstrunk
DiffM
57
0
0
14 Mar 2025
DriveGEN: Generalized and Robust 3D Detection in Driving via Controllable Text-to-Image Diffusion Generation
Hongbin Lin
Zilu Guo
Y. Zhang
Shuaicheng Niu
Yafeng Li
R. Zhang
Shuguang Cui
Zhen Li
DiffM
53
0
0
14 Mar 2025
Exploring Typographic Visual Prompts Injection Threats in Cross-Modality Generation Models
Hao-Ran Cheng
Erjia Xiao
Yichi Wang
Kaidi Xu
Mengshu Sun
Jindong Gu
Renjing Xu
38
0
0
14 Mar 2025
EmoDiffusion: Enhancing Emotional 3D Facial Animation with Latent Diffusion Models
Yixuan Zhang
Qing Chang
Yuxi Wang
Guang Chen
Z. Zhang
Junran Peng
47
0
0
14 Mar 2025
On the Generalization Properties of Diffusion Models
Puheng Li
Zhong Li
Huishuai Zhang
Jiang Bian
74
29
0
13 Mar 2025
FlowTok: Flowing Seamlessly Across Text and Image Tokens
Ju He
Qihang Yu
Qihao Liu
Liang-Chieh Chen
71
0
0
13 Mar 2025
GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing
Rongyao Fang
Chengqi Duan
Kun Wang
Linjiang Huang
Hao Li
...
Xingyu Zeng
R. Zhao
Jifeng Dai
Xihui Liu
Hongsheng Li
MLLM
ReLM
LRM
112
5
0
13 Mar 2025
Previous
1
2
3
4
5
6
...
93
94
95
Next