Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2204.06125
Cited By
Hierarchical Text-Conditional Image Generation with CLIP Latents
13 April 2022
Aditya A. Ramesh
Prafulla Dhariwal
Alex Nichol
Casey Chu
Mark Chen
VLM
DiffM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Hierarchical Text-Conditional Image Generation with CLIP Latents"
50 / 4,897 papers shown
Title
Knowledge-Aligned Counterfactual-Enhancement Diffusion Perception for Unsupervised Cross-Domain Visual Emotion Recognition
Wen Yin
Yong Wang
Guiduo Duan
Dongyang Zhang
Xin Hu
Yuan-Fang Li
Tao He
125
0
0
26 May 2025
FUDOKI: Discrete Flow-based Unified Understanding and Generation via Kinetic-Optimal Velocities
Jin Wang
Yao Lai
Aoxue Li
Shifeng Zhang
Jiacheng Sun
Ning Kang
Chengyue Wu
Zhenguo Li
Ping Luo
63
2
0
26 May 2025
Progressive Scaling Visual Object Tracking
Jack Hong
Shilin Yan
Zehao Xiao
Jiayin Cai
Xiaolong Jiang
Yao Hu
Henghui Ding
73
0
0
26 May 2025
MultLFG: Training-free Multi-LoRA composition using Frequency-domain Guidance
Aniket Roy
Maitreya Suin
Ketul Shah
Rama Chellappa
69
1
0
26 May 2025
Adaptive Diffusion Guidance via Stochastic Optimal Control
Iskander Azangulov
Peter Potaptchik
Qinyu Li
Eddie Aamari
George Deligiannidis
Judith Rousseau
7
0
0
25 May 2025
GhostPrompt: Jailbreaking Text-to-image Generative Models based on Dynamic Optimization
Zixuan Chen
Hao Lin
Ke Xu
Xinghao Jiang
Tanfeng Sun
41
0
0
25 May 2025
Concept Reachability in Diffusion Models: Beyond Dataset Constraints
Marta Aparicio Rodriguez
Xenia Miscouridou
Anastasia Borovykh
43
0
0
25 May 2025
From Generation to Detection: A Multimodal Multi-Task Dataset for Benchmarking Health Misinformation
Zhihao Zhang
Yiran Zhang
Xiyue Zhou
Liting Huang
Imran Razzak
Preslav Nakov
Usman Naseem
11
0
0
24 May 2025
OmniGenBench: A Benchmark for Omnipotent Multimodal Generation across 50+ Tasks
Jiayu Wang
Yang Jiao
Yue Yu
Tianwen Qian
Shaoxiang Chen
Jingjing Chen
Yu Jiang
MLLM
LM&MA
ELM
110
0
0
24 May 2025
So-Fake: Benchmarking and Explaining Social Media Image Forgery Detection
Zhenglin Huang
Tianxiao Li
Xiangtai Li
Haiquan Wen
Yiwei He
...
Hao Fei
Xi Yang
Xiaowei Huang
Bei Peng
Guangliang Cheng
69
0
0
24 May 2025
Diffusion Blend: Inference-Time Multi-Preference Alignment for Diffusion Models
Min Cheng
Fatemeh Doudi
D. Kalathil
Mohammad Ghavamzadeh
P. R. Kumar
57
0
0
24 May 2025
DVD-Quant: Data-free Video Diffusion Transformers Quantization
Zhiteng Li
Hanxuan Li
Junyi Wu
Kai Liu
Linghe Kong
Guihai Chen
Yulun Zhang
Xiaokang Yang
MQ
VGen
72
0
0
24 May 2025
Variational Autoencoding Discrete Diffusion with Enhanced Dimensional Correlations Modeling
Tianyu Xie
Shuchen Xue
Zijin Feng
Tianyang Hu
Jiacheng Sun
Zhenguo Li
Cheng Zhang
DiffM
779
0
0
23 May 2025
A Coreset Selection of Coreset Selection Literature: Introduction and Recent Advances
Brian B. Moser
Arundhati S. Shanbhag
Stanislav Frolov
Federico Raue
Joachim Folz
Andreas Dengel
252
0
0
23 May 2025
Diffusion Classifiers Understand Compositionality, but Conditions Apply
Yujin Jeong
Arnas Uselis
Seong Joon Oh
Anna Rohrbach
DiffM
CoGe
564
0
3
23 May 2025
CAMME: Adaptive Deepfake Image Detection with Multi-Modal Cross-Attention
Naseem Khan
Tuan Nguyen
Amine Bermak
Issa Khalil
270
0
0
23 May 2025
ComfyMind: Toward General-Purpose Generation via Tree-Based Planning and Reactive Feedback
Litao Guo
Xinli Xu
Luozhou Wang
Jiantao Lin
Jinsong Zhou
Zixin Zhang
Bolan Su
Ying-Cong Chen
LLMAG
LRM
84
1
0
23 May 2025
Slot-MLLM: Object-Centric Visual Tokenization for Multimodal LLM
Donghwan Chi
Hyomin Kim
Yoonjin Oh
Yongjin Kim
Donghoon Lee
DaeJin Jo
Jongmin Kim
Junyeob Baek
Sungjin Ahn
Sungwoong Kim
MLLM
VLM
461
0
0
23 May 2025
CONCORD: Concept-Informed Diffusion for Dataset Distillation
Jianyang Gu
Haonan Wang
Ruoxi Jia
Saeed Vahidian
Vyacheslav Kungurtsev
Wei Jiang
Yiran Chen
DiffM
DD
922
0
0
23 May 2025
Learning Shared Representations from Unpaired Data
Amitai Yacobi
Nir Ben-Ari
Ronen Talmon
Uri Shaham
SSL
80
0
0
23 May 2025
Alignment and Safety of Diffusion Models via Reinforcement Learning and Reward Modeling: A Survey
Preeti Lamba
Kiran Ravish
Ankita Kushwaha
Pawan Kumar
EGVM
MedIm
107
0
0
23 May 2025
Erased or Dormant? Rethinking Concept Erasure Through Reversibility
Ping Liu
Chi Zhang
KELM
65
0
0
22 May 2025
M2SVid: End-to-End Inpainting and Refinement for Monocular-to-Stereo Video Conversion
Nina Shvetsova
Goutam Bhat
Prune Truong
Hilde Kuehne
Federico Tombari
DiffM
VGen
MDE
95
0
0
22 May 2025
DetailMaster: Can Your Text-to-Image Model Handle Long Prompts?
Qirui Jiao
Daoyuan Chen
Yilun Huang
Xika Lin
Ying Shen
Yaliang Li
VLM
58
0
0
22 May 2025
Style Transfer with Diffusion Models for Synthetic-to-Real Domain Adaptation
Estelle Chigot
Dennis G. Wilson
Meriem Ghrib
Thomas Oberlin
DiffM
54
0
0
22 May 2025
SEED: Speaker Embedding Enhancement Diffusion Model
KiHyun Nam
Jungwoo Heo
Jee-weon Jung
Gangin Park
Chaeyoung Jung
Ha-Jin Yu
Joon Son Chung
DiffM
59
0
0
22 May 2025
DOVE: Efficient One-Step Diffusion Model for Real-World Video Super-Resolution
Zheng Chen
Zichen Zou
Kewei Zhang
Xiongfei Su
Xin Yuan
Yong Guo
Yulun Zhang
DiffM
VGen
89
0
0
22 May 2025
Bigger Isn't Always Memorizing: Early Stopping Overparameterized Diffusion Models
Alessandro Favero
Antonio Sclocchi
Matthieu Wyart
DiffM
79
0
0
22 May 2025
A collaborative constrained graph diffusion model for the generation of realistic synthetic molecules
Manuel Ruiz-Botella
Marta Sales-Pardo
Roger Guimerà
22
0
0
22 May 2025
Toward Theoretical Insights into Diffusion Trajectory Distillation via Operator Merging
Weiguo Gao
Ming Li
DiffM
50
0
0
21 May 2025
Angle Domain Guidance: Latent Diffusion Requires Rotation Rather Than Extrapolation
Cheng Jin
Zhenyu Xiao
Chutao Liu
Yuantao Gu
DiffM
19
2
0
21 May 2025
MMaDA: Multimodal Large Diffusion Language Models
Ling Yang
Ye Tian
Bowen Li
Xinchen Zhang
Ke Shen
Yunhai Tong
Mengdi Wang
VLM
LRM
136
6
0
21 May 2025
Challenges and Limitations in the Synthetic Generation of mHealth Sensor Data
Flavio Di Martino
Franca Delmastro
70
0
0
20 May 2025
AKRMap: Adaptive Kernel Regression for Trustworthy Visualization of Cross-Modal Embeddings
Yilin Ye
Junchao Huang
Xingchen Zeng
Jiazhi Xia
Wei Zeng
149
0
0
20 May 2025
CURE: Concept Unlearning via Orthogonal Representation Editing in Diffusion Models
Shristi Das Biswas
Arani Roy
Kaushik Roy
DiffM
117
0
0
19 May 2025
Restoration Score Distillation: From Corrupted Diffusion Pretraining to One-Step High-Quality Generation
Yasi Zhang
Tianyu Chen
Zhendong Wang
Ying Nian Wu
Mingyuan Zhou
Oscar Leong
DiffM
77
1
0
19 May 2025
Few-Step Diffusion via Score identity Distillation
Mingyuan Zhou
Yi Gu
Zhendong Wang
94
1
0
19 May 2025
Seeing the Unseen: How EMoE Unveils Bias in Text-to-Image Diffusion Models
Lucas Berry
Axel Brando
Wei-Di Chang
Juan Camilo Gamboa Higuera
David Meger
DiffM
58
0
0
19 May 2025
Uniformity First: Uniformity-aware Test-time Adaptation of Vision-language Models against Image Corruption
Kazuki Adachi
Shin'ya Yamaguchi
Tomoki Hamagami
VLM
60
0
0
19 May 2025
FinePhys: Fine-grained Human Action Generation by Explicitly Incorporating Physical Laws for Effective Skeletal Guidance
Dian Shao
Mingfei Shi
Shengda Xu
Haodong Chen
Yongle Huang
Binglu Wang
3DH
63
0
0
19 May 2025
ViEEG: Hierarchical Neural Coding with Cross-Modal Progressive Enhancement for EEG-Based Visual Decoding
Minxu Liu
Donghai Guan
Chuhang Zheng
Chunwei Tian
Jie Wen
Qi Zhu
75
0
0
18 May 2025
Robust Planning for Autonomous Driving via Mixed Adversarial Diffusion Predictions
Albert Zhao
Stefano Soatto
DiffM
136
0
0
18 May 2025
CompBench: Benchmarking Complex Instruction-guided Image Editing
Bohan Jia
Wenxuan Huang
Yuntian Tang
Junbo Qiao
Jincheng Liao
...
Lin Chen
Fei Zhao
Zihan Wang
Yuan Xie
Shaohui Lin
CoGe
144
1
0
18 May 2025
Video-GPT via Next Clip Diffusion
Shaobin Zhuang
Zhipeng Huang
Ying Zhang
Fangyikang Wang
Canmiao Fu
Binxin Yang
Chong Sun
Chen Li
Yali Wang
DiffM
VGen
239
0
0
18 May 2025
GenZSL: Generative Zero-Shot Learning Via Inductive Variational Autoencoder
Shiming Chen
Dingjie Fu
Salman Khan
Fahad Shahbaz Khan
VLM
124
0
0
17 May 2025
Towards Robust and Controllable Text-to-Motion via Masked Autoregressive Diffusion
Zongye Zhang
Bohan Kong
Qingjie Liu
Yanjie Wang
DiffM
98
0
0
16 May 2025
DiCo: Revitalizing ConvNets for Scalable and Efficient Diffusion Modeling
Yuang Ai
Qihang Fan
Xuefeng Hu
Zhenheng Yang
Ran He
Huaibo Huang
DiffM
88
0
0
16 May 2025
MAVOS-DD: Multilingual Audio-Video Open-Set Deepfake Detection Benchmark
Florinel-Alin Croitoru
Vlad Hondru
Marius Popescu
Radu Tudor Ionescu
Fahad Shahbaz Khan
Mubarak Shah
105
0
0
16 May 2025
NeuSEditor: From Multi-View Images to Text-Guided Neural Surface Edits
Nail Ibrahimli
Julian F. P. Kooij
Liangliang Nan
55
0
0
16 May 2025
Spatiotemporal Field Generation Based on Hybrid Mamba-Transformer with Physics-informed Fine-tuning
Peimian Du
Jiabin Liu
Xiaowei Jin
Mengwang Zuo
Hui Li
AI4CE
124
0
0
16 May 2025
Previous
1
2
3
4
5
6
...
96
97
98
Next