ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.06125
  4. Cited By
Hierarchical Text-Conditional Image Generation with CLIP Latents

Hierarchical Text-Conditional Image Generation with CLIP Latents

13 April 2022
Aditya A. Ramesh
Prafulla Dhariwal
Alex Nichol
Casey Chu
Mark Chen
    VLMDiffM
ArXiv (abs)PDFHTML

Papers citing "Hierarchical Text-Conditional Image Generation with CLIP Latents"

50 / 4,897 papers shown
Title
Neurons: Emulating the Human Visual Cortex Improves Fidelity and Interpretability in fMRI-to-Video Reconstruction
Neurons: Emulating the Human Visual Cortex Improves Fidelity and Interpretability in fMRI-to-Video Reconstruction
Haonan Wang
Qixiang Zhang
Lehan Wang
Xuanqi Huang
Xiaomeng Li
VOSVGen
102
0
0
14 Mar 2025
Safe-VAR: Safe Visual Autoregressive Model for Text-to-Image Generative Watermarking
Ziyi Wang
Songbai Tan
Gang Xu
Xuerui Qiu
Hongbin Xu
Xin Meng
Ming Li
Fei Richard Yu
WIGM
126
0
0
14 Mar 2025
Noise Synthesis for Low-Light Image Denoising with Diffusion Models
Liying Lu
Raphaël Achddou
Sabine Süsstrunk
DiffM
80
0
0
14 Mar 2025
AudioX: Diffusion Transformer for Anything-to-Audio Generation
AudioX: Diffusion Transformer for Anything-to-Audio Generation
Zeyue Tian
Yizhu Jin
Zhaoyang Liu
Ruibin Yuan
Xu Tan
Qifeng Chen
Wei Xue
Yu Guo
114
6
0
13 Mar 2025
On the Generalization Properties of Diffusion Models
On the Generalization Properties of Diffusion Models
Puheng Li
Zhong Li
Huishuai Zhang
Jiang Bian
238
39
0
13 Mar 2025
Spatial-Temporal Graph Diffusion Policy with Kinematic Modeling for Bimanual Robotic Manipulation
Qi Lv
Hao Li
Xiang Deng
Rui Shao
Yinchuan Li
Haifeng Zhang
Longxiang Gao
Michael Yu Wang
Liqiang Nie
118
2
0
13 Mar 2025
Piece it Together: Part-Based Concepting with IP-Priors
Elad Richardson
Kfir Goldberg
Yuval Alaluf
Daniel Cohen-Or
DiffM
102
0
0
13 Mar 2025
FlowTok: Flowing Seamlessly Across Text and Image Tokens
FlowTok: Flowing Seamlessly Across Text and Image Tokens
Ju He
Qihang Yu
Qihao Liu
Liang-Chieh Chen
146
1
0
13 Mar 2025
Probability-Flow ODE in Infinite-Dimensional Function Spaces
Kunwoo Na
Junghyun Lee
Se-Young Yun
Sungbin Lim
79
0
0
13 Mar 2025
DiT-Air: Revisiting the Efficiency of Diffusion Model Architecture Design in Text to Image Generation
DiT-Air: Revisiting the Efficiency of Diffusion Model Architecture Design in Text to Image Generation
Chen Chen
Rui Qian
Wenze Hu
Tsu-Jui Fu
Jialing Tong
...
Lezhi Li
Bowen Zhang
Alex Schwing
Wei Liu
Yue Yang
143
0
0
13 Mar 2025
GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing
Rongyao Fang
Chengqi Duan
Kun Wang
Linjiang Huang
Hao Li
...
Xingyu Zeng
R. Zhao
Jifeng Dai
Xihui Liu
Hongsheng Li
MLLMReLMLRM
163
23
0
13 Mar 2025
MACS: Multi-source Audio-to-image Generation with Contextual Significance and Semantic Alignment
Hao Zhou
Xiaobao Guo
Yuzhe Zhu
A. Kong
DiffM
142
1
0
13 Mar 2025
UVE: Are MLLMs Unified Evaluators for AI-Generated Videos?
UVE: Are MLLMs Unified Evaluators for AI-Generated Videos?
Yuanxin Liu
Rui Zhu
Shuhuai Ren
Jiacong Wang
Haoyuan Guo
Xu Sun
Lu Jiang
372
1
0
13 Mar 2025
Membership Inference Attacks fueled by Few-Short Learning to detect privacy leakage tackling data integrity
D. López
Nuria Rodríguez Barroso
M. V. Luzón
Francisco Herrera
110
0
0
12 Mar 2025
InteractEdit: Zero-Shot Editing of Human-Object Interactions in Images
Jiun Tian Hoe
Weipeng Hu
Wei Zhou
Chao Xie
Ziwei Wang
Chee Seng Chan
Xudong Jiang
Y. Tan
119
0
0
12 Mar 2025
RewardSDS: Aligning Score Distillation via Reward-Weighted Sampling
Itay Chachy
Guy Yariv
Sagie Benaim
459
0
0
12 Mar 2025
Training Data Provenance Verification: Did Your Model Use Synthetic Data from My Generative Model for Training?
Yuechen Xie
Jie Song
Huiqiong Wang
Mingli Song
82
0
0
12 Mar 2025
PromptMap: An Alternative Interaction Style for AI-Based Image Generation
PromptMap: An Alternative Interaction Style for AI-Based Image Generation
Krzysztof Adamkiewicz
Paweł W. Woźniak
Julia Dominiak
Andrzej Romanowski
Jakob Karolus
Stanislav Frolov
133
1
0
12 Mar 2025
Long-horizon Visual Instruction Generation with Logic and Attribute Self-reflection
Long-horizon Visual Instruction Generation with Logic and Attribute Self-reflection
Yucheng Suo
Fan Ma
Kaixin Shen
Linchao Zhu
Yi Yang
VLM
86
0
0
12 Mar 2025
MGHanD: Multi-modal Guidance for authentic Hand Diffusion
Taehyeon Eum
Jieun Choi
Tae-Kyun Kim
83
1
0
11 Mar 2025
Exploring Bias in over 100 Text-to-Image Generative Models
Jordan Vice
Naveed Akhtar
Leonid Sigal
Ajmal Mian
EGVM
108
4
0
11 Mar 2025
Controlling Latent Diffusion Using Latent CLIP
Jason Becker
Chris Wendler
Peter Baylies
Robert West
Christian Wressnegger
DiffMVLM
85
0
0
11 Mar 2025
Accelerated Distributed Optimization with Compression and Error Feedback
Accelerated Distributed Optimization with Compression and Error Feedback
Yuan Gao
Anton Rodomanov
Jeremy Rack
Sebastian U. Stich
89
0
0
11 Mar 2025
DexGrasp Anything: Towards Universal Robotic Dexterous Grasping with Physics Awareness
DexGrasp Anything: Towards Universal Robotic Dexterous Grasping with Physics Awareness
Yiming Zhong
Qi Jiang
Jingyi Yu
Yuexin Ma
181
4
0
11 Mar 2025
Few-Shot Class-Incremental Model Attribution Using Learnable Representation From CLIP-ViT Features
Hanbyul Lee
Juneho Yi
DiffM
109
0
0
11 Mar 2025
Generalizable AI-Generated Image Detection Based on Fractal Self-Similarity in the Spectrum
Shengpeng Xiao
Yuanfang Guo
Heqi Peng
Zeming Liu
Liang Yang
Yanjie Wang
114
0
0
11 Mar 2025
PRISM: Privacy-Preserving Improved Stochastic Masking for Federated Generative Models
PRISM: Privacy-Preserving Improved Stochastic Masking for Federated Generative Models
Kyeongkook Seo
Dong-Jun Han
Jaejun Yoo
143
1
0
11 Mar 2025
NullFace: Training-Free Localized Face Anonymization
Han-Wei Kung
Tuomas Varanka
Terence Sim
N. Sebe
DiffMPICV
109
0
0
11 Mar 2025
"Principal Components" Enable A New Language of Images
Xin Wen
Bingchen Zhao
Ismail Elezi
Jiankang Deng
Xiaojuan Qi
114
1
0
11 Mar 2025
Pathology-Aware Adaptive Watermarking for Text-Driven Medical Image Synthesis
Chanyoung Kim
Dayun Ju
Jinyeong Kim
Woojung Han
Roberto Alcover-Couso
Seong Jae Hwang
MedIm
103
0
0
11 Mar 2025
Modular Customization of Diffusion Models via Blockwise-Parameterized Low-Rank Adaptation
Mingkang Zhu
Xi Chen
Ziyi Wang
Bei Yu
Hengshuang Zhao
Jiaya Jia
MoMe
89
0
0
11 Mar 2025
OmniMamba: Efficient and Unified Multimodal Understanding and Generation via State Space Models
Jialv Zou
Bencheng Liao
Qian Zhang
Wenyu Liu
Xinggang Wang
MambaMLLM
139
1
0
11 Mar 2025
Identity Preserving Latent Diffusion for Brain Aging Modeling
Gexin Huang
Zhangsihao Yang
Yalin Wang
Guido Gerig
Mengwei Ren
Xiaoxiao Li
MedImDiffM
150
0
0
11 Mar 2025
WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation
WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation
Yuwei Niu
Munan Ning
Mengren Zheng
Weiyang Jin
Bin Lin
...
Jiaqi Liao
Chaoran Feng
Kunpeng Ning
Bin Zhu
Li Yuan
EGVM
147
26
0
10 Mar 2025
TimeStep Master: Asymmetrical Mixture of Timestep LoRA Experts for Versatile and Efficient Diffusion Models in Vision
Shaobin Zhuang
Yiwei Guo
Yanbo Ding
Kunchang Li
Xinyuan Chen
Yaohui Wang
Fangyikang Wang
Ying Zhang
Chen Li
Yijiao Wang
84
1
0
10 Mar 2025
RayFlow: Instance-Aware Diffusion Acceleration via Adaptive Flow Trajectories
RayFlow: Instance-Aware Diffusion Acceleration via Adaptive Flow Trajectories
Huiyang Shao
Xin Xia
Yanting Yang
Yuxi Ren
Xing Wang
Xuefeng Xiao
93
4
0
10 Mar 2025
PersonaBooth: Personalized Text-to-Motion Generation
PersonaBooth: Personalized Text-to-Motion Generation
Boeun Kim
Hea In Jeong
JungHoon Sung
Yihua Cheng
Jeongmin Lee
...
Sang-Il Choi
Younggeun Choi
Saim Shin
Jungho Kim
Hyung Jin Chang
DiffMVGen
139
0
0
10 Mar 2025
LBM: Latent Bridge Matching for Fast Image-to-Image Translation
Clement Chadebec
O. Tasar
Sanjeev Sreetharan
Benjamin Aubin
148
0
0
10 Mar 2025
AttenST: A Training-Free Attention-Driven Style Transfer Framework with Pre-Trained Diffusion Models
Bo Huang
Wenlun Xu
Qizhuo Han
Haodong Jing
Ying Li
DiffM
94
0
0
10 Mar 2025
Recovering Partially Corrupted Major Objects through Tri-modality Based Image Completion
Yongle Zhang
Yimin Liu
Qiang Wu
DiffM
90
0
0
10 Mar 2025
Efficient Distillation of Classifier-Free Guidance using Adapters
Cristian Perez Jensen
Seyedmorteza Sadat
96
1
0
10 Mar 2025
Post-Training Quantization for Diffusion Transformer via Hierarchical Timestep Grouping
Post-Training Quantization for Diffusion Transformer via Hierarchical Timestep Grouping
Ning Ding
Jing Han
Yuchuan Tian
Chao Xu
Kai Han
Yehui Tang
MQ
153
0
0
10 Mar 2025
Unleashing the Potential of Large Language Models for Text-to-Image Generation through Autoregressive Representation Alignment
Unleashing the Potential of Large Language Models for Text-to-Image Generation through Autoregressive Representation Alignment
Xing Xie
Jiawei Liu
Ziyue Lin
Huijie Fan
Zhi Han
Yandong Tang
Liangqiong Qu
113
0
0
10 Mar 2025
Denoising Score Distillation: From Noisy Diffusion Pretraining to One-Step High-Quality Generation
Denoising Score Distillation: From Noisy Diffusion Pretraining to One-Step High-Quality Generation
Tianyu Chen
Yasi Zhang
Ziyi Wang
Ying Nian Wu
Oscar Leong
Mingyuan Zhou
DiffM
155
2
0
10 Mar 2025
LatexBlend: Scaling Multi-concept Customized Generation with Latent Textual Blending
Jian Jin
Zhenbo Yu
Yang Shen
Zhenyong Fu
Jian Yang
DiffM
110
1
0
10 Mar 2025
AnomalyPainter: Vision-Language-Diffusion Synergy for Zero-Shot Realistic and Diverse Industrial Anomaly Synthesis
Zhangyu Lai
Yilin Lu
Xinyang Li
Jianghang Lin
Yansong Qu
Liujuan Cao
Ming Li
Rongrong Ji
DiffM
463
0
0
10 Mar 2025
FaceID-6M: A Large-Scale, Open-Source FaceID Customization Dataset
FaceID-6M: A Large-Scale, Open-Source FaceID Customization Dataset
Shuhe Wang
Xiaoya Li
Jiwei Li
G. Wang
Xiaofei Sun
...
Han Qiu
Mo Yu
Shengjie Shen
Tianwei Zhang
Eduard H. Hovy
VLM
124
1
0
10 Mar 2025
VidBot: Learning Generalizable 3D Actions from In-the-Wild 2D Human Videos for Zero-Shot Robotic Manipulation
VidBot: Learning Generalizable 3D Actions from In-the-Wild 2D Human Videos for Zero-Shot Robotic Manipulation
Hanzhi Chen
Boyang Sun
Anran Zhang
Marc Pollefeys
Stefan Leutenegger
LM&Ro
159
0
0
10 Mar 2025
Color Alignment in Diffusion
Ka Chun Shum
Binh-Son Hua
Duc Thanh Nguyen
Sai-Kit Yeung
78
0
0
09 Mar 2025
Fine-Grained Alignment and Noise Refinement for Compositional Text-to-Image Generation
Amir Mohammad Izadi
Seyed Mohammad Hadi Hosseini
Soroush Vafaie Tabar
Ali Abdollahi
Armin Saghafian
M. Baghshah
EGVM
86
1
0
09 Mar 2025
Previous
123...8910...969798
Next