ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.06125
  4. Cited By
Hierarchical Text-Conditional Image Generation with CLIP Latents

Hierarchical Text-Conditional Image Generation with CLIP Latents

13 April 2022
Aditya A. Ramesh
Prafulla Dhariwal
Alex Nichol
Casey Chu
Mark Chen
    VLM
    DiffM
ArXivPDFHTML

Papers citing "Hierarchical Text-Conditional Image Generation with CLIP Latents"

50 / 4,739 papers shown
Title
ORAL: Prompting Your Large-Scale LoRAs via Conditional Recurrent Diffusion
ORAL: Prompting Your Large-Scale LoRAs via Conditional Recurrent Diffusion
Rana Muhammad Shahroz Khan
Dongwen Tang
Pingzhi Li
Kai Wang
Tianlong Chen
AI4CE
142
0
0
31 Mar 2025
Biologically Inspired Spiking Diffusion Model with Adaptive Lateral Selection Mechanism
Biologically Inspired Spiking Diffusion Model with Adaptive Lateral Selection Mechanism
Linghao Feng
Dongcheng Zhao
Sicheng Shen
Yi Zeng
67
0
0
31 Mar 2025
Pre-training with 3D Synthetic Data: Learning 3D Point Cloud Instance Segmentation from 3D Synthetic Scenes
Pre-training with 3D Synthetic Data: Learning 3D Point Cloud Instance Segmentation from 3D Synthetic Scenes
Daichi Otsuka
Shinichi Mae
Ryosuke Yamada
Hirokatsu Kataoka
3DPC
37
0
0
31 Mar 2025
MuseFace: Text-driven Face Editing via Diffusion-based Mask Generation Approach
MuseFace: Text-driven Face Editing via Diffusion-based Mask Generation Approach
Xin Zhang
Siting Huang
Xiangyang Luo
Yifan Xie
Weijiang Yu
Heng Chang
Fei Ma
Fei Richard Yu
DiffM
46
0
0
31 Mar 2025
Training-Free Text-Guided Image Editing with Visual Autoregressive Model
Training-Free Text-Guided Image Editing with Visual Autoregressive Model
Yufei Wang
Lanqing Guo
Z. Li
Jiaxing Huang
Pichao Wang
Bihan Wen
J. Wang
DiffM
65
1
0
31 Mar 2025
Effective Cloud Removal for Remote Sensing Images by an Improved Mean-Reverting Denoising Model with Elucidated Design Space
Effective Cloud Removal for Remote Sensing Images by an Improved Mean-Reverting Denoising Model with Elucidated Design Space
Yi Liu
Wengen Li
Jihong Guan
S. Kevin Zhou
Yichao Zhang
DiffM
51
1
0
31 Mar 2025
TextCrafter: Accurately Rendering Multiple Texts in Complex Visual Scenes
TextCrafter: Accurately Rendering Multiple Texts in Complex Visual Scenes
Nikai Du
Zhennan Chen
Z. Chen
Shan Gao
Xi Chen
Zhengkai Jiang
Jian Yang
Ying Tai
DiffM
43
0
0
30 Mar 2025
Object Isolated Attention for Consistent Story Visualization
Object Isolated Attention for Consistent Story Visualization
Xiangyang Luo
Junhao Cheng
Yifan Xie
Xin Zhang
Tao Feng
Ziqiang Liu
Fei Ma
Fei Richard Yu
DiffM
47
1
0
30 Mar 2025
Evaluating Compositional Scene Understanding in Multimodal Generative Models
Evaluating Compositional Scene Understanding in Multimodal Generative Models
Shuhao Fu
Andrew Jun Lee
Anna Wang
Ida Momennejad
Trevor Bihl
Hongjing Lu
Taylor Webb
CoGe
OCL
109
1
0
29 Mar 2025
Spatial Transport Optimization by Repositioning Attention Map for Training-Free Text-to-Image Synthesis
Spatial Transport Optimization by Repositioning Attention Map for Training-Free Text-to-Image Synthesis
Woojung Han
Yeonkyung Lee
Chanyoung Kim
Kwanghyun Park
Seong Jae Hwang
DiffM
62
0
0
28 Mar 2025
Semantix: An Energy Guided Sampler for Semantic Style Transfer
Semantix: An Energy Guided Sampler for Semantic Style Transfer
Huiang He
Minghui Hu
C. Zheng
Chaoyue Wang
Tat-Jen Cham
DiffM
48
0
0
28 Mar 2025
Scenario Dreamer: Vectorized Latent Diffusion for Generating Driving Simulation Environments
Scenario Dreamer: Vectorized Latent Diffusion for Generating Driving Simulation Environments
Luke Rowe
Roger Girgis
Anthony Gosselin
Liam Paull
C. Pal
Felix Heide
DiffM
VGen
43
1
0
28 Mar 2025
Concept-Aware LoRA for Domain-Aligned Segmentation Dataset Generation
Concept-Aware LoRA for Domain-Aligned Segmentation Dataset Generation
Minho Park
S. Park
Jungsoo Lee
Hyojin Park
Kyuwoong Hwang
Fatih Porikli
Jaegul Choo
Sungha Choi
39
0
0
28 Mar 2025
Harnessing uncertainty when learning through Equilibrium Propagation in neural networks
Harnessing uncertainty when learning through Equilibrium Propagation in neural networks
Jonathan Peters
Philippe Talatchian
39
0
0
28 Mar 2025
AGILE: A Diffusion-Based Attention-Guided Image and Label Translation for Efficient Cross-Domain Plant Trait Identification
AGILE: A Diffusion-Based Attention-Guided Image and Label Translation for Efficient Cross-Domain Plant Trait Identification
Earl Ranario
Lars Lundqvist
Heesup Yun
Brian N Bailey
J. M. Earles
VLM
40
0
0
27 Mar 2025
Data Poisoning in Deep Learning: A Survey
Data Poisoning in Deep Learning: A Survey
Pinlong Zhao
Weiyao Zhu
Pengfei Jiao
Di Gao
Ou Wu
AAML
39
0
0
27 Mar 2025
Can Video Diffusion Model Reconstruct 4D Geometry?
Can Video Diffusion Model Reconstruct 4D Geometry?
Jinjie Mai
Wenxuan Zhu
Haozhe Liu
Bing Li
Cheng Zheng
Jürgen Schmidhuber
Bernard Ghanem
VGen
MDE
74
0
0
27 Mar 2025
LOCATEdit: Graph Laplacian Optimized Cross Attention for Localized Text-Guided Image Editing
LOCATEdit: Graph Laplacian Optimized Cross Attention for Localized Text-Guided Image Editing
Achint Soni
Meet Soni
Sirisha Rambhatla
DiffM
63
0
0
27 Mar 2025
Harmonizing Visual Representations for Unified Multimodal Understanding and Generation
Harmonizing Visual Representations for Unified Multimodal Understanding and Generation
Size Wu
W. Zhang
Lumin Xu
Sheng Jin
Zhonghua Wu
Qingyi Tao
Wentao Liu
Wei Li
Chen Change Loy
VGen
153
2
0
27 Mar 2025
A Unified Image-Dense Annotation Generation Model for Underwater Scenes
A Unified Image-Dense Annotation Generation Model for Underwater Scenes
Hongkai Lin
Dingkang Liang
Zhenghao Qi
X. Bai
DiffM
41
0
0
27 Mar 2025
Evaluating Text-to-Image Synthesis with a Conditional Fréchet Distance
Evaluating Text-to-Image Synthesis with a Conditional Fréchet Distance
Jaywon Koo
J. Hernandez
Moayed Haji-Ali
Ziyan Yang
Vicente Ordonez
EGVM
72
0
0
27 Mar 2025
SyncSDE: A Probabilistic Framework for Diffusion Synchronization
SyncSDE: A Probabilistic Framework for Diffusion Synchronization
Hyunjun Lee
Hyunsoo Lee
Sookwan Han
DiffM
48
0
0
27 Mar 2025
EditCLIP: Representation Learning for Image Editing
EditCLIP: Representation Learning for Image Editing
Qian Wang
Aleksandar Cvejic
Abdelrahman Eldesokey
Peter Wonka
67
0
0
26 Mar 2025
MMGen: Unified Multi-modal Image Generation and Understanding in One Go
MMGen: Unified Multi-modal Image Generation and Understanding in One Go
Jiepeng Wang
Zhaoqing Wang
H. Pan
Yuan Liu
Dongdong Yu
Changhu Wang
Wenping Wang
DiffM
80
0
0
26 Mar 2025
Eyes Tell the Truth: GazeVal Highlights Shortcomings of Generative AI in Medical Imaging
Eyes Tell the Truth: GazeVal Highlights Shortcomings of Generative AI in Medical Imaging
David Wong
Bin Wang
Gorkem Durak
Marouane Tliba
Akshay S. Chaudhari
...
Eric Hart
Drew A Torigian
J. Udupa
Elizabeth A. Krupinski
Ulas Bagci
MedIm
34
0
0
26 Mar 2025
VPO: Aligning Text-to-Video Generation Models with Prompt Optimization
VPO: Aligning Text-to-Video Generation Models with Prompt Optimization
Jiale Cheng
Ruiliang Lyu
Xiaotao Gu
Xiao-Chang Liu
Jiazheng Xu
...
Zhuoyi Yang
Yuxiao Dong
Jie Tang
Hairu Wang
Minlie Huang
VGen
89
0
0
26 Mar 2025
Contrastive Learning Guided Latent Diffusion Model for Image-to-Image Translation
Contrastive Learning Guided Latent Diffusion Model for Image-to-Image Translation
Qi Si
Bo Wang
Zhao Zhang
73
0
0
26 Mar 2025
Forensic Self-Descriptions Are All You Need for Zero-Shot Detection, Open-Set Source Attribution, and Clustering of AI-generated Images
Forensic Self-Descriptions Are All You Need for Zero-Shot Detection, Open-Set Source Attribution, and Clustering of AI-generated Images
Tai D. Nguyen
Aref Azizpour
Matthew C. Stamm
46
1
0
26 Mar 2025
EfficientMT: Efficient Temporal Adaptation for Motion Transfer in Text-to-Video Diffusion Models
EfficientMT: Efficient Temporal Adaptation for Motion Transfer in Text-to-Video Diffusion Models
Yufei Cai
Hu Han
Yuxiang Wei
Shiguang Shan
Xilin Chen
DiffM
VGen
65
0
0
25 Mar 2025
Scaling Vision Pre-Training to 4K Resolution
Scaling Vision Pre-Training to 4K Resolution
Baifeng Shi
Boyi Li
Han Cai
Yaojie Lu
Sifei Liu
...
Jan Kautz
Enze Xie
Trevor Darrell
Pavlo Molchanov
Hongxu Yin
CLIP
139
0
0
25 Mar 2025
ICE: Intrinsic Concept Extraction from a Single Image via Diffusion Models
ICE: Intrinsic Concept Extraction from a Single Image via Diffusion Models
Fernando Julio Cendra
Kai Han
VLM
58
0
0
25 Mar 2025
Learning Hazing to Dehazing: Towards Realistic Haze Generation for Real-World Image Dehazing
Learning Hazing to Dehazing: Towards Realistic Haze Generation for Real-World Image Dehazing
Ruiyi Wang
Yushuo Zheng
Zicheng Zhang
Chunyi Li
Shuaicheng Liu
Guangtao Zhai
Xiaohong Liu
DiffM
49
0
0
25 Mar 2025
Quantifying the Ease of Reproducing Training Data in Unconditional Diffusion Models
Quantifying the Ease of Reproducing Training Data in Unconditional Diffusion Models
Masaya Hasegawa
Koji Yasuda
39
0
0
25 Mar 2025
IPGO: Indirect Prompt Gradient Optimization for Parameter-Efficient Prompt-level Fine-Tuning on Text-to-Image Models
IPGO: Indirect Prompt Gradient Optimization for Parameter-Efficient Prompt-level Fine-Tuning on Text-to-Image Models
Jianping Ye
Michel Wedel
Kunpeng Zhang
39
0
0
25 Mar 2025
SITA: Structurally Imperceptible and Transferable Adversarial Attacks for Stylized Image Generation
SITA: Structurally Imperceptible and Transferable Adversarial Attacks for Stylized Image Generation
Jingdan Kang
Haoxin Yang
Yan Cai
Huaidong Zhang
Xuemiao Xu
Yong Du
Shengfeng He
AAML
49
0
0
25 Mar 2025
ImageGen-CoT: Enhancing Text-to-Image In-context Learning with Chain-of-Thought Reasoning
ImageGen-CoT: Enhancing Text-to-Image In-context Learning with Chain-of-Thought Reasoning
Jiaqi Liao
Z. Yang
Linjie Li
Dianqi Li
Kevin Qinghong Lin
Yu-Xi Cheng
Lijuan Wang
MLLM
LRM
62
0
0
25 Mar 2025
Fine-Grained Erasure in Text-to-Image Diffusion-based Foundation Models
Fine-Grained Erasure in Text-to-Image Diffusion-based Foundation Models
K. Thakral
Tamar Glaser
Tal Hassner
Mayank Vatsa
Richa Singh
49
2
0
25 Mar 2025
Scaling Down Text Encoders of Text-to-Image Diffusion Models
Scaling Down Text Encoders of Text-to-Image Diffusion Models
Lifu Wang
Daqing Liu
Xinchen Liu
Xiaodong He
VLM
46
0
0
25 Mar 2025
Latent Space Super-Resolution for Higher-Resolution Image Generation with Diffusion Models
Latent Space Super-Resolution for Higher-Resolution Image Generation with Diffusion Models
Jinho Jeong
Sangmin Han
Jinwoo Kim
Seon Joo Kim
42
0
0
24 Mar 2025
Training-free Diffusion Acceleration with Bottleneck Sampling
Training-free Diffusion Acceleration with Bottleneck Sampling
Ye Tian
Xin Xia
Yuxi Ren
Shanchuan Lin
Xing Wang
Xuefeng Xiao
Yunhai Tong
L. Yang
Bin Cui
60
0
0
24 Mar 2025
Latent Embedding Adaptation for Human Preference Alignment in Diffusion Planners
Latent Embedding Adaptation for Human Preference Alignment in Diffusion Planners
Wen Zheng Terence Ng
Jianda Chen
Yuan Xu
Tianwei Zhang
41
0
0
24 Mar 2025
DiffusedWrinkles: A Diffusion-Based Model for Data-Driven Garment Animation
DiffusedWrinkles: A Diffusion-Based Model for Data-Driven Garment Animation
R. Vidaurre
Elena Garces
Dan Casas
DiffM
AI4CE
81
1
0
24 Mar 2025
InPO: Inversion Preference Optimization with Reparametrized DDIM for Efficient Diffusion Model Alignment
InPO: Inversion Preference Optimization with Reparametrized DDIM for Efficient Diffusion Model Alignment
Yaojie Lu
Qichao Wang
H. Cao
Xierui Wang
Xiaoyin Xu
Min Zhang
61
0
0
24 Mar 2025
FDS: Frequency-Aware Denoising Score for Text-Guided Latent Diffusion Image Editing
FDS: Frequency-Aware Denoising Score for Text-Guided Latent Diffusion Image Editing
Yufan Ren
Zicong Jiang
Tong Zhang
Søren Forchhammer
Sabine Süsstrunk
DiffM
61
0
0
24 Mar 2025
Resource-Efficient Motion Control for Video Generation via Dynamic Mask Guidance
Resource-Efficient Motion Control for Video Generation via Dynamic Mask Guidance
Sicong Feng
Jielong Yang
Li Peng
DiffM
VGen
53
0
0
24 Mar 2025
Diffusion-4K: Ultra-High-Resolution Image Synthesis with Latent Diffusion Models
Diffusion-4K: Ultra-High-Resolution Image Synthesis with Latent Diffusion Models
Jinjin Zhang
Qiuyu Huang
Junjie Liu
Xiefan Guo
Di Huang
59
2
0
24 Mar 2025
OmnimatteZero: Training-free Real-time Omnimatte with Pre-trained Video Diffusion Models
OmnimatteZero: Training-free Real-time Omnimatte with Pre-trained Video Diffusion Models
Dvir Samuel
Matan Levy
N. Darshan
Gal Chechik
Rami Ben-Ari
DiffM
67
0
0
23 Mar 2025
SimMotionEdit: Text-Based Human Motion Editing with Motion Similarity Prediction
SimMotionEdit: Text-Based Human Motion Editing with Motion Similarity Prediction
Zhengyuan Li
Kai Cheng
Anindita Ghosh
Uttaran Bhattacharya
Liangyan Gui
Aniket Bera
DiffM
VGen
44
0
0
23 Mar 2025
InstructVEdit: A Holistic Approach for Instructional Video Editing
InstructVEdit: A Holistic Approach for Instructional Video Editing
Chi Zhang
C. Feng
Feng Yan
Qiming Zhang
Mingjin Zhang
Yujie Zhong
Jing Zhang
Lin Ma
DiffM
VGen
44
0
0
22 Mar 2025
DynASyn: Multi-Subject Personalization Enabling Dynamic Action Synthesis
DynASyn: Multi-Subject Personalization Enabling Dynamic Action Synthesis
Yongjin Choi
Chanhun Park
Seung Jun Baek
DiffM
51
0
0
22 Mar 2025
Previous
12345...939495
Next