ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.06125
  4. Cited By
Hierarchical Text-Conditional Image Generation with CLIP Latents

Hierarchical Text-Conditional Image Generation with CLIP Latents

13 April 2022
Aditya A. Ramesh
Prafulla Dhariwal
Alex Nichol
Casey Chu
Mark Chen
    VLM
    DiffM
ArXivPDFHTML

Papers citing "Hierarchical Text-Conditional Image Generation with CLIP Latents"

50 / 4,750 papers shown
Title
Enhancing MMDiT-Based Text-to-Image Models for Similar Subject
  Generation
Enhancing MMDiT-Based Text-to-Image Models for Similar Subject Generation
Tianyi Wei
Dongdong Chen
Yifan Zhou
Xingang Pan
EGVM
90
2
0
27 Nov 2024
Diffusion Autoencoders for Few-shot Image Generation in Hyperbolic Space
Diffusion Autoencoders for Few-shot Image Generation in Hyperbolic Space
Lingxiao Li
Kaixuan Fan
Boqing Gong
Xiangyu Yue
DiffM
75
0
0
27 Nov 2024
Pan-protein Design Learning Enables Task-adaptive Generalization for
  Low-resource Enzyme Design
Pan-protein Design Learning Enables Task-adaptive Generalization for Low-resource Enzyme Design
Jiangbin Zheng
Ge Wang
Han Zhang
Stan Z. Li
68
0
0
26 Nov 2024
Reward Incremental Learning in Text-to-Image Generation
Reward Incremental Learning in Text-to-Image Generation
Maorong Wang
Jiafeng Mao
Xueting Wang
Toshihiko Yamasaki
EGVM
103
0
0
26 Nov 2024
Relations, Negations, and Numbers: Looking for Logic in Generative
  Text-to-Image Models
Relations, Negations, and Numbers: Looking for Logic in Generative Text-to-Image Models
C. Conwell
Rupert Tawiah-Quashie
T. Ullman
74
2
0
26 Nov 2024
Noise Diffusion for Enhancing Semantic Faithfulness in Text-to-Image
  Synthesis
Noise Diffusion for Enhancing Semantic Faithfulness in Text-to-Image Synthesis
Boming Miao
C. Li
Xiaobei Wang
Andi Zhang
Rui Sun
Zizhe Wang
Yao Zhu
DiffM
81
0
0
25 Nov 2024
Characterized Diffusion Networks for Enhanced Autonomous Driving
  Trajectory Prediction
Characterized Diffusion Networks for Enhanced Autonomous Driving Trajectory Prediction
Haoming Li
79
0
0
25 Nov 2024
Privacy Protection in Personalized Diffusion Models via Targeted
  Cross-Attention Adversarial Attack
Privacy Protection in Personalized Diffusion Models via Targeted Cross-Attention Adversarial Attack
Xide Xu
Muhammad Atif Butt
Sandesh Kamath
Bogdan Raducanu
DiffM
AAML
86
1
0
25 Nov 2024
SMGDiff: Soccer Motion Generation using diffusion probabilistic models
SMGDiff: Soccer Motion Generation using diffusion probabilistic models
Hongdi Yang
Chengyang Li
Zhenxuan Wu
Gaozheng Li
Jingya Wang
Jingyi Yu
Zhuo Su
Lan Xu
DiffM
VGen
80
1
0
25 Nov 2024
In-Context Experience Replay Facilitates Safety Red-Teaming of Text-to-Image Diffusion Models
In-Context Experience Replay Facilitates Safety Red-Teaming of Text-to-Image Diffusion Models
Zhi-Yi Chin
Kuan-Chen Mu
Mario Fritz
Pin-Yu Chen
DiffM
90
0
0
25 Nov 2024
Imagine and Seek: Improving Composed Image Retrieval with an Imagined
  Proxy
Imagine and Seek: Improving Composed Image Retrieval with an Imagined Proxy
Y. Li
Fan Ma
Yi Yang
143
2
0
24 Nov 2024
AnySynth: Harnessing the Power of Image Synthetic Data Generation for
  Generalized Vision-Language Tasks
AnySynth: Harnessing the Power of Image Synthetic Data Generation for Generalized Vision-Language Tasks
Y. Li
Fan Ma
Yi Yang
DiffM
154
2
0
24 Nov 2024
Fixing the Perspective: A Critical Examination of Zero-1-to-3
Fixing the Perspective: A Critical Examination of Zero-1-to-3
Jack Yu
Xueying Jia
Charlie Sun
Prince Wang
DiffM
76
0
0
24 Nov 2024
AnyEdit: Mastering Unified High-Quality Image Editing for Any Idea
AnyEdit: Mastering Unified High-Quality Image Editing for Any Idea
Qifan Yu
Wei Chow
Zhongqi Yue
Kaihang Pan
Yang Wu
Xiaoyang Wan
Juncheng Billy Li
Siliang Tang
Hao Zhang
Yueting Zhuang
DiffM
110
17
0
24 Nov 2024
Gaussian Scenes: Pose-Free Sparse-View Scene Reconstruction using Depth-Enhanced Diffusion Priors
Gaussian Scenes: Pose-Free Sparse-View Scene Reconstruction using Depth-Enhanced Diffusion Priors
Soumava Paul
Prakhar Kaushik
Alan Yuille
3DGS
DiffM
239
0
0
24 Nov 2024
PriorDiffusion: Leverage Language Prior in Diffusion Models for Monocular Depth Estimation
PriorDiffusion: Leverage Language Prior in Diffusion Models for Monocular Depth Estimation
Ziyao Zeng
Jingcheng Ni
Daniel Wang
Patrick Rim
Younjoon Chung
Fengyu Yang
Byung-Woo Hong
A. Wong
DiffM
MDE
108
2
0
24 Nov 2024
Semantic Shield: Defending Vision-Language Models Against Backdooring
  and Poisoning via Fine-grained Knowledge Alignment
Semantic Shield: Defending Vision-Language Models Against Backdooring and Poisoning via Fine-grained Knowledge Alignment
Alvi Md Ishmam
Christopher Thomas
AAML
124
3
0
23 Nov 2024
Automatic Evaluation for Text-to-image Generation: Task-decomposed
  Framework, Distilled Training, and Meta-evaluation Benchmark
Automatic Evaluation for Text-to-image Generation: Task-decomposed Framework, Distilled Training, and Meta-evaluation Benchmark
Rong-Cheng Tu
Zi-Ao Ma
Tian Lan
Yuehao Zhao
Heyan Huang
Xian-Ling Mao
MLLM
VLM
EGVM
106
4
0
23 Nov 2024
Large-Scale Text-to-Image Model with Inpainting is a Zero-Shot
  Subject-Driven Image Generator
Large-Scale Text-to-Image Model with Inpainting is a Zero-Shot Subject-Driven Image Generator
Chaehun Shin
Jooyoung Choi
Heeseung Kim
Sungroh Yoon
DiffM
89
8
0
23 Nov 2024
$\textit{Revelio}$: Interpreting and leveraging semantic information in
  diffusion models
Revelio\textit{Revelio}Revelio: Interpreting and leveraging semantic information in diffusion models
Dahye Kim
Xavier Thomas
Deepti Ghadiyaram
91
4
0
23 Nov 2024
Gotta Hear Them All: Sound Source Aware Vision to Audio Generation
Gotta Hear Them All: Sound Source Aware Vision to Audio Generation
Wei Guo
Heng Wang
Jianbo Ma
Weidong Cai
DiffM
93
3
0
23 Nov 2024
TPIE: Topology-Preserved Image Editing With Text Instructions
TPIE: Topology-Preserved Image Editing With Text Instructions
Nivetha Jayakumar
Srivardhan Reddy Gadila
Tonmoy Hossain
Yangfeng Ji
Miaomiao Zhang
DiffM
MedIm
95
0
0
22 Nov 2024
Zero-Shot Coreset Selection: Efficient Pruning for Unlabeled Data
Zero-Shot Coreset Selection: Efficient Pruning for Unlabeled Data
Brent A. Griffin
Jacob Marks
Jason J. Corso
VLM
79
2
0
22 Nov 2024
LocRef-Diffusion:Tuning-Free Layout and Appearance-Guided Generation
LocRef-Diffusion:Tuning-Free Layout and Appearance-Guided Generation
Fan Deng
Yaguang Wu
Xinyang Yu
Xiangjun Huang
Jian Yang
Guangyu Yan
Qiang Xu
DiffM
94
0
0
22 Nov 2024
AnyText2: Visual Text Generation and Editing With Customizable
  Attributes
AnyText2: Visual Text Generation and Editing With Customizable Attributes
Yuxiang Tuo
Yifeng Geng
Liefeng Bo
VLM
93
6
0
22 Nov 2024
Reward Fine-Tuning Two-Step Diffusion Models via Learning Differentiable Latent-Space Surrogate Reward
Reward Fine-Tuning Two-Step Diffusion Models via Learning Differentiable Latent-Space Surrogate Reward
Zhiwei Jia
Yuesong Nan
Huixi Zhao
Gengdai Liu
EGVM
91
0
0
22 Nov 2024
Text Embedding is Not All You Need: Attention Control for Text-to-Image
  Semantic Alignment with Text Self-Attention Maps
Text Embedding is Not All You Need: Attention Control for Text-to-Image Semantic Alignment with Text Self-Attention Maps
Jeeyung Kim
Erfan Esmaeili
Qiang Qiu
DiffM
90
1
0
21 Nov 2024
On the Fairness, Diversity and Reliability of Text-to-Image Generative
  Models
On the Fairness, Diversity and Reliability of Text-to-Image Generative Models
J. Vice
Naveed Akhtar
Richard I. Hartley
Ajmal Mian
EGVM
71
0
0
21 Nov 2024
Test-Time Adaptation of 3D Point Clouds via Denoising Diffusion Models
Test-Time Adaptation of 3D Point Clouds via Denoising Diffusion Models
Hamidreza Dastmalchi
Aijun An
A. Cheraghian
Shafin Rahman
Sameera Ramasinghe
DiffM
TTA
89
1
0
21 Nov 2024
Safety Without Semantic Disruptions: Editing-free Safe Image Generation via Context-preserving Dual Latent Reconstruction
Safety Without Semantic Disruptions: Editing-free Safe Image Generation via Context-preserving Dual Latent Reconstruction
J. Vice
Naveed Akhtar
Richard I. Hartley
Ajmal Mian
Ajmal Mian
DiffM
89
0
0
21 Nov 2024
Quantum-Brain: Quantum-Inspired Neural Network Approach to Vision-Brain
  Understanding
Quantum-Brain: Quantum-Inspired Neural Network Approach to Vision-Brain Understanding
Hoang-Quan Nguyen
Xuan-Bac Nguyen
Hugh Churchill
Arabinda Kumar Choudhary
Pawan Sinha
S. Khan
Khoa Luu
75
1
0
20 Nov 2024
RAW-Diffusion: RGB-Guided Diffusion Models for High-Fidelity RAW Image
  Generation
RAW-Diffusion: RGB-Guided Diffusion Models for High-Fidelity RAW Image Generation
Christoph Reinders
Radu Berdan
Beril Besbinar
Junji Otsuka
Daisuke Iso
81
2
0
20 Nov 2024
From Text to Pose to Image: Improving Diffusion Model Control and
  Quality
From Text to Pose to Image: Improving Diffusion Model Control and Quality
Clément Bonnet
Ariel N. Lee
Franck Wertel
Antoine Tamano
Tanguy Cizain
Pablo Ducru
DiffM
71
0
0
19 Nov 2024
CDI: Copyrighted Data Identification in Diffusion Models
CDI: Copyrighted Data Identification in Diffusion Models
Jan Dubiñski
Antoni Kowalczuk
Franziska Boenisch
Adam Dziedzic
72
1
0
19 Nov 2024
Decoupling Training-Free Guided Diffusion by ADMM
Decoupling Training-Free Guided Diffusion by ADMM
Youyuan Zhang
Zehua Liu
Zenan Li
Zhaoyu Li
James J. Clark
X. Si
80
0
0
18 Nov 2024
Alien Recombination: Exploring Concept Blends Beyond Human Cognitive Availability in Visual Art
Alejandro Hernandez
Levin Brinkmann
Ignacio Serna
Nasim Rahaman
Hassan Abu Alhaija
Hiromu Yakura
Mar Canet Sola
Bernhard Schölkopf
Iyad Rahwan
82
0
0
18 Nov 2024
Teaching Video Diffusion Model with Latent Physical Phenomenon Knowledge
Qinglong Cao
Ding Wang
Xirui Li
Yuntian Chen
Chao Ma
Xiaokang Yang
DiffM
VGen
118
2
0
18 Nov 2024
LaVin-DiT: Large Vision Diffusion Transformer
Zhaoqing Wang
Xiaobo Xia
Runnan Chen
Dongdong Yu
Changhu Wang
Mingming Gong
Tongliang Liu
100
6
0
18 Nov 2024
Oscillation Inversion: Understand the structure of Large Flow Model through the Lens of Inversion Method
Yan Zheng
Zhenxiao Liang
Xiaoyan Cong
Lanqing guo
Yuehao Wang
Peihao Wang
Zihan Wang
DiffM
35
2
0
17 Nov 2024
Constrained Diffusion with Trust Sampling
William Huang
Yifeng Jiang
Tom Van Wouwe
Chenxi Liu
40
3
0
17 Nov 2024
Test-time Conditional Text-to-Image Synthesis Using Diffusion Models
Tripti Shukla
Srikrishna Karanam
Balaji Vasan Srinivasan
DiffM
41
0
0
16 Nov 2024
ViBe: A Text-to-Video Benchmark for Evaluating Hallucination in Large Multimodal Models
Vipula Rawte
Sarthak Jain
Aarush Sinha
Garv Kaushik
Aman Bansal
...
Aishwarya N. Reganti
Vinija Jain
Aman Chadha
A. Sheth
A. Das
VLM
MLLM
52
0
0
16 Nov 2024
GSEditPro: 3D Gaussian Splatting Editing with Attention-based
  Progressive Localization
GSEditPro: 3D Gaussian Splatting Editing with Attention-based Progressive Localization
Yanhao Sun
RunZe Tian
Xiao Han
XinYao Liu
Yan Zhang
Kai Xu
3DGS
DiffM
55
2
0
15 Nov 2024
Boundary Attention Constrained Zero-Shot Layout-To-Image Generation
Boundary Attention Constrained Zero-Shot Layout-To-Image Generation
Huancheng Chen
Jingtao Li
Weiming Zhuang
H. Vikalo
Lingjuan Lyu
DiffM
38
0
0
15 Nov 2024
ColorEdit: Training-free Image-Guided Color editing with diffusion model
ColorEdit: Training-free Image-Guided Color editing with diffusion model
Xingxi Yin
Zhi Li
Jingfeng Zhang
Chenglin Li
Yin Zhang
DiffM
54
0
0
15 Nov 2024
Inconsistencies In Consistency Models: Better ODE Solving Does Not Imply
  Better Samples
Inconsistencies In Consistency Models: Better ODE Solving Does Not Imply Better Samples
Noël Vouitsis
Rasa Hosseinzadeh
Brendan Leigh Ross
Valentin Villecroze
S. Gorti
Jesse C. Cresswell
G. Loaiza-Ganem
DiffM
48
0
0
13 Nov 2024
Physics Informed Distillation for Diffusion Models
Physics Informed Distillation for Diffusion Models
Joshua Tian Jin Tee
Kang Zhang
Hee Suk Yoon
Dhananjaya N. Gowda
Chanwoo Kim
Chang D. Yoo
DiffM
70
3
0
13 Nov 2024
Latent Space Disentanglement in Diffusion Transformers Enables Precise
  Zero-shot Semantic Editing
Latent Space Disentanglement in Diffusion Transformers Enables Precise Zero-shot Semantic Editing
Zitao Shuai
Chenwei Wu
Zhengxu Tang
Bowen Song
Liyue Shen
DiffM
70
0
0
12 Nov 2024
Evaluating the Generation of Spatial Relations in Text and Image
  Generative Models
Evaluating the Generation of Spatial Relations in Text and Image Generative Models
Shang Hong Sim
Clarence Lee
A. Tan
Cheston Tan
EGVM
41
2
0
12 Nov 2024
Add-it: Training-Free Object Insertion in Images With Pretrained
  Diffusion Models
Add-it: Training-Free Object Insertion in Images With Pretrained Diffusion Models
Yoad Tewel
Rinon Gal
Dvir Samuel
Y. Atzmon
Lior Wolf
Gal Chechik
VLM
59
6
0
11 Nov 2024
Previous
123...111213...939495
Next