ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.06125
  4. Cited By
Hierarchical Text-Conditional Image Generation with CLIP Latents

Hierarchical Text-Conditional Image Generation with CLIP Latents

13 April 2022
Aditya A. Ramesh
Prafulla Dhariwal
Alex Nichol
Casey Chu
Mark Chen
    VLMDiffM
ArXiv (abs)PDFHTML

Papers citing "Hierarchical Text-Conditional Image Generation with CLIP Latents"

50 / 4,897 papers shown
Title
OFTSR: One-Step Flow for Image Super-Resolution with Tunable
  Fidelity-Realism Trade-offs
OFTSR: One-Step Flow for Image Super-Resolution with Tunable Fidelity-Realism Trade-offs
Yuanzhi Zhu
R. Wang
Shilin Lu
Junnan Li
Hanshu Yan
Peng Sun
SupR
184
5
0
12 Dec 2024
UFO: Enhancing Diffusion-Based Video Generation with a Uniform Frame
  Organizer
UFO: Enhancing Diffusion-Based Video Generation with a Uniform Frame Organizer
Delong Liu
Zhaohui Hou
Mingjie Zhan
Shihao Han
Zhicheng Zhao
Fei Su
VGen
105
0
0
12 Dec 2024
Illusion3D: 3D Multiview Illusion with 2D Diffusion Priors
Illusion3D: 3D Multiview Illusion with 2D Diffusion Priors
Yue Feng
Vaibhav Sanjay
Spencer Lutz
Badour Albahar
Songwei Ge
Jia-Bin Huang
148
1
0
12 Dec 2024
SVGFusion: Scalable Text-to-SVG Generation via Vector Space Diffusion
SVGFusion: Scalable Text-to-SVG Generation via Vector Space Diffusion
Ximing Xing
Juncheng Hu
Jing Zhang
Dong Xu
Qian Yu
216
4
0
11 Dec 2024
FireFlow: Fast Inversion of Rectified Flow for Image Semantic Editing
FireFlow: Fast Inversion of Rectified Flow for Image Semantic Editing
Yingying Deng
Xiangyu He
Changwang Mei
Peisong Wang
Fan Tang
124
9
0
10 Dec 2024
Buster: Implanting Semantic Backdoor into Text Encoder to Mitigate NSFW Content Generation
Buster: Implanting Semantic Backdoor into Text Encoder to Mitigate NSFW Content Generation
Xin Zhao
Xiaojun Chen
Yuexin Xuan
Zhendong Zhao
Xiaojun Jia
Xinfeng Li
Xiaofeng Wang
118
1
0
10 Dec 2024
FIRE: Robust Detection of Diffusion-Generated Images via Frequency-Guided Reconstruction Error
FIRE: Robust Detection of Diffusion-Generated Images via Frequency-Guided Reconstruction Error
Beilin Chu
Xuan Xu
Xin Wang
Yanzhe Zhang
Weike You
Linna Zhou
DiffM
161
4
0
10 Dec 2024
ArtFormer: Controllable Generation of Diverse 3D Articulated Objects
ArtFormer: Controllable Generation of Diverse 3D Articulated Objects
Jiayi Su
Youhe Feng
Zheng Li
Jinhua Song
Yangfan He
Botao Ren
Botian Xu
AI4CE
156
3
0
10 Dec 2024
Sound2Vision: Generating Diverse Visuals from Audio through Cross-Modal
  Latent Alignment
Sound2Vision: Generating Diverse Visuals from Audio through Cross-Modal Latent Alignment
Kim Sung-Bin
Arda Senocak
Hyunwoo Ha
Tae-Hyun Oh
DiffM
219
0
0
09 Dec 2024
Nested Diffusion Models Using Hierarchical Latent Priors
Nested Diffusion Models Using Hierarchical Latent Priors
Xiao Zhang
Ruoxi Jiang
Rebecca Willett
Michael Maire
BDLDiffM
118
1
0
08 Dec 2024
Evaluating Hallucination in Text-to-Image Diffusion Models with
  Scene-Graph based Question-Answering Agent
Evaluating Hallucination in Text-to-Image Diffusion Models with Scene-Graph based Question-Answering Agent
Ziyuan Qin
D. Cheng
Haoyu Wang
Huahui Yi
Yuting Shao
Zhiyuan Fan
Kang Li
Qicheng Lao
EGVMMLLM
467
0
0
07 Dec 2024
Combining Genre Classification and Harmonic-Percussive Features with
  Diffusion Models for Music-Video Generation
Combining Genre Classification and Harmonic-Percussive Features with Diffusion Models for Music-Video Generation
Leonardo Pina
Yongmin Li
VGenDiffM
95
0
0
07 Dec 2024
SMIC: Semantic Multi-Item Compression based on CLIP dictionary
SMIC: Semantic Multi-Item Compression based on CLIP dictionary
Tom Bachard
Thomas Maugey
116
0
0
06 Dec 2024
HumanEdit: A High-Quality Human-Rewarded Dataset for Instruction-based Image Editing
HumanEdit: A High-Quality Human-Rewarded Dataset for Instruction-based Image Editing
Jinbin Bai
Wei Chow
L. Yang
Hefei Ling
Juncheng Billy Li
Hao Zhang
Shuicheng Yan
187
10
0
05 Dec 2024
Multi-view Image Diffusion via Coordinate Noise and Fourier Attention
Multi-view Image Diffusion via Coordinate Noise and Fourier Attention
Justin D. Theiss
Norman Müller
Daeil Kim
Aayush Prakash
102
0
0
04 Dec 2024
MV-Adapter: Multi-view Consistent Image Generation Made Easy
MV-Adapter: Multi-view Consistent Image Generation Made Easy
Zehuan Huang
Yu Guo
Haoran Wang
Ran Yi
Lizhuang Ma
Yan-Pei Cao
Lu Sheng
169
18
0
04 Dec 2024
Implicit Priors Editing in Stable Diffusion via Targeted Token
  Adjustment
Implicit Priors Editing in Stable Diffusion via Targeted Token Adjustment
Feng He
Chao Zhang
Zhixue Zhao
181
0
0
04 Dec 2024
DynamicControl: Adaptive Condition Selection for Improved Text-to-Image Generation
DynamicControl: Adaptive Condition Selection for Improved Text-to-Image Generation
Qu He
Jinlong Peng
P. Xu
Boyuan Jiang
Xiaobin Hu
...
Yang Liu
Yun Wang
Chengjie Wang
Xuelong Li
Jing Zhang
DiffM
212
1
0
04 Dec 2024
ShapeWords: Guiding Text-to-Image Synthesis with 3D Shape-Aware Prompts
ShapeWords: Guiding Text-to-Image Synthesis with 3D Shape-Aware Prompts
Dmitry Petrov
Pradyumn Goyal
Divyansh Shivashok
Yuanming Tao
Melinos Averkiou
E. Kalogerakis
127
0
0
03 Dec 2024
SyncFlow: Toward Temporally Aligned Joint Audio-Video Generation from
  Text
SyncFlow: Toward Temporally Aligned Joint Audio-Video Generation from Text
Haohe Liu
Gaël Le Lan
Xinhao Mei
Zhaoheng Ni
Anurag Kumar
Varun K. Nagaraja
Wenwu Wang
Mark D. Plumbley
Yangyang Shi
Vikas Chandra
VGen
157
1
0
03 Dec 2024
FoundHand: Large-Scale Domain-Specific Learning for Controllable Hand
  Image Generation
FoundHand: Large-Scale Domain-Specific Learning for Controllable Hand Image Generation
Kefan Chen
Chaerin Min
Linguang Zhang
Shreyas Hampali
Cem Keskin
Srinath Sridhar
140
0
0
03 Dec 2024
Diffusion models learn distributions generated by complex Langevin
  dynamics
Diffusion models learn distributions generated by complex Langevin dynamics
Diaa E. Habibi
Gert Aarts
Lei Wang
K. Zhou
DiffM
132
2
0
02 Dec 2024
CTRL-D: Controllable Dynamic 3D Scene Editing with Personalized 2D
  Diffusion
CTRL-D: Controllable Dynamic 3D Scene Editing with Personalized 2D Diffusion
Kai He
Chin-Hsuan Wu
Igor Gilitschenski
DiffM3DGS
127
0
0
02 Dec 2024
Gen-SIS: Generative Self-augmentation Improves Self-supervised Learning
Gen-SIS: Generative Self-augmentation Improves Self-supervised Learning
Varun Belagali
Srikar Yellapragada
Alexandros Graikos
S. Kapse
Zilinghan Li
Tarak Nandi
Ravi K. Madduri
Prateek Prasanna
Joel H. Saltz
Dimitris Samaras
DiffM
144
2
0
02 Dec 2024
CopyrightShield: Spatial Similarity Guided Backdoor Defense against
  Copyright Infringement in Diffusion Models
CopyrightShield: Spatial Similarity Guided Backdoor Defense against Copyright Infringement in Diffusion Models
Zhixiang Guo
Siyuan Liang
Aishan Liu
Dacheng Tao
AAML
135
3
0
02 Dec 2024
MFTF: Mask-free Training-free Object Level Layout Control Diffusion
  Model
MFTF: Mask-free Training-free Object Level Layout Control Diffusion Model
Shan Yang
DiffM
85
0
0
02 Dec 2024
PainterNet: Adaptive Image Inpainting with Actual-Token Attention and
  Diverse Mask Control
PainterNet: Adaptive Image Inpainting with Actual-Token Attention and Diverse Mask Control
Ruichen Wang
Junliang Zhang
Qingsong Xie
Chen Chen
H. Lu
DiffM
127
1
0
02 Dec 2024
Unleashing In-context Learning of Autoregressive Models for Few-shot
  Image Manipulation
Unleashing In-context Learning of Autoregressive Models for Few-shot Image Manipulation
Bolin Lai
F. Xu
Miao Liu
Xiaoliang Dai
Nikhil Mehta
...
Zeyi Huang
James M. Rehg
Sangmin Lee
Ning Zhang
Tong Xiao
136
3
0
02 Dec 2024
Schedule On the Fly: Diffusion Time Prediction for Faster and Better Image Generation
Schedule On the Fly: Diffusion Time Prediction for Faster and Better Image Generation
Zilyu Ye
Zhiyang Chen
Tiancheng Li
Zemin Huang
Weijian Luo
Guo-Jun Qi
DiffM
132
6
0
02 Dec 2024
SerialGen: Personalized Image Generation by First Standardization Then Personalization
SerialGen: Personalized Image Generation by First Standardization Then Personalization
Cong Xie
Han Zou
Ruiqi Yu
Yan Zhang
Zhenpeng Zhan
146
1
0
02 Dec 2024
DiffPatch: Generating Customizable Adversarial Patches using Diffusion Models
DiffPatch: Generating Customizable Adversarial Patches using Diffusion Models
Zhixiang Wang
Guangnan Ye
Xinyu Wang
Siheng Chen
Ziyi Wang
Xingjun Ma
Yu-Gang Jiang
AAMLDiffM
195
0
0
02 Dec 2024
MuLan: Adapting Multilingual Diffusion Models for Hundreds of Languages with Negligible Cost
MuLan: Adapting Multilingual Diffusion Models for Hundreds of Languages with Negligible Cost
Sen Xing
Muyan Zhong
Zeqiang Lai
Liangchen Li
Jing Liu
Yaohui Wang
Jifeng Dai
Wenhai Wang
205
2
0
02 Dec 2024
STEVE-Audio: Expanding the Goal Conditioning Modalities of Embodied
  Agents in Minecraft
STEVE-Audio: Expanding the Goal Conditioning Modalities of Embodied Agents in Minecraft
Nicholas Lenzen
Amogh Raut
Andrew Melnik
VGen
113
0
0
01 Dec 2024
Advancing Myopia To Holism: Fully Contrastive Language-Image
  Pre-training
Advancing Myopia To Holism: Fully Contrastive Language-Image Pre-training
Haicheng Wang
Chen Ju
Weixiong Lin
Shuai Xiao
Mengting Chen
...
Mingshuai Yao
Jinsong Lan
Ying Chen
Qingwen Liu
Yanfeng Wang
VLMCLIP
121
4
0
30 Nov 2024
Deepfake Media Generation and Detection in the Generative AI Era: A
  Survey and Outlook
Deepfake Media Generation and Detection in the Generative AI Era: A Survey and Outlook
Florinel-Alin Croitoru
Andrei Iulian Hiji
Vlad Hondru
Nicolae-Cătălin Ristea
Paul Irofti
Marius Popescu
Cristian Rusu
Radu Tudor Ionescu
Fahad Shahbaz Khan
Mubarak Shah
135
5
0
29 Nov 2024
DreamBlend: Advancing Personalized Fine-tuning of Text-to-Image
  Diffusion Models
DreamBlend: Advancing Personalized Fine-tuning of Text-to-Image Diffusion Models
Shwetha Ram
T. Neiman
Qianli Feng
Andrew Stuart
S. D. Tran
Trishul Chilimbi
128
2
0
28 Nov 2024
Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model
Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model
Feng Liu
Shiwei Zhang
Xiaofeng Wang
Yujie Wei
Haonan Qiu
Yuzhong Zhao
Yingya Zhang
Qixiang Ye
Fang Wan
VGenAI4TS
216
30
0
28 Nov 2024
Enhancing Few-Shot Vision-Language Classification with Large Multimodal Model Features
Chancharik Mitra
Brandon Huang
Tianning Chai
Zhiqiu Lin
Assaf Arbelle
Rogerio Feris
Leonid Karlinsky
Trevor Darrell
Deva Ramanan
Roei Herzig
VLM
391
4
0
28 Nov 2024
Any-Resolution AI-Generated Image Detection by Spectral Learning
Any-Resolution AI-Generated Image Detection by Spectral Learning
Dimitrios Karageorgiou
Symeon Papadopoulos
I. Kompatsiaris
Efstratios Gavves
176
1
0
28 Nov 2024
Orthus: Autoregressive Interleaved Image-Text Generation with Modality-Specific Heads
Siqi Kou
Jiachun Jin
Chang Liu
Ye Ma
Jian Jia
Quan Chen
Peng Jiang
Zhijie Deng
Zhijie Deng
DiffMVGenVLM
243
12
0
28 Nov 2024
Self-Cross Diffusion Guidance for Text-to-Image Synthesis of Similar Subjects
Self-Cross Diffusion Guidance for Text-to-Image Synthesis of Similar Subjects
Weimin Qiu
Jieke Wang
Meng Tang
DiffM
185
1
0
28 Nov 2024
FaithDiff: Unleashing Diffusion Priors for Faithful Image
  Super-resolution
FaithDiff: Unleashing Diffusion Priors for Faithful Image Super-resolution
Junyang Chen
Jinshan Pan
Jiangxin Dong
111
2
0
27 Nov 2024
Steering Rectified Flow Models in the Vector Field for Controlled Image
  Generation
Steering Rectified Flow Models in the Vector Field for Controlled Image Generation
Maitreya Patel
Song Wen
Dimitris N. Metaxas
Yezhou Yang
DiffM
196
6
0
27 Nov 2024
Diffusion Self-Distillation for Zero-Shot Customized Image Generation
Diffusion Self-Distillation for Zero-Shot Customized Image Generation
Shengqu Cai
Eric Ryan Chan
Yunzhi Zhang
Leonidas Guibas
Jiajun Wu
Gordon Wetzstein
132
13
0
27 Nov 2024
Enhancing MMDiT-Based Text-to-Image Models for Similar Subject
  Generation
Enhancing MMDiT-Based Text-to-Image Models for Similar Subject Generation
Tianyi Wei
Dongdong Chen
Yifan Zhou
Xingang Pan
EGVM
137
3
0
27 Nov 2024
Diffusion Autoencoders for Few-shot Image Generation in Hyperbolic Space
Diffusion Autoencoders for Few-shot Image Generation in Hyperbolic Space
Lingxiao Li
Kaixuan Fan
Boqing Gong
Xiangyu Yue
DiffM
124
0
0
27 Nov 2024
Pan-protein Design Learning Enables Task-adaptive Generalization for
  Low-resource Enzyme Design
Pan-protein Design Learning Enables Task-adaptive Generalization for Low-resource Enzyme Design
Jiangbin Zheng
Ge Wang
Han Zhang
Stan Z. Li
117
0
0
26 Nov 2024
Reward Incremental Learning in Text-to-Image Generation
Reward Incremental Learning in Text-to-Image Generation
Maorong Wang
Jiafeng Mao
Xueting Wang
Toshihiko Yamasaki
EGVM
127
0
0
26 Nov 2024
LiteVAR: Compressing Visual Autoregressive Modelling with Efficient
  Attention and Quantization
LiteVAR: Compressing Visual Autoregressive Modelling with Efficient Attention and Quantization
Rui Xie
Tianchen Zhao
Zhihang Yuan
Rui Wan
Wenxi Gao
Zhenhua Zhu
Xuefei Ning
Yu Wang
VGenMQ
92
4
0
26 Nov 2024
Relations, Negations, and Numbers: Looking for Logic in Generative
  Text-to-Image Models
Relations, Negations, and Numbers: Looking for Logic in Generative Text-to-Image Models
C. Conwell
Rupert Tawiah-Quashie
T. Ullman
123
3
0
26 Nov 2024
Previous
123...131415...969798
Next