ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.06125
  4. Cited By
Hierarchical Text-Conditional Image Generation with CLIP Latents

Hierarchical Text-Conditional Image Generation with CLIP Latents

13 April 2022
Aditya A. Ramesh
Prafulla Dhariwal
Alex Nichol
Casey Chu
Mark Chen
    VLM
    DiffM
ArXivPDFHTML

Papers citing "Hierarchical Text-Conditional Image Generation with CLIP Latents"

50 / 4,750 papers shown
Title
Dual-Schedule Inversion: Training- and Tuning-Free Inversion for Real
  Image Editing
Dual-Schedule Inversion: Training- and Tuning-Free Inversion for Real Image Editing
Jiancheng Huang
Yi Huang
Jianzhuang Liu
Donghao Zhou
Yong-Jin Liu
Shifeng Chen
DiffM
109
0
0
15 Dec 2024
SHMT: Self-supervised Hierarchical Makeup Transfer via Latent Diffusion
  Models
SHMT: Self-supervised Hierarchical Makeup Transfer via Latent Diffusion Models
Zhaoyang Sun
Shengwu Xiong
Yaxiong Chen
Fei Du
Weihua Chen
Fan Wang
Yi Rong
DiffM
74
1
0
15 Dec 2024
Diffusion Model from Scratch
Diffusion Model from Scratch
Wang Zhen
Dong Yunyun
DiffM
70
0
0
14 Dec 2024
Video Diffusion Transformers are In-Context Learners
Video Diffusion Transformers are In-Context Learners
Zhengcong Fei
Di Qiu
Changqian Yu
Debang Li
Mingyuan Fan
VGen
DiffM
211
2
0
14 Dec 2024
Aspen Open Jets: Unlocking LHC Data for Foundation Models in Particle
  Physics
Aspen Open Jets: Unlocking LHC Data for Foundation Models in Particle Physics
Oz Amram
Luca Anzalone
Joschka Birk
D. Faroughy
Anna Hallin
Gregor Kasieczka
Michael Krämer
Ian Pang
H. Reyes-González
David Shih
AI4CE
82
5
0
13 Dec 2024
Illusion3D: 3D Multiview Illusion with 2D Diffusion Priors
Illusion3D: 3D Multiview Illusion with 2D Diffusion Priors
Yue Feng
Vaibhav Sanjay
Spencer Lutz
Badour Albahar
Songwei Ge
Jia-Bin Huang
77
1
0
12 Dec 2024
Video Seal: Open and Efficient Video Watermarking
Video Seal: Open and Efficient Video Watermarking
Pierre Fernandez
Hady ElSahar
I. Zeki Yalniz
Alexandre Mourachko
VLM
92
5
0
12 Dec 2024
OFTSR: One-Step Flow for Image Super-Resolution with Tunable
  Fidelity-Realism Trade-offs
OFTSR: One-Step Flow for Image Super-Resolution with Tunable Fidelity-Realism Trade-offs
Yuanzhi Zhu
R. Wang
Shilin Lu
Junnan Li
Hanshu Yan
Peng Sun
SupR
89
3
0
12 Dec 2024
UFO: Enhancing Diffusion-Based Video Generation with a Uniform Frame
  Organizer
UFO: Enhancing Diffusion-Based Video Generation with a Uniform Frame Organizer
Delong Liu
Zhaohui Hou
Mingjie Zhan
Shihao Han
Zhaohui Hou
Zhicheng Zhao
VGen
93
0
0
12 Dec 2024
SVGFusion: Scalable Text-to-SVG Generation via Vector Space Diffusion
SVGFusion: Scalable Text-to-SVG Generation via Vector Space Diffusion
Ximing Xing
Juncheng Hu
Jing Zhang
Dong Xu
Qian Yu
89
1
0
11 Dec 2024
FireFlow: Fast Inversion of Rectified Flow for Image Semantic Editing
FireFlow: Fast Inversion of Rectified Flow for Image Semantic Editing
Yingying Deng
Xiangyu He
Changwang Mei
Peisong Wang
Fan Tang
86
8
0
10 Dec 2024
Buster: Implanting Semantic Backdoor into Text Encoder to Mitigate NSFW Content Generation
Buster: Implanting Semantic Backdoor into Text Encoder to Mitigate NSFW Content Generation
Xin Zhao
Xiaojun Chen
Yuexin Xuan
Zhendong Zhao
Xiaojun Jia
Xinfeng Li
Xiaofeng Wang
80
0
0
10 Dec 2024
ArtFormer: Controllable Generation of Diverse 3D Articulated Objects
ArtFormer: Controllable Generation of Diverse 3D Articulated Objects
Jiayi Su
Youhe Feng
Zheng Li
Jinhua Song
Yangfan He
Botao Ren
Botian Xu
AI4CE
91
2
0
10 Dec 2024
FIRE: Robust Detection of Diffusion-Generated Images via Frequency-Guided Reconstruction Error
FIRE: Robust Detection of Diffusion-Generated Images via Frequency-Guided Reconstruction Error
Beilin Chu
Xuan Xu
Xin Wang
Wenjie Qu
Weike You
Linna Zhou
DiffM
100
1
0
10 Dec 2024
Sound2Vision: Generating Diverse Visuals from Audio through Cross-Modal
  Latent Alignment
Sound2Vision: Generating Diverse Visuals from Audio through Cross-Modal Latent Alignment
Kim Sung-Bin
Arda Senocak
Hyunwoo Ha
Tae-Hyun Oh
DiffM
83
0
0
09 Dec 2024
Evaluating Hallucination in Text-to-Image Diffusion Models with
  Scene-Graph based Question-Answering Agent
Evaluating Hallucination in Text-to-Image Diffusion Models with Scene-Graph based Question-Answering Agent
Ziyuan Qin
D. Cheng
Haoyu Wang
Huahui Yi
Yuting Shao
Zhiyuan Fan
Kang Li
Qicheng Lao
EGVM
MLLM
229
0
0
07 Dec 2024
Combining Genre Classification and Harmonic-Percussive Features with
  Diffusion Models for Music-Video Generation
Combining Genre Classification and Harmonic-Percussive Features with Diffusion Models for Music-Video Generation
Leonardo Pina
Yongmin Li
VGen
DiffM
81
0
0
07 Dec 2024
SMIC: Semantic Multi-Item Compression based on CLIP dictionary
SMIC: Semantic Multi-Item Compression based on CLIP dictionary
Tom Bachard
Thomas Maugey
81
0
0
06 Dec 2024
Coordinate In and Value Out: Training Flow Transformers in Ambient Space
Coordinate In and Value Out: Training Flow Transformers in Ambient Space
Yuyang Wang
Anurag Ranjan
J. Susskind
Miguel Angel Bautista
3DPC
81
0
0
05 Dec 2024
HumanEdit: A High-Quality Human-Rewarded Dataset for Instruction-based Image Editing
HumanEdit: A High-Quality Human-Rewarded Dataset for Instruction-based Image Editing
Jinbin Bai
Wei Chow
L. Yang
Hefei Ling
Juncheng Billy Li
Hao Zhang
Shuicheng Yan
103
3
0
05 Dec 2024
Multi-view Image Diffusion via Coordinate Noise and Fourier Attention
Multi-view Image Diffusion via Coordinate Noise and Fourier Attention
Justin D. Theiss
Norman Müller
Daeil Kim
Aayush Prakash
73
0
0
04 Dec 2024
MV-Adapter: Multi-view Consistent Image Generation Made Easy
MV-Adapter: Multi-view Consistent Image Generation Made Easy
Zehuan Huang
Yu Guo
Haoran Wang
Ran Yi
Lizhuang Ma
Yan-Pei Cao
Lu Sheng
107
9
0
04 Dec 2024
Implicit Priors Editing in Stable Diffusion via Targeted Token
  Adjustment
Implicit Priors Editing in Stable Diffusion via Targeted Token Adjustment
Feng He
Chao Zhang
Zhixue Zhao
84
0
0
04 Dec 2024
DynamicControl: Adaptive Condition Selection for Improved Text-to-Image Generation
DynamicControl: Adaptive Condition Selection for Improved Text-to-Image Generation
Q. He
Jinlong Peng
P. Xu
Boyuan Jiang
Xiaobin Hu
...
Yong-Jin Liu
Yishuo Wang
Chengjie Wang
Xuelong Li
Jingyang Zhang
DiffM
122
1
0
04 Dec 2024
ShapeWords: Guiding Text-to-Image Synthesis with 3D Shape-Aware Prompts
ShapeWords: Guiding Text-to-Image Synthesis with 3D Shape-Aware Prompts
Dmitry Petrov
Pradyumn Goyal
Divyansh Shivashok
Yuanming Tao
Melinos Averkiou
E. Kalogerakis
66
0
0
03 Dec 2024
SyncFlow: Toward Temporally Aligned Joint Audio-Video Generation from
  Text
SyncFlow: Toward Temporally Aligned Joint Audio-Video Generation from Text
Haohe Liu
Gaël Le Lan
Xinhao Mei
Zhaoheng Ni
Anurag Kumar
Varun K. Nagaraja
Wenwu Wang
Mark D. Plumbley
Yangyang Shi
Vikas Chandra
VGen
64
1
0
03 Dec 2024
FoundHand: Large-Scale Domain-Specific Learning for Controllable Hand
  Image Generation
FoundHand: Large-Scale Domain-Specific Learning for Controllable Hand Image Generation
Kefan Chen
Chaerin Min
Linguang Zhang
Shreyas Hampali
Cem Keskin
Srinath Sridhar
77
0
0
03 Dec 2024
Diffusion models learn distributions generated by complex Langevin
  dynamics
Diffusion models learn distributions generated by complex Langevin dynamics
Diaa E. Habibi
Gert Aarts
Lei Wang
K. Zhou
DiffM
93
1
0
02 Dec 2024
CTRL-D: Controllable Dynamic 3D Scene Editing with Personalized 2D
  Diffusion
CTRL-D: Controllable Dynamic 3D Scene Editing with Personalized 2D Diffusion
Kai He
Chin-Hsuan Wu
Igor Gilitschenski
DiffM
3DGS
83
0
0
02 Dec 2024
Gen-SIS: Generative Self-augmentation Improves Self-supervised Learning
Gen-SIS: Generative Self-augmentation Improves Self-supervised Learning
Varun Belagali
Srikar Yellapragada
Alexandros Graikos
S. Kapse
Zilinghan Li
Tarak Nandi
Ravi K. Madduri
Prateek Prasanna
Joel H. Saltz
Dimitris Samaras
DiffM
85
1
0
02 Dec 2024
CopyrightShield: Spatial Similarity Guided Backdoor Defense against
  Copyright Infringement in Diffusion Models
CopyrightShield: Spatial Similarity Guided Backdoor Defense against Copyright Infringement in Diffusion Models
Zhixiang Guo
Siyuan Liang
Aishan Liu
Dacheng Tao
AAML
84
1
0
02 Dec 2024
MFTF: Mask-free Training-free Object Level Layout Control Diffusion
  Model
MFTF: Mask-free Training-free Object Level Layout Control Diffusion Model
Shan Yang
DiffM
76
0
0
02 Dec 2024
MuLan: Adapting Multilingual Diffusion Models for Hundreds of Languages
  with Negligible Cost
MuLan: Adapting Multilingual Diffusion Models for Hundreds of Languages with Negligible Cost
Sen Xing
Muyan Zhong
Zeqiang Lai
Liangchen Li
Jun Liu
Yaohui Wang
Jifeng Dai
Wenhai Wang
83
1
0
02 Dec 2024
PainterNet: Adaptive Image Inpainting with Actual-Token Attention and
  Diverse Mask Control
PainterNet: Adaptive Image Inpainting with Actual-Token Attention and Diverse Mask Control
Ruichen Wang
Junliang Zhang
Qingsong Xie
Chen Chen
H. Lu
DiffM
95
1
0
02 Dec 2024
Unleashing In-context Learning of Autoregressive Models for Few-shot
  Image Manipulation
Unleashing In-context Learning of Autoregressive Models for Few-shot Image Manipulation
Bolin Lai
F. Xu
Miao Liu
Xiaoliang Dai
Nikhil Mehta
...
Zeyi Huang
James M. Rehg
Sangmin Lee
Ning Zhang
Tong Xiao
73
2
0
02 Dec 2024
Schedule On the Fly: Diffusion Time Prediction for Faster and Better Image Generation
Schedule On the Fly: Diffusion Time Prediction for Faster and Better Image Generation
Zilyu Ye
Zhiyang Chen
Tiancheng Li
Zemin Huang
Weijian Luo
Guo-jun Qi
DiffM
83
5
0
02 Dec 2024
DiffPatch: Generating Customizable Adversarial Patches using Diffusion Models
DiffPatch: Generating Customizable Adversarial Patches using Diffusion Models
Zhixiang Wang
Guangnan Ye
Xueliang Wang
Siheng Chen
Zihan Wang
Xingjun Ma
Yu-Gang Jiang
AAML
DiffM
98
0
0
02 Dec 2024
SerialGen: Personalized Image Generation by First Standardization Then Personalization
SerialGen: Personalized Image Generation by First Standardization Then Personalization
Cong Xie
Han Zou
Ruiqi Yu
Yan Zhang
Zhenpeng Zhan
74
1
0
02 Dec 2024
STEVE-Audio: Expanding the Goal Conditioning Modalities of Embodied
  Agents in Minecraft
STEVE-Audio: Expanding the Goal Conditioning Modalities of Embodied Agents in Minecraft
Nicholas Lenzen
Amogh Raut
Andrew Melnik
VGen
74
0
0
01 Dec 2024
Advancing Myopia To Holism: Fully Contrastive Language-Image
  Pre-training
Advancing Myopia To Holism: Fully Contrastive Language-Image Pre-training
Haicheng Wang
Chen Ju
Weixiong Lin
Shuai Xiao
Mengting Chen
...
Mingshuai Yao
Jinsong Lan
Ying Chen
Qingwen Liu
Yanfeng Wang
VLM
CLIP
80
4
0
30 Nov 2024
Deepfake Media Generation and Detection in the Generative AI Era: A
  Survey and Outlook
Deepfake Media Generation and Detection in the Generative AI Era: A Survey and Outlook
Florinel-Alin Croitoru
Andrei Iulian Hiji
Vlad Hondru
Nicolae-Cătălin Ristea
Paul Irofti
Marius Popescu
Cristian Rusu
Radu Tudor Ionescu
Fahad Shahbaz Khan
Mubarak Shah
89
3
0
29 Nov 2024
DreamBlend: Advancing Personalized Fine-tuning of Text-to-Image
  Diffusion Models
DreamBlend: Advancing Personalized Fine-tuning of Text-to-Image Diffusion Models
Shwetha Ram
T. Neiman
Qianli Feng
Andrew Stuart
S. D. Tran
Trishul Chilimbi
77
1
0
28 Nov 2024
Sparse Attention Vectors: Generative Multimodal Model Features Are Discriminative Vision-Language Classifiers
Chancharik Mitra
Brandon Huang
Tianning Chai
Zhiqiu Lin
Assaf Arbelle
Rogerio Feris
Leonid Karlinsky
Trevor Darrell
Deva Ramanan
Roei Herzig
VLM
134
4
0
28 Nov 2024
Orthus: Autoregressive Interleaved Image-Text Generation with Modality-Specific Heads
Siqi Kou
Jiachun Jin
Chang Liu
Ye Ma
Jian Jia
Quan Chen
Peng Jiang
Zhijie Deng
Zhijie Deng
DiffM
VGen
VLM
135
6
0
28 Nov 2024
Any-Resolution AI-Generated Image Detection by Spectral Learning
Any-Resolution AI-Generated Image Detection by Spectral Learning
Dimitrios Karageorgiou
Symeon Papadopoulos
I. Kompatsiaris
Efstratios Gavves
103
0
0
28 Nov 2024
Self-Cross Diffusion Guidance for Text-to-Image Synthesis of Similar Subjects
Self-Cross Diffusion Guidance for Text-to-Image Synthesis of Similar Subjects
Weimin Qiu
Jieke Wang
Meng Tang
DiffM
82
0
0
28 Nov 2024
Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model
Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model
Feng Liu
Shiwei Zhang
Xiaofeng Wang
Yujie Wei
Haonan Qiu
Yuzhong Zhao
Yingya Zhang
Qixiang Ye
Fang Wan
VGen
AI4TS
99
11
0
28 Nov 2024
FaithDiff: Unleashing Diffusion Priors for Faithful Image
  Super-resolution
FaithDiff: Unleashing Diffusion Priors for Faithful Image Super-resolution
Junyang Chen
Jinshan Pan
Jiangxin Dong
78
0
0
27 Nov 2024
Steering Rectified Flow Models in the Vector Field for Controlled Image
  Generation
Steering Rectified Flow Models in the Vector Field for Controlled Image Generation
Maitreya Patel
Song Wen
Dimitris N. Metaxas
Yezhou Yang
DiffM
116
4
0
27 Nov 2024
Diffusion Self-Distillation for Zero-Shot Customized Image Generation
Diffusion Self-Distillation for Zero-Shot Customized Image Generation
Shengqu Cai
Eric Ryan Chan
Yunzhi Zhang
Leonidas J. Guibas
Jiajun Wu
Gordon Wetzstein
83
8
0
27 Nov 2024
Previous
123...101112...939495
Next