ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.06125
  4. Cited By
Hierarchical Text-Conditional Image Generation with CLIP Latents

Hierarchical Text-Conditional Image Generation with CLIP Latents

13 April 2022
Aditya A. Ramesh
Prafulla Dhariwal
Alex Nichol
Casey Chu
Mark Chen
    VLMDiffM
ArXiv (abs)PDFHTML

Papers citing "Hierarchical Text-Conditional Image Generation with CLIP Latents"

50 / 4,897 papers shown
Title
AGHI-QA: A Subjective-Aligned Dataset and Metric for AI-Generated Human Images
AGHI-QA: A Subjective-Aligned Dataset and Metric for AI-Generated Human Images
Yunhao Li
Sijing Wu
Wei Sun
Zhichao Zhang
Yucheng Zhu
Zicheng Zhang
Huiyu Duan
Xiongkuo Min
Guangtao Zhai
EGVM
138
0
0
30 Apr 2025
Partitioned Memory Storage Inspired Few-Shot Class-Incremental learning
Partitioned Memory Storage Inspired Few-Shot Class-Incremental learning
Renye Zhang
Yimin Yin
Jinghua Zhang
CLL
94
0
0
29 Apr 2025
Efficient Listener: Dyadic Facial Motion Synthesis via Action Diffusion
Efficient Listener: Dyadic Facial Motion Synthesis via Action Diffusion
Zehua Wang
Alexandre Bruckert
P. Le Callet
Guangtao Zhai
VGen
56
0
0
29 Apr 2025
Evaluating Generative Models for Tabular Data: Novel Metrics and Benchmarking
Evaluating Generative Models for Tabular Data: Novel Metrics and Benchmarking
Dayananda Herurkar
Ahmad Ali
Andreas Dengel
70
0
0
29 Apr 2025
EarthMapper: Visual Autoregressive Models for Controllable Bidirectional Satellite-Map Translation
EarthMapper: Visual Autoregressive Models for Controllable Bidirectional Satellite-Map Translation
Zhe Dong
Yuzhe Sun
Tianzhu Liu
Wangmeng Zuo
Yanfeng Gu
90
0
0
28 Apr 2025
SynergyAmodal: Deocclude Anything with Text Control
SynergyAmodal: Deocclude Anything with Text Control
Xinyang Li
Chengjie Yi
Jiawei Lai
Mingbao Lin
Yansong Qu
Shengchuan Zhang
Liujuan Cao
DiffM
135
0
0
28 Apr 2025
Open-set Anomaly Segmentation in Complex Scenarios
Open-set Anomaly Segmentation in Complex Scenarios
Song Xia
Yi Yu
Henghui Ding
Wenhan Yang
Shixuan Liu
Alex C. Kot
Xudong Jiang
DiffM
85
0
0
28 Apr 2025
Masked Language Prompting for Generative Data Augmentation in Few-shot Fashion Style Recognition
Masked Language Prompting for Generative Data Augmentation in Few-shot Fashion Style Recognition
Yuki Hirakawa
Ryotaro Shimizu
102
0
0
28 Apr 2025
CapsFake: A Multimodal Capsule Network for Detecting Instruction-Guided Deepfakes
CapsFake: A Multimodal Capsule Network for Detecting Instruction-Guided Deepfakes
Tuan Nguyen
Naseem Khan
Issa Khalil
AAML
165
0
0
27 Apr 2025
REED-VAE: RE-Encode Decode Training for Iterative Image Editing with Diffusion Models
REED-VAE: RE-Encode Decode Training for Iterative Image Editing with Diffusion Models
Gal Almog
Ariel Shamir
Ohad Fried
DiffM
75
0
0
26 Apr 2025
Optimizing Multi-Round Enhanced Training in Diffusion Models for Improved Preference Understanding
Optimizing Multi-Round Enhanced Training in Diffusion Models for Improved Preference Understanding
Kun Li
Jiadong Wang
Yangfan He
Xinyuan Song
Ruoyu Wang
...
Keqin Li
Sida Li
Miao Zhang
Tianyu Shi
Xueqian Wang
102
0
0
25 Apr 2025
Seeing Soundscapes: Audio-Visual Generation and Separation from Soundscapes Using Audio-Visual Separator
Seeing Soundscapes: Audio-Visual Generation and Separation from Soundscapes Using Audio-Visual Separator
Minjae Kang
Martim Brandão
90
0
0
25 Apr 2025
Token-Shuffle: Towards High-Resolution Image Generation with Autoregressive Models
Token-Shuffle: Towards High-Resolution Image Generation with Autoregressive Models
Xu Ma
Peize Sun
Haoyu Ma
Hao Tang
Chih-Yao Ma
...
Matt Feiszli
Peizhao Zhang
Peter Vajda
Sam S. Tsai
Y. Fu
177
4
0
24 Apr 2025
Text-to-Image Alignment in Denoising-Based Models through Step Selection
Text-to-Image Alignment in Denoising-Based Models through Step Selection
P. Grimal
Hervé Le Borgne
Olivier Ferret
DiffMEGVM
90
0
0
24 Apr 2025
DreamO: A Unified Framework for Image Customization
DreamO: A Unified Framework for Image Customization
Chong Mou
Yanze Wu
Wenxu Wu
Zinan Guo
Pengze Zhang
...
Shaojin Wu
Songtao Zhao
Jian Zhang
Qian He
Xinglong Wu
174
3
0
23 Apr 2025
ePBR: Extended PBR Materials in Image Synthesis
ePBR: Extended PBR Materials in Image Synthesis
Yu Guo
Zhiqiang Lao
Xiyun Song
Yubin Zhou
Zongfang Lin
Heather Yu
78
0
0
23 Apr 2025
FreeGraftor: Training-Free Cross-Image Feature Grafting for Subject-Driven Text-to-Image Generation
FreeGraftor: Training-Free Cross-Image Feature Grafting for Subject-Driven Text-to-Image Generation
Zebin Yao
Lujie Niu
Huixing Jiang
Chen Wei
Fangkun Zhao
Ruifan Li
Fangxiang Feng
DiffM
178
0
0
22 Apr 2025
Twin Co-Adaptive Dialogue for Progressive Image Generation
Twin Co-Adaptive Dialogue for Progressive Image Generation
Jun Wang
Yangfan He
Yan Zhong
Xinyuan Song
Jiayi Su
...
Miao Zhang
Keqin Li
Jiaqi Chen
Tianyu Shi
Xueqian Wang
50
0
0
21 Apr 2025
Solving New Tasks by Adapting Internet Video Knowledge
Solving New Tasks by Adapting Internet Video Knowledge
Calvin Luo
Zilai Zeng
Yilun Du
Chen Sun
111
6
0
21 Apr 2025
Acquire and then Adapt: Squeezing out Text-to-Image Model for Image Restoration
Acquire and then Adapt: Squeezing out Text-to-Image Model for Image Restoration
Junyuan Deng
Xinyi Wu
Yongxing Yang
Congchao Zhu
Song Wang
Zhenyao Wu
86
0
0
21 Apr 2025
"I Know It When I See It": Mood Spaces for Connecting and Expressing Visual Concepts
"I Know It When I See It": Mood Spaces for Connecting and Expressing Visual Concepts
Huzheng Yang
Katherine Xu
Michael D. Grossberg
Yutong Bai
Jianbo Shi
78
0
0
21 Apr 2025
MirrorVerse: Pushing Diffusion Models to Realistically Reflect the World
MirrorVerse: Pushing Diffusion Models to Realistically Reflect the World
Ankit Dhiman
Manan Shah
R. V. Babu
66
0
0
21 Apr 2025
DreamID: High-Fidelity and Fast diffusion-based Face Swapping via Triplet ID Group Learning
DreamID: High-Fidelity and Fast diffusion-based Face Swapping via Triplet ID Group Learning
Fulong Ye
Miao Hua
Pengze Zhang
Xinghui Li
Qichao Sun
Mingcong Liu
Qian He
Xinglong Wu
190
0
0
20 Apr 2025
Generative Multimodal Pretraining with Discrete Diffusion Timestep Tokens
Generative Multimodal Pretraining with Discrete Diffusion Timestep Tokens
Kaihang Pan
Wang Lin
Zhongqi Yue
Tenglong Ao
Liyu Jia
Wei Zhao
Juncheng Billy Li
Siliang Tang
Hanwang Zhang
101
8
0
20 Apr 2025
REDEditing: Relationship-Driven Precise Backdoor Poisoning on Text-to-Image Diffusion Models
REDEditing: Relationship-Driven Precise Backdoor Poisoning on Text-to-Image Diffusion Models
Chongye Guo
Jinhu Fu
Sihang Li
Kun Wang
Guorui Feng
140
0
0
20 Apr 2025
Teach Me How to Denoise: A Universal Framework for Denoising Multi-modal Recommender Systems via Guided Calibration
Teach Me How to Denoise: A Universal Framework for Denoising Multi-modal Recommender Systems via Guided Calibration
Haoyang Li
Hanwen Du
You Li
Junchen Fu
Chunxiao Li
Ziyi Zhuang
Jiakang Li
Yongxin Ni
AI4TS
102
0
0
19 Apr 2025
Learning Joint ID-Textual Representation for ID-Preserving Image Synthesis
Learning Joint ID-Textual Representation for ID-Preserving Image Synthesis
Zichuan Liu
Liming Jiang
Qing Yan
Yumin Jia
Hao Kang
Xin Lu
DiffM
140
0
0
19 Apr 2025
Towards NSFW-Free Text-to-Image Generation via Safety-Constraint Direct Preference Optimization
Towards NSFW-Free Text-to-Image Generation via Safety-Constraint Direct Preference Optimization
Shouwei Ruan
Zhenyu Wu
Yao Huang
Ruochen Zhang
Yitong Sun
Caixin Kang
Xingxing Wei
EGVM
112
0
0
19 Apr 2025
Exploring Language Patterns of Prompts in Text-to-Image Generation and Their Impact on Visual Diversity
Exploring Language Patterns of Prompts in Text-to-Image Generation and Their Impact on Visual Diversity
Maria-Teresa De Rosa Palmini
Eva Cetinic
60
0
0
19 Apr 2025
LLM-Enabled Style and Content Regularization for Personalized Text-to-Image Generation
LLM-Enabled Style and Content Regularization for Personalized Text-to-Image Generation
Anran Yu
Wei Feng
Yanzhe Zhang
Xiang Li
Lei Meng
Lei Wu
Xiangxu Meng
DiffM
57
0
0
19 Apr 2025
Point-Driven Interactive Text and Image Layer Editing Using Diffusion Models
Point-Driven Interactive Text and Image Layer Editing Using Diffusion Models
Zhenyu Yu
Mohd Yamani Idna Idris
Pei Wang
Yuelong Xia
DiffM
70
1
0
18 Apr 2025
ESPLoRA: Enhanced Spatial Precision with Low-Rank Adaption in Text-to-Image Diffusion Models for High-Definition Synthesis
ESPLoRA: Enhanced Spatial Precision with Low-Rank Adaption in Text-to-Image Diffusion Models for High-Definition Synthesis
Andrea Rigo
Luca Stornaiuolo
Mauro Martino
Bruno Lepri
N. Sebe
85
0
0
18 Apr 2025
Multi-modal Knowledge Graph Generation with Semantics-enriched Prompts
Multi-modal Knowledge Graph Generation with Semantics-enriched Prompts
Yajing Xu
Zhiqiang Liu
Jiaoyan Chen
Mingchen Tu
Z. Chen
Jeff Z. Pan
Yichi Zhang
Yushan Zhu
Wen Zhang
Ningyu Zhang
71
0
0
18 Apr 2025
UniEdit-Flow: Unleashing Inversion and Editing in the Era of Flow Models
UniEdit-Flow: Unleashing Inversion and Editing in the Era of Flow Models
Guanlong Jiao
Biqing Huang
Kuan-Chieh Wang
Renjie Liao
DiffM
137
0
0
17 Apr 2025
Perception Encoder: The best visual embeddings are not at the output of the network
Perception Encoder: The best visual embeddings are not at the output of the network
Daniel Bolya
Po-Yao (Bernie) Huang
Peize Sun
Jang Hyun Cho
Andrea Madotto
...
Shiyu Dong
Nikhila Ravi
Daniel Li
Piotr Dollár
Christoph Feichtenhofer
ObjDVOS
327
9
0
17 Apr 2025
Set You Straight: Auto-Steering Denoising Trajectories to Sidestep Unwanted Concepts
Set You Straight: Auto-Steering Denoising Trajectories to Sidestep Unwanted Concepts
Leyang Li
Shilin Lu
Yan Ren
A. Kong
DiffM
113
4
0
17 Apr 2025
Post-pre-training for Modality Alignment in Vision-Language Foundation Models
Post-pre-training for Modality Alignment in Vision-Language Foundation Models
Shinýa Yamaguchi
Dewei Feng
Sekitoshi Kanai
Kazuki Adachi
Daiki Chijiwa
VLM
88
2
0
17 Apr 2025
Image Editing with Diffusion Models: A Survey
Image Editing with Diffusion Models: A Survey
Jia Wang
Jie Hu
Xiaoqi Ma
Hanghang Ma
Xiaoming Wei
Enhua Wu
144
1
0
17 Apr 2025
ICAS: IP Adapter and ControlNet-based Attention Structure for Multi-Subject Style Transfer Optimization
ICAS: IP Adapter and ControlNet-based Attention Structure for Multi-Subject Style Transfer Optimization
Fuwei Liu
DiffM
113
0
0
17 Apr 2025
Image-Editing Specialists: An RLAIF Approach for Diffusion Models
Image-Editing Specialists: An RLAIF Approach for Diffusion Models
Elior Benarous
Yilun Du
Heng Yang
60
0
0
17 Apr 2025
Towards Forceful Robotic Foundation Models: a Literature Survey
Towards Forceful Robotic Foundation Models: a Literature Survey
William Xie
N. Correll
OffRL
132
4
0
16 Apr 2025
DMM: Building a Versatile Image Generation Model via Distillation-Based Model Merging
DMM: Building a Versatile Image Generation Model via Distillation-Based Model Merging
Tianhui Song
Weixin Feng
Shuai Wang
Xinfeng Li
Tiezheng Ge
Bo Zheng
Limin Wang
MoMe
132
1
0
16 Apr 2025
Instruction-augmented Multimodal Alignment for Image-Text and Element Matching
Instruction-augmented Multimodal Alignment for Image-Text and Element Matching
Xinli Yue
Jianhui Sun
Junda Lu
Liangchao Yao
Fan Xia
Tianyi Wang
Fengyun Rao
Jing Lyu
Yuetang Deng
83
2
0
16 Apr 2025
The Devil is in the Prompts: Retrieval-Augmented Prompt Optimization for Text-to-Video Generation
The Devil is in the Prompts: Retrieval-Augmented Prompt Optimization for Text-to-Video Generation
Bingjie Gao
Xinyu Gao
Xiaoxue Wu
Yujie Zhou
Yu Qiao
Li Niu
Xinyuan Chen
Yaohui Wang
173
1
0
16 Apr 2025
Securing the Skies: A Comprehensive Survey on Anti-UAV Methods, Benchmarking, and Future Directions
Securing the Skies: A Comprehensive Survey on Anti-UAV Methods, Benchmarking, and Future Directions
Yifei Dong
Fengyi Wu
Sanjian Zhang
Guangyu Chen
Yuzhi Hu
...
Jingdong Sun
Siyu Huang
Feng Liu
Qi Dai
Zhi-Qi Cheng
121
0
0
16 Apr 2025
PCDiff: Proactive Control for Ownership Protection in Diffusion Models with Watermark Compatibility
PCDiff: Proactive Control for Ownership Protection in Diffusion Models with Watermark Compatibility
Keke Gai
Ziyue Shen
Jiahao Yu
Liehuang Zhu
Qi Wu
WIGM
114
0
0
16 Apr 2025
Anti-Aesthetics: Protecting Facial Privacy against Customized Text-to-Image Synthesis
Anti-Aesthetics: Protecting Facial Privacy against Customized Text-to-Image Synthesis
Songping Wang
Yueming Lyu
Shiqi Liu
Ning Li
Tong Tong
Hao Sun
Caifeng Shan
PICV
137
0
0
16 Apr 2025
Omni$^2$: Unifying Omnidirectional Image Generation and Editing in an Omni Model
Omni2^22: Unifying Omnidirectional Image Generation and Editing in an Omni Model
Liu Yang
Huiyu Duan
Yucheng Zhu
Xiaohong Liu
Lu Liu
Zitong Xu
Guangji Ma
Xiongkuo Min
Guangtao Zhai
P. Callet
VLMVGen
441
2
0
15 Apr 2025
Aligning Generative Denoising with Discriminative Objectives Unleashes Diffusion for Visual Perception
Aligning Generative Denoising with Discriminative Objectives Unleashes Diffusion for Visual Perception
Ziqi Pang
Xin Xu
Yu-Xiong Wang
DiffM
193
0
0
15 Apr 2025
Token-Level Constraint Boundary Search for Jailbreaking Text-to-Image Models
Token-Level Constraint Boundary Search for Jailbreaking Text-to-Image Models
Qingbin Liu
Zhaoxin Wang
Handing Wang
Cong Tian
Yaochu Jin
54
1
0
15 Apr 2025
Previous
123456...969798
Next