ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.06125
  4. Cited By
Hierarchical Text-Conditional Image Generation with CLIP Latents

Hierarchical Text-Conditional Image Generation with CLIP Latents

13 April 2022
Aditya A. Ramesh
Prafulla Dhariwal
Alex Nichol
Casey Chu
Mark Chen
    VLM
    DiffM
ArXivPDFHTML

Papers citing "Hierarchical Text-Conditional Image Generation with CLIP Latents"

50 / 4,750 papers shown
Title
SOEDiff: Efficient Distillation for Small Object Editing
SOEDiff: Efficient Distillation for Small Object Editing
Yiming Wu
Qihe Pan
Zhen Zhao
Zicheng Wang
Sifan Long
Ronghua Liang
DiffM
70
0
0
03 Jan 2025
Cached Adaptive Token Merging: Dynamic Token Reduction and Redundant Computation Elimination in Diffusion Model
Omid Saghatchian
Atiyeh Gh. Moghadam
Ahmad Nickabadi
MoMe
49
1
0
03 Jan 2025
Text2midi: Generating Symbolic Music from Captions
Text2midi: Generating Symbolic Music from Captions
Keshav Bhandari
Abhinaba Roy
Kyra Wang
Geeta Puri
Simon Colton
Dorien Herremans
77
4
0
03 Jan 2025
Nested Attention: Semantic-aware Attention Values for Concept Personalization
Or Patashnik
Rinon Gal
Daniil Ostashev
Sergey Tulyakov
Kfir Aberman
Daniel Cohen-Or
DiffM
46
5
0
03 Jan 2025
Neural Network Diffusion
Neural Network Diffusion
Kaili Wang
Dongwen Tang
Boya Zeng
Yida Yin
Zhaopan Xu
Yukun Zhou
Zelin Zang
Trevor Darrell
Zhuang Liu
Yang You
DiffM
60
5
0
03 Jan 2025
Population Aware Diffusion for Time Series Generation
Yang Li
Han Meng
Zhenyu Bi
Ingolv T. Urnes
Haipeng Chen
AI4TS
49
0
0
03 Jan 2025
Adapting to Unknown Low-Dimensional Structures in Score-Based Diffusion Models
Adapting to Unknown Low-Dimensional Structures in Score-Based Diffusion Models
Gen Li
Yuling Yan
DiffM
44
18
0
03 Jan 2025
DuMo: Dual Encoder Modulation Network for Precise Concept Erasure
Feng Han
Kai-xiang Chen
Chao Gong
Zhipeng Wei
Jingjing Chen
Yu-Gang Jiang
49
2
0
03 Jan 2025
RealCustom++: Representing Images as Real-Word for Real-Time Customization
RealCustom++: Representing Images as Real-Word for Real-Time Customization
Zhendong Mao
Mengqi Huang
Fei Ding
Mingcong Liu
Qian He
Xiaojun Chang
DiffM
78
6
0
03 Jan 2025
Unleashing Text-to-Image Diffusion Prior for Zero-Shot Image Captioning
Jianjie Luo
Jingwen Chen
Yehao Li
Yingwei Pan
Jianlin Feng
Hongyang Chao
Ting Yao
DiffM
VLM
53
0
0
03 Jan 2025
GeoDiffuser: Geometry-Based Image Editing with Diffusion Models
GeoDiffuser: Geometry-Based Image Editing with Diffusion Models
Rahul Sajnani
Jeroen Vanbaar
Jie Min
Kapil D. Katyal
Srinath Sridhar
DiffM
59
11
0
03 Jan 2025
Ethical-Lens: Curbing Malicious Usages of Open-Source Text-to-Image Models
Ethical-Lens: Curbing Malicious Usages of Open-Source Text-to-Image Models
Yuzhu Cai
Sheng Yin
Yuxi Wei
Chenxin Xu
Weibo Mao
Felix Juefei Xu
Siheng Chen
Yanfeng Wang
EGVM
91
3
0
03 Jan 2025
TexAVi: Generating Stereoscopic VR Video Clips from Text Descriptions
Vriksha Srihari
R. Bhavya
Shruti Jayaraman
V. Mary Anita Rajam
DiffM
VGen
34
0
0
02 Jan 2025
MAKIMA: Tuning-free Multi-Attribute Open-domain Video Editing via Mask-Guided Attention Modulation
MAKIMA: Tuning-free Multi-Attribute Open-domain Video Editing via Mask-Guided Attention Modulation
Haoyu Zheng
Wenqiao Zhang
Zheqi Lv
Yu Zhong
Yang Dai
...
Yongliang Shen
Juncheng Billy Li
Dongping Zhang
Siliang Tang
Yueting Zhuang
DiffM
VGen
57
0
0
31 Dec 2024
ConsistentID: Portrait Generation with Multimodal Fine-Grained Identity Preserving
ConsistentID: Portrait Generation with Multimodal Fine-Grained Identity Preserving
Jiehui Huang
Xiao Dong
Wenhui Song
Zheng Chong
Zhiqiang Zhang
...
Long Chen
Hanhui Li
Yiqiang Yan
Shengcai Liao
Xiaodan Liang
DiffM
50
19
0
31 Dec 2024
Grid Diffusion Models for Text-to-Video Generation
Grid Diffusion Models for Text-to-Video Generation
Taegyeong Lee
Soyeong Kwon
Taehwan Kim
56
5
0
31 Dec 2024
AdaDiff: Adaptive Step Selection for Fast Diffusion Models
AdaDiff: Adaptive Step Selection for Fast Diffusion Models
Hui Zhang
Zuxuan Wu
Zhen Xing
Jie Shao
Yu-Gang Jiang
58
9
0
31 Dec 2024
Multi-Modality Driven LoRA for Adverse Condition Depth Estimation
Multi-Modality Driven LoRA for Adverse Condition Depth Estimation
Guanglei Yang
Rui Tian
Yongqiang Zhang
Zhun Zhong
Yongqiang Li
Wangmeng Zuo
37
0
0
31 Dec 2024
Is Your Image a Good Storyteller?
Is Your Image a Good Storyteller?
Xiujie Song
Xiaoyi Pang
Haifeng Tang
Mengyue Wu
Kenny Q. Zhu
48
0
0
29 Dec 2024
Provable Uncertainty Decomposition via Higher-Order Calibration
Provable Uncertainty Decomposition via Higher-Order Calibration
Gustaf Ahdritz
Aravind Gollakota
Parikshit Gopalan
Charlotte Peale
Udi Wieder
UD
UQCV
PER
52
1
0
25 Dec 2024
Protective Perturbations against Unauthorized Data Usage in
  Diffusion-based Image Generation
Protective Perturbations against Unauthorized Data Usage in Diffusion-based Image Generation
Sen Peng
Jijia Yang
Mingyue Wang
Jianfei He
Xiaohua Jia
DiffM
35
0
0
25 Dec 2024
PC Agent: While You Sleep, AI Works -- A Cognitive Journey into Digital
  World
PC Agent: While You Sleep, AI Works -- A Cognitive Journey into Digital World
Yanheng He
Jiahe Jin
Shijie Xia
Jiadi Su
Runze Fan
Haoyang Zou
Xiangkun Hu
Pengfei Liu
LLMAG
43
2
0
23 Dec 2024
Enhancing Multi-Text Long Video Generation Consistency without Tuning:
  Time-Frequency Analysis, Prompt Alignment, and Theory
Enhancing Multi-Text Long Video Generation Consistency without Tuning: Time-Frequency Analysis, Prompt Alignment, and Theory
Xingyao Li
Fengzhuo Zhang
Jiachun Pan
Yunlong Hou
Vincent Y. F. Tan
Zhuoran Yang
DiffM
VGen
47
0
0
23 Dec 2024
CharGen: High Accurate Character-Level Visual Text Generation Model with
  MultiModal Encoder
CharGen: High Accurate Character-Level Visual Text Generation Model with MultiModal Encoder
Lichen Ma
Tiezhu Yue
Pei Fu
Yujie Zhong
Kai Zhou
Xiaoming Wei
Jie Hu
DiffM
78
2
0
23 Dec 2024
D-Judge: How Far Are We? Evaluating the Discrepancies Between AI-synthesized Images and Natural Images through Multimodal Guidance
D-Judge: How Far Are We? Evaluating the Discrepancies Between AI-synthesized Images and Natural Images through Multimodal Guidance
Renyang Liu
Ziyu Lyu
Wei Zhou
See-Kiong Ng
EGVM
38
0
0
23 Dec 2024
RealisID: Scale-Robust and Fine-Controllable Identity Customization via
  Local and Global Complementation
RealisID: Scale-Robust and Fine-Controllable Identity Customization via Local and Global Complementation
Zhaoyang Sun
Fei Du
Weihua Chen
Fan Wang
Yaxiong Chen
Yi Rong
Shengwu Xiong
DiffM
83
1
0
22 Dec 2024
From Creation to Curriculum: Examining the role of generative AI in Arts
  Universities
From Creation to Curriculum: Examining the role of generative AI in Arts Universities
Atticus Sims
76
1
0
21 Dec 2024
Follow-Your-MultiPose: Tuning-Free Multi-Character Text-to-Video
  Generation via Pose Guidance
Follow-Your-MultiPose: Tuning-Free Multi-Character Text-to-Video Generation via Pose Guidance
Beiyuan Zhang
Yue Ma
Chunlei Fu
Xinyang Song
Zhenan Sun
Ziqiang Li
DiffM
VGen
89
1
0
21 Dec 2024
Mapping the Mind of an Instruction-based Image Editing using SMILE
Mapping the Mind of an Instruction-based Image Editing using SMILE
Zeinab Dehghani
Koorosh Aslansefat
Adil Khan
Adín Ramirez Rivera
Franky George
Muhammad Khalid
DiffM
88
0
0
20 Dec 2024
Reframing Image Difference Captioning with BLIP2IDC and Synthetic
  Augmentation
Reframing Image Difference Captioning with BLIP2IDC and Synthetic Augmentation
Gautier Evennou
Antoine Chaffin
Vivien Chappelier
Ewa Kijak
DiffM
79
0
0
20 Dec 2024
Diffusion-Based Conditional Image Editing through Optimized Inference
  with Guidance
Diffusion-Based Conditional Image Editing through Optimized Inference with Guidance
Hyunsoo Lee
Minsoo Kang
Bohyung Han
79
1
0
20 Dec 2024
AI-generated Image Quality Assessment in Visual Communication
AI-generated Image Quality Assessment in Visual Communication
Yu Tian
Yixuan Li
Baoliang Chen
Hanwei Zhu
Shiqi Wang
Sam Kwong
89
0
0
20 Dec 2024
GCA-3D: Towards Generalized and Consistent Domain Adaptation of 3D
  Generators
GCA-3D: Towards Generalized and Consistent Domain Adaptation of 3D Generators
Hengjia Li
Yang Liu
Yibo Zhao
Haoran Cheng
Yang Yang
...
Qibo Qiu
Boxi Wu
Tu Zheng
Zheng Yang
D. Cai
96
0
0
20 Dec 2024
Dataset Augmentation by Mixing Visual Concepts
Dataset Augmentation by Mixing Visual Concepts
Abdullah Al Rahat
Hemanth Venkateswara
DiffM
81
0
0
19 Dec 2024
Next Patch Prediction for Autoregressive Visual Generation
Next Patch Prediction for Autoregressive Visual Generation
Yatian Pang
Peng Jin
Shuo Yang
Bin Lin
Bin Zhu
...
Liuhan Chen
Francis E. H. Tay
Ser-Nam Lim
Harry Yang
Li Yuan
129
9
0
19 Dec 2024
Joint Co-Speech Gesture and Expressive Talking Face Generation using
  Diffusion with Adapters
Joint Co-Speech Gesture and Expressive Talking Face Generation using Diffusion with Adapters
S. Hogue
Chenxu Zhang
Yapeng Tian
Xiaohu Guo
DiffM
76
0
0
18 Dec 2024
What makes a good metric? Evaluating automatic metrics for text-to-image
  consistency
What makes a good metric? Evaluating automatic metrics for text-to-image consistency
Candace Ross
Melissa Hall
Adriana Romero Soriano
Adina Williams
95
3
0
18 Dec 2024
Data-Efficient Inference of Neural Fluid Fields via SciML Foundation
  Model
Data-Efficient Inference of Neural Fluid Fields via SciML Foundation Model
Yuqiu Liu
Jingxuan Xu
Mauricio Soroco
Yunchao Wei
Wuyang Chen
AI4CE
84
2
0
18 Dec 2024
Self-control: A Better Conditional Mechanism for Masked Autoregressive
  Model
Self-control: A Better Conditional Mechanism for Masked Autoregressive Model
Qiaoying Qu
Shiyu Shen
DiffM
81
0
0
18 Dec 2024
F-Bench: Rethinking Human Preference Evaluation Metrics for Benchmarking
  Face Generation, Customization, and Restoration
F-Bench: Rethinking Human Preference Evaluation Metrics for Benchmarking Face Generation, Customization, and Restoration
Lu Liu
Huiyu Duan
Qiang Hu
Liu Yang
Chunlei Cai
Tianxiao Ye
Huayu Liu
Xiaoyun Zhang
Guangtao Zhai
EGVM
99
1
0
17 Dec 2024
Prompt Augmentation for Self-supervised Text-guided Image Manipulation
Prompt Augmentation for Self-supervised Text-guided Image Manipulation
Rumeysa Bodur
Binod Bhattarai
Tae-Kyun Kim
DiffM
68
2
0
17 Dec 2024
Optimized two-stage AI-based Neural Decoding for Enhanced Visual
  Stimulus Reconstruction from fMRI Data
Optimized two-stage AI-based Neural Decoding for Enhanced Visual Stimulus Reconstruction from fMRI Data
Lorenzo Veronese
Andrea Moglia
Luca Mainardi
Pietro Cerveri
DiffM
71
0
0
17 Dec 2024
Unsupervised Region-Based Image Editing of Denoising Diffusion Models
Unsupervised Region-Based Image Editing of Denoising Diffusion Models
ZeLin Li
Yue Song
R. Tao
Xiaohong Jia
Yao Zhao
Wei Wang
DiffM
84
0
0
17 Dec 2024
Efficient Scaling of Diffusion Transformers for Text-to-Image Generation
Efficient Scaling of Diffusion Transformers for Text-to-Image Generation
Hao Li
Shamit Lal
Zhiheng Li
Yusheng Xie
Ying Wang
...
R. Manmatha
Zhuowen Tu
Stefano Ermon
Stefano Soatto
A. Swaminathan
86
0
0
16 Dec 2024
OmniPrism: Learning Disentangled Visual Concept for Image Generation
OmniPrism: Learning Disentangled Visual Concept for Image Generation
Yangyang Li
Daqing Liu
Wu Liu
Allen He
Xinchen Liu
Yongdong Zhang
Guoqing Jin
DiffM
CoGe
83
0
0
16 Dec 2024
IDEA-Bench: How Far are Generative Models from Professional Designing?
IDEA-Bench: How Far are Generative Models from Professional Designing?
C. Liang
Lianghua Huang
Jingwu Fang
Huanzhang Dou
Wei Wang
Zhi-Fan Wu
Yupeng Shi
Junge Zhang
Xin Zhao
Yu Liu
3DV
77
1
0
16 Dec 2024
StrandHead: Text to Strand-Disentangled 3D Head Avatars Using Hair
  Geometric Priors
StrandHead: Text to Strand-Disentangled 3D Head Avatars Using Hair Geometric Priors
Xiaokun Sun
Zeyu Cai
Zhenyu Zhang
Ying Tai
Jian Yang
78
0
0
16 Dec 2024
Can video generation replace cinematographers? Research on the cinematic language of generated video
Can video generation replace cinematographers? Research on the cinematic language of generated video
Xuelong Li
Kai WU
Siyi Yang
YiZhan Qu
Guohua. Zhang
...
Mingliang Xiong
Hao Deng
Qingwen Liu
Gang Li
Bin He
VGen
DiffM
90
1
0
16 Dec 2024
EditSplat: Multi-View Fusion and Attention-Guided Optimization for View-Consistent 3D Scene Editing with 3D Gaussian Splatting
EditSplat: Multi-View Fusion and Attention-Guided Optimization for View-Consistent 3D Scene Editing with 3D Gaussian Splatting
Dong In Lee
Hyeongcheol Park
Jiyoung Seo
Eunbyung Park
Hyunje Park
Ha Dam Baek
Shin Sangheon
Sangmin kim
Sangpil Kim
3DGS
102
1
0
16 Dec 2024
Detecting Daily Living Gait Amid Huntington's Disease Chorea using a
  Foundation Deep Learning Model
Detecting Daily Living Gait Amid Huntington's Disease Chorea using a Foundation Deep Learning Model
Dafna Schwartz
Lori Quinn
Nora E. Fritz
Lisa M. Muratori
Jeffery M. Hausdorff
Ran Gilad Bachrach
74
0
0
15 Dec 2024
Previous
123...91011...939495
Next