ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.06125
  4. Cited By
Hierarchical Text-Conditional Image Generation with CLIP Latents

Hierarchical Text-Conditional Image Generation with CLIP Latents

13 April 2022
Aditya A. Ramesh
Prafulla Dhariwal
Alex Nichol
Casey Chu
Mark Chen
    VLM
    DiffM
ArXivPDFHTML

Papers citing "Hierarchical Text-Conditional Image Generation with CLIP Latents"

50 / 4,739 papers shown
Title
Separate to Collaborate: Dual-Stream Diffusion Model for Coordinated Piano Hand Motion Synthesis
Separate to Collaborate: Dual-Stream Diffusion Model for Coordinated Piano Hand Motion Synthesis
Zihao Liu
Mingwen Ou
Zunnan Xu
Jiaqi Huang
Haonan Han
Ronghui Li
X. Li
DiffM
28
0
0
14 Apr 2025
SD-ReID: View-aware Stable Diffusion for Aerial-Ground Person Re-Identification
SD-ReID: View-aware Stable Diffusion for Aerial-Ground Person Re-Identification
Xiang Hu
Pingping Zhang
Yuhao Wang
Bin Yan
Huchuan Lu
25
0
0
13 Apr 2025
Automatic Detection of Intro and Credits in Video using CLIP and Multihead Attention
Automatic Detection of Intro and Credits in Video using CLIP and Multihead Attention
Vasilii Korolkov
Andrey Yanchenko
VLM
40
0
0
13 Apr 2025
Scalable Motion In-betweening via Diffusion and Physics-Based Character Adaptation
Scalable Motion In-betweening via Diffusion and Physics-Based Character Adaptation
Jia Qin
DiffM
VGen
38
0
0
13 Apr 2025
D$^2$iT: Dynamic Diffusion Transformer for Accurate Image Generation
D2^22iT: Dynamic Diffusion Transformer for Accurate Image Generation
Weinan Jia
Mengqi Huang
Nan Chen
Lei Zhang
Zhendong Mao
29
0
0
13 Apr 2025
Towards Explainable Partial-AIGC Image Quality Assessment
Towards Explainable Partial-AIGC Image Quality Assessment
Jiaying Qian
Ziheng Jia
Zicheng Zhang
Zeyu Zhang
Guangtao Zhai
Xiongkuo Min
40
0
0
12 Apr 2025
UniFlowRestore: A General Video Restoration Framework via Flow Matching and Prompt Guidance
UniFlowRestore: A General Video Restoration Framework via Flow Matching and Prompt Guidance
Shri Kiran Srinivasan
Yu Zhang
Chen Wu
Dianjie Lu
Dianjie Lu
Guijuan Zhan
Yang Weng
Zhuoran Zheng
DiffM
VGen
28
0
0
12 Apr 2025
Diffusion Models for Robotic Manipulation: A Survey
Diffusion Models for Robotic Manipulation: A Survey
Rosa Wolf
Yitian Shi
Sheng Liu
Rania Rayyes
51
1
0
11 Apr 2025
AGENT: An Aerial Vehicle Generation and Design Tool Using Large Language Models
AGENT: An Aerial Vehicle Generation and Design Tool Using Large Language Models
Colin Samplawski
Adam Cobb
Susmit Jha
LLMAG
AI4CE
60
0
0
11 Apr 2025
Generating Fine Details of Entity Interactions
Generating Fine Details of Entity Interactions
Xinyi Gu
Jiayuan Mao
32
0
0
11 Apr 2025
POEM: Precise Object-level Editing via MLLM control
POEM: Precise Object-level Editing via MLLM control
Marco Schouten
Mehmet Onurcan Kaya
Serge Belongie
Dim P. Papadopoulos
DiffM
77
0
0
10 Apr 2025
Marmot: Multi-Agent Reasoning for Multi-Object Self-Correcting in Improving Image-Text Alignment
Marmot: Multi-Agent Reasoning for Multi-Object Self-Correcting in Improving Image-Text Alignment
Jiayang Sun
H. Wang
Jie Cao
Huaibo Huang
Ran He
DiffM
73
0
0
10 Apr 2025
Teaching Humans Subtle Differences with DIFFusion
Teaching Humans Subtle Differences with DIFFusion
Mia Chiquier
Orr Avrech
Yossi Gandelsman
Berthy T. Feng
Katherine L. Bouman
Carl Vondrick
DiffM
51
0
0
10 Apr 2025
PixelFlow: Pixel-Space Generative Models with Flow
PixelFlow: Pixel-Space Generative Models with Flow
Shoufa Chen
Chongjian Ge
Shilong Zhang
Peize Sun
Ping Luo
VLM
DRL
37
0
0
10 Apr 2025
Compass Control: Multi Object Orientation Control for Text-to-Image Generation
Compass Control: Multi Object Orientation Control for Text-to-Image Generation
Rishubh Parihar
Vaibhav Agrawal
Sachidanand VS
R. V. Babu
DiffM
36
0
0
09 Apr 2025
IGG: Image Generation Informed by Geodesic Dynamics in Deformation Spaces
IGG: Image Generation Informed by Geodesic Dynamics in Deformation Spaces
Nian Wu
Nivetha Jayakumar
Jiarui Xing
Miaomiao Zhang
26
0
0
09 Apr 2025
A Unified Agentic Framework for Evaluating Conditional Image Generation
A Unified Agentic Framework for Evaluating Conditional Image Generation
Jifang Wang
Xue Yang
Longyue Wang
Zhenran Xu
Yixuan Wang
Yaowei Wang
Weihua Luo
Kaifu Zhang
Baotian Hu
Min Zhang
EGVM
DiffM
72
0
0
09 Apr 2025
A Meaningful Perturbation Metric for Evaluating Explainability Methods
A Meaningful Perturbation Metric for Evaluating Explainability Methods
Danielle Cohen
Hila Chefer
Lior Wolf
AAML
25
0
0
09 Apr 2025
CDM-QTA: Quantized Training Acceleration for Efficient LoRA Fine-Tuning of Diffusion Model
CDM-QTA: Quantized Training Acceleration for Efficient LoRA Fine-Tuning of Diffusion Model
Jinming Lu
Minghao She
Wendong Mao
Zhongfeng Wang
MQ
38
0
0
08 Apr 2025
Tuning-Free Image Editing with Fidelity and Editability via Unified Latent Diffusion Model
Tuning-Free Image Editing with Fidelity and Editability via Unified Latent Diffusion Model
Qi Mao
L. Chen
Yuchao Gu
Mike Zheng Shou
Ming-Hsuan Yang
DiffM
39
0
0
08 Apr 2025
Reinforced Multi-teacher Knowledge Distillation for Efficient General Image Forgery Detection and Localization
Reinforced Multi-teacher Knowledge Distillation for Efficient General Image Forgery Detection and Localization
Zeqin Yu
Jiangqun Ni
Jian Zhang
Haoyi Deng
Yuzhen Lin
28
0
0
07 Apr 2025
CADCrafter: Generating Computer-Aided Design Models from Unconstrained Images
CADCrafter: Generating Computer-Aided Design Models from Unconstrained Images
Cheng Chen
Jiacheng Wei
Tianrun Chen
Chi Zhang
Xiaofeng Yang
...
Bingchen Yang
Chuan-Sheng Foo
Guosheng Lin
Qixing Huang
Fayao Liu
44
1
0
07 Apr 2025
Dimension-Free Convergence of Diffusion Models for Approximate Gaussian Mixtures
Dimension-Free Convergence of Diffusion Models for Approximate Gaussian Mixtures
Gen Li
Changxiao Cai
Yuting Wei
DiffM
36
1
0
07 Apr 2025
DiCoTTA: Domain-invariant Learning for Continual Test-time Adaptation
DiCoTTA: Domain-invariant Learning for Continual Test-time Adaptation
Sohyun Lee
N. Kim
Juwon Kang
Seong Joon Oh
Suha Kwak
91
0
0
07 Apr 2025
SCAM: A Real-World Typographic Robustness Evaluation for Multimodal Foundation Models
SCAM: A Real-World Typographic Robustness Evaluation for Multimodal Foundation Models
Justus Westerhoff
Erblina Purellku
Jakob Hackstein
Jonas Loos
Leo Pinetzki
Lorenz Hufe
AAML
28
0
0
07 Apr 2025
PartStickers: Generating Parts of Objects for Rapid Prototyping
PartStickers: Generating Parts of Objects for Rapid Prototyping
Mo Zhou
Josh Myers-Dean
Danna Gurari
25
0
0
07 Apr 2025
Enhancing Compositional Reasoning in Vision-Language Models with Synthetic Preference Data
Enhancing Compositional Reasoning in Vision-Language Models with Synthetic Preference Data
Samarth Mishra
Kate Saenko
Venkatesh Saligrama
CoGe
LRM
37
0
0
07 Apr 2025
UniToken: Harmonizing Multimodal Understanding and Generation through Unified Visual Encoding
UniToken: Harmonizing Multimodal Understanding and Generation through Unified Visual Encoding
Yang Jiao
Haibo Qiu
Zequn Jie
S. Chen
Jingjing Chen
Lin Ma
Yu Jiang
34
2
0
06 Apr 2025
Multi-identity Human Image Animation with Structural Video Diffusion
Multi-identity Human Image Animation with Structural Video Diffusion
Zhenzhi Wang
Yongqian Li
Yanhong Zeng
Yuwei Guo
Dahua Lin
Tianfan Xue
Bo Dai
VGen
24
0
0
05 Apr 2025
Can You Count to Nine? A Human Evaluation Benchmark for Counting Limits in Modern Text-to-Video Models
Can You Count to Nine? A Human Evaluation Benchmark for Counting Limits in Modern Text-to-Video Models
Xuyang Guo
Zekai Huang
Jiayan Huo
Yingyu Liang
Zhenmei Shi
Zhao-quan Song
Jiahao Zhang
ALM
VGen
96
2
0
05 Apr 2025
Structured Knowledge Accumulation: The Principle of Entropic Least Action in Forward-Only Neural Learning
Structured Knowledge Accumulation: The Principle of Entropic Least Action in Forward-Only Neural Learning
Bouarfa Mahi Quantiota
38
0
0
04 Apr 2025
Fine-Tuning Visual Autoregressive Models for Subject-Driven Generation
Fine-Tuning Visual Autoregressive Models for Subject-Driven Generation
Jiwoo Chung
Sangeek Hyun
Hyunjun Kim
Eunseo Koh
MinKyu Lee
Jae-Pil Heo
33
0
0
03 Apr 2025
VARGPT-v1.1: Improve Visual Autoregressive Large Unified Model via Iterative Instruction Tuning and Reinforcement Learning
VARGPT-v1.1: Improve Visual Autoregressive Large Unified Model via Iterative Instruction Tuning and Reinforcement Learning
Xianwei Zhuang
Yuxin Xie
Yufan Deng
Dongchao Yang
Liming Liang
Jinghan Ru
Yuguo Yin
Yuexian Zou
71
2
0
03 Apr 2025
GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation
GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation
Zhiyuan Yan
Junyan Ye
Weijia Li
Zilong Huang
Shenghai Yuan
Xiangyang He
Kaiqing Lin
Jun-Jian He
Conghui He
Li Yuan
MLLM
EGVM
88
8
0
03 Apr 2025
Scene Splatter: Momentum 3D Scene Generation from Single Image with Video Diffusion Model
Scene Splatter: Momentum 3D Scene Generation from Single Image with Video Diffusion Model
Shengjun Zhang
Jinzhao Li
Xin Fei
Hao Liu
Yueqi Duan
DiffM
3DGS
VGen
73
0
0
03 Apr 2025
MD-ProjTex: Texturing 3D Shapes with Multi-Diffusion Projection
MD-ProjTex: Texturing 3D Shapes with Multi-Diffusion Projection
Ahmet Burak Yildirim
Mustafa Utku Aydogdu
Duygu Ceylan
Aysegül Dündar
DiffM
48
1
0
03 Apr 2025
MultiNeRF: Multiple Watermark Embedding for Neural Radiance Fields
MultiNeRF: Multiple Watermark Embedding for Neural Radiance Fields
Yash Kulthe
Andrew Gilbert
John Collomosse
41
0
0
03 Apr 2025
Multi-party Collaborative Attention Control for Image Customization
Multi-party Collaborative Attention Control for Image Customization
Han Yang
Chuanguang Yang
Qiuli Wang
Zhulin An
Weilun Feng
Libo Huang
Yongjun Xu
DiffM
35
0
0
02 Apr 2025
Random Conditioning with Distillation for Data-Efficient Diffusion Model Compression
Random Conditioning with Distillation for Data-Efficient Diffusion Model Compression
Dohyun Kim
S. Park
Geonhee Han
Seung Wook Kim
Paul Hongsuck Seo
DiffM
55
0
0
02 Apr 2025
FlowMotion: Target-Predictive Conditional Flow Matching for Jitter-Reduced Text-Driven Human Motion Generation
FlowMotion: Target-Predictive Conditional Flow Matching for Jitter-Reduced Text-Driven Human Motion Generation
Manolo Canales Cuba
Vinícius do Carmo Melício
João Paulo Gois
3DH
52
0
0
02 Apr 2025
Pro-DG: Procedural Diffusion Guidance for Architectural Facade Generation
Pro-DG: Procedural Diffusion Guidance for Architectural Facade Generation
Aleksander Plocharski
Jan Swidzinski
Przemyslaw Musialski
DiffM
41
0
0
02 Apr 2025
Implicit Bias Injection Attacks against Text-to-Image Diffusion Models
Implicit Bias Injection Attacks against Text-to-Image Diffusion Models
Huayang Huang
Xiangye Jin
Jiaxu Miao
Yu Wu
31
0
0
02 Apr 2025
FreSca: Unveiling the Scaling Space in Diffusion Models
FreSca: Unveiling the Scaling Space in Diffusion Models
Chao Huang
Susan Liang
Yunlong Tang
Li Ma
Yapeng Tian
Chenliang Xu
DiffM
48
0
0
02 Apr 2025
Less-to-More Generalization: Unlocking More Controllability by In-Context Generation
Less-to-More Generalization: Unlocking More Controllability by In-Context Generation
Shaojin Wu
Mengqi Huang
Wenxu Wu
Yufeng Cheng
Fei Ding
Qian He
DiffM
55
4
0
02 Apr 2025
Prompting Forgetting: Unlearning in GANs via Textual Guidance
Prompting Forgetting: Unlearning in GANs via Textual Guidance
Piyush Nagasubramaniam
Neeraj Karamchandani
Chen Wu
Sencun Zhu
DiffM
AILaw
MU
54
0
0
01 Apr 2025
Spingarn's Method and Progressive Decoupling Beyond Elicitable Monotonicity
Spingarn's Method and Progressive Decoupling Beyond Elicitable Monotonicity
B. Evens
P. Latafat
Panagiotis Patrinos
48
0
0
01 Apr 2025
Beyond Static Scenes: Camera-controllable Background Generation for Human Motion
Beyond Static Scenes: Camera-controllable Background Generation for Human Motion
Mingshuai Yao
Mengting Chen
Qinye Zhou
Yuyao Zhang
Ming-Yu Liu
...
Chen Ju
Shuai Xiao
Qingwen Liu
Jinsong Lan
Wangmeng Zuo
DiffM
VGen
48
1
0
01 Apr 2025
IntrinsiX: High-Quality PBR Generation using Image Priors
IntrinsiX: High-Quality PBR Generation using Image Priors
Peter Kocsis
Lukas Höllein
Matthias Nießner
39
0
0
01 Apr 2025
Training-Free Text-Guided Image Editing with Visual Autoregressive Model
Training-Free Text-Guided Image Editing with Visual Autoregressive Model
Yufei Wang
Lanqing Guo
Z. Li
Jiaxing Huang
Pichao Wang
Bihan Wen
J. Wang
DiffM
65
1
0
31 Mar 2025
Pre-training with 3D Synthetic Data: Learning 3D Point Cloud Instance Segmentation from 3D Synthetic Scenes
Pre-training with 3D Synthetic Data: Learning 3D Point Cloud Instance Segmentation from 3D Synthetic Scenes
Daichi Otsuka
Shinichi Mae
Ryosuke Yamada
Hirokatsu Kataoka
3DPC
37
0
0
31 Mar 2025
Previous
123456...939495
Next