ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.06125
  4. Cited By
Hierarchical Text-Conditional Image Generation with CLIP Latents

Hierarchical Text-Conditional Image Generation with CLIP Latents

13 April 2022
Aditya A. Ramesh
Prafulla Dhariwal
Alex Nichol
Casey Chu
Mark Chen
    VLMDiffM
ArXiv (abs)PDFHTML

Papers citing "Hierarchical Text-Conditional Image Generation with CLIP Latents"

50 / 4,897 papers shown
Title
Token-Level Constraint Boundary Search for Jailbreaking Text-to-Image Models
Token-Level Constraint Boundary Search for Jailbreaking Text-to-Image Models
Qingbin Liu
Zhaoxin Wang
Handing Wang
Cong Tian
Yaochu Jin
54
1
0
15 Apr 2025
Aligning Generative Denoising with Discriminative Objectives Unleashes Diffusion for Visual Perception
Aligning Generative Denoising with Discriminative Objectives Unleashes Diffusion for Visual Perception
Ziqi Pang
Xin Xu
Yu-Xiong Wang
DiffM
193
0
0
15 Apr 2025
Separate to Collaborate: Dual-Stream Diffusion Model for Coordinated Piano Hand Motion Synthesis
Separate to Collaborate: Dual-Stream Diffusion Model for Coordinated Piano Hand Motion Synthesis
Zihao Liu
Mingwen Ou
Zunnan Xu
Jiaqi Huang
Haonan Han
Ronghui Li
Xiaochen Li
DiffM
108
0
0
14 Apr 2025
InstructEngine: Instruction-driven Text-to-Image Alignment
InstructEngine: Instruction-driven Text-to-Image Alignment
Xingyu Lu
Yihan Hu
Yuanxing Zhang
Kaiyu Jiang
Changyi Liu
...
Bin Wen
C. Yuan
Fan Yang
Yan Li
Di Zhang
127
0
0
14 Apr 2025
GeoUni: A Unified Model for Generating Geometry Diagrams, Problems and Problem Solutions
GeoUni: A Unified Model for Generating Geometry Diagrams, Problems and Problem Solutions
Jo-Ku Cheng
Zeren Zhang
Ran Chen
Jingyang Deng
Ziran Qin
Jinwen Ma
100
1
0
14 Apr 2025
An Image is Worth $K$ Topics: A Visual Structural Topic Model with Pretrained Image Embeddings
An Image is Worth KKK Topics: A Visual Structural Topic Model with Pretrained Image Embeddings
Matías Piqueras
Alexandra Segerberg
Matteo Magnani
Måns Magnusson
Nataša Sladoje
101
0
0
14 Apr 2025
Efficient Generative Model Training via Embedded Representation Warmup
Efficient Generative Model Training via Embedded Representation Warmup
Deyuan Liu
Peng Sun
Xufeng Li
Tao Lin
72
0
0
14 Apr 2025
Omni-Dish: Photorealistic and Faithful Image Generation and Editing for Arbitrary Chinese Dishes
Omni-Dish: Photorealistic and Faithful Image Generation and Editing for Arbitrary Chinese Dishes
Huijie Liu
Bingcan Wang
Jie Hu
Xiaoming Wei
Guoliang Kang
134
0
0
14 Apr 2025
Automatic Detection of Intro and Credits in Video using CLIP and Multihead Attention
Automatic Detection of Intro and Credits in Video using CLIP and Multihead Attention
Vasilii Korolkov
Andrey Yanchenko
VLM
76
1
0
13 Apr 2025
D$^2$iT: Dynamic Diffusion Transformer for Accurate Image Generation
D2^22iT: Dynamic Diffusion Transformer for Accurate Image Generation
Weinan Jia
Mengqi Huang
Nan Chen
Lei Zhang
Zhendong Mao
86
0
0
13 Apr 2025
SD-ReID: View-aware Stable Diffusion for Aerial-Ground Person Re-Identification
SD-ReID: View-aware Stable Diffusion for Aerial-Ground Person Re-Identification
Xiang Hu
Pingping Zhang
Yuhao Wang
Bin Yan
Huchuan Lu
58
0
0
13 Apr 2025
Scalable Motion In-betweening via Diffusion and Physics-Based Character Adaptation
Scalable Motion In-betweening via Diffusion and Physics-Based Character Adaptation
Jia Qin
DiffMVGen
68
0
0
13 Apr 2025
UniFlowRestore: A General Video Restoration Framework via Flow Matching and Prompt Guidance
UniFlowRestore: A General Video Restoration Framework via Flow Matching and Prompt Guidance
Siyang Song
Yu Zhang
Chen Wu
Dianjie Lu
Dianjie Lu
Guijuan Zhan
Yang Weng
Zhuoran Zheng
DiffMVGen
61
0
0
12 Apr 2025
Towards Explainable Partial-AIGC Image Quality Assessment
Towards Explainable Partial-AIGC Image Quality Assessment
Jiaying Qian
Ziheng Jia
Zicheng Zhang
Zeyu Zhang
Guangtao Zhai
Xiongkuo Min
71
0
0
12 Apr 2025
Generating Fine Details of Entity Interactions
Generating Fine Details of Entity Interactions
Xinyi Gu
Jiayuan Mao
150
0
0
11 Apr 2025
AGENT: An Aerial Vehicle Generation and Design Tool Using Large Language Models
AGENT: An Aerial Vehicle Generation and Design Tool Using Large Language Models
Colin Samplawski
Adam Cobb
Susmit Jha
LLMAGAI4CE
110
0
0
11 Apr 2025
PixelFlow: Pixel-Space Generative Models with Flow
PixelFlow: Pixel-Space Generative Models with Flow
Shoufa Chen
Chongjian Ge
Shilong Zhang
Peize Sun
Ping Luo
VLMDRL
65
0
0
10 Apr 2025
Teaching Humans Subtle Differences with DIFFusion
Teaching Humans Subtle Differences with DIFFusion
Mia Chiquier
Orr Avrech
Yossi Gandelsman
Berthy Feng
Katherine Bouman
Carl Vondrick
DiffM
128
0
0
10 Apr 2025
POEM: Precise Object-level Editing via MLLM control
POEM: Precise Object-level Editing via MLLM control
Marco Schouten
Mehmet Onurcan Kaya
Serge Belongie
Dim P. Papadopoulos
DiffM
101
0
0
10 Apr 2025
Marmot: Multi-Agent Reasoning for Multi-Object Self-Correcting in Improving Image-Text Alignment
Marmot: Multi-Agent Reasoning for Multi-Object Self-Correcting in Improving Image-Text Alignment
Jiayang Sun
Hongru Wang
Jie Cao
Huaibo Huang
Ran He
DiffM
110
0
0
10 Apr 2025
Compass Control: Multi Object Orientation Control for Text-to-Image Generation
Compass Control: Multi Object Orientation Control for Text-to-Image Generation
Rishubh Parihar
Vaibhav Agrawal
Sachidanand VS
R. V. Babu
DiffM
118
0
0
09 Apr 2025
A Meaningful Perturbation Metric for Evaluating Explainability Methods
A Meaningful Perturbation Metric for Evaluating Explainability Methods
Danielle Cohen
Hila Chefer
Lior Wolf
AAML
59
0
0
09 Apr 2025
A Unified Agentic Framework for Evaluating Conditional Image Generation
A Unified Agentic Framework for Evaluating Conditional Image Generation
Jifang Wang
Xue Yang
Longyue Wang
Zhenran Xu
Yansen Wang
Yaowei Wang
Weihua Luo
Kaifu Zhang
Baotian Hu
Min Zhang
EGVMDiffM
129
2
0
09 Apr 2025
IGG: Image Generation Informed by Geodesic Dynamics in Deformation Spaces
IGG: Image Generation Informed by Geodesic Dynamics in Deformation Spaces
Nian Wu
Nivetha Jayakumar
Jiarui Xing
Miaomiao Zhang
100
0
0
09 Apr 2025
CDM-QTA: Quantized Training Acceleration for Efficient LoRA Fine-Tuning of Diffusion Model
CDM-QTA: Quantized Training Acceleration for Efficient LoRA Fine-Tuning of Diffusion Model
Jinming Lu
Minghao She
Wendong Mao
Zhongfeng Wang
MQ
47
0
0
08 Apr 2025
Tuning-Free Image Editing with Fidelity and Editability via Unified Latent Diffusion Model
Tuning-Free Image Editing with Fidelity and Editability via Unified Latent Diffusion Model
Qi Mao
Lawrence Yunliang Chen
Yuchao Gu
Mike Zheng Shou
Ming-Hsuan Yang
DiffM
80
0
0
08 Apr 2025
Reinforced Multi-teacher Knowledge Distillation for Efficient General Image Forgery Detection and Localization
Reinforced Multi-teacher Knowledge Distillation for Efficient General Image Forgery Detection and Localization
Zeqin Yu
Jiangqun Ni
Jian Zhang
Haoyi Deng
Yuzhen Lin
72
0
0
07 Apr 2025
SCAM: A Real-World Typographic Robustness Evaluation for Multimodal Foundation Models
SCAM: A Real-World Typographic Robustness Evaluation for Multimodal Foundation Models
Justus Westerhoff
Erblina Purellku
Jakob Hackstein
Jonas Loos
Leo Pinetzki
Lorenz Hufe
AAML
139
0
0
07 Apr 2025
Enhancing Compositional Reasoning in Vision-Language Models with Synthetic Preference Data
Enhancing Compositional Reasoning in Vision-Language Models with Synthetic Preference Data
Samarth Mishra
Kate Saenko
Venkatesh Saligrama
CoGeLRM
69
0
0
07 Apr 2025
Dimension-Free Convergence of Diffusion Models for Approximate Gaussian Mixtures
Dimension-Free Convergence of Diffusion Models for Approximate Gaussian Mixtures
Gen Li
Changxiao Cai
Yuting Wei
DiffM
75
1
0
07 Apr 2025
CADCrafter: Generating Computer-Aided Design Models from Unconstrained Images
CADCrafter: Generating Computer-Aided Design Models from Unconstrained Images
Cheng Chen
Jiacheng Wei
Tianrun Chen
Chi Zhang
Xiaofeng Yang
...
Bingchen Yang
Chuan-Sheng Foo
Guosheng Lin
Qixing Huang
Fayao Liu
88
4
0
07 Apr 2025
TestDG: Test-time Domain Generalization for Continual Test-time Adaptation
TestDG: Test-time Domain Generalization for Continual Test-time Adaptation
Sohyun Lee
N. Kim
Juwon Kang
Seong Joon Oh
Suha Kwak
OODTTA
191
0
0
07 Apr 2025
PartStickers: Generating Parts of Objects for Rapid Prototyping
PartStickers: Generating Parts of Objects for Rapid Prototyping
Mo Zhou
Josh Myers-Dean
Danna Gurari
97
0
0
07 Apr 2025
UniToken: Harmonizing Multimodal Understanding and Generation through Unified Visual Encoding
UniToken: Harmonizing Multimodal Understanding and Generation through Unified Visual Encoding
Yang Jiao
Haibo Qiu
Zequn Jie
Tian Jin
Jingjing Chen
Lin Ma
Yu Jiang
106
10
0
06 Apr 2025
Can You Count to Nine? A Human Evaluation Benchmark for Counting Limits in Modern Text-to-Video Models
Can You Count to Nine? A Human Evaluation Benchmark for Counting Limits in Modern Text-to-Video Models
Xuyang Guo
Zekai Huang
Jiayan Huo
Yingyu Liang
Zhenmei Shi
Zhao Song
Jiahao Zhang
ALMVGen
198
6
0
05 Apr 2025
Multi-identity Human Image Animation with Structural Video Diffusion
Multi-identity Human Image Animation with Structural Video Diffusion
Zhenzhi Wang
Yongqian Li
Yanhong Zeng
Yuwei Guo
Dahua Lin
Tianfan Xue
Bo Dai
VGen
75
2
0
05 Apr 2025
Structured Knowledge Accumulation: The Principle of Entropic Least Action in Forward-Only Neural Learning
Structured Knowledge Accumulation: The Principle of Entropic Least Action in Forward-Only Neural Learning
Bouarfa Mahi Quantiota
87
0
0
04 Apr 2025
MD-ProjTex: Texturing 3D Shapes with Multi-Diffusion Projection
MD-ProjTex: Texturing 3D Shapes with Multi-Diffusion Projection
Ahmet Burak Yildirim
Mustafa Utku Aydogdu
Duygu Ceylan
Aysegül Dündar
DiffM
133
1
0
03 Apr 2025
Fine-Tuning Visual Autoregressive Models for Subject-Driven Generation
Fine-Tuning Visual Autoregressive Models for Subject-Driven Generation
Jiwoo Chung
Sangeek Hyun
Hyunjun Kim
Eunseo Koh
MinKyu Lee
Jae-Pil Heo
74
0
0
03 Apr 2025
MultiNeRF: Multiple Watermark Embedding for Neural Radiance Fields
MultiNeRF: Multiple Watermark Embedding for Neural Radiance Fields
Yash Kulthe
Andrew Gilbert
John Collomosse
128
0
0
03 Apr 2025
VARGPT-v1.1: Improve Visual Autoregressive Large Unified Model via Iterative Instruction Tuning and Reinforcement Learning
VARGPT-v1.1: Improve Visual Autoregressive Large Unified Model via Iterative Instruction Tuning and Reinforcement Learning
Xianwei Zhuang
Yuxin Xie
Yufan Deng
Dongchao Yang
Liming Liang
Jinghan Ru
Yuguo Yin
Yuexian Zou
155
5
0
03 Apr 2025
Scene Splatter: Momentum 3D Scene Generation from Single Image with Video Diffusion Model
Scene Splatter: Momentum 3D Scene Generation from Single Image with Video Diffusion Model
Shengjun Zhang
Jinzhao Li
Xin Fei
Hao Liu
Yueqi Duan
DiffM3DGSVGen
110
1
0
03 Apr 2025
GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation
GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation
Zhiyuan Yan
Junyan Ye
Weijia Li
Zilong Huang
Shenghai Yuan
Xiangyang He
Kaiqing Lin
Jun-Jian He
Conghui He
Li Yuan
MLLMEGVM
191
24
0
03 Apr 2025
Pro-DG: Procedural Diffusion Guidance for Architectural Facade Generation
Pro-DG: Procedural Diffusion Guidance for Architectural Facade Generation
Aleksander Plocharski
Jan Swidzinski
Przemyslaw Musialski
DiffM
50
0
0
02 Apr 2025
Multi-party Collaborative Attention Control for Image Customization
Multi-party Collaborative Attention Control for Image Customization
Han Yang
Chuanguang Yang
Qiuli Wang
Zhulin An
Weilun Feng
Libo Huang
Yongjun Xu
DiffM
108
1
0
02 Apr 2025
FlowMotion: Target-Predictive Conditional Flow Matching for Jitter-Reduced Text-Driven Human Motion Generation
FlowMotion: Target-Predictive Conditional Flow Matching for Jitter-Reduced Text-Driven Human Motion Generation
Manolo Canales Cuba
Vinícius do Carmo Melício
João Paulo Gois
3DH
119
0
0
02 Apr 2025
Random Conditioning with Distillation for Data-Efficient Diffusion Model Compression
Random Conditioning with Distillation for Data-Efficient Diffusion Model Compression
Dohyun Kim
S. Park
Geonhee Han
Seung Wook Kim
Paul Hongsuck Seo
DiffM
106
0
0
02 Apr 2025
Less-to-More Generalization: Unlocking More Controllability by In-Context Generation
Less-to-More Generalization: Unlocking More Controllability by In-Context Generation
Shaojin Wu
Mengqi Huang
Wenxu Wu
Yufeng Cheng
Fei Ding
Qian He
DiffM
122
12
0
02 Apr 2025
Implicit Bias Injection Attacks against Text-to-Image Diffusion Models
Implicit Bias Injection Attacks against Text-to-Image Diffusion Models
Huayang Huang
Xiangye Jin
Jiaxu Miao
Yu Wu
88
0
0
02 Apr 2025
Prompting Medical Vision-Language Models to Mitigate Diagnosis Bias by Generating Realistic Dermoscopic Images
Prompting Medical Vision-Language Models to Mitigate Diagnosis Bias by Generating Realistic Dermoscopic Images
Nusrat Munia
Abdullah-Al-Zubaer Imran
LM&MAMedIm
62
0
0
02 Apr 2025
Previous
123...567...969798
Next