ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.06125
  4. Cited By
Hierarchical Text-Conditional Image Generation with CLIP Latents

Hierarchical Text-Conditional Image Generation with CLIP Latents

13 April 2022
Aditya A. Ramesh
Prafulla Dhariwal
Alex Nichol
Casey Chu
Mark Chen
    VLM
    DiffM
ArXivPDFHTML

Papers citing "Hierarchical Text-Conditional Image Generation with CLIP Latents"

50 / 4,757 papers shown
Title
Avoiding Generative Model Writer's Block With Embedding Nudging
Avoiding Generative Model Writer's Block With Embedding Nudging
Ali Zand
Milad Nasr
26
0
0
28 Aug 2024
NeuralOOD: Improving Out-of-Distribution Generalization Performance with
  Brain-machine Fusion Learning Framework
NeuralOOD: Improving Out-of-Distribution Generalization Performance with Brain-machine Fusion Learning Framework
Shuangchen Zhao
Changde Du
Hui Li
Huiguang He
47
0
0
27 Aug 2024
Alfie: Democratising RGBA Image Generation With No $$$
Alfie: Democratising RGBA Image Generation With No
Fabio Quattrini
Vittorio Pippi
Silvia Cascianelli
Rita Cucchiara
DiffM
51
5
0
27 Aug 2024
Build-A-Scene: Interactive 3D Layout Control for Diffusion-Based Image
  Generation
Build-A-Scene: Interactive 3D Layout Control for Diffusion-Based Image Generation
Abdelrahman Eldesokey
Peter Wonka
DiffM
46
4
0
27 Aug 2024
CrossViewDiff: A Cross-View Diffusion Model for Satellite-to-Street View
  Synthesis
CrossViewDiff: A Cross-View Diffusion Model for Satellite-to-Street View Synthesis
Weijia Li
Jun He
Junyan Ye
Huaping Zhong
Zhimeng Zheng
Zilong Huang
Dahua Lin
Conghui He
49
6
0
27 Aug 2024
Diffusion Models Are Real-Time Game Engines
Diffusion Models Are Real-Time Game Engines
Dani Valevski
Yaniv Leviathan
Moab Arar
Shlomi Fruchter
DiffM
VGen
AI4CE
38
62
0
27 Aug 2024
Social perception of faces in a vision-language model
Social perception of faces in a vision-language model
C. I. Hausladen
Manuel Knott
Colin F. Camerer
Pietro Perona
CVBM
VLM
50
2
0
26 Aug 2024
MagicMan: Generative Novel View Synthesis of Humans with 3D-Aware
  Diffusion and Iterative Refinement
MagicMan: Generative Novel View Synthesis of Humans with 3D-Aware Diffusion and Iterative Refinement
Xu He
Xiaoyu Li
Di Kang
Jiangnan Ye
Chaopeng Zhang
Liyang Chen
Xiangjun Gao
Han Zhang
Zhiyong Wu
Haolin Zhuang
DiffM
43
7
0
26 Aug 2024
I2EBench: A Comprehensive Benchmark for Instruction-based Image Editing
I2EBench: A Comprehensive Benchmark for Instruction-based Image Editing
Yiwei Ma
Jiayi Ji
Ke Ye
Weihuang Lin
Zhibin Wang
Yonghan Zheng
Qiang-feng Zhou
Xiaoshuai Sun
Rongrong Ji
51
7
0
26 Aug 2024
SwiftBrush v2: Make Your One-step Diffusion Model Better Than Its
  Teacher
SwiftBrush v2: Make Your One-step Diffusion Model Better Than Its Teacher
T. Dao
Thuan Hoang Nguyen
T. Le
D. Vu
Khoi Nguyen
Cuong Pham
Anh Tran
DiffM
49
15
0
26 Aug 2024
Foodfusion: A Novel Approach for Food Image Composition via Diffusion
  Models
Foodfusion: A Novel Approach for Food Image Composition via Diffusion Models
Chaohua Shi
Xuan Wang
Si Shi
Xule Wang
Mingrui Zhu
Nannan Wang
X. Gao
CoGe
48
1
0
26 Aug 2024
Draw Like an Artist: Complex Scene Generation with Diffusion Model via
  Composition, Painting, and Retouching
Draw Like an Artist: Complex Scene Generation with Diffusion Model via Composition, Painting, and Retouching
Minghao Liu
Le Zhang
Yingjie Tian
Xiaochao Qu
Luoqi Liu
Ting Liu
DiffM
CoGe
42
2
0
25 Aug 2024
Localization of Synthetic Manipulations in Western Blot Images
Localization of Synthetic Manipulations in Western Blot Images
Anmol Manjunath
Viola Negroni
S. Mandelli
Daniel Moreira
Paolo Bestagini
45
0
0
25 Aug 2024
GenCA: A Text-conditioned Generative Model for Realistic and Drivable
  Codec Avatars
GenCA: A Text-conditioned Generative Model for Realistic and Drivable Codec Avatars
Keqiang Sun
Amin Jourabloo
Riddhish Bhalodia
Moustafa Meshry
Yu Rong
...
Christian Haene
Jiu Xu
Sam Johnson
Hongsheng Li
Sofien Bouaziz
DiffM
50
0
0
24 Aug 2024
Prompt-Softbox-Prompt: A free-text Embedding Control for Image Editing
Prompt-Softbox-Prompt: A free-text Embedding Control for Image Editing
Yitong Yang
Yinglin Wang
Jing Wang
Tian Zhang
DiffM
40
1
0
24 Aug 2024
Task-Oriented Diffusion Inversion for High-Fidelity Text-based Editing
Task-Oriented Diffusion Inversion for High-Fidelity Text-based Editing
Yangyang Xu
Wenqi Shao
Yong Du
Haiming Zhu
Yang Zhou
Ping Luo
Shengfeng He
DiffM
51
2
0
23 Aug 2024
Latent Space Disentanglement in Diffusion Transformers Enables Zero-shot
  Fine-grained Semantic Editing
Latent Space Disentanglement in Diffusion Transformers Enables Zero-shot Fine-grained Semantic Editing
Zitao Shuai
Chenwei Wu
Zhengxu Tang
Bowen Song
Liyue Shen
40
0
0
23 Aug 2024
CustomCrafter: Customized Video Generation with Preserving Motion and
  Concept Composition Abilities
CustomCrafter: Customized Video Generation with Preserving Motion and Concept Composition Abilities
Tao Wu
Yong Zhang
Xintao Wang
Xianpan Zhou
Guangcong Zheng
Zhongang Qi
Ying Shan
Xi Li
VGen
DiffM
29
26
0
23 Aug 2024
On Class Separability Pitfalls In Audio-Text Contrastive Zero-Shot
  Learning
On Class Separability Pitfalls In Audio-Text Contrastive Zero-Shot Learning
Tiago Tavares
Fabio Ayres
Zhepei Wang
Paris Smaragdis
VLM
36
2
0
23 Aug 2024
Abstract Art Interpretation Using ControlNet
Abstract Art Interpretation Using ControlNet
Rishabh Srivastava
Addrish Roy
18
0
0
23 Aug 2024
Atlas Gaussians Diffusion for 3D Generation
Atlas Gaussians Diffusion for 3D Generation
Haitao Yang
Yuan Dong
Hanwen Jiang
Dejia Xu
Georgios Pavlakos
Qixing Huang
3DGS
81
3
0
23 Aug 2024
Visual Verity in AI-Generated Imagery: Computational Metrics and
  Human-Centric Analysis
Visual Verity in AI-Generated Imagery: Computational Metrics and Human-Centric Analysis
Memoona Aziz
Umair Rehman
Syed Ali Safi
Amir Zaib Abbasi
EGVM
40
2
0
22 Aug 2024
CatFree3D: Category-agnostic 3D Object Detection with Diffusion
CatFree3D: Category-agnostic 3D Object Detection with Diffusion
Wenjing Bian
Zirui Wang
Andrea Vedaldi
44
1
0
22 Aug 2024
Show-o: One Single Transformer to Unify Multimodal Understanding and
  Generation
Show-o: One Single Transformer to Unify Multimodal Understanding and Generation
Jinheng Xie
Weijia Mao
Zechen Bai
David Junhao Zhang
Weihao Wang
Kevin Qinghong Lin
Yuchao Gu
Zhijie Chen
Zhenheng Yang
Mike Zheng Shou
59
171
0
22 Aug 2024
FIDAVL: Fake Image Detection and Attribution using Vision-Language Model
FIDAVL: Fake Image Detection and Attribution using Vision-Language Model
Mamadou Keita
W. Hamidouche
Hessen Bougueffa Eutamene
Abdelmalik Taleb-Ahmed
Abdenour Hadid
VLM
90
1
0
22 Aug 2024
Diffusion-Based Visual Art Creation: A Survey and New Perspectives
Diffusion-Based Visual Art Creation: A Survey and New Perspectives
Bingyuan Wang
Qifeng Chen
Zeyu Wang
59
7
0
22 Aug 2024
CaRDiff: Video Salient Object Ranking Chain of Thought Reasoning for
  Saliency Prediction with Diffusion
CaRDiff: Video Salient Object Ranking Chain of Thought Reasoning for Saliency Prediction with Diffusion
Yunlong Tang
Gen Zhan
Li Yang
Yiting Liao
Chenliang Xu
VGen
DiffM
LRM
58
8
0
21 Aug 2024
Approaching Deep Learning through the Spectral Dynamics of Weights
Approaching Deep Learning through the Spectral Dynamics of Weights
David Yunis
Kumar Kshitij Patel
Samuel Wheeler
Pedro H. P. Savarese
Gal Vardi
Karen Livescu
Michael Maire
Matthew R. Walter
59
3
0
21 Aug 2024
Iterative Object Count Optimization for Text-to-image Diffusion Models
Iterative Object Count Optimization for Text-to-image Diffusion Models
Oz Zafar
Lior Wolf
Idan Schwartz
VLM
27
3
0
21 Aug 2024
MeTTA: Single-View to 3D Textured Mesh Reconstruction with Test-Time
  Adaptation
MeTTA: Single-View to 3D Textured Mesh Reconstruction with Test-Time Adaptation
Kim Yu-Ji
Hyunwoo Ha
Kim Youwang
Jaeheung Surh
Hyowon Ha
Tae-Hyun Oh
62
0
0
21 Aug 2024
FRAP: Faithful and Realistic Text-to-Image Generation with Adaptive Prompt Weighting
FRAP: Faithful and Realistic Text-to-Image Generation with Adaptive Prompt Weighting
Liyao Jiang
Negar Hassanpour
Mohammad Salameh
Mohan Sai Singamsetti
Fengyu Sun
Wei Lu
Di Niu
DiffM
85
1
0
21 Aug 2024
Transfusion: Predict the Next Token and Diffuse Images with One
  Multi-Modal Model
Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model
Chunting Zhou
Lili Yu
Arun Babu
Kushal Tirumala
Michihiro Yasunaga
Leonid Shamis
Jacob Kahn
Xuezhe Ma
Luke Zettlemoyer
Omer Levy
DiffM
44
154
0
20 Aug 2024
MegaFusion: Extend Diffusion Models towards Higher-resolution Image
  Generation without Further Tuning
MegaFusion: Extend Diffusion Models towards Higher-resolution Image Generation without Further Tuning
Haoning Wu
Shaocheng Shen
Qiang Hu
Xiaoyun Zhang
Ya Zhang
Yanfeng Wang
42
10
0
20 Aug 2024
Generative AI in Industrial Machine Vision -- A Review
Generative AI in Industrial Machine Vision -- A Review
H. Zhou
Dominik Wolfschlager
Constantinos Florides
Jonas Werheid
Hannes Behnen
Jan-Henrick Woltersmann
Tiago C. Pinto
Marco Kemmerling
Anas Abdelrazeq
Robert H. Schmitt
42
3
0
20 Aug 2024
Prompt-Agnostic Adversarial Perturbation for Customized Diffusion Models
Prompt-Agnostic Adversarial Perturbation for Customized Diffusion Models
Cong Wan
Yuhang He
Xiang Song
Yihong Gong
DiffM
AAML
42
7
0
20 Aug 2024
Learning Multimodal Latent Space with EBM Prior and MCMC Inference
Learning Multimodal Latent Space with EBM Prior and MCMC Inference
Shiyu Yuan
Carlo Lipizzi
Tian Han
38
1
0
20 Aug 2024
BrewCLIP: A Bifurcated Representation Learning Framework for
  Audio-Visual Retrieval
BrewCLIP: A Bifurcated Representation Learning Framework for Audio-Visual Retrieval
Zhenyu Lu
Lakshay Sethi
45
0
0
19 Aug 2024
Diversity and stylization of the contemporary user-generated visual arts
  in the complexity-entropy plane
Diversity and stylization of the contemporary user-generated visual arts in the complexity-entropy plane
Seunghwan Kim
Byunghwee Lee
Wonjae Lee
55
2
0
19 Aug 2024
Factorized-Dreamer: Training A High-Quality Video Generator with Limited
  and Low-Quality Data
Factorized-Dreamer: Training A High-Quality Video Generator with Limited and Low-Quality Data
Tao Yang
Yangming Shi
Yunwen Huang
Feng Chen
Yin Zheng
Lei Zhang
DiffM
VGen
70
0
0
19 Aug 2024
Latent Diffusion for Guided Document Table Generation
Latent Diffusion for Guided Document Table Generation
Syed Jawwad Haider Hamdani
S. Saifullah
S. Agne
Andreas Dengel
Sheraz Ahmed
26
0
0
19 Aug 2024
TraDiffusion: Trajectory-Based Training-Free Image Generation
TraDiffusion: Trajectory-Based Training-Free Image Generation
Mingrui Wu
Oucheng Huang
Jiayi Ji
Jiale Li
Xinyue Cai
Huafeng Kuang
Jianzhuang Liu
Xiaoshuai Sun
Rongrong Ji
42
3
0
19 Aug 2024
Mask in the Mirror: Implicit Sparsification
Mask in the Mirror: Implicit Sparsification
Tom Jacobs
R. Burkholz
52
3
0
19 Aug 2024
Crossing New Frontiers: Knowledge-Augmented Large Language Model
  Prompting for Zero-Shot Text-Based De Novo Molecule Design
Crossing New Frontiers: Knowledge-Augmented Large Language Model Prompting for Zero-Shot Text-Based De Novo Molecule Design
Sakhinana Sagar Srinivas
Venkataramana Runkana
49
1
0
18 Aug 2024
OVOSE: Open-Vocabulary Semantic Segmentation in Event-Based Cameras
OVOSE: Open-Vocabulary Semantic Segmentation in Event-Based Cameras
Muhammad Rameez Ur Rahman
Jhony H. Giraldo
Indro Spinelli
Stéphane Lathuilière
Fabio Galasso
VLM
38
0
0
18 Aug 2024
Combo: Co-speech holistic 3D human motion generation and efficient
  customizable adaptation in harmony
Combo: Co-speech holistic 3D human motion generation and efficient customizable adaptation in harmony
Chao Xu
Mingze Sun
Zhi-Qi Cheng
Fei Wang
Yang Liu
Baigui Sun
Ruqi Huang
Alexander G. Hauptmann
VGen
50
3
0
18 Aug 2024
FD2Talk: Towards Generalized Talking Head Generation with Facial
  Decoupled Diffusion Model
FD2Talk: Towards Generalized Talking Head Generation with Facial Decoupled Diffusion Model
Ziyu Yao
Xuxin Cheng
Zhiqi Huang
DiffM
39
3
0
18 Aug 2024
Quality Assessment in the Era of Large Models: A Survey
Quality Assessment in the Era of Large Models: A Survey
Zicheng Zhang
Yingjie Zhou
Chunyi Li
Baixuan Zhao
Xiaohong Liu
Guangtao Zhai
61
10
0
17 Aug 2024
Generative Dataset Distillation Based on Diffusion Model
Generative Dataset Distillation Based on Diffusion Model
Duo Su
Junjie Hou
Guang Li
Ren Togo
Rui Song
Takahiro Ogawa
Miki Haseyama
VGen
DD
43
4
0
16 Aug 2024
An End-to-End Model for Photo-Sharing Multi-modal Dialogue Generation
An End-to-End Model for Photo-Sharing Multi-modal Dialogue Generation
Peiming Guo
Sinuo Liu
Yanzhao Zhang
Dingkun Long
Pengjun Xie
Meishan Zhang
Hao Fei
DiffM
50
1
0
16 Aug 2024
METR: Image Watermarking with Large Number of Unique Messages
METR: Image Watermarking with Large Number of Unique Messages
Alexander Varlamov
Daria Diatlova
Egor Spirin
WIGM
33
0
0
15 Aug 2024
Previous
123...192021...949596
Next