ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.06125
  4. Cited By
Hierarchical Text-Conditional Image Generation with CLIP Latents

Hierarchical Text-Conditional Image Generation with CLIP Latents

13 April 2022
Aditya A. Ramesh
Prafulla Dhariwal
Alex Nichol
Casey Chu
Mark Chen
    VLM
    DiffM
ArXivPDFHTML

Papers citing "Hierarchical Text-Conditional Image Generation with CLIP Latents"

50 / 4,753 papers shown
Title
Env-Aware Anomaly Detection: Ignore Style Changes, Stay True to Content!
Env-Aware Anomaly Detection: Ignore Style Changes, Stay True to Content!
Stefan Smeu
Elena Burceanu
Andrei Liviu Nicolicioiu
Emanuela Haller
35
4
0
06 Oct 2022
VIMA: General Robot Manipulation with Multimodal Prompts
VIMA: General Robot Manipulation with Multimodal Prompts
Yunfan Jiang
Agrim Gupta
Zichen Zhang
Guanzhi Wang
Yongqiang Dou
Yanjun Chen
Li Fei-Fei
Anima Anandkumar
Yuke Zhu
Linxi Fan
LM&Ro
38
338
0
06 Oct 2022
Novel View Synthesis with Diffusion Models
Novel View Synthesis with Diffusion Models
Daniel Watson
William Chan
Ricardo Martín Brualla
Jonathan Ho
Andrea Tagliasacchi
Mohammad Norouzi
DiffM
62
268
0
06 Oct 2022
Flow Matching for Generative Modeling
Flow Matching for Generative Modeling
Y. Lipman
Ricky T. Q. Chen
Heli Ben-Hamu
Maximilian Nickel
Matt Le
OOD
61
1,081
0
06 Oct 2022
DALL-E-Bot: Introducing Web-Scale Diffusion Models to Robotics
DALL-E-Bot: Introducing Web-Scale Diffusion Models to Robotics
Ivan Kapelyukh
Vitalis Vosylius
Edward Johns
LM&Ro
DiffM
117
146
0
05 Oct 2022
Phenaki: Variable Length Video Generation From Open Domain Textual
  Description
Phenaki: Variable Length Video Generation From Open Domain Textual Description
Ruben Villegas
Mohammad Babaeizadeh
Pieter-Jan Kindermans
Hernan Moraldo
Han Zhang
M. Saffar
Santiago Castro
Julius Kunze
D. Erhan
DiffM
VGen
68
375
0
05 Oct 2022
clip2latent: Text driven sampling of a pre-trained StyleGAN using
  denoising diffusion and CLIP
clip2latent: Text driven sampling of a pre-trained StyleGAN using denoising diffusion and CLIP
Justin N. M. Pinkney
Chuan Li
CLIP
VLM
58
20
0
05 Oct 2022
Imagen Video: High Definition Video Generation with Diffusion Models
Imagen Video: High Definition Video Generation with Diffusion Models
Jonathan Ho
William Chan
Chitwan Saharia
Jay Whang
Ruiqi Gao
...
Diederik P. Kingma
Ben Poole
Mohammad Norouzi
David J. Fleet
Tim Salimans
VGen
67
1,480
0
05 Oct 2022
Progressive Text-to-Image Generation
Progressive Text-to-Image Generation
Zhengcong Fei
Mingyuan Fan
Li Zhu
Junshi Huang
89
4
0
05 Oct 2022
Vision+X: A Survey on Multimodal Learning in the Light of Data
Vision+X: A Survey on Multimodal Learning in the Light of Data
Ye Zhu
Yuehua Wu
N. Sebe
Yan Yan
35
16
0
05 Oct 2022
When and why vision-language models behave like bags-of-words, and what
  to do about it?
When and why vision-language models behave like bags-of-words, and what to do about it?
Mert Yuksekgonul
Federico Bianchi
Pratyusha Kalluri
Dan Jurafsky
James Zou
VLM
CoGe
30
364
0
04 Oct 2022
Collecting The Puzzle Pieces: Disentangled Self-Driven Human Pose
  Transfer by Permuting Textures
Collecting The Puzzle Pieces: Disentangled Self-Driven Human Pose Transfer by Permuting Textures
Nannan Li
Kevin J. Shih
Bryan A. Plummer
31
7
0
04 Oct 2022
ASIF: Coupled Data Turns Unimodal Models to Multimodal Without Training
ASIF: Coupled Data Turns Unimodal Models to Multimodal Without Training
Antonio Norelli
Marco Fumero
Valentino Maiorca
Luca Moschella
Emanuele Rodolà
Francesco Locatello
VLM
89
34
0
04 Oct 2022
PLOT: Prompt Learning with Optimal Transport for Vision-Language Models
PLOT: Prompt Learning with Optimal Transport for Vision-Language Models
Guangyi Chen
Weiran Yao
Xiangchen Song
Xinyue Li
Yongming Rao
Kun Zhang
VPVLM
VLM
8
62
0
03 Oct 2022
Visual Prompt Tuning for Generative Transfer Learning
Visual Prompt Tuning for Generative Transfer Learning
Kihyuk Sohn
Yuan Hao
José Lezama
Luisa F. Polanía
Huiwen Chang
Han Zhang
Irfan Essa
Lu Jiang
VPVLM
VLM
64
81
0
03 Oct 2022
Membership Inference Attacks Against Text-to-image Generation Models
Membership Inference Attacks Against Text-to-image Generation Models
Yixin Wu
Ning Yu
Zheng Li
Michael Backes
Yang Zhang
DiffM
27
65
0
03 Oct 2022
Red-Teaming the Stable Diffusion Safety Filter
Red-Teaming the Stable Diffusion Safety Filter
Javier Rando
Daniel Paleka
David Lindner
Lennard Heim
Florian Tramèr
DiffM
132
184
0
03 Oct 2022
Improving Sample Quality of Diffusion Models Using Self-Attention
  Guidance
Improving Sample Quality of Diffusion Models Using Self-Attention Guidance
Susung Hong
Gyuseong Lee
Wooseok Jang
Seung Wook Kim
DiffM
35
97
0
03 Oct 2022
Generated Faces in the Wild: Quantitative Comparison of Stable
  Diffusion, Midjourney and DALL-E 2
Generated Faces in the Wild: Quantitative Comparison of Stable Diffusion, Midjourney and DALL-E 2
Ali Borji
DiffM
91
121
0
02 Oct 2022
ManiCLIP: Multi-Attribute Face Manipulation from Text
ManiCLIP: Multi-Attribute Face Manipulation from Text
Hao Wang
Guosheng Lin
A. Molino
Anran Wang
Jiashi Feng
Zehuan Yuan
CVBM
40
9
0
02 Oct 2022
Ten Years after ImageNet: A 360° Perspective on AI
Ten Years after ImageNet: A 360° Perspective on AI
Sanjay Chawla
Preslav Nakov
Ahmed Ali
Wendy Hall
Issa M. Khalil
Xiaosong Ma
Husrev Taha Sencar
Ingmar Weber
Michael Wooldridge
Tingyue Yu
21
0
0
01 Oct 2022
Equivariant Energy-Guided SDE for Inverse Molecular Design
Equivariant Energy-Guided SDE for Inverse Molecular Design
Fan Bao
Min Zhao
Zhongkai Hao
Pei‐Yun Li
Chongxuan Li
Jun Zhu
DiffM
193
64
0
30 Sep 2022
AudioGen: Textually Guided Audio Generation
AudioGen: Textually Guided Audio Generation
Felix Kreuk
Gabriel Synnaeve
Adam Polyak
Uriel Singer
Alexandre Défossez
Jade Copet
Devi Parikh
Yaniv Taigman
Yossi Adi
DiffM
27
290
0
30 Sep 2022
Data Poisoning Attacks Against Multimodal Encoders
Data Poisoning Attacks Against Multimodal Encoders
Ziqing Yang
Xinlei He
Zheng Li
Michael Backes
Mathias Humbert
Pascal Berrang
Yang Zhang
AAML
121
46
0
30 Sep 2022
Diffusion-based Image Translation using Disentangled Style and Content
  Representation
Diffusion-based Image Translation using Disentangled Style and Content Representation
Gihyun Kwon
Jong Chul Ye
DiffM
162
156
0
30 Sep 2022
Mind Reader: Reconstructing complex images from brain activities
Mind Reader: Reconstructing complex images from brain activities
Sikun Lin
Thomas C. Sprague
Ambuj K. Singh
DiffM
124
87
0
30 Sep 2022
Understanding Pure CLIP Guidance for Voxel Grid NeRF Models
Understanding Pure CLIP Guidance for Voxel Grid NeRF Models
Han-Hung Lee
Angel X. Chang
24
63
0
30 Sep 2022
State-specific protein-ligand complex structure prediction with a
  multi-scale deep generative model
State-specific protein-ligand complex structure prediction with a multi-scale deep generative model
Zhuoran Qiao
Weili Nie
Arash Vahdat
Thomas F. Miller
Anima Anandkumar
DiffM
39
84
0
30 Sep 2022
DreamFusion: Text-to-3D using 2D Diffusion
DreamFusion: Text-to-3D using 2D Diffusion
Ben Poole
Ajay Jain
Jonathan T. Barron
B. Mildenhall
85
2,323
0
29 Sep 2022
EDA: Explicit Text-Decoupling and Dense Alignment for 3D Visual
  Grounding
EDA: Explicit Text-Decoupling and Dense Alignment for 3D Visual Grounding
Yanmin Wu
Xinhua Cheng
Renrui Zhang
Zesen Cheng
Jian Zhang
63
63
0
29 Sep 2022
Spotlight: Mobile UI Understanding using Vision-Language Models with a
  Focus
Spotlight: Mobile UI Understanding using Vision-Language Models with a Focus
Gang Li
Yang Li
30
67
0
29 Sep 2022
Human Motion Diffusion Model
Human Motion Diffusion Model
Guy Tevet
Sigal Raab
Brian Gordon
Yonatan Shafir
Daniel Cohen-Or
Amit H. Bermano
DiffM
VGen
226
724
0
29 Sep 2022
Analyzing Diffusion as Serial Reproduction
Analyzing Diffusion as Serial Reproduction
Raja Marjieh
Ilia Sucholutsky
Thomas A. Langlois
Nori Jacoby
Thomas Griffiths
DiffM
35
4
0
29 Sep 2022
Make-A-Video: Text-to-Video Generation without Text-Video Data
Make-A-Video: Text-to-Video Generation without Text-Video Data
Uriel Singer
Adam Polyak
Thomas Hayes
Xiaoyue Yin
Jie An
...
Oron Ashual
Oran Gafni
Devi Parikh
Sonal Gupta
Yaniv Taigman
DiffM
VGen
45
1,353
0
29 Sep 2022
Offline Reinforcement Learning via High-Fidelity Generative Behavior
  Modeling
Offline Reinforcement Learning via High-Fidelity Generative Behavior Modeling
Huayu Chen
Cheng Lu
Chengyang Ying
Hang Su
Jun Zhu
DiffM
OffRL
108
106
0
29 Sep 2022
Re-Imagen: Retrieval-Augmented Text-to-Image Generator
Re-Imagen: Retrieval-Augmented Text-to-Image Generator
Wenhu Chen
Hexiang Hu
Chitwan Saharia
William W. Cohen
VLM
131
164
0
29 Sep 2022
Compositional Score Modeling for Simulation-based Inference
Compositional Score Modeling for Simulation-based Inference
Tomas Geffner
George Papamakarios
A. Mnih
72
25
0
28 Sep 2022
What Does DALL-E 2 Know About Radiology?
What Does DALL-E 2 Know About Radiology?
Lisa Christine Adams
Felix Busch
Daniel Truhn
Marcus R. Makowski
Hugo J. W. L. Aerts
Keno K. Bressem
MedIm
42
58
0
27 Sep 2022
Draw Your Art Dream: Diverse Digital Art Synthesis with Multimodal
  Guided Diffusion
Draw Your Art Dream: Diverse Digital Art Synthesis with Multimodal Guided Diffusion
Nisha Huang
Fan Tang
Weiming Dong
Changsheng Xu
DiffM
93
40
0
27 Sep 2022
Learning to Learn with Generative Models of Neural Network Checkpoints
Learning to Learn with Generative Models of Neural Network Checkpoints
William S. Peebles
Ilija Radosavovic
Tim Brooks
Alexei A. Efros
Jitendra Malik
UQCV
75
65
0
26 Sep 2022
Can Large Language Models Truly Understand Prompts? A Case Study with
  Negated Prompts
Can Large Language Models Truly Understand Prompts? A Case Study with Negated Prompts
Joel Jang
Seonghyeon Ye
Minjoon Seo
ELM
LRM
99
64
0
26 Sep 2022
A Collaborative, Interactive and Context-Aware Drawing Agent for
  Co-Creative Design
A Collaborative, Interactive and Context-Aware Drawing Agent for Co-Creative Design
F. Ibarrola
Tomas Lawton
Kazjon Grace
48
13
0
26 Sep 2022
Convergence of score-based generative modeling for general data
  distributions
Convergence of score-based generative modeling for general data distributions
Holden Lee
Jianfeng Lu
Yixin Tan
DiffM
191
129
0
26 Sep 2022
All are Worth Words: A ViT Backbone for Diffusion Models
All are Worth Words: A ViT Backbone for Diffusion Models
Fan Bao
Shen Nie
Kaiwen Xue
Yue Cao
Chongxuan Li
Hang Su
Jun Zhu
VLM
30
324
0
25 Sep 2022
Best Prompts for Text-to-Image Models and How to Find Them
Best Prompts for Text-to-Image Models and How to Find Them
Nikita Pavlichenko
Dmitry Ustalov
DiffM
25
58
0
23 Sep 2022
Sampling is as easy as learning the score: theory for diffusion models
  with minimal data assumptions
Sampling is as easy as learning the score: theory for diffusion models with minimal data assumptions
Sitan Chen
Sinho Chewi
Jungshian Li
Yuanzhi Li
Adil Salim
Anru R. Zhang
DiffM
135
249
0
22 Sep 2022
MIDMs: Matching Interleaved Diffusion Models for Exemplar-based Image
  Translation
MIDMs: Matching Interleaved Diffusion Models for Exemplar-based Image Translation
Junyoung Seo
Gyuseong Lee
Seokju Cho
Jiyoung Lee
Seung Wook Kim
DiffM
40
27
0
22 Sep 2022
Implementing and Experimenting with Diffusion Models for Text-to-Image
  Generation
Implementing and Experimenting with Diffusion Models for Text-to-Image Generation
Robin Zbinden
33
3
0
22 Sep 2022
Text2Light: Zero-Shot Text-Driven HDR Panorama Generation
Text2Light: Zero-Shot Text-Driven HDR Panorama Generation
Zhaoxi Chen
Guangcong Wang
Ziwei Liu
97
30
0
20 Sep 2022
Extremely Simple Activation Shaping for Out-of-Distribution Detection
Extremely Simple Activation Shaping for Out-of-Distribution Detection
Andrija Djurisic
Nebojsa Bozanic
Arjun Ashok
Rosanne Liu
OODD
172
152
0
20 Sep 2022
Previous
123...909192...949596
Next