Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2204.06125
Cited By
Hierarchical Text-Conditional Image Generation with CLIP Latents
13 April 2022
Aditya A. Ramesh
Prafulla Dhariwal
Alex Nichol
Casey Chu
Mark Chen
VLM
DiffM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Hierarchical Text-Conditional Image Generation with CLIP Latents"
50 / 4,753 papers shown
Title
Latent Space Disentanglement in Diffusion Transformers Enables Precise Zero-shot Semantic Editing
Zitao Shuai
Chenwei Wu
Zhengxu Tang
Bowen Song
Liyue Shen
DiffM
70
0
0
12 Nov 2024
Evaluating the Generation of Spatial Relations in Text and Image Generative Models
Shang Hong Sim
Clarence Lee
A. Tan
Cheston Tan
EGVM
41
2
0
12 Nov 2024
Add-it: Training-Free Object Insertion in Images With Pretrained Diffusion Models
Yoad Tewel
Rinon Gal
Dvir Samuel
Yuval Atzmon
Lior Wolf
Gal Chechik
VLM
59
6
0
11 Nov 2024
Token Merging for Training-Free Semantic Binding in Text-to-Image Synthesis
Taihang Hu
Linxuan Li
Joost van de Weijer
Hongcheng Gao
Fahad Shahbaz Khan
Jian Yang
Ming-Ming Cheng
Kai Wang
Yaxing Wang
DiffM
62
4
0
11 Nov 2024
Edify Image: High-Quality Image Generation with Pixel Space Laplacian Diffusion Models
Nvidia
:
Yuval Atzmon
Maciej Bala
Yogesh Balaji
...
Ting-Chun Wang
Shuran Song
Fangyin Wei
Yu Zeng
Qinsheng Zhang
61
6
0
11 Nov 2024
Decoding Visual Experience and Mapping Semantics through Whole-Brain Analysis Using fMRI Foundation Models
Yanchen Wang
Adam Turnbull
Tiange Xiang
Yunlong Xu
Sa Zhou
Adnan Masoud
Shekoofeh Azizi
F. Lin
Ehsan Adeli
37
1
0
11 Nov 2024
ENAT: Rethinking Spatial-temporal Interactions in Token-based Image Synthesis
Zanlin Ni
Yulin Wang
Renping Zhou
Yizeng Han
Jiayi Guo
Zhiyuan Liu
Yuan Yao
Gao Huang
65
4
0
11 Nov 2024
OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision
Cong Wei
Zheyang Xiong
Weiming Ren
Xinrun Du
Ge Zhang
Wenhu Chen
121
19
0
11 Nov 2024
Region-Aware Text-to-Image Generation via Hard Binding and Soft Refinement
Zhennan Chen
Yajie Li
Haofan Wang
Zheyu Chen
Zhengkai Jiang
Jun Yu Li
Qian Wang
Jian Yang
Ying Tai
DiffM
52
8
0
10 Nov 2024
Scalable, Tokenization-Free Diffusion Model Architectures with Efficient Initial Convolution and Fixed-Size Reusable Structures for On-Device Image Generation
Sanchar Palit
Sathya Veera Reddy Dendi
Mallikarjuna Talluri
Raj Narayana Gadde
41
0
0
09 Nov 2024
Autoregressive Models in Vision: A Survey
Jing Xiong
Gongye Liu
Lun Huang
Chengyue Wu
Taiqiang Wu
...
Hao Fei
Guillermo Sapiro
Jiebo Luo
Ping Luo
Ngai Wong
VGen
53
9
0
08 Nov 2024
Improving image synthesis with diffusion-negative sampling
Alakh Desai
Nuno Vasconcelos
DiffM
40
0
0
08 Nov 2024
Analyzing The Language of Visual Tokens
David M. Chan
Rodolfo Corona
J. S. Park
Cheol Jun Cho
Yutong Bai
Trevor Darrell
28
2
0
07 Nov 2024
Few-Shot Task Learning through Inverse Generative Modeling
Aviv Netanyahu
Yilun Du
Antonia Bronars
Jyothish Pari
J. Tenenbaum
Tianmin Shu
Pulkit Agrawal
53
1
0
07 Nov 2024
AsCAN: Asymmetric Convolution-Attention Networks for Efficient Recognition and Generation
Anil Kag
Huseyin Coskun
Jierun Chen
Junli Cao
Willi Menapace
Aliaksandr Siarohin
Sergey Tulyakov
Jian Ren
53
3
0
07 Nov 2024
DomainGallery: Few-shot Domain-driven Image Generation by Attribute-centric Finetuning
Yuxuan Duan
Y. Hong
Bo Zhang
Jun Lan
Huijia Zhu
Weiqiang Wang
Jianfu Zhang
Li Niu
Lefei Zhang
DiffM
52
0
0
07 Nov 2024
ReEdit: Multimodal Exemplar-Based Image Editing with Diffusion Models
Ashutosh Srivastava
Tarun Ram Menta
Abhinav Java
Avadhoot Jadhav
Silky Singh
Surgan Jandial
Balaji Krishnamurthy
DiffM
43
1
0
06 Nov 2024
ROBIN: Robust and Invisible Watermarks for Diffusion Models with Adversarial Optimization
Huayang Huang
Yu Wu
Qian Wang
DiffM
WIGM
51
5
0
06 Nov 2024
VQ-ACE: Efficient Policy Search for Dexterous Robotic Manipulation via Action Chunking Embedding
Chenyu Yang
Davide Liconti
Robert K. Katzschmann
47
1
0
05 Nov 2024
Efficient and Effective Adaptation of Multimodal Foundation Models in Sequential Recommendation
Junchen Fu
Xuri Ge
Xin Xin
Alexandros Karatzoglou
Ioannis Arapakis
Kaiwen Zheng
Yongxin Ni
J. Jose
23
2
0
05 Nov 2024
Explanations that reveal all through the definition of encoding
A. Puli
Nhi Nguyen
Rajesh Ranganath
FAtt
XAI
43
1
0
04 Nov 2024
INQUIRE: A Natural World Text-to-Image Retrieval Benchmark
Edward Vendrow
Omiros Pantazis
Alexander Shepard
Gabriel J. Brostow
Kate E. Jones
Oisin Mac Aodha
Sara Beery
Grant Van Horn
VLM
43
3
0
04 Nov 2024
Silver medal Solution for Image Matching Challenge 2024
Yian Wang
3DV
3DPC
44
0
0
04 Nov 2024
Trustworthy Federated Learning: Privacy, Security, and Beyond
Chunlu Chen
Ji Liu
Haowen Tan
Xingjian Li
Kevin I-Kai Wang
Peng Li
Kouichi Sakurai
Dejing Dou
FedML
57
4
0
03 Nov 2024
Denoising Fisher Training For Neural Implicit Samplers
Weijian Luo
Wei Deng
38
0
0
03 Nov 2024
Identifying Implicit Social Biases in Vision-Language Models
Kimia Hamidieh
Haoran Zhang
Walter Gerych
Thomas Hartvigsen
Marzyeh Ghassemi
VLM
36
11
0
01 Nov 2024
GameGen-X: Interactive Open-world Game Video Generation
Haoxuan Che
Xuanhua He
Quande Liu
Cheng Jin
Hao Chen
VGen
66
17
0
01 Nov 2024
Creativity in the Age of AI: Evaluating the Impact of Generative AI on Design Outputs and Designers' Creative Thinking
Yue Fu
Han Bin
Tony Zhou
Marx Wang
Yixin Chen
Zelia Gomes Da Costa Lai
Jacob O. Wobbrock
Alexis Hiniker
42
0
0
31 Oct 2024
Scaling Concept With Text-Guided Diffusion Models
Chao Huang
Susan Liang
Yunlong Tang
Yapeng Tian
Anurag Kumar
Chenliang Xu
DiffM
59
6
0
31 Oct 2024
EDT: An Efficient Diffusion Transformer Framework Inspired by Human-like Sketching
Xinwang Chen
Ning Liu
Bo Li
Feifei Feng
Jian Tang
42
2
0
31 Oct 2024
Language-guided Hierarchical Fine-grained Image Forgery Detection and Localization
Xiao Guo
Xiaohong Liu
I. Masi
Xiaoming Liu
95
9
0
31 Oct 2024
Redefining <Creative> in Dictionary: Towards an Enhanced Semantic Understanding of Creative Generation
Fu Feng
Yucheng Xie
Xu Yang
Jing Wang
Xin Geng
DiffM
38
0
0
31 Oct 2024
Understanding the Limits of Vision Language Models Through the Lens of the Binding Problem
Declan Campbell
Sunayana Rane
Tyler Giallanza
Nicolò De Sabbata
Kia Ghods
...
Alexander Ku
Steven M. Frankland
Thomas Griffiths
Jonathan D. Cohen
Taylor W. Webb
42
13
0
31 Oct 2024
MoLE: Enhancing Human-centric Text-to-image Diffusion via Mixture of Low-rank Experts
Jie Zhu
Y. Chen
Mingyu Ding
Ping Luo
Leye Wang
Jingdong Wang
DiffM
42
4
0
30 Oct 2024
Public Domain 12M: A Highly Aesthetic Image-Text Dataset with Novel Governance Mechanisms
Jordan Meyer
Nick Padgett
Cullen Miller
Laura Exline
31
4
0
30 Oct 2024
Latent Diffusion, Implicit Amplification: Efficient Continuous-Scale Super-Resolution for Remote Sensing Images
Hanlin Wu
Jiangwei Mo
Xiaohui Sun
Jie Ma
36
1
0
30 Oct 2024
FuseAnyPart: Diffusion-Driven Facial Parts Swapping via Multiple Reference Images
Zheng Yu
Yaohua Wang
Siying Cui
Aixi Zhang
Wei-Long Zheng
Senzhang Wang
36
0
0
30 Oct 2024
VerifyPrompt: How to Verify Text-to-Image Models Behind Black-Box API?
Ji Guo
Wenbo Jiang
Rui Zhang
Guoming Lu
Hongwei Li
AAML
42
0
0
30 Oct 2024
Diffusion Beats Autoregressive: An Evaluation of Compositional Generation in Text-to-Image Models
Arash Marioriyad
Parham Rezaei
M. Baghshah
M. Rohban
CoGe
219
0
0
30 Oct 2024
Embedding Watermarks in Diffusion Process for Model Intellectual Property Protection
Jijia Yang
Sen Peng
Xiaohua Jia
WIGM
39
0
0
29 Oct 2024
Class-Aware Contrastive Optimization for Imbalanced Text Classification
Grigorii Khvatskii
Nuno Moniz
Khoa D. Doan
Nitesh Chawla
33
0
0
29 Oct 2024
Capacity Control is an Effective Memorization Mitigation Mechanism in Text-Conditional Diffusion Models
Raman Dutt
Pedro Sanchez
Ondrej Bohdal
Sotirios A. Tsaftaris
Timothy M. Hospedales
38
1
0
29 Oct 2024
PACA: Perspective-Aware Cross-Attention Representation for Zero-Shot Scene Rearrangement
Shutong Jin
Ruiyu Wang
Kuangyi Chen
Florian T. Pokorny
32
0
0
29 Oct 2024
Preserving Pre-trained Representation Space: On Effectiveness of Prefix-tuning for Large Multi-modal Models
Donghoon Kim
Gusang Lee
Kyuhong Shim
B. Shim
64
1
0
29 Oct 2024
Volumetric Conditioning Module to Control Pretrained Diffusion Models for 3D Medical Images
Suhyun Ahn
Wonjung Park
Jihoon Cho
Seunghyuck Park
Jinah Park
MedIm
31
0
0
29 Oct 2024
MotionGPT-2: A General-Purpose Motion-Language Model for Motion Generation and Understanding
Yuan Wang
Di Huang
Yaqi Zhang
Wanli Ouyang
J. Jiao
Xuetao Feng
Yan Zhou
Pengfei Wan
Shixiang Tang
Dan Xu
VGen
36
13
0
29 Oct 2024
RDSinger: Reference-based Diffusion Network for Singing Voice Synthesis
Kehan Sui
Jinxu Xiang
Fang Jin
DiffM
26
0
0
29 Oct 2024
Adapting Diffusion Models for Improved Prompt Compliance and Controllable Image Synthesis
Deepak Sridhar
Abhishek Peri
Rohith Rachala
Nuno Vasconcelos
DiffM
40
0
0
29 Oct 2024
IntLoRA: Integral Low-rank Adaptation of Quantized Diffusion Models
Hang Guo
Yawei Li
Tao Dai
Shu-Tao Xia
Luca Benini
MQ
39
1
0
29 Oct 2024
Dual Conditional Diffusion Models for Sequential Recommendation
Hongtao Huang
Chengkai Huang
Xiaojun Chang
Wen Hu
Lina Yao
Julian McAuley
Lina Yao
DiffM
53
2
0
29 Oct 2024
Previous
1
2
3
...
12
13
14
...
94
95
96
Next