Hierarchical Text-Conditional Image Generation with CLIP Latents

13 April 2022

Papers citing "Hierarchical Text-Conditional Image Generation with CLIP Latents"

50 / 4,897 papers shown

Title
Exploiting Cultural Biases via Homoglyphs in Text-to-Image Synthesis Lukas Struppek Dominik Hintersdorf Felix Friedrich Manuel Brack P. Schramowski Kristian Kersting 121 33 0 19 Sep 2022
Distribution Aware Metrics for Conditional Natural Language Generation David M. Chan Yiming Ni David A. Ross Sudheendra Vijayanarasimhan Austin Myers John F. Canny 77 4 0 15 Sep 2022
Test-Time Prompt Tuning for Zero-Shot Generalization in Vision-Language Models Manli Shu Weili Nie De-An Huang Zhiding Yu Tom Goldstein Anima Anandkumar Chaowei Xiao VLM VPVLM 274 309 0 15 Sep 2022
Does CLIP Know My Face? Dominik Hintersdorf Lukas Struppek Manuel Brack Felix Friedrich P. Schramowski Kristian Kersting VLM 60 11 0 15 Sep 2022
Brain Imaging Generation with Latent Diffusion Models W. H. Pinaya Petru-Daniel Tudosiu J. Dafflon P. F. D. Costa Virginia Fernandez P. Nachev Sebastien Ourselin M. Jorge Cardoso DiffM MedIm 154 305 0 15 Sep 2022
M^4I: Multi-modal Models Membership Inference Pingyi Hu Zihan Wang Ruoxi Sun Hu Wang Minhui Xue 97 27 0 15 Sep 2022
Lossy Image Compression with Conditional Diffusion Models Ruihan Yang Stephan Mandt DiffM 87 137 0 14 Sep 2022
Law Informs Code: A Legal Informatics Approach to Aligning Artificial Intelligence with Humans John J. Nay ELM AILaw 190 29 0 14 Sep 2022
Soft Diffusion: Score Matching for General Corruptions Giannis Daras M. Delbracio Hossein Talebi A. Dimakis P. Milanfar DiffM 144 111 0 12 Sep 2022
Diffusion Models in Vision: A Survey Florinel-Alin Croitoru Vlad Hondru Radu Tudor Ionescu M. Shah DiffM VLM MedIm 363 1,255 0 10 Sep 2022
ISS: Image as Stepping Stone for Text-Guided 3D Shape Generation Zhengzhe Liu Peng Dai Ruihui Li Xiaojuan Qi Chi-Wing Fu DiffM 245 25 0 09 Sep 2022
TEACH: Temporal Action Composition for 3D Humans Nikos Athanasiou Mathis Petrovich Michael J. Black Gül Varol 157 147 0 09 Sep 2022
Dr. Neurosymbolic, or: How I Learned to Stop Worrying and Accept Statistics Masataro Asai 85 0 0 08 Sep 2022
Text-Free Learning of a Natural Language Interface for Pretrained Face Generators Xiaodan Du Raymond A. Yeh Nicholas I. Kolkin Eli Shechtman Gregory Shakhnarovich CLIP 59 1 0 08 Sep 2022
Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow Xingchao Liu Chengyue Gong Qiang Liu OOD 265 1,056 0 07 Sep 2022
Statistical Foundation Behind Machine Learning and Its Impact on Computer Vision Lei Zhang H. Shum VLM SSL 65 2 0 06 Sep 2022
A Survey on Generative Diffusion Model Hanqun Cao Cheng Tan Zhangyang Gao Yilun Xu Guangyong Chen Pheng-Ann Heng Stan Z. Li MedIm 270 239 0 06 Sep 2022
Diffusion Models: A Comprehensive Survey of Methods and Applications Ling Yang Zhilong Zhang Yingxia Shao Shenda Hong Runsheng Xu Yue Zhao Wentao Zhang Tengjiao Wang Ming-Hsuan Yang DiffM MedIm 485 1,420 0 02 Sep 2022
Zero-Shot Multi-Modal Artist-Controlled Retrieval and Exploration of 3D Object Sets Kristofer Schlachter Benjamin Ahlbrand Zhu Wang V. Ortenzi Ken Perlin DiffM 3DV 53 7 0 01 Sep 2022
FLAME: Free-form Language-based Motion Synthesis & Editing Jihoon Kim Jiseob Kim Sungjoon Choi VGen 125 213 0 01 Sep 2022
Large-Scale Auto-Regressive Modeling Of Street Networks Michael Birsak Tom Kelly W. Para Peter Wonka GNN AI4TS 30 6 0 01 Sep 2022
MotionDiffuse: Text-Driven Human Motion Generation with Diffusion Model Mingyuan Zhang Zhongang Cai Liang Pan Fangzhou Hong Xinying Guo Lei Yang Ziwei Liu DiffM VGen 182 584 0 31 Aug 2022
Let us Build Bridges: Understanding and Extending Diffusion Generative Models Xingchao Liu Lemeng Wu Mao Ye Qiang Liu DiffM 89 85 0 31 Aug 2022
Deep Generative Modeling on Limited Data with Regularization by Nontransferable Pre-trained Models Yong Zhong Hongtao Liu Xiaodong Liu Fan Bao Weiran Shen Chongxuan Li AI4CE 99 4 0 30 Aug 2022
Frido: Feature Pyramid Diffusion for Complex Scene Image Synthesis Wanshu Fan Yen-Chun Chen Dongdong Chen Yu Cheng Lu Yuan Yu-Chiang Frank Wang DiffM 92 97 0 29 Aug 2022
Efficient Vision-Language Pretraining with Visual Concepts and Hierarchical Alignment Mustafa Shukor Guillaume Couairon Matthieu Cord VLM CLIP 100 27 0 29 Aug 2022
LogicRank: Logic Induced Reranking for Generative Text-to-Image Systems Bjorn Deiseroth P. Schramowski Hikaru Shindo Devendra Singh Dhami Kristian Kersting EGVM DiffM 52 2 0 29 Aug 2022
Grounded Affordance from Exocentric View Hongcheng Luo Wei Zhai Jing Zhang Yang Cao Dacheng Tao 62 21 0 28 Aug 2022
DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation Nataniel Ruiz Yuanzhen Li Varun Jampani Yael Pritch Michael Rubinstein Kfir Aberman 304 2,904 0 25 Aug 2022
Understanding Diffusion Models: A Unified Perspective Calvin Luo DiffM 102 347 0 25 Aug 2022
Comprehensive Dataset of Face Manipulations for Development and Evaluation of Forensic Tools Brian DeCann K. Trapeznikov CVBM 79 2 0 24 Aug 2022
PromptFL: Let Federated Participants Cooperatively Learn Prompts Instead of Models -- Federated Learning in Age of Foundation Model Tao Guo Song Guo Junxiao Wang Wenchao Xu FedML VLM LRM 71 127 0 24 Aug 2022
Bidirectional Contrastive Split Learning for Visual Question Answering Yuwei Sun H. Ochiai 37 2 0 24 Aug 2022
Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned Deep Ganguli Liane Lovitt John Kernion Amanda Askell Yuntao Bai ... Nicholas Joseph Sam McCandlish C. Olah Jared Kaplan Jack Clark 314 489 0 23 Aug 2022
How good are deep models in understanding the generated images? Ali Borji OOD 55 6 0 23 Aug 2022
Learning More May Not Be Better: Knowledge Transferability in Vision and Language Tasks Tianwei Chen Noa Garcia Mayu Otani Chenhui Chu Yuta Nakashima Hajime Nagahara VLM 56 0 0 23 Aug 2022
Cold Diffusion: Inverting Arbitrary Image Transforms Without Noise Arpit Bansal Eitan Borgnia Hong-Min Chu Jie S. Li Hamid Kazemi Furong Huang Micah Goldblum Jonas Geiping Tom Goldstein VLM DiffM 82 286 0 19 Aug 2022
Text to Image Generation: Leaving no Language Behind Pedro Reviriego Elena Merino-Gómez VLM 49 13 0 19 Aug 2022
Pathway to Future Symbiotic Creativity Yi-Ting Guo Qi-fei Liu Jie Chen Wei Xue Jie Fu ... Fernando Rosas Jeffrey Shaw Xing Wu Jiji Zhang Jianliang Xu 66 0 0 18 Aug 2022
Discovering Bugs in Vision Models using Off-the-shelf Image Generation and Captioning Olivia Wiles Isabela Albuquerque Sven Gowal VLM 72 47 0 18 Aug 2022
Enhancing Diffusion-Based Image Synthesis with Robust Classifier Guidance Bahjat Kawar Roy Ganz Michael Elad DiffM 91 39 0 18 Aug 2022
Multimodal foundation models are better simulators of the human brain Haoyu Lu Qiongyi Zhou Nanyi Fei Zhiwu Lu Mingyu Ding ... Changde Du Xin Zhao Haoran Sun Huiguang He J. Wen AI4CE 85 13 0 17 Aug 2022
ILLUME: Rationalizing Vision-Language Models through Human Interactions Manuel Brack P. Schramowski Bjorn Deiseroth Kristian Kersting VLM MLLM 52 3 0 17 Aug 2022
Applying Regularized Schrödinger-Bridge-Based Stochastic Process in Generative Modeling Ki-Ung Song DiffM 54 8 0 15 Aug 2022
Recognition of All Categories of Entities by AI Hiroshi Yamakawa Yutaka Matsuo 60 0 0 13 Aug 2022
Layout-Bridging Text-to-Image Synthesis Jiadong Liang Wenjie Pei Feng Lu EGVM 98 15 0 12 Aug 2022
Language-Guided Face Animation by Recurrent StyleGAN-based Generator Tiankai Hang Huan Yang Bei Liu Jianlong Fu Xin Geng B. Guo VGen 102 13 0 11 Aug 2022
Quality Not Quantity: On the Interaction between Dataset Design and Robustness of CLIP Thao Nguyen Gabriel Ilharco Mitchell Wortsman Sewoong Oh Ludwig Schmidt CLIP VLM 180 108 0 10 Aug 2022
Txt2Img-MHN: Remote Sensing Image Generation from Text Using Modern Hopfield Networks Yonghao Xu Weikang Yu Pedram Ghamisi Michael K Kopp Sepp Hochreiter 66 34 0 08 Aug 2022
CLIP-based Neural Neighbor Style Transfer for 3D Assets Shailesh Mishra Jonathan Granskog CLIP 3DH 3DPC 112 7 0 08 Aug 2022