v1v2v3 (latest)

ReGround: Improving Textual and Spatial Grounding at No Cost

20 March 2024

Papers citing "ReGround: Improving Textual and Spatial Grounding at No Cost"

33 / 33 papers shown

Title
ORIGEN: Zero-Shot 3D Orientation Grounding in Text-to-Image Generation Yunhong Min Daehyeon Choi Kyeongmin Yeo Jihyun Lee Minhyuk Sung 94 0 0 28 Mar 2025
LoCo: Locally Constrained Training-Free Layout-to-Image Synthesis Peiang Zhao Han Li Ruiyang Jin S. Kevin Zhou DiffM 100 13 0 21 Nov 2023
LLaVA-Interactive: An All-in-One Demo for Image Chat, Segmentation, Generation and Editing Wei-Ge Chen Irina Spiridonova Jianwei Yang Jianfeng Gao Chun-yue Li MLLM VLM 55 36 0 01 Nov 2023
R&B: Region and Boundary Aware Zero-shot Grounded Text-to-image Generation Jiayu Xiao Henglei Lv Liang Li Shuhui Wang Qingming Huang DiffM 84 22 0 13 Oct 2023
BoxDiff: Text-to-Image Synthesis with Training-Free Box-Constrained Diffusion Jinheng Xie Yuexiang Li Yawen Huang Haozhe Liu Wentian Zhang Yefeng Zheng Mike Zheng Shou DiffM 116 201 0 20 Jul 2023
Zero-shot spatial layout conditioning for text-to-image diffusion models Guillaume Couairon Marlene Careil Matthieu Cord Stéphane Lathuilière Jakob Verbeek VLM 57 64 0 23 Jun 2023
LayoutGPT: Compositional Visual Planning and Generation with Large Language Models Weixi Feng Wanrong Zhu Tsu-Jui Fu Varun Jampani Arjun Reddy Akula Xuehai He Sugato Basu Xinze Wang William Yang Wang MLLM 84 173 0 24 May 2023
LayoutDiffusion: Controllable Diffusion Model for Layout-to-image Generation Guangcong Zheng Xianpan Zhou Xuewei Li Zhongang Qi Ying Shan Xi Li DiffM 73 188 0 30 Mar 2023
Directed Diffusion: Direct Control of Object Placement through Attention Guidance W. Ma J. P. Lewis Avisek Lahiri Thomas Leung W. Kleijn DiffM 60 67 0 25 Feb 2023
LayoutDiffuse: Adapting Foundational Diffusion Models for Layout-to-Image Generation Jiaxin Cheng Xiao Liang Xingjian Shi Tong He Tianjun Xiao Mu Li DiffM 65 68 0 16 Feb 2023
Adding Conditional Control to Text-to-Image Diffusion Models Lvmin Zhang Anyi Rao Maneesh Agrawala AI4CE 148 4,113 1 10 Feb 2023
ReCo: Region-Controlled Text-to-Image Generation Zhengyuan Yang Jianfeng Wang Zhe Gan Linjie Li Kevin Qinghong Lin ... Nan Duan Zicheng Liu Ce Liu Michael Zeng Lijuan Wang DiffM 93 149 0 23 Nov 2022
eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers Yogesh Balaji Seungjun Nah Xun Huang Arash Vahdat Jiaming Song ... Timo Aila S. Laine Bryan Catanzaro Tero Karras Xuan Li VLM MoE 168 827 0 02 Nov 2022
Improving Sample Quality of Diffusion Models Using Self-Attention Guidance Susung Hong Gyuseong Lee Wooseok Jang Seung Wook Kim DiffM 88 103 0 03 Oct 2022
Prompt-to-Prompt Image Editing with Cross Attention Control Amir Hertz Ron Mokady J. Tenenbaum Kfir Aberman Yael Pritch Daniel Cohen-Or DiffM 198 1,773 0 02 Aug 2022
YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors Chien-Yao Wang Alexey Bochkovskiy H. Liao ObjD 165 6,527 0 06 Jul 2022
Flamingo: a Visual Language Model for Few-Shot Learning Jean-Baptiste Alayrac Jeff Donahue Pauline Luc Antoine Miech Iain Barr ... Mikolaj Binkowski Ricardo Barreira Oriol Vinyals Andrew Zisserman Karen Simonyan MLLM VLM 385 3,542 0 29 Apr 2022
Hierarchical Text-Conditional Image Generation with CLIP Latents Aditya A. Ramesh Prafulla Dhariwal Alex Nichol Casey Chu Mark Chen VLM DiffM 404 6,866 0 13 Apr 2022
High-Resolution Image Synthesis with Latent Diffusion Models Robin Rombach A. Blattmann Dominik Lorenz Patrick Esser Bjorn Ommer 3DV 440 15,665 0 20 Dec 2021
Alias-Free Generative Adversarial Networks Tero Karras M. Aittala S. Laine Erik Härkönen Janne Hellsten J. Lehtinen Timo Aila GAN 177 1,596 0 23 Jun 2021
CLIPScore: A Reference-free Evaluation Metric for Image Captioning Jack Hessel Ari Holtzman Maxwell Forbes Ronan Le Bras Yejin Choi CLIP 139 1,561 0 18 Apr 2021
Zero-Shot Text-to-Image Generation Aditya A. Ramesh Mikhail Pavlov Gabriel Goh Scott Gray Chelsea Voss Alec Radford Mark Chen Ilya Sutskever VLM 415 4,953 0 24 Feb 2021
Score-Based Generative Modeling through Stochastic Differential Equations Yang Song Jascha Narain Sohl-Dickstein Diederik P. Kingma Abhishek Kumar Stefano Ermon Ben Poole DiffM SyDa 341 6,480 0 26 Nov 2020
Denoising Diffusion Implicit Models Jiaming Song Chenlin Meng Stefano Ermon VLM DiffM 283 7,384 0 06 Oct 2020
Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains Matthew Tancik Pratul P. Srinivasan B. Mildenhall Sara Fridovich-Keil N. Raghavan Utkarsh Singhal R. Ramamoorthi Jonathan T. Barron Ren Ng 124 2,421 0 18 Jun 2020
Object-Centric Image Generation from Layouts Tristan Sylvain Pengchuan Zhang Yoshua Bengio R. Devon Hjelm Shikhar Sharma EGVM OCL 108 102 0 16 Mar 2020
Learning Canonical Representations for Scene Graph to Image Generation Roei Herzig Amir Bar Huijuan Xu Gal Chechik Trevor Darrell Amir Globerson GNN OCL 66 108 0 16 Dec 2019
Analyzing and Improving the Image Quality of StyleGAN Tero Karras S. Laine M. Aittala Janne Hellsten J. Lehtinen Timo Aila GAN 293 5,815 0 03 Dec 2019
Object-driven Text-to-Image Synthesis via Adversarial Training Wenbo Li Pengchuan Zhang Lei Zhang Qiuyuan Huang Xiaodong He Siwei Lyu Jianfeng Gao GAN 71 302 0 27 Feb 2019
A Style-Based Generator Architecture for Generative Adversarial Networks Tero Karras S. Laine Timo Aila 583 10,561 0 12 Dec 2018
Image Generation from Scene Graphs Justin Johnson Agrim Gupta Li Fei-Fei GNN 300 820 0 04 Apr 2018
The Unreasonable Effectiveness of Deep Features as a Perceptual Metric Richard Y. Zhang Phillip Isola Alexei A. Efros Eli Shechtman Oliver Wang EGVM 377 11,795 0 11 Jan 2018
Microsoft COCO: Common Objects in Context Nayeon Lee Michael Maire Serge J. Belongie Lubomir Bourdev Ross B. Girshick James Hays Pietro Perona Deva Ramanan C. L. Zitnick Piotr Dollár ObjD 413 43,667 0 01 May 2014