Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2308.09599
Cited By
Language-Guided Diffusion Model for Visual Grounding
18 August 2023
Sijia Chen
Baochun Li
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Language-Guided Diffusion Model for Visual Grounding"
50 / 52 papers shown
Title
Paint Outside the Box: Synthesizing and Selecting Training Data for Visual Grounding
Zilin Du
Haoxin Li
Jianfei Yu
Boyang Li
362
0
0
01 Dec 2024
Language-driven Grasp Detection
An Dinh Vuong
Minh Nhat Vu
Baoru Huang
Nghia Nguyen
Hieu Le
T. Vo
Anh Nguyen
VLM
59
19
0
13 Jun 2024
Zero-Shot Medical Phrase Grounding with Off-the-shelf Diffusion Models
Konstantinos Vilouras
Pedro Sanchez
Alison Q. OÑeil
Sotirios A. Tsaftaris
MedIm
96
2
0
19 Apr 2024
DiffusionDet: Diffusion Model for Object Detection
Shoufa Chen
Pei Sun
Yibing Song
Ping Luo
81
450
0
17 Nov 2022
Diffusion-LM Improves Controllable Text Generation
Xiang Lisa Li
John Thickstun
Ishaan Gulrajani
Percy Liang
Tatsunori B. Hashimoto
AI4CE
204
802
0
27 May 2022
Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding
Chitwan Saharia
William Chan
Saurabh Saxena
Lala Li
Jay Whang
...
Raphael Gontijo-Lopes
Tim Salimans
Jonathan Ho
David J Fleet
Mohammad Norouzi
VLM
263
5,904
0
23 May 2022
Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning
Li Yang
Yan Xu
Chunfen Yuan
Wei Liu
Bing Li
Weiming Hu
ObjD
58
115
0
30 Apr 2022
Hierarchical Text-Conditional Image Generation with CLIP Latents
Aditya A. Ramesh
Prafulla Dhariwal
Alex Nichol
Casey Chu
Mark Chen
VLM
DiffM
276
6,768
0
13 Apr 2022
Shifting More Attention to Visual Backbone: Query-modulated Refinement Networks for End-to-End Visual Grounding
Jiabo Ye
Junfeng Tian
Ming Yan
Xiaoshan Yang
Xuwu Wang
Ji Zhang
Liang He
Xin Lin
ObjD
32
64
0
29 Mar 2022
Deconfounded Visual Grounding
Jianqiang Huang
Yu Qin
Jiaxin Qi
Qianru Sun
Hanwang Zhang
CML
ObjD
47
31
0
31 Dec 2021
GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models
Alex Nichol
Prafulla Dhariwal
Aditya A. Ramesh
Pranav Shyam
Pamela Mishkin
Bob McGrew
Ilya Sutskever
Mark Chen
226
3,531
0
20 Dec 2021
Label-Efficient Semantic Segmentation with Diffusion Models
Dmitry Baranchuk
Ivan Rubachev
A. Voynov
Valentin Khrulkov
Artem Babenko
DiffM
VLM
231
526
0
06 Dec 2021
Phrase-BERT: Improved Phrase Embeddings from BERT with an Application to Corpus Exploration
Shufan Wang
Laure Thompson
Mohit Iyyer
199
67
0
13 Sep 2021
Structured Denoising Diffusion Models in Discrete State-Spaces
Jacob Austin
Daniel D. Johnson
Jonathan Ho
Daniel Tarlow
Rianne van den Berg
DiffM
100
900
0
07 Jul 2021
Segmenter: Transformer for Semantic Segmentation
Robin Strudel
Ricardo Garcia Pinel
Ivan Laptev
Cordelia Schmid
ViT
118
1,442
0
12 May 2021
Diffusion Models Beat GANs on Image Synthesis
Prafulla Dhariwal
Alex Nichol
130
7,639
0
11 May 2021
TransVG: End-to-End Visual Grounding with Transformers
Jiajun Deng
Zhengyuan Yang
Tianlang Chen
Wen-gang Zhou
Houqiang Li
ViT
54
338
0
17 Apr 2021
Look Before You Leap: Learning Landmark Features for One-Stage Visual Grounding
Binbin Huang
Dongze Lian
Weixin Luo
Shenghua Gao
ObjD
42
94
0
09 Apr 2021
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Ze Liu
Yutong Lin
Yue Cao
Han Hu
Yixuan Wei
Zheng Zhang
Stephen Lin
B. Guo
ViT
286
21,051
0
25 Mar 2021
Iterative Shrinking for Referring Expression Grounding Using Deep Reinforcement Learning
Mingjie Sun
Jimin Xiao
Eng Gee Lim
ObjD
45
34
0
09 Mar 2021
Perceiver: General Perception with Iterative Attention
Andrew Jaegle
Felix Gimeno
Andrew Brock
Andrew Zisserman
Oriol Vinyals
João Carreira
VLM
ViT
MDE
129
999
0
04 Mar 2021
Improved Denoising Diffusion Probabilistic Models
Alex Nichol
Prafulla Dhariwal
DiffM
162
3,599
0
18 Feb 2021
Score-Based Generative Modeling through Stochastic Differential Equations
Yang Song
Jascha Narain Sohl-Dickstein
Diederik P. Kingma
Abhishek Kumar
Stefano Ermon
Ben Poole
DiffM
SyDa
260
6,293
0
26 Nov 2020
Denoising Diffusion Implicit Models
Jiaming Song
Chenlin Meng
Stefano Ermon
VLM
DiffM
137
7,166
0
06 Oct 2020
Ref-NMS: Breaking Proposal Bottlenecks in Two-Stage Referring Expression Grounding
Long Chen
Wenbo Ma
Jun Xiao
Hanwang Zhang
Shih-Fu Chang
ObjD
36
90
0
03 Sep 2020
Denoising Diffusion Probabilistic Models
Jonathan Ho
Ajay Jain
Pieter Abbeel
DiffM
271
17,550
0
19 Jun 2020
End-to-End Object Detection with Transformers
Nicolas Carion
Francisco Massa
Gabriel Synnaeve
Nicolas Usunier
Alexander Kirillov
Sergey Zagoruyko
ViT
3DV
PINN
275
12,847
0
26 May 2020
IMRAM: Iterative Matching with Recurrent Attention Memory for Cross-Modal Image-Text Retrieval
Hui Chen
Guiguang Ding
Xudong Liu
Zijia Lin
Ji Liu
Jungong Han
38
318
0
08 Mar 2020
Learning Cross-modal Context Graph for Visual Grounding
Yongfei Liu
Bo Wan
Xiao-Dan Zhu
Xuming He
33
90
0
20 Nov 2019
A Fast and Accurate One-Stage Approach to Visual Grounding
Zhengyuan Yang
Boqing Gong
Liwei Wang
Wenbing Huang
Dong Yu
Jiebo Luo
ObjD
43
361
0
18 Aug 2019
Neural Sequential Phrase Grounding (SeqGROUND)
Pelin Dogan
Leonid Sigal
Markus Gross
ObjD
45
52
0
18 Mar 2019
Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression
S. Hamid Rezatofighi
Deyuan Li
JunYoung Gwak
Amir Sadeghian
Ian Reid
Silvio Savarese
127
4,114
0
25 Feb 2019
Neighbourhood Watch: Referring Expression Comprehension via Language-guided Graph Attention Networks
Peng Wang
Qi Wu
Jiewei Cao
Chunhua Shen
Lianli Gao
Anton Van Den Hengel
ObjD
52
253
0
12 Dec 2018
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
870
93,936
0
11 Oct 2018
Rethinking Diversified and Discriminative Proposal Generation for Visual Grounding
Zhou Yu
Jun-chen Yu
Chenchao Xiang
Zhou Zhao
Q. Tian
Dacheng Tao
ObjD
37
139
0
09 May 2018
Iterative Visual Reasoning Beyond Convolutions
Xinlei Chen
Li Li
Li Fei-Fei
Abhinav Gupta
LRM
GNN
124
215
0
29 Mar 2018
MAttNet: Modular Attention Network for Referring Expression Comprehension
Licheng Yu
Zhe Lin
Xiaohui Shen
Jimei Yang
Xin Lu
Joey Tianyi Zhou
Tamara L. Berg
ObjD
85
822
0
24 Jan 2018
Conditional Image-Text Embedding Networks
Bryan A. Plummer
Paige Kordas
M. Kiapour
Shuai Zheng
Robinson Piramuthu
Svetlana Lazebnik
37
118
0
22 Nov 2017
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
422
129,831
0
12 Jun 2017
Mask R-CNN
Kaiming He
Georgia Gkioxari
Piotr Dollár
Ross B. Girshick
ObjD
293
27,018
0
20 Mar 2017
Feature Pyramid Networks for Object Detection
Nayeon Lee
Piotr Dollár
Ross B. Girshick
Kaiming He
Bharath Hariharan
Serge J. Belongie
ObjD
410
21,951
0
09 Dec 2016
Phrase Localization and Visual Relationship Detection with Comprehensive Image-Language Cues
Bryan A. Plummer
Arun Mallya
Christopher M. Cervantes
Julia Hockenmaier
Svetlana Lazebnik
49
189
0
21 Nov 2016
Modeling Context Between Objects for Referring Expression Understanding
Varun K. Nagaraja
Vlad I. Morariu
Larry S. Davis
49
147
0
01 Aug 2016
Modeling Context in Referring Expressions
Licheng Yu
Patrick Poirson
Shan Yang
Alexander C. Berg
Tamara L. Berg
95
1,250
0
31 Jul 2016
Deep Residual Learning for Image Recognition
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
MedIm
1.3K
192,638
0
10 Dec 2015
Learning Deep Structure-Preserving Image-Text Embeddings
Liwei Wang
Yin Li
Svetlana Lazebnik
62
782
0
19 Nov 2015
Generation and Comprehension of Unambiguous Object Descriptions
Junhua Mao
Jonathan Huang
Alexander Toshev
Oana-Maria Camburu
Alan Yuille
Kevin Patrick Murphy
ObjD
82
1,331
0
07 Nov 2015
Flickr30k Entities: Collecting Region-to-Phrase Correspondences for Richer Image-to-Sentence Models
Bryan A. Plummer
Liwei Wang
Christopher M. Cervantes
Juan C. Caicedo
Julia Hockenmaier
Svetlana Lazebnik
167
2,033
0
19 May 2015
Fast R-CNN
Ross B. Girshick
ObjD
275
24,933
0
30 Apr 2015
Very Deep Convolutional Networks for Large-Scale Image Recognition
Karen Simonyan
Andrew Zisserman
FAtt
MDE
822
99,991
0
04 Sep 2014
1
2
Next