Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1903.11649
Cited By
Align2Ground: Weakly Supervised Phrase Grounding Guided by Image-Caption Alignment
27 March 2019
Samyak Datta
Karan Sikka
Anirban Roy
Karuna Ahuja
Devi Parikh
Ajay Divakaran
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Align2Ground: Weakly Supervised Phrase Grounding Guided by Image-Caption Alignment"
25 / 25 papers shown
Title
Disambiguating Reference in Visually Grounded Dialogues through Joint Modeling of Textual and Multimodal Semantic Structures
Shun Inadumi
Nobuhiro Ueda
Koichiro Yoshino
ObjD
14
0
0
16 May 2025
3DWG: 3D Weakly Supervised Visual Grounding via Category and Instance-Level Alignment
Xianrui Li
Jing Liu
Nuowei Han
Liang Heng
Y. Guo
Hao Dong
Yang Liu
74
0
0
03 May 2025
CTRL-O: Language-Controllable Object-Centric Visual Representation Learning
Aniket Didolkar
Andrii Zadaianchuk
Rabiul Awal
Maximilian Seitzer
E. Gavves
Aishwarya Agrawal
OCL
VLM
94
2
0
27 Mar 2025
Referring to Any Person
Qing Jiang
Lin Wu
Zhaoyang Zeng
Tianhe Ren
Yuda Xiong
Yihao Chen
Qin Liu
Lei Zhang
235
0
0
11 Mar 2025
How to Understand "Support"? An Implicit-enhanced Causal Inference Approach for Weakly-supervised Phrase Grounding
Jiamin Luo
Jianing Zhao
Jingjing Wang
Guodong Zhou
46
0
0
29 Feb 2024
Weakly-Supervised 3D Visual Grounding based on Visual Linguistic Alignment
Xiaoxu Xu
Yitian Yuan
Qiudan Zhang
Wen-Bin Wu
Zequn Jie
Lin Ma
Xu Wang
64
4
0
15 Dec 2023
AircraftVerse: A Large-Scale Multimodal Dataset of Aerial Vehicle Designs
Adam D. Cobb
Anirban Roy
Daniel Elenius
F. M. Heim
Brian Swenson
...
Theodore Bapty
Joseph Hite
K. Ramani
Christopher McComb
Susmit Jha
20
7
0
08 Jun 2023
Focusing On Targets For Improving Weakly Supervised Visual Grounding
V. Pham
Nao Mishima
ObjD
26
1
0
22 Feb 2023
Who are you referring to? Coreference resolution in image narrations
A. Goel
Basura Fernando
Frank Keller
Hakan Bilen
27
3
0
26 Nov 2022
Detailed Annotations of Chest X-Rays via CT Projection for Report Understanding
C. Seibold
Simon Reiß
Saquib Sarfraz
M. Fink
Victoria L. Mayer
Jan Sellner
Moon S. Kim
Klaus H. Maier-Hein
Jens Kleesiek
Rainer Stiefelhagen
37
19
0
07 Oct 2022
2D Human Pose Estimation: A Survey
Haoming Chen
Runyang Feng
Sifan Wu
Hao Xu
F. Zhou
Zhenguang Liu
3DH
29
55
0
15 Apr 2022
Pseudo-Q: Generating Pseudo Language Queries for Visual Grounding
Haojun Jiang
Yuanze Lin
Dongchen Han
Shiji Song
Gao Huang
ObjD
48
51
0
16 Mar 2022
Repurposing Existing Deep Networks for Caption and Aesthetic-Guided Image Cropping
Nora Horanyi
Kedi Xia
K. M. Yi
Abhishake Kumar Bojja
A. Leonardis
H. Chang
31
12
0
07 Jan 2022
Weakly-Supervised Video Object Grounding via Causal Intervention
Wei Wang
Junyu Gao
Changsheng Xu
CML
30
20
0
01 Dec 2021
Relation-aware Instance Refinement for Weakly Supervised Visual Grounding
Yongfei Liu
Bo Wan
Lin Ma
Xuming He
ObjD
24
56
0
24 Mar 2021
Training image classifiers using Semi-Weak Label Data
Anxiang Zhang
Ankit Parag Shah
Bhiksha Raj
23
2
0
19 Mar 2021
Refer-it-in-RGBD: A Bottom-up Approach for 3D Visual Grounding in RGBD Images
Haolin Liu
Anran Lin
Xiaoguang Han
Lei Yang
Yizhou Yu
Shuguang Cui
27
40
0
14 Mar 2021
Open-Vocabulary Object Detection Using Captions
Alireza Zareian
Kevin Dela Rosa
Derek Hao Hu
Shih-Fu Chang
VLM
ObjD
44
418
0
20 Nov 2020
MAF: Multimodal Alignment Framework for Weakly-Supervised Phrase Grounding
Qinxin Wang
Hao Tan
Sheng Shen
Michael W. Mahoney
Z. Yao
ObjD
50
11
0
12 Oct 2020
Cosine meets Softmax: A tough-to-beat baseline for visual grounding
N. Rufus
U. R. Nair
K. M. Krishna
Vineet Gandhi
27
13
0
13 Sep 2020
PhraseCut: Language-based Image Segmentation in the Wild
Chenyun Wu
Zhe-nan Lin
Scott D. Cohen
Trung Bui
Subhransu Maji
VLM
13
111
0
03 Aug 2020
Contrastive Learning for Weakly Supervised Phrase Grounding
Tanmay Gupta
Arash Vahdat
Gal Chechik
Xiaodong Yang
Jan Kautz
Derek Hoiem
ObjD
SSL
42
141
0
17 Jun 2020
ScanRefer: 3D Object Localization in RGB-D Scans using Natural Language
Dave Zhenyu Chen
Angel X. Chang
Matthias Nießner
3DPC
47
350
0
18 Dec 2019
Unsupervised Discovery of Multimodal Links in Multi-image, Multi-sentence Documents
Jack Hessel
Lillian Lee
David M. Mimno
31
30
0
16 Apr 2019
Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering
Yash Goyal
Tejas Khot
D. Summers-Stay
Dhruv Batra
Devi Parikh
CoGe
149
3,136
0
02 Dec 2016
1