Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2303.03761
Cited By
v1
v2 (latest)
Graph Neural Networks in Vision-Language Image Understanding: A Survey
7 March 2023
Henry Senior
Greg Slabaugh
Shanxin Yuan
Luca Rossi
GNN
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Graph Neural Networks in Vision-Language Image Understanding: A Survey"
50 / 80 papers shown
Title
GNN-LoFI: a Novel Graph Neural Network through Localized Feature-based Histogram Intersection
Alessandro Bicciato
Luca Cosmo
G. Minello
Luca Rossi
A. Torsello
55
2
0
17 Jan 2024
De-Diffusion Makes Text a Strong Cross-Modal Interface
Chen Wei
Chenxi Liu
Siyuan Qiao
Zhishuai Zhang
Alan Yuille
Jiahui Yu
VLM
DiffM
74
11
0
01 Nov 2023
Transforming Visual Scene Graphs to Image Captions
Xu Yang
Jiawei Peng
Zihua Wang
Haiyang Xu
Qinghao Ye
Chenliang Li
Mingshi Yan
Feisi Huang
Zhangzikang Li
Yu Zhang
78
20
0
03 May 2023
ETPNav: Evolving Topological Planning for Vision-Language Navigation in Continuous Environments
Dongyan An
Hongru Wang
Wenguan Wang
Zun Wang
Yan Huang
Keji He
Liang Wang
146
67
0
06 Apr 2023
Dynamic Graph Enhanced Contrastive Learning for Chest X-ray Report Generation
Mingjie Li
Bingqian Lin
Zicong Chen
Haokun Lin
Xiaodan Liang
Xiaojun Chang
MedIm
72
116
0
18 Mar 2023
Do DALL-E and Flamingo Understand Each Other?
Hang Li
Jindong Gu
Rajat Koner
Sahand Sharifzadeh
Volker Tresp
MLLM
49
12
0
23 Dec 2022
CLIP-Diffusion-LM: Apply Diffusion Model on Image Captioning
Shi-You Xu
VLM
DiffM
79
12
0
10 Oct 2022
Graph Neural Networks for Link Prediction with Subgraph Sketching
B. Chamberlain
S. Shirobokov
Emanuele Rossi
Fabrizio Frasca
Thomas Markovich
Nils Y. Hammerla
Michael M. Bronstein
Max Hansmire
118
84
0
30 Sep 2022
A Survey on Graph Neural Networks and Graph Transformers in Computer Vision: A Task-Oriented Perspective
Chaoqi Chen
Yushuang Wu
Qiyuan Dai
Hong-Yu Zhou
Mutian Xu
Sibei Yang
Xiaoguang Han
Yizhou Yu
ViT
MedIm
AI4CE
129
77
0
27 Sep 2022
PaLI: A Jointly-Scaled Multilingual Language-Image Model
Xi Chen
Tianlin Li
Soravit Changpinyo
A. Piergiovanni
Piotr Padlewski
...
Andreas Steiner
A. Angelova
Xiaohua Zhai
N. Houlsby
Radu Soricut
MLLM
VLM
116
732
0
14 Sep 2022
Testing Relational Understanding in Text-Guided Image Generation
C. Conwell
T. Ullman
EGVM
216
66
0
29 Jul 2022
Target-Driven Structured Transformer Planner for Vision-Language Navigation
Yusheng Zhao
Jinyu Chen
Chen Gao
Wenguan Wang
Lirong Yang
Haibing Ren
Huaxia Xia
Si Liu
LM&Ro
78
60
0
19 Jul 2022
Understanding and Extending Subgraph GNNs by Rethinking Their Symmetries
Fabrizio Frasca
Beatrice Bevilacqua
Michael M. Bronstein
Haggai Maron
73
132
0
22 Jun 2022
Sheaf Neural Networks with Connection Laplacians
Federico Barbero
Cristian Bodnar
Haitz Sáez de Ocáriz Borde
Michael M. Bronstein
Petar Velivcković
Pietro Lio
61
43
0
17 Jun 2022
Cross-modal Clinical Graph Transformer for Ophthalmic Report Generation
Mingjie Li
Wenjia Cai
Karin Verspoor
Shirui Pan
Xiaodan Liang
Xiaojun Chang
MedIm
75
36
0
04 Jun 2022
mPLUG: Effective and Efficient Vision-Language Learning by Cross-modal Skip-connections
Chenliang Li
Haiyang Xu
Junfeng Tian
Wei Wang
Ming Yan
...
Ji Zhang
Songfang Huang
Feiran Huang
Jingren Zhou
Luo Si
VLM
MLLM
81
221
0
24 May 2022
VQA-GNN: Reasoning with Multimodal Knowledge via Graph Neural Networks for Visual Question Answering
Yanan Wang
Michihiro Yasunaga
Hongyu Ren
Shinya Wada
J. Leskovec
64
18
0
23 May 2022
Gender and Racial Bias in Visual Question Answering Datasets
Yusuke Hirota
Yuta Nakashima
Noa Garcia
FaML
176
54
0
17 May 2022
Hierarchical Text-Conditional Image Generation with CLIP Latents
Aditya A. Ramesh
Prafulla Dhariwal
Alex Nichol
Casey Chu
Mark Chen
VLM
DiffM
413
6,908
0
13 Apr 2022
Neural Sheaf Diffusion: A Topological Perspective on Heterophily and Oversmoothing in GNNs
Cristian Bodnar
Francesco Di Giovanni
B. Chamberlain
Pietro Lio
Michael M. Bronstein
88
183
0
09 Feb 2022
A Thousand Words Are Worth More Than a Picture: Natural Language-Centric Outside-Knowledge Visual Question Answering
Feng Gao
Q. Ping
Govind Thattai
Aishwarya N. Reganti
Yingting Wu
Premkumar Natarajan
50
17
0
14 Jan 2022
Two-level Graph Neural Network
Xing Ai
Chengyu Sun
Zhihong Zhang
Edwin R. Hancock
23
10
0
03 Jan 2022
High-Resolution Image Synthesis with Latent Diffusion Models
Robin Rombach
A. Blattmann
Dominik Lorenz
Patrick Esser
Bjorn Ommer
3DV
493
15,734
0
20 Dec 2021
SGEITL: Scene Graph Enhanced Image-Text Learning for Visual Commonsense Reasoning
Zhecan Wang
Haoxuan You
Liunian Harold Li
Alireza Zareian
Suji Park
Yiqing Liang
Kai-Wei Chang
Shih-Fu Chang
ReLM
LRM
59
33
0
16 Dec 2021
Vision-Language Navigation: A Survey and Taxonomy
Wansen Wu
Tao Chang
Xinmeng Li
LM&Ro
54
24
0
26 Aug 2021
Dual Graph Convolutional Networks with Transformer and Curriculum Learning for Image Captioning
Xinzhi Dong
Chengjiang Long
Wenju Xu
Chunxia Xiao
ViT
119
67
0
05 Aug 2021
Weisfeiler and Lehman Go Cellular: CW Networks
Cristian Bodnar
Fabrizio Frasca
N. Otter
Yu Guang Wang
Pietro Lio
Guido Montúfar
M. Bronstein
GNN
83
237
0
23 Jun 2021
GRAND: Graph Neural Diffusion
B. Chamberlain
J. Rowbottom
Maria I. Gorinova
Stefan Webb
Emanuele Rossi
M. Bronstein
GNN
123
270
0
21 Jun 2021
Understanding and Evaluating Racial Biases in Image Captioning
Dora Zhao
Angelina Wang
Olga Russakovsky
63
138
0
16 Jun 2021
Exploring Explicit and Implicit Visual Relationships for Image Captioning
Zeliang Song
Xiaofei Zhou
21
8
0
06 May 2021
A survey on VQA_Datasets and Approaches
Yeyun Zou
Qiyu Xie
72
18
0
02 May 2021
SOON: Scenario Oriented Object Navigation with Graph-based Exploration
Fengda Zhu
Xiwen Liang
Yi Zhu
Xiaojun Chang
Xiaodan Liang
57
126
0
31 Mar 2021
A Comprehensive Survey of Scene Graphs: Generation and Application
Xiaojun Chang
Pengzhen Ren
Pengfei Xu
Zhihui Li
Xiaojiang Chen
Alexander G. Hauptmann
3DV
98
234
0
17 Mar 2021
Structured Scene Memory for Vision-Language Navigation
Hanqing Wang
Wenguan Wang
Wei Liang
Caiming Xiong
Jianbing Shen
LM&Ro
80
114
0
05 Mar 2021
Image-to-Image Retrieval by Learning Similarity between Scene Graphs
Sangwoong Yoon
Woo-Young Kang
Sungwook Jeon
SeongEun Lee
C. Han
Jonghun Park
Eun-Sol Kim
3DH
81
44
0
29 Dec 2020
Understanding Image Retrieval Re-Ranking: A Graph Neural Network Perspective
Xuanmeng Zhang
Minyue Jiang
Zhedong Zheng
Xiaoping Tan
Errui Ding
Yi Yang
GNN
50
45
0
14 Dec 2020
NCGNN: Node-Level Capsule Graph Neural Network for Semisupervised Classification
Rui Yang
Wenrui Dai
Chenglin Li
Junni Zou
H. Xiong
109
21
0
07 Dec 2020
Advanced Graph and Sequence Neural Networks for Molecular Property Prediction and Drug Discovery
Zhengyang Wang
Meng Liu
Youzhi Luo
Zhao Xu
Yaochen Xie
...
Lei Cai
Q. Qi
Zhuoning Yuan
Tianbao Yang
Shuiwang Ji
72
101
0
02 Dec 2020
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
673
41,430
0
22 Oct 2020
Multi-Modal Retrieval using Graph Neural Networks
Aashish Kumar Misraa
Ajinkya Kale
Pranav Aggarwal
A. Aminian
GNN
36
12
0
04 Oct 2020
Cross-modal Knowledge Reasoning for Knowledge-based Visual Question Answering
Jiahao Yu
Zihao Zhu
Yujing Wang
Weifeng Zhang
Yue Hu
Jianlong Tan
59
99
0
31 Aug 2020
Comprehensive Image Captioning via Scene Graph Decomposition
Yiwu Zhong
Liwei Wang
Jianshu Chen
Dong Yu
Yin Li
119
127
0
23 Jul 2020
A new approach to descriptors generation for image retrieval by analyzing activations of deep neural network layers
P. Staszewski
Maciej Jaworski
Jinde Cao
Leszek Rutkowski
14
16
0
13 Jul 2020
Evolving Graphical Planner: Contextual Global Planning for Vision-and-Language Navigation
Zhiwei Deng
Karthik Narasimhan
Olga Russakovsky
72
88
0
11 Jul 2020
Mucko: Multi-Layer Cross-Modal Knowledge Reasoning for Fact-based Visual Question Answering
Zihao Zhu
Jiahao Yu
Yujing Wang
Yajing Sun
Yue Hu
Qi Wu
87
127
0
16 Jun 2020
Image Captioning through Image Transformer
Sen He
Wentong Liao
Hamed R. Tavakoli
M. Yang
Bodo Rosenhahn
N. Pugeault
ViT
83
94
0
29 Apr 2020
Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene Text
Difei Gao
Ke Li
Ruiping Wang
Shiguang Shan
Xilin Chen
76
113
0
31 Mar 2020
Point-GNN: Graph Neural Network for 3D Object Detection in a Point Cloud
Weijing Shi
Ragunathan
R. Rajkumar
3DPC
226
745
0
02 Mar 2020
When Radiology Report Generation Meets Knowledge Graph
Yixiao Zhang
Xiaosong Wang
Ziyue Xu
Qihang Yu
Alan Yuille
Daguang Xu
MedIm
70
302
0
19 Feb 2020
Meshed-Memory Transformer for Image Captioning
Marcella Cornia
Matteo Stefanini
Lorenzo Baraldi
Rita Cucchiara
78
884
0
17 Dec 2019
1
2
Next