Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1809.07041
Cited By
Exploring Visual Relationship for Image Captioning
19 September 2018
Ting Yao
Yingwei Pan
Yehao Li
Tao Mei
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Exploring Visual Relationship for Image Captioning"
50 / 133 papers shown
Title
Tri-FusionNet: Enhancing Image Description Generation with Transformer-based Fusion Network and Dual Attention Mechanism
Lakshita Agarwal
Bindu Verma
ViT
29
0
0
23 Apr 2025
Predicate Hierarchies Improve Few-Shot State Classification
Emily Jin
Joy Hsu
Jiajun Wu
OffRL
79
0
0
18 Feb 2025
Unleashing Text-to-Image Diffusion Prior for Zero-Shot Image Captioning
Jianjie Luo
Jingwen Chen
Yehao Li
Yingwei Pan
Jianlin Feng
Hongyang Chao
Ting Yao
DiffM
VLM
53
0
0
03 Jan 2025
Semi-Supervised Image Captioning Considering Wasserstein Graph Matching
Yang Yang
41
0
0
26 Mar 2024
A Comprehensive Survey of 3D Dense Captioning: Localizing and Describing Objects in 3D Scenes
Ting Yu
Xiaojun Lin
Shuhui Wang
Weiguo Sheng
Qingming Huang
Jun-chen Yu
3DV
54
10
0
12 Mar 2024
Video Relationship Detection Using Mixture of Experts
A. Shaabana
Zahra Gharaee
Paul Fieguth
34
1
0
06 Mar 2024
MeaCap: Memory-Augmented Zero-shot Image Captioning
Zequn Zeng
Yan Xie
Hao Zhang
Chiyu Chen
Zhengjue Wang
Boli Chen
VLM
39
14
0
06 Mar 2024
Predicate Classification Using Optimal Transport Loss in Scene Graph Generation
Sorachi Kurita
Satoshi Oyama
Itsuki Noda
OT
24
0
0
19 Sep 2023
With a Little Help from your own Past: Prototypical Memory Networks for Image Captioning
Manuele Barraco
Sara Sarto
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
VLM
55
19
0
23 Aug 2023
A request for clarity over the End of Sequence token in the Self-Critical Sequence Training
J. Hu
Roberto Cavicchioli
Alessandro Capotondi
29
6
0
20 May 2023
Hierarchical Aligned Multimodal Learning for NER on Tweet Posts
Peipei Liu
Hong Li
Yimo Ren
Jie Liu
Shuaizong Si
Hongsong Zhu
Limin Sun
31
2
0
15 May 2023
Relational Context Learning for Human-Object Interaction Detection
Sanghyun Kim
Deunsol Jung
Minsu Cho
24
37
0
11 Apr 2023
DetCLIPv2: Scalable Open-Vocabulary Object Detection Pre-training via Word-Region Alignment
Lewei Yao
Jianhua Han
Xiaodan Liang
Danqian Xu
Wei Zhang
Zhenguo Li
Hang Xu
VLM
ObjD
CLIP
56
74
0
10 Apr 2023
GNNFormer: A Graph-based Framework for Cytopathology Report Generation
Yangqiaoyu Zhou
Kai-Lang Yao
Wusuo Li
MedIm
19
1
0
17 Mar 2023
Graph Neural Networks in Vision-Language Image Understanding: A Survey
Henry Senior
Greg Slabaugh
Shanxin Yuan
Luca Rossi
GNN
33
13
0
07 Mar 2023
ConZIC: Controllable Zero-shot Image Captioning by Sampling-Based Polishing
Zequn Zeng
Hao Zhang
Zhengjue Wang
Ruiying Lu
Dongsheng Wang
Bo Chen
BDL
DiffM
19
33
0
04 Mar 2023
Interpretable Medical Image Visual Question Answering via Multi-Modal Relationship Graph Learning
Xinyue Hu
Lin Gu
Kazuma Kobayashi
Qi A. An
Qingyu Chen
Zhiyong Lu
Chang Su
Tatsuya Harada
Yingying Zhu
GNN
34
9
0
19 Feb 2023
Retrieval-augmented Image Captioning
R. Ramos
Desmond Elliott
Bruno Martins
VLM
32
29
0
16 Feb 2023
Towards Local Visual Modeling for Image Captioning
Yiwei Ma
Jiayi Ji
Xiaoshuai Sun
Yiyi Zhou
Rongrong Ji
ViT
21
71
0
13 Feb 2023
Stacked Cross-modal Feature Consolidation Attention Networks for Image Captioning
Mozhgan Pourkeshavarz
Shahabedin Nabavi
Mohsen Moghaddam
M. Shamsfard
31
4
0
08 Feb 2023
What You Say Is What You Show: Visual Narration Detection in Instructional Videos
Kumar Ashutosh
Rohit Girdhar
Lorenzo Torresani
Kristen Grauman
24
4
0
05 Jan 2023
HGAN: Hierarchical Graph Alignment Network for Image-Text Retrieval
Jie Guo
Meiting Wang
Yan Zhou
Bin Song
Yuhao Chi
Wei-liang Fan
Jianglong Chang
42
15
0
16 Dec 2022
Semantic-Conditional Diffusion Networks for Image Captioning
Jianjie Luo
Yehao Li
Yingwei Pan
Ting Yao
Jianlin Feng
Hongyang Chao
Tao Mei
DiffM
27
62
0
06 Dec 2022
Uncertainty-Aware Image Captioning
Zhengcong Fei
Mingyuan Fan
Li Zhu
Junshi Huang
Xiaoming Wei
Xiaolin K. Wei
UQLM
18
10
0
30 Nov 2022
How to Describe Images in a More Funny Way? Towards a Modular Approach to Cross-Modal Sarcasm Generation
Jie Ruan
Yue Wu
Xiaojun Wan
Yuesheng Zhu
29
1
0
20 Nov 2022
Prophet Attention: Predicting Attention with Future Attention for Image Captioning
Fenglin Liu
Xuancheng Ren
Xian Wu
Wei Fan
Yuexian Zou
Xu Sun
24
46
0
19 Oct 2022
Graph Neural Network Surrogate for Seismic Reliability Analysis of Highway Bridge Systems
Tong Liu
Hadi Meidani
30
10
0
12 Oct 2022
Learning to Collocate Visual-Linguistic Neural Modules for Image Captioning
Xu Yang
Hanwang Zhang
Chongyang Gao
Jianfei Cai
MLLM
40
10
0
04 Oct 2022
Toward 3D Spatial Reasoning for Human-like Text-based Visual Question Answering
Hao Li
Jinfa Huang
Peng Jin
Guoli Song
Qi Wu
Jie Chen
39
21
0
21 Sep 2022
Towards Open-vocabulary Scene Graph Generation with Prompt-based Finetuning
Tao He
Lianli Gao
Jingkuan Song
Yuan-Fang Li
VLM
31
50
0
17 Aug 2022
GRIT: Faster and Better Image captioning Transformer Using Dual Visual Features
Van-Quang Nguyen
Masanori Suganuma
Takayuki Okatani
ViT
36
106
0
20 Jul 2022
Exploring the sequence length bottleneck in the Transformer for Image Captioning
Jiapeng Hu
Roberto Cavicchioli
Alessandro Capotondi
ViT
38
3
0
07 Jul 2022
Bypass Network for Semantics Driven Image Paragraph Captioning
Qinjie Zheng
Chaoyue Wang
Dadong Wang
21
1
0
21 Jun 2022
Comprehending and Ordering Semantics for Image Captioning
Yehao Li
Yingwei Pan
Ting Yao
Tao Mei
26
87
0
14 Jun 2022
Exploring Structure-aware Transformer over Interaction Proposals for Human-Object Interaction Detection
Y. Zhang
Yingwei Pan
Ting Yao
Rui Huang
Tao Mei
C. Chen
ViT
23
68
0
13 Jun 2022
Modeling Image Composition for Complex Scene Generation
Zuopeng Yang
Daqing Liu
Chaoyue Wang
J. Yang
Dacheng Tao
ViT
36
50
0
02 Jun 2022
Visual Transformer for Object Detection
M. Yang
ViT
25
6
0
01 Jun 2022
Prompt-based Learning for Unpaired Image Captioning
Peipei Zhu
Tianlin Li
Lin Zhu
Zhenglong Sun
Weishi Zheng
Yaowei Wang
Chia-Ju Chen
VLM
23
31
0
26 May 2022
Controllable Image Captioning
Luka Maxwell
33
0
0
28 Apr 2022
Guiding Attention using Partial-Order Relationships for Image Captioning
Murad Popattia
Muhammad Rafi
Rizwan Qureshi
Shah Nawaz
21
4
0
15 Apr 2022
End-to-End Transformer Based Model for Image Captioning
Yiyu Wang
Jungang Xu
Yingfei Sun
VLM
ViT
26
117
0
29 Mar 2022
CaMEL: Mean Teacher Learning for Image Captioning
Manuele Barraco
Matteo Stefanini
Marcella Cornia
S. Cascianelli
Lorenzo Baraldi
Rita Cucchiara
ViT
VLM
38
27
0
21 Feb 2022
Detecting Human-Object Interactions with Object-Guided Cross-Modal Calibrated Semantics
Hangjie Yuan
Mang Wang
Dong Ni
Liangpeng Xu
19
36
0
01 Feb 2022
Deep Learning Approaches on Image Captioning: A Review
Taraneh Ghandi
H. Pourreza
H. Mahyar
VLM
16
89
0
31 Jan 2022
Representing Videos as Discriminative Sub-graphs for Action Recognition
Dong Li
Zhaofan Qiu
Yingwei Pan
Ting Yao
Houqiang Li
Tao Mei
42
25
0
11 Jan 2022
Uni-EDEN: Universal Encoder-Decoder Network by Multi-Granular Vision-Language Pre-training
Yehao Li
Jiahao Fan
Yingwei Pan
Ting Yao
Weiyao Lin
Tao Mei
MLLM
ObjD
33
19
0
11 Jan 2022
Compact Bidirectional Transformer for Image Captioning
Yuanen Zhou
Zhenzhen Hu
Daqing Liu
Huixia Ben
Meng Wang
VLM
20
16
0
06 Jan 2022
Incremental Object Grounding Using Scene Graphs
J. Yi
Yoonwoo Kim
Sonia Chernova
LM&Ro
30
9
0
06 Jan 2022
Graph Neural Networks: a bibliometrics overview
Abdalsamad Keramatfar
Mohadeseh Rafiee
Hossein Amirkhani
GNN
AI4CE
42
24
0
03 Jan 2022
A Survey of Natural Language Generation
Chenhe Dong
Hai-Tao Zheng
Haifan Gong
M. Chen
Junxin Li
Ying Shen
Min Yang
3DV
27
43
0
22 Dec 2021
1
2
3
Next