ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1809.07041
  4. Cited By
Exploring Visual Relationship for Image Captioning

Exploring Visual Relationship for Image Captioning

19 September 2018
Ting Yao
Yingwei Pan
Yehao Li
Tao Mei
ArXivPDFHTML

Papers citing "Exploring Visual Relationship for Image Captioning"

50 / 136 papers shown
Title
MAGIC: Multimodal relAtional Graph adversarIal inferenCe for Diverse and
  Unpaired Text-based Image Captioning
MAGIC: Multimodal relAtional Graph adversarIal inferenCe for Diverse and Unpaired Text-based Image Captioning
Wenqiao Zhang
Haochen Shi
Jiannan Guo
Shengyu Zhang
Qingpeng Cai
Juncheng Li
Sihui Luo
Yueting Zhuang
DiffM
26
46
0
13 Dec 2021
Injecting Semantic Concepts into End-to-End Image Captioning
Injecting Semantic Concepts into End-to-End Image Captioning
Zhiyuan Fang
Jianfeng Wang
Xiaowei Hu
Lin Liang
Zhe Gan
Lijuan Wang
Yezhou Yang
Zicheng Liu
ViT
VLM
24
86
0
09 Dec 2021
Consensus Graph Representation Learning for Better Grounded Image
  Captioning
Consensus Graph Representation Learning for Better Grounded Image Captioning
Wenqiao Zhang
Haochen Shi
Siliang Tang
Jun Xiao
Qiang Yu
Yueting Zhuang
15
54
0
02 Dec 2021
Neural Attention for Image Captioning: Review of Outstanding Methods
Neural Attention for Image Captioning: Review of Outstanding Methods
Zanyar Zohourianshahzadi
Jugal Kalita
VLM
32
45
0
29 Nov 2021
ZeroCap: Zero-Shot Image-to-Text Generation for Visual-Semantic
  Arithmetic
ZeroCap: Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic
Yoad Tewel
Yoav Shalev
Idan Schwartz
Lior Wolf
VLM
34
192
0
29 Nov 2021
Scaling Up Vision-Language Pre-training for Image Captioning
Scaling Up Vision-Language Pre-training for Image Captioning
Xiaowei Hu
Zhe Gan
Jianfeng Wang
Zhengyuan Yang
Zicheng Liu
Yumao Lu
Lijuan Wang
MLLM
VLM
34
246
0
24 Nov 2021
CSI: Contrastive Data Stratification for Interaction Prediction and its
  Application to Compound-Protein Interaction Prediction
CSI: Contrastive Data Stratification for Interaction Prediction and its Application to Compound-Protein Interaction Prediction
A. Kalia
Dilip Krishnan
Soha Hassoun Tufts University
21
2
0
18 Nov 2021
Unifying Multimodal Transformer for Bi-directional Image and Text
  Generation
Unifying Multimodal Transformer for Bi-directional Image and Text Generation
Yupan Huang
Hongwei Xue
Bei Liu
Yutong Lu
19
57
0
19 Oct 2021
Topic Scene Graph Generation by Attention Distillation from Caption
Topic Scene Graph Generation by Attention Distillation from Caption
Wenbin Wang
R. Wang
X. Chen
DiffM
25
14
0
12 Oct 2021
Semi-Autoregressive Image Captioning
Semi-Autoregressive Image Captioning
Xu Yan
Zhengcong Fei
Zekang Li
Shuhui Wang
Qingming Huang
Qi Tian
29
23
0
11 Oct 2021
SDA-GAN: Unsupervised Image Translation Using Spectral Domain
  Attention-Guided Generative Adversarial Network
SDA-GAN: Unsupervised Image Translation Using Spectral Domain Attention-Guided Generative Adversarial Network
Qizhou Wang
M. Makarenko
19
0
0
06 Oct 2021
Scene Graph Generation for Better Image Captioning?
Scene Graph Generation for Better Image Captioning?
Maximilian Mozes
Martin Schmitt
Vladimir Golkov
Hinrich Schütze
Daniel Cremers
GNN
26
3
0
23 Sep 2021
GoG: Relation-aware Graph-over-Graph Network for Visual Dialog
GoG: Relation-aware Graph-over-Graph Network for Visual Dialog
Feilong Chen
Xiuyi Chen
Fandong Meng
Peng Li
Jie Zhou
76
34
0
17 Sep 2021
Cross Modification Attention Based Deliberation Model for Image
  Captioning
Cross Modification Attention Based Deliberation Model for Image Captioning
Zheng Lian
Yanan Zhang
Haichang Li
Rui Wang
Xiaohui Hu
24
4
0
17 Sep 2021
A Survey on Temporal Sentence Grounding in Videos
A Survey on Temporal Sentence Grounding in Videos
Xiaohan Lan
Yitian Yuan
Xin Wang
Zhi Wang
Wenwu Zhu
30
47
0
16 Sep 2021
Learning to Generate Scene Graph from Natural Language Supervision
Learning to Generate Scene Graph from Natural Language Supervision
Yiwu Zhong
Jing Shi
Jianwei Yang
Chenliang Xu
Yin Li
SSL
42
77
0
06 Sep 2021
SketchLattice: Latticed Representation for Sketch Manipulation
SketchLattice: Latticed Representation for Sketch Manipulation
Yonggang Qi
Guoyao Su
Pinaki Nath Chowdhury
Mingkang Li
Yi-Zhe Song
40
23
0
26 Aug 2021
Auto-Parsing Network for Image Captioning and Visual Question Answering
Auto-Parsing Network for Image Captioning and Visual Question Answering
Xu Yang
Chongyang Gao
Hanwang Zhang
Jianfei Cai
24
35
0
24 Aug 2021
Exploiting Multi-Object Relationships for Detecting Adversarial Attacks
  in Complex Scenes
Exploiting Multi-Object Relationships for Detecting Adversarial Attacks in Complex Scenes
Mingjun Yin
Shasha Li
Zikui Cai
Chengyu Song
M. Salman Asif
A. Roy-Chowdhury
S. Krishnamurthy
AAML
19
18
0
19 Aug 2021
X-modaler: A Versatile and High-performance Codebase for Cross-modal
  Analytics
X-modaler: A Versatile and High-performance Codebase for Cross-modal Analytics
Yehao Li
Yingwei Pan
Jingwen Chen
Ting Yao
Tao Mei
VLM
19
31
0
18 Aug 2021
ROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and
  Intra-modal Knowledge Integration
ROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and Intra-modal Knowledge Integration
Yuhao Cui
Zhou Yu
Chunqi Wang
Zhongzhou Zhao
Ji Zhang
Meng Wang
Jun-chen Yu
VLM
27
53
0
16 Aug 2021
Medical-VLBERT: Medical Visual Language BERT for COVID-19 CT Report
  Generation With Alternate Learning
Medical-VLBERT: Medical Visual Language BERT for COVID-19 CT Report Generation With Alternate Learning
Guangyi Liu
Yinghong Liao
Fuyu Wang
Bin Zhang
Lu Zhang
...
Xiang Wan
Shaolin Li
Zhen Li
Shuixing Zhang
Shuguang Cui
23
56
0
11 Aug 2021
Dual Graph Convolutional Networks with Transformer and Curriculum
  Learning for Image Captioning
Dual Graph Convolutional Networks with Transformer and Curriculum Learning for Image Captioning
Xinzhi Dong
Chengjiang Long
Wenju Xu
Chunxia Xiao
ViT
76
66
0
05 Aug 2021
From Show to Tell: A Survey on Deep Learning-based Image Captioning
From Show to Tell: A Survey on Deep Learning-based Image Captioning
Matteo Stefanini
Marcella Cornia
Lorenzo Baraldi
S. Cascianelli
G. Fiameni
Rita Cucchiara
3DV
VLM
MLLM
67
254
0
14 Jul 2021
Zero-Shot Scene Graph Relation Prediction through Commonsense Knowledge
  Integration
Zero-Shot Scene Graph Relation Prediction through Commonsense Knowledge Integration
Xuan Kan
Hejie Cui
Carl Yang
76
40
0
11 Jul 2021
Instance-Level Relative Saliency Ranking with Graph Reasoning
Instance-Level Relative Saliency Ranking with Graph Reasoning
Nian Liu
Long Li
Wangbo Zhao
Junwei Han
Ling Shao
30
27
0
08 Jul 2021
Recovering the Unbiased Scene Graphs from the Biased Ones
Recovering the Unbiased Scene Graphs from the Biased Ones
Meng-Jiun Chiou
Henghui Ding
Hanshu Yan
Changhu Wang
Roger Zimmermann
Jiashi Feng
49
113
0
05 Jul 2021
Structured Sparse R-CNN for Direct Scene Graph Generation
Structured Sparse R-CNN for Direct Scene Graph Generation
Yao Teng
Limin Wang
3DPC
GNN
26
53
0
21 Jun 2021
Giving Commands to a Self-Driving Car: How to Deal with Uncertain
  Situations?
Giving Commands to a Self-Driving Car: How to Deal with Uncertain Situations?
Thierry Deruyttere
Victor Milewski
Marie-Francine Moens
30
15
0
08 Jun 2021
Multi-Modal Image Captioning for the Visually Impaired
Multi-Modal Image Captioning for the Visually Impaired
Hiba Ahsan
Nikita Bhalla
Daivat Bhatt
Kaivankumar Shah
25
20
0
17 May 2021
T-EMDE: Sketching-based global similarity for cross-modal retrieval
T-EMDE: Sketching-based global similarity for cross-modal retrieval
Barbara Rychalska
Mikolaj Wieczorek
Jacek Dąbrowski
33
0
0
10 May 2021
Exploring Explicit and Implicit Visual Relationships for Image
  Captioning
Exploring Explicit and Implicit Visual Relationships for Image Captioning
Zeliang Song
Xiaofei Zhou
21
7
0
06 May 2021
Structured Co-reference Graph Attention for Video-grounded Dialogue
Structured Co-reference Graph Attention for Video-grounded Dialogue
Junyeong Kim
Sunjae Yoon
Dahyun Kim
Chang D. Yoo
23
26
0
24 Mar 2021
Causal Attention for Vision-Language Tasks
Causal Attention for Vision-Language Tasks
Xu Yang
Hanwang Zhang
Guojun Qi
Jianfei Cai
CML
28
148
0
05 Mar 2021
CPTR: Full Transformer Network for Image Captioning
CPTR: Full Transformer Network for Image Captioning
Wei Liu
Sihan Chen
Longteng Guo
Xinxin Zhu
Jing Liu
ViT
10
141
0
26 Jan 2021
Towards Overcoming False Positives in Visual Relationship Detection
Towards Overcoming False Positives in Visual Relationship Detection
Daisheng Jin
Xiao Ma
Chongzhi Zhang
Yizhuo Zhou
Jiashu Tao
...
Haiyu Zhao
Shuai Yi
Zhoujun Li
Xianglong Liu
Hongsheng Li
25
5
0
23 Dec 2020
AutoCaption: Image Captioning with Neural Architecture Search
AutoCaption: Image Captioning with Neural Architecture Search
Xinxin Zhu
Weining Wang
Longteng Guo
Jing Liu
26
9
0
16 Dec 2020
Improving Image Captioning by Leveraging Intra- and Inter-layer Global
  Representation in Transformer Network
Improving Image Captioning by Leveraging Intra- and Inter-layer Global Representation in Transformer Network
Jiayi Ji
Yunpeng Luo
Xiaoshuai Sun
Fuhai Chen
Gen Luo
Yongjian Wu
Yue Gao
Rongrong Ji
ViT
51
170
0
13 Dec 2020
LayoutGMN: Neural Graph Matching for Structural Layout Similarity
LayoutGMN: Neural Graph Matching for Structural Layout Similarity
A. Patil
Manyi Li
Matthew Fisher
Manolis Savva
Hao Zhang
33
32
0
11 Dec 2020
Image Captioning with Context-Aware Auxiliary Guidance
Image Captioning with Context-Aware Auxiliary Guidance
Zeliang Song
Xiaofei Zhou
Zhendong Mao
Jianlong Tan
33
31
0
10 Dec 2020
Scan2Cap: Context-aware Dense Captioning in RGB-D Scans
Scan2Cap: Context-aware Dense Captioning in RGB-D Scans
Dave Zhenyu Chen
A. Gholami
Matthias Nießner
Angel X. Chang
3DPC
23
159
0
03 Dec 2020
Leveraging Activity Recognition to Enable Protective Behavior Detection
  in Continuous Data
Leveraging Activity Recognition to Enable Protective Behavior Detection in Continuous Data
Chongyang Wang
Yuan Gao
Akhil Mathur
A. Williams
Nicholas D. Lane
N. Bianchi-Berthouze
32
34
0
03 Nov 2020
Dual Attention on Pyramid Feature Maps for Image Captioning
Dual Attention on Pyramid Feature Maps for Image Captioning
Litao Yu
Jian Zhang
Qiang Wu
21
47
0
02 Nov 2020
New Ideas and Trends in Deep Multimodal Content Understanding: A Review
New Ideas and Trends in Deep Multimodal Content Understanding: A Review
Wei Chen
Weiping Wang
Li Liu
M. Lew
VLM
115
31
0
16 Oct 2020
Teacher-Critical Training Strategies for Image Captioning
Teacher-Critical Training Strategies for Image Captioning
Yiqing Huang
Jiansheng Chen
VLM
26
8
0
30 Sep 2020
SceneGen: Generative Contextual Scene Augmentation using Scene Graph
  Priors
SceneGen: Generative Contextual Scene Augmentation using Scene Graph Priors
Mohammad Keshavarzi
Aakash Parikh
Xiyu Zhai
Melody Mao
Luisa Caldas
An Yang
27
24
0
25 Sep 2020
Retargetable AR: Context-aware Augmented Reality in Indoor Scenes based
  on 3D Scene Graph
Retargetable AR: Context-aware Augmented Reality in Indoor Scenes based on 3D Scene Graph
Tomu Tahara
Takashi Seno
Gaku Narita
T. Ishikawa
26
47
0
18 Aug 2020
Length-Controllable Image Captioning
Length-Controllable Image Captioning
Chaorui Deng
Ning Ding
Mingkui Tan
Qi Wu
VLM
33
56
0
19 Jul 2020
Sparse Graph to Sequence Learning for Vision Conditioned Long Textual
  Sequence Generation
Sparse Graph to Sequence Learning for Vision Conditioned Long Textual Sequence Generation
Aditya Mogadala
Marius Mosbach
Dietrich Klakow
VLM
106
0
0
12 Jul 2020
Loss Function Search for Face Recognition
Loss Function Search for Face Recognition
Xiaobo Wang
Shuo Wang
Cheng Chi
Shifeng Zhang
Tao Mei
CVBM
22
48
0
10 Jul 2020
Previous
123
Next