Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1807.09986
Cited By
Recurrent Fusion Network for Image Captioning
26 July 2018
Wenhao Jiang
Lin Ma
Yu-Gang Jiang
Wei Liu
Tong Zhang
ObjD
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Recurrent Fusion Network for Image Captioning"
40 / 40 papers shown
Title
Tri-FusionNet: Enhancing Image Description Generation with Transformer-based Fusion Network and Dual Attention Mechanism
Lakshita Agarwal
Bindu Verma
ViT
29
0
0
23 Apr 2025
HAAV: Hierarchical Aggregation of Augmented Views for Image Captioning
Chia-Wen Kuo
Z. Kira
34
21
0
25 May 2023
Towards Local Visual Modeling for Image Captioning
Yiwei Ma
Jiayi Ji
Xiaoshuai Sun
Yiyi Zhou
Rongrong Ji
ViT
21
71
0
13 Feb 2023
Stacked Cross-modal Feature Consolidation Attention Networks for Image Captioning
Mozhgan Pourkeshavarz
Shahabedin Nabavi
Mohsen Moghaddam
M. Shamsfard
31
4
0
08 Feb 2023
Semantic-Conditional Diffusion Networks for Image Captioning
Jianjie Luo
Yehao Li
Yingwei Pan
Ting Yao
Jianlin Feng
Hongyang Chao
Tao Mei
DiffM
27
62
0
06 Dec 2022
Learning to Collocate Visual-Linguistic Neural Modules for Image Captioning
Xu Yang
Hanwang Zhang
Chongyang Gao
Jianfei Cai
MLLM
40
10
0
04 Oct 2022
Reducing the Vision and Language Bias for Temporal Sentence Grounding
Daizong Liu
Xiaoye Qu
Wei Hu
19
49
0
27 Jul 2022
Bypass Network for Semantics Driven Image Paragraph Captioning
Qinjie Zheng
Chaoyue Wang
Dadong Wang
24
1
0
21 Jun 2022
Guiding Attention using Partial-Order Relationships for Image Captioning
Murad Popattia
Muhammad Rafi
Rizwan Qureshi
Shah Nawaz
21
4
0
15 Apr 2022
On Distinctive Image Captioning via Comparing and Reweighting
Jiuniu Wang
Wenjia Xu
Qingzhong Wang
Antoni B. Chan
38
16
0
08 Apr 2022
End-to-End Transformer Based Model for Image Captioning
Yiyu Wang
Jungang Xu
Yingfei Sun
VLM
ViT
26
117
0
29 Mar 2022
Multi-Scale Self-Contrastive Learning with Hard Negative Mining for Weakly-Supervised Query-based Video Grounding
Shentong Mo
Daizong Liu
Wei Hu
SSL
21
6
0
08 Mar 2022
Exploring Optical-Flow-Guided Motion and Detection-Based Appearance for Temporal Sentence Grounding
Daizong Liu
Xiang Fang
Wei Hu
Pan Zhou
23
37
0
06 Mar 2022
Deep Learning Approaches on Image Captioning: A Review
Taraneh Ghandi
H. Pourreza
H. Mahyar
VLM
16
89
0
31 Jan 2022
Compact Bidirectional Transformer for Image Captioning
Yuanen Zhou
Zhenzhen Hu
Daqing Liu
Huixia Ben
Meng Wang
VLM
20
16
0
06 Jan 2022
Injecting Semantic Concepts into End-to-End Image Captioning
Zhiyuan Fang
Jianfeng Wang
Xiaowei Hu
Lin Liang
Zhe Gan
Lijuan Wang
Yezhou Yang
Zicheng Liu
ViT
VLM
24
86
0
09 Dec 2021
D3Net: A Unified Speaker-Listener Architecture for 3D Dense Captioning and Visual Grounding
Dave Zhenyu Chen
Qirui Wu
Matthias Nießner
Angel X. Chang
21
29
0
02 Dec 2021
Neural Attention for Image Captioning: Review of Outstanding Methods
Zanyar Zohourianshahzadi
Jugal Kalita
VLM
32
45
0
29 Nov 2021
Auto-Parsing Network for Image Captioning and Visual Question Answering
Xu Yang
Chongyang Gao
Hanwang Zhang
Jianfei Cai
24
35
0
24 Aug 2021
From Show to Tell: A Survey on Deep Learning-based Image Captioning
Matteo Stefanini
Marcella Cornia
Lorenzo Baraldi
S. Cascianelli
G. Fiameni
Rita Cucchiara
3DV
VLM
MLLM
67
254
0
14 Jul 2021
Human-like Controllable Image Captioning with Verb-specific Semantic Roles
Long Chen
Zhihong Jiang
Jun Xiao
Wei Liu
30
74
0
22 Mar 2021
Causal Attention for Vision-Language Tasks
Xu Yang
Hanwang Zhang
Guojun Qi
Jianfei Cai
CML
28
148
0
05 Mar 2021
Macroscopic Control of Text Generation for Image Captioning
Zhangzi Zhu
Tianlei Wang
Hong Qu
29
4
0
20 Jan 2021
AutoCaption: Image Captioning with Neural Architecture Search
Xinxin Zhu
Weining Wang
Longteng Guo
Jing Liu
26
9
0
16 Dec 2020
Scan2Cap: Context-aware Dense Captioning in RGB-D Scans
Dave Zhenyu Chen
A. Gholami
Matthias Nießner
Angel X. Chang
3DPC
23
159
0
03 Dec 2020
New Ideas and Trends in Deep Multimodal Content Understanding: A Review
Wei Chen
Weiping Wang
Li Liu
M. Lew
VLM
115
31
0
16 Oct 2020
Compare and Reweight: Distinctive Image Captioning Using Similar Images Sets
Jiuniu Wang
Wenjia Xu
Qingzhong Wang
Antoni B. Chan
37
44
0
14 Jul 2020
A Better Variant of Self-Critical Sequence Training
Ruotian Luo
BDL
30
37
0
22 Mar 2020
Show, Edit and Tell: A Framework for Editing Image Captions
Fawaz Sammani
Luke Melas-Kyriazi
KELM
DiffM
48
59
0
06 Mar 2020
Meshed-Memory Transformer for Image Captioning
Marcella Cornia
Matteo Stefanini
Lorenzo Baraldi
Rita Cucchiara
14
868
0
17 Dec 2019
Fast Image Caption Generation with Position Alignment
Z. Fei
25
37
0
13 Dec 2019
Unpaired Image-to-Speech Synthesis with Multimodal Information Bottleneck
Shuang Ma
Daniel J. McDuff
Yale Song
25
22
0
19 Aug 2019
Attention on Attention for Image Captioning
Lun Huang
Wenmin Wang
Jie Chen
Xiao-Yong Wei
24
823
0
19 Aug 2019
Reconstruct and Represent Video Contents for Captioning via Reinforcement Learning
Wei Zhang
Bairui Wang
Lin Ma
Wei Liu
20
67
0
03 Jun 2019
Hallucinating Optical Flow Features for Video Classification
Yongyi Tang
Lin Ma
Lianqiang Zhou
19
19
0
28 May 2019
Aligning Visual Regions and Textual Concepts for Semantic-Grounded Image Representations
Fenglin Liu
Yuanxin Liu
Xuancheng Ren
Xiaodong He
Xu Sun
VLM
31
81
0
15 May 2019
Learning to Collocate Neural Modules for Image Captioning
Xu Yang
Hanwang Zhang
Jianfei Cai
25
77
0
18 Apr 2019
Context and Attribute Grounded Dense Captioning
Guojun Yin
Lu Sheng
Bin Liu
Nenghai Yu
Xiaogang Wang
Jing Shao
16
75
0
02 Apr 2019
Improving Image Captioning with Conditional Generative Adversarial Nets
Chen Chen
Shuai Mu
Wanpeng Xiao
Zexiong Ye
Liesi Wu
Qi Ju
GAN
29
90
0
18 May 2018
Multi-Way, Multilingual Neural Machine Translation with a Shared Attention Mechanism
Orhan Firat
Kyunghyun Cho
Yoshua Bengio
LRM
AIMat
231
623
0
06 Jan 2016
1