Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1411.5726
Cited By
CIDEr: Consensus-based Image Description Evaluation
20 November 2014
Ramakrishna Vedantam
C. L. Zitnick
Devi Parikh
Re-assign community
ArXiv
PDF
HTML
Papers citing
"CIDEr: Consensus-based Image Description Evaluation"
50 / 2,142 papers shown
Title
Normalized and Geometry-Aware Self-Attention Network for Image Captioning
Longteng Guo
Jing Liu
Xinxin Zhu
Peng Yao
Shichen Lu
Hanqing Lu
ViT
135
189
0
19 Mar 2020
Video Caption Dataset for Describing Human Actions in Japanese
Yutaro Shigeto
Yuya Yoshikawa
Jiaqing Lin
A. Takeuchi
20
3
0
10 Mar 2020
Pseudo-Convolutional Policy Gradient for Sequence-to-Sequence Lip-Reading
Mingshuang Luo
Shuang Yang
Shiguang Shan
Xilin Chen
27
41
0
09 Mar 2020
Deconfounded Image Captioning: A Causal Retrospect
Xu Yang
Hanwang Zhang
Jianfei Cai
CML
18
119
0
09 Mar 2020
Better Captioning with Sequence-Level Exploration
Jia Chen
Qin Jin
37
12
0
08 Mar 2020
OVC-Net: Object-Oriented Video Captioning with Temporal Graph and Detail Enhancement
Fangyi Zhu
Lei Li
Zhanyu Ma
Guang Chen
Jun Guo
19
1
0
08 Mar 2020
Captioning Images with Novel Objects via Online Vocabulary Expansion
Mikihiro Tanaka
Tatsuya Harada
3DV
33
2
0
06 Mar 2020
Show, Edit and Tell: A Framework for Editing Image Captions
Fawaz Sammani
Luke Melas-Kyriazi
KELM
DiffM
48
59
0
06 Mar 2020
Say As You Wish: Fine-grained Control of Image Caption Generation with Abstract Scene Graphs
Shizhe Chen
Qin Jin
Peng Wang
Qi Wu
DiffM
36
215
0
01 Mar 2020
Unblind Your Apps: Predicting Natural-Language Labels for Mobile GUI Components by Deep Learning
Jieshan Chen
Chunyang Chen
Zhenchang Xing
Xiwei Xu
Liming Zhu
Guoqiang Li
Jinshui Wang
19
138
0
01 Mar 2020
Grounded and Controllable Image Completion by Incorporating Lexical Semantics
Shengyu Zhang
Tan Jiang
Qinghao Huang
Ziqi Tan
Zhou Zhao
Siliang Tang
Jin Yu
Hongxia Yang
Yi Yang
Fei Wu
6
1
0
29 Feb 2020
Exploring and Distilling Cross-Modal Information for Image Captioning
Fenglin Liu
Xuancheng Ren
Yuanxin Liu
Kai Lei
Xu Sun
ViT
37
51
0
28 Feb 2020
Visual Commonsense R-CNN
Tan Wang
Jianqiang Huang
Hanwang Zhang
Qianru Sun
SSL
ObjD
CML
24
246
0
27 Feb 2020
Hierarchical Memory Decoding for Video Captioning
Aming Wu
Yahong Han
22
2
0
27 Feb 2020
CLARA: Clinical Report Auto-completion
Siddharth Biswal
Cao Xiao
Lucas Glass
M. P. M. Brandon Westover
Jimeng Sun
24
27
0
26 Feb 2020
Object Relational Graph with Teacher-Recommended Learning for Video Captioning
Ziqi Zhang
Yaya Shi
Chunfen Yuan
Bing Li
Peijin Wang
Weiming Hu
Zhengjun Zha
VLM
37
271
0
26 Feb 2020
What BERT Sees: Cross-Modal Transfer for Visual Question Generation
Thomas Scialom
Patrick Bordes
Paul-Alexis Dray
Jacopo Staiano
Patrick Gallinari
31
6
0
25 Feb 2020
Multimodal Transformer with Pointer Network for the DSTC8 AVSD Challenge
Hung Le
Nancy F. Chen
28
9
0
25 Feb 2020
Captioning Images Taken by People Who Are Blind
Danna Gurari
Yinan Zhao
Meng Zhang
Nilavra Bhattacharya
27
181
0
20 Feb 2020
When Radiology Report Generation Meets Knowledge Graph
Yixiao Zhang
Xiaosong Wang
Ziyue Xu
Qihang Yu
Alan Yuille
Daguang Xu
MedIm
31
295
0
19 Feb 2020
Latent Normalizing Flows for Many-to-Many Cross-Domain Mappings
Shweta Mahajan
Iryna Gurevych
Stefan Roth
DRL
21
36
0
16 Feb 2020
UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation
Huaishao Luo
Lei Ji
Botian Shi
Haoyang Huang
Nan Duan
Tianrui Li
Jason Li
Xilin Chen
Ming Zhou
VLM
46
440
0
15 Feb 2020
CBAG: Conditional Biomedical Abstract Generation
Justin Sybrandt
Ilya Safro
MedIm
AI4CE
22
8
0
13 Feb 2020
Sparse and Structured Visual Attention
Pedro Henrique Martins
S. Becker
Zita Marinho
Michael Arens
35
8
0
13 Feb 2020
Hide-and-Tell: Learning to Bridge Photo Streams for Visual Storytelling
Yunjae Jung
Dahun Kim
Sanghyun Woo
Kyungsu Kim
Sungjin Kim
In So Kweon
DiffM
16
31
0
03 Feb 2020
UIT-ViIC: A Dataset for the First Evaluation on Vietnamese Image Captioning
Q. Lam
Q. Le
Kiet Van Nguyen
Ngan Luu-Thuy Nguyen
27
19
0
01 Feb 2020
Bridging Text and Video: A Universal Multimodal Transformer for Video-Audio Scene-Aware Dialog
Zekang Li
Zongjia Li
Jinchao Zhang
Yang Feng
Cheng Niu
Jie Zhou
24
37
0
01 Feb 2020
TVR: A Large-Scale Dataset for Video-Subtitle Moment Retrieval
Jie Lei
Licheng Yu
Tamara L. Berg
Joey Tianyi Zhou
119
277
0
24 Jan 2020
Deep Bayesian Network for Visual Question Generation
Badri N. Patro
V. Kurmi
Sandeep Kumar
Vinay P. Namboodiri
BDL
17
19
0
23 Jan 2020
Robust Explanations for Visual Question Answering
Badri N. Patro
Shivansh Pate
Vinay P. Namboodiri
OOD
AAML
25
20
0
23 Jan 2020
Nested-Wasserstein Self-Imitation Learning for Sequence Generation
Ruiyi Zhang
Changyou Chen
Zhe Gan
Zheng Wen
Wenlin Wang
Lawrence Carin
31
5
0
20 Jan 2020
Spatio-Temporal Ranked-Attention Networks for Video Captioning
A. Cherian
Jue Wang
Chiori Hori
Tim K. Marks
AI4TS
22
19
0
17 Jan 2020
Delving Deeper into the Decoder for Video Captioning
Haoran Chen
Jianmin Li
Xiaolin Hu
43
34
0
16 Jan 2020
Show, Recall, and Tell: Image Captioning with Recall Mechanism
Li Wang
Zechen Bai
Yonghua Zhang
Hongtao Lu
27
67
0
15 Jan 2020
In Defense of Grid Features for Visual Question Answering
Huaizu Jiang
Ishan Misra
Marcus Rohrbach
Erik Learned-Miller
Xinlei Chen
OOD
ObjD
23
318
0
10 Jan 2020
A Survey on Machine Reading Comprehension Systems
Razieh Baradaran
Razieh Ghiasi
Hossein Amirkhani
FaML
18
85
0
06 Jan 2020
Explain and Improve: LRP-Inference Fine-Tuning for Image Captioning Models
Jiamei Sun
Sebastian Lapuschkin
Wojciech Samek
Alexander Binder
FAtt
42
29
0
04 Jan 2020
Adaptive Correlated Monte Carlo for Contextual Categorical Sequence Generation
Xinjie Fan
Yizhe Zhang
Zhendong Wang
Mingyuan Zhou
BDL
9
4
0
31 Dec 2019
Explicit Sparse Transformer: Concentrated Attention Through Explicit Selection
Guangxiang Zhao
Junyang Lin
Zhiyuan Zhang
Xuancheng Ren
Qi Su
Xu Sun
22
108
0
25 Dec 2019
Deep Exemplar Networks for VQA and VQG
Badri N. Patro
Vinay P. Namboodiri
27
4
0
19 Dec 2019
Going Beneath the Surface: Evaluating Image Captioning for Grammaticality, Truthfulness and Diversity
Huiyuan Xie
Tom Sherborne
A. Kuhnle
Ann A. Copestake
DiffM
25
9
0
19 Dec 2019
Meshed-Memory Transformer for Image Captioning
Marcella Cornia
Matteo Stefanini
Lorenzo Baraldi
Rita Cucchiara
14
869
0
17 Dec 2019
Fast Image Caption Generation with Position Alignment
Z. Fei
28
37
0
13 Dec 2019
Connecting Vision and Language with Localized Narratives
Jordi Pont-Tuset
J. Uijlings
Soravit Changpinyo
Radu Soricut
V. Ferrari
ObjD
36
242
0
06 Dec 2019
Better Understanding Hierarchical Visual Relationship for Image Caption
Z. Fei
24
0
0
04 Dec 2019
Deep Bayesian Active Learning for Multiple Correct Outputs
Khaled Jedoui
Ranjay Krishna
Michael S. Bernstein
Li Fei-Fei
BDL
OOD
UQCV
24
14
0
02 Dec 2019
Assessing the Robustness of Visual Question Answering Models
Jia-Hong Huang
Modar Alfadly
Guohao Li
M. Worring
AAML
OOD
28
23
0
30 Nov 2019
Non-Autoregressive Coarse-to-Fine Video Captioning
Bang-ju Yang
Yuexian Zou
Fenglin Liu
Can Zhang
27
11
0
27 Nov 2019
Injecting Prior Knowledge into Image Caption Generation
A. Goel
Basura Fernando
Thanh-Son Nguyen
Hakan Bilen
23
0
0
22 Nov 2019
Characterizing the impact of using features extracted from pre-trained models on the quality of video captioning sequence-to-sequence models
Menatallh Hammad
May Hammad
Mohamed Elshenawy
24
2
0
22 Nov 2019
Previous
1
2
3
...
33
34
35
...
41
42
43
Next