Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1411.5726
Cited By
CIDEr: Consensus-based Image Description Evaluation
20 November 2014
Ramakrishna Vedantam
C. L. Zitnick
Devi Parikh
Re-assign community
ArXiv
PDF
HTML
Papers citing
"CIDEr: Consensus-based Image Description Evaluation"
50 / 2,142 papers shown
Title
Iconographic Image Captioning for Artworks
E. Cetinic
32
24
0
07 Feb 2021
Commonsense Knowledge Aware Concept Selection For Diverse and Informative Visual Storytelling
Hong Chen
Yifei Huang
Hiroya Takamura
Hideki Nakayama
DiffM
17
42
0
05 Feb 2021
Unifying Vision-and-Language Tasks via Text Generation
Jaemin Cho
Jie Lei
Hao Tan
Joey Tianyi Zhou
MLLM
277
525
0
04 Feb 2021
L2C: Describing Visual Differences Needs Semantic Understanding of Individuals
An Yan
Xinze Wang
Tsu-Jui Fu
William Yang Wang
VLM
24
11
0
03 Feb 2021
Semantic Grouping Network for Video Captioning
Hobin Ryu
Sunghun Kang
Haeyong Kang
Chang D. Yoo
41
135
0
01 Feb 2021
VX2TEXT: End-to-End Learning of Video-Based Text Generation From Multimodal Inputs
Xudong Lin
Gedas Bertasius
Jue Wang
Shih-Fu Chang
Devi Parikh
Lorenzo Torresani
VGen
33
66
0
28 Jan 2021
The Role of Syntactic Planning in Compositional Image Captioning
Emanuele Bugliarello
Desmond Elliott
CoGe
36
13
0
28 Jan 2021
VisualMRC: Machine Reading Comprehension on Document Images
Ryota Tanaka
Kyosuke Nishida
Sen Yoshida
26
138
0
27 Jan 2021
On the Evaluation of Vision-and-Language Navigation Instructions
Mingde Zhao
Peter Anderson
Vihan Jain
Su Wang
Alexander Ku
Jason Baldridge
Eugene Ie
238
51
0
26 Jan 2021
Adversarial Text-to-Image Synthesis: A Review
Stanislav Frolov
Tobias Hinz
Federico Raue
Jörn Hees
Andreas Dengel
EGVM
27
175
0
25 Jan 2021
ECOL-R: Encouraging Copying in Novel Object Captioning with Reinforcement Learning
Yufei Wang
Ian D. Wood
Stephen Wan
Mark Johnson
28
7
0
25 Jan 2021
Fast Sequence Generation with Multi-Agent Reinforcement Learning
Longteng Guo
Jing Liu
Xinxin Zhu
Hanqing Lu
LRM
56
6
0
24 Jan 2021
Data-to-text Generation by Splicing Together Nearest Neighbors
Sam Wiseman
A. Backurs
K. Stratos
27
9
0
20 Jan 2021
Macroscopic Control of Text Generation for Image Captioning
Zhangzi Zhu
Tianlei Wang
Hong Qu
31
4
0
20 Jan 2021
ArtEmis: Affective Language for Visual Art
Panos Achlioptas
M. Ovsjanikov
Kilichbek Haydarov
Mohamed Elhoseiny
Leonidas J. Guibas
31
115
0
19 Jan 2021
Diagnostic Captioning: A Survey
John Pavlopoulos
Vasiliki Kougia
Ion Androutsopoulos
D. Papamichail
3DV
MedIm
91
26
0
18 Jan 2021
GENIE: Toward Reproducible and Standardized Human Evaluation for Text Generation
Daniel Khashabi
Gabriel Stanovsky
Jonathan Bragg
Nicholas Lourie
Jungo Kasai
Yejin Choi
Noah A. Smith
Daniel S. Weld
29
20
0
17 Jan 2021
Dual-Level Collaborative Transformer for Image Captioning
Yunpeng Luo
Jiayi Ji
Xiaoshuai Sun
Liujuan Cao
Yongjian Wu
Feiyue Huang
Chia-Wen Lin
Rongrong Ji
ViT
19
274
0
16 Jan 2021
Enabling Robots to Draw and Tell: Towards Visually Grounded Multimodal Description Generation
Ting Han
Sina Zarrieß
36
0
0
14 Jan 2021
Exploration of Visual Features and their weighted-additive fusion for Video Captioning
V. PraveenS.
Akhilesh Bharadwaj
Harsh Raj
Janhavi Dadhania
Ganesh Samarth C.A
Nikhil Pareek
S. M. I. S. R. Mahadeva Prasanna
35
1
0
14 Jan 2021
Explainability of deep vision-based autonomous driving systems: Review and challenges
Éloi Zablocki
H. Ben-younes
P. Pérez
Matthieu Cord
XAI
53
170
0
13 Jan 2021
Transforming Multi-Conditioned Generation from Meaning Representation
Joosung Lee
19
3
0
12 Jan 2021
Unifying Relational Sentence Generation and Retrieval for Medical Image Report Composition
Fuyu Wang
Xiaodan Liang
Lin Xu
Liang Lin
MedIm
34
25
0
09 Jan 2021
TextBox: A Unified, Modularized, and Extensible Framework for Text Generation
Junyi Li
Tianyi Tang
Gaole He
Jinhao Jiang
Xiaoxuan Hu
Puzhao Xie
Zhipeng Chen
Zhuohao Yu
Wayne Xin Zhao
Ji-Rong Wen
37
25
0
06 Jan 2021
End-to-End Video Question-Answer Generation with Generator-Pretester Network
Hung-Ting Su
Chen-Hsi Chang
Po-Wei Shen
Yu-Siang Wang
Ya-Liang Chang
Yu-Cheng Chang
Pu-Jen Cheng
Winston H. Hsu
40
31
0
05 Jan 2021
How to Train Your Agent to Read and Write
Li Liu
Mengge He
Guanghui Xu
Mingkui Tan
Qi Wu
DiffM
31
3
0
04 Jan 2021
KM-BART: Knowledge Enhanced Multimodal BART for Visual Commonsense Generation
Yiran Xing
Z. Shi
Zhao Meng
Gerhard Lakemeyer
Yunpu Ma
Roger Wattenhofer
VLM
72
40
0
02 Jan 2021
Improving Sequence-to-Sequence Pre-training via Sequence Span Rewriting
Wangchunshu Zhou
Tao Ge
Canwen Xu
Ke Xu
Furu Wei
LRM
16
15
0
02 Jan 2021
On-the-Fly Attention Modulation for Neural Generation
Yue Dong
Chandra Bhagavatula
Ximing Lu
Jena D. Hwang
Antoine Bosselut
Jackie C.K. Cheung
Yejin Choi
59
13
0
02 Jan 2021
Video Captioning in Compressed Video
Mingjian Zhu
Chenrui Duan
Changbin (Brad) Yu
25
4
0
02 Jan 2021
Analyzing Commonsense Emergence in Few-shot Knowledge Models
Jeff Da
Ronan Le Bras
Ximing Lu
Yejin Choi
Antoine Bosselut
AI4MH
KELM
77
40
0
01 Jan 2021
Prefix-Tuning: Optimizing Continuous Prompts for Generation
Xiang Lisa Li
Percy Liang
61
4,103
0
01 Jan 2021
Text-Free Image-to-Speech Synthesis Using Learned Segmental Units
Wei-Ning Hsu
David Harwath
Christopher Song
James R. Glass
CLIP
37
66
0
31 Dec 2020
Neural Text Generation with Artificial Negative Examples
Keisuke Shirai
Kazuma Hashimoto
Akiko Eriguchi
Takashi Ninomiya
Shinsuke Mori
13
7
0
28 Dec 2020
WEmbSim: A Simple yet Effective Metric for Image Captioning
Naeha Sharif
Lyndon White
Bennamoun
Wei Liu
Syed Afaq Ali Shah
25
1
0
24 Dec 2020
LCEval: Learned Composite Metric for Caption Evaluation
Naeha Sharif
Lyndon White
Bennamoun
Wei Liu
Syed Afaq Ali Shah
26
8
0
24 Dec 2020
SubICap: Towards Subword-informed Image Captioning
Naeha Sharif
Bennamoun
Wei Liu
Syed Afaq Ali Shah
30
2
0
24 Dec 2020
Guidance Module Network for Video Captioning
Xiao Zhang
Chunsheng Liu
F. Chang
21
4
0
20 Dec 2020
Lexically-constrained Text Generation through Commonsense Knowledge Extraction and Injection
Yikang Li
P. Goel
Varsha Kuppur Rajendra
H. Singh
Jonathan M Francis
Kaixin Ma
Eric Nyberg
A. Oltramari
13
7
0
19 Dec 2020
AutoCaption: Image Captioning with Neural Architecture Search
Xinxin Zhu
Weining Wang
Longteng Guo
Jing Liu
32
9
0
16 Dec 2020
Intrinsic Image Captioning Evaluation
Chao Zeng
Sam Kwong
21
0
0
14 Dec 2020
MSVD-Turkish: A Comprehensive Multimodal Dataset for Integrated Vision and Language Research in Turkish
Begum Citamak
Ozan Caglayan
Menekse Kuyu
Erkut Erdem
Aykut Erdem
Pranava Madhyastha
Lucia Specia
31
8
0
13 Dec 2020
Improving Image Captioning by Leveraging Intra- and Inter-layer Global Representation in Transformer Network
Jiayi Ji
Yunpeng Luo
Xiaoshuai Sun
Fuhai Chen
Gen Luo
Yongjian Wu
Yue Gao
Rongrong Ji
ViT
54
170
0
13 Dec 2020
MiniVLM: A Smaller and Faster Vision-Language Model
Jianfeng Wang
Xiaowei Hu
Pengchuan Zhang
Xiujun Li
Lijuan Wang
Lefei Zhang
Jianfeng Gao
Zicheng Liu
VLM
MLLM
35
59
0
13 Dec 2020
Image Captioning with Context-Aware Auxiliary Guidance
Zeliang Song
Xiaofei Zhou
Zhendong Mao
Jianlong Tan
41
31
0
10 Dec 2020
Simple is not Easy: A Simple Strong Baseline for TextVQA and TextCaps
Qi Zhu
Chenyu Gao
Peng Wang
Qi Wu
33
54
0
09 Dec 2020
Driving Behavior Explanation with Multi-level Fusion
H. Ben-younes
Éloi Zablocki
Patrick Pérez
Matthieu Cord
27
30
0
09 Dec 2020
Towards Annotation-Free Evaluation of Cross-Lingual Image Captioning
Aozhu Chen
Xinyi Huang
Hailan Lin
Xirong Li
16
5
0
09 Dec 2020
TAP: Text-Aware Pre-training for Text-VQA and Text-Caption
Zhengyuan Yang
Yijuan Lu
Jianfeng Wang
Xi Yin
D. Florêncio
Lijuan Wang
Cha Zhang
Lei Zhang
Jiebo Luo
VLM
28
141
0
08 Dec 2020
Confidence-aware Non-repetitive Multimodal Transformers for TextCaps
Zhaokai Wang
Renda Bao
Qi Wu
Si Liu
21
26
0
07 Dec 2020
Previous
1
2
3
...
29
30
31
...
41
42
43
Next