Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1411.5726
Cited By
CIDEr: Consensus-based Image Description Evaluation
20 November 2014
Ramakrishna Vedantam
C. L. Zitnick
Devi Parikh
Re-assign community
ArXiv
PDF
HTML
Papers citing
"CIDEr: Consensus-based Image Description Evaluation"
50 / 2,142 papers shown
Title
Exploring Explicit and Implicit Visual Relationships for Image Captioning
Zeliang Song
Xiaofei Zhou
21
7
0
06 May 2021
A First Look: Towards Explainable TextVQA Models via Visual and Textual Explanations
Varun Nagaraj Rao
Xingjian Zhen
K. Hovsepian
Mingwei Shen
37
18
0
29 Apr 2021
Removing Word-Level Spurious Alignment between Images and Pseudo-Captions in Unsupervised Image Captioning
Ukyo Honda
Yoshitaka Ushiku
Atsushi Hashimoto
Taro Watanabe
Yuji Matsumoto
33
23
0
28 Apr 2021
TRECVID 2020: A comprehensive campaign for evaluating video retrieval tasks across multiple application domains
G. Awad
A. Butt
Keith Curtis
Jonathan G. Fiscus
A. Godil
...
Alan F. Smeaton
Yvette Graham
Gareth J. F. Jones
Wessel Kraaij
Georges Quénot
14
64
0
27 Apr 2021
Contextualized Keyword Representations for Multi-modal Retinal Image Captioning
Jia-Hong Huang
Ting-Wei Wu
M. Worring
MedIm
68
26
0
26 Apr 2021
MusCaps: Generating Captions for Music Audio
Ilaria Manco
Emmanouil Benetos
Elio Quinton
Gyorgy Fazekas
35
36
0
24 Apr 2021
Towards Accurate Text-based Image Captioning with Content Diversity Exploration
Guanghui Xu
Shuaicheng Niu
Mingkui Tan
Yucheng Luo
Qing Du
Qi Wu
DiffM
27
56
0
23 Apr 2021
CLIPScore: A Reference-free Evaluation Metric for Image Captioning
Jack Hessel
Ari Holtzman
Maxwell Forbes
Ronan Le Bras
Yejin Choi
CLIP
22
1,454
0
18 Apr 2021
Improving Question Answering Model Robustness with Synthetic Adversarial Data Generation
Max Bartolo
Tristan Thrush
Robin Jia
Sebastian Riedel
Pontus Stenetorp
Douwe Kiela
AAML
28
103
0
18 Apr 2021
Concadia: Towards Image-Based Text Generation with a Purpose
Elisa Kreiss
Fei Fang
Noah D. Goodman
Christopher Potts
24
23
0
16 Apr 2021
VGNMN: Video-grounded Neural Module Network to Video-Grounded Language Tasks
Hung Le
Nancy F. Chen
Guosheng Lin
MLLM
30
19
0
16 Apr 2021
Sentence-Permuted Paragraph Generation
Wenhao Yu
Chenguang Zhu
Tong Zhao
Zhichun Guo
Meng Jiang
8
11
0
15 Apr 2021
Video Question Answering with Phrases via Semantic Roles
Arka Sadhu
Kan Chen
Ram Nevatia
16
15
0
08 Apr 2021
Automatic Generation of Descriptive Titles for Video Clips Using Deep Learning
Soheyla Amirian
Khaled Rasheed
T. Taha
H. Arabnia
VLM
VGen
19
23
0
07 Apr 2021
Compressing Visual-linguistic Model via Knowledge Distillation
Zhiyuan Fang
Jianfeng Wang
Xiaowei Hu
Lijuan Wang
Yezhou Yang
Zicheng Liu
VLM
39
97
0
05 Apr 2021
FixMyPose: Pose Correctional Captioning and Retrieval
Hyounghun Kim
Abhaysinh Zala
Graham Burri
Joey Tianyi Zhou
36
16
0
04 Apr 2021
Towards General Purpose Vision Systems
Tanmay Gupta
Amita Kamath
Aniruddha Kembhavi
Derek Hoiem
13
50
0
01 Apr 2021
Learning Domain Adaptation with Model Calibration for Surgical Report Generation in Robotic Surgery
Mengya Xu
Mobarakol Islam
C. Lim
Hongliang Ren
OOD
MedIm
37
29
0
31 Mar 2021
Embedding API Dependency Graph for Neural Code Generation
Chen Lyu
Ruyun Wang
Hongyu Zhang
Hanwen Zhang
Songlin Hu
GNN
31
20
0
29 Mar 2021
On Hallucination and Predictive Uncertainty in Conditional Language Generation
Yijun Xiao
Wenjie Wang
HILM
19
182
0
28 Mar 2021
A Comprehensive Review of the Video-to-Text Problem
Jesus Perez-Martin
B. Bustos
S. Guimarães
I. Sipiran
Jorge A. Pérez
Grethel Coello Said
13
17
0
27 Mar 2021
Describing and Localizing Multiple Changes with Transformers
Yue Qiu
Shintaro Yamamoto
Kodai Nakashima
Ryota Suzuki
K. Iwata
Hirokatsu Kataoka
Y. Satoh
30
55
0
25 Mar 2021
Structured Co-reference Graph Attention for Video-grounded Dialogue
Junyeong Kim
Sunjae Yoon
Dahyun Kim
Chang D. Yoo
26
26
0
24 Mar 2021
QuestEval: Summarization Asks for Fact-based Evaluation
Thomas Scialom
Paul-Alexis Dray
Patrick Gallinari
Sylvain Lamprier
Benjamin Piwowarski
Jacopo Staiano
Alex Jinpeng Wang
HILM
33
268
0
23 Mar 2021
Human-like Controllable Image Captioning with Verb-specific Semantic Roles
Long Chen
Zhihong Jiang
Jun Xiao
Wei Liu
32
74
0
22 Mar 2021
BERT: A Review of Applications in Natural Language Processing and Understanding
M. V. Koroteev
VLM
25
197
0
22 Mar 2021
BlonDe: An Automatic Evaluation Metric for Document-level Machine Translation
Yu Jiang
Tianyu Liu
Shuming Ma
Dongdong Zhang
Jian Yang
Haoyang Huang
Rico Sennrich
Ryan Cotterell
Mrinmaya Sachan
M. Zhou
24
58
0
22 Mar 2021
#PraCegoVer: A Large Dataset for Image Captioning in Portuguese
G. O. D. Santos
Esther Luna Colombini
Sandra Avila
39
10
0
21 Mar 2021
3M: Multi-style image caption generation using Multi-modality features under Multi-UPDOWN model
Chengxi Li
Brent Harrison
41
6
0
20 Mar 2021
Local Interpretations for Explainable Natural Language Processing: A Survey
Siwen Luo
Hamish Ivison
S. Han
Josiah Poon
MILM
48
48
0
20 Mar 2021
Play the Shannon Game With Language Models: A Human-Free Approach to Summary Evaluation
Nicholas Egan
Oleg V. Vasilyev
John Bohannon
HILM
13
18
0
19 Mar 2021
Quinductor: a multilingual data-driven method for generating reading-comprehension questions using Universal Dependencies
Dmytro Kalpakchi
Johan Boye
30
6
0
18 Mar 2021
On Semantic Similarity in Video Retrieval
Michael Wray
Hazel Doughty
Dima Damen
33
66
0
18 Mar 2021
Constrained Text Generation with Global Guidance -- Case Study on CommonGen
Yixian Liu
Liwen Zhang
Wenjuan Han
Yue Zhang
Kewei Tu
41
9
0
12 Mar 2021
Iterative Shrinking for Referring Expression Grounding Using Deep Reinforcement Learning
Mingjie Sun
Jimin Xiao
Eng Gee Lim
ObjD
22
33
0
09 Mar 2021
Multiple Instance Captioning: Learning Representations from Histopathology Textbooks and Articles
Jevgenij Gamper
Nasir M. Rajpoot
27
63
0
08 Mar 2021
Relationship-based Neural Baby Talk
Fan Fu
Tingting Xie
Ioannis Patras
Sepehr Jalali
12
0
0
08 Mar 2021
Perspectives and Prospects on Transformer Architecture for Cross-Modal Tasks with Language and Vision
Andrew Shin
Masato Ishii
T. Narihira
35
37
0
06 Mar 2021
Causal Attention for Vision-Language Tasks
Xu Yang
Hanwang Zhang
Guojun Qi
Jianfei Cai
CML
36
149
0
05 Mar 2021
CrossMap Transformer: A Crossmodal Masked Path Transformer Using Double Back-Translation for Vision-and-Language Navigation
A. Magassouba
K. Sugiura
Hisashi Kawai
53
10
0
01 Mar 2021
Learning Reasoning Paths over Semantic Graphs for Video-grounded Dialogues
Hung Le
Nancy F. Chen
Guosheng Lin
36
14
0
01 Mar 2021
Unbiased Sentence Encoder For Large-Scale Multi-lingual Search Engines
Mahdi Hajiaghayi
Monir Hajiaghayi
Mark R. Bolin
26
0
0
01 Mar 2021
Enhanced Modality Transition for Image Captioning
Ziwei Wang
Yadan Luo
Zi Huang
13
0
0
23 Feb 2021
Investigating Local and Global Information for Automated Audio Captioning with Transfer Learning
Xuenan Xu
Heinrich Dinkel
Mengyue Wu
Zeyu Xie
Kai Yu
18
60
0
23 Feb 2021
VisualGPT: Data-efficient Adaptation of Pretrained Language Models for Image Captioning
Jun Chen
Han Guo
Kai Yi
Boyang Albert Li
Mohamed Elhoseiny
VLM
31
219
0
20 Feb 2021
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts
Soravit Changpinyo
P. Sharma
Nan Ding
Radu Soricut
VLM
320
1,086
0
17 Feb 2021
Improved Bengali Image Captioning via deep convolutional neural network based encoder-decoder model
Mohammad Faiyaz Khan
S. M. S. Shifath
Md. Saiful Islam
VLM
35
18
0
14 Feb 2021
Generating Diversified Comments via Reader-Aware Topic Modeling and Saliency Detection
Wei Wang
Piji Li
Haitao Zheng
27
13
0
13 Feb 2021
The MSR-Video to Text Dataset with Clean Annotations
Haoran Chen
Jianmin Li
Simone Frintrop
Xiaolin Hu
24
18
0
12 Feb 2021
The Role of the Input in Natural Language Video Description
S. Cascianelli
G. Costante
Alessandro Devo
Thomas Alessandro Ciarfuglia
P. Valigi
M. L. Fravolini
21
5
0
09 Feb 2021
Previous
1
2
3
...
28
29
30
...
41
42
43
Next