Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1411.5726
Cited By
v1
v2 (latest)
CIDEr: Consensus-based Image Description Evaluation
20 November 2014
Ramakrishna Vedantam
C. L. Zitnick
Devi Parikh
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"CIDEr: Consensus-based Image Description Evaluation"
50 / 2,184 papers shown
Title
RedCaps: web-curated image-text data created by the people, for the people
Karan Desai
Gaurav Kaul
Zubin Aysola
Justin Johnson
137
169
0
22 Nov 2021
L-Verse: Bidirectional Generation Between Image and Text
Taehoon Kim
Gwangmo Song
Sihaeng Lee
Sangyun Kim
Yewon Seo
Soonyoung Lee
S. Kim
Honglak Lee
Kyunghoon Bae
158
26
0
22 Nov 2021
DVCFlow: Modeling Information Flow Towards Human-like Video Captioning
Xu Yan
Zhengcong Fei
Shuhui Wang
Qingming Huang
Qi Tian
VGen
105
4
0
19 Nov 2021
UFO: A UniFied TransfOrmer for Vision-Language Representation Learning
Jianfeng Wang
Xiaowei Hu
Zhe Gan
Zhengyuan Yang
Xiyang Dai
Zicheng Liu
Yumao Lu
Lijuan Wang
ViT
78
57
0
19 Nov 2021
ClipCap: CLIP Prefix for Image Captioning
Ron Mokady
Amir Hertz
Amit H. Bermano
CLIP
VLM
81
684
0
18 Nov 2021
Transparent Human Evaluation for Image Captioning
Jungo Kasai
Keisuke Sakaguchi
Lavinia Dunagan
Jacob Morrison
Ronan Le Bras
Yejin Choi
Noah A. Smith
82
49
0
17 Nov 2021
EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching
Yaya Shi
Xu Yang
Haiyang Xu
Chunfen Yuan
Bing Li
Weiming Hu
Zhengjun Zha
80
33
0
17 Nov 2021
Co-segmentation Inspired Attention Module for Video-based Computer Vision Tasks
Arulkumar Subramaniam
Jayesh Vaidya
Muhammed Ameen
Athira M. Nambiar
Anurag Mittal
71
7
0
14 Nov 2021
Visual Intelligence through Human Interaction
Ranjay Krishna
Mitchell L. Gordon
Fei-Fei Li
Michael S. Bernstein
71
8
0
12 Nov 2021
The Curious Layperson: Fine-Grained Image Recognition without Expert Labels
Subhabrata Choudhury
Iro Laina
Christian Rupprecht
Andrea Vedaldi
VLM
75
10
0
05 Nov 2021
Transparency of Deep Neural Networks for Medical Image Analysis: A Review of Interpretability Methods
Zohaib Salahuddin
Henry C. Woodruff
A. Chatterjee
Philippe Lambin
90
321
0
01 Nov 2021
EventNarrative: A large-scale Event-centric Dataset for Knowledge Graph-to-Text Generation
Anthony Colas
A. Sadeghian
Yue Wang
D. Wang
83
22
0
30 Oct 2021
Automatic Knowledge Augmentation for Generative Commonsense Reasoning
Jaehyung Seo
Chanjun Park
Sugyeong Eo
Hyeonseok Moon
Heuiseok Lim
ReLM
LRM
43
3
0
30 Oct 2021
Discovering Non-monotonic Autoregressive Orderings with Variational Inference
Xuanlin Li
Brandon Trabucco
Dongmin Park
Michael Luo
S. Shen
Trevor Darrell
Yang Gao
90
13
0
27 Oct 2021
Bangla Image Caption Generation through CNN-Transformer based Encoder-Decoder Network
Yuansan Liu
MD Abdullah Al Nasim
Sourav Saha
Faria Afrin
Raisa Mallik
Sathishkumar Samiappan
ViT
41
14
0
24 Oct 2021
Exploiting Cross-Modal Prediction and Relation Consistency for Semi-Supervised Image Captioning
Yang Yang
Haoran Wei
Hengshu Zhu
Dianhai Yu
Hui Xiong
Jian Yang
SSL
34
33
0
22 Oct 2021
Cortico-cerebellar networks as decoupling neural interfaces
J. Pemberton
E. Boven
Richard Apps
Rui Ponte Costa
70
6
0
21 Oct 2021
Better than Average: Paired Evaluation of NLP Systems
Maxime Peyrard
Wei Zhao
Steffen Eger
Robert West
ELM
101
26
0
20 Oct 2021
A Self-Explainable Stylish Image Captioning Framework via Multi-References
Chengxi Li
Brent Harrison
126
0
0
20 Oct 2021
R
3
^3
3
Net:Relation-embedded Representation Reconstruction Network for Change Captioning
Yunbin Tu
Liang Li
C. Yan
Shengxiang Gao
Zhengtao Yu
82
25
0
20 Oct 2021
A Picture is Worth a Thousand Words: A Unified System for Diverse Captions and Rich Images Generation
Yupan Huang
Bei Liu
Jianlong Fu
Yutong Lu
DiffM
65
6
0
19 Oct 2021
Unifying Multimodal Transformer for Bi-directional Image and Text Generation
Yupan Huang
Hongwei Xue
Bei Liu
Yutong Lu
79
59
0
19 Oct 2021
BEAMetrics: A Benchmark for Language Generation Evaluation Evaluation
Thomas Scialom
Felix Hill
60
7
0
18 Oct 2021
Think Before You Speak: Explicitly Generating Implicit Commonsense Knowledge for Response Generation
Pei Zhou
Karthik Gopalakrishnan
Behnam Hedayatnia
Seokhwan Kim
Jay Pujara
Xiang Ren
Yang Liu
Dilek Z. Hakkani-Tür
111
41
0
16 Oct 2021
A Good Prompt Is Worth Millions of Parameters: Low-resource Prompt-based Learning for Vision-Language Models
Woojeong Jin
Yu Cheng
Yelong Shen
Weizhu Chen
Xiang Ren
VLM
VPVLM
MLLM
119
138
0
16 Oct 2021
Self-Annotated Training for Controllable Image Captioning
Zhangzi Zhu
Tianlei Wang
Hong Qu
76
2
0
16 Oct 2021
Guiding Visual Question Generation
Nihir Vedd
Zixu Wang
Marek Rei
Yishu Miao
Lucia Specia
138
22
0
15 Oct 2021
Diverse Audio Captioning via Adversarial Training
Xinhao Mei
Xubo Liu
Jianyuan Sun
Mark D. Plumbley
Wenwu Wang
DiffM
GAN
110
28
0
13 Oct 2021
CLIP4Caption: CLIP for Video Caption
Mingkang Tang
Zhanyu Wang
Zhenhua Liu
Fengyun Rao
Dian Li
Xiu Li
CLIP
VLM
89
155
0
13 Oct 2021
Improving the Performance of Automated Audio Captioning via Integrating the Acoustic and Semantic Information
Zhongjie Ye
Helin Wang
Dongchao Yang
Yuexian Zou
101
28
0
12 Oct 2021
Semi-Autoregressive Image Captioning
Xu Yan
Zhengcong Fei
Zekang Li
Shuhui Wang
Qingming Huang
Qi Tian
91
25
0
11 Oct 2021
CLIP4Caption ++: Multi-CLIP for Video Caption
Mingkang Tang
Zhanyu Wang
Zhaoyang Zeng
Feng Rao
Dian Li
VLM
CLIP
42
7
0
11 Oct 2021
Can Audio Captions Be Evaluated with Image Caption Metrics?
Zelin Zhou
Zhiling Zhang
Xuenan Xu
Zeyu Xie
Mengyue Wu
Kenny Q. Zhu
68
46
0
10 Oct 2021
Toward a Human-Level Video Understanding Intelligence
Y. Heo
Minsu Lee
Seongho Choi
Woo Suk Choi
Minjung Shin
Minjoon Jung
Jeh-Kwang Ryu
Byoung-Tak Zhang
32
0
0
08 Oct 2021
End-to-End Supermask Pruning: Learning to Prune Image Captioning Models
J. Tan
C. Chan
Joon Huang Chuah
VLM
132
16
0
07 Oct 2021
Is An Image Worth Five Sentences? A New Look into Semantics for Image-Text Matching
Ali Furkan Biten
Andrés Mafla
Lluís Gómez
Dimosthenis Karatzas
251
18
0
06 Oct 2021
Let there be a clock on the beach: Reducing Object Hallucination in Image Captioning
Ali Furkan Biten
L. G. I. Bigorda
Dimosthenis Karatzas
159
63
0
04 Oct 2021
Audio Captioning Using Sound Event Detection
Aycsegul Ozkaya Eren
M. Sert
78
8
0
04 Oct 2021
Geometry Attention Transformer with Position-aware LSTMs for Image Captioning
Chi-Yin Wang
Yulin Shen
Luping Ji
ViT
106
53
0
01 Oct 2021
CrossCLR: Cross-modal Contrastive Learning For Multi-modal Video Representations
Mohammadreza Zolfaghari
Yi Zhu
Peter V. Gehler
Thomas Brox
194
130
0
30 Sep 2021
Geometry-Entangled Visual Semantic Transformer for Image Captioning
Ling Cheng
Wei Wei
Feida Zhu
Yong Liu
Chunyan Miao
ViT
47
3
0
29 Sep 2021
CIDEr-R: Robust Consensus-based Image Description Evaluation
G. O. D. Santos
Esther Luna Colombini
Sandra Avila
81
30
0
28 Sep 2021
Learning Natural Language Generation from Scratch
Alice Martin Donati
Guillaume Quispe
Charles Ollion
Sylvain Le Corff
Florian Strub
Olivier Pietquin
LRM
55
4
0
20 Sep 2021
Cross Modification Attention Based Deliberation Model for Image Captioning
Zheng Lian
Yanan Zhang
Haichang Li
Rui Wang
Xiaohui Hu
66
5
0
17 Sep 2021
Label-Attention Transformer with Geometrically Coherent Objects for Image Captioning
Shikha Dubey
Farrukh Olimov
M. Rafique
Joonmo Kim
M. Jeon
ViT
84
43
0
16 Sep 2021
Improving Text Auto-Completion with Next Phrase Prediction
Dong-Ho Lee
Zhiqiang Hu
Roy Ka-wei Lee
LRM
52
4
0
15 Sep 2021
Attention Is Indeed All You Need: Semantically Attention-Guided Decoding for Data-to-Text NLG
Juraj Juraska
M. Walker
56
17
0
15 Sep 2021
SafeAccess+: An Intelligent System to make Smart Home Safer and Americans with Disability Act Compliant
Shahinur Alam
40
2
0
14 Sep 2021
KFCNet: Knowledge Filtering and Contrastive Learning Network for Generative Commonsense Reasoning
Haonan Li
Yeyun Gong
Jian Jiao
Ruofei Zhang
Timothy Baldwin
Nan Duan
OffRL
93
6
0
14 Sep 2021
Perturbation CheckLists for Evaluating NLG Evaluation Metrics
Ananya B. Sai
Tanay Dixit
D. Y. Sheth
S. Mohan
Mitesh M. Khapra
AAML
161
58
0
13 Sep 2021
Previous
1
2
3
...
26
27
28
...
42
43
44
Next