ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1411.5726
  4. Cited By
CIDEr: Consensus-based Image Description Evaluation
v1v2 (latest)

CIDEr: Consensus-based Image Description Evaluation

20 November 2014
Ramakrishna Vedantam
C. L. Zitnick
Devi Parikh
ArXiv (abs)PDFHTML

Papers citing "CIDEr: Consensus-based Image Description Evaluation"

50 / 2,183 papers shown
Title
Unblind Your Apps: Predicting Natural-Language Labels for Mobile GUI
  Components by Deep Learning
Unblind Your Apps: Predicting Natural-Language Labels for Mobile GUI Components by Deep Learning
Jieshan Chen
Chunyang Chen
Zhenchang Xing
Xiwei Xu
Liming Zhu
Guoqiang Li
Jinshui Wang
78
139
0
01 Mar 2020
Grounded and Controllable Image Completion by Incorporating Lexical
  Semantics
Grounded and Controllable Image Completion by Incorporating Lexical Semantics
Shengyu Zhang
Tan Jiang
Qinghao Huang
Ziqi Tan
Zhou Zhao
Siliang Tang
Jin Yu
Hongxia Yang
Yi Yang
Leilei Gan
25
1
0
29 Feb 2020
Exploring and Distilling Cross-Modal Information for Image Captioning
Exploring and Distilling Cross-Modal Information for Image Captioning
Fenglin Liu
Xuancheng Ren
Yuanxin Liu
Kai Lei
Xu Sun
ViT
82
52
0
28 Feb 2020
Visual Commonsense R-CNN
Visual Commonsense R-CNN
Tan Wang
Jianqiang Huang
Hanwang Zhang
Qianru Sun
SSLObjDCML
86
252
0
27 Feb 2020
Hierarchical Memory Decoding for Video Captioning
Hierarchical Memory Decoding for Video Captioning
Aming Wu
Yahong Han
44
2
0
27 Feb 2020
CLARA: Clinical Report Auto-completion
CLARA: Clinical Report Auto-completion
Siddharth Biswal
Cao Xiao
Lucas Glass
M. P. M. Brandon Westover
Jimeng Sun
79
28
0
26 Feb 2020
Object Relational Graph with Teacher-Recommended Learning for Video
  Captioning
Object Relational Graph with Teacher-Recommended Learning for Video Captioning
Ziqi Zhang
Yaya Shi
Chunfen Yuan
Bing Li
Peijin Wang
Weiming Hu
Zhengjun Zha
VLM
93
275
0
26 Feb 2020
What BERT Sees: Cross-Modal Transfer for Visual Question Generation
What BERT Sees: Cross-Modal Transfer for Visual Question Generation
Thomas Scialom
Patrick Bordes
Paul-Alexis Dray
Jacopo Staiano
Patrick Gallinari
59
6
0
25 Feb 2020
Multimodal Transformer with Pointer Network for the DSTC8 AVSD Challenge
Multimodal Transformer with Pointer Network for the DSTC8 AVSD Challenge
Hung Le
Nancy F. Chen
57
9
0
25 Feb 2020
Captioning Images Taken by People Who Are Blind
Captioning Images Taken by People Who Are Blind
Danna Gurari
Yinan Zhao
Meng Zhang
Nilavra Bhattacharya
105
184
0
20 Feb 2020
When Radiology Report Generation Meets Knowledge Graph
When Radiology Report Generation Meets Knowledge Graph
Yixiao Zhang
Xiaosong Wang
Ziyue Xu
Qihang Yu
Alan Yuille
Daguang Xu
MedIm
90
305
0
19 Feb 2020
Latent Normalizing Flows for Many-to-Many Cross-Domain Mappings
Latent Normalizing Flows for Many-to-Many Cross-Domain Mappings
Shweta Mahajan
Iryna Gurevych
Stefan Roth
DRL
77
36
0
16 Feb 2020
UniVL: A Unified Video and Language Pre-Training Model for Multimodal
  Understanding and Generation
UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation
Huaishao Luo
Lei Ji
Botian Shi
Haoyang Huang
Nan Duan
Tianrui Li
Jason Li
Xilin Chen
Ming Zhou
VLM
130
438
0
15 Feb 2020
CBAG: Conditional Biomedical Abstract Generation
CBAG: Conditional Biomedical Abstract Generation
Justin Sybrandt
Ilya Safro
MedImAI4CE
53
8
0
13 Feb 2020
Sparse and Structured Visual Attention
Sparse and Structured Visual Attention
Pedro Henrique Martins
S. Becker
Zita Marinho
Michael Arens
78
8
0
13 Feb 2020
Hide-and-Tell: Learning to Bridge Photo Streams for Visual Storytelling
Hide-and-Tell: Learning to Bridge Photo Streams for Visual Storytelling
Yunjae Jung
Dahun Kim
Sanghyun Woo
Kyungsu Kim
Sungjin Kim
In So Kweon
DiffM
60
32
0
03 Feb 2020
UIT-ViIC: A Dataset for the First Evaluation on Vietnamese Image
  Captioning
UIT-ViIC: A Dataset for the First Evaluation on Vietnamese Image Captioning
Q. Lam
Q. Le
Kiet Van Nguyen
Ngan Luu-Thuy Nguyen
77
19
0
01 Feb 2020
Bridging Text and Video: A Universal Multimodal Transformer for
  Video-Audio Scene-Aware Dialog
Bridging Text and Video: A Universal Multimodal Transformer for Video-Audio Scene-Aware Dialog
Zekang Li
Zongjia Li
Jinchao Zhang
Yang Feng
Cheng Niu
Jie Zhou
143
37
0
01 Feb 2020
TVR: A Large-Scale Dataset for Video-Subtitle Moment Retrieval
TVR: A Large-Scale Dataset for Video-Subtitle Moment Retrieval
Jie Lei
Licheng Yu
Tamara L. Berg
Joey Tianyi Zhou
224
288
0
24 Jan 2020
Deep Bayesian Network for Visual Question Generation
Deep Bayesian Network for Visual Question Generation
Badri N. Patro
V. Kurmi
Sandeep Kumar
Vinay P. Namboodiri
BDL
36
18
0
23 Jan 2020
Robust Explanations for Visual Question Answering
Robust Explanations for Visual Question Answering
Badri N. Patro
Shivansh Pate
Vinay P. Namboodiri
OODAAML
71
19
0
23 Jan 2020
Nested-Wasserstein Self-Imitation Learning for Sequence Generation
Nested-Wasserstein Self-Imitation Learning for Sequence Generation
Ruiyi Zhang
Changyou Chen
Zhe Gan
Zheng Wen
Wenlin Wang
Lawrence Carin
76
7
0
20 Jan 2020
Spatio-Temporal Ranked-Attention Networks for Video Captioning
Spatio-Temporal Ranked-Attention Networks for Video Captioning
A. Cherian
Jue Wang
Chiori Hori
Tim K. Marks
AI4TS
49
19
0
17 Jan 2020
Delving Deeper into the Decoder for Video Captioning
Delving Deeper into the Decoder for Video Captioning
Haoran Chen
Jianmin Li
Xiaolin Hu
73
35
0
16 Jan 2020
Show, Recall, and Tell: Image Captioning with Recall Mechanism
Show, Recall, and Tell: Image Captioning with Recall Mechanism
Li Wang
Zechen Bai
Yonghua Zhang
Hongtao Lu
75
67
0
15 Jan 2020
In Defense of Grid Features for Visual Question Answering
In Defense of Grid Features for Visual Question Answering
Huaizu Jiang
Ishan Misra
Marcus Rohrbach
Erik Learned-Miller
Xinlei Chen
OODObjD
85
320
0
10 Jan 2020
A Survey on Machine Reading Comprehension Systems
A Survey on Machine Reading Comprehension Systems
Razieh Baradaran
Razieh Ghiasi
Hossein Amirkhani
FaML
131
86
0
06 Jan 2020
Explain and Improve: LRP-Inference Fine-Tuning for Image Captioning
  Models
Explain and Improve: LRP-Inference Fine-Tuning for Image Captioning Models
Jiamei Sun
Sebastian Lapuschkin
Wojciech Samek
Alexander Binder
FAtt
98
30
0
04 Jan 2020
Adaptive Correlated Monte Carlo for Contextual Categorical Sequence
  Generation
Adaptive Correlated Monte Carlo for Contextual Categorical Sequence Generation
Xinjie Fan
Yizhe Zhang
Zhendong Wang
Mingyuan Zhou
BDL
72
4
0
31 Dec 2019
Explicit Sparse Transformer: Concentrated Attention Through Explicit
  Selection
Explicit Sparse Transformer: Concentrated Attention Through Explicit Selection
Guangxiang Zhao
Junyang Lin
Zhiyuan Zhang
Xuancheng Ren
Qi Su
Xu Sun
79
113
0
25 Dec 2019
Deep Exemplar Networks for VQA and VQG
Deep Exemplar Networks for VQA and VQG
Badri N. Patro
Vinay P. Namboodiri
31
4
0
19 Dec 2019
Going Beneath the Surface: Evaluating Image Captioning for
  Grammaticality, Truthfulness and Diversity
Going Beneath the Surface: Evaluating Image Captioning for Grammaticality, Truthfulness and Diversity
Huiyuan Xie
Tom Sherborne
A. Kuhnle
Ann A. Copestake
DiffM
38
9
0
19 Dec 2019
Meshed-Memory Transformer for Image Captioning
Meshed-Memory Transformer for Image Captioning
Marcella Cornia
Matteo Stefanini
Lorenzo Baraldi
Rita Cucchiara
110
889
0
17 Dec 2019
Fast Image Caption Generation with Position Alignment
Fast Image Caption Generation with Position Alignment
Z. Fei
77
38
0
13 Dec 2019
Connecting Vision and Language with Localized Narratives
Connecting Vision and Language with Localized Narratives
Jordi Pont-Tuset
J. Uijlings
Soravit Changpinyo
Radu Soricut
V. Ferrari
ObjD
143
252
0
06 Dec 2019
Better Understanding Hierarchical Visual Relationship for Image Caption
Better Understanding Hierarchical Visual Relationship for Image Caption
Z. Fei
38
0
0
04 Dec 2019
Deep Bayesian Active Learning for Multiple Correct Outputs
Deep Bayesian Active Learning for Multiple Correct Outputs
Khaled Jedoui
Ranjay Krishna
Michael S. Bernstein
Li Fei-Fei
BDLOODUQCV
93
14
0
02 Dec 2019
Assessing the Robustness of Visual Question Answering Models
Assessing the Robustness of Visual Question Answering Models
Jia-Hong Huang
Modar Alfadly
Guohao Li
Marcel Worring
AAMLOOD
100
24
0
30 Nov 2019
Non-Autoregressive Coarse-to-Fine Video Captioning
Non-Autoregressive Coarse-to-Fine Video Captioning
Bang-ju Yang
Yuexian Zou
Fenglin Liu
Can Zhang
96
11
0
27 Nov 2019
Injecting Prior Knowledge into Image Caption Generation
Injecting Prior Knowledge into Image Caption Generation
A. Goel
Basura Fernando
Thanh-Son Nguyen
Hakan Bilen
33
0
0
22 Nov 2019
Characterizing the impact of using features extracted from pre-trained
  models on the quality of video captioning sequence-to-sequence models
Characterizing the impact of using features extracted from pre-trained models on the quality of video captioning sequence-to-sequence models
Menatallh Hammad
May Hammad
Mohamed Elshenawy
33
2
0
22 Nov 2019
Reinforcing an Image Caption Generator Using Off-Line Human Feedback
Reinforcing an Image Caption Generator Using Off-Line Human Feedback
Paul Hongsuck Seo
Piyush Sharma
Tomer Levinboim
Bohyung Han
Radu Soricut
OffRL
72
22
0
21 Nov 2019
Empirical Autopsy of Deep Video Captioning Frameworks
Empirical Autopsy of Deep Video Captioning Frameworks
Nayyer Aafaq
Naveed Akhtar
Wei Liu
Ajmal Mian
49
6
0
21 Nov 2019
Conditionally Learn to Pay Attention for Sequential Visual Task
Conditionally Learn to Pay Attention for Sequential Visual Task
Jun He
Quan-Jie Cao
Lei Zhang
44
0
0
11 Nov 2019
Keep it Consistent: Topic-Aware Storytelling from an Image Stream via
  Iterative Multi-agent Communication
Keep it Consistent: Topic-Aware Storytelling from an Image Stream via Iterative Multi-agent Communication
Ruize Wang
Zhongyu Wei
Ying Cheng
Piji Li
Haijun Shan
Ji Zhang
Qi Zhang
Xuanjing Huang
VGenDiffM
86
13
0
11 Nov 2019
Semantic Noise Matters for Neural Natural Language Generation
Semantic Noise Matters for Neural Natural Language Generation
Ondrej Dusek
David M. Howcroft
Verena Rieser
103
118
0
10 Nov 2019
On Architectures for Including Visual Information in Neural Language
  Models for Image Description
On Architectures for Including Visual Information in Neural Language Models for Image Description
Marc Tanti
Albert Gatt
K. Camilleri
VLM
48
2
0
09 Nov 2019
CommonGen: A Constrained Text Generation Challenge for Generative
  Commonsense Reasoning
CommonGen: A Constrained Text Generation Challenge for Generative Commonsense Reasoning
Bill Yuchen Lin
Wangchunshu Zhou
Minghan Shen
Pei Zhou
Chandra Bhagavatula
Yu Xing
Xiang Ren
LRM
91
16
0
09 Nov 2019
Contrastive Multi-document Question Generation
Contrastive Multi-document Question Generation
W. Cho
Yizhe Zhang
Sudha Rao
Asli Celikyilmaz
Chenyan Xiong
Jianfeng Gao
Mengdi Wang
Bill Dolan
SyDa
121
28
0
08 Nov 2019
Video Captioning with Text-based Dynamic Attention and Step-by-Step
  Learning
Video Captioning with Text-based Dynamic Attention and Step-by-Step Learning
Huanhou Xiao
Jinglun Shi
34
25
0
05 Nov 2019
Previous
123...343536...424344
Next