ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1411.5726
  4. Cited By
CIDEr: Consensus-based Image Description Evaluation

CIDEr: Consensus-based Image Description Evaluation

20 November 2014
Ramakrishna Vedantam
C. L. Zitnick
Devi Parikh
ArXivPDFHTML

Papers citing "CIDEr: Consensus-based Image Description Evaluation"

50 / 2,142 papers shown
Title
Thinking Hallucination for Video Captioning
Thinking Hallucination for Video Captioning
Nasib Ullah
Partha Pratim Mohanta
VLM
38
4
0
28 Sep 2022
Improving Radiology Report Generation Systems by Removing Hallucinated
  References to Non-existent Priors
Improving Radiology Report Generation Systems by Removing Hallucinated References to Non-existent Priors
Vignav Ramesh
Nathan Chi
Pranav Rajpurkar
MedIm
36
49
0
27 Sep 2022
Word to Sentence Visual Semantic Similarity for Caption Generation:
  Lessons Learned
Word to Sentence Visual Semantic Similarity for Caption Generation: Lessons Learned
Ahmed Sabir
25
0
0
26 Sep 2022
DRAMA: Joint Risk Localization and Captioning in Driving
DRAMA: Joint Risk Localization and Captioning in Driving
Srikanth Malla
Chiho Choi
Isht Dwivedi
Joonhyang Choi
Jiachen Li
107
88
0
22 Sep 2022
INFINITY: A Simple Yet Effective Unsupervised Framework for Graph-Text
  Mutual Conversion
INFINITY: A Simple Yet Effective Unsupervised Framework for Graph-Text Mutual Conversion
Yi Xu
Luoyi Fu
Zhouhan Lin
Jiexing Qi
Xinbing Wang
48
3
0
22 Sep 2022
Show, Interpret and Tell: Entity-aware Contextualised Image Captioning
  in Wikipedia
Show, Interpret and Tell: Entity-aware Contextualised Image Captioning in Wikipedia
K. Nguyen
Ali Furkan Biten
Andrés Mafla
Lluís Gómez
Dimosthenis Karatzas
38
10
0
21 Sep 2022
Toward 3D Spatial Reasoning for Human-like Text-based Visual Question
  Answering
Toward 3D Spatial Reasoning for Human-like Text-based Visual Question Answering
Hao Li
Jinfa Huang
Peng Jin
Guoli Song
Qi Wu
Jie Chen
44
21
0
21 Sep 2022
Recipe Generation from Unsegmented Cooking Videos
Recipe Generation from Unsegmented Cooking Videos
Taichi Nishimura
Atsushi Hashimoto
Yoshitaka Ushiku
Hirotaka Kameko
Shinsuke Mori
25
3
0
21 Sep 2022
Learning Distinct and Representative Styles for Image Captioning
Learning Distinct and Representative Styles for Image Captioning
Qi Chen
Chaorui Deng
Qi Wu
VLM
45
23
0
17 Sep 2022
Belief Revision based Caption Re-ranker with Visual Semantic Information
Belief Revision based Caption Re-ranker with Visual Semantic Information
Ahmed Sabir
Francesc Moreno-Noguer
Pranava Madhyastha
Lluís Padró
BDL
37
2
0
16 Sep 2022
Distribution Aware Metrics for Conditional Natural Language Generation
Distribution Aware Metrics for Conditional Natural Language Generation
David M. Chan
Yiming Ni
David A. Ross
Sudheendra Vijayanarasimhan
Austin Myers
John F. Canny
53
4
0
15 Sep 2022
PaLI: A Jointly-Scaled Multilingual Language-Image Model
PaLI: A Jointly-Scaled Multilingual Language-Image Model
Xi Chen
Tianlin Li
Soravit Changpinyo
A. Piergiovanni
Piotr Padlewski
...
Andreas Steiner
A. Angelova
Xiaohua Zhai
N. Houlsby
Radu Soricut
MLLM
VLM
37
691
0
14 Sep 2022
Automatic Comment Generation via Multi-Pass Deliberation
Automatic Comment Generation via Multi-Pass Deliberation
Fangwen Mu
Xiao Chen
Lin Shi
Song Wang
Qing Wang
47
12
0
14 Sep 2022
PreSTU: Pre-Training for Scene-Text Understanding
PreSTU: Pre-Training for Scene-Text Understanding
Jihyung Kil
Soravit Changpinyo
Xi Chen
Hexiang Hu
Sebastian Goodman
Wei-Lun Chao
Radu Soricut
VLM
145
29
0
12 Sep 2022
MaXM: Towards Multilingual Visual Question Answering
MaXM: Towards Multilingual Visual Question Answering
Soravit Changpinyo
Linting Xue
Michal Yarom
Ashish V. Thapliyal
Idan Szpektor
J. Amelot
Xi Chen
Radu Soricut
36
8
0
12 Sep 2022
Evaluation of Question Answering Systems: Complexity of judging a
  natural language
Evaluation of Question Answering Systems: Complexity of judging a natural language
Amer Farea
Zhen Yang
Kien Duong
Nadeesha Perera
F. Emmert-Streib
ELM
42
3
0
10 Sep 2022
Bridging Music and Text with Crowdsourced Music Comments: A
  Sequence-to-Sequence Framework for Thematic Music Comments Generation
Bridging Music and Text with Crowdsourced Music Comments: A Sequence-to-Sequence Framework for Thematic Music Comments Generation
Peining Zhang
Junliang Guo
Linli Xu
Mu You
Junming Yin
27
0
0
05 Sep 2022
On Grounded Planning for Embodied Tasks with Language Models
On Grounded Planning for Embodied Tasks with Language Models
Bill Yuchen Lin
Chengsong Huang
Qian Liu
Wenda Gu
Sam Sommerer
Xiang Ren
LM&Ro
36
39
0
29 Aug 2022
Of Human Criteria and Automatic Metrics: A Benchmark of the Evaluation
  of Story Generation
Of Human Criteria and Automatic Metrics: A Benchmark of the Evaluation of Story Generation
Cyril Chhun
Pierre Colombo
Chloé Clavel
Fabian M. Suchanek
60
51
0
24 Aug 2022
Improving Personality Consistency in Conversation by Persona Extending
Improving Personality Consistency in Conversation by Persona Extending
Yifan Liu
Wei Wei
Jiayi Liu
Xian-Ling Mao
Rui Fang
Dangyang Chen
35
24
0
23 Aug 2022
A Medical Semantic-Assisted Transformer for Radiographic Report
  Generation
A Medical Semantic-Assisted Transformer for Radiographic Report Generation
Zhanyu Wang
Mingkang Tang
Lei Wang
Xiu Li
Luping Zhou
ViT
MedIm
29
57
0
22 Aug 2022
Diverse Video Captioning by Adaptive Spatio-temporal Attention
Diverse Video Captioning by Adaptive Spatio-temporal Attention
Zohreh Ghaderi
Leonard Salewski
Hendrik P. A. Lensch
18
8
0
19 Aug 2022
An investigation on selecting audio pre-trained models for audio
  captioning
An investigation on selecting audio pre-trained models for audio captioning
Peiran Yan
Sheng-Wei Li
26
0
0
12 Aug 2022
A Comprehensive Survey of Natural Language Generation Advances from the
  Perspective of Digital Deception
A Comprehensive Survey of Natural Language Generation Advances from the Perspective of Digital Deception
Keenan I. Jones
Enes ALTUNCU
V. N. Franqueira
Yi-Chia Wang
Shujun Li
DeLMO
47
3
0
11 Aug 2022
Sports Video Analysis on Large-Scale Data
Sports Video Analysis on Large-Scale Data
Dekun Wu
Henghui Zhao
Xingce Bao
Richard P. Wildes
29
13
0
09 Aug 2022
Distinctive Image Captioning via CLIP Guided Group Optimization
Distinctive Image Captioning via CLIP Guided Group Optimization
Youyuan Zhang
Jiuniu Wang
Hao Wu
Wenjia Xu
VLM
40
8
0
08 Aug 2022
Prompt Tuning for Generative Multimodal Pretrained Models
Prompt Tuning for Generative Multimodal Pretrained Models
Han Yang
Junyang Lin
An Yang
Peng Wang
Chang Zhou
Hongxia Yang
VLM
LRM
VPVLM
37
30
0
04 Aug 2022
SMART: Sentences as Basic Units for Text Evaluation
SMART: Sentences as Basic Units for Text Evaluation
Reinald Kim Amplayo
Peter J. Liu
Yao-Min Zhao
Shashi Narayan
38
21
0
01 Aug 2022
MAFW: A Large-scale, Multi-modal, Compound Affective Database for
  Dynamic Facial Expression Recognition in the Wild
MAFW: A Large-scale, Multi-modal, Compound Affective Database for Dynamic Facial Expression Recognition in the Wild
Y. Liu
Wei Dai
Chuanxu Feng
Wenbin Wang
Guanghao Yin
Jiabei Zeng
Shiguang Shan
CVBM
32
62
0
01 Aug 2022
Uncertainty-based Visual Question Answering: Estimating Semantic
  Inconsistency between Image and Knowledge Base
Uncertainty-based Visual Question Answering: Estimating Semantic Inconsistency between Image and Knowledge Base
Jinyeong Chae
Jihie Kim
27
2
0
27 Jul 2022
Retrieval-Augmented Transformer for Image Captioning
Retrieval-Augmented Transformer for Image Captioning
Sara Sarto
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
29
57
0
26 Jul 2022
Is GPT-3 all you need for Visual Question Answering in Cultural
  Heritage?
Is GPT-3 all you need for Visual Question Answering in Cultural Heritage?
P. Bongini
Federico Becattini
A. Bimbo
12
13
0
25 Jul 2022
Chunk-aware Alignment and Lexical Constraint for Visual Entailment with
  Natural Language Explanations
Chunk-aware Alignment and Lexical Constraint for Visual Entailment with Natural Language Explanations
Qian Yang
Yunxin Li
Baotian Hu
Lin Ma
Yuxin Ding
Min Zhang
47
10
0
23 Jul 2022
Rethinking the Reference-based Distinctive Image Captioning
Rethinking the Reference-based Distinctive Image Captioning
Yangjun Mao
Long Chen
Zhihong Jiang
Dong Zhang
Zhimeng Zhang
Jian Shao
Jun Xiao
DiffM
35
22
0
22 Jul 2022
Zero-Shot Video Captioning with Evolving Pseudo-Tokens
Zero-Shot Video Captioning with Evolving Pseudo-Tokens
Yoad Tewel
Yoav Shalev
Roy Nadler
Idan Schwartz
Lior Wolf
37
27
0
22 Jul 2022
Efficient Modeling of Future Context for Image Captioning
Efficient Modeling of Future Context for Image Captioning
Zhengcong Fei
Junshi Huang
Xiaoming Wei
Xiaolin K. Wei
50
14
0
22 Jul 2022
Grounding Visual Representations with Texts for Domain Generalization
Grounding Visual Representations with Texts for Domain Generalization
Seonwoo Min
Nokyung Park
Siwon Kim
Seunghyun Park
Jinkyu Kim
OOD
19
34
0
21 Jul 2022
Diffsound: Discrete Diffusion Model for Text-to-sound Generation
Diffsound: Discrete Diffusion Model for Text-to-sound Generation
Dongchao Yang
Jianwei Yu
Helin Wang
Wen Wang
Chao Weng
Yuexian Zou
Dong Yu
DiffM
36
297
0
20 Jul 2022
GRIT: Faster and Better Image captioning Transformer Using Dual Visual
  Features
GRIT: Faster and Better Image captioning Transformer Using Dual Visual Features
Van-Quang Nguyen
Masanori Suganuma
Takayuki Okatani
ViT
41
107
0
20 Jul 2022
Explicit Image Caption Editing
Explicit Image Caption Editing
Zhen Wang
Long Chen
Wenbo Ma
G. Han
Yulei Niu
Jian Shao
Jun Xiao
25
12
0
20 Jul 2022
Relational Future Captioning Model for Explaining Likely Collisions in
  Daily Tasks
Relational Future Captioning Model for Explaining Likely Collisions in Daily Tasks
Motonari Kambara
K. Sugiura
32
6
0
19 Jul 2022
Unifying Event Detection and Captioning as Sequence Generation via
  Pre-Training
Unifying Event Detection and Captioning as Sequence Generation via Pre-Training
Qi Zhang
Yuqing Song
Qin Jin
30
24
0
18 Jul 2022
Towards the Human Global Context: Does the Vision-Language Model Really
  Judge Like a Human Being?
Towards the Human Global Context: Does the Vision-Language Model Really Judge Like a Human Being?
Sangmyeong Woh
Jaemin Lee
Hoki Kim
Jinsuk Lee
21
0
0
18 Jul 2022
Dual-branch Hybrid Learning Network for Unbiased Scene Graph Generation
Dual-branch Hybrid Learning Network for Unbiased Scene Graph Generation
Chao Zheng
Lianli Gao
Xinyu Lyu
Pengpeng Zeng
Abdulmotaleb El Saddik
Hengtao Shen
37
14
0
16 Jul 2022
LineCap: Line Charts for Data Visualization Captioning Models
LineCap: Line Charts for Data Visualization Captioning Models
Anita Mahinpei
Zona Kostic
Christy Tanner
VLM
34
17
0
15 Jul 2022
A Baseline for Detecting Out-of-Distribution Examples in Image
  Captioning
A Baseline for Detecting Out-of-Distribution Examples in Image Captioning
Gabi Shalev
Gal-Lev Shalev
Joseph Keshet
OODD
29
7
0
12 Jul 2022
Cross-modal Prototype Driven Network for Radiology Report Generation
Cross-modal Prototype Driven Network for Radiology Report Generation
Jun Wang
A. Bhalerao
Yulan He
MedIm
93
73
0
11 Jul 2022
Adaptive Fine-Grained Predicates Learning for Scene Graph Generation
Adaptive Fine-Grained Predicates Learning for Scene Graph Generation
Xinyu Lyu
Lianli Gao
Pengpeng Zeng
Hengtao Shen
Jingkuan Song
44
18
0
11 Jul 2022
Predicting Word Learning in Children from the Performance of Computer
  Vision Systems
Predicting Word Learning in Children from the Performance of Computer Vision Systems
Sunayana Rane
Mira L. Nencheva
Zeyu Wang
C. Lew‐Williams
Olga Russakovsky
Thomas Griffiths
21
3
0
07 Jul 2022
Exploring the sequence length bottleneck in the Transformer for Image
  Captioning
Exploring the sequence length bottleneck in the Transformer for Image Captioning
Jiapeng Hu
Roberto Cavicchioli
Alessandro Capotondi
ViT
43
3
0
07 Jul 2022
Previous
123...212223...414243
Next