Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1411.5726
Cited By
CIDEr: Consensus-based Image Description Evaluation
20 November 2014
Ramakrishna Vedantam
C. L. Zitnick
Devi Parikh
Re-assign community
ArXiv
PDF
HTML
Papers citing
"CIDEr: Consensus-based Image Description Evaluation"
50 / 2,150 papers shown
Title
Implicit and Explicit Commonsense for Multi-sentence Video Captioning
Shih-Han Chou
James J. Little
Leonid Sigal
31
2
0
14 Mar 2023
Breaking Common Sense: WHOOPS! A Vision-and-Language Benchmark of Synthetic and Compositional Images
Nitzan Bitton-Guetta
Yonatan Bitton
Jack Hessel
Ludwig Schmidt
Yuval Elovici
Gabriel Stanovsky
Roy Schwartz
VLM
123
67
0
13 Mar 2023
ChatGPT Asks, BLIP-2 Answers: Automatic Questioning Towards Enriched Visual Descriptions
Deyao Zhu
Jun Chen
Kilichbek Haydarov
Xiaoqian Shen
Wenxuan Zhang
Mohamed Elhoseiny
MLLM
50
100
0
12 Mar 2023
ZeroNLG: Aligning and Autoencoding Domains for Zero-Shot Multimodal and Multilingual Natural Language Generation
Bang-ju Yang
Fenglin Liu
Yuexian Zou
Xian Wu
Yaowei Wang
David Clifton
43
9
0
11 Mar 2023
Learning Combinatorial Prompts for Universal Controllable Image Captioning
Zhen Wang
Jun Xiao
Yueting Zhuang
Fei Gao
Jian Shao
Long Chen
60
5
0
11 Mar 2023
Interpretable Visual Question Answering Referring to Outside Knowledge
He Zhu
Ren Togo
Takahiro Ogawa
Miki Haseyama
32
0
0
08 Mar 2023
Describe me an Aucklet: Generating Grounded Perceptual Category Descriptions
Bill Noble
N. Ilinykh
34
0
0
07 Mar 2023
Graph Neural Networks in Vision-Language Image Understanding: A Survey
Henry Senior
Greg Slabaugh
Shanxin Yuan
Luca Rossi
GNN
38
14
0
07 Mar 2023
Neighborhood Contrastive Transformer for Change Captioning
Yunbin Tu
Liang Li
Li Su
Kelvin Lu
Qin Huang
ViT
24
14
0
06 Mar 2023
DeCap: Decoding CLIP Latents for Zero-Shot Captioning via Text-Only Training
Wei Li
Linchao Zhu
Longyin Wen
Yi Yang
VLM
56
86
0
06 Mar 2023
Models See Hallucinations: Evaluating the Factuality in Video Captioning
Hui Liu
Xiaojun Wan
HILM
40
11
0
06 Mar 2023
Comparative study of Transformer and LSTM Network with attention mechanism on Image Captioning
Pranav Dandwate
Chaitanya Shahane
V. Jagtap
Shridevi C. Karande
64
8
0
05 Mar 2023
Prismer: A Vision-Language Model with Multi-Task Experts
Shikun Liu
Linxi Fan
Edward Johns
Zhiding Yu
Chaowei Xiao
Anima Anandkumar
VLM
MLLM
55
22
0
04 Mar 2023
FAME-ViL: Multi-Tasking Vision-Language Model for Heterogeneous Fashion Tasks
Xiaoping Han
Xiatian Zhu
Licheng Yu
Li Zhang
Yi-Zhe Song
Tao Xiang
VLM
29
39
0
04 Mar 2023
ConZIC: Controllable Zero-shot Image Captioning by Sampling-Based Polishing
Zequn Zeng
Hao Zhang
Zhengjue Wang
Ruiying Lu
Dongsheng Wang
Bo Chen
BDL
DiffM
29
33
0
04 Mar 2023
Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning
Antoine Yang
Arsha Nagrani
Paul Hongsuck Seo
Antoine Miech
Jordi Pont-Tuset
Ivan Laptev
Josef Sivic
Cordelia Schmid
AI4TS
VLM
73
225
0
27 Feb 2023
Language Is Not All You Need: Aligning Perception with Language Models
Shaohan Huang
Li Dong
Wenhui Wang
Y. Hao
Saksham Singhal
...
Johan Bjorck
Vishrav Chaudhary
Subhojit Som
Xia Song
Furu Wei
VLM
LRM
MLLM
38
544
0
27 Feb 2023
Improving Medical Speech-to-Text Accuracy with Vision-Language Pre-training Model
Jaeyoung Huh
Sangjoon Park
Jeonghyeon Lee
Jong Chul Ye
LM&MA
25
10
0
27 Feb 2023
Learning Visual Representations via Language-Guided Sampling
Mohamed El Banani
Karan Desai
Justin Johnson
SSL
VLM
57
28
0
23 Feb 2023
HL Dataset: Visually-grounded Description of Scenes, Actions and Rationales
Michele Cafagna
Kees van Deemter
Albert Gatt
3DV
26
4
0
23 Feb 2023
Test-Time Distribution Normalization for Contrastively Learned Vision-language Models
Yi Zhou
Juntao Ren
Fengyu Li
Ramin Zabih
Ser-Nam Lim
VLM
47
14
0
22 Feb 2023
STOA-VLP: Spatial-Temporal Modeling of Object and Action for Video-Language Pre-training
Weihong Zhong
Mao Zheng
Duyu Tang
Xuan Luo
Heng Gong
Xiaocheng Feng
Bing Qin
52
8
0
20 Feb 2023
Towards Unifying Medical Vision-and-Language Pre-training via Soft Prompts
Zhihong Chen
Shizhe Diao
Benyou Wang
Guanbin Li
Xiang Wan
MedIm
63
30
0
17 Feb 2023
Retrieval-augmented Image Captioning
R. Ramos
Desmond Elliott
Bruno Martins
VLM
43
29
0
16 Feb 2023
Tuning computer vision models with task rewards
André Susano Pinto
Alexander Kolesnikov
Yuge Shi
Lucas Beyer
Xiaohua Zhai
VLM
32
40
0
16 Feb 2023
Towards Local Visual Modeling for Image Captioning
Yiwei Ma
Jiayi Ji
Xiaoshuai Sun
Yiyi Zhou
Rongrong Ji
ViT
29
72
0
13 Feb 2023
See Your Heart: Psychological states Interpretation through Visual Creations
Likun Yang
Xiaokun Feng
Xiaotang Chen
Shiyu Zhang
Kaiqi Huang
16
0
0
11 Feb 2023
Stacked Cross-modal Feature Consolidation Attention Networks for Image Captioning
Mozhgan Pourkeshavarz
Shahabedin Nabavi
Mohsen Moghaddam
M. Shamsfard
48
4
0
08 Feb 2023
KENGIC: KEyword-driven and N-Gram Graph based Image Captioning
Brandon Birmingham
A. Muscat
32
1
0
07 Feb 2023
Transform, Contrast and Tell: Coherent Entity-Aware Multi-Image Captioning
Jingqiang Chen
30
4
0
04 Feb 2023
DEVICE: Depth and Visual Concepts Aware Transformer for OCR-based Image Captioning
Dongsheng Xu
Qingbao Huang
Shuang Feng
Yiru Cai
Feng Shuang
Yi Cai
ViT
VLM
47
0
0
03 Feb 2023
IC3: Image Captioning by Committee Consensus
David M. Chan
Austin Myers
Sudheendra Vijayanarasimhan
David A. Ross
John F. Canny
34
17
0
02 Feb 2023
ADAPT: Action-aware Driving Caption Transformer
Bu Jin
Xinyi Liu
Yupeng Zheng
Pengfei Li
Hao Zhao
Tong Zhang
Yuhang Zheng
Guyue Zhou
Jingjing Liu
50
70
0
01 Feb 2023
Semi-Parametric Video-Grounded Text Generation
Sungdong Kim
Jin-Hwa Kim
Jiyoung Lee
Minjoon Seo
VGen
37
14
0
27 Jan 2023
Style-Aware Contrastive Learning for Multi-Style Image Captioning
Yucheng Zhou
Guodong Long
27
22
0
26 Jan 2023
Multimodal Event Transformer for Image-guided Story Ending Generation
Yucheng Zhou
Guodong Long
38
19
0
26 Jan 2023
Semi-Supervised Image Captioning by Adversarially Propagating Labeled Data
Dong-Jin Kim
Tae-Hyun Oh
Jinsoo Choi
In So Kweon
SSL
VLM
31
4
0
26 Jan 2023
Towards a Unified Model for Generating Answers and Explanations in Visual Question Answering
Chenxi Whitehouse
Tillman Weyde
Pranava Madhyastha
LRM
63
3
0
25 Jan 2023
Visual Semantic Relatedness Dataset for Image Captioning
Ahmed Sabir
Francesc Moreno-Noguer
Lluís Padró
CoGe
VLM
49
3
0
20 Jan 2023
Visual Writing Prompts: Character-Grounded Story Generation with Curated Image Sequences
Xudong Hong
A. Sayeed
K. Mehra
Vera Demberg
Bernt Schiele
VGen
29
35
0
20 Jan 2023
Embodied Agents for Efficient Exploration and Smart Scene Description
Roberto Bigazzi
Marcella Cornia
S. Cascianelli
Lorenzo Baraldi
Rita Cucchiara
LM&Ro
27
7
0
17 Jan 2023
TikTalk: A Video-Based Dialogue Dataset for Multi-Modal Chitchat in Real World
Hongpeng Lin
Ludan Ruan
Wenke Xia
Peiyu Liu
Jing Wen
...
Di Hu
Ruihua Song
Wayne Xin Zhao
Qin Jin
Zhiwu Lu
VGen
44
10
0
14 Jan 2023
End-to-End 3D Dense Captioning with Vote2Cap-DETR
Sijin Chen
Erik Cambria
Xin Chen
Yinjie Lei
Tao Chen
YU Gang
ViT
33
53
0
06 Jan 2023
An Image captioning algorithm based on the Hybrid Deep Learning Technique (CNN+GRU)
Rana Adnan Ahmad
Muhammad Azhar
Hina Sattar
31
10
0
06 Jan 2023
Adaptively Clustering Neighbor Elements for Image-Text Generation
Zihua Wang
Xu Yang
Hanwang Zhang
Haiyang Xu
Mingshi Yan
Feisi Huang
Yu Zhang
VLM
116
0
0
05 Jan 2023
SPRING: Situated Conversation Agent Pretrained with Multimodal Questions from Incremental Layout Graph
Yuxing Long
Binyuan Hui
Fulong Ye
Yanyang Li
Zhuoxin Han
Caixia Yuan
Yongbin Li
Xiaojie Wang
LLMAG
43
7
0
05 Jan 2023
HiTeA: Hierarchical Temporal-Aware Video-Language Pre-training
Qinghao Ye
Guohai Xu
Ming Yan
Haiyang Xu
Qi Qian
Ji Zhang
Fei Huang
VLM
AI4TS
188
71
0
30 Dec 2022
Neural Shape Compiler: A Unified Framework for Transforming between Text, Point Cloud, and Program
Tiange Luo
Honglak Lee
Justin Johnson
58
5
0
25 Dec 2022
Do DALL-E and Flamingo Understand Each Other?
Hang Li
Jindong Gu
Rajat Koner
Sahand Sharifzadeh
Volker Tresp
MLLM
31
12
0
23 Dec 2022
Confidence-Aware Paced-Curriculum Learning by Label Smoothing for Surgical Scene Understanding
Mengya Xu
Mobarakol Islam
Ben Glocker
Hongliang Ren
38
2
0
22 Dec 2022
Previous
1
2
3
...
18
19
20
...
41
42
43
Next