Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1511.07571
Cited By
DenseCap: Fully Convolutional Localization Networks for Dense Captioning
24 November 2015
Justin Johnson
A. Karpathy
Li Fei-Fei
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"DenseCap: Fully Convolutional Localization Networks for Dense Captioning"
50 / 452 papers shown
Title
Understanding Guided Image Captioning Performance across Domains
Edwin G. Ng
Bo Pang
P. Sharma
Radu Soricut
27
24
0
04 Dec 2020
Scan2Cap: Context-aware Dense Captioning in RGB-D Scans
Dave Zhenyu Chen
A. Gholami
Matthias Nießner
Angel X. Chang
3DPC
23
159
0
03 Dec 2020
SuperOCR: A Conversion from Optical Character Recognition to Image Captioning
Baohua Sun
Michael Lin
Hao Sha
Lin Yang
19
5
0
21 Nov 2020
Watch and Learn: Mapping Language and Noisy Real-world Videos with Self-supervision
Yujie Zhong
Linhai Xie
Sen Wang
Lucia Specia
Yishu Miao
SSL
11
0
0
19 Nov 2020
iPerceive: Applying Common-Sense Reasoning to Multi-Modal Dense Video Captioning and Video Question Answering
Aman Chadha
Gurneet Arora
Navpreet Kaloty
19
35
0
16 Nov 2020
MAGNeto: An Efficient Deep Learning Method for the Extractive Tags Summarization Problem
H. Phung
A. Vu
Tung D. Nguyen
Lam Thanh Do
Giang Nam Ngo
Trung Thanh Tran
Hà Nội
ViT
17
0
0
09 Nov 2020
Diverse Image Captioning with Context-Object Split Latent Spaces
Shweta Mahajan
Stefan Roth
19
41
0
02 Nov 2020
Boost Image Captioning with Knowledge Reasoning
Feicheng Huang
Zhixin Li
Haiyang Wei
Canlong Zhang
Huifang Ma
9
25
0
02 Nov 2020
TextMage: The Automated Bangla Caption Generator Based On Deep Learning
Abrar Hasin Kamal
Md Asifuzzaman Jishan
N. Mansoor
VLM
8
17
0
15 Oct 2020
Diagnosing and Preventing Instabilities in Recurrent Video Processing
T. Tanay
Aivar Sootla
Matteo Maggioni
P. Dokania
Philip Torr
A. Leonardis
Greg Slabaugh
19
7
0
10 Oct 2020
Dense Relational Image Captioning via Multi-task Triple-Stream Networks
Dong-Jin Kim
Tae-Hyun Oh
Jinsoo Choi
In So Kweon
29
27
0
08 Oct 2020
ALFWorld: Aligning Text and Embodied Environments for Interactive Learning
Mohit Shridhar
Xingdi Yuan
Marc-Alexandre Côté
Yonatan Bisk
Adam Trischler
Matthew J. Hausknecht
LM&Ro
LLMAG
32
397
0
08 Oct 2020
Rescribe: Authoring and Automatically Editing Audio Descriptions
Amy Pavel
G. Reyes
Jeffrey P. Bigham
12
58
0
07 Oct 2020
Spatial Attention as an Interface for Image Captioning Models
P. Sadler
20
0
0
29 Sep 2020
VIVO: Visual Vocabulary Pre-Training for Novel Object Captioning
Xiaowei Hu
Xi Yin
Kevin Qinghong Lin
Lijuan Wang
Lefei Zhang
Jianfeng Gao
Zicheng Liu
VLM
14
56
0
28 Sep 2020
Towards Unique and Informative Captioning of Images
Zeyu Wang
Berthy Feng
Karthik Narasimhan
Olga Russakovsky
25
37
0
08 Sep 2020
CoNCRA: A Convolutional Neural Network Code Retrieval Approach
Marcelo de Rezende Martins
M. Gerosa
19
11
0
03 Sep 2020
Cross-modal Knowledge Reasoning for Knowledge-based Visual Question Answering
Jiahao Yu
Zihao Zhu
Yujing Wang
Weifeng Zhang
Yue Hu
Jianlong Tan
8
98
0
31 Aug 2020
Decoupled Variational Embedding for Signed Directed Networks
Xu Chen
Jiangchao Yao
Maosen Li
Ya Zhang
Yanfeng Wang
14
4
0
28 Aug 2020
Matching Guided Distillation
Kaiyu Yue
Jiangfan Deng
Feng Zhou
14
49
0
23 Aug 2020
Weakly supervised cross-domain alignment with optimal transport
Siyang Yuan
Ke Bai
Liqun Chen
Yizhe Zhang
Chenyang Tao
Chunyuan Li
Guoyin Wang
Ricardo Henao
Lawrence Carin
OT
24
7
0
14 Aug 2020
KBGN: Knowledge-Bridge Graph Network for Adaptive Vision-Text Reasoning in Visual Dialogue
X. Jiang
Siyi Du
Zengchang Qin
Yajing Sun
Jiahao Yu
29
37
0
11 Aug 2020
Textual Description for Mathematical Equations
Ajoy Mondal
C. V. Jawahar
14
2
0
07 Aug 2020
Fashion Captioning: Towards Generating Accurate Descriptions with Semantic Rewards
Xuewen Yang
Heming Zhang
Di Jin
Yingru Liu
Chi-Hao Wu
Jianchao Tan
Dongliang Xie
Jue Wang
Xin Wang
19
68
0
06 Aug 2020
Eigen-CAM: Class Activation Map using Principal Components
Mohammed Bany Muhammad
M. Yeasin
17
334
0
01 Aug 2020
Weakly supervised one-stage vision and language disease detection using large scale pneumonia and pneumothorax studies
Leo K. Tam
Xiaosong Wang
E. Turkbey
Kevin Lu
Yuhong Wen
Daguang Xu
31
13
0
31 Jul 2020
Comprehensive Image Captioning via Scene Graph Decomposition
Yiwu Zhong
Liwei Wang
Jianshu Chen
Dong Yu
Yin Li
87
124
0
23 Jul 2020
Diverse and Styled Image Captioning Using SVD-Based Mixture of Recurrent Experts
Marzi Heidari
M. Ghatee
A. Nickabadi
Arash Pourhasan Nezhad
DiffM
MoE
35
1
0
07 Jul 2020
Mucko: Multi-Layer Cross-Modal Knowledge Reasoning for Fact-based Visual Question Answering
Zihao Zhu
Jiahao Yu
Yujing Wang
Yajing Sun
Yue Hu
Qi Wu
30
125
0
16 Jun 2020
iSeeBetter: Spatio-temporal video super-resolution using recurrent generative back-projection networks
Aman Chadha
John Britto
M. Mani Roja
SupR
14
25
0
13 Jun 2020
MetricUNet: Synergistic Image- and Voxel-Level Learning for Precise CT Prostate Segmentation via Online Sampling
Kelei He
C. Lian
Ehsan Adeli
Jing Huo
Yang Gao
Bing-Bin Zhang
Junfeng Zhang
D. Shen
14
0
0
15 May 2020
Dense-Caption Matching and Frame-Selection Gating for Temporal Localization in VideoQA
Hyounghun Kim
Zineng Tang
Joey Tianyi Zhou
33
31
0
13 May 2020
Towards Embodied Scene Description
Sinan Tan
Huaping Liu
Di Guo
Xinyu Zhang
F. Sun
LM&Ro
10
9
0
30 Apr 2020
Show, Describe and Conclude: On Exploiting the Structure Information of Chest X-Ray Reports
Baoyu Jing
Zeya Wang
Eric P. Xing
14
139
0
26 Apr 2020
Visual Question Answering Using Semantic Information from Image Descriptions
Tasmia Tasrin
Md Sultan al Nahian
Brent Harrison
18
0
0
23 Apr 2020
VisualCOMET: Reasoning about the Dynamic Context of a Still Image
J. S. Park
Chandra Bhagavatula
Roozbeh Mottaghi
Ali Farhadi
Yejin Choi
ReLM
LRM
27
6
0
22 Apr 2020
ParaCNN: Visual Paragraph Generation via Adversarial Twin Contextual CNNs
Shiyang Yan
Yang Hua
N. Robertson
6
7
0
21 Apr 2020
Context-Aware Group Captioning via Self-Attention and Contrastive Features
Zhuowan Li
Quan Hung Tran
Long Mai
Zhe-nan Lin
Alan Yuille
VLM
14
44
0
07 Apr 2020
Semantic Image Manipulation Using Scene Graphs
Helisa Dhamo
Azade Farshad
Iro Laina
Nassir Navab
Gregory Hager
Federico Tombari
Christian Rupprecht
22
119
0
07 Apr 2020
Consistent Multiple Sequence Decoding
Bicheng Xu
Leonid Sigal
31
0
0
02 Apr 2020
Detection and Description of Change in Visual Streams
Davis Gilton
Ruotian Luo
Rebecca Willett
Gregory Shakhnarovich
AI4TS
10
4
0
27 Mar 2020
Exploring Long Tail Visual Relationship Recognition with Large Vocabulary
Sherif Abdelkarim
Aniket Agarwal
Panos Achlioptas
Jun Chen
Jiaji Huang
Boyang Albert Li
Kenneth Ward Church
Mohamed Elhoseiny
VLM
32
18
0
25 Mar 2020
Bootstrapping Weakly Supervised Segmentation-free Word Spotting through HMM-based Alignment
T. Wilkinson
Carl Nettelblad
6
1
0
24 Mar 2020
Multi-modal Dense Video Captioning
Vladimir E. Iashin
Esa Rahtu
22
164
0
17 Mar 2020
PointINS: Point-based Instance Segmentation
Lu Qi
Xinming Zhang
Yukang Chen
Ying-Cong Chen
Xiangyu Zhang
Jian Sun
Jiaya Jia
ISeg
3DPC
32
31
0
13 Mar 2020
OVC-Net: Object-Oriented Video Captioning with Temporal Graph and Detail Enhancement
Fangyi Zhu
Lei Li
Zhanyu Ma
Guang Chen
Jun Guo
14
1
0
08 Mar 2020
Say As You Wish: Fine-grained Control of Image Caption Generation with Abstract Scene Graphs
Shizhe Chen
Qin Jin
Peng Wang
Qi Wu
DiffM
36
215
0
01 Mar 2020
A Convolutional Baseline for Person Re-Identification Using Vision and Language Descriptions
Ammarah Farooq
Muhammad Awais
F. Yan
J. Kittler
A. Akbari
S. S. Khalid
18
8
0
20 Feb 2020
Weakly Supervised Attention Pyramid Convolutional Neural Network for Fine-Grained Visual Classification
Yifeng Ding
Shaoguo Wen
Jiyang Xie
Dongliang Chang
Zhanyu Ma
Zhongwei Si
Haibin Ling
43
56
0
09 Feb 2020
Visual Concept-Metaconcept Learning
Chi Han
Jiayuan Mao
Chuang Gan
J. Tenenbaum
Jiajun Wu
NAI
LRM
6
63
0
04 Feb 2020
Previous
1
2
3
4
5
...
8
9
10
Next