ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1511.07571
  4. Cited By
DenseCap: Fully Convolutional Localization Networks for Dense Captioning

DenseCap: Fully Convolutional Localization Networks for Dense Captioning

24 November 2015
Justin Johnson
A. Karpathy
Li Fei-Fei
    VLM
ArXivPDFHTML

Papers citing "DenseCap: Fully Convolutional Localization Networks for Dense Captioning"

50 / 452 papers shown
Title
Improving Deep Visual Representation for Person Re-identification by
  Global and Local Image-language Association
Improving Deep Visual Representation for Person Re-identification by Global and Local Image-language Association
Dapeng Chen
Hongsheng Li
Xihui Liu
Yantao Shen
Zejian Yuan
Xiaogang Wang
22
134
0
05 Aug 2018
Equal But Not The Same: Understanding the Implicit Relationship Between
  Persuasive Images and Text
Equal But Not The Same: Understanding the Implicit Relationship Between Persuasive Images and Text
Ruotong Wang
R. Hwa
Adriana Kovashka
18
54
0
21 Jul 2018
Presentation Attack Detection for Cadaver Iris
Presentation Attack Detection for Cadaver Iris
Mateusz Trokielewicz
A. Czajka
P. Maciejewicz
CVBM
17
24
0
11 Jul 2018
Dynamic Multimodal Instance Segmentation guided by natural language
  queries
Dynamic Multimodal Instance Segmentation guided by natural language queries
Edgar Margffoy-Tuay
Juan C. Pérez
Emilio Botero
Pablo Arbelaez
27
170
0
06 Jul 2018
Face-Cap: Image Captioning using Facial Expression Analysis
Face-Cap: Image Captioning using Facial Expression Analysis
Omid Mohamad Nezami
Mark Dras
Peter Anderson
Len Hamey
CVBM
27
27
0
06 Jul 2018
Factorizable Net: An Efficient Subgraph-based Framework for Scene Graph
  Generation
Factorizable Net: An Efficient Subgraph-based Framework for Scene Graph Generation
Yikang Li
Wanli Ouyang
Bolei Zhou
Jianping Shi
Yawen Cui
Xiaogang Wang
GNN
17
273
0
29 Jun 2018
Learning Multimodal Representations for Unseen Activities
Learning Multimodal Representations for Unseen Activities
A. Piergiovanni
Michael S. Ryoo
SSL
16
4
0
21 Jun 2018
Part-Aware Fine-grained Object Categorization using Weakly Supervised
  Part Detection Network
Part-Aware Fine-grained Object Categorization using Weakly Supervised Part Detection Network
Yabin Zhang
Kui Jia
Zhixin Wang
11
23
0
16 Jun 2018
Interactive Visual Grounding of Referring Expressions for Human-Robot
  Interaction
Interactive Visual Grounding of Referring Expressions for Human-Robot Interaction
Mohit Shridhar
David Hsu
22
142
0
11 Jun 2018
Video Description: A Survey of Methods, Datasets and Evaluation Metrics
Video Description: A Survey of Methods, Datasets and Evaluation Metrics
Nayyer Aafaq
Ajmal Mian
Wei Liu
Syed Zulqarnain Gilani
Mubarak Shah
6
91
0
01 Jun 2018
GLAC Net: GLocal Attention Cascading Networks for Multi-image Cued Story
  Generation
GLAC Net: GLocal Attention Cascading Networks for Multi-image Cued Story Generation
Taehyeong Kim
Min-Oh Heo
Seonil Son
Kyoung-Wha Park
Byoung-Tak Zhang
23
75
0
28 May 2018
Identifying Object States in Cooking-Related Images
Identifying Object States in Cooking-Related Images
Ahmad Babaeian Jelodar
Md Sirajus Salekin
Yu Sun
22
37
0
17 May 2018
Deep Perm-Set Net: Learn to predict sets with unknown permutation and
  cardinality using deep neural networks
Deep Perm-Set Net: Learn to predict sets with unknown permutation and cardinality using deep neural networks
S. Hamid Rezatofighi
Roman Kaskman
F. Motlagh
Javen Qinfeng Shi
Daniel Cremers
Laura Leal-Taixé
Ian Reid
SSL
23
23
0
02 May 2018
Large-Scale Visual Relationship Understanding
Large-Scale Visual Relationship Understanding
Ji Zhang
Yannis Kalantidis
Marcus Rohrbach
Manohar Paluri
Ahmed Elgammal
Mohamed Elhoseiny
14
167
0
27 Apr 2018
Customized Image Narrative Generation via Interactive Visual Question
  Generation and Answering
Customized Image Narrative Generation via Interactive Visual Question Generation and Answering
Andrew Shin
Yoshitaka Ushiku
Tatsuya Harada
42
7
0
27 Apr 2018
Entity-aware Image Caption Generation
Entity-aware Image Caption Generation
Di Lu
Spencer Whitehead
Lifu Huang
Heng Ji
Shih-Fu Chang
VLM
25
82
0
21 Apr 2018
Multilevel Language and Vision Integration for Text-to-Clip Retrieval
Multilevel Language and Vision Integration for Text-to-Clip Retrieval
Huijuan Xu
Kun He
Bryan A. Plummer
Leonid Sigal
Stan Sclaroff
Kate Saenko
CLIP
19
319
0
13 Apr 2018
Hybrid Binary Networks: Optimizing for Accuracy, Efficiency and Memory
Hybrid Binary Networks: Optimizing for Accuracy, Efficiency and Memory
Ameya Prabhu
Vishal Batchu
Rohit Gajawada
Sri Aurobindo Munagala
A. Namboodiri
MQ
33
18
0
11 Apr 2018
Decoupled Novel Object Captioner
Decoupled Novel Object Captioner
Yuehua Wu
Linchao Zhu
Lu Jiang
Yi Yang
10
62
0
11 Apr 2018
Learning a Text-Video Embedding from Incomplete and Heterogeneous Data
Learning a Text-Video Embedding from Incomplete and Heterogeneous Data
Antoine Miech
Ivan Laptev
Josef Sivic
22
233
0
07 Apr 2018
Guess Where? Actor-Supervision for Spatiotemporal Action Localization
Guess Where? Actor-Supervision for Spatiotemporal Action Localization
Victor Escorcia
Cuong Duc Dao
Mihir Jain
Guohao Li
Cees G. M. Snoek
24
29
0
05 Apr 2018
Jointly Discovering Visual Objects and Spoken Words from Raw Sensory
  Input
Jointly Discovering Visual Objects and Spoken Words from Raw Sensory Input
David Harwath
Adrià Recasens
Dídac Surís
Galen Chuang
Antonio Torralba
James R. Glass
32
200
0
04 Apr 2018
Guide Me: Interacting with Deep Networks
Guide Me: Interacting with Deep Networks
Christian Rupprecht
Iro Laina
Nassir Navab
Gregory Hager
Federico Tombari
HAI
35
38
0
30 Mar 2018
A New Target-specific Object Proposal Generation Method for Visual
  Tracking
A New Target-specific Object Proposal Generation Method for Visual Tracking
Guanjun Guo
Hanzi Wang
Yan Yan
H. Liao
Bo-wen Li
18
4
0
27 Mar 2018
Neural Baby Talk
Neural Baby Talk
Jiasen Lu
Jianwei Yang
Dhruv Batra
Devi Parikh
VLM
200
434
0
27 Mar 2018
Explicit Reasoning over End-to-End Neural Architectures for Visual
  Question Answering
Explicit Reasoning over End-to-End Neural Architectures for Visual Question Answering
Somak Aditya
Yezhou Yang
Chitta Baral
LRM
NAI
ReLM
20
53
0
23 Mar 2018
EVA$^2$: Exploiting Temporal Redundancy in Live Computer Vision
EVA2^22: Exploiting Temporal Redundancy in Live Computer Vision
Mark Buckler
Philip Bedoukian
Suren Jayasuriya
Adrian Sampson
39
75
0
16 Mar 2018
Object Captioning and Retrieval with Natural Language
Object Captioning and Retrieval with Natural Language
A. Nguyen
Thanh-Toan Do
Ian Reid
D. Caldwell
Nikos G. Tsagarakis
3DV
22
18
0
16 Mar 2018
Inverse Visual Question Answering: A New Benchmark and VQA Diagnosis
  Tool
Inverse Visual Question Answering: A New Benchmark and VQA Diagnosis Tool
Feng Liu
Tao Xiang
Timothy M. Hospedales
Wankou Yang
Changyin Sun
30
29
0
16 Mar 2018
Approximate Query Matching for Image Retrieval
Approximate Query Matching for Image Retrieval
Abhijit Suprem
Polo Chau
19
1
0
14 Mar 2018
Less Is More: Picking Informative Frames for Video Captioning
Less Is More: Picking Informative Frames for Video Captioning
Yangyu Chen
Shuhui Wang
Feiyu Xiong
Qingming Huang
12
200
0
05 Mar 2018
Joint Event Detection and Description in Continuous Video Streams
Joint Event Detection and Description in Continuous Video Streams
Huijuan Xu
Boyang Albert Li
Vasili Ramanishka
Leonid Sigal
Kate Saenko
8
51
0
28 Feb 2018
Neural Aesthetic Image Reviewer
Neural Aesthetic Image Reviewer
Wenshan Wang
Su Yang
Weishan Zhang
Jiulong Zhang
22
38
0
28 Feb 2018
Teaching Machines to Code: Neural Markup Generation with Visual
  Attention
Teaching Machines to Code: Neural Markup Generation with Visual Attention
Sumeet S. Singh
14
7
0
15 Feb 2018
FlipDial: A Generative Model for Two-Way Visual Dialogue
FlipDial: A Generative Model for Two-Way Visual Dialogue
Daniela Massiceti
N. Siddharth
P. Dokania
Philip Torr
MLLM
27
41
0
11 Feb 2018
Generating Triples with Adversarial Networks for Scene Graph
  Construction
Generating Triples with Adversarial Networks for Scene Graph Construction
Matthew Klawonn
Eric Heim
GAN
GNN
32
22
0
07 Feb 2018
E2E-MLT - an Unconstrained End-to-End Method for Multi-Language Scene
  Text
E2E-MLT - an Unconstrained End-to-End Method for Multi-Language Scene Text
M. Busta
Yash J. Patel
Jirí Matas
33
91
0
30 Jan 2018
Image denoising and restoration with CNN-LSTM Encoder Decoder with
  Direct Attention
Image denoising and restoration with CNN-LSTM Encoder Decoder with Direct Attention
Kazi Nazmul Haque
M. Yousuf
R. Rana
3DV
19
21
0
16 Jan 2018
TieNet: Text-Image Embedding Network for Common Thorax Disease
  Classification and Reporting in Chest X-rays
TieNet: Text-Image Embedding Network for Common Thorax Disease Classification and Reporting in Chest X-rays
Xiaosong Wang
Yifan Peng
Le Lu
Zhiyong Lu
Ronald M. Summers
MedIm
38
462
0
12 Jan 2018
Visual Text Correction
Visual Text Correction
Amir Mazaheri
M. Shah
44
11
0
06 Jan 2018
Object Referring in Videos with Language and Human Gaze
Object Referring in Videos with Language and Human Gaze
A. Vasudevan
Dengxin Dai
Luc Van Gool
VOS
27
73
0
04 Jan 2018
Exploring Models and Data for Remote Sensing Image Caption Generation
Exploring Models and Data for Remote Sensing Image Caption Generation
Xiaoqiang Lu
Binqiang Wang
Xiangtao Zheng
Xuelong Li
24
461
0
21 Dec 2017
Learning to Act Properly: Predicting and Explaining Affordances from
  Images
Learning to Act Properly: Predicting and Explaining Affordances from Images
Ching-Yao Chuang
Jiaman Li
Antonio Torralba
Sanja Fidler
16
101
0
20 Dec 2017
Attribute CNNs for Word Spotting in Handwritten Documents
Attribute CNNs for Word Spotting in Handwritten Documents
Sebastian Sudholt
G. Fink
30
55
0
20 Dec 2017
Beyond the Pixel-Wise Loss for Topology-Aware Delineation
Beyond the Pixel-Wise Loss for Topology-Aware Delineation
Agata Mosinska
Pablo Márquez-Neila
Mateusz Koziñski
Pascal Fua
3DV
37
231
0
06 Dec 2017
Sequence Mining and Pattern Analysis in Drilling Reports with Deep
  Natural Language Processing
Sequence Mining and Pattern Analysis in Drilling Reports with Deep Natural Language Processing
J. Hoffimann
Youli Mao
A. Wesley
Aimee Taylor
13
15
0
05 Dec 2017
Examining Cooperation in Visual Dialog Models
Examining Cooperation in Visual Dialog Models
Mircea Mironenco
D. Kianfar
Ke M. Tran
Evangelos Kanoulas
E. Gavves
20
4
0
04 Dec 2017
Discriminative Learning of Open-Vocabulary Object Retrieval and
  Localization by Negative Phrase Augmentation
Discriminative Learning of Open-Vocabulary Object Retrieval and Localization by Negative Phrase Augmentation
Ryota Hinami
Shiníchi Satoh
ObjD
14
22
0
27 Nov 2017
Conditional Image-Text Embedding Networks
Conditional Image-Text Embedding Networks
Bryan A. Plummer
Paige Kordas
M. Kiapour
Shuai Zheng
Robinson Piramuthu
Svetlana Lazebnik
26
118
0
22 Nov 2017
On the Automatic Generation of Medical Imaging Reports
On the Automatic Generation of Medical Imaging Reports
Baoyu Jing
P. Xie
Eric P. Xing
MedIm
35
503
0
22 Nov 2017
Previous
123...106789
Next