ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1411.4952
  4. Cited By
From Captions to Visual Concepts and Back

From Captions to Visual Concepts and Back

18 November 2014
Hao Fang
Saurabh Gupta
F. Iandola
R. Srivastava
Li Deng
Piotr Dollár
Jianfeng Gao
Xiaodong He
Margaret Mitchell
John C. Platt
C. L. Zitnick
Geoffrey Zweig
    VLM
ArXivPDFHTML

Papers citing "From Captions to Visual Concepts and Back"

50 / 213 papers shown
Title
Learning Language-Visual Embedding for Movie Understanding with
  Natural-Language
Learning Language-Visual Embedding for Movie Understanding with Natural-Language
Atousa Torabi
Niket Tandon
Leonid Sigal
22
97
0
26 Sep 2016
Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning
  Challenge
Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge
Oriol Vinyals
Alexander Toshev
Samy Bengio
D. Erhan
27
850
0
21 Sep 2016
Measuring Machine Intelligence Through Visual Question Answering
Measuring Machine Intelligence Through Visual Question Answering
C. L. Zitnick
Aishwarya Agrawal
Stanislaw Antol
Margaret Mitchell
Dhruv Batra
Devi Parikh
24
37
0
31 Aug 2016
Learning to generalize to new compositions in image understanding
Learning to generalize to new compositions in image understanding
Y. Atzmon
Jonathan Berant
Vahid Kezami
Amir Globerson
Gal Chechik
26
67
0
27 Aug 2016
Seeing with Humans: Gaze-Assisted Neural Image Captioning
Seeing with Humans: Gaze-Assisted Neural Image Captioning
Yusuke Sugano
Andreas Bulling
24
68
0
18 Aug 2016
Modeling Context in Referring Expressions
Modeling Context in Referring Expressions
Licheng Yu
Patrick Poirson
Shan Yang
Alexander C. Berg
Tamara L. Berg
30
1,227
0
31 Jul 2016
Visual Relationship Detection with Language Priors
Visual Relationship Detection with Language Priors
Cewu Lu
Ranjay Krishna
Michael S. Bernstein
Li Fei-Fei
VLM
16
1,134
0
31 Jul 2016
SPICE: Semantic Propositional Image Caption Evaluation
SPICE: Semantic Propositional Image Caption Evaluation
Peter Anderson
Basura Fernando
Mark Johnson
Stephen Gould
EGVM
36
1,884
0
29 Jul 2016
Captioning Images with Diverse Objects
Captioning Images with Diverse Objects
Subhashini Venugopalan
Lisa Anne Hendricks
Marcus Rohrbach
Raymond J. Mooney
Trevor Darrell
Kate Saenko
VLM
27
178
0
24 Jun 2016
Question Relevance in VQA: Identifying Non-Visual And False-Premise
  Questions
Question Relevance in VQA: Identifying Non-Visual And False-Premise Questions
Arijit Ray
Gordon A. Christie
Joey Tianyi Zhou
Dhruv Batra
Devi Parikh
27
56
0
21 Jun 2016
Unsupervised Learning of Predictors from Unpaired Input-Output Samples
Unsupervised Learning of Predictors from Unpaired Input-Output Samples
Jianshu Chen
Po-Sen Huang
Xiaodong He
Jianfeng Gao
Li Deng
OOD
SSL
26
8
0
15 Jun 2016
cvpaper.challenge in 2015 - A review of CVPR2015 and DeepSurvey
cvpaper.challenge in 2015 - A review of CVPR2015 and DeepSurvey
Hirokatsu Kataoka
Yudai Miyashita
Tomoaki K. Yamabe
Soma Shirakabe
Shin-ichi Sato
...
Kaori Abe
Takaaki Imanari
Naomichi Kobayashi
Shinichiro Morita
Akio Nakamura
24
2
0
26 May 2016
Review Networks for Caption Generation
Review Networks for Caption Generation
Zhilin Yang
Ye Yuan
Yuexin Wu
Ruslan Salakhutdinov
William W. Cohen
3DV
32
85
0
25 May 2016
Movie Description
Movie Description
Anna Rohrbach
Atousa Torabi
Marcus Rohrbach
Niket Tandon
C. Pal
Hugo Larochelle
Aaron Courville
Bernt Schiele
3DV
VGen
32
353
0
12 May 2016
Word2VisualVec: Image and Video to Sentence Matching by Visual Feature
  Prediction
Word2VisualVec: Image and Video to Sentence Matching by Visual Feature Prediction
Jianfeng Dong
Xirong Li
Cees G. M. Snoek
3DV
24
35
0
23 Apr 2016
Subjects and Their Objects: Localizing Interactees for a Person-Centric
  View of Importance
Subjects and Their Objects: Localizing Interactees for a Person-Centric View of Importance
Chao-Yeh Chen
Kristen Grauman
32
9
0
17 Apr 2016
Visual Storytelling
Visual Storytelling
Ting-Hao 'Kenneth' Huang
Huang
Francis Ferraro
N. Mostafazadeh
Ishan Misra
...
C. L. Zitnick
Devi Parikh
Lucy Vanderwende
Michel Galley
Margaret Mitchell
VGen
22
464
0
13 Apr 2016
TGIF: A New Dataset and Benchmark on Animated GIF Description
TGIF: A New Dataset and Benchmark on Animated GIF Description
Yuncheng Li
Yale Song
Liangliang Cao
Joel R. Tetreault
Larry Goldberg
A. Jaimes
Jiebo Luo
25
270
0
10 Apr 2016
Unsupervised Visual Sense Disambiguation for Verbs using Multimodal
  Embeddings
Unsupervised Visual Sense Disambiguation for Verbs using Multimodal Embeddings
Spandana Gella
Mirella Lapata
Frank Keller
CoGe
27
52
0
30 Mar 2016
Rich Image Captioning in the Wild
Rich Image Captioning in the Wild
Kenneth Tran
Xiaodong He
Lei Zhang
Jian Sun
Cornelia Carapcea
Chris Thrasher
Chris Buehler
Chris Sienkiewicz
VLM
19
123
0
30 Mar 2016
Generating Visual Explanations
Generating Visual Explanations
Lisa Anne Hendricks
Zeynep Akata
Marcus Rohrbach
Jeff Donahue
Bernt Schiele
Trevor Darrell
VLM
FAtt
44
618
0
28 Mar 2016
Learning to Read Chest X-Rays: Recurrent Neural Cascade Model for
  Automated Image Annotation
Learning to Read Chest X-Rays: Recurrent Neural Cascade Model for Automated Image Annotation
Hoo-Chang Shin
Kirk Roberts
Le Lu
Dina Demner-Fushman
Jianhua Yao
Ronald M. Summers
18
347
0
28 Mar 2016
Neural Text Generation from Structured Data with Application to the
  Biography Domain
Neural Text Generation from Structured Data with Application to the Biography Domain
R. Lebret
David Grangier
Michael Auli
21
45
0
24 Mar 2016
BreakingNews: Article Annotation by Image and Text Processing
BreakingNews: Article Annotation by Image and Text Processing
Arnau Ramisa
F. Yan
Francesc Moreno-Noguer
K. Mikolajczyk
29
105
0
23 Mar 2016
Image Captioning with Semantic Attention
Image Captioning with Semantic Attention
Quanzeng You
Hailin Jin
Zhaowen Wang
Chen Fang
Jiebo Luo
VLM
64
1,652
0
12 Mar 2016
Image Captioning and Visual Question Answering Based on Attributes and
  External Knowledge
Image Captioning and Visual Question Answering Based on Attributes and External Knowledge
Qi Wu
Chunhua Shen
Anton Van Den Hengel
Peng Wang
A. Dick
27
360
0
09 Mar 2016
Dynamic Memory Networks for Visual and Textual Question Answering
Dynamic Memory Networks for Visual and Textual Question Answering
Caiming Xiong
Stephen Merity
R. Socher
20
753
0
04 Mar 2016
Multimodal Pivots for Image Caption Translation
Multimodal Pivots for Image Caption Translation
Julian Hitschler
Shigehiko Schamoni
Stefan Riezler
27
97
0
15 Jan 2016
Automatic Description Generation from Images: A Survey of Models,
  Datasets, and Evaluation Measures
Automatic Description Generation from Images: A Survey of Models, Datasets, and Evaluation Measures
Raffaella Bernardi
Ruken Cakici
Desmond Elliott
Aykut Erdem
Erkut Erdem
Nazli Ikizler-Cinbis
Frank Keller
A. Muscat
Barbara Plank
EGVM
VLM
27
363
0
15 Jan 2016
DenseCap: Fully Convolutional Localization Networks for Dense Captioning
DenseCap: Fully Convolutional Localization Networks for Dense Captioning
Justin Johnson
A. Karpathy
Li Fei-Fei
VLM
71
1,159
0
24 Nov 2015
Where To Look: Focus Regions for Visual Question Answering
Where To Look: Focus Regions for Visual Question Answering
Kevin J. Shih
Saurabh Singh
Derek Hoiem
34
456
0
23 Nov 2015
Deep Compositional Captioning: Describing Novel Object Categories
  without Paired Training Data
Deep Compositional Captioning: Describing Novel Object Categories without Paired Training Data
Lisa Anne Hendricks
Subhashini Venugopalan
Marcus Rohrbach
Raymond J. Mooney
Kate Saenko
Trevor Darrell
CoGe
16
284
0
17 Nov 2015
Recurrent Neural Networks Hardware Implementation on FPGA
Recurrent Neural Networks Hardware Implementation on FPGA
Andre Xian Ming Chang
B. Martini
Eugenio Culurciello
18
126
0
17 Nov 2015
Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for
  Visual Question Answering
Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for Visual Question Answering
Huijuan Xu
Kate Saenko
27
760
0
17 Nov 2015
Yin and Yang: Balancing and Answering Binary Visual Questions
Yin and Yang: Balancing and Answering Binary Visual Questions
Peng Zhang
Yash Goyal
D. Summers-Stay
Dhruv Batra
Devi Parikh
CoGe
22
349
0
16 Nov 2015
FireCaffe: near-linear acceleration of deep neural network training on
  compute clusters
FireCaffe: near-linear acceleration of deep neural network training on compute clusters
F. Iandola
Khalid Ashraf
Matthew W. Moskewicz
Kurt Keutzer
21
302
0
31 Oct 2015
Multilingual Image Description with Neural Sequence Models
Multilingual Image Description with Neural Sequence Models
Desmond Elliott
Stella Frank
Eva Hasler
VLM
22
75
0
15 Oct 2015
SentiCap: Generating Image Descriptions with Sentiments
SentiCap: Generating Image Descriptions with Sentiments
A. Mathews
Lexing Xie
Xuming He
26
221
0
06 Oct 2015
Guiding Long-Short Term Memory for Image Caption Generation
Guiding Long-Short Term Memory for Image Caption Generation
Xu Jia
E. Gavves
Basura Fernando
Tinne Tuytelaars
VLM
22
101
0
16 Sep 2015
Describing Multimedia Content using Attention-based Encoder--Decoder
  Networks
Describing Multimedia Content using Attention-based Encoder--Decoder Networks
Kyunghyun Cho
Aaron Courville
Yoshua Bengio
32
411
0
04 Jul 2015
Learning language through pictures
Learning language through pictures
Grzegorz Chrupała
Ákos Kádár
A. Alishahi
VLM
SSL
35
65
0
11 Jun 2015
Scheduled Sampling for Sequence Prediction with Recurrent Neural
  Networks
Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks
Samy Bengio
Oriol Vinyals
Navdeep Jaitly
Noam M. Shazeer
72
2,018
0
09 Jun 2015
Learning to Answer Questions From Image Using Convolutional Neural
  Network
Learning to Answer Questions From Image Using Convolutional Neural Network
Lin Ma
Zhengdong Lu
Hang Li
27
261
0
01 Jun 2015
Visual Madlibs: Fill in the blank Image Generation and Question
  Answering
Visual Madlibs: Fill in the blank Image Generation and Question Answering
Licheng Yu
Eunbyung Park
Alexander C. Berg
Tamara L. Berg
VLM
MLLM
32
97
0
31 May 2015
A Multi-scale Multiple Instance Video Description Network
A Multi-scale Multiple Instance Video Description Network
Huijuan Xu
Subhashini Venugopalan
Vasili Ramanishka
Marcus Rohrbach
Kate Saenko
40
64
0
21 May 2015
Are You Talking to a Machine? Dataset and Methods for Multilingual Image
  Question Answering
Are You Talking to a Machine? Dataset and Methods for Multilingual Image Question Answering
Haoyuan Gao
Junhua Mao
Jie Zhou
Zhiheng Huang
Lei Wang
Wenyuan Xu
32
496
0
21 May 2015
Jointly Modeling Embedding and Translation to Bridge Video and Language
Jointly Modeling Embedding and Translation to Bridge Video and Language
Yingwei Pan
Tao Mei
Ting Yao
Houqiang Li
Y. Rui
41
535
0
07 May 2015
Language Models for Image Captioning: The Quirks and What Works
Language Models for Image Captioning: The Quirks and What Works
Jacob Devlin
Hao Cheng
Hao Fang
Saurabh Gupta
Li Deng
Xiaodong He
Geoffrey Zweig
Margaret Mitchell
32
281
0
07 May 2015
Contextual Action Recognition with R*CNN
Contextual Action Recognition with R*CNN
Georgia Gkioxari
Ross B. Girshick
Jitendra Malik
HAI
20
401
0
05 May 2015
VQA: Visual Question Answering
VQA: Visual Question Answering
Aishwarya Agrawal
Jiasen Lu
Stanislaw Antol
Margaret Mitchell
C. L. Zitnick
Dhruv Batra
Devi Parikh
CoGe
66
5,369
0
03 May 2015
Previous
12345
Next