ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1406.5679
  4. Cited By
Deep Fragment Embeddings for Bidirectional Image Sentence Mapping

Deep Fragment Embeddings for Bidirectional Image Sentence Mapping

22 June 2014
A. Karpathy
Armand Joulin
Li Fei-Fei
    VLM
ArXivPDFHTML

Papers citing "Deep Fragment Embeddings for Bidirectional Image Sentence Mapping"

50 / 153 papers shown
Title
Deep Reinforcement Learning-based Image Captioning with Embedding Reward
Deep Reinforcement Learning-based Image Captioning with Embedding Reward
Zhou Ren
Xiaoyu Wang
Ning Zhang
Xutao Lv
Li-Jia Li
34
324
0
12 Apr 2017
Learning Two-Branch Neural Networks for Image-Text Matching Tasks
Learning Two-Branch Neural Networks for Image-Text Matching Tasks
Liwei Wang
Yin Li
Jing-ling Huang
Svetlana Lazebnik
VLM
27
494
0
11 Apr 2017
Detecting Visual Relationships with Deep Relational Networks
Detecting Visual Relationships with Deep Relational Networks
Bo Dai
Yuqi Zhang
Dahua Lin
GNN
59
500
0
11 Apr 2017
Learning Cross-Modal Deep Representations for Robust Pedestrian
  Detection
Learning Cross-Modal Deep Representations for Robust Pedestrian Detection
Dan Xu
Wanli Ouyang
Elisa Ricci
Xiaogang Wang
N. Sebe
30
191
0
08 Apr 2017
Survey of the State of the Art in Natural Language Generation: Core
  tasks, applications and evaluation
Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation
Albert Gatt
E. Krahmer
LM&MA
ELM
27
810
0
29 Mar 2017
Perception Driven Texture Generation
Perception Driven Texture Generation
Yanhai Gan
Huifang Chi
Ying Gao
Jun Liu
G. Zhong
Junyu Dong
OOD
31
12
0
24 Mar 2017
Recurrent Topic-Transition GAN for Visual Paragraph Generation
Recurrent Topic-Transition GAN for Visual Paragraph Generation
Xiaodan Liang
Zhiting Hu
Huatian Zhang
Chuang Gan
Eric Xing
GAN
27
200
0
21 Mar 2017
I2T2I: Learning Text to Image Synthesis with Textual Data Augmentation
I2T2I: Learning Text to Image Synthesis with Textual Data Augmentation
Hao Dong
Jingqing Zhang
Douglas McIlwraith
Yike Guo
35
58
0
20 Mar 2017
Learning Visual N-Grams from Web Data
Learning Visual N-Grams from Web Data
Ang Li
Allan Jabri
Armand Joulin
Laurens van der Maaten
VLM
20
136
0
29 Dec 2016
Dense Captioning with Joint Inference and Visual Context
Dense Captioning with Joint Inference and Visual Context
L. Yang
K. Tang
Jianchao Yang
Li-Jia Li
VLM
30
169
0
21 Nov 2016
A Hierarchical Approach for Generating Descriptive Image Paragraphs
A Hierarchical Approach for Generating Descriptive Image Paragraphs
J. Krause
Justin Johnson
Ranjay Krishna
Li Fei-Fei
VLM
36
373
0
20 Nov 2016
Instance-aware Image and Sentence Matching with Selective Multimodal
  LSTM
Instance-aware Image and Sentence Matching with Selective Multimodal LSTM
Yan Huang
Wei Wang
Liang Wang
26
222
0
17 Nov 2016
Dual Attention Networks for Multimodal Reasoning and Matching
Dual Attention Networks for Multimodal Reasoning and Matching
Hyeonseob Nam
Jung-Woo Ha
Jeonghee Kim
45
664
0
02 Nov 2016
Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning
  Challenge
Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge
Oriol Vinyals
Alexander Toshev
Samy Bengio
D. Erhan
30
851
0
21 Sep 2016
phi-LSTM: A Phrase-based Hierarchical LSTM Model for Image Captioning
phi-LSTM: A Phrase-based Hierarchical LSTM Model for Image Captioning
Y. Tan
Chee Seng Chan
VLM
22
29
0
20 Aug 2016
Mean Box Pooling: A Rich Image Representation and Output Embedding for
  the Visual Madlibs Task
Mean Box Pooling: A Rich Image Representation and Output Embedding for the Visual Madlibs Task
Ashkan Mokarian
Mateusz Malinowski
Mario Fritz
27
5
0
09 Aug 2016
A Comprehensive Survey on Cross-modal Retrieval
A Comprehensive Survey on Cross-modal Retrieval
Kun Wang
Qiyue Yin
Wei Wang
Shu Wu
Liang Wang
42
294
0
21 Jul 2016
Multilingual Visual Sentiment Concept Matching
Multilingual Visual Sentiment Concept Matching
Nikolaos Pappas
Miriam Redi
Mercan Topkara
Brendan Jou
Hongyi Liu
Tao Chen
Shih-Fu Chang
CVBM
29
14
0
07 Jun 2016
Deep Image Retrieval: Learning global representations for image search
Deep Image Retrieval: Learning global representations for image search
Albert Gordo
Jon Almazán
Jérôme Revaud
Diane Larlus
23
802
0
05 Apr 2016
Multi-Cue Zero-Shot Learning with Strong Supervision
Multi-Cue Zero-Shot Learning with Strong Supervision
Zeynep Akata
Mateusz Malinowski
Mario Fritz
Bernt Schiele
40
148
0
29 Mar 2016
Image Captioning and Visual Question Answering Based on Attributes and
  External Knowledge
Image Captioning and Visual Question Answering Based on Attributes and External Knowledge
Qi Wu
Chunhua Shen
Anton Van Den Hengel
Peng Wang
A. Dick
27
360
0
09 Mar 2016
Automatic Description Generation from Images: A Survey of Models,
  Datasets, and Evaluation Measures
Automatic Description Generation from Images: A Survey of Models, Datasets, and Evaluation Measures
Raffaella Bernardi
Ruken Cakici
Desmond Elliott
Aykut Erdem
Erkut Erdem
Nazli Ikizler-Cinbis
Frank Keller
A. Muscat
Barbara Plank
EGVM
VLM
27
363
0
15 Jan 2016
Multi-Instance Visual-Semantic Embedding
Multi-Instance Visual-Semantic Embedding
Zhou Ren
Hailin Jin
Zhe-nan Lin
Chen Fang
Alan Yuille
VLM
27
38
0
22 Dec 2015
Analyzing Classifiers: Fisher Vectors and Deep Neural Networks
Analyzing Classifiers: Fisher Vectors and Deep Neural Networks
Sebastian Bach
Alexander Binder
G. Montavon
K. Müller
Wojciech Samek
35
198
0
01 Dec 2015
Images Don't Lie: Transferring Deep Visual Semantic Features to
  Large-Scale Multimodal Learning to Rank
Images Don't Lie: Transferring Deep Visual Semantic Features to Large-Scale Multimodal Learning to Rank
Corey Lynch
Kamelia Aryafar
Josh Attenberg
31
45
0
20 Nov 2015
Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for
  Visual Question Answering
Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for Visual Question Answering
Huijuan Xu
Kate Saenko
36
761
0
17 Nov 2015
Sherlock: Scalable Fact Learning in Images
Sherlock: Scalable Fact Learning in Images
Mohamed Elhoseiny
Scott D. Cohen
W. Chang
Brian L. Price
Ahmed Elgammal
19
26
0
16 Nov 2015
Natural Language Object Retrieval
Natural Language Object Retrieval
Ronghang Hu
Huazhe Xu
Marcus Rohrbach
Jiashi Feng
Kate Saenko
Trevor Darrell
ObjD
46
551
0
13 Nov 2015
Deep Multimodal Semantic Embeddings for Speech and Images
Deep Multimodal Semantic Embeddings for Speech and Images
David Harwath
James R. Glass
18
155
0
11 Nov 2015
Visual7W: Grounded Question Answering in Images
Visual7W: Grounded Question Answering in Images
Yuke Zhu
Oliver Groth
Michael S. Bernstein
Li Fei-Fei
44
875
0
11 Nov 2015
Neural Module Networks
Neural Module Networks
Jacob Andreas
Marcus Rohrbach
Trevor Darrell
Dan Klein
CoGe
48
1,062
0
09 Nov 2015
Learning Visual Features from Large Weakly Supervised Data
Learning Visual Features from Large Weakly Supervised Data
Armand Joulin
Laurens van der Maaten
Allan Jabri
Nicolas Vasilache
SSL
24
406
0
06 Nov 2015
Guiding Long-Short Term Memory for Image Caption Generation
Guiding Long-Short Term Memory for Image Caption Generation
Xu Jia
E. Gavves
Basura Fernando
Tinne Tuytelaars
VLM
22
101
0
16 Sep 2015
What value do explicit high level concepts have in vision to language
  problems?
What value do explicit high level concepts have in vision to language problems?
Qi Wu
Chunhua Shen
Lingqiao Liu
A. Dick
Anton Van Den Hengel
33
443
0
03 Jun 2015
Learning to Answer Questions From Image Using Convolutional Neural
  Network
Learning to Answer Questions From Image Using Convolutional Neural Network
Lin Ma
Zhengdong Lu
Hang Li
27
261
0
01 Jun 2015
Text to 3D Scene Generation with Rich Lexical Grounding
Text to 3D Scene Generation with Rich Lexical Grounding
Angel X. Chang
Will Monroe
Manolis Savva
Christopher Potts
Christopher D. Manning
3DV
15
106
0
23 May 2015
Weakly-Supervised Alignment of Video With Text
Weakly-Supervised Alignment of Video With Text
Piotr Bojanowski
Rémi Lajugie
Edouard Grave
Francis R. Bach
Ivan Laptev
Jean Ponce
Cordelia Schmid
41
134
0
22 May 2015
Exploring Nearest Neighbor Approaches for Image Captioning
Exploring Nearest Neighbor Approaches for Image Captioning
Jacob Devlin
Saurabh Gupta
Ross B. Girshick
Margaret Mitchell
C. L. Zitnick
27
195
0
17 May 2015
Ask Your Neurons: A Neural-based Approach to Answering Questions about
  Images
Ask Your Neurons: A Neural-based Approach to Answering Questions about Images
Mateusz Malinowski
Marcus Rohrbach
Mario Fritz
41
596
0
05 May 2015
Learning Temporal Embeddings for Complex Video Analysis
Learning Temporal Embeddings for Complex Video Analysis
Vignesh Ramanathan
K. Tang
Greg Mori
Li Fei-Fei
34
71
0
02 May 2015
Multimodal Convolutional Neural Networks for Matching Image and Sentence
Multimodal Convolutional Neural Networks for Matching Image and Sentence
Lin Ma
Zhengdong Lu
Lifeng Shang
Hang Li
38
337
0
23 Apr 2015
Microsoft COCO Captions: Data Collection and Evaluation Server
Microsoft COCO Captions: Data Collection and Evaluation Server
Xinlei Chen
Hao Fang
Nayeon Lee
Ramakrishna Vedantam
Saurabh Gupta
Piotr Dollar
C. L. Zitnick
97
2,434
0
01 Apr 2015
Text Understanding from Scratch
Text Understanding from Scratch
Xiang Zhang
Yann LeCun
46
557
0
05 Feb 2015
Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)
Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)
Junhua Mao
Wenyuan Xu
Yi Yang
Jiang Wang
Zhiheng Huang
Alan Yuille
VLM
86
1,235
0
20 Dec 2014
Translating Videos to Natural Language Using Deep Recurrent Neural
  Networks
Translating Videos to Natural Language Using Deep Recurrent Neural Networks
Subhashini Venugopalan
Huijuan Xu
Jeff Donahue
Marcus Rohrbach
Raymond J. Mooney
Kate Saenko
47
951
0
15 Dec 2014
Deep Visual-Semantic Alignments for Generating Image Descriptions
Deep Visual-Semantic Alignments for Generating Image Descriptions
A. Karpathy
Li Fei-Fei
24
5,559
0
07 Dec 2014
Cross-Modal Learning via Pairwise Constraints
Cross-Modal Learning via Pairwise Constraints
Ran He
Man Zhang
Liang Wang
Ye Ji
Qiyue Yin
24
60
0
28 Nov 2014
CIDEr: Consensus-based Image Description Evaluation
CIDEr: Consensus-based Image Description Evaluation
Ramakrishna Vedantam
C. L. Zitnick
Devi Parikh
103
4,412
0
20 Nov 2014
Learning a Recurrent Visual Representation for Image Caption Generation
Learning a Recurrent Visual Representation for Image Caption Generation
Xinlei Chen
C. L. Zitnick
SSL
GAN
35
195
0
20 Nov 2014
Show and Tell: A Neural Image Caption Generator
Show and Tell: A Neural Image Caption Generator
Oriol Vinyals
Alexander Toshev
Samy Bengio
D. Erhan
3DV
92
5,996
0
17 Nov 2014
Previous
1234
Next