Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1605.03705
Cited By
Movie Description
12 May 2016
Anna Rohrbach
Atousa Torabi
Marcus Rohrbach
Niket Tandon
C. Pal
Hugo Larochelle
Aaron Courville
Bernt Schiele
3DV
VGen
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Movie Description"
47 / 47 papers shown
Title
Generative Modeling of Class Probability for Multi-Modal Representation Learning
Jungkyoo Shin
Bumsoo Kim
Eunwoo Kim
88
1
0
21 Mar 2025
Audio-Language Datasets of Scenes and Events: A Survey
Gijs Wijngaard
Elia Formisano
Michele Esposito
M. Dumontier
116
2
0
10 Jan 2025
Do Language Models Understand Time?
Xi Ding
Lei Wang
242
1
0
18 Dec 2024
Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-training
Gen Luo
Xue Yang
Wenhan Dou
Zhaokai Wang
Jifeng Dai
Jifeng Dai
Yu Qiao
Xizhou Zhu
VLM
MLLM
119
28
0
10 Oct 2024
Deep Variational Multivariate Information Bottleneck -- A Framework for Variational Losses
Eslam Abdelaleem
I. Nemenman
K. M. Martini
58
6
0
05 Oct 2023
End-to-end Concept Word Detection for Video Captioning, Retrieval, and Question Answering
Youngjae Yu
Hyungjin Ko
Jongwook Choi
Gunhee Kim
118
231
0
10 Oct 2016
Learning Language-Visual Embedding for Movie Understanding with Natural-Language
Atousa Torabi
Niket Tandon
Leonid Sigal
65
97
0
26 Sep 2016
Title Generation for User Generated Videos
Kuo-Hao Zeng
Tseng-Hung Chen
Juan Carlos Niebles
Min Sun
67
69
0
25 Aug 2016
Frame- and Segment-Level Features and Candidate Pool Evaluation for Video Caption Generation
Rakshith Shetty
Jorma T. Laaksonen
61
94
0
17 Aug 2016
SPICE: Semantic Propositional Image Caption Evaluation
Peter Anderson
Basura Fernando
Mark Johnson
Stephen Gould
EGVM
100
1,914
0
29 Jul 2016
TGIF: A New Dataset and Benchmark on Animated GIF Description
Yuncheng Li
Yale Song
Liangliang Cao
Joel R. Tetreault
Larry Goldberg
A. Jaimes
Jiebo Luo
79
271
0
10 Apr 2016
Improving LSTM-based Video Description with Linguistic Knowledge Mined from Text
Subhashini Venugopalan
Lisa Anne Hendricks
Raymond J. Mooney
Kate Saenko
VLM
42
117
0
06 Apr 2016
RNN Fisher Vectors for Action Recognition and Image Annotation
Guy Lev
Gil Sadeh
Benjamin Klein
Lior Wolf
51
164
0
12 Dec 2015
Video captioning with recurrent networks based on frame- and video-level features and visual content classification
Rakshith Shetty
Jorma T. Laaksonen
42
31
0
09 Dec 2015
MovieQA: Understanding Stories in Movies through Question-Answering
Makarand Tapaswi
Yukun Zhu
Rainer Stiefelhagen
Antonio Torralba
R. Urtasun
Sanja Fidler
112
746
0
09 Dec 2015
Delving Deeper into Convolutional Networks for Learning Video Representations
Nicolas Ballas
L. Yao
C. Pal
Aaron Courville
MDE
85
701
0
19 Nov 2015
Deep Compositional Captioning: Describing Novel Object Categories without Paired Training Data
Lisa Anne Hendricks
Subhashini Venugopalan
Marcus Rohrbach
Raymond J. Mooney
Kate Saenko
Trevor Darrell
CoGe
61
284
0
17 Nov 2015
Uncovering Temporal Context for Video Question and Answering
Linchao Zhu
Zhongwen Xu
Yi Yang
Alexander G. Hauptmann
BDL
65
45
0
15 Nov 2015
Hierarchical Recurrent Neural Encoder for Video Representation with Application to Captioning
Pingbo Pan
Zhongwen Xu
Yi Yang
Leilei Gan
Yueting Zhuang
43
385
0
11 Nov 2015
Video Paragraph Captioning Using Hierarchical Recurrent Neural Networks
Haonan Yu
Jiang Wang
Zhiheng Huang
Yi Yang
Wenyuan Xu
88
560
0
26 Oct 2015
Aligning Books and Movies: Towards Story-like Visual Explanations by Watching Movies and Reading Books
Yukun Zhu
Ryan Kiros
R. Zemel
Ruslan Salakhutdinov
R. Urtasun
Antonio Torralba
Sanja Fidler
120
2,548
0
22 Jun 2015
The Long-Short Story of Movie Description
Anna Rohrbach
Marcus Rohrbach
Bernt Schiele
VLM
64
111
0
04 Jun 2015
Jointly Modeling Embedding and Translation to Bridge Video and Language
Yingwei Pan
Tao Mei
Ting Yao
Houqiang Li
Y. Rui
77
534
0
07 May 2015
Language Models for Image Captioning: The Quirks and What Works
Jacob Devlin
Hao Cheng
Hao Fang
Saurabh Gupta
Li Deng
Xiaodong He
Geoffrey Zweig
Margaret Mitchell
83
281
0
07 May 2015
Sequence to Sequence -- Video to Text
Subhashini Venugopalan
Marcus Rohrbach
Jeff Donahue
Raymond J. Mooney
Trevor Darrell
Kate Saenko
140
1,418
0
03 May 2015
Microsoft COCO Captions: Data Collection and Evaluation Server
Xinlei Chen
Hao Fang
Nayeon Lee
Ramakrishna Vedantam
Saurabh Gupta
Piotr Dollar
C. L. Zitnick
211
2,475
0
01 Apr 2015
Using Descriptive Video Services to Create a Large Data Source for Video Annotation Research
Atousa Torabi
C. Pal
Hugo Larochelle
Aaron Courville
VGen
83
205
0
03 Mar 2015
Describing Videos by Exploiting Temporal Structure
L. Yao
Atousa Torabi
Kyunghyun Cho
Nicolas Ballas
C. Pal
Hugo Larochelle
Aaron Courville
141
1,064
0
27 Feb 2015
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
Ke Xu
Jimmy Ba
Ryan Kiros
Kyunghyun Cho
Aaron Courville
Ruslan Salakhutdinov
R. Zemel
Yoshua Bengio
DiffM
334
10,069
0
10 Feb 2015
A Dataset for Movie Description
Anna Rohrbach
Marcus Rohrbach
Niket Tandon
Bernt Schiele
VGen
108
500
0
12 Jan 2015
Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)
Junhua Mao
Wenyuan Xu
Yi Yang
Jiang Wang
Zhiheng Huang
Alan Yuille
VLM
168
1,240
0
20 Dec 2014
Translating Videos to Natural Language Using Deep Recurrent Neural Networks
Subhashini Venugopalan
Huijuan Xu
Jeff Donahue
Marcus Rohrbach
Raymond J. Mooney
Kate Saenko
135
952
0
15 Dec 2014
Deep Visual-Semantic Alignments for Generating Image Descriptions
A. Karpathy
Li Fei-Fei
124
5,584
0
07 Dec 2014
CIDEr: Consensus-based Image Description Evaluation
Ramakrishna Vedantam
C. L. Zitnick
Devi Parikh
283
4,484
0
20 Nov 2014
From Captions to Visual Concepts and Back
Hao Fang
Saurabh Gupta
F. Iandola
R. Srivastava
Li Deng
...
Xiaodong He
Margaret Mitchell
John C. Platt
C. L. Zitnick
Geoffrey Zweig
VLM
103
1,311
0
18 Nov 2014
Show and Tell: A Neural Image Caption Generator
Oriol Vinyals
Alexander Toshev
Samy Bengio
D. Erhan
3DV
237
6,028
0
17 Nov 2014
Long-term Recurrent Convolutional Networks for Visual Recognition and Description
Jeff Donahue
Lisa Anne Hendricks
Marcus Rohrbach
Subhashini Venugopalan
S. Guadarrama
Kate Saenko
Trevor Darrell
VLM
162
6,051
0
17 Nov 2014
Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models
Ryan Kiros
Ruslan Salakhutdinov
R. Zemel
VLM
125
1,399
0
10 Nov 2014
Going Deeper with Convolutions
Christian Szegedy
Wei Liu
Yangqing Jia
P. Sermanet
Scott E. Reed
Dragomir Anguelov
D. Erhan
Vincent Vanhoucke
Andrew Rabinovich
457
43,649
0
17 Sep 2014
Very Deep Convolutional Networks for Large-Scale Image Recognition
Karen Simonyan
Andrew Zisserman
FAtt
MDE
1.6K
100,348
0
04 Sep 2014
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky
Jia Deng
Hao Su
J. Krause
S. Satheesh
...
A. Karpathy
A. Khosla
Michael S. Bernstein
Alexander C. Berg
Li Fei-Fei
VLM
ObjD
1.7K
39,525
0
01 Sep 2014
Video In Sentences Out
Andrei Barbu
Alexander Bridge
Zachary Burchill
D. Coroian
Sven J. Dickinson
...
Jarrell W. Waggoner
Song Wang
Jinlian Wei
Yifan Yin
Zhiqi Zhang
64
156
0
09 Aug 2014
LSDA: Large Scale Detection Through Adaptation
Judy Hoffman
S. Guadarrama
Eric Tzeng
Ronghang Hu
Jeff Donahue
Ross B. Girshick
Trevor Darrell
Kate Saenko
ObjD
95
334
0
18 Jul 2014
Weakly Supervised Action Labeling in Videos Under Ordering Constraints
Piotr Bojanowski
Rémi Lajugie
Francis R. Bach
Ivan Laptev
Jean Ponce
Cordelia Schmid
Josef Sivic
63
237
0
04 Jul 2014
Microsoft COCO: Common Objects in Context
Nayeon Lee
Michael Maire
Serge J. Belongie
Lubomir Bourdev
Ross B. Girshick
James Hays
Pietro Perona
Deva Ramanan
C. L. Zitnick
Piotr Dollár
ObjD
413
43,638
0
01 May 2014
Coherent Multi-Sentence Video Description with Variable Level of Detail
Anna Rohrbach
Marcus Rohrbach
Weijian Qiu
Annemarie Friedrich
Sikandar Amin
Mykhaylo Andriluka
Manfred Pinkal
Bernt Schiele
73
218
0
24 Mar 2014
Improving neural networks by preventing co-adaptation of feature detectors
Geoffrey E. Hinton
Nitish Srivastava
A. Krizhevsky
Ilya Sutskever
Ruslan Salakhutdinov
VLM
450
7,661
0
03 Jul 2012
1