Long-term Recurrent Convolutional Networks for Visual Recognition and Description

17 November 2014

Jeff Donahue

Lisa Anne Hendricks

Marcus Rohrbach

Subhashini Venugopalan

Papers citing "Long-term Recurrent Convolutional Networks for Visual Recognition and Description"

38 / 688 papers shown

Title
Aligning where to see and what to tell: image caption with region-based attention and scene factorization Junqi Jin Kun Fu Runpeng Cui Fei Sha Changshui Zhang 28 117 0 20 Jun 2015
Reading Scene Text in Deep Convolutional Sequences Pan He Weilin Huang Yu Qiao Chen Change Loy Xiaoou Tang 21 307 0 14 Jun 2015
Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting Xingjian Shi Zhourong Chen Hao Wang Dit-Yan Yeung W. Wong W. Woo 236 7,906 0 13 Jun 2015
Listen, Attend, and Walk: Neural Mapping of Navigational Instructions to Action Sequences Hongyuan Mei Joey Tianyi Zhou Matthew R. Walter LM&Ro 26 242 0 12 Jun 2015
Learning language through pictures Grzegorz Chrupała Ákos Kádár A. Alishahi VLM SSL 32 65 0 11 Jun 2015
P-CNN: Pose-based CNN Features for Action Recognition Guilhem Chéron Ivan Laptev Cordelia Schmid 22 607 0 11 Jun 2015
Pointer Networks Oriol Vinyals Meire Fortunato Navdeep Jaitly 48 3,016 0 09 Jun 2015
Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks Samy Bengio Oriol Vinyals Navdeep Jaitly Noam M. Shazeer 51 2,018 0 09 Jun 2015
Visualizing and Understanding Recurrent Networks A. Karpathy Justin Johnson Li Fei-Fei HAI 23 1,096 0 05 Jun 2015
Learning to track for spatio-temporal action localization Philippe Weinzaepfel Zaïd Harchaoui Cordelia Schmid 36 338 0 05 Jun 2015
Beyond Temporal Pooling: Recurrence and Temporal Convolutions for Gesture Recognition in Video Lionel Pigou Aaron van den Oord Sander Dieleman Mieke Van Herreweghe J. Dambre 27 254 0 05 Jun 2015
Learning to Answer Questions From Image Using Convolutional Neural Network Lin Ma Zhengdong Lu Hang Li 27 261 0 01 Jun 2015
Visual Madlibs: Fill in the blank Image Generation and Question Answering Licheng Yu Eunbyung Park Alexander C. Berg Tamara L. Berg VLM MLLM 32 97 0 31 May 2015
Weakly-Supervised Alignment of Video With Text Piotr Bojanowski Rémi Lajugie Edouard Grave Francis R. Bach Ivan Laptev Jean Ponce Cordelia Schmid 41 134 0 22 May 2015
Are You Talking to a Machine? Dataset and Methods for Multilingual Image Question Answering Haoyuan Gao Junhua Mao Jie Zhou Zhiheng Huang Lei Wang Wenyuan Xu 32 496 0 21 May 2015
Jointly Modeling Embedding and Translation to Bridge Video and Language Yingwei Pan Tao Mei Ting Yao Houqiang Li Y. Rui 41 535 0 07 May 2015
Language Models for Image Captioning: The Quirks and What Works Jacob Devlin Hao Cheng Hao Fang Saurabh Gupta Li Deng Xiaodong He Geoffrey Zweig Margaret Mitchell 32 281 0 07 May 2015
Ask Your Neurons: A Neural-based Approach to Answering Questions about Images Mateusz Malinowski Marcus Rohrbach Mario Fritz 41 595 0 05 May 2015
VQA: Visual Question Answering Aishwarya Agrawal Jiasen Lu Stanislaw Antol Margaret Mitchell C. L. Zitnick Dhruv Batra Devi Parikh CoGe 66 5,369 0 03 May 2015
Dense Optical Flow Prediction from a Static Image Jacob Walker Abhinav Gupta M. Hebert 37 210 0 02 May 2015
Anticipating Visual Representations from Unlabeled Video Carl Vondrick Hamed Pirsiavash Antonio Torralba 24 145 0 29 Apr 2015
Learning like a Child: Fast Novel Visual Concept Learning from Sentence Descriptions of Images Junhua Mao Xu Wei Yi Yang Jiang Wang Zhiheng Huang Alan Yuille 25 154 0 25 Apr 2015
Differential Recurrent Neural Networks for Action Recognition Vivek Veeriah Naifan Zhuang Guo-Jun Qi 44 462 0 25 Apr 2015
Multimodal Convolutional Neural Networks for Matching Image and Sentence Lin Ma Zhengdong Lu Lifeng Shang Hang Li 38 337 0 23 Apr 2015
Temporal Localization of Fine-Grained Actions in Videos by Domain Transfer from Web Images Chen Sun Sanketh Shetty Rahul Sukthankar Ram Nevatia 20 136 0 04 Apr 2015
Microsoft COCO Captions: Data Collection and Evaluation Server Xinlei Chen Hao Fang Nayeon Lee Ramakrishna Vedantam Saurabh Gupta Piotr Dollar C. L. Zitnick 67 2,433 0 01 Apr 2015
Fully Connected Deep Structured Networks A. Schwing R. Urtasun SSeg 59 308 0 09 Mar 2015
Using Descriptive Video Services to Create a Large Data Source for Video Annotation Research Atousa Torabi C. Pal Hugo Larochelle Aaron Courville VGen 36 204 0 03 Mar 2015
Describing Videos by Exploiting Temporal Structure L. Yao Atousa Torabi Kyunghyun Cho Nicolas Ballas C. Pal Hugo Larochelle Aaron Courville 37 1,062 0 27 Feb 2015
Image Specificity M. Jas Devi Parikh 29 40 0 16 Feb 2015
Phrase-based Image Captioning R. Lebret Pedro H. O. Pinheiro R. Collobert VLM 31 120 0 12 Feb 2015
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention Ke Xu Jimmy Ba Ryan Kiros Kyunghyun Cho Aaron Courville Ruslan Salakhutdinov R. Zemel Yoshua Bengio DiffM 109 10,006 0 10 Feb 2015
A Dataset for Movie Description Anna Rohrbach Marcus Rohrbach Niket Tandon Bernt Schiele VGen 33 498 0 12 Jan 2015
Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN) Junhua Mao Wenyuan Xu Yi Yang Jiang Wang Zhiheng Huang Alan Yuille VLM 65 1,235 0 20 Dec 2014
Translating Videos to Natural Language Using Deep Recurrent Neural Networks Subhashini Venugopalan Huijuan Xu Jeff Donahue Marcus Rohrbach Raymond J. Mooney Kate Saenko 47 950 0 15 Dec 2014
Deep Visual-Semantic Alignments for Generating Image Descriptions A. Karpathy Li Fei-Fei 21 5,556 0 07 Dec 2014
CIDEr: Consensus-based Image Description Evaluation Ramakrishna Vedantam C. L. Zitnick Devi Parikh 67 4,401 0 20 Nov 2014
Transfer Learning for Video Recognition with Scarce Training Data for Deep Convolutional Neural Network Yu-Chuan Su Tzu-Hsuan Chiu Chun-Yen Yeh Hsinfu Huang Winston H. Hsu 27 27 0 15 Sep 2014