Video Paragraph Captioning Using Hierarchical Recurrent Neural Networks

26 October 2015

Yi Yang

Papers citing "Video Paragraph Captioning Using Hierarchical Recurrent Neural Networks"

34 / 34 papers shown

Title
Progress-Aware Video Frame Captioning Zihui Xue Joungbin An Xitong Yang Kristen Grauman 149 1 0 03 Dec 2024
Selective Query-guided Debiasing for Video Corpus Moment Retrieval Sunjae Yoon Jiajing Hong Eunseop Yoon Dahyun Kim Junyeong Kim Hee Suk Yoon Changdong Yoo 82 21 0 17 Oct 2022
Textual Description for Mathematical Equations Ajoy Mondal C. V. Jawahar 47 2 0 07 Aug 2020
Attention-Based Multimodal Fusion for Video Description Chiori Hori Takaaki Hori Teng-Yok Lee Kazuhiro Sumi J. Hershey Tim K. Marks 54 359 0 11 Jan 2017
Spatio-Temporal Attention Models for Grounded Video Captioning M. Zanfir Elisabeta Marinoiu C. Sminchisescu 48 50 0 17 Oct 2016
End-to-end Concept Word Detection for Video Captioning, Retrieval, and Question Answering Youngjae Yu Hyungjin Ko Jongwook Choi Gunhee Kim 70 230 0 10 Oct 2016
Sequence Level Training with Recurrent Neural Networks MarcÁurelio Ranzato S. Chopra Michael Auli Wojciech Zaremba 65 1,610 0 20 Nov 2015
Semantically Conditioned LSTM-based Natural Language Generation for Spoken Dialogue Systems Tsung-Hsien Wen Milica Gasic N. Mrksic Pei-hao Su David Vandyke S. Young 72 948 0 07 Aug 2015
A Neural Conversational Model Oriol Vinyals Quoc V. Le BDL 79 1,768 0 19 Jun 2015
Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks Samy Bengio Oriol Vinyals Navdeep Jaitly Noam M. Shazeer 103 2,024 0 09 Jun 2015
A Hierarchical Neural Autoencoder for Paragraphs and Documents Jiwei Li Minh-Thang Luong Dan Jurafsky BDL 55 602 0 02 Jun 2015
A Multi-scale Multiple Instance Video Description Network Huijuan Xu Subhashini Venugopalan Vasili Ramanishka Marcus Rohrbach Kate Saenko 42 64 0 21 May 2015
Jointly Modeling Embedding and Translation to Bridge Video and Language Yingwei Pan Tao Mei Ting Yao Houqiang Li Y. Rui 57 534 0 07 May 2015
Sequence to Sequence -- Video to Text Subhashini Venugopalan Marcus Rohrbach Jeff Donahue Raymond J. Mooney Trevor Darrell Kate Saenko 86 1,417 0 03 May 2015
Learning like a Child: Fast Novel Visual Concept Learning from Sentence Descriptions of Images Junhua Mao Xu Wei Yi Yang Jiang Wang Zhiheng Huang Alan Yuille 58 154 0 25 Apr 2015
Microsoft COCO Captions: Data Collection and Evaluation Server Xinlei Chen Hao Fang Nayeon Lee Ramakrishna Vedantam Saurabh Gupta Piotr Dollar C. L. Zitnick 144 2,461 0 01 Apr 2015
Describing Videos by Exploiting Temporal Structure L. Yao Atousa Torabi Kyunghyun Cho Nicolas Ballas C. Pal Hugo Larochelle Aaron Courville 109 1,063 0 27 Feb 2015
Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN) Junhua Mao Wenyuan Xu Yi Yang Jiang Wang Zhiheng Huang Alan Yuille VLM 107 1,237 0 20 Dec 2014
Translating Videos to Natural Language Using Deep Recurrent Neural Networks Subhashini Venugopalan Huijuan Xu Jeff Donahue Marcus Rohrbach Raymond J. Mooney Kate Saenko 80 951 0 15 Dec 2014
Learning Spatiotemporal Features with 3D Convolutional Networks Du Tran Lubomir D. Bourdev Rob Fergus Lorenzo Torresani Manohar Paluri 3DPC 42 410 0 02 Dec 2014
CIDEr: Consensus-based Image Description Evaluation Ramakrishna Vedantam C. L. Zitnick Devi Parikh 211 4,451 0 20 Nov 2014
Learning a Recurrent Visual Representation for Image Caption Generation Xinlei Chen C. L. Zitnick SSL GAN 48 195 0 20 Nov 2014
Show and Tell: A Neural Image Caption Generator Oriol Vinyals Alexander Toshev Samy Bengio D. Erhan 3DV 183 6,009 0 17 Nov 2014
Long-term Recurrent Convolutional Networks for Visual Recognition and Description Jeff Donahue Lisa Anne Hendricks Marcus Rohrbach Subhashini Venugopalan S. Guadarrama Kate Saenko Trevor Darrell VLM 114 6,037 0 17 Nov 2014
Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models Ryan Kiros Ruslan Salakhutdinov R. Zemel VLM 70 1,395 0 10 Nov 2014
Sequence to Sequence Learning with Neural Networks Ilya Sutskever Oriol Vinyals Quoc V. Le AIMat 259 20,467 0 10 Sep 2014
Very Deep Convolutional Networks for Large-Scale Image Recognition Karen Simonyan Andrew Zisserman FAtt MDE 761 99,991 0 04 Sep 2014
ImageNet Large Scale Visual Recognition Challenge Olga Russakovsky Jia Deng Hao Su J. Krause S. Satheesh ... A. Karpathy A. Khosla Michael S. Bernstein Alexander C. Berg Li Fei-Fei VLM ObjD 952 39,383 0 01 Sep 2014
Neural Machine Translation by Jointly Learning to Align and Translate Dzmitry Bahdanau Kyunghyun Cho Yoshua Bengio AIMat 356 27,205 0 01 Sep 2014
Video In Sentences Out Andrei Barbu Alexander Bridge Zachary Burchill D. Coroian Sven J. Dickinson ... Jarrell W. Waggoner Song Wang Jinlian Wei Yifan Yin Zhiqi Zhang 25 155 0 09 Aug 2014
Question Answering with Subgraph Embeddings Antoine Bordes S. Chopra Jason Weston RALM BDL GNN 57 713 0 14 Jun 2014
Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation Kyunghyun Cho B. V. Merrienboer Çağlar Gülçehre Dzmitry Bahdanau Fethi Bougares Holger Schwenk Yoshua Bengio AIMat 574 23,235 0 03 Jun 2014
Microsoft COCO: Common Objects in Context Nayeon Lee Michael Maire Serge J. Belongie Lubomir Bourdev Ross B. Girshick James Hays Pietro Perona Deva Ramanan C. L. Zitnick Piotr Dollár ObjD 215 43,290 0 01 May 2014
Coherent Multi-Sentence Video Description with Variable Level of Detail Anna Rohrbach Marcus Rohrbach Weijian Qiu Annemarie Friedrich Sikandar Amin Mykhaylo Andriluka Manfred Pinkal Bernt Schiele 44 217 0 24 Mar 2014