MovieQA: Understanding Stories in Movies through Question-Answering

9 December 2015

Antonio Torralba

Sanja Fidler

Papers citing "MovieQA: Understanding Stories in Movies through Question-Answering"

50 / 202 papers shown

Title
Constructing Hierarchical Q&A Datasets for Video Story Understanding Y. Heo Kyoung-Woon On Seong-Ho Choi Jaeseo Lim Jinah Kim Jeh-Kwang Ryu Byung-Chull Bae Byoung-Tak Zhang 23 5 0 01 Apr 2019
Episodic Memory Reader: Learning What to Remember for Question Answering from Streaming Data Moonsu Han Minki Kang Hyunwoo Jung Sung Ju Hwang RALM 27 19 0 14 Mar 2019
Self-Supervised Learning of Face Representations for Video Face Clustering Vivek Sharma Makarand Tapaswi M. Sarfraz Rainer Stiefelhagen SSL CVBM 19 49 0 03 Mar 2019
From Visual to Acoustic Question Answering Jerome Abdelnour G. Salvi Jean Rouat 27 3 0 28 Feb 2019
Audio-Visual Scene-Aware Dialog Huda AlAmri Vincent Cartillier Abhishek Das Jue Wang A. Cherian ... Tim K. Marks Chiori Hori Peter Anderson Stefan Lee Devi Parikh VGen 27 189 0 25 Jan 2019
Adversarial Attacks on Deep Learning Models in Natural Language Processing: A Survey W. Zhang Quan Z. Sheng A. Alhazmi Chenliang Li AAML 24 57 0 21 Jan 2019
Visual Entailment: A Novel Task for Fine-Grained Image Understanding Ning Xie Farley Lai Derek Doran Asim Kadav CoGe 56 322 0 20 Jan 2019
Self-Monitoring Navigation Agent via Auxiliary Progress Estimation Chih-Yao Ma Jiasen Lu Zuxuan Wu G. Al-Regib Z. Kira R. Socher Caiming Xiong LM&Ro 8 274 0 10 Jan 2019
Supervised Transfer Learning for Product Information Question Answering T. Lai Trung Bui Nedim Lipka Sheng Li 30 19 0 08 Jan 2019
From FiLM to Video: Multi-turn Question Answering with Multi-modal Context T. Nguyen Shikhar Sharma Hannes Schulz Layla El Asri 15 33 0 17 Dec 2018
Recursive Visual Attention in Visual Dialog Yulei Niu Hanwang Zhang Manli Zhang Jianhong Zhang Zhiwu Lu Ji-Rong Wen 28 118 0 06 Dec 2018
Traversing the Continuous Spectrum of Image Retrieval with Deep Dynamic Models Ziad Al-Halah Andreas M. Lehrmann Leonid Sigal 24 0 0 01 Dec 2018
From Recognition to Cognition: Visual Commonsense Reasoning Rowan Zellers Yonatan Bisk Ali Farhadi Yejin Choi LRM BDL OCL ReLM 61 868 0 27 Nov 2018
Holistic Multi-modal Memory Network for Movie Question Answering Anran Wang Anh Tuan Luu Chuan-Sheng Foo Erik Cambria Yi Tay V. Chandrasekhar 36 20 0 12 Nov 2018
Improving Machine Reading Comprehension with General Reading Strategies Kai Sun Dian Yu Dong Yu Claire Cardie AI4CE 24 116 0 31 Oct 2018
A Knowledge-Grounded Multimodal Search-Based Conversational Agent Shubham Agarwal Ondrej Dusek Ioannis Konstas Verena Rieser 31 22 0 20 Oct 2018
A Qualitative Comparison of CoQA, SQuAD 2.0 and QuAC Mark Yatskar 17 96 0 27 Sep 2018
TVQA: Localized, Compositional Video Question Answering Muhammad Abdul Wahab Licheng Yu Mounir Nasr Allah Tamara L. Berg 36 617 0 05 Sep 2018
Comparing Attention-based Convolutional and Recurrent Neural Networks: Success and Limitations in Machine Reading Comprehension Matthias Blohm Glorianna Jagfeld Ekta Sood Xiang Yu Ngoc Thang Vu 27 54 0 27 Aug 2018
ODSQA: Open-domain Spoken Question Answering Dataset Chia-Hsuan Lee Shang-Ming Wang Huan-Cheng Chang Hung-yi Lee RALM 30 52 0 07 Aug 2018
End-to-End Audio Visual Scene-Aware Dialog using Multimodal Attention-Based Video Features Chiori Hori Huda AlAmri Jue Wang Gordon Wichern Takaaki Hori ... Raphael Gontijo-Lopes Abhishek Das Irfan Essa Dhruv Batra Devi Parikh VGen 18 125 0 21 Jun 2018
From Trailers to Storylines: An Efficient Way to Learn from Movies Qingqiu Huang Yuanjun Xiong Yu Xiong Yuqi Zhang Dahua Lin 31 26 0 14 Jun 2018
GLAC Net: GLocal Attention Cascading Networks for Multi-image Cued Story Generation Taehyeong Kim Min-Oh Heo Seonil Son Kyoung-Wha Park Byoung-Tak Zhang 31 75 0 28 May 2018
DuoRC: Towards Complex Language Understanding with Paraphrased Reading Comprehension Amrita Saha Rahul Aralikatte Mitesh M. Khapra Karthik Sankaranarayanan 34 194 0 21 Apr 2018
Scaling Egocentric Vision: The EPIC-KITCHENS Dataset Dima Damen Hazel Doughty G. Farinella Sanja Fidler Antonino Furnari ... Davide Moltisanti Jonathan Munro Toby Perrett Will Price Michael Wray EgoV 25 1,000 0 08 Apr 2018
Learning a Text-Video Embedding from Incomplete and Heterogeneous Data Antoine Miech Ivan Laptev Josef Sivic 22 233 0 07 Apr 2018
Motion-Appearance Co-Memory Networks for Video Question Answering J. Gao Runzhou Ge Kan Chen Ram Nevatia 41 240 0 29 Mar 2018
Weakly-Supervised Action Segmentation with Iterative Soft Boundary Assignment Li Ding Chenliang Xu 30 180 0 28 Mar 2018
Audio-Visual Event Localization in Unconstrained Videos Yapeng Tian Jing Shi Bochen Li Zhiyao Duan Chenliang Xu 53 426 0 23 Mar 2018
MovieGraphs: Towards Understanding Human-Centric Situations from Videos Paul Vicol Makarand Tapaswi Lluis Castrejon Sanja Fidler 42 136 0 19 Dec 2017
CoDraw: Collaborative Drawing as a Testbed for Grounded Goal-driven Communication Jin-Hwa Kim Nikita Kitaev Xinlei Chen Marcus Rohrbach Byoung-Tak Zhang Yuandong Tian Dhruv Batra Devi Parikh DiffM VGen 38 25 0 15 Dec 2017
Audio-Visual Sentiment Analysis for Learning Emotional Arcs in Movies Eric Chu D. Roy 21 41 0 08 Dec 2017
A Read-Write Memory Network for Movie Story Understanding Seil Na Sangho Lee Jisung Kim Gunhee Kim AIMat 24 98 0 27 Sep 2017
Localizing Moments in Video with Natural Language Lisa Anne Hendricks Oliver Wang Eli Shechtman Josef Sivic Trevor Darrell Bryan C. Russell 57 929 0 04 Aug 2017
DeepStory: Video Story QA by Deep Embedded Memory Networks Kyung-Min Kim Min-Oh Heo Seongho Choi Byoung-Tak Zhang 26 174 0 04 Jul 2017
Inferring and Executing Programs for Visual Reasoning Justin Johnson B. Hariharan Laurens van der Maaten Judy Hoffman Li Fei-Fei C. L. Zitnick Ross B. Girshick NAI 35 541 0 10 May 2017
ChestX-ray8: Hospital-scale Chest X-ray Database and Benchmarks on Weakly-Supervised Classification and Localization of Common Thorax Diseases Xiaosong Wang Yifan Peng Le Lu Zhiyong Lu M. Bagheri Ronald M. Summers LM&MA 75 2,474 0 05 May 2017
The Forgettable-Watcher Model for Video Question Answering Hongyang Xue Zhou Zhao Deng Cai 21 9 0 03 May 2017
TGIF-QA: Toward Spatio-Temporal Reasoning in Visual Question Answering Y. Jang Yale Song Youngjae Yu Youngjin Kim Gunhee Kim 34 547 0 14 Apr 2017
Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning Abhishek Das Satwik Kottur J. M. F. Moura Stefan Lee Dhruv Batra OffRL 31 424 0 20 Mar 2017
CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning Justin Johnson B. Hariharan Laurens van der Maaten Li Fei-Fei C. L. Zitnick Ross B. Girshick CoGe 116 2,324 0 20 Dec 2016
Condensed Memory Networks for Clinical Diagnostic Inferencing Aaditya (Adi) Prakash Siyuan Zhao Sadid A. Hasan Vivek Datla Kathy Lee Ashequl Qadir Joey Liu Oladimeji Farri 22 102 0 06 Dec 2016
MarioQA: Answering Questions by Watching Gameplay Videos Jonghwan Mun Paul Hongsuck Seo Ilchae Jung Bohyung Han 50 108 0 06 Dec 2016
Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering Yash Goyal Tejas Khot D. Summers-Stay Dhruv Batra Devi Parikh CoGe 161 3,136 0 02 Dec 2016
Visual Dialog Abhishek Das Satwik Kottur Khushi Gupta Avi Singh Deshraj Yadav José M. F. Moura Devi Parikh Dhruv Batra 71 990 0 26 Nov 2016
A dataset and exploration of models for understanding video data through fill-in-the-blank question-answering Tegan Maharaj Nicolas Ballas Anna Rohrbach Aaron Courville C. Pal VGen 15 107 0 23 Nov 2016
Leveraging Video Descriptions to Learn Video Question Answering Kuo-Hao Zeng Tseng-Hung Chen Ching-Yao Chuang Yuan-Hong Liao Juan Carlos Niebles Min Sun 32 175 0 12 Nov 2016
End-to-end Concept Word Detection for Video Captioning, Retrieval, and Question Answering Youngjae Yu Hyungjin Ko Jongwook Choi Gunhee Kim 16 230 0 10 Oct 2016
Learning Language-Visual Embedding for Movie Understanding with Natural-Language Atousa Torabi Niket Tandon Leonid Sigal 22 97 0 26 Sep 2016
Machine Comprehension Using Match-LSTM and Answer Pointer Shuohang Wang Jing Jiang 17 594 0 29 Aug 2016