Learning to Reason: End-to-End Module Networks for Visual Question Answering

18 April 2017

Papers citing "Learning to Reason: End-to-End Module Networks for Visual Question Answering"

50 / 128 papers shown

Title
Neuro Symbolic Knowledge Reasoning for Procedural Video Question Answering Thanh-Son Nguyen Hong Yang Tzeh Yuan Neoh Hao Zhang Ee Yeo Keat Basura Fernando NAI 64 0 0 19 Mar 2025
Make Haste Slowly: A Theory of Emergent Structured Mixed Selectivity in Feature Learning ReLU Networks Devon Jarvis Richard Klein Benjamin Rosman Andrew M. Saxe MLT 69 1 0 08 Mar 2025
Learning to Reason Iteratively and Parallelly for Complex Visual Reasoning Scenarios Shantanu Jaiswal Debaditya Roy Basura Fernando Cheston Tan ReLM LRM 79 2 0 20 Nov 2024
Discovering Object Attributes by Prompting Large Language Models with Perception-Action APIs A. Mavrogiannis Dehao Yuan Yiannis Aloimonos LM&Ro 45 0 0 23 Sep 2024
What Makes a Maze Look Like a Maze? Joy Hsu Jiayuan Mao J. Tenenbaum Noah D. Goodman Jiajun Wu OCL 70 6 0 12 Sep 2024
Breaking Neural Network Scaling Laws with Modularity Akhilan Boopathy Sunshine Jiang William Yue Jaedong Hwang Abhiram Iyer Ila Fiete OOD 59 2 0 09 Sep 2024
On Behalf of the Stakeholders: Trends in NLP Model Interpretability in the Era of LLMs Nitay Calderon Roi Reichart 47 13 0 27 Jul 2024
3VL: Using Trees to Improve Vision-Language Models' Interpretability Nir Yellinek Leonid Karlinsky Raja Giryes CoGe VLM 54 4 0 28 Dec 2023
ProtoArgNet: Interpretable Image Classification with Super-Prototypes and Argumentation [Technical Report] Hamed Ayoobi Nico Potyka Francesca Toni 46 3 0 26 Nov 2023
Multimodal Representations for Teacher-Guided Compositional Visual Reasoning Wafa Aissa Marin Ferecatu M. Crucianu LRM 26 0 0 24 Oct 2023
Modularized Zero-shot VQA with Pre-trained Models Rui Cao Jing Jiang LRM 35 2 0 27 May 2023
Curriculum Learning for Compositional Visual Reasoning Wafa Aissa Marin Ferecatu M. Crucianu LRM 36 3 0 27 Mar 2023
ViperGPT: Visual Inference via Python Execution for Reasoning Dídac Surís Sachit Menon Carl Vondrick MLLM LRM ReLM 49 435 0 14 Mar 2023
Prophet: Prompting Large Language Models with Complementary Answer Heuristics for Knowledge-based Visual Question Answering Zhou Yu Xuecheng Ouyang Zhenwei Shao Mei Wang Jun Yu MLLM 94 11 0 03 Mar 2023
Modular Deep Learning Jonas Pfeiffer Sebastian Ruder Ivan Vulić Edoardo Ponti MoMe OOD 34 73 0 22 Feb 2023
Decomposing a Recurrent Neural Network into Modules for Enabling Reusability and Replacement S. Imtiaz Fraol Batole Astha Singh Rangeet Pan Breno Dantas Cruz Hridesh Rajan 18 7 0 09 Dec 2022
A Short Survey of Systematic Generalization Yuanpeng Li AI4CE 45 1 0 22 Nov 2022
Visual Programming: Compositional visual reasoning without training Tanmay Gupta Aniruddha Kembhavi ReLM VLM LRM 94 406 0 18 Nov 2022
Neural Attentive Circuits Nasim Rahaman M. Weiß Francesco Locatello C. Pal Yoshua Bengio Bernhard Schölkopf Erran L. Li Nicolas Ballas 37 6 0 14 Oct 2022
On the Explainability of Natural Language Processing Deep Models Julia El Zini M. Awad 39 82 0 13 Oct 2022
Binding Language Models in Symbolic Languages Zhoujun Cheng Tianbao Xie Peng Shi Chengzu Li Rahul Nadkarni ... Dragomir R. Radev Mari Ostendorf Luke Zettlemoyer Noah A. Smith Tao Yu LMTD 134 200 0 06 Oct 2022
Learning to Collocate Visual-Linguistic Neural Modules for Image Captioning Xu Yang Hanwang Zhang Chongyang Gao Jianfei Cai MLLM 45 10 0 04 Oct 2022
Visual Perturbation-aware Collaborative Learning for Overcoming the Language Prior Problem Yudong Han Liqiang Nie Jianhua Yin Jianlong Wu Yan Yan 26 13 0 24 Jul 2022
How to Reuse and Compose Knowledge for a Lifetime of Tasks: A Survey on Continual Learning and Functional Composition Jorge Armando Mendez Mendez Eric Eaton KELM CLL 37 27 0 15 Jul 2022
Is a Modular Architecture Enough? Sarthak Mittal Yoshua Bengio Guillaume Lajoie 34 47 0 06 Jun 2022
Multimodal Conversational AI: A Survey of Datasets and Approaches Anirudh S. Sundar Larry Heck 48 29 0 13 May 2022
What is Right for Me is Not Yet Right for You: A Dataset for Grounding Relative Directions via Multi-Task Learning Jae Hee Lee Matthias Kerzel Kyra Ahrens C. Weber S. Wermter 40 9 0 05 May 2022
METGEN: A Module-Based Entailment Tree Generation Framework for Answer Explanation Ruixin Hong Hongming Zhang Xintong Yu Changshui Zhang ReLM LRM 32 33 0 05 May 2022
Measuring Compositional Consistency for Video Question Answering Mona Gandhi Mustafa Omer Gul Eva Prakash Madeleine Grunde-McLaughlin Ranjay Krishna Maneesh Agrawala CoGe 40 15 0 14 Apr 2022
NEWSKVQA: Knowledge-Aware News Video Question Answering Pranay Gupta Manish Gupta 30 7 0 08 Feb 2022
Discrete and continuous representations and processing in deep learning: Looking forward Ruben Cartuyvels Graham Spinks Marie-Francine Moens OCL 38 20 0 04 Jan 2022
Decomposing Convolutional Neural Networks into Reusable and Replaceable Modules Rangeet Pan Hridesh Rajan MoMe 16 30 0 11 Oct 2021
Calibrating Concepts and Operations: Towards Symbolic Reasoning on Real Images Zhuowan Li Elias Stengel-Eskin Yixiao Zhang Cihang Xie Q. Tran Benjamin Van Durme Alan Yuille VLM 26 15 0 01 Oct 2021
Systematic Generalization on gSCAN: What is Nearly Solved and What is Next? Linlu Qiu Hexiang Hu Bowen Zhang Peter Shaw Fei Sha 33 21 0 25 Sep 2021
Auto-Parsing Network for Image Captioning and Visual Question Answering Xu Yang Chongyang Gao Hanwang Zhang Jianfei Cai 24 35 0 24 Aug 2021
DualVGR: A Dual-Visual Graph Reasoning Unit for Video Question Answering Jianyu Wang Bingkun Bao Changsheng Xu 19 75 0 10 Jul 2021
Adventurer's Treasure Hunt: A Transparent System for Visually Grounded Compositional Visual Question Answering based on Scene Graphs Daniel Reich F. Putze Tanja Schultz 30 2 0 28 Jun 2021
Supervising the Transfer of Reasoning Patterns in VQA Corentin Kervadec Christian Wolf G. Antipov M. Baccouche Madiha Nadri Wolf 35 10 0 10 Jun 2021
A Review on Explainability in Multimodal Deep Neural Nets Gargi Joshi Rahee Walambe K. Kotecha 34 140 0 17 May 2021
Show Why the Answer is Correct! Towards Explainable AI using Compositional Temporal Attention Nihar Bendre K. Desai Peyman Najafirad CoGe 31 6 0 15 May 2021
Designing Multimodal Datasets for NLP Challenges James Pustejovsky E. Holderness Jingxuan Tu Parker Glenn Kyeongmin Rim Kelley Lynch R. Brutti 31 5 0 12 May 2021
Neuro-Symbolic Artificial Intelligence: Current Trends Md Kamruzzaman Sarker Lu Zhou Aaron Eberhart Pascal Hitzler NAI 27 87 0 11 May 2021
MDETR -- Modulated Detection for End-to-End Multi-Modal Understanding Aishwarya Kamath Mannat Singh Yann LeCun Gabriel Synnaeve Ishan Misra Nicolas Carion ObjD VLM 93 864 0 26 Apr 2021
VGNMN: Video-grounded Neural Module Network to Video-Grounded Language Tasks Hung Le Nancy F. Chen Guosheng Lin MLLM 28 19 0 16 Apr 2021
Object-Centric Representation Learning for Video Question Answering Long Hoang Dang T. Le Vuong Le T. Tran 27 7 0 12 Apr 2021
Beyond Question-Based Biases: Assessing Multimodal Shortcut Learning in Visual Question Answering Corentin Dancette Rémi Cadène Damien Teney Matthieu Cord CML 33 76 0 07 Apr 2021
KANDINSKYPatterns -- An experimental exploration environment for Pattern Analysis and Machine Intelligence Andreas Holzinger Anna Saranti Heimo Mueller 46 10 0 28 Feb 2021
Explainability of deep vision-based autonomous driving systems: Review and challenges Éloi Zablocki H. Ben-younes P. Pérez Matthieu Cord XAI 53 170 0 13 Jan 2021
Knowledge-Routed Visual Question Reasoning: Challenges for Deep Representation Embedding Qingxing Cao Bailin Li Xiaodan Liang Keze Wang Liang Lin 46 36 0 14 Dec 2020
Quantifying Learnability and Describability of Visual Concepts Emerging in Representation Learning Iro Laina Ruth C. Fong Andrea Vedaldi OCL 33 13 0 27 Oct 2020