Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1704.05526
Cited By
Learning to Reason: End-to-End Module Networks for Visual Question Answering
18 April 2017
Ronghang Hu
Jacob Andreas
Marcus Rohrbach
Trevor Darrell
Kate Saenko
KELM
GNN
ReLM
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Learning to Reason: End-to-End Module Networks for Visual Question Answering"
50 / 128 papers shown
Title
Neuro Symbolic Knowledge Reasoning for Procedural Video Question Answering
Thanh-Son Nguyen
Hong Yang
Tzeh Yuan Neoh
Hao Zhang
Ee Yeo Keat
Basura Fernando
NAI
64
0
0
19 Mar 2025
Make Haste Slowly: A Theory of Emergent Structured Mixed Selectivity in Feature Learning ReLU Networks
Devon Jarvis
Richard Klein
Benjamin Rosman
Andrew M. Saxe
MLT
69
1
0
08 Mar 2025
Learning to Reason Iteratively and Parallelly for Complex Visual Reasoning Scenarios
Shantanu Jaiswal
Debaditya Roy
Basura Fernando
Cheston Tan
ReLM
LRM
79
2
0
20 Nov 2024
Discovering Object Attributes by Prompting Large Language Models with Perception-Action APIs
A. Mavrogiannis
Dehao Yuan
Yiannis Aloimonos
LM&Ro
45
0
0
23 Sep 2024
What Makes a Maze Look Like a Maze?
Joy Hsu
Jiayuan Mao
J. Tenenbaum
Noah D. Goodman
Jiajun Wu
OCL
70
6
0
12 Sep 2024
Breaking Neural Network Scaling Laws with Modularity
Akhilan Boopathy
Sunshine Jiang
William Yue
Jaedong Hwang
Abhiram Iyer
Ila Fiete
OOD
59
2
0
09 Sep 2024
On Behalf of the Stakeholders: Trends in NLP Model Interpretability in the Era of LLMs
Nitay Calderon
Roi Reichart
47
13
0
27 Jul 2024
3VL: Using Trees to Improve Vision-Language Models' Interpretability
Nir Yellinek
Leonid Karlinsky
Raja Giryes
CoGe
VLM
54
4
0
28 Dec 2023
ProtoArgNet: Interpretable Image Classification with Super-Prototypes and Argumentation [Technical Report]
Hamed Ayoobi
Nico Potyka
Francesca Toni
46
3
0
26 Nov 2023
Multimodal Representations for Teacher-Guided Compositional Visual Reasoning
Wafa Aissa
Marin Ferecatu
M. Crucianu
LRM
26
0
0
24 Oct 2023
Modularized Zero-shot VQA with Pre-trained Models
Rui Cao
Jing Jiang
LRM
35
2
0
27 May 2023
Curriculum Learning for Compositional Visual Reasoning
Wafa Aissa
Marin Ferecatu
M. Crucianu
LRM
36
3
0
27 Mar 2023
ViperGPT: Visual Inference via Python Execution for Reasoning
Dídac Surís
Sachit Menon
Carl Vondrick
MLLM
LRM
ReLM
49
435
0
14 Mar 2023
Prophet: Prompting Large Language Models with Complementary Answer Heuristics for Knowledge-based Visual Question Answering
Zhou Yu
Xuecheng Ouyang
Zhenwei Shao
Mei Wang
Jun Yu
MLLM
94
11
0
03 Mar 2023
Modular Deep Learning
Jonas Pfeiffer
Sebastian Ruder
Ivan Vulić
Edoardo Ponti
MoMe
OOD
34
73
0
22 Feb 2023
Decomposing a Recurrent Neural Network into Modules for Enabling Reusability and Replacement
S. Imtiaz
Fraol Batole
Astha Singh
Rangeet Pan
Breno Dantas Cruz
Hridesh Rajan
18
7
0
09 Dec 2022
A Short Survey of Systematic Generalization
Yuanpeng Li
AI4CE
45
1
0
22 Nov 2022
Visual Programming: Compositional visual reasoning without training
Tanmay Gupta
Aniruddha Kembhavi
ReLM
VLM
LRM
94
406
0
18 Nov 2022
Neural Attentive Circuits
Nasim Rahaman
M. Weiß
Francesco Locatello
C. Pal
Yoshua Bengio
Bernhard Schölkopf
Erran L. Li
Nicolas Ballas
37
6
0
14 Oct 2022
On the Explainability of Natural Language Processing Deep Models
Julia El Zini
M. Awad
39
82
0
13 Oct 2022
Binding Language Models in Symbolic Languages
Zhoujun Cheng
Tianbao Xie
Peng Shi
Chengzu Li
Rahul Nadkarni
...
Dragomir R. Radev
Mari Ostendorf
Luke Zettlemoyer
Noah A. Smith
Tao Yu
LMTD
134
200
0
06 Oct 2022
Learning to Collocate Visual-Linguistic Neural Modules for Image Captioning
Xu Yang
Hanwang Zhang
Chongyang Gao
Jianfei Cai
MLLM
45
10
0
04 Oct 2022
Visual Perturbation-aware Collaborative Learning for Overcoming the Language Prior Problem
Yudong Han
Liqiang Nie
Jianhua Yin
Jianlong Wu
Yan Yan
26
13
0
24 Jul 2022
How to Reuse and Compose Knowledge for a Lifetime of Tasks: A Survey on Continual Learning and Functional Composition
Jorge Armando Mendez Mendez
Eric Eaton
KELM
CLL
37
27
0
15 Jul 2022
Is a Modular Architecture Enough?
Sarthak Mittal
Yoshua Bengio
Guillaume Lajoie
34
47
0
06 Jun 2022
Multimodal Conversational AI: A Survey of Datasets and Approaches
Anirudh S. Sundar
Larry Heck
48
29
0
13 May 2022
What is Right for Me is Not Yet Right for You: A Dataset for Grounding Relative Directions via Multi-Task Learning
Jae Hee Lee
Matthias Kerzel
Kyra Ahrens
C. Weber
S. Wermter
40
9
0
05 May 2022
METGEN: A Module-Based Entailment Tree Generation Framework for Answer Explanation
Ruixin Hong
Hongming Zhang
Xintong Yu
Changshui Zhang
ReLM
LRM
32
33
0
05 May 2022
Measuring Compositional Consistency for Video Question Answering
Mona Gandhi
Mustafa Omer Gul
Eva Prakash
Madeleine Grunde-McLaughlin
Ranjay Krishna
Maneesh Agrawala
CoGe
40
15
0
14 Apr 2022
NEWSKVQA: Knowledge-Aware News Video Question Answering
Pranay Gupta
Manish Gupta
30
7
0
08 Feb 2022
Discrete and continuous representations and processing in deep learning: Looking forward
Ruben Cartuyvels
Graham Spinks
Marie-Francine Moens
OCL
38
20
0
04 Jan 2022
Decomposing Convolutional Neural Networks into Reusable and Replaceable Modules
Rangeet Pan
Hridesh Rajan
MoMe
16
30
0
11 Oct 2021
Calibrating Concepts and Operations: Towards Symbolic Reasoning on Real Images
Zhuowan Li
Elias Stengel-Eskin
Yixiao Zhang
Cihang Xie
Q. Tran
Benjamin Van Durme
Alan Yuille
VLM
26
15
0
01 Oct 2021
Systematic Generalization on gSCAN: What is Nearly Solved and What is Next?
Linlu Qiu
Hexiang Hu
Bowen Zhang
Peter Shaw
Fei Sha
33
21
0
25 Sep 2021
Auto-Parsing Network for Image Captioning and Visual Question Answering
Xu Yang
Chongyang Gao
Hanwang Zhang
Jianfei Cai
24
35
0
24 Aug 2021
DualVGR: A Dual-Visual Graph Reasoning Unit for Video Question Answering
Jianyu Wang
Bingkun Bao
Changsheng Xu
19
75
0
10 Jul 2021
Adventurer's Treasure Hunt: A Transparent System for Visually Grounded Compositional Visual Question Answering based on Scene Graphs
Daniel Reich
F. Putze
Tanja Schultz
30
2
0
28 Jun 2021
Supervising the Transfer of Reasoning Patterns in VQA
Corentin Kervadec
Christian Wolf
G. Antipov
M. Baccouche
Madiha Nadri Wolf
35
10
0
10 Jun 2021
A Review on Explainability in Multimodal Deep Neural Nets
Gargi Joshi
Rahee Walambe
K. Kotecha
34
140
0
17 May 2021
Show Why the Answer is Correct! Towards Explainable AI using Compositional Temporal Attention
Nihar Bendre
K. Desai
Peyman Najafirad
CoGe
31
6
0
15 May 2021
Designing Multimodal Datasets for NLP Challenges
James Pustejovsky
E. Holderness
Jingxuan Tu
Parker Glenn
Kyeongmin Rim
Kelley Lynch
R. Brutti
31
5
0
12 May 2021
Neuro-Symbolic Artificial Intelligence: Current Trends
Md Kamruzzaman Sarker
Lu Zhou
Aaron Eberhart
Pascal Hitzler
NAI
27
87
0
11 May 2021
MDETR -- Modulated Detection for End-to-End Multi-Modal Understanding
Aishwarya Kamath
Mannat Singh
Yann LeCun
Gabriel Synnaeve
Ishan Misra
Nicolas Carion
ObjD
VLM
93
864
0
26 Apr 2021
VGNMN: Video-grounded Neural Module Network to Video-Grounded Language Tasks
Hung Le
Nancy F. Chen
Guosheng Lin
MLLM
28
19
0
16 Apr 2021
Object-Centric Representation Learning for Video Question Answering
Long Hoang Dang
T. Le
Vuong Le
T. Tran
27
7
0
12 Apr 2021
Beyond Question-Based Biases: Assessing Multimodal Shortcut Learning in Visual Question Answering
Corentin Dancette
Rémi Cadène
Damien Teney
Matthieu Cord
CML
33
76
0
07 Apr 2021
KANDINSKYPatterns -- An experimental exploration environment for Pattern Analysis and Machine Intelligence
Andreas Holzinger
Anna Saranti
Heimo Mueller
46
10
0
28 Feb 2021
Explainability of deep vision-based autonomous driving systems: Review and challenges
Éloi Zablocki
H. Ben-younes
P. Pérez
Matthieu Cord
XAI
53
170
0
13 Jan 2021
Knowledge-Routed Visual Question Reasoning: Challenges for Deep Representation Embedding
Qingxing Cao
Bailin Li
Xiaodan Liang
Keze Wang
Liang Lin
46
36
0
14 Dec 2020
Quantifying Learnability and Describability of Visual Concepts Emerging in Representation Learning
Iro Laina
Ruth C. Fong
Andrea Vedaldi
OCL
33
13
0
27 Oct 2020
1
2
3
Next