ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1704.05526
  4. Cited By
Learning to Reason: End-to-End Module Networks for Visual Question
  Answering

Learning to Reason: End-to-End Module Networks for Visual Question Answering

18 April 2017
Ronghang Hu
Jacob Andreas
Marcus Rohrbach
Trevor Darrell
Kate Saenko
    KELM
    GNN
    ReLM
    LRM
ArXivPDFHTML

Papers citing "Learning to Reason: End-to-End Module Networks for Visual Question Answering"

50 / 128 papers shown
Title
Deep Reinforcement Learning with Stacked Hierarchical Attention for
  Text-based Games
Deep Reinforcement Learning with Stacked Hierarchical Attention for Text-based Games
Yunqiu Xu
Meng Fang
Ling-Hao Chen
Yali Du
Qiufeng Wang
Chengqi Zhang
OffRL
27
44
0
22 Oct 2020
Graph-based Heuristic Search for Module Selection Procedure in Neural
  Module Network
Graph-based Heuristic Search for Module Selection Procedure in Neural Module Network
Yuxuan Wu
Hideki Nakayama
GNN
25
3
0
30 Sep 2020
Object-and-Action Aware Model for Visual Language Navigation
Object-and-Action Aware Model for Visual Language Navigation
Yuankai Qi
Zizheng Pan
Shengping Zhang
Anton Van Den Hengel
Qi Wu
LM&Ro
23
111
0
29 Jul 2020
AiR: Attention with Reasoning Capability
AiR: Attention with Reasoning Capability
Shi Chen
Ming Jiang
Jinhui Yang
Qi Zhao
LRM
13
36
0
28 Jul 2020
Referring Expression Comprehension: A Survey of Methods and Datasets
Referring Expression Comprehension: A Survey of Methods and Datasets
Yanyuan Qiao
Chaorui Deng
Qi Wu
ObjD
50
93
0
19 Jul 2020
Learning to Discretely Compose Reasoning Module Networks for Video
  Captioning
Learning to Discretely Compose Reasoning Module Networks for Video Captioning
Ganchao Tan
Daqing Liu
Meng Wang
Zhengjun Zha
LRM
30
73
0
17 Jul 2020
Large-Scale Adversarial Training for Vision-and-Language Representation
  Learning
Large-Scale Adversarial Training for Vision-and-Language Representation Learning
Zhe Gan
Yen-Chun Chen
Linjie Li
Chen Zhu
Yu Cheng
Jingjing Liu
ObjD
VLM
35
489
0
11 Jun 2020
Cross-Modality Relevance for Reasoning on Language and Vision
Cross-Modality Relevance for Reasoning on Language and Vision
Chen Zheng
Quan Guo
Parisa Kordjamshidi
LRM
46
36
0
12 May 2020
Teaching Machine Comprehension with Compositional Explanations
Teaching Machine Comprehension with Compositional Explanations
Qinyuan Ye
Xiao Huang
Elizabeth Boschee
Xiang Ren
LRM
ReLM
29
34
0
02 May 2020
Dynamic Language Binding in Relational Visual Reasoning
Dynamic Language Binding in Relational Visual Reasoning
T. Le
Vuong Le
Svetha Venkatesh
T. Tran
NAI
26
19
0
30 Apr 2020
VD-BERT: A Unified Vision and Dialog Transformer with BERT
VD-BERT: A Unified Vision and Dialog Transformer with BERT
Yue Wang
Shafiq Joty
Michael R. Lyu
Irwin King
Caiming Xiong
Guosheng Lin
24
102
0
28 Apr 2020
MoVie: Revisiting Modulated Convolutions for Visual Counting and Beyond
MoVie: Revisiting Modulated Convolutions for Visual Counting and Beyond
Duy-Kien Nguyen
Vedanuj Goswami
Xinlei Chen
39
23
0
24 Apr 2020
Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene
  Text
Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene Text
Difei Gao
Ke Li
Ruiping Wang
Shiguang Shan
Xilin Chen
16
111
0
31 Mar 2020
A Review on Intelligent Object Perception Methods Combining
  Knowledge-based Reasoning and Machine Learning
A Review on Intelligent Object Perception Methods Combining Knowledge-based Reasoning and Machine Learning
Filippos Gouidis
Alexandros Vassiliades
T. Patkos
Antonis Argyros
Nick Bassiliades
Dimitris Plexousakis
OCL
29
12
0
26 Dec 2019
Towards Causal VQA: Revealing and Reducing Spurious Correlations by
  Invariant and Covariant Semantic Editing
Towards Causal VQA: Revealing and Reducing Spurious Correlations by Invariant and Covariant Semantic Editing
Vedika Agarwal
Rakshith Shetty
Mario Fritz
CML
AAML
32
155
0
16 Dec 2019
Factorized Multimodal Transformer for Multimodal Sequential Learning
Factorized Multimodal Transformer for Multimodal Sequential Learning
Amir Zadeh
Chengfeng Mao
Kelly Shi
Yiwei Zhang
Paul Pu Liang
Soujanya Poria
Louis-Philippe Morency
25
44
0
22 Nov 2019
Kinematic State Abstraction and Provably Efficient Rich-Observation
  Reinforcement Learning
Kinematic State Abstraction and Provably Efficient Rich-Observation Reinforcement Learning
Dipendra Kumar Misra
Mikael Henaff
A. Krishnamurthy
John Langford
29
151
0
13 Nov 2019
Multi-modal Deep Analysis for Multimedia
Multi-modal Deep Analysis for Multimedia
Wenwu Zhu
Xin Eric Wang
Hongzhi Li
29
38
0
11 Oct 2019
CLEVRER: CoLlision Events for Video REpresentation and Reasoning
CLEVRER: CoLlision Events for Video REpresentation and Reasoning
Kexin Yi
Yuta Saito
Yunzhu Li
Pushmeet Kohli
Jiajun Wu
Antonio Torralba
J. Tenenbaum
NAI
43
457
0
03 Oct 2019
Synthetic Data for Deep Learning
Synthetic Data for Deep Learning
Sergey I. Nikolenko
51
349
0
25 Sep 2019
Dynamic Graph Attention for Referring Expression Comprehension
Dynamic Graph Attention for Referring Expression Comprehension
Sibei Yang
Guanbin Li
Yizhou Yu
OCL
25
215
0
18 Sep 2019
LXMERT: Learning Cross-Modality Encoder Representations from
  Transformers
LXMERT: Learning Cross-Modality Encoder Representations from Transformers
Hao Hao Tan
Joey Tianyi Zhou
VLM
MLLM
111
2,456
0
20 Aug 2019
What is needed for simple spatial language capabilities in VQA?
What is needed for simple spatial language capabilities in VQA?
A. Kuhnle
Ann A. Copestake
CoGe
23
1
0
17 Aug 2019
A Multi-Type Multi-Span Network for Reading Comprehension that Requires
  Discrete Reasoning
A Multi-Type Multi-Span Network for Reading Comprehension that Requires Discrete Reasoning
Minghao Hu
Yuxing Peng
Zhen Huang
Dongsheng Li
AIMat
LRM
32
90
0
15 Aug 2019
VideoNavQA: Bridging the Gap between Visual and Embodied Question
  Answering
VideoNavQA: Bridging the Gap between Visual and Embodied Question Answering
Cătălina Cangea
Eugene Belilovsky
Pietro Lio
Aaron Courville
16
17
0
14 Aug 2019
Multimodal Unified Attention Networks for Vision-and-Language
  Interactions
Multimodal Unified Attention Networks for Vision-and-Language Interactions
Zhou Yu
Yuhao Cui
Jun Yu
Dacheng Tao
Q. Tian
27
38
0
12 Aug 2019
Trends in Integration of Vision and Language Research: A Survey of
  Tasks, Datasets, and Methods
Trends in Integration of Vision and Language Research: A Survey of Tasks, Datasets, and Methods
Aditya Mogadala
M. Kalimuthu
Dietrich Klakow
VLM
25
133
0
22 Jul 2019
Variational Context: Exploiting Visual and Textual Context for Grounding
  Referring Expressions
Variational Context: Exploiting Visual and Textual Context for Grounding Referring Expressions
Yulei Niu
Hanwang Zhang
Zhiwu Lu
Shih-Fu Chang
ObjD
BDL
36
24
0
08 Jul 2019
Context-Aware Visual Policy Network for Fine-Grained Image Captioning
Context-Aware Visual Policy Network for Fine-Grained Image Captioning
Zhengjun Zha
Daqing Liu
Hanwang Zhang
Yongdong Zhang
Feng Wu
25
120
0
06 Jun 2019
Recursive Sketches for Modular Deep Learning
Recursive Sketches for Modular Deep Learning
Badih Ghazi
Rina Panigrahy
Joshua R. Wang
15
20
0
29 May 2019
Language-Conditioned Graph Networks for Relational Reasoning
Language-Conditioned Graph Networks for Relational Reasoning
Ronghang Hu
Anna Rohrbach
Trevor Darrell
Kate Saenko
31
171
0
10 May 2019
Compositional generalization in a deep seq2seq model by separating
  syntax and semantics
Compositional generalization in a deep seq2seq model by separating syntax and semantics
Jacob Russin
Jason Jo
R. C. O'Reilly
Yoshua Bengio
30
102
0
22 Apr 2019
Learning to Collocate Neural Modules for Image Captioning
Learning to Collocate Neural Modules for Image Captioning
Xu Yang
Hanwang Zhang
Jianfei Cai
27
77
0
18 Apr 2019
Question Guided Modular Routing Networks for Visual Question Answering
Question Guided Modular Routing Networks for Visual Question Answering
Yanze Wu
Qiang Sun
Jianqi Ma
Bin Li
Yanwei Fu
Yao Peng
Xiangyang Xue
23
1
0
17 Apr 2019
Factor Graph Attention
Factor Graph Attention
Idan Schwartz
Seunghak Yu
Tamir Hazan
Alex Schwing
30
110
0
11 Apr 2019
CLEVR-Dialog: A Diagnostic Dataset for Multi-Round Reasoning in Visual
  Dialog
CLEVR-Dialog: A Diagnostic Dataset for Multi-Round Reasoning in Visual Dialog
Satwik Kottur
José M. F. Moura
Devi Parikh
Dhruv Batra
Marcus Rohrbach
31
86
0
07 Mar 2019
RAVEN: A Dataset for Relational and Analogical Visual rEasoNing
RAVEN: A Dataset for Relational and Analogical Visual rEasoNing
Chi Zhang
Feng Gao
Baoxiong Jia
Yixin Zhu
Song-Chun Zhu
AIMat
32
304
0
07 Mar 2019
Answer Them All! Toward Universal Visual Question Answering Models
Answer Them All! Toward Universal Visual Question Answering Models
Robik Shrestha
Kushal Kafle
Christopher Kanan
25
82
0
01 Mar 2019
MUREL: Multimodal Relational Reasoning for Visual Question Answering
MUREL: Multimodal Relational Reasoning for Visual Question Answering
Rémi Cadène
H. Ben-younes
Matthieu Cord
Nicolas Thome
LRM
19
272
0
25 Feb 2019
Visual Entailment: A Novel Task for Fine-Grained Image Understanding
Visual Entailment: A Novel Task for Fine-Grained Image Understanding
Ning Xie
Farley Lai
Derek Doran
Asim Kadav
CoGe
56
322
0
20 Jan 2019
CLEVR-Ref+: Diagnosing Visual Reasoning with Referring Expressions
CLEVR-Ref+: Diagnosing Visual Reasoning with Referring Expressions
Runtao Liu
Chenxi Liu
Yutong Bai
Alan Yuille
NAI
ObjD
24
123
0
03 Jan 2019
Neighbourhood Watch: Referring Expression Comprehension via
  Language-guided Graph Attention Networks
Neighbourhood Watch: Referring Expression Comprehension via Language-guided Graph Attention Networks
Peng Wang
Qi Wu
Jiewei Cao
Chunhua Shen
Lianli Gao
Anton Van Den Hengel
ObjD
22
252
0
12 Dec 2018
Counterfactual Critic Multi-Agent Training for Scene Graph Generation
Counterfactual Critic Multi-Agent Training for Scene Graph Generation
Long Chen
Hanwang Zhang
Jun Xiao
Xiangnan He
Shiliang Pu
Shih-Fu Chang
39
159
0
06 Dec 2018
Explainable and Explicit Visual Reasoning over Scene Graphs
Explainable and Explicit Visual Reasoning over Scene Graphs
Jiaxin Shi
Hanwang Zhang
Juan-Zi Li
OCL
169
230
0
05 Dec 2018
Generating Diverse Programs with Instruction Conditioned Reinforced
  Adversarial Learning
Generating Diverse Programs with Instruction Conditioned Reinforced Adversarial Learning
Aishwarya Agrawal
Mateusz Malinowski
Felix Hill
S. M. Ali Eslami
Oriol Vinyals
Tejas D. Kulkarni
21
4
0
03 Dec 2018
On transfer learning using a MAC model variant
On transfer learning using a MAC model variant
Vincent Marois
T. S. Jayram
V. Albouy
Tomasz Kornuta
Younes Bouhadjar
A. Ozcan
DRL
26
9
0
15 Nov 2018
Toward Driving Scene Understanding: A Dataset for Learning Driver
  Behavior and Causal Reasoning
Toward Driving Scene Understanding: A Dataset for Learning Driver Behavior and Causal Reasoning
Vasili Ramanishka
Yi-Ting Chen
Teruhisa Misu
Kate Saenko
30
277
0
06 Nov 2018
Zero-Shot Transfer VQA Dataset
Zero-Shot Transfer VQA Dataset
Yuanpeng Li
Yi Yang
Jianyu Wang
Wei Xu
27
8
0
02 Nov 2018
Machine Common Sense Concept Paper
Machine Common Sense Concept Paper
David Gunning
VLM
LRM
21
39
0
17 Oct 2018
Discovering General-Purpose Active Learning Strategies
Discovering General-Purpose Active Learning Strategies
Ksenia Konyushkova
Raphael Sznitman
Pascal Fua
30
33
0
09 Oct 2018
Previous
123
Next