Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1612.06890
Cited By
CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning
20 December 2016
Justin Johnson
B. Hariharan
Laurens van der Maaten
Li Fei-Fei
C. L. Zitnick
Ross B. Girshick
CoGe
Re-assign community
ArXiv
PDF
HTML
Papers citing
"CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning"
50 / 1,475 papers shown
Title
Video2Commonsense: Generating Commonsense Descriptions to Enrich Video Captioning
Zhiyuan Fang
Tejas Gokhale
Pratyay Banerjee
Chitta Baral
Yezhou Yang
25
60
0
11 Mar 2020
A Benchmark for Systematic Generalization in Grounded Language Understanding
Laura Ruis
Jacob Andreas
Marco Baroni
Diane Bouchacourt
Brenden M. Lake
19
138
0
11 Mar 2020
MQA: Answering the Question via Robotic Manipulation
Yuhong Deng
Di Guo
F. Sun
Naifu Zhang
Huaping Liu
Chen Pang
10
19
0
10 Mar 2020
Better Set Representations For Relational Reasoning
Qian Huang
Horace He
Ashutosh Kumar Singh
Yan Zhang
Ser-Nam Lim
Austin R. Benson
NAI
OCL
GNN
36
1
0
09 Mar 2020
Deconfounded Image Captioning: A Causal Retrospect
Xu Yang
Hanwang Zhang
Jianfei Cai
CML
18
119
0
09 Mar 2020
PathVQA: 30000+ Questions for Medical Visual Question Answering
Xuehai He
Yichen Zhang
Luntian Mou
Eric Xing
P. Xie
LM&MA
25
215
0
07 Mar 2020
Cops-Ref: A new Dataset and Task on Compositional Referring Expression Comprehension
Zhenfang Chen
Peng Wang
Lin Ma
Kwan-Yee K. Wong
Qi Wu
ObjD
34
68
0
01 Mar 2020
Unshuffling Data for Improved Generalization
Damien Teney
Ehsan Abbasnejad
Anton Van Den Hengel
OOD
31
76
0
27 Feb 2020
On the General Value of Evidence, and Bilingual Scene-Text Visual Question Answering
Xinyu Wang
Yuliang Liu
Chunhua Shen
Chun Chet Ng
Canjie Luo
Lianwen Jin
C. Chan
Anton Van Den Hengel
Liangwei Wang
31
91
0
24 Feb 2020
VQA-LOL: Visual Question Answering under the Lens of Logic
Tejas Gokhale
Pratyay Banerjee
Chitta Baral
Yezhou Yang
CoGe
28
73
0
19 Feb 2020
Bayes-TrEx: a Bayesian Sampling Approach to Model Transparency by Example
Serena Booth
Yilun Zhou
Ankit J. Shah
J. Shah
BDL
20
2
0
19 Feb 2020
Stratified Rule-Aware Network for Abstract Visual Reasoning
Sheng Hu
Yuqing Ma
Xianglong Liu
Yanlu Wei
Shihao Bai
6
101
0
17 Feb 2020
3D Dynamic Scene Graphs: Actionable Spatial Perception with Places, Objects, and Humans
Antoni Rosinol
Arjun Gupta
Marcus Abate
Jingang Shi
Luca Carlone
36
189
0
15 Feb 2020
Parameterizing Branch-and-Bound Search Trees to Learn Branching Policies
Giulia Zarpellon
Jason Jo
Andrea Lodi
Yoshua Bengio
24
96
0
12 Feb 2020
Multi-Task Learning by a Top-Down Control Network
Hila Levi
S. Ullman
15
7
0
09 Feb 2020
Visual Concept-Metaconcept Learning
Chi Han
Jiayuan Mao
Chuang Gan
J. Tenenbaum
Jiajun Wu
NAI
LRM
11
63
0
04 Feb 2020
Break It Down: A Question Understanding Benchmark
Tomer Wolfson
Mor Geva
Ankit Gupta
Matt Gardner
Yoav Goldberg
Daniel Deutch
Jonathan Berant
25
185
0
31 Jan 2020
Evaluating the Progress of Deep Learning for Visual Relational Concepts
Sebastian Stabinger
Peer David
J. Piater
A. Rodríguez-Sánchez
20
19
0
29 Jan 2020
ImVoteNet: Boosting 3D Object Detection in Point Clouds with Image Votes
C. Qi
Xinlei Chen
Or Litany
Leonidas J. Guibas
3DPC
197
249
0
29 Jan 2020
ManyModalQA: Modality Disambiguation and QA over Diverse Inputs
Darryl Hannan
Akshay Jain
Joey Tianyi Zhou
AAML
38
57
0
22 Jan 2020
In Defense of Grid Features for Visual Question Answering
Huaizu Jiang
Ishan Misra
Marcus Rohrbach
Erik Learned-Miller
Xinlei Chen
OOD
ObjD
23
318
0
10 Jan 2020
SVIRO: Synthetic Vehicle Interior Rear Seat Occupancy Dataset and Benchmark
S. Cruz
Oliver Wasenmüller
H. Beise
Thomas Stifter
D. Stricker
19
43
0
10 Jan 2020
Visual Question Answering on 360° Images
Shih-Han Chou
Wei-Lun Chao
Wei-Sheng Lai
Min Sun
Ming-Hsuan Yang
24
21
0
10 Jan 2020
Measuring Compositional Generalization: A Comprehensive Method on Realistic Data
Daniel Keysers
Nathanael Scharli
Nathan Scales
Hylke Buisman
Daniel Furrer
...
Tibor Tihon
Dmitry Tsarkov
Tianlin Li
Marc van Zee
Olivier Bousquet
CoGe
21
348
0
20 Dec 2019
Smart Home Appliances: Chat with Your Fridge
Denis A. Gudovskiy
Gyuri Han
Takuya Yamaguchi
Sotaro Tsukizawa
LRM
19
3
0
19 Dec 2019
Going Beneath the Surface: Evaluating Image Captioning for Grammaticality, Truthfulness and Diversity
Huiyuan Xie
Tom Sherborne
A. Kuhnle
Ann A. Copestake
DiffM
25
9
0
19 Dec 2019
Learning Canonical Representations for Scene Graph to Image Generation
Roei Herzig
Amir Bar
Huijuan Xu
Gal Chechik
Trevor Darrell
Amir Globerson
GNN
OCL
22
107
0
16 Dec 2019
CLOSURE: Assessing Systematic Generalization of CLEVR Models
Dzmitry Bahdanau
H. D. Vries
Timothy J. O'Donnell
Shikhar Murty
Philippe Beaudoin
Yoshua Bengio
Aaron Courville
NAI
15
90
0
12 Dec 2019
Neural Module Networks for Reasoning over Text
Nitish Gupta
Kevin Lin
Dan Roth
Sameer Singh
Matt Gardner
NAI
ReLM
LRM
13
130
0
10 Dec 2019
Deep Bayesian Active Learning for Multiple Correct Outputs
Khaled Jedoui
Ranjay Krishna
Michael S. Bernstein
Li Fei-Fei
BDL
OOD
UQCV
29
14
0
02 Dec 2019
Learning Perceptual Inference by Contrasting
Chi Zhang
Baoxiong Jia
Feng Gao
Yixin Zhu
Hongjing Lu
Song-Chun Zhu
LRM
26
108
0
29 Nov 2019
Transfer Learning in Visual and Relational Reasoning
T. S. Jayram
Vincent Marois
Tomasz Kornuta
V. Albouy
Emre Sevgen
A. Ozcan
NAI
OOD
LRM
19
2
0
27 Nov 2019
Biology and Compositionality: Empirical Considerations for Emergent-Communication Protocols
Travis LaCroix
22
3
0
26 Nov 2019
Learning to Learn Words from Visual Scenes
Dídac Surís
Dave Epstein
Heng Ji
Shih-Fu Chang
Carl Vondrick
VLM
CLIP
SSL
OffRL
30
4
0
25 Nov 2019
Identifying Model Weakness with Adversarial Examiner
Michelle Shu
Chenxi Liu
Weichao Qiu
Alan Yuille
AAML
ELM
27
19
0
25 Nov 2019
Temporal Reasoning via Audio Question Answering
Haytham M. Fayek
Justin Johnson
30
51
0
21 Nov 2019
ChartNet: Visual Reasoning over Statistical Charts using MAC-Networks
Monika Sharma
Shikha Gupta
Arindam Chowdhury
L. Vig
25
9
0
21 Nov 2019
Iterative Answer Prediction with Pointer-Augmented Multimodal Transformers for TextVQA
Ronghang Hu
Amanpreet Singh
Trevor Darrell
Marcus Rohrbach
32
195
0
14 Nov 2019
Attention on Abstract Visual Reasoning
Lukas Hahne
Timo Lüddecke
Florentin Wörgötter
David Kappel
GNN
25
23
0
14 Nov 2019
Multimodal Intelligence: Representation Learning, Information Fusion, and Applications
Chao Zhang
Zichao Yang
Xiaodong He
Li Deng
HAI
AI4TS
35
326
0
10 Nov 2019
A Spoken Dialogue System for Spatial Question Answering in a Physical Blocks World
Georgiy Platonov
Benjamin Kane
A. Gindi
Lenhart K. Schubert
21
15
0
06 Nov 2019
Scene Graph based Image Retrieval -- A case study on the CLEVR Dataset
Sahana Ramnath
Amrita Saha
Soumen Chakrabarti
Mitesh M. Khapra
3DV
12
14
0
03 Nov 2019
TAB-VCR: Tags and Attributes based Visual Commonsense Reasoning Baselines
Jingxiang Lin
Unnat Jain
Alex Schwing
LRM
ReLM
37
9
0
31 Oct 2019
Relation Module for Non-answerable Prediction on Question Answering
Kevin Huang
Yun Tang
Jing-ling Huang
Xiaodong He
Bowen Zhou
20
6
0
23 Oct 2019
KnowIT VQA: Answering Knowledge-Based Questions about Videos
Noa Garcia
Mayu Otani
Chenhui Chu
Yuta Nakashima
30
77
0
23 Oct 2019
Good, Better, Best: Textual Distractors Generation for Multiple-Choice Visual Question Answering via Reinforcement Learning
Jiaying Lu
Xin Ye
Yi Ren
Yezhou Yang
21
10
0
21 Oct 2019
A Theory of Relation Learning and Cross-domain Generalization
L. Doumas
Guillermo Puebla
Andrea E. Martin
J. Hummel
NAI
28
30
0
11 Oct 2019
CATER: A diagnostic dataset for Compositional Actions and TEmporal Reasoning
Rohit Girdhar
Deva Ramanan
22
176
0
10 Oct 2019
Meta Module Network for Compositional Visual Reasoning
Wenhu Chen
Zhe Gan
Linjie Li
Yu Cheng
Wenjie Wang
Jingjing Liu
LRM
25
68
0
08 Oct 2019
3D Scene Graph: A Structure for Unified Semantics, 3D Space, and Camera
Iro Armeni
Zhi-Yang He
JunYoung Gwak
Amir Zamir
Martin Fischer
Jitendra Malik
Silvio Savarese
3DV
3DPC
48
338
0
06 Oct 2019
Previous
1
2
3
...
23
24
25
...
28
29
30
Next