ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1612.06890
  4. Cited By
CLEVR: A Diagnostic Dataset for Compositional Language and Elementary
  Visual Reasoning

CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning

20 December 2016
Justin Johnson
B. Hariharan
Laurens van der Maaten
Li Fei-Fei
C. L. Zitnick
Ross B. Girshick
    CoGe
ArXivPDFHTML

Papers citing "CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning"

50 / 1,475 papers shown
Title
Video2Commonsense: Generating Commonsense Descriptions to Enrich Video
  Captioning
Video2Commonsense: Generating Commonsense Descriptions to Enrich Video Captioning
Zhiyuan Fang
Tejas Gokhale
Pratyay Banerjee
Chitta Baral
Yezhou Yang
25
60
0
11 Mar 2020
A Benchmark for Systematic Generalization in Grounded Language
  Understanding
A Benchmark for Systematic Generalization in Grounded Language Understanding
Laura Ruis
Jacob Andreas
Marco Baroni
Diane Bouchacourt
Brenden M. Lake
19
138
0
11 Mar 2020
MQA: Answering the Question via Robotic Manipulation
MQA: Answering the Question via Robotic Manipulation
Yuhong Deng
Di Guo
F. Sun
Naifu Zhang
Huaping Liu
Chen Pang
10
19
0
10 Mar 2020
Better Set Representations For Relational Reasoning
Better Set Representations For Relational Reasoning
Qian Huang
Horace He
Ashutosh Kumar Singh
Yan Zhang
Ser-Nam Lim
Austin R. Benson
NAI
OCL
GNN
36
1
0
09 Mar 2020
Deconfounded Image Captioning: A Causal Retrospect
Deconfounded Image Captioning: A Causal Retrospect
Xu Yang
Hanwang Zhang
Jianfei Cai
CML
18
119
0
09 Mar 2020
PathVQA: 30000+ Questions for Medical Visual Question Answering
PathVQA: 30000+ Questions for Medical Visual Question Answering
Xuehai He
Yichen Zhang
Luntian Mou
Eric Xing
P. Xie
LM&MA
25
215
0
07 Mar 2020
Cops-Ref: A new Dataset and Task on Compositional Referring Expression
  Comprehension
Cops-Ref: A new Dataset and Task on Compositional Referring Expression Comprehension
Zhenfang Chen
Peng Wang
Lin Ma
Kwan-Yee K. Wong
Qi Wu
ObjD
34
68
0
01 Mar 2020
Unshuffling Data for Improved Generalization
Unshuffling Data for Improved Generalization
Damien Teney
Ehsan Abbasnejad
Anton Van Den Hengel
OOD
31
76
0
27 Feb 2020
On the General Value of Evidence, and Bilingual Scene-Text Visual
  Question Answering
On the General Value of Evidence, and Bilingual Scene-Text Visual Question Answering
Xinyu Wang
Yuliang Liu
Chunhua Shen
Chun Chet Ng
Canjie Luo
Lianwen Jin
C. Chan
Anton Van Den Hengel
Liangwei Wang
31
91
0
24 Feb 2020
VQA-LOL: Visual Question Answering under the Lens of Logic
VQA-LOL: Visual Question Answering under the Lens of Logic
Tejas Gokhale
Pratyay Banerjee
Chitta Baral
Yezhou Yang
CoGe
28
73
0
19 Feb 2020
Bayes-TrEx: a Bayesian Sampling Approach to Model Transparency by
  Example
Bayes-TrEx: a Bayesian Sampling Approach to Model Transparency by Example
Serena Booth
Yilun Zhou
Ankit J. Shah
J. Shah
BDL
20
2
0
19 Feb 2020
Stratified Rule-Aware Network for Abstract Visual Reasoning
Stratified Rule-Aware Network for Abstract Visual Reasoning
Sheng Hu
Yuqing Ma
Xianglong Liu
Yanlu Wei
Shihao Bai
6
101
0
17 Feb 2020
3D Dynamic Scene Graphs: Actionable Spatial Perception with Places,
  Objects, and Humans
3D Dynamic Scene Graphs: Actionable Spatial Perception with Places, Objects, and Humans
Antoni Rosinol
Arjun Gupta
Marcus Abate
Jingang Shi
Luca Carlone
36
189
0
15 Feb 2020
Parameterizing Branch-and-Bound Search Trees to Learn Branching Policies
Parameterizing Branch-and-Bound Search Trees to Learn Branching Policies
Giulia Zarpellon
Jason Jo
Andrea Lodi
Yoshua Bengio
24
96
0
12 Feb 2020
Multi-Task Learning by a Top-Down Control Network
Multi-Task Learning by a Top-Down Control Network
Hila Levi
S. Ullman
15
7
0
09 Feb 2020
Visual Concept-Metaconcept Learning
Visual Concept-Metaconcept Learning
Chi Han
Jiayuan Mao
Chuang Gan
J. Tenenbaum
Jiajun Wu
NAI
LRM
11
63
0
04 Feb 2020
Break It Down: A Question Understanding Benchmark
Break It Down: A Question Understanding Benchmark
Tomer Wolfson
Mor Geva
Ankit Gupta
Matt Gardner
Yoav Goldberg
Daniel Deutch
Jonathan Berant
25
185
0
31 Jan 2020
Evaluating the Progress of Deep Learning for Visual Relational Concepts
Evaluating the Progress of Deep Learning for Visual Relational Concepts
Sebastian Stabinger
Peer David
J. Piater
A. Rodríguez-Sánchez
20
19
0
29 Jan 2020
ImVoteNet: Boosting 3D Object Detection in Point Clouds with Image Votes
ImVoteNet: Boosting 3D Object Detection in Point Clouds with Image Votes
C. Qi
Xinlei Chen
Or Litany
Leonidas J. Guibas
3DPC
197
249
0
29 Jan 2020
ManyModalQA: Modality Disambiguation and QA over Diverse Inputs
ManyModalQA: Modality Disambiguation and QA over Diverse Inputs
Darryl Hannan
Akshay Jain
Joey Tianyi Zhou
AAML
38
57
0
22 Jan 2020
In Defense of Grid Features for Visual Question Answering
In Defense of Grid Features for Visual Question Answering
Huaizu Jiang
Ishan Misra
Marcus Rohrbach
Erik Learned-Miller
Xinlei Chen
OOD
ObjD
23
318
0
10 Jan 2020
SVIRO: Synthetic Vehicle Interior Rear Seat Occupancy Dataset and
  Benchmark
SVIRO: Synthetic Vehicle Interior Rear Seat Occupancy Dataset and Benchmark
S. Cruz
Oliver Wasenmüller
H. Beise
Thomas Stifter
D. Stricker
19
43
0
10 Jan 2020
Visual Question Answering on 360° Images
Visual Question Answering on 360° Images
Shih-Han Chou
Wei-Lun Chao
Wei-Sheng Lai
Min Sun
Ming-Hsuan Yang
24
21
0
10 Jan 2020
Measuring Compositional Generalization: A Comprehensive Method on
  Realistic Data
Measuring Compositional Generalization: A Comprehensive Method on Realistic Data
Daniel Keysers
Nathanael Scharli
Nathan Scales
Hylke Buisman
Daniel Furrer
...
Tibor Tihon
Dmitry Tsarkov
Tianlin Li
Marc van Zee
Olivier Bousquet
CoGe
21
348
0
20 Dec 2019
Smart Home Appliances: Chat with Your Fridge
Smart Home Appliances: Chat with Your Fridge
Denis A. Gudovskiy
Gyuri Han
Takuya Yamaguchi
Sotaro Tsukizawa
LRM
19
3
0
19 Dec 2019
Going Beneath the Surface: Evaluating Image Captioning for
  Grammaticality, Truthfulness and Diversity
Going Beneath the Surface: Evaluating Image Captioning for Grammaticality, Truthfulness and Diversity
Huiyuan Xie
Tom Sherborne
A. Kuhnle
Ann A. Copestake
DiffM
25
9
0
19 Dec 2019
Learning Canonical Representations for Scene Graph to Image Generation
Learning Canonical Representations for Scene Graph to Image Generation
Roei Herzig
Amir Bar
Huijuan Xu
Gal Chechik
Trevor Darrell
Amir Globerson
GNN
OCL
22
107
0
16 Dec 2019
CLOSURE: Assessing Systematic Generalization of CLEVR Models
CLOSURE: Assessing Systematic Generalization of CLEVR Models
Dzmitry Bahdanau
H. D. Vries
Timothy J. O'Donnell
Shikhar Murty
Philippe Beaudoin
Yoshua Bengio
Aaron Courville
NAI
15
90
0
12 Dec 2019
Neural Module Networks for Reasoning over Text
Neural Module Networks for Reasoning over Text
Nitish Gupta
Kevin Lin
Dan Roth
Sameer Singh
Matt Gardner
NAI
ReLM
LRM
13
130
0
10 Dec 2019
Deep Bayesian Active Learning for Multiple Correct Outputs
Deep Bayesian Active Learning for Multiple Correct Outputs
Khaled Jedoui
Ranjay Krishna
Michael S. Bernstein
Li Fei-Fei
BDL
OOD
UQCV
29
14
0
02 Dec 2019
Learning Perceptual Inference by Contrasting
Learning Perceptual Inference by Contrasting
Chi Zhang
Baoxiong Jia
Feng Gao
Yixin Zhu
Hongjing Lu
Song-Chun Zhu
LRM
26
108
0
29 Nov 2019
Transfer Learning in Visual and Relational Reasoning
Transfer Learning in Visual and Relational Reasoning
T. S. Jayram
Vincent Marois
Tomasz Kornuta
V. Albouy
Emre Sevgen
A. Ozcan
NAI
OOD
LRM
19
2
0
27 Nov 2019
Biology and Compositionality: Empirical Considerations for
  Emergent-Communication Protocols
Biology and Compositionality: Empirical Considerations for Emergent-Communication Protocols
Travis LaCroix
22
3
0
26 Nov 2019
Learning to Learn Words from Visual Scenes
Learning to Learn Words from Visual Scenes
Dídac Surís
Dave Epstein
Heng Ji
Shih-Fu Chang
Carl Vondrick
VLM
CLIP
SSL
OffRL
30
4
0
25 Nov 2019
Identifying Model Weakness with Adversarial Examiner
Identifying Model Weakness with Adversarial Examiner
Michelle Shu
Chenxi Liu
Weichao Qiu
Alan Yuille
AAML
ELM
27
19
0
25 Nov 2019
Temporal Reasoning via Audio Question Answering
Temporal Reasoning via Audio Question Answering
Haytham M. Fayek
Justin Johnson
30
51
0
21 Nov 2019
ChartNet: Visual Reasoning over Statistical Charts using MAC-Networks
ChartNet: Visual Reasoning over Statistical Charts using MAC-Networks
Monika Sharma
Shikha Gupta
Arindam Chowdhury
L. Vig
25
9
0
21 Nov 2019
Iterative Answer Prediction with Pointer-Augmented Multimodal
  Transformers for TextVQA
Iterative Answer Prediction with Pointer-Augmented Multimodal Transformers for TextVQA
Ronghang Hu
Amanpreet Singh
Trevor Darrell
Marcus Rohrbach
32
195
0
14 Nov 2019
Attention on Abstract Visual Reasoning
Attention on Abstract Visual Reasoning
Lukas Hahne
Timo Lüddecke
Florentin Wörgötter
David Kappel
GNN
25
23
0
14 Nov 2019
Multimodal Intelligence: Representation Learning, Information Fusion,
  and Applications
Multimodal Intelligence: Representation Learning, Information Fusion, and Applications
Chao Zhang
Zichao Yang
Xiaodong He
Li Deng
HAI
AI4TS
35
326
0
10 Nov 2019
A Spoken Dialogue System for Spatial Question Answering in a Physical
  Blocks World
A Spoken Dialogue System for Spatial Question Answering in a Physical Blocks World
Georgiy Platonov
Benjamin Kane
A. Gindi
Lenhart K. Schubert
21
15
0
06 Nov 2019
Scene Graph based Image Retrieval -- A case study on the CLEVR Dataset
Scene Graph based Image Retrieval -- A case study on the CLEVR Dataset
Sahana Ramnath
Amrita Saha
Soumen Chakrabarti
Mitesh M. Khapra
3DV
12
14
0
03 Nov 2019
TAB-VCR: Tags and Attributes based Visual Commonsense Reasoning
  Baselines
TAB-VCR: Tags and Attributes based Visual Commonsense Reasoning Baselines
Jingxiang Lin
Unnat Jain
Alex Schwing
LRM
ReLM
37
9
0
31 Oct 2019
Relation Module for Non-answerable Prediction on Question Answering
Relation Module for Non-answerable Prediction on Question Answering
Kevin Huang
Yun Tang
Jing-ling Huang
Xiaodong He
Bowen Zhou
20
6
0
23 Oct 2019
KnowIT VQA: Answering Knowledge-Based Questions about Videos
KnowIT VQA: Answering Knowledge-Based Questions about Videos
Noa Garcia
Mayu Otani
Chenhui Chu
Yuta Nakashima
30
77
0
23 Oct 2019
Good, Better, Best: Textual Distractors Generation for Multiple-Choice
  Visual Question Answering via Reinforcement Learning
Good, Better, Best: Textual Distractors Generation for Multiple-Choice Visual Question Answering via Reinforcement Learning
Jiaying Lu
Xin Ye
Yi Ren
Yezhou Yang
21
10
0
21 Oct 2019
A Theory of Relation Learning and Cross-domain Generalization
A Theory of Relation Learning and Cross-domain Generalization
L. Doumas
Guillermo Puebla
Andrea E. Martin
J. Hummel
NAI
28
30
0
11 Oct 2019
CATER: A diagnostic dataset for Compositional Actions and TEmporal
  Reasoning
CATER: A diagnostic dataset for Compositional Actions and TEmporal Reasoning
Rohit Girdhar
Deva Ramanan
22
176
0
10 Oct 2019
Meta Module Network for Compositional Visual Reasoning
Meta Module Network for Compositional Visual Reasoning
Wenhu Chen
Zhe Gan
Linjie Li
Yu Cheng
Wenjie Wang
Jingjing Liu
LRM
25
68
0
08 Oct 2019
3D Scene Graph: A Structure for Unified Semantics, 3D Space, and Camera
3D Scene Graph: A Structure for Unified Semantics, 3D Space, and Camera
Iro Armeni
Zhi-Yang He
JunYoung Gwak
Amir Zamir
Martin Fischer
Jitendra Malik
Silvio Savarese
3DV
3DPC
48
338
0
06 Oct 2019
Previous
123...232425...282930
Next