ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1612.00837
  4. Cited By
Making the V in VQA Matter: Elevating the Role of Image Understanding in
  Visual Question Answering
v1v2v3 (latest)

Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering

2 December 2016
Yash Goyal
Tejas Khot
D. Summers-Stay
Dhruv Batra
Devi Parikh
    CoGe
ArXiv (abs)PDFHTML

Papers citing "Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering"

37 / 2,037 papers shown
Title
Embodied Question Answering
Embodied Question Answering
Abhishek Das
Samyak Datta
Georgia Gkioxari
Stefan Lee
Devi Parikh
Dhruv Batra
LM&Ro
122
652
0
30 Nov 2017
Visual Question Answering as a Meta Learning Task
Visual Question Answering as a Meta Learning Task
Damien Teney
Anton Van Den Hengel
OOD
83
42
0
22 Nov 2017
Vision-and-Language Navigation: Interpreting visually-grounded
  navigation instructions in real environments
Vision-and-Language Navigation: Interpreting visually-grounded navigation instructions in real environments
Peter Anderson
Qi Wu
Damien Teney
Jake Bruce
Mark Johnson
Niko Sünderhauf
Ian Reid
Stephen Gould
Anton Van Den Hengel
LM&Ro
224
1,325
0
20 Nov 2017
A Novel Framework for Robustness Analysis of Visual QA Models
A Novel Framework for Robustness Analysis of Visual QA Models
Jia-Hong Huang
Cuong Duc Dao
Modar Alfadly
Guohao Li
AAMLOOD
82
34
0
16 Nov 2017
Active Learning for Visual Question Answering: An Empirical Study
Active Learning for Visual Question Answering: An Empirical Study
Xiaoyu Lin
Devi Parikh
102
32
0
06 Nov 2017
Whodunnit? Crime Drama as a Case for Natural Language Understanding
Whodunnit? Crime Drama as a Case for Natural Language Understanding
Lea Frermann
Shay B. Cohen
Mirella Lapata
67
26
0
31 Oct 2017
FigureQA: An Annotated Figure Dataset for Visual Reasoning
FigureQA: An Annotated Figure Dataset for Visual Reasoning
Samira Ebrahimi Kahou
Vincent Michalski
Adam Atkinson
Ákos Kádár
Adam Trischler
Yoshua Bengio
ReLMAIMat
96
332
0
19 Oct 2017
iVQA: Inverse Visual Question Answering
iVQA: Inverse Visual Question Answering
Feng Liu
Tao Xiang
Timothy M. Hospedales
Wankou Yang
Changyin Sun
73
47
0
10 Oct 2017
Fooling Vision and Language Models Despite Localization and Attention
  Mechanism
Fooling Vision and Language Models Despite Localization and Attention Mechanism
Xiaojun Xu
Xinyun Chen
Chang-rui Liu
Anna Rohrbach
Trevor Darrell
Basel Alomair
AAML
106
41
0
25 Sep 2017
Survey of Recent Advances in Visual Question Answering
Survey of Recent Advances in Visual Question Answering
Supriya Pandhre
Shagun Sodhani
30
14
0
24 Sep 2017
Visual Reference Resolution using Attention Memory for Visual Dialog
Visual Reference Resolution using Attention Memory for Visual Dialog
Paul Hongsuck Seo
Andreas M. Lehrmann
Bohyung Han
Leonid Sigal
109
123
0
23 Sep 2017
FiLM: Visual Reasoning with a General Conditioning Layer
FiLM: Visual Reasoning with a General Conditioning Layer
Ethan Perez
Florian Strub
H. D. Vries
Vincent Dumoulin
Aaron Courville
FAttAIMatOffRLAI4CE
516
2,250
0
22 Sep 2017
Visual Question Generation as Dual Task of Visual Question Answering
Visual Question Generation as Dual Task of Visual Question Answering
Yikang Li
Nan Duan
Bolei Zhou
Xiao Chu
Wanli Ouyang
Xiaogang Wang
108
166
0
21 Sep 2017
Beyond Bilinear: Generalized Multimodal Factorized High-order Pooling
  for Visual Question Answering
Beyond Bilinear: Generalized Multimodal Factorized High-order Pooling for Visual Question Answering
Zhou Yu
Jun-chen Yu
Chenchao Xiang
Jianping Fan
Dacheng Tao
106
462
0
10 Aug 2017
Learning to Disambiguate by Asking Discriminative Questions
Learning to Disambiguate by Asking Discriminative Questions
Yining Li
Chen Huang
Xiaoou Tang
Chen Change Loy
67
22
0
09 Aug 2017
Tips and Tricks for Visual Question Answering: Learnings from the 2017
  Challenge
Tips and Tricks for Visual Question Answering: Learnings from the 2017 Challenge
Damien Teney
Peter Anderson
Xiaodong He
Anton Van Den Hengel
155
383
0
09 Aug 2017
A Simple Loss Function for Improving the Convergence and Accuracy of
  Visual Question Answering Models
A Simple Loss Function for Improving the Convergence and Accuracy of Visual Question Answering Models
Ilija Ilievski
Jiashi Feng
53
11
0
02 Aug 2017
Bottom-Up and Top-Down Attention for Image Captioning and Visual
  Question Answering
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
Peter Anderson
Xiaodong He
Chris Buehler
Damien Teney
Mark Johnson
Stephen Gould
Lei Zhang
AIMat
291
4,232
0
25 Jul 2017
Visual Question Answering with Memory-Augmented Networks
Visual Question Answering with Memory-Augmented Networks
Chao Ma
Chunhua Shen
A. Dick
Qi Wu
Peng Wang
Anton Van Den Hengel
Ian Reid
97
100
0
17 Jul 2017
Learning Visual Reasoning Without Strong Priors
Learning Visual Reasoning Without Strong Priors
Ethan Perez
H. D. Vries
Florian Strub
Vincent Dumoulin
Aaron Courville
OODNAI
117
62
0
10 Jul 2017
Modulating early visual processing by language
Modulating early visual processing by language
H. D. Vries
Florian Strub
Jérémie Mary
Hugo Larochelle
Olivier Pietquin
Aaron Courville
246
490
0
02 Jul 2017
Deep learning evaluation using deep linguistic processing
Deep learning evaluation using deep linguistic processing
A. Kuhnle
Ann A. Copestake
ELM
66
11
0
05 Jun 2017
Attention-based Natural Language Person Retrieval
Attention-based Natural Language Person Retrieval
Tao Zhou
Muhao Chen
Jie Yu
Demetri Terzopoulos
39
14
0
24 May 2017
Inferring and Executing Programs for Visual Reasoning
Inferring and Executing Programs for Visual Reasoning
Justin Johnson
B. Hariharan
Laurens van der Maaten
Judy Hoffman
Li Fei-Fei
C. L. Zitnick
Ross B. Girshick
NAI
130
545
0
10 May 2017
FOIL it! Find One mismatch between Image and Language caption
FOIL it! Find One mismatch between Image and Language caption
Ravi Shekhar
Sandro Pezzelle
Yauhen Klimovich
Aurélie Herbelot
Moin Nabi
E. Sangineto
Raffaella Bernardi
68
141
0
03 May 2017
Speech-Based Visual Question Answering
Speech-Based Visual Question Answering
Ted Zhang
Dengxin Dai
Tinne Tuytelaars
Marie-Francine Moens
Luc Van Gool
93
25
0
01 May 2017
C-VQA: A Compositional Split of the Visual Question Answering (VQA) v1.0
  Dataset
C-VQA: A Compositional Split of the Visual Question Answering (VQA) v1.0 Dataset
Aishwarya Agrawal
Aniruddha Kembhavi
Dhruv Batra
Devi Parikh
CoGe
75
80
0
26 Apr 2017
Being Negative but Constructively: Lessons Learnt from Creating Better
  Visual Question Answering Datasets
Being Negative but Constructively: Lessons Learnt from Creating Better Visual Question Answering Datasets
Wei-Lun Chao
Hexiang Hu
Fei Sha
95
37
0
24 Apr 2017
Learning to Reason: End-to-End Module Networks for Visual Question
  Answering
Learning to Reason: End-to-End Module Networks for Visual Question Answering
Ronghang Hu
Jacob Andreas
Marcus Rohrbach
Trevor Darrell
Kate Saenko
KELMGNNReLMLRM
160
581
0
18 Apr 2017
ShapeWorld - A new test methodology for multimodal language
  understanding
ShapeWorld - A new test methodology for multimodal language understanding
A. Kuhnle
Ann A. Copestake
67
69
0
14 Apr 2017
TGIF-QA: Toward Spatio-Temporal Reasoning in Visual Question Answering
TGIF-QA: Toward Spatio-Temporal Reasoning in Visual Question Answering
Y. Jang
Yale Song
Youngjae Yu
Youngjin Kim
Gunhee Kim
102
562
0
14 Apr 2017
Show, Ask, Attend, and Answer: A Strong Baseline For Visual Question
  Answering
Show, Ask, Attend, and Answer: A Strong Baseline For Visual Question Answering
V. Kazemi
Ali Elqursh
OOD
91
185
0
11 Apr 2017
Pay Attention to Those Sets! Learning Quantification from Images
Pay Attention to Those Sets! Learning Quantification from Images
Ionut-Teodor Sorodoc
Sandro Pezzelle
Aurélie Herbelot
Mariella Dimiccoli
Raffaella Bernardi
46
0
0
10 Apr 2017
It Takes Two to Tango: Towards Theory of AI's Mind
It Takes Two to Tango: Towards Theory of AI's Mind
Arjun Chandrasekaran
Deshraj Yadav
Prithvijit Chattopadhyay
Viraj Prabhu
Devi Parikh
115
55
0
03 Apr 2017
Aligned Image-Word Representations Improve Inductive Transfer Across
  Vision-Language Tasks
Aligned Image-Word Representations Improve Inductive Transfer Across Vision-Language Tasks
Tanmay Gupta
Kevin J. Shih
Saurabh Singh
Derek Hoiem
117
26
0
02 Apr 2017
An Analysis of Visual Question Answering Algorithms
An Analysis of Visual Question Answering Algorithms
Kushal Kafle
Christopher Kanan
102
234
0
28 Mar 2017
Visual Dialog
Visual Dialog
Abhishek Das
Satwik Kottur
Khushi Gupta
Avi Singh
Deshraj Yadav
José M. F. Moura
Devi Parikh
Dhruv Batra
180
1,005
0
26 Nov 2016
Previous
123...394041