ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1512.02167
  4. Cited By
Simple Baseline for Visual Question Answering

Simple Baseline for Visual Question Answering

7 December 2015
Bolei Zhou
Yuandong Tian
Sainbayar Sukhbaatar
Arthur Szlam
Rob Fergus
    FAtt
ArXivPDFHTML

Papers citing "Simple Baseline for Visual Question Answering"

20 / 20 papers shown
Title
LOVA3: Learning to Visual Question Answering, Asking and Assessment
LOVA3: Learning to Visual Question Answering, Asking and Assessment
Henry Hengyuan Zhao
Pan Zhou
Difei Gao
Zechen Bai
Mike Zheng Shou
121
9
0
21 Feb 2025
On human motion prediction using recurrent neural networks
On human motion prediction using recurrent neural networks
Julieta Martinez
Michael J. Black
Javier Romero
3DH
93
936
0
06 May 2017
Learning to Compose Neural Networks for Question Answering
Learning to Compose Neural Networks for Question Answering
Jacob Andreas
Marcus Rohrbach
Trevor Darrell
Dan Klein
NAI
KELM
BDL
CoGe
99
566
0
07 Jan 2016
Learning Deep Features for Discriminative Localization
Learning Deep Features for Discriminative Localization
Bolei Zhou
A. Khosla
Àgata Lapedriza
A. Oliva
Antonio Torralba
SSL
SSeg
FAtt
250
9,319
0
14 Dec 2015
Where To Look: Focus Regions for Visual Question Answering
Where To Look: Focus Regions for Visual Question Answering
Kevin J. Shih
Saurabh Singh
Derek Hoiem
71
460
0
23 Nov 2015
Ask Me Anything: Free-form Visual Question Answering Based on Knowledge
  from External Sources
Ask Me Anything: Free-form Visual Question Answering Based on Knowledge from External Sources
Qi Wu
Peng Wang
Chunhua Shen
A. Dick
Anton Van Den Hengel
75
372
0
22 Nov 2015
ABC-CNN: An Attention Based Convolutional Neural Network for Visual
  Question Answering
ABC-CNN: An Attention Based Convolutional Neural Network for Visual Question Answering
Kan Chen
Jiang Wang
Liang-Chieh Chen
Haoyuan Gao
Wenyuan Xu
Ram Nevatia
68
288
0
18 Nov 2015
Image Question Answering using Convolutional Neural Network with Dynamic
  Parameter Prediction
Image Question Answering using Convolutional Neural Network with Dynamic Parameter Prediction
Hyeonwoo Noh
Paul Hongsuck Seo
Bohyung Han
OOD
72
327
0
18 Nov 2015
Compositional Memory for Visual Question Answering
Compositional Memory for Visual Question Answering
Aiwen Jiang
Fang Wang
Fatih Porikli
Yi Li
CoGe
46
42
0
18 Nov 2015
Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for
  Visual Question Answering
Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for Visual Question Answering
Huijuan Xu
Kate Saenko
71
763
0
17 Nov 2015
Neural Module Networks
Neural Module Networks
Jacob Andreas
Marcus Rohrbach
Trevor Darrell
Dan Klein
CoGe
134
1,072
0
09 Nov 2015
Stacked Attention Networks for Image Question Answering
Stacked Attention Networks for Image Question Answering
Zichao Yang
Xiaodong He
Jianfeng Gao
Li Deng
Alex Smola
BDL
107
1,882
0
07 Nov 2015
Are You Talking to a Machine? Dataset and Methods for Multilingual Image
  Question Answering
Are You Talking to a Machine? Dataset and Methods for Multilingual Image Question Answering
Haoyuan Gao
Junhua Mao
Jie Zhou
Zhiheng Huang
Lei Wang
Wenyuan Xu
78
501
0
21 May 2015
Exploring Nearest Neighbor Approaches for Image Captioning
Exploring Nearest Neighbor Approaches for Image Captioning
Jacob Devlin
Saurabh Gupta
Ross B. Girshick
Margaret Mitchell
C. L. Zitnick
177
195
0
17 May 2015
Exploring Models and Data for Image Question Answering
Exploring Models and Data for Image Question Answering
Mengye Ren
Ryan Kiros
R. Zemel
80
715
0
08 May 2015
VQA: Visual Question Answering
VQA: Visual Question Answering
Aishwarya Agrawal
Jiasen Lu
Stanislaw Antol
Margaret Mitchell
C. L. Zitnick
Dhruv Batra
Devi Parikh
CoGe
199
5,470
0
03 May 2015
Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)
Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)
Junhua Mao
Wenyuan Xu
Yi Yang
Jiang Wang
Zhiheng Huang
Alan Yuille
VLM
171
1,240
0
20 Dec 2014
Show and Tell: A Neural Image Caption Generator
Show and Tell: A Neural Image Caption Generator
Oriol Vinyals
Alexander Toshev
Samy Bengio
D. Erhan
3DV
237
6,028
0
17 Nov 2014
Going Deeper with Convolutions
Going Deeper with Convolutions
Christian Szegedy
Wei Liu
Yangqing Jia
P. Sermanet
Scott E. Reed
Dragomir Anguelov
D. Erhan
Vincent Vanhoucke
Andrew Rabinovich
460
43,649
0
17 Sep 2014
Microsoft COCO: Common Objects in Context
Microsoft COCO: Common Objects in Context
Nayeon Lee
Michael Maire
Serge J. Belongie
Lubomir Bourdev
Ross B. Girshick
James Hays
Pietro Perona
Deva Ramanan
C. L. Zitnick
Piotr Dollár
ObjD
413
43,638
0
01 May 2014
1