Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1707.07998
Cited By
v1
v2
v3 (latest)
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
25 July 2017
Peter Anderson
Xiaodong He
Chris Buehler
Damien Teney
Mark Johnson
Stephen Gould
Lei Zhang
AIMat
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering"
50 / 1,868 papers shown
Title
Language-Conditioned Graph Networks for Relational Reasoning
Ronghang Hu
Anna Rohrbach
Trevor Darrell
Kate Saenko
85
175
0
10 May 2019
PR Product: A Substitute for Inner Product in Neural Networks
Zhennan Wang
Wenbin Zou
Chen Xu
36
6
0
30 Apr 2019
Knowing When to Stop: Evaluation and Verification of Conformity to Output-size Specifications
Chenglong Wang
Rudy Bunel
Krishnamurthy Dvijotham
Po-Sen Huang
Edward Grefenstette
Pushmeet Kohli
58
5
0
26 Apr 2019
The Neuro-Symbolic Concept Learner: Interpreting Scenes, Words, and Sentences From Natural Supervision
Jiayuan Mao
Chuang Gan
Pushmeet Kohli
J. Tenenbaum
Jiajun Wu
NAI
165
706
0
26 Apr 2019
Pointing Novel Objects in Image Captioning
Yehao Li
Ting Yao
Yingwei Pan
Hongyang Chao
Tao Mei
88
70
0
25 Apr 2019
HAR-Net: Joint Learning of Hybrid Attention for Single-stage Object Detection
Yali Li
Shengjin Wang
59
34
0
25 Apr 2019
Deep Metric Learning Beyond Binary Supervision
Sungyeon Kim
Minkyo Seo
Ivan Laptev
Minsu Cho
Suha Kwak
SSL
74
96
0
21 Apr 2019
Challenges and Prospects in Vision and Language Research
Kushal Kafle
Robik Shrestha
Christopher Kanan
69
41
0
19 Apr 2019
Towards VQA Models That Can Read
Amanpreet Singh
Vivek Natarajan
Meet Shah
Yu Jiang
Xinlei Chen
Dhruv Batra
Devi Parikh
Marcus Rohrbach
EgoV
166
1,257
0
18 Apr 2019
Attentive Single-Tasking of Multiple Tasks
Kevis-Kokitsi Maninis
Ilija Radosavovic
Iasonas Kokkinos
202
251
0
18 Apr 2019
Learning to Collocate Neural Modules for Image Captioning
Xu Yang
Hanwang Zhang
Jianfei Cai
61
78
0
18 Apr 2019
Progressive Attention Memory Network for Movie Story Question Answering
Junyeong Kim
Minuk Ma
Kyungsu Kim
Sungjin Kim
Chang D. Yoo
114
76
0
18 Apr 2019
Question Guided Modular Routing Networks for Visual Question Answering
Yanze Wu
Qiang Sun
Jianqi Ma
Bin Li
Yanwei Fu
Yao Peng
Xiangyang Xue
60
1
0
17 Apr 2019
Interpreting Adversarial Examples with Attributes
Sadaf Gulshad
J. H. Metzen
A. Smeulders
Zeynep Akata
FAtt
AAML
93
6
0
17 Apr 2019
What I See Is What You See: Joint Attention Learning for First and Third Person Video Co-analysis
Huangyue Yu
Minjie Cai
Yunfei Liu
Feng Lu
EgoV
59
22
0
16 Apr 2019
Self-critical n-step Training for Image Captioning
Junlong Gao
Shiqi Wang
Shanshe Wang
Siwei Ma
Wen Gao
92
55
0
15 Apr 2019
Factor Graph Attention
Idan Schwartz
Seunghak Yu
Tamir Hazan
Alex Schwing
116
110
0
11 Apr 2019
Reasoning Visual Dialogs with Structural and Partial Observations
Zilong Zheng
Wenguan Wang
Siyuan Qi
Song-Chun Zhu
121
117
0
11 Apr 2019
Heterogeneous Memory Enhanced Multimodal Attention Model for Video Question Answering
Chenyou Fan
Xiaofan Zhang
Shu Zhang
Wensheng Wang
Chi Zhang
Heng-Chiao Huang
75
279
0
08 Apr 2019
Revisiting EmbodiedQA: A Simple Baseline and Beyond
Yuehua Wu
Lu Jiang
Yi Yang
LM&Ro
79
30
0
08 Apr 2019
The Steep Road to Happily Ever After: An Analysis of Current Visual Storytelling Models
Yatri Modi
Natalie Parde
61
16
0
06 Apr 2019
What Object Should I Use? - Task Driven Object Detection
Johann Sawatzky
Yaser Souri
C. Grund
Juergen Gall
ObjD
77
27
0
05 Apr 2019
An Attentive Survey of Attention Models
S. Chaudhari
Varun Mithal
Gungor Polatkan
R. Ramanath
192
666
0
05 Apr 2019
Actively Seeking and Learning from Live Data
Damien Teney
Anton Van Den Hengel
OOD
69
21
0
05 Apr 2019
Good News, Everyone! Context driven entity-aware captioning for news images
Ali Furkan Biten
Lluís Gómez
Marçal Rusiñol
Dimosthenis Karatzas
89
141
0
02 Apr 2019
Context and Attribute Grounded Dense Captioning
Guojun Yin
Lu Sheng
Bin Liu
Nenghai Yu
Xiaogang Wang
Jing Shao
63
76
0
02 Apr 2019
Aiding Intra-Text Representations with Visual Context for Multimodal Named Entity Recognition
Omer Arshad
I. Gallo
Shah Nawaz
Alessandro Calefati
44
43
0
02 Apr 2019
EE-AE: An Exclusivity Enhanced Unsupervised Feature Learning Approach
Jingcai Guo
Song Guo
42
15
0
30 Mar 2019
Relation-Aware Graph Attention Network for Visual Question Answering
Linjie Li
Zhe Gan
Yu Cheng
Jingjing Liu
GNN
190
347
0
29 Mar 2019
Align2Ground: Weakly Supervised Phrase Grounding Guided by Image-Caption Alignment
Samyak Datta
Karan Sikka
Anirban Roy
Karuna Ahuja
Devi Parikh
Ajay Divakaran
97
104
0
27 Mar 2019
Attention Based Glaucoma Detection: A Large-scale Database and CNN Model
Liu Li
Mai Xu
Xiaofei Wang
Lai Jiang
Hanruo Liu
92
205
0
26 Mar 2019
Unpaired Image Captioning via Scene Graph Alignments
Jiuxiang Gu
Shafiq Joty
Jianfei Cai
Handong Zhao
Xu Yang
G. Wang
GNN
83
176
0
26 Mar 2019
Dense Relational Captioning: Triple-Stream Networks for Relationship-Based Captioning
Dong-Jin Kim
Jinsoo Choi
Tae-Hyun Oh
In So Kweon
98
84
0
14 Mar 2019
MirrorGAN: Learning Text-to-image Generation by Redescription
Tingting Qiao
Jing Zhang
Duanqing Xu
Dacheng Tao
VLM
GAN
67
544
0
14 Mar 2019
Spatial-Aware Non-Local Attention for Fashion Landmark Detection
Yixin Li
Shengqin Tang
Yun Ye
Jinwen Ma
42
23
0
11 Mar 2019
Image captioning with weakly-supervised attention penalty
Jiayun Li
M. K. Ebrahimpour
Azadeh Moghtaderi
Yen-Yun Yu
30
5
0
06 Mar 2019
Improving Referring Expression Grounding with Cross-modal Attention-guided Erasing
Xihui Liu
Zihao Wang
Jing Shao
Xiaogang Wang
Hongsheng Li
ObjD
98
186
0
03 Mar 2019
Answer Them All! Toward Universal Visual Question Answering Models
Robik Shrestha
Kushal Kafle
Christopher Kanan
88
83
0
01 Mar 2019
Generative Visual Dialogue System via Adaptive Reasoning and Weighted Likelihood Estimation
Heming Zhang
Shalini Ghosh
Larry Heck
Stephen Walsh
Junting Zhang
Jie Zhang
C.-C. Jay Kuo
126
7
0
26 Feb 2019
Image-Question-Answer Synergistic Network for Visual Dialog
Dalu Guo
Chang Xu
Dacheng Tao
63
74
0
26 Feb 2019
MUREL: Multimodal Relational Reasoning for Visual Question Answering
Rémi Cadène
H. Ben-younes
Matthieu Cord
Nicolas Thome
LRM
84
277
0
25 Feb 2019
Dual Attention Networks for Visual Reference Resolution in Visual Dialog
Gi-Cheon Kang
Jaeseo Lim
Byoung-Tak Zhang
56
73
0
25 Feb 2019
Cycle-Consistency for Robust Visual Question Answering
Meet Shah
Xinlei Chen
Marcus Rohrbach
Devi Parikh
OOD
85
190
0
15 Feb 2019
Taking a HINT: Leveraging Explanations to Make Vision and Language Models More Grounded
Ramprasaath R. Selvaraju
Stefan Lee
Yilin Shen
Hongxia Jin
Shalini Ghosh
Larry Heck
Dhruv Batra
Devi Parikh
FAtt
VLM
76
255
0
11 Feb 2019
CHIP: Channel-wise Disentangled Interpretation of Deep Convolutional Neural Networks
Xinrui Cui
Dan Wang
F. I. Z. Jane Wang
FAtt
BDL
36
12
0
07 Feb 2019
Multi-step Reasoning via Recurrent Dual Attention for Visual Dialog
Zhe Gan
Yu Cheng
Ahmed El Kholy
Linjie Li
Jingjing Liu
Jianfeng Gao
104
105
0
01 Feb 2019
VrR-VG: Refocusing Visually-Relevant Relationships
Yuanzhi Liang
Yalong Bai
Wei Zhang
Xueming Qian
Li Zhu
Tao Mei
3DH
136
8
0
01 Feb 2019
Visual Entailment: A Novel Task for Fine-Grained Image Understanding
Ning Xie
Farley Lai
Derek Doran
Asim Kadav
CoGe
127
327
0
20 Jan 2019
Evaluating Text-to-Image Matching using Binary Image Selection (BISON)
Hexiang Hu
Ishan Misra
Laurens van der Maaten
82
22
0
19 Jan 2019
Improving Sequence-to-Sequence Learning via Optimal Transport
Liqun Chen
Yizhe Zhang
Ruiyi Zhang
Chenyang Tao
Zhe Gan
Haichao Zhang
Bai Li
Dinghan Shen
Changyou Chen
Lawrence Carin
OT
71
94
0
18 Jan 2019
Previous
1
2
3
...
34
35
36
37
38
Next