Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1707.07998
Cited By
v1
v2
v3 (latest)
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
25 July 2017
Peter Anderson
Xiaodong He
Chris Buehler
Damien Teney
Mark Johnson
Stephen Gould
Lei Zhang
AIMat
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering"
50 / 1,868 papers shown
Title
Manipulation-skill Assessment from Videos with Spatial Attention Network
Zhenqiang Li
Yifei Huang
Minjie Cai
Yoichi Sato
76
60
0
09 Jan 2019
Robust Change Captioning
Dong Huk Park
Trevor Darrell
Anna Rohrbach
43
5
0
08 Jan 2019
Action2Vec: A Crossmodal Embedding Approach to Action Learning
Meera Hahn
Andrew Silva
James M. Rehg
80
58
0
02 Jan 2019
Hierarchical LSTMs with Adaptive Attention for Visual Captioning
Jingkuan Song
Xiangpeng Li
Lianli Gao
Heng Tao Shen
104
223
0
26 Dec 2018
Scene Graph Reasoning with Prior Visual Relationship for Visual Question Answering
Zhuoqian Yang
Zengchang Qin
Jing Yu
Yue Hu
GNN
80
16
0
23 Dec 2018
A Multi-task Neural Approach for Emotion Attribution, Classification and Summarization
Guoyun Tu
Yanwei Fu
Boyang Albert Li
Jiarui Gao
Yu-Gang Jiang
Xiangyang Xue
36
29
0
21 Dec 2018
The Design and Implementation of XiaoIce, an Empathetic Social Chatbot
Li Zhou
Jianfeng Gao
Di Li
Harry Shum
82
608
0
21 Dec 2018
nocaps: novel object captioning at scale
Harsh Agrawal
Karan Desai
Yufei Wang
Xinlei Chen
Rishabh Jain
Mark Johnson
Dhruv Batra
Devi Parikh
Stefan Lee
Peter Anderson
VLM
148
488
0
20 Dec 2018
Generating Diverse and Meaningful Captions
Annika Lindh
R. Ross
Abhijit Mahalunkar
Giancarlo D. Salton
John D. Kelleher
VLM
38
10
0
19 Dec 2018
Feature Fusion Effects of Tensor Product Representation on (De)Compositional Network for Caption Generation for Images
C. Sur
39
5
0
17 Dec 2018
Grounded Video Description
Luowei Zhou
Yannis Kalantidis
Xinlei Chen
Jason J. Corso
Marcus Rohrbach
85
193
0
17 Dec 2018
Adversarial Inference for Multi-Sentence Video Description
J. S. Park
Marcus Rohrbach
Trevor Darrell
Anna Rohrbach
81
80
0
13 Dec 2018
Dynamic Fusion with Intra- and Inter- Modality Attention Flow for Visual Question Answering
Peng Gao
Zhengkai Jiang
Haoxuan You
Pan Lu
Steven C. H. Hoi
Xiaogang Wang
Hongsheng Li
AIMat
100
367
0
13 Dec 2018
Long-Term Feature Banks for Detailed Video Understanding
Chao-Yuan Wu
Christoph Feichtenhofer
Haoqi Fan
Kaiming He
Philipp Krahenbuhl
Ross B. Girshick
209
480
0
12 Dec 2018
Neighbourhood Watch: Referring Expression Comprehension via Language-guided Graph Attention Networks
Peng Wang
Qi Wu
Jiewei Cao
Chunhua Shen
Lianli Gao
Anton Van Den Hengel
ObjD
95
256
0
12 Dec 2018
Learning Representations of Sets through Optimized Permutations
Yan Zhang
Jonathon S. Hare
Adam Prugel-Bennett
SSL
81
25
0
10 Dec 2018
Semantically-Aware Attentive Neural Embeddings for Image-based Visual Localization
Zachary Seymour
Karan Sikka
Han-Pang Chiu
S. Samarasekera
Rakesh Kumar
60
10
0
08 Dec 2018
An Attempt towards Interpretable Audio-Visual Video Captioning
Yapeng Tian
Chenxiao Guan
Justin Goodman
Marc Moore
Chenliang Xu
91
20
0
07 Dec 2018
Recursive Visual Attention in Visual Dialog
Yulei Niu
Hanwang Zhang
Manli Zhang
Jianhong Zhang
Zhiwu Lu
Ji-Rong Wen
103
119
0
06 Dec 2018
Auto-Encoding Scene Graphs for Image Captioning
Xu Yang
Kaihua Tang
Hanwang Zhang
Jianfei Cai
172
703
0
06 Dec 2018
Learning to Compose Dynamic Tree Structures for Visual Contexts
Kaihua Tang
Hanwang Zhang
Baoyuan Wu
Wenhan Luo
Wen Liu
85
505
0
05 Dec 2018
Explainable and Explicit Visual Reasoning over Scene Graphs
Jiaxin Shi
Hanwang Zhang
Juan-Zi Li
OCL
207
235
0
05 Dec 2018
Attention-based Adaptive Selection of Operations for Image Restoration in the Presence of Unknown Combined Distortions
Masanori Suganuma
Xing Liu
Takayuki Okatani
116
84
0
03 Dec 2018
Multi-task Learning of Hierarchical Vision-Language Representation
Duy-Kien Nguyen
Takayuki Okatani
105
52
0
03 Dec 2018
Plan-Recognition-Driven Attention Modeling for Visual Recognition
Yantian Zha
Yikang Li
Tianshu Yu
Subbarao Kambhampati
Baoxin Li
31
0
0
02 Dec 2018
From Known to the Unknown: Transferring Knowledge to Answer Questions about Novel Visual and Semantic Concepts
M. Farazi
Salman H Khan
Nick Barnes
58
13
0
30 Nov 2018
Generating Easy-to-Understand Referring Expressions for Target Identifications
Mikihiro Tanaka
Takayuki Itamochi
Kenichi Narioka
Ikuro Sato
Yoshitaka Ushiku
Tatsuya Harada
45
1
0
29 Nov 2018
Multi-level Multimodal Common Semantic Space for Image-Phrase Grounding
Hassan Akbari
Svebor Karaman
Surabhi Bhargava
Brian Chen
Carl Vondrick
Shih-Fu Chang
62
83
0
28 Nov 2018
From Recognition to Cognition: Visual Commonsense Reasoning
Rowan Zellers
Yonatan Bisk
Ali Farhadi
Yejin Choi
LRM
BDL
OCL
ReLM
215
885
0
27 Nov 2018
LSTA: Long Short-Term Attention for Egocentric Action Recognition
Swathikiran Sudhakaran
Sergio Escalera
Oswald Lanz
EgoV
80
143
0
26 Nov 2018
Art2Real: Unfolding the Reality of Artworks via Semantically-Aware Image-to-Image Translation
Matteo Tomei
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
DiffM
155
77
0
26 Nov 2018
Show, Control and Tell: A Framework for Generating Controllable and Grounded Captions
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
DiffM
109
176
0
26 Nov 2018
Visual Entailment Task for Visually-Grounded Language Learning
Ning Xie
Farley Lai
Derek Doran
Asim Kadav
60
53
0
26 Nov 2018
Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation
Xin Eric Wang
Qiuyuan Huang
Asli Celikyilmaz
Jianfeng Gao
Dinghan Shen
Yuan-fang Wang
William Yang Wang
Lei Zhang
LM&Ro
SSL
138
541
0
25 Nov 2018
Learning to discover and localize visual objects with open vocabulary
Keren Ye
Ruotong Wang
Wei Li
Danfeng Qin
Adriana Kovashka
Jesse Berent
ObjD
48
4
0
25 Nov 2018
Senti-Attend: Image Captioning using Sentiment and Attention
Omid Mohamad Nezami
Mark Dras
Stephen Wan
Cécile Paris
VLM
56
16
0
24 Nov 2018
What and Where: A Context-based Recommendation System for Object Insertion
Song-Hai Zhang
Zhengping Zhou
Bin Liu
Xin Dong
Dun Liang
P. Hall
Shimin Hu
VLM
79
23
0
24 Nov 2018
VQA with no questions-answers training
B. Vatashsky
S. Ullman
108
13
0
20 Nov 2018
Scene Graph Generation via Conditional Random Fields
Weilin Cong
Wenjie Wang
Wang-Chien Lee
GNN
79
22
0
20 Nov 2018
Intention Oriented Image Captions with Guiding Objects
Yue Zheng
Yali Li
Shengjin Wang
62
55
0
19 Nov 2018
Revisiting Image-Language Networks for Open-ended Phrase Detection
Bryan A. Plummer
Kevin J. Shih
Yichen Li
Ke Xu
Svetlana Lazebnik
Stan Sclaroff
Kate Saenko
ObjD
SSeg
55
4
0
17 Nov 2018
Gated Hierarchical Attention for Image Captioning
Qingzhong Wang
Antoni B. Chan
80
18
0
30 Oct 2018
TallyQA: Answering Complex Counting Questions
Manoj Acharya
Kushal Kafle
Christopher Kanan
69
125
0
29 Oct 2018
A Neural Compositional Paradigm for Image Captioning
Bo Dai
Sanja Fidler
Dahua Lin
CoGe
56
41
0
23 Oct 2018
Semantic Aware Attention Based Deep Object Co-segmentation
Hong Chen
Yifei Huang
Hideki Nakayama
SSeg
67
73
0
16 Oct 2018
Bringing back simplicity and lightliness into neural image captioning
Jean-Benoit Delbrouck
Stéphane Dupont
36
5
0
15 Oct 2018
Image Captioning as Neural Machine Translation Task in SOCKEYE
Loris Bazzani
Tobias Domhan
Felix Hieber
VLM
54
2
0
09 Oct 2018
Overcoming Language Priors in Visual Question Answering with Adversarial Regularization
S. Ramakrishnan
Aishwarya Agrawal
Stefan Lee
AAML
65
239
0
08 Oct 2018
A Comprehensive Survey of Deep Learning for Image Captioning
Md Zakir Hossain
Ferdous Sohel
M. Shiratuddin
Hamid Laga
VLM
3DV
141
779
0
06 Oct 2018
Transfer Learning via Unsupervised Task Discovery for Visual Question Answering
Hyeonwoo Noh
Taehoon Kim
Jonghwan Mun
Bohyung Han
86
17
0
03 Oct 2018
Previous
1
2
3
...
35
36
37
38
Next