ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1707.07998
  4. Cited By
Bottom-Up and Top-Down Attention for Image Captioning and Visual
  Question Answering
v1v2v3 (latest)

Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering

25 July 2017
Peter Anderson
Xiaodong He
Chris Buehler
Damien Teney
Mark Johnson
Stephen Gould
Lei Zhang
    AIMat
ArXiv (abs)PDFHTML

Papers citing "Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering"

50 / 1,868 papers shown
Title
Manipulation-skill Assessment from Videos with Spatial Attention Network
Manipulation-skill Assessment from Videos with Spatial Attention Network
Zhenqiang Li
Yifei Huang
Minjie Cai
Yoichi Sato
76
60
0
09 Jan 2019
Robust Change Captioning
Robust Change Captioning
Dong Huk Park
Trevor Darrell
Anna Rohrbach
43
5
0
08 Jan 2019
Action2Vec: A Crossmodal Embedding Approach to Action Learning
Action2Vec: A Crossmodal Embedding Approach to Action Learning
Meera Hahn
Andrew Silva
James M. Rehg
80
58
0
02 Jan 2019
Hierarchical LSTMs with Adaptive Attention for Visual Captioning
Hierarchical LSTMs with Adaptive Attention for Visual Captioning
Jingkuan Song
Xiangpeng Li
Lianli Gao
Heng Tao Shen
104
223
0
26 Dec 2018
Scene Graph Reasoning with Prior Visual Relationship for Visual Question
  Answering
Scene Graph Reasoning with Prior Visual Relationship for Visual Question Answering
Zhuoqian Yang
Zengchang Qin
Jing Yu
Yue Hu
GNN
80
16
0
23 Dec 2018
A Multi-task Neural Approach for Emotion Attribution, Classification and
  Summarization
A Multi-task Neural Approach for Emotion Attribution, Classification and Summarization
Guoyun Tu
Yanwei Fu
Boyang Albert Li
Jiarui Gao
Yu-Gang Jiang
Xiangyang Xue
36
29
0
21 Dec 2018
The Design and Implementation of XiaoIce, an Empathetic Social Chatbot
The Design and Implementation of XiaoIce, an Empathetic Social Chatbot
Li Zhou
Jianfeng Gao
Di Li
Harry Shum
82
608
0
21 Dec 2018
nocaps: novel object captioning at scale
nocaps: novel object captioning at scale
Harsh Agrawal
Karan Desai
Yufei Wang
Xinlei Chen
Rishabh Jain
Mark Johnson
Dhruv Batra
Devi Parikh
Stefan Lee
Peter Anderson
VLM
148
488
0
20 Dec 2018
Generating Diverse and Meaningful Captions
Generating Diverse and Meaningful Captions
Annika Lindh
R. Ross
Abhijit Mahalunkar
Giancarlo D. Salton
John D. Kelleher
VLM
38
10
0
19 Dec 2018
Feature Fusion Effects of Tensor Product Representation on
  (De)Compositional Network for Caption Generation for Images
Feature Fusion Effects of Tensor Product Representation on (De)Compositional Network for Caption Generation for Images
C. Sur
39
5
0
17 Dec 2018
Grounded Video Description
Grounded Video Description
Luowei Zhou
Yannis Kalantidis
Xinlei Chen
Jason J. Corso
Marcus Rohrbach
85
193
0
17 Dec 2018
Adversarial Inference for Multi-Sentence Video Description
Adversarial Inference for Multi-Sentence Video Description
J. S. Park
Marcus Rohrbach
Trevor Darrell
Anna Rohrbach
81
80
0
13 Dec 2018
Dynamic Fusion with Intra- and Inter- Modality Attention Flow for Visual
  Question Answering
Dynamic Fusion with Intra- and Inter- Modality Attention Flow for Visual Question Answering
Peng Gao
Zhengkai Jiang
Haoxuan You
Pan Lu
Steven C. H. Hoi
Xiaogang Wang
Hongsheng Li
AIMat
100
367
0
13 Dec 2018
Long-Term Feature Banks for Detailed Video Understanding
Long-Term Feature Banks for Detailed Video Understanding
Chao-Yuan Wu
Christoph Feichtenhofer
Haoqi Fan
Kaiming He
Philipp Krahenbuhl
Ross B. Girshick
209
480
0
12 Dec 2018
Neighbourhood Watch: Referring Expression Comprehension via
  Language-guided Graph Attention Networks
Neighbourhood Watch: Referring Expression Comprehension via Language-guided Graph Attention Networks
Peng Wang
Qi Wu
Jiewei Cao
Chunhua Shen
Lianli Gao
Anton Van Den Hengel
ObjD
95
256
0
12 Dec 2018
Learning Representations of Sets through Optimized Permutations
Learning Representations of Sets through Optimized Permutations
Yan Zhang
Jonathon S. Hare
Adam Prugel-Bennett
SSL
81
25
0
10 Dec 2018
Semantically-Aware Attentive Neural Embeddings for Image-based Visual
  Localization
Semantically-Aware Attentive Neural Embeddings for Image-based Visual Localization
Zachary Seymour
Karan Sikka
Han-Pang Chiu
S. Samarasekera
Rakesh Kumar
60
10
0
08 Dec 2018
An Attempt towards Interpretable Audio-Visual Video Captioning
An Attempt towards Interpretable Audio-Visual Video Captioning
Yapeng Tian
Chenxiao Guan
Justin Goodman
Marc Moore
Chenliang Xu
91
20
0
07 Dec 2018
Recursive Visual Attention in Visual Dialog
Recursive Visual Attention in Visual Dialog
Yulei Niu
Hanwang Zhang
Manli Zhang
Jianhong Zhang
Zhiwu Lu
Ji-Rong Wen
103
119
0
06 Dec 2018
Auto-Encoding Scene Graphs for Image Captioning
Auto-Encoding Scene Graphs for Image Captioning
Xu Yang
Kaihua Tang
Hanwang Zhang
Jianfei Cai
172
703
0
06 Dec 2018
Learning to Compose Dynamic Tree Structures for Visual Contexts
Learning to Compose Dynamic Tree Structures for Visual Contexts
Kaihua Tang
Hanwang Zhang
Baoyuan Wu
Wenhan Luo
Wen Liu
85
505
0
05 Dec 2018
Explainable and Explicit Visual Reasoning over Scene Graphs
Explainable and Explicit Visual Reasoning over Scene Graphs
Jiaxin Shi
Hanwang Zhang
Juan-Zi Li
OCL
207
235
0
05 Dec 2018
Attention-based Adaptive Selection of Operations for Image Restoration
  in the Presence of Unknown Combined Distortions
Attention-based Adaptive Selection of Operations for Image Restoration in the Presence of Unknown Combined Distortions
Masanori Suganuma
Xing Liu
Takayuki Okatani
116
84
0
03 Dec 2018
Multi-task Learning of Hierarchical Vision-Language Representation
Multi-task Learning of Hierarchical Vision-Language Representation
Duy-Kien Nguyen
Takayuki Okatani
105
52
0
03 Dec 2018
Plan-Recognition-Driven Attention Modeling for Visual Recognition
Plan-Recognition-Driven Attention Modeling for Visual Recognition
Yantian Zha
Yikang Li
Tianshu Yu
Subbarao Kambhampati
Baoxin Li
31
0
0
02 Dec 2018
From Known to the Unknown: Transferring Knowledge to Answer Questions
  about Novel Visual and Semantic Concepts
From Known to the Unknown: Transferring Knowledge to Answer Questions about Novel Visual and Semantic Concepts
M. Farazi
Salman H Khan
Nick Barnes
58
13
0
30 Nov 2018
Generating Easy-to-Understand Referring Expressions for Target
  Identifications
Generating Easy-to-Understand Referring Expressions for Target Identifications
Mikihiro Tanaka
Takayuki Itamochi
Kenichi Narioka
Ikuro Sato
Yoshitaka Ushiku
Tatsuya Harada
45
1
0
29 Nov 2018
Multi-level Multimodal Common Semantic Space for Image-Phrase Grounding
Multi-level Multimodal Common Semantic Space for Image-Phrase Grounding
Hassan Akbari
Svebor Karaman
Surabhi Bhargava
Brian Chen
Carl Vondrick
Shih-Fu Chang
62
83
0
28 Nov 2018
From Recognition to Cognition: Visual Commonsense Reasoning
From Recognition to Cognition: Visual Commonsense Reasoning
Rowan Zellers
Yonatan Bisk
Ali Farhadi
Yejin Choi
LRMBDLOCLReLM
215
885
0
27 Nov 2018
LSTA: Long Short-Term Attention for Egocentric Action Recognition
LSTA: Long Short-Term Attention for Egocentric Action Recognition
Swathikiran Sudhakaran
Sergio Escalera
Oswald Lanz
EgoV
80
143
0
26 Nov 2018
Art2Real: Unfolding the Reality of Artworks via Semantically-Aware
  Image-to-Image Translation
Art2Real: Unfolding the Reality of Artworks via Semantically-Aware Image-to-Image Translation
Matteo Tomei
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
DiffM
155
77
0
26 Nov 2018
Show, Control and Tell: A Framework for Generating Controllable and
  Grounded Captions
Show, Control and Tell: A Framework for Generating Controllable and Grounded Captions
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
DiffM
109
176
0
26 Nov 2018
Visual Entailment Task for Visually-Grounded Language Learning
Visual Entailment Task for Visually-Grounded Language Learning
Ning Xie
Farley Lai
Derek Doran
Asim Kadav
60
53
0
26 Nov 2018
Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning
  for Vision-Language Navigation
Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation
Xin Eric Wang
Qiuyuan Huang
Asli Celikyilmaz
Jianfeng Gao
Dinghan Shen
Yuan-fang Wang
William Yang Wang
Lei Zhang
LM&RoSSL
138
541
0
25 Nov 2018
Learning to discover and localize visual objects with open vocabulary
Learning to discover and localize visual objects with open vocabulary
Keren Ye
Ruotong Wang
Wei Li
Danfeng Qin
Adriana Kovashka
Jesse Berent
ObjD
48
4
0
25 Nov 2018
Senti-Attend: Image Captioning using Sentiment and Attention
Senti-Attend: Image Captioning using Sentiment and Attention
Omid Mohamad Nezami
Mark Dras
Stephen Wan
Cécile Paris
VLM
56
16
0
24 Nov 2018
What and Where: A Context-based Recommendation System for Object
  Insertion
What and Where: A Context-based Recommendation System for Object Insertion
Song-Hai Zhang
Zhengping Zhou
Bin Liu
Xin Dong
Dun Liang
P. Hall
Shimin Hu
VLM
79
23
0
24 Nov 2018
VQA with no questions-answers training
VQA with no questions-answers training
B. Vatashsky
S. Ullman
108
13
0
20 Nov 2018
Scene Graph Generation via Conditional Random Fields
Weilin Cong
Wenjie Wang
Wang-Chien Lee
GNN
79
22
0
20 Nov 2018
Intention Oriented Image Captions with Guiding Objects
Intention Oriented Image Captions with Guiding Objects
Yue Zheng
Yali Li
Shengjin Wang
62
55
0
19 Nov 2018
Revisiting Image-Language Networks for Open-ended Phrase Detection
Revisiting Image-Language Networks for Open-ended Phrase Detection
Bryan A. Plummer
Kevin J. Shih
Yichen Li
Ke Xu
Svetlana Lazebnik
Stan Sclaroff
Kate Saenko
ObjDSSeg
55
4
0
17 Nov 2018
Gated Hierarchical Attention for Image Captioning
Gated Hierarchical Attention for Image Captioning
Qingzhong Wang
Antoni B. Chan
80
18
0
30 Oct 2018
TallyQA: Answering Complex Counting Questions
TallyQA: Answering Complex Counting Questions
Manoj Acharya
Kushal Kafle
Christopher Kanan
69
125
0
29 Oct 2018
A Neural Compositional Paradigm for Image Captioning
A Neural Compositional Paradigm for Image Captioning
Bo Dai
Sanja Fidler
Dahua Lin
CoGe
56
41
0
23 Oct 2018
Semantic Aware Attention Based Deep Object Co-segmentation
Semantic Aware Attention Based Deep Object Co-segmentation
Hong Chen
Yifei Huang
Hideki Nakayama
SSeg
67
73
0
16 Oct 2018
Bringing back simplicity and lightliness into neural image captioning
Bringing back simplicity and lightliness into neural image captioning
Jean-Benoit Delbrouck
Stéphane Dupont
36
5
0
15 Oct 2018
Image Captioning as Neural Machine Translation Task in SOCKEYE
Image Captioning as Neural Machine Translation Task in SOCKEYE
Loris Bazzani
Tobias Domhan
Felix Hieber
VLM
54
2
0
09 Oct 2018
Overcoming Language Priors in Visual Question Answering with Adversarial
  Regularization
Overcoming Language Priors in Visual Question Answering with Adversarial Regularization
S. Ramakrishnan
Aishwarya Agrawal
Stefan Lee
AAML
65
239
0
08 Oct 2018
A Comprehensive Survey of Deep Learning for Image Captioning
A Comprehensive Survey of Deep Learning for Image Captioning
Md Zakir Hossain
Ferdous Sohel
M. Shiratuddin
Hamid Laga
VLM3DV
141
779
0
06 Oct 2018
Transfer Learning via Unsupervised Task Discovery for Visual Question
  Answering
Transfer Learning via Unsupervised Task Discovery for Visual Question Answering
Hyeonwoo Noh
Taehoon Kim
Jonghwan Mun
Bohyung Han
86
17
0
03 Oct 2018
Previous
123...35363738
Next