ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1805.07932
  4. Cited By
Bilinear Attention Networks

Bilinear Attention Networks

21 May 2018
Jin-Hwa Kim
Jaehyun Jun
Byoung-Tak Zhang
    AIMat
ArXivPDFHTML

Papers citing "Bilinear Attention Networks"

50 / 164 papers shown
Title
DTGAN: Dual Attention Generative Adversarial Networks for Text-to-Image
  Generation
DTGAN: Dual Attention Generative Adversarial Networks for Text-to-Image Generation
Zhenxing Zhang
Lambert Schomaker
GAN
31
34
0
05 Nov 2020
An Improved Attention for Visual Question Answering
An Improved Attention for Visual Question Answering
Tanzila Rahman
Shih-Han Chou
Leonid Sigal
Giuseppe Carenini
13
42
0
04 Nov 2020
Cross-Media Keyphrase Prediction: A Unified Framework with
  Multi-Modality Multi-Head Attention and Image Wordings
Cross-Media Keyphrase Prediction: A Unified Framework with Multi-Modality Multi-Head Attention and Image Wordings
Yue Wang
Jing Li
M. Lyu
Irwin King
19
16
0
03 Nov 2020
Deep Reinforcement Learning with Stacked Hierarchical Attention for
  Text-based Games
Deep Reinforcement Learning with Stacked Hierarchical Attention for Text-based Games
Yunqiu Xu
Meng Fang
Ling-Hao Chen
Yali Du
Qiufeng Wang
Chengqi Zhang
OffRL
25
44
0
22 Oct 2020
Removing Bias in Multi-modal Classifiers: Regularization by Maximizing
  Functional Entropies
Removing Bias in Multi-modal Classifiers: Regularization by Maximizing Functional Entropies
Itai Gat
Idan Schwartz
Alex Schwing
Tamir Hazan
60
90
0
21 Oct 2020
Hierarchical Conditional Relation Networks for Multimodal Video Question
  Answering
Hierarchical Conditional Relation Networks for Multimodal Video Question Answering
T. Le
Vuong Le
Svetha Venkatesh
T. Tran
BDL
24
22
0
18 Oct 2020
MAF: Multimodal Alignment Framework for Weakly-Supervised Phrase
  Grounding
MAF: Multimodal Alignment Framework for Weakly-Supervised Phrase Grounding
Qinxin Wang
Hao Tan
Sheng Shen
Michael W. Mahoney
Z. Yao
ObjD
50
11
0
12 Oct 2020
AiR: Attention with Reasoning Capability
AiR: Attention with Reasoning Capability
Shi Chen
Ming Jiang
Jinhui Yang
Qi Zhao
LRM
13
36
0
28 Jul 2020
Learning Modality Interaction for Temporal Sentence Localization and
  Event Captioning in Videos
Learning Modality Interaction for Temporal Sentence Localization and Event Captioning in Videos
Shaoxiang Chen
Wenhao Jiang
Wei Liu
Yu-Gang Jiang
25
101
0
28 Jul 2020
A Novel Graph-based Multi-modal Fusion Encoder for Neural Machine
  Translation
A Novel Graph-based Multi-modal Fusion Encoder for Neural Machine Translation
Yongjing Yin
Fandong Meng
Jinsong Su
Chulun Zhou
Zhengyuan Yang
Jie Zhou
Jiebo Luo
35
139
0
17 Jul 2020
Reducing Language Biases in Visual Question Answering with
  Visually-Grounded Question Encoder
Reducing Language Biases in Visual Question Answering with Visually-Grounded Question Encoder
K. Gouthaman
Anurag Mittal
50
78
0
13 Jul 2020
IQ-VQA: Intelligent Visual Question Answering
IQ-VQA: Intelligent Visual Question Answering
Vatsal Goel
Mohit Chandak
A. Anand
Prithwijit Guha
28
5
0
08 Jul 2020
Graph Optimal Transport for Cross-Domain Alignment
Graph Optimal Transport for Cross-Domain Alignment
Liqun Chen
Zhe Gan
Yu Cheng
Linjie Li
Lawrence Carin
Jingjing Liu
OT
25
148
0
26 Jun 2020
Mucko: Multi-Layer Cross-Modal Knowledge Reasoning for Fact-based Visual
  Question Answering
Mucko: Multi-Layer Cross-Modal Knowledge Reasoning for Fact-based Visual Question Answering
Zihao Zhu
Jiahao Yu
Yujing Wang
Yajing Sun
Yue Hu
Qi Wu
30
125
0
16 Jun 2020
Large-Scale Adversarial Training for Vision-and-Language Representation
  Learning
Large-Scale Adversarial Training for Vision-and-Language Representation Learning
Zhe Gan
Yen-Chun Chen
Linjie Li
Chen Zhu
Yu Cheng
Jingjing Liu
ObjD
VLM
35
489
0
11 Jun 2020
Estimating semantic structure for the VQA answer space
Estimating semantic structure for the VQA answer space
Corentin Kervadec
G. Antipov
M. Baccouche
Christian Wolf
26
4
0
10 Jun 2020
Roses Are Red, Violets Are Blue... but Should Vqa Expect Them To?
Roses Are Red, Violets Are Blue... but Should Vqa Expect Them To?
Corentin Kervadec
G. Antipov
M. Baccouche
Christian Wolf
OOD
21
88
0
09 Jun 2020
Cross-Modality Relevance for Reasoning on Language and Vision
Cross-Modality Relevance for Reasoning on Language and Vision
Chen Zheng
Quan Guo
Parisa Kordjamshidi
LRM
46
36
0
12 May 2020
History for Visual Dialog: Do we really need it?
History for Visual Dialog: Do we really need it?
Shubham Agarwal
Trung Bui
Joon-Young Lee
Ioannis Konstas
Verena Rieser
VLM
19
69
0
08 May 2020
Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal
  Transformers
Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal Transformers
Zhicheng Huang
Zhaoyang Zeng
Bei Liu
Dongmei Fu
Jianlong Fu
ViT
50
436
0
02 Apr 2020
Counterfactual Samples Synthesizing for Robust Visual Question Answering
Counterfactual Samples Synthesizing for Robust Visual Question Answering
Long Chen
Xin Yan
Jun Xiao
Hanwang Zhang
Shiliang Pu
Yueting Zhuang
OOD
AAML
154
290
0
14 Mar 2020
Unbiased Scene Graph Generation from Biased Training
Unbiased Scene Graph Generation from Biased Training
Kaihua Tang
Yulei Niu
Jianqiang Huang
Jiaxin Shi
Hanwang Zhang
CML
22
682
0
27 Feb 2020
CQ-VQA: Visual Question Answering on Categorized Questions
CQ-VQA: Visual Question Answering on Categorized Questions
Aakansha Mishra
A. Anand
Prithwijit Guha
33
6
0
17 Feb 2020
Self-Attentive Associative Memory
Self-Attentive Associative Memory
Hung Le
T. Tran
Svetha Venkatesh
22
56
0
10 Feb 2020
Filter Sketch for Network Pruning
Filter Sketch for Network Pruning
Mingbao Lin
Liujuan Cao
Shaojie Li
QiXiang Ye
Yonghong Tian
Jianzhuang Liu
Q. Tian
Rongrong Ji
CLIP
3DPC
31
82
0
23 Jan 2020
Accuracy vs. Complexity: A Trade-off in Visual Question Answering Models
Accuracy vs. Complexity: A Trade-off in Visual Question Answering Models
M. Farazi
Salman H. Khan
Nick Barnes
23
17
0
20 Jan 2020
In Defense of Grid Features for Visual Question Answering
In Defense of Grid Features for Visual Question Answering
Huaizu Jiang
Ishan Misra
Marcus Rohrbach
Erik Learned-Miller
Xinlei Chen
OOD
ObjD
23
318
0
10 Jan 2020
Locality and compositionality in zero-shot learning
Locality and compositionality in zero-shot learning
Tristan Sylvain
Linda Petrini
R. Devon Hjelm
24
56
0
20 Dec 2019
Explanation vs Attention: A Two-Player Game to Obtain Attention for VQA
Explanation vs Attention: A Two-Player Game to Obtain Attention for VQA
Badri N. Patro
Anupriy
Vinay P. Namboodiri
AAML
FAtt
48
26
0
19 Nov 2019
SOGNet: Scene Overlap Graph Network for Panoptic Segmentation
SOGNet: Scene Overlap Graph Network for Panoptic Segmentation
Yibo Yang
Hongyang Li
Xia Li
Qijie Zhao
Jianlong Wu
Zhouchen Lin
ISeg
21
62
0
18 Nov 2019
Modulated Self-attention Convolutional Network for VQA
Modulated Self-attention Convolutional Network for VQA
Jean-Benoit Delbrouck
Antoine Maiorca
Nathan Hubens
Stéphane Dupont
23
1
0
08 Oct 2019
REMIND Your Neural Network to Prevent Catastrophic Forgetting
REMIND Your Neural Network to Prevent Catastrophic Forgetting
Tyler L. Hayes
Kushal Kafle
Robik Shrestha
Manoj Acharya
Christopher Kanan
CLL
31
295
0
06 Oct 2019
Automatic Fact-guided Sentence Modification
Automatic Fact-guided Sentence Modification
Darsh J. Shah
Tal Schuster
Regina Barzilay
KELM
21
40
0
30 Sep 2019
LoGAN: Latent Graph Co-Attention Network for Weakly-Supervised Video
  Moment Retrieval
LoGAN: Latent Graph Co-Attention Network for Weakly-Supervised Video Moment Retrieval
Reuben Tan
Huijuan Xu
Kate Saenko
Bryan A. Plummer
28
67
0
27 Sep 2019
Compact Trilinear Interaction for Visual Question Answering
Compact Trilinear Interaction for Visual Question Answering
Tuong Khanh Long Do
Thanh-Toan Do
Huy Tran
Erman Tjiputra
Quang-Dieu Tran
36
59
0
26 Sep 2019
Overcoming Data Limitation in Medical Visual Question Answering
Overcoming Data Limitation in Medical Visual Question Answering
Binh Duc Nguyen
Thanh-Toan Do
Binh X. Nguyen
Tuong Khanh Long Do
Erman Tjiputra
Quang-Dieu Tran
MedIm
26
145
0
26 Sep 2019
Unified Vision-Language Pre-Training for Image Captioning and VQA
Unified Vision-Language Pre-Training for Image Captioning and VQA
Luowei Zhou
Hamid Palangi
Lei Zhang
Houdong Hu
Jason J. Corso
Jianfeng Gao
MLLM
VLM
255
928
0
24 Sep 2019
CAMP: Cross-Modal Adaptive Message Passing for Text-Image Retrieval
CAMP: Cross-Modal Adaptive Message Passing for Text-Image Retrieval
Zihao Wang
Xihui Liu
Hongsheng Li
Lu Sheng
Junjie Yan
Xiaogang Wang
Jing Shao
VLM
25
299
0
12 Sep 2019
Probabilistic framework for solving Visual Dialog
Probabilistic framework for solving Visual Dialog
Badri N. Patro
Anupriy
Vinay P. Namboodiri
BDL
30
13
0
11 Sep 2019
PlotQA: Reasoning over Scientific Plots
PlotQA: Reasoning over Scientific Plots
Nitesh Methani
Pritha Ganguly
Mitesh M. Khapra
Pratyush Kumar
49
7
0
03 Sep 2019
LXMERT: Learning Cross-Modality Encoder Representations from
  Transformers
LXMERT: Learning Cross-Modality Encoder Representations from Transformers
Hao Hao Tan
Joey Tianyi Zhou
VLM
MLLM
105
2,456
0
20 Aug 2019
U-CAM: Visual Explanation using Uncertainty based Class Activation Maps
U-CAM: Visual Explanation using Uncertainty based Class Activation Maps
Badri N. Patro
Mayank Lunayach
Shivansh Patel
Vinay P. Namboodiri
FAtt
UQCV
27
76
0
17 Aug 2019
Multimodal Unified Attention Networks for Vision-and-Language
  Interactions
Multimodal Unified Attention Networks for Vision-and-Language Interactions
Zhou Yu
Yuhao Cui
Jun Yu
Dacheng Tao
Q. Tian
27
38
0
12 Aug 2019
Multi-modality Latent Interaction Network for Visual Question Answering
Multi-modality Latent Interaction Network for Visual Question Answering
Peng Gao
Haoxuan You
Zhanpeng Zhang
Xiaogang Wang
Hongsheng Li
25
82
0
10 Aug 2019
VisualBERT: A Simple and Performant Baseline for Vision and Language
VisualBERT: A Simple and Performant Baseline for Vision and Language
Liunian Harold Li
Mark Yatskar
Da Yin
Cho-Jui Hsieh
Kai-Wei Chang
VLM
82
1,920
0
09 Aug 2019
Answering Questions about Data Visualizations using Efficient Bimodal
  Fusion
Answering Questions about Data Visualizations using Efficient Bimodal Fusion
Kushal Kafle
Robik Shrestha
Brian L. Price
Scott D. Cohen
Christopher Kanan
25
58
0
05 Aug 2019
OmniNet: A unified architecture for multi-modal multi-task learning
OmniNet: A unified architecture for multi-modal multi-task learning
Subhojeet Pramanik
Priyanka Agrawal
A. Hussain
27
41
0
17 Jul 2019
Deep Modular Co-Attention Networks for Visual Question Answering
Deep Modular Co-Attention Networks for Visual Question Answering
Zhou Yu
Jun Yu
Yuhao Cui
Dacheng Tao
Q. Tian
36
798
0
25 Jun 2019
RUBi: Reducing Unimodal Biases in Visual Question Answering
RUBi: Reducing Unimodal Biases in Visual Question Answering
Rémi Cadène
Corentin Dancette
H. Ben-younes
Matthieu Cord
Devi Parikh
CML
19
369
0
24 Jun 2019
Frontal Low-rank Random Tensors for Fine-grained Action Segmentation
Frontal Low-rank Random Tensors for Fine-grained Action Segmentation
Yan Zhang
Krikamol Muandet
Qianli Ma
Heiko Neumann
Siyu Tang
37
3
0
03 Jun 2019
Previous
1234
Next