ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1611.00471
  4. Cited By
Dual Attention Networks for Multimodal Reasoning and Matching

Dual Attention Networks for Multimodal Reasoning and Matching

2 November 2016
Hyeonseob Nam
Jung-Woo Ha
Jeonghee Kim
ArXivPDFHTML

Papers citing "Dual Attention Networks for Multimodal Reasoning and Matching"

50 / 97 papers shown
Title
Hadamard product in deep learning: Introduction, Advances and Challenges
Hadamard product in deep learning: Introduction, Advances and Challenges
Grigorios G. Chrysos
Yongtao Wu
Razvan Pascanu
Philip Torr
V. Cevher
AAML
98
1
0
17 Apr 2025
PROSE-FD: A Multimodal PDE Foundation Model for Learning Multiple
  Operators for Forecasting Fluid Dynamics
PROSE-FD: A Multimodal PDE Foundation Model for Learning Multiple Operators for Forecasting Fluid Dynamics
Yuxuan Liu
Jingmin Sun
Xinjie He
Griffin Pinney
Zecheng Zhang
Hayden Schaeffer
AI4CE
43
7
0
15 Sep 2024
Describe-then-Reason: Improving Multimodal Mathematical Reasoning
  through Visual Comprehension Training
Describe-then-Reason: Improving Multimodal Mathematical Reasoning through Visual Comprehension Training
Mengzhao Jia
Zhihan Zhang
Wenhao Yu
Fangkai Jiao
Meng Jiang
VLM
ReLM
LRM
68
8
0
22 Apr 2024
PVLR: Prompt-driven Visual-Linguistic Representation Learning for
  Multi-Label Image Recognition
PVLR: Prompt-driven Visual-Linguistic Representation Learning for Multi-Label Image Recognition
Hao Tan
Zichang Tan
Jun Li
Jun Wan
Zhen Lei
VLM
38
0
0
31 Jan 2024
BARTPhoBEiT: Pre-trained Sequence-to-Sequence and Image Transformers
  Models for Vietnamese Visual Question Answering
BARTPhoBEiT: Pre-trained Sequence-to-Sequence and Image Transformers Models for Vietnamese Visual Question Answering
Khiem Vinh Tran
Kiet Van Nguyen
Ngan Luu-Thuy Nguyen
ViT
33
2
0
28 Jul 2023
Multimodal Graph Transformer for Multimodal Question Answering
Multimodal Graph Transformer for Multimodal Question Answering
Xuehai He
Xin Eric Wang
36
7
0
30 Apr 2023
Understanding Social Media Cross-Modality Discourse in Linguistic Space
Understanding Social Media Cross-Modality Discourse in Linguistic Space
Chunpu Xu
Hanzhuo Tan
Jing Li
Piji Li
24
6
0
26 Feb 2023
Oriented Object Detection in Optical Remote Sensing Images using Deep Learning: A Survey
Oriented Object Detection in Optical Remote Sensing Images using Deep Learning: A Survey
Kunlin Wang
Zi Wang
Zhang Li
Ang Su
Xichao Teng
Minhao Liu
Qifeng Yu
Qifeng Yu
ObjD
89
9
0
21 Feb 2023
Scene-centric vs. Object-centric Image-Text Cross-modal Retrieval: A
  Reproducibility Study
Scene-centric vs. Object-centric Image-Text Cross-modal Retrieval: A Reproducibility Study
Mariya Hendriksen
Svitlana Vakulenko
E. Kuiper
Maarten de Rijke
39
5
0
12 Jan 2023
Improving Cross-Modal Retrieval with Set of Diverse Embeddings
Improving Cross-Modal Retrieval with Set of Diverse Embeddings
Dongwon Kim
Nam-Won Kim
Suha Kwak
26
37
0
30 Nov 2022
Cross-modal Semantic Enhanced Interaction for Image-Sentence Retrieval
Cross-modal Semantic Enhanced Interaction for Image-Sentence Retrieval
Xuri Ge
Fuhai Chen
Songpei Xu
Fuxiang Tao
J. Jose
30
26
0
17 Oct 2022
MKANet: A Lightweight Network with Sobel Boundary Loss for Efficient
  Land-cover Classification of Satellite Remote Sensing Imagery
MKANet: A Lightweight Network with Sobel Boundary Loss for Efficient Land-cover Classification of Satellite Remote Sensing Imagery
Zhiqi Zhang
W. Lu
Jinshan Cao
Guangqi Xie
35
15
0
28 Jul 2022
CAMS: An Annotated Corpus for Causal Analysis of Mental Health Issues in
  Social Media Posts
CAMS: An Annotated Corpus for Causal Analysis of Mental Health Issues in Social Media Posts
Muskan Garg
Chandni Saxena
V. Krishnan
R. Joshi
S. Saha
Vijay K. Mago
Bonnie J. Dorr
CML
32
36
0
11 Jul 2022
A cross-corpus study on speech emotion recognition
A cross-corpus study on speech emotion recognition
R. Milner
Md. Asif Jalal
Raymond W. M. Ng
Thomas Hain
23
30
0
05 Jul 2022
From Pixels to Objects: Cubic Visual Attention for Visual Question
  Answering
From Pixels to Objects: Cubic Visual Attention for Visual Question Answering
Jingkuan Song
Pengpeng Zeng
Lianli Gao
Heng Tao Shen
32
62
0
04 Jun 2022
Structured Two-stream Attention Network for Video Question Answering
Structured Two-stream Attention Network for Video Question Answering
Lianli Gao
Pengpeng Zeng
Jingkuan Song
Yuan-Fang Li
Wu Liu
Tao Mei
Heng Tao Shen
43
68
0
02 Jun 2022
VLCDoC: Vision-Language Contrastive Pre-Training Model for Cross-Modal
  Document Classification
VLCDoC: Vision-Language Contrastive Pre-Training Model for Cross-Modal Document Classification
Souhail Bakkali
Zuheng Ming
Mickael Coustaty
Marccal Rusinol
O. R. Terrades
VLM
54
30
0
24 May 2022
Co-VQA : Answering by Interactive Sub Question Sequence
Co-VQA : Answering by Interactive Sub Question Sequence
Ruonan Wang
Yuxi Qian
Fangxiang Feng
Xiaojie Wang
Huixing Jiang
LRM
29
16
0
02 Apr 2022
Beyond Fixation: Dynamic Window Visual Transformer
Beyond Fixation: Dynamic Window Visual Transformer
Pengzhen Ren
Changlin Li
Guangrun Wang
Yun Xiao
Qing Du
Xiaodan Liang
Qing Du Xiaodan Liang Xiaojun Chang
ViT
36
32
0
24 Mar 2022
Two-stream Hierarchical Similarity Reasoning for Image-text Matching
Two-stream Hierarchical Similarity Reasoning for Image-text Matching
Ran Chen
Hanli Wang
Lei Wang
Sam Kwong
21
9
0
10 Mar 2022
SA-VQA: Structured Alignment of Visual and Semantic Representations for
  Visual Question Answering
SA-VQA: Structured Alignment of Visual and Semantic Representations for Visual Question Answering
Peixi Xiong
Quanzeng You
Pei Yu
Zicheng Liu
Ying Wu
24
5
0
25 Jan 2022
Show, Write, and Retrieve: Entity-aware Article Generation and Retrieval
Show, Write, and Retrieve: Entity-aware Article Generation and Retrieval
Zhongping Zhang
Yiwen Gu
Bryan A. Plummer
48
2
0
11 Dec 2021
Contrastive Learning of Visual-Semantic Embeddings
Contrastive Learning of Visual-Semantic Embeddings
Anurag Jain
Yashaswi Verma
SSL
33
1
0
17 Oct 2021
Quantifying the Suicidal Tendency on Social Media: A Survey
Quantifying the Suicidal Tendency on Social Media: A Survey
Muskan Garg
22
4
0
04 Oct 2021
Multimodal Integration of Human-Like Attention in Visual Question
  Answering
Multimodal Integration of Human-Like Attention in Visual Question Answering
Ekta Sood
Fabian Kögel
Philippe Muller
Dominike Thomas
Mihai Bâce
Andreas Bulling
41
16
0
27 Sep 2021
DSSL: Deep Surroundings-person Separation Learning for Text-based Person
  Retrieval
DSSL: Deep Surroundings-person Separation Learning for Text-based Person Retrieval
A. Zhu
Zijie Wang
Yifeng Li
Xili Wan
Jing Jin
Tian Wang
Fangqiang Hu
G. Hua
95
162
0
12 Sep 2021
Structured Multi-modal Feature Embedding and Alignment for
  Image-Sentence Retrieval
Structured Multi-modal Feature Embedding and Alignment for Image-Sentence Retrieval
Xuri Ge
Fuhai Chen
J. Jose
Zhilong Ji
Zhongqin Wu
Xiao-Chang Liu
34
55
0
05 Aug 2021
ICECAP: Information Concentrated Entity-aware Image Captioning
ICECAP: Information Concentrated Entity-aware Image Captioning
Anwen Hu
Shizhe Chen
Qin Jin
28
20
0
04 Aug 2021
Spatio-Temporal Representation Factorization for Video-based Person
  Re-Identification
Spatio-Temporal Representation Factorization for Video-based Person Re-Identification
Abhishek Aich
Meng Zheng
Srikrishna Karanam
Terrence Chen
Amit K. Roy-Chowdhury
Ziyan Wu
37
70
0
25 Jul 2021
Focal Self-attention for Local-Global Interactions in Vision
  Transformers
Focal Self-attention for Local-Global Interactions in Vision Transformers
Jianwei Yang
Chunyuan Li
Pengchuan Zhang
Xiyang Dai
Bin Xiao
Lu Yuan
Jianfeng Gao
ViT
47
428
0
01 Jul 2021
Looking Outside the Window: Wide-Context Transformer for the Semantic
  Segmentation of High-Resolution Remote Sensing Images
Looking Outside the Window: Wide-Context Transformer for the Semantic Segmentation of High-Resolution Remote Sensing Images
L. Ding
Dong Lin
Shaofu Lin
Jing Zhang
Xiaojie Cui
Yuebin Wang
Hao Tang
Lorenzo Bruzzone
ViT
33
98
0
29 Jun 2021
Step-Wise Hierarchical Alignment Network for Image-Text Matching
Step-Wise Hierarchical Alignment Network for Image-Text Matching
Zhong Ji
Kexin Chen
Haoran Wang
22
93
0
11 Jun 2021
ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias
ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias
Yufei Xu
Qiming Zhang
Jing Zhang
Dacheng Tao
ViT
65
330
0
07 Jun 2021
Bridge to Answer: Structure-aware Graph Interaction Network for Video
  Question Answering
Bridge to Answer: Structure-aware Graph Interaction Network for Video Question Answering
Jungin Park
Jiyoung Lee
Kwanghoon Sohn
167
100
0
29 Apr 2021
Retrieve Fast, Rerank Smart: Cooperative and Joint Approaches for
  Improved Cross-Modal Retrieval
Retrieve Fast, Rerank Smart: Cooperative and Joint Approaches for Improved Cross-Modal Retrieval
Gregor Geigle
Jonas Pfeiffer
Nils Reimers
Ivan Vulić
Iryna Gurevych
37
59
0
22 Mar 2021
Adaptive Multi-Teacher Multi-level Knowledge Distillation
Adaptive Multi-Teacher Multi-level Knowledge Distillation
Yuang Liu
Wei Zhang
Jun Wang
28
157
0
06 Mar 2021
Probabilistic Embeddings for Cross-Modal Retrieval
Probabilistic Embeddings for Cross-Modal Retrieval
Sanghyuk Chun
Seong Joon Oh
Rafael Sampaio de Rezende
Yannis Kalantidis
Diane Larlus
UQCV
415
203
0
13 Jan 2021
Similarity Reasoning and Filtration for Image-Text Matching
Similarity Reasoning and Filtration for Image-Text Matching
Haiwen Diao
Ying Zhang
Lingyun Ma
Huchuan Lu
240
332
0
05 Jan 2021
An Improved Attention for Visual Question Answering
An Improved Attention for Visual Question Answering
Tanzila Rahman
Shih-Han Chou
Leonid Sigal
Giuseppe Carenini
13
42
0
04 Nov 2020
Learning Dual Semantic Relations with Graph Attention for Image-Text
  Matching
Learning Dual Semantic Relations with Graph Attention for Image-Text Matching
Keyu Wen
Xiaodong Gu
Qingrong Cheng
27
95
0
22 Oct 2020
Multifaceted Context Representation using Dual Attention for Ontology
  Alignment
Multifaceted Context Representation using Dual Attention for Ontology Alignment
Vivek Iyer
Arvind Agarwal
Harshit Kumar
25
17
0
16 Oct 2020
New Ideas and Trends in Deep Multimodal Content Understanding: A Review
New Ideas and Trends in Deep Multimodal Content Understanding: A Review
Wei Chen
Weiping Wang
Li Liu
M. Lew
VLM
118
31
0
16 Oct 2020
Audio-Visual Event Localization via Recursive Fusion by Joint
  Co-Attention
Audio-Visual Event Localization via Recursive Fusion by Joint Co-Attention
Bin Duan
Hao Tang
Wei Wang
Ziliang Zong
Guowei Yang
Yan Yan
33
59
0
14 Aug 2020
Graph Optimal Transport for Cross-Domain Alignment
Graph Optimal Transport for Cross-Domain Alignment
Liqun Chen
Zhe Gan
Yu Cheng
Linjie Li
Lawrence Carin
Jingjing Liu
OT
25
148
0
26 Jun 2020
DAPnet: A Double Self-attention Convolutional Network for Point Cloud
  Semantic Labeling
DAPnet: A Double Self-attention Convolutional Network for Point Cloud Semantic Labeling
Li Chen
Zewei Xu
Qing Zhu
Haozhe Huang
Shaowen Wang
Haifeng Li
3DPC
27
12
0
18 Apr 2020
Graph Structured Network for Image-Text Matching
Graph Structured Network for Image-Text Matching
Chunxiao Liu
Zhendong Mao
Tianzhu Zhang
Hongtao Xie
Bin Wang
Yongdong Zhang
25
232
0
01 Apr 2020
IMRAM: Iterative Matching with Recurrent Attention Memory for
  Cross-Modal Image-Text Retrieval
IMRAM: Iterative Matching with Recurrent Attention Memory for Cross-Modal Image-Text Retrieval
Hui Chen
Guiguang Ding
Xudong Liu
Zijia Lin
Ji Liu
Jungong Han
22
318
0
08 Mar 2020
Adaptive Offline Quintuplet Loss for Image-Text Matching
Adaptive Offline Quintuplet Loss for Image-Text Matching
Tianlang Chen
Jiajun Deng
Jiebo Luo
181
68
0
07 Mar 2020
Two Causal Principles for Improving Visual Dialog
Two Causal Principles for Improving Visual Dialog
Jiaxin Qi
Yulei Niu
Jianqiang Huang
Hanwang Zhang
CML
16
146
0
24 Nov 2019
Neural Storyboard Artist: Visualizing Stories with Coherent Image
  Sequences
Neural Storyboard Artist: Visualizing Stories with Coherent Image Sequences
Shizhe Chen
Bei Liu
Jianlong Fu
Ruihua Song
Qin Jin
Pingping Lin
Xiaoyu Qi
Chunting Wang
Jin Zhou
DiffM
25
33
0
24 Nov 2019
12
Next