Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1711.04323
Cited By
High-Order Attention Models for Visual Question Answering
12 November 2017
Idan Schwartz
A. Schwing
Tamir Hazan
Re-assign community
ArXiv
PDF
HTML
Papers citing
"High-Order Attention Models for Visual Question Answering"
23 / 23 papers shown
Title
Tensor Sketch: Fast and Scalable Polynomial Kernel Approximation
Ninh Pham
Rasmus Pagh
39
0
0
13 May 2025
A Comprehensive Survey of 3D Dense Captioning: Localizing and Describing Objects in 3D Scenes
Ting Yu
Xiaojun Lin
Shuhui Wang
Weiguo Sheng
Qingming Huang
Jun-chen Yu
3DV
54
10
0
12 Mar 2024
MeaCap: Memory-Augmented Zero-shot Image Captioning
Zequn Zeng
Yan Xie
Hao Zhang
Chiyu Chen
Zhengjue Wang
Boli Chen
VLM
39
14
0
06 Mar 2024
Monotone deep Boltzmann machines
Zhili Feng
Ezra Winston
J. Zico Kolter
30
1
0
11 Jul 2023
ConZIC: Controllable Zero-shot Image Captioning by Sampling-Based Polishing
Zequn Zeng
Hao Zhang
Zhengjue Wang
Ruiying Lu
Dongsheng Wang
Bo Chen
BDL
DiffM
24
33
0
04 Mar 2023
Semantic Segmentation Enhanced Transformer Model for Human Attention Prediction
Shuo Zhang
ViT
31
0
0
26 Jan 2023
AlignVE: Visual Entailment Recognition Based on Alignment Relations
Biwei Cao
Jiuxin Cao
Jie Gui
Jiayun Shen
Bo Liu
Lei He
Yuan Yan Tang
James T. Kwok
26
7
0
16 Nov 2022
Dual Decomposition of Convex Optimization Layers for Consistent Attention in Medical Images
Tom Ron
M. Weiler-Sagie
Tamir Hazan
FAtt
MedIm
27
6
0
06 Jun 2022
ZeroCap: Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic
Yoad Tewel
Yoav Shalev
Idan Schwartz
Lior Wolf
VLM
34
192
0
29 Nov 2021
Learning Spatial Attention for Face Super-Resolution
Chaofeng Chen
Dihong Gong
Hao Wang
Zhifeng Li
Kwan-Yee K. Wong
CVBM
SupR
3DH
17
158
0
02 Dec 2020
An Improved Attention for Visual Question Answering
Tanzila Rahman
Shih-Han Chou
Leonid Sigal
Giuseppe Carenini
13
42
0
04 Nov 2020
Removing Bias in Multi-modal Classifiers: Regularization by Maximizing Functional Entropies
Itai Gat
Idan Schwartz
A. Schwing
Tamir Hazan
60
90
0
21 Oct 2020
New Ideas and Trends in Deep Multimodal Content Understanding: A Review
Wei Chen
Weiping Wang
Li Liu
M. Lew
VLM
118
31
0
16 Oct 2020
TAB-VCR: Tags and Attributes based Visual Commonsense Reasoning Baselines
Jingxiang Lin
Unnat Jain
A. Schwing
LRM
ReLM
34
9
0
31 Oct 2019
Compact Trilinear Interaction for Visual Question Answering
Tuong Khanh Long Do
Thanh-Toan Do
Huy Tran
Erman Tjiputra
Quang-Dieu Tran
36
59
0
26 Sep 2019
Hybrid-Attention based Decoupled Metric Learning for Zero-Shot Image Retrieval
Binghui Chen
Weihong Deng
VLM
FedML
26
55
0
27 Jul 2019
Factor Graph Attention
Idan Schwartz
Seunghak Yu
Tamir Hazan
A. Schwing
24
110
0
11 Apr 2019
A Simple Baseline for Audio-Visual Scene-Aware Dialog
Idan Schwartz
A. Schwing
Tamir Hazan
27
69
0
11 Apr 2019
Diverse and Coherent Paragraph Generation from Images
Moitreya Chatterjee
A. Schwing
19
66
0
03 Sep 2018
Learning Conditioned Graph Structures for Interpretable Visual Question Answering
Will Norcliffe-Brown
Efstathios Vafeias
Sarah Parisot
GNN
21
236
0
19 Jun 2018
Unsupervised Textual Grounding: Linking Words to Image Concepts
Raymond A. Yeh
Minh Do
A. Schwing
22
40
0
29 Mar 2018
Convolutional Image Captioning
J. Aneja
Aditya Deshpande
A. Schwing
VLM
37
360
0
24 Nov 2017
Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding
Akira Fukui
Dong Huk Park
Daylen Yang
Anna Rohrbach
Trevor Darrell
Marcus Rohrbach
167
1,465
0
06 Jun 2016
1