Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1902.00579
Cited By
v1
v2 (latest)
Multi-step Reasoning via Recurrent Dual Attention for Visual Dialog
1 February 2019
Zhe Gan
Yu Cheng
Ahmed El Kholy
Linjie Li
Jingjing Liu
Jianfeng Gao
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Multi-step Reasoning via Recurrent Dual Attention for Visual Dialog"
43 / 43 papers shown
Title
VideoLLM-online: Online Video Large Language Model for Streaming Video
Joya Chen
Zhaoyang Lv
Shiwei Wu
Kevin Qinghong Lin
Chenan Song
Difei Gao
Jia-Wei Liu
Ziteng Gao
Dongxing Mao
Mike Zheng Shou
MLLM
MoMe
140
59
0
17 Jun 2024
FaceChat: An Emotion-Aware Face-to-face Dialogue Framework
Deema Alnuhait
Qingyang Wu
Zhou Yu
61
7
0
08 Mar 2023
Unified Multimodal Model with Unlikelihood Training for Visual Dialog
Zihao Wang
Junli Wang
Changjun Jiang
MLLM
67
10
0
23 Nov 2022
MMDialog: A Large-scale Multi-turn Dialogue Dataset Towards Multi-modal Open-domain Conversation
Jiazhan Feng
Qingfeng Sun
Can Xu
Pu Zhao
Yaming Yang
Chongyang Tao
Dongyan Zhao
Qingwei Lin
103
59
0
10 Nov 2022
Dual Decomposition of Convex Optimization Layers for Consistent Attention in Medical Images
Tom Ron
M. Weiler-Sagie
Tamir Hazan
FAtt
MedIm
83
6
0
06 Jun 2022
The Dialog Must Go On: Improving Visual Dialog via Generative Self-Training
Gi-Cheon Kang
Sungdong Kim
Jin-Hwa Kim
Donghyun Kwak
Byoung-Tak Zhang
99
10
0
25 May 2022
META-GUI: Towards Multi-modal Conversational Agents on Mobile GUI
Liangtai Sun
Xingyu Chen
Lu Chen
Tianle Dai
Zichen Zhu
Kai Yu
LLMAG
99
62
0
23 May 2022
UTC: A Unified Transformer with Inter-Task Contrastive Learning for Visual Dialog
Cheng Chen
Yudong Zhu
Zhenshan Tan
Qingrong Cheng
Xin Jiang
Qun Liu
X. Gu
75
39
0
01 May 2022
Improving Cross-Modal Understanding in Visual Dialog via Contrastive Learning
Feilong Chen
Xiuyi Chen
Shuang Xu
Bo Xu
VLM
92
19
0
15 Apr 2022
Spot the Difference: A Cooperative Object-Referring Game in Non-Perfectly Co-Observable Scene
Duo Zheng
Fandong Meng
Q. Si
Hairun Fan
Zipeng Xu
Jie Zhou
Fangxiang Feng
Xiaojie Wang
80
0
0
16 Mar 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
Guosheng Lin
MLLM
BDL
VLM
CLIP
586
4,444
0
28 Jan 2022
Multimodal Incremental Transformer with Visual Grounding for Visual Dialogue Generation
Feilong Chen
Fandong Meng
Xiuyi Chen
Peng Li
Jie Zhou
105
23
0
17 Sep 2021
GoG: Relation-aware Graph-over-Graph Network for Visual Dialog
Feilong Chen
Xiuyi Chen
Fandong Meng
Peng Li
Jie Zhou
147
35
0
17 Sep 2021
Learning to Ground Visual Objects for Visual Dialog
Feilong Chen
Xiuyi Chen
Can Xu
Daxin Jiang
OOD
96
18
0
13 Sep 2021
Enhancing Visual Dialog Questioner with Entity-based Strategy Learning and Augmented Guesser
Duo Zheng
Zipeng Xu
Fandong Meng
Xiaojie Wang
Jiaan Wang
Jie Zhou
50
13
0
06 Sep 2021
Productivity, Portability, Performance: Data-Centric Python
Yiheng Wang
Yao Zhang
Yanzhang Wang
Yan Wan
Jiao Wang
Zhongyuan Wu
Yuhao Yang
Bowen She
169
101
0
01 Jul 2021
Modeling Text-visual Mutual Dependency for Multi-modal Dialog Generation
Shuhe Wang
Yuxian Meng
Xiaofei Sun
Leilei Gan
Rongbin Ouyang
Rui Yan
Tianwei Zhang
Jiwei Li
73
15
0
30 May 2021
Recent Advances in Deep Learning Based Dialogue Systems: A Systematic Survey
Jinjie Ni
Tom Young
Vlad Pandelea
Fuzhao Xue
Min Zhang
225
280
0
10 May 2021
Ensemble of MRR and NDCG models for Visual Dialog
Idan Schwartz
58
9
0
15 Apr 2021
Attention, please! A survey of Neural Attention Models in Deep Learning
Alana de Santana Correia
Esther Luna Colombini
HAI
132
198
0
31 Mar 2021
Prompt Programming for Large Language Models: Beyond the Few-Shot Paradigm
Laria Reynolds
Kyle McDonell
137
932
0
15 Feb 2021
OpenViDial: A Large-Scale, Open-Domain Dialogue Dataset with Visual Contexts
Yuxian Meng
Shuhe Wang
Qinghong Han
Xiaofei Sun
Leilei Gan
Rui Yan
Jiwei Li
93
30
0
30 Dec 2020
DTGAN: Dual Attention Generative Adversarial Networks for Text-to-Image Generation
Zhenxing Zhang
Lambert Schomaker
GAN
67
35
0
05 Nov 2020
Beyond VQA: Generating Multi-word Answer and Rationale to Visual Questions
Radhika Dua
Sai Srinivas Kancheti
V. Balasubramanian
LRM
88
22
0
24 Oct 2020
Deep Reinforcement Learning with Stacked Hierarchical Attention for Text-based Games
Yunqiu Xu
Meng Fang
Ling-Hao Chen
Yali Du
Qiufeng Wang
Chengqi Zhang
OffRL
88
44
0
22 Oct 2020
New Ideas and Trends in Deep Multimodal Content Understanding: A Review
Wei Chen
Weiping Wang
Li Liu
M. Lew
VLM
174
33
0
16 Oct 2020
Describing Unseen Videos via Multi-Modal Cooperative Dialog Agents
Ye Zhu
Yu Wu
Yi Yang
Yan Yan
82
13
0
18 Aug 2020
KBGN: Knowledge-Bridge Graph Network for Adaptive Vision-Text Reasoning in Visual Dialogue
X. Jiang
Siyi Du
Zengchang Qin
Yajing Sun
Jiahao Yu
91
37
0
11 Aug 2020
Dynamic Graph Representation Learning for Video Dialog via Multi-Modal Shuffled Transformers
Shijie Geng
Peng Gao
Moitreya Chatterjee
Chiori Hori
Jonathan Le Roux
Yongfeng Zhang
Hongsheng Li
A. Cherian
101
11
0
08 Jul 2020
DAM: Deliberation, Abandon and Memory Networks for Generating Detailed and Non-repetitive Responses in Visual Dialogue
X. Jiang
Jiahao Yu
Yajing Sun
Zengchang Qin
Zihao Zhu
Yue Hu
Qi Wu
MLLM
119
19
0
07 Jul 2020
Large-Scale Adversarial Training for Vision-and-Language Representation Learning
Zhe Gan
Yen-Chun Chen
Linjie Li
Chen Zhu
Yu Cheng
Jingjing Liu
ObjD
VLM
171
501
0
11 Jun 2020
History for Visual Dialog: Do we really need it?
Shubham Agarwal
Trung Bui
Joon-Young Lee
Ioannis Konstas
Verena Rieser
VLM
38
71
0
08 May 2020
VD-BERT: A Unified Vision and Dialog Transformer with BERT
Yue Wang
Shafiq Joty
Michael R. Lyu
Irwin King
Caiming Xiong
Guosheng Lin
125
104
0
28 Apr 2020
Iterative Context-Aware Graph Inference for Visual Dialog
Dan Guo
Haibo Wang
Hanwang Zhang
Zhengjun Zha
Meng Wang
79
49
0
05 Apr 2020
VIOLIN: A Large-Scale Dataset for Video-and-Language Inference
J. Liu
Wenhu Chen
Yu Cheng
Zhe Gan
Licheng Yu
Yiming Yang
Jingjing Liu
MLLM
VGen
124
70
0
25 Mar 2020
Vision-Dialog Navigation by Exploring Cross-modal Memory
Yi Zhu
Fengda Zhu
Zhaohuan Zhan
Bingqian Lin
Jianbin Jiao
Xiaojun Chang
Xiaodan Liang
VLM
91
49
0
15 Mar 2020
Modality-Balanced Models for Visual Dialogue
Hyounghun Kim
Hao Tan
Joey Tianyi Zhou
61
27
0
17 Jan 2020
Multi-step Joint-Modality Attention Network for Scene-Aware Dialogue System
Yun-Wei Chu
Kuan-Yen Lin
Chao-Chun Hsu
Lun-Wei Ku
137
22
0
17 Jan 2020
DMRM: A Dual-channel Multi-hop Reasoning Model for Visual Dialog
Feilong Chen
Fandong Meng
Jiaming Xu
Peng Li
Bo Xu
Jie Zhou
95
34
0
18 Dec 2019
Large-scale Pretraining for Visual Dialog: A Simple State-of-the-Art Baseline
Vishvak Murahari
Dhruv Batra
Devi Parikh
Abhishek Das
VLM
119
117
0
05 Dec 2019
Two Causal Principles for Improving Visual Dialog
Jiaxin Qi
Yulei Niu
Jianqiang Huang
Hanwang Zhang
CML
112
149
0
24 Nov 2019
DualVD: An Adaptive Dual Encoding Model for Deep Visual Understanding in Visual Dialogue
X. Jiang
Jiahao Yu
Zengchang Qin
Yingying Zhuang
Xingxing Zhang
Yue Hu
Qi Wu
92
70
0
17 Nov 2019
Attend To Count: Crowd Counting with Adaptive Capacity Multi-scale CNNs
Zhikang Zou
Yu Cheng
Xiaoye Qu
S. Ji
Xiaoxiao Guo
Pan Zhou
91
51
0
07 Aug 2019
1