Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1611.08669
Cited By
Visual Dialog
26 November 2016
Abhishek Das
Satwik Kottur
Khushi Gupta
Avi Singh
Deshraj Yadav
José M. F. Moura
Devi Parikh
Dhruv Batra
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Visual Dialog"
50 / 194 papers shown
Title
Deep Sequence Models for Text Classification Tasks
S. S. Abdullahi
Su Yiming
Shamsuddeen Hassan Muhammad
A. Mustapha
Ahmad Muhammad Aminu
Abdulkadir Abdullahi
Musa Bello
Saminu Mohammad Aliyu
27
3
0
18 Jul 2022
Enabling Harmonious Human-Machine Interaction with Visual-Context Augmented Dialogue System: A Review
Hao Wang
Bin Guo
Y. Zeng
Yasan Ding
Chen Qiu
Ying Zhang
Li Yao
Zhiwen Yu
32
2
0
02 Jul 2022
Multimodal Learning with Transformers: A Survey
P. Xu
Xiatian Zhu
David A. Clifton
ViT
72
528
0
13 Jun 2022
VD-PCR: Improving Visual Dialog with Pronoun Coreference Resolution
Xintong Yu
Hongming Zhang
Ruixin Hong
Yangqiu Song
Changshui Zhang
17
13
0
29 May 2022
Prompt-based Learning for Unpaired Image Captioning
Peipei Zhu
Tianlin Li
Lin Zhu
Zhenglong Sun
Weishi Zheng
Yaowei Wang
Chia-Ju Chen
VLM
25
31
0
26 May 2022
The Dialog Must Go On: Improving Visual Dialog via Generative Self-Training
Gi-Cheon Kang
Sungdong Kim
Jin-Hwa Kim
Donghyun Kwak
Byoung-Tak Zhang
32
10
0
25 May 2022
Multimodal Conversational AI: A Survey of Datasets and Approaches
Anirudh S. Sundar
Larry Heck
38
29
0
13 May 2022
Learning to Retrieve Videos by Asking Questions
Avinash Madasu
Junier Oliva
Gedas Bertasius
VGen
32
16
0
11 May 2022
Answer-Me: Multi-Task Open-Vocabulary Visual Question Answering
A. Piergiovanni
Wei Li
Weicheng Kuo
M. Saffar
Fred Bertsch
A. Angelova
17
16
0
02 May 2022
UTC: A Unified Transformer with Inter-Task Contrastive Learning for Visual Dialog
Cheng Chen
Yudong Zhu
Zhenshan Tan
Qingrong Cheng
Xin Jiang
Qun Liu
X. Gu
31
39
0
01 May 2022
Flamingo: a Visual Language Model for Few-Shot Learning
Jean-Baptiste Alayrac
Jeff Donahue
Pauline Luc
Antoine Miech
Iain Barr
...
Mikolaj Binkowski
Ricardo Barreira
Oriol Vinyals
Andrew Zisserman
Karen Simonyan
MLLM
VLM
46
3,349
0
29 Apr 2022
Can Visual Dialogue Models Do Scorekeeping? Exploring How Dialogue Representations Incrementally Encode Shared Knowledge
Brielen Madureira
David Schlangen
25
4
0
14 Apr 2022
There Are a Thousand Hamlets in a Thousand People's Eyes: Enhancing Knowledge-grounded Dialogue with Personal Memory
Tingchen Fu
Xueliang Zhao
Chongyang Tao
Jiaxin Wen
Rui Yan
22
26
0
06 Apr 2022
Co-VQA : Answering by Interactive Sub Question Sequence
Ruonan Wang
Yuxi Qian
Fangxiang Feng
Xiaojie Wang
Huixing Jiang
LRM
26
16
0
02 Apr 2022
Image Retrieval from Contextual Descriptions
Benno Krojer
Vaibhav Adlakha
Vibhav Vineet
Yash Goyal
E. Ponti
Siva Reddy
19
29
0
29 Mar 2022
Spot the Difference: A Cooperative Object-Referring Game in Non-Perfectly Co-Observable Scene
Duo Zheng
Fandong Meng
Q. Si
Hairun Fan
Zipeng Xu
Jie Zhou
Fangxiang Feng
Xiaojie Wang
21
0
0
16 Mar 2022
AssistQ: Affordance-centric Question-driven Task Completion for Egocentric Assistant
B. Wong
Joya Chen
You Wu
Stan Weixian Lei
Dongxing Mao
Difei Gao
Mike Zheng Shou
EgoV
35
27
0
08 Mar 2022
Unpaired Image Captioning by Image-level Weakly-Supervised Visual Concept Recognition
Peipei Zhu
Tianlin Li
Yong Luo
Zhenglong Sun
Wei-Shi Zheng
Yaowei Wang
Chia-Ju Chen
30
12
0
07 Mar 2022
Modeling Coreference Relations in Visual Dialog
Mingxiao Li
Marie-Francine Moens
19
9
0
06 Mar 2022
CAISE: Conversational Agent for Image Search and Editing
Hyounghun Kim
Doo Soon Kim
Seunghyun Yoon
Franck Dernoncourt
Trung Bui
Joey Tianyi Zhou
24
6
0
24 Feb 2022
VU-BERT: A Unified framework for Visual Dialog
Tong Ye
Shijing Si
Jianzong Wang
Rui Wang
Ning Cheng
Jing Xiao
MLLM
38
5
0
22 Feb 2022
VLP: A Survey on Vision-Language Pre-training
Feilong Chen
Duzhen Zhang
Minglun Han
Xiuyi Chen
Jing Shi
Shuang Xu
Bo Xu
VLM
82
213
0
18 Feb 2022
The slurk Interaction Server Framework: Better Data for Better Dialog Models
Jana Gotze
Maike Paetzel-Prusmann
Wencke Liermann
Tim Diekmann
David Schlangen
VLM
34
11
0
02 Feb 2022
Debiased-CAM to mitigate systematic error with faithful visual explanations of machine learning
Wencan Zhang
Mariella Dimiccoli
Brian Y. Lim
FAtt
19
1
0
30 Jan 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
Guosheng Lin
MLLM
BDL
VLM
CLIP
392
4,154
0
28 Jan 2022
Interpretable Learned Emergent Communication for Human-Agent Teams
Seth Karten
Mycal Tucker
Huao Li
Siva Kailas
Michael Lewis
Katia P. Sycara
AI4CE
23
10
0
19 Jan 2022
3D Question Answering
Shuquan Ye
Dongdong Chen
Songfang Han
Jing Liao
ViT
26
46
0
15 Dec 2021
Multimodal Interactions Using Pretrained Unimodal Models for SIMMC 2.0
Joosung Lee
Kijong Han
39
6
0
10 Dec 2021
Iconary: A Pictionary-Based Game for Testing Multimodal Communication with Drawings and Text
Christopher Clark
Jordi Salvador
Dustin Schwenk
Derrick Bonafilia
Mark Yatskar
...
Aaron Sarnat
Hannaneh Hajishirzi
Aniruddha Kembhavi
Oren Etzioni
Ali Farhadi
MLLM
20
3
0
01 Dec 2021
Classification-Regression for Chart Comprehension
Matan Levy
Rami Ben-Ari
Dani Lischinski
25
15
0
29 Nov 2021
Building Goal-Oriented Dialogue Systems with Situated Visual Context
Sanchit Agarwal
Jan Jezabek
Arijit Biswas
Emre Barut
Shuyang Gao
Tagyoung Chung
23
1
0
22 Nov 2021
Multimodal Dialogue Response Generation
Qingfeng Sun
Yujing Wang
Can Xu
Kai Zheng
Yaming Yang
Huang Hu
Fei Xu
Jessica Zhang
Xiubo Geng
Daxin Jiang
26
43
0
16 Oct 2021
A Framework for Learning to Request Rich and Contextually Useful Information from Humans
Khanh Nguyen
Yonatan Bisk
Hal Daumé
47
16
0
14 Oct 2021
Collecting and Characterizing Natural Language Utterances for Specifying Data Visualizations
Arjun Srinivasan
Nikhila Nyapathy
Bongshin Lee
Steven Drucker
J. Stasko
64
69
0
01 Oct 2021
CPT: Colorful Prompt Tuning for Pre-trained Vision-Language Models
Yuan Yao
Ao Zhang
Zhengyan Zhang
Zhiyuan Liu
Tat-Seng Chua
Maosong Sun
MLLM
VPVLM
VLM
208
221
0
24 Sep 2021
GoG: Relation-aware Graph-over-Graph Network for Visual Dialog
Feilong Chen
Xiuyi Chen
Fandong Meng
Peng Li
Jie Zhou
76
34
0
17 Sep 2021
Knowledge-based Embodied Question Answering
Sinan Tan
Mengmeng Ge
Di Guo
Huaping Liu
F. Sun
30
20
0
16 Sep 2021
Hybrid Reasoning Network for Video-based Commonsense Captioning
Weijiang Yu
Jian Liang
Lei Ji
Lu Li
Yuejian Fang
Nong Xiao
Nan Duan
14
10
0
05 Aug 2021
Chest ImaGenome Dataset for Clinical Reasoning
Joy T. Wu
Nkechinyere N. Agu
Ismini Lourentzou
Arjun Sharma
J. Paguio
...
William Mitchell
Satyananda Kashyap
Andrea Giovannini
Leo Anthony Celi
Mehdi Moradi
21
64
0
31 Jul 2021
Adversarial Reinforced Instruction Attacker for Robust Vision-Language Navigation
Bingqian Lin
Yi Zhu
Yanxin Long
Xiaodan Liang
QiXiang Ye
Liang Lin
AAML
39
16
0
23 Jul 2021
Constructing Multi-Modal Dialogue Dataset by Replacing Text with Semantically Relevant Images
Nyoungwoo Lee
Suwon Shin
Jaegul Choo
Ho‐Jin Choi
S. Myaeng
19
25
0
19 Jul 2021
Evaluating Large Language Models Trained on Code
Mark Chen
Jerry Tworek
Heewoo Jun
Qiming Yuan
Henrique Pondé
...
Bob McGrew
Dario Amodei
Sam McCandlish
Ilya Sutskever
Wojciech Zaremba
ELM
ALM
78
5,082
0
07 Jul 2021
Productivity, Portability, Performance: Data-Centric Python
Yiheng Wang
Yao Zhang
Yanzhang Wang
Yan Wan
Jiao Wang
Zhongyuan Wu
Yuhao Yang
Bowen She
54
94
0
01 Jul 2021
Are VQA Systems RAD? Measuring Robustness to Augmented Data with Focused Interventions
Daniel Rosenberg
Itai Gat
Amir Feder
Roi Reichart
AAML
39
16
0
08 Jun 2021
Worst of Both Worlds: Biases Compound in Pre-trained Vision-and-Language Models
Tejas Srinivasan
Yonatan Bisk
VLM
26
55
0
18 Apr 2021
Action-Based Conversations Dataset: A Corpus for Building More In-Depth Task-Oriented Dialogue Systems
Derek Chen
Howard Chen
Yi Yang
A. Lin
Zhou Yu
17
65
0
01 Apr 2021
Towards General Purpose Vision Systems
Tanmay Gupta
Amita Kamath
Aniruddha Kembhavi
Derek Hoiem
11
50
0
01 Apr 2021
Structured Co-reference Graph Attention for Video-grounded Dialogue
Junyeong Kim
Sunjae Yoon
Dahyun Kim
Chang D. Yoo
23
26
0
24 Mar 2021
What is Multimodality?
Letitia Parcalabescu
Nils Trost
Anette Frank
21
0
0
10 Mar 2021
MultiSubs: A Large-scale Multimodal and Multilingual Dataset
Josiah Wang
Pranava Madhyastha
J. Figueiredo
Chiraag Lala
Lucia Specia
VGen
22
11
0
02 Mar 2021
Previous
1
2
3
4
Next