Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.15933
Cited By
Bridging the Gap between 2D and 3D Visual Question Answering: A Fusion Approach for 3D VQA
24 February 2024
Wentao Mo
Yang Liu
Re-assign community
ArXiv (abs)
PDF
HTML
Github (21★)
Papers citing
"Bridging the Gap between 2D and 3D Visual Question Answering: A Fusion Approach for 3D VQA"
9 / 9 papers shown
Title
Multi-CLIP: Contrastive Vision-Language Pre-training for Question Answering tasks in 3D Scenes
Alexandros Delitzas
Maria Parelli
Nikolas Hars
G. Vlassis
Sotiris Anagnostidis
Gregor Bachmann
Thomas Hofmann
CLIP
45
20
0
04 Jun 2023
SQA3D: Situated Question Answering in 3D Scenes
Xiaojian Ma
Silong Yong
Zilong Zheng
Qing Li
Yitao Liang
Song-Chun Zhu
Siyuan Huang
LM&Ro
72
158
0
14 Oct 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
Guosheng Lin
MLLM
BDL
VLM
CLIP
555
4,409
0
28 Jan 2022
SimVLM: Simple Visual Language Model Pretraining with Weak Supervision
Zirui Wang
Jiahui Yu
Adams Wei Yu
Zihang Dai
Yulia Tsvetkov
Yuan Cao
VLM
MLLM
136
799
0
24 Aug 2021
Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks
Xiujun Li
Xi Yin
Chunyuan Li
Pengchuan Zhang
Xiaowei Hu
...
Houdong Hu
Li Dong
Furu Wei
Yejin Choi
Jianfeng Gao
VLM
132
1,944
0
13 Apr 2020
VQA-LOL: Visual Question Answering under the Lens of Logic
Tejas Gokhale
Pratyay Banerjee
Chitta Baral
Yezhou Yang
CoGe
55
75
0
19 Feb 2020
LXMERT: Learning Cross-Modality Encoder Representations from Transformers
Hao Hao Tan
Joey Tianyi Zhou
VLM
MLLM
250
2,488
0
20 Aug 2019
Deep Modular Co-Attention Networks for Visual Question Answering
Zhou Yu
Jun Yu
Yuhao Cui
Dacheng Tao
Q. Tian
87
808
0
25 Jun 2019
Deep Hough Voting for 3D Object Detection in Point Clouds
C. Qi
Or Litany
Kaiming He
Leonidas Guibas
3DPC
108
1,290
0
21 Apr 2019
1