Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.01210
Cited By
GeminiFusion: Efficient Pixel-wise Multimodal Fusion for Vision Transformer
3 June 2024
Ding Jia
Jianyuan Guo
Kai Han
Han Wu
Chao Zhang
Chang Xu
Xinghao Chen
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"GeminiFusion: Efficient Pixel-wise Multimodal Fusion for Vision Transformer"
7 / 7 papers shown
Title
Benchmarking Multi-modal Semantic Segmentation under Sensor Failures: Missing and Noisy Modality Robustness
Chenfei Liao
Kaiyu Lei
Xu Zheng
Junha Moon
Zhixiong Wang
Y. Wang
Danda Pani Paudel
Luc Van Gool
Xuming Hu
VLM
68
3
0
24 Mar 2025
Multimodal-Aware Fusion Network for Referring Remote Sensing Image Segmentation
Leideng Shi
Juan Zhang
48
1
0
14 Mar 2025
Sitcom-Crafter: A Plot-Driven Human Motion Generation System in 3D Scenes
Jianqi Chen
Panwen Hu
Xiaojun Chang
Z. Shi
Michael C. Kampffmeyer
Xiaodan Liang
48
4
0
14 Oct 2024
DFormer: Rethinking RGBD Representation Learning for Semantic Segmentation
Bo Yin
Xuying Zhang
Zhongyu Li
Li Liu
Ming-Ming Cheng
Qibin Hou
24
43
0
18 Sep 2023
Omnivore: A Single Model for Many Visual Modalities
Rohit Girdhar
Mannat Singh
Nikhil Ravi
L. V. D. van der Maaten
Armand Joulin
Ishan Misra
223
225
0
20 Jan 2022
CMT: Convolutional Neural Networks Meet Vision Transformers
Jianyuan Guo
Kai Han
Han Wu
Yehui Tang
Chunjing Xu
Yunhe Wang
Chang Xu
ViT
349
500
0
13 Jul 2021
Visual Saliency Transformer
Nian Liu
Ni Zhang
Kaiyuan Wan
Ling Shao
Junwei Han
ViT
253
352
0
25 Apr 2021
1