Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2203.04838
Cited By
CMX: Cross-Modal Fusion for RGB-X Semantic Segmentation with Transformers
9 March 2022
Jiaming Zhang
Huayao Liu
Kailun Yang
Xinxin Hu
Ruiping Liu
Rainer Stiefelhagen
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"CMX: Cross-Modal Fusion for RGB-X Semantic Segmentation with Transformers"
46 / 46 papers shown
Title
Boosting Cross-spectral Unsupervised Domain Adaptation for Thermal Semantic Segmentation
Seokjun Kwon
Jeongmin Shin
Namil Kim
Soonmin Hwang
Yukyung Choi
31
0
0
11 May 2025
Depth-Sensitive Soft Suppression with RGB-D Inter-Modal Stylization Flow for Domain Generalization Semantic Segmentation
Binbin Wei
Yuhang Zhang
Shishun Tian
Muxin Liao
Wei Li
Wenbin Zou
MDE
36
0
0
11 May 2025
Reducing Unimodal Bias in Multi-Modal Semantic Segmentation with Multi-Scale Functional Entropy Regularization
Xu Zheng
Yuanhuiyi Lyu
Lutao Jiang
Danda Pani Paudel
Luc Van Gool
Xuming Hu
29
0
0
10 May 2025
Segment Any RGB-Thermal Model with Language-aided Distillation
Dong Xing
Xianxun Zhu
Wei Zhou
Qika Lin
Hang Yang
Yuqing Wang
VLM
61
0
0
04 May 2025
HDBFormer: Efficient RGB-D Semantic Segmentation with A Heterogeneous Dual-Branch Framework
Shuobin Wei
Zhuang Zhou
Zhengan Lu
Zizhao Yuan
Binghua Su
MDE
47
0
0
18 Apr 2025
ControlFusion: A Controllable Image Fusion Framework with Language-Vision Degradation Prompts
Linfeng Tang
Yeda Wang
Z. Cai
Junjun Jiang
Jiayi Ma
45
1
0
30 Mar 2025
MemorySAM: Memorize Modalities and Semantics with Segment Anything Model 2 for Multi-modal Semantic Segmentation
Chenfei Liao
Xu Zheng
Yuanhuiyi Lyu
Haiwei Xue
Yihong Cao
Jiawen Wang
Kailun Yang
Xuming Hu
VLM
63
4
0
09 Mar 2025
3D-Grounded Vision-Language Framework for Robotic Task Planning: Automated Prompt Synthesis and Supervised Reasoning
Guoqin Tang
Qingxuan Jia
Zeyuan Huang
Gang Chen
Ning Ji
Zhipeng Yao
66
0
0
13 Feb 2025
Rethinking Early-Fusion Strategies for Improved Multimodal Image Segmentation
Zhengwen Shen
Yulian Li
Han Zhang
Yuchen Weng
Jun Wang
37
0
0
19 Jan 2025
Learning Motion and Temporal Cues for Unsupervised Video Object Segmentation
Yunzhi Zhuge
Hongyu Gu
Lu Zhang
Jinqing Qi
Huchuan Lu
VOS
69
2
0
14 Jan 2025
IRFusionFormer: Enhancing Pavement Crack Segmentation with RGB-T Fusion and Topological-Based Loss
Ruiqiang Xiao
Xiaohu Chen
34
0
0
31 Dec 2024
IV-tuning: Parameter-Efficient Transfer Learning for Infrared-Visible Tasks
Yaming Zhang
Chenqiang Gao
Fangcen Liu
Junjie Guo
Lan Wang
Xinggan Peng
Deyu Meng
106
0
0
21 Dec 2024
Part-Whole Relational Fusion Towards Multi-Modal Scene Understanding
Yi Liu
Chengxin Li
Shoukun Xu
J. Han
ViT
42
2
0
19 Oct 2024
Order-aware Interactive Segmentation
Bin Wang
Anwesa Choudhuri
Meng Zheng
Zhongpai Gao
Benjamin Planche
Andong Deng
Qin Liu
Terrence Chen
Ulas Bagci
Ziyan Wu
VLM
146
1
0
16 Oct 2024
Sitcom-Crafter: A Plot-Driven Human Motion Generation System in 3D Scenes
Jianqi Chen
Panwen Hu
Xiaojun Chang
Z. Shi
Michael C. Kampffmeyer
Xiaodan Liang
48
4
0
14 Oct 2024
IVGF: The Fusion-Guided Infrared and Visible General Framework
Fangcen Liu
Chenqiang Gao
Fang Chen
Pengcheng Li
Junjie Guo
Deyu Meng
33
0
0
02 Sep 2024
CSFNet: A Cosine Similarity Fusion Network for Real-Time RGB-X Semantic Segmentation of Driving Scenes
Danial Qashqai
Emad Mousavian
S. B. Shokouhi
S. Mirzakuchaki
48
0
0
01 Jul 2024
GIM: A Million-scale Benchmark for Generative Image Manipulation Detection and Localization
Y. Chen
X. Huang
Quan Zhang
Wei Li
Mingjian Zhu
...
Hanting Chen
Hailin Hu
J. Yang
Wei Liu
Jie Hu
EGVM
61
1
0
24 Jun 2024
OmniBind: Teach to Build Unequal-Scale Modality Interaction for Omni-Bind of All
Yuanhuiyi Lyu
Xueye Zheng
Dahun Kim
Lin Wang
51
13
0
25 May 2024
LIX: Implicitly Infusing Spatial Geometric Prior Knowledge into Visual Semantic Segmentation for Autonomous Driving
Sicen Guo
Zhiyuan Wu
Qijun Chen
Ioannis Pitas
Rui Fan
Rui Fan
37
1
0
13 Mar 2024
RABBIT: A Robot-Assisted Bed Bathing System with Multimodal Perception and Integrated Compliance
Rishabh Madan
Skyler Valdez
David Kim
Sujie Fang
Luoyan Zhong
Diego Virtue
T. Bhattacharjee
28
14
0
26 Jan 2024
Unleashing the Power of CNN and Transformer for Balanced RGB-Event Video Recognition
Tianlin Li
Yao Rong
Shiao Wang
Yuan Chen
Zhe Wu
Bowei Jiang
Yonghong Tian
Jin Tang
ViT
81
3
0
18 Dec 2023
LABELMAKER: Automatic Semantic Label Generation from RGB-D Trajectories
Silvan Weder
Hermann Blum
Francis Engelmann
Marc Pollefeys
VLM
19
11
0
20 Nov 2023
PolyMaX: General Dense Prediction with Mask Transformer
Xuan S. Yang
Liangzhe Yuan
Kimberly Wilber
Astuti Sharma
Xiuye Gu
...
Stephanie Debats
Huisheng Wang
Hartwig Adam
Mikhail Sirotenko
Liang-Chieh Chen
28
14
0
09 Nov 2023
OmniVec: Learning robust representations with cross modal sharing
Siddharth Srivastava
Gaurav Sharma
SSL
27
64
0
07 Nov 2023
GAMUS: A Geometry-aware Multi-modal Semantic Segmentation Benchmark for Remote Sensing Data
Zhitong Xiong
Sining Chen
Yi Wang
Lichao Mou
Xiao Xiang Zhu
26
4
0
24 May 2023
Impact of Pseudo Depth on Open World Object Segmentation with Minimal User Guidance
Robin Schon
K. Ludwig
Rainer Lienhart
VLM
MDE
41
2
0
12 Apr 2023
Breaking Modality Disparity: Harmonized Representation for Infrared and Visible Image Registration
Zhiying Jiang
Zengxi Zhang
Jinyuan Liu
Xin-Yue Fan
Risheng Liu
27
2
0
12 Apr 2023
A Neuromorphic Dataset for Object Segmentation in Indoor Cluttered Environment
Xiaoqian Huang
Sanket Kachole
Abdulla Ayyad
F. B. Naeini
Dimitrios Makris
Yahya Zweiri
3DV
3DPC
22
9
0
13 Feb 2023
Lightweight integration of 3D features to improve 2D image segmentation
Olivier Pradelle
R. Chaine
D. Wendland
Julie Digne
3DV
3DPC
30
2
0
16 Dec 2022
InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions
Wenhai Wang
Jifeng Dai
Zhe Chen
Zhenhang Huang
Zhiqi Li
...
Tong Lu
Lewei Lu
Hongsheng Li
Xiaogang Wang
Yu Qiao
VLM
36
657
0
10 Nov 2022
DepthFormer: Multimodal Positional Encodings and Cross-Input Attention for Transformer-Based Segmentation Networks
F. Barbato
Giulia Rizzoli
Pietro Zanuttigh
MDE
ViT
28
4
0
08 Nov 2022
Early or Late Fusion Matters: Efficient RGB-D Fusion in Vision Transformers for 3D Object Recognition
Georgios Tziafas
H. Kasaei
ViT
40
10
0
03 Oct 2022
HRFuser: A Multi-resolution Sensor Fusion Architecture for 2D Object Detection
Tim Broedermann
Christos Sakaridis
Dengxin Dai
Luc Van Gool
52
31
0
30 Jun 2022
Semantic Segmentation by Early Region Proxy
Yifan Zhang
Bo Pang
Cewu Lu
ViT
49
29
0
26 Mar 2022
UniFormer: Unifying Convolution and Self-attention for Visual Recognition
Kunchang Li
Yali Wang
Junhao Zhang
Peng Gao
Guanglu Song
Yu Liu
Hongsheng Li
Yu Qiao
ViT
153
361
0
24 Jan 2022
Omnivore: A Single Model for Many Visual Modalities
Rohit Girdhar
Mannat Singh
Nikhil Ravi
L. V. D. van der Maaten
Armand Joulin
Ishan Misra
223
225
0
20 Jan 2022
CAVER: Cross-Modal View-Mixed Transformer for Bi-Modal Salient Object Detection
Youwei Pang
Xiaoqi Zhao
Lihe Zhang
Huchuan Lu
39
91
0
04 Dec 2021
Channel Exchanging Networks for Multimodal and Multitask Dense Image Prediction
Yikai Wang
Gang Hua
Wenbing Huang
Fengxiang He
Dacheng Tao
54
28
0
04 Dec 2021
ConvMLP: Hierarchical Convolutional MLPs for Vision
Jiachen Li
Ali Hassani
Steven Walton
Humphrey Shi
43
55
0
09 Sep 2021
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
277
3,623
0
24 Feb 2021
Trear: Transformer-based RGB-D Egocentric Action Recognition
Xiangyu Li
Yonghong Hou
Pichao Wang
Zhimin Gao
Mingliang Xu
Wanqing Li
ViT
188
88
0
05 Jan 2021
Boundary-Aware Feature Propagation for Scene Segmentation
Henghui Ding
Xudong Jiang
A. Liu
N. Magnenat-Thalmann
G. Wang
137
255
0
31 Aug 2019
Deep High-Resolution Representation Learning for Visual Recognition
Jingdong Wang
Ke Sun
Tianheng Cheng
Borui Jiang
Chaorui Deng
...
Yadong Mu
Mingkui Tan
Xinggang Wang
Wenyu Liu
Bin Xiao
195
3,531
0
20 Aug 2019
Joint 2D-3D-Semantic Data for Indoor Scene Understanding
Iro Armeni
S. Sax
Amir Zamir
Silvio Savarese
3DV
3DPC
115
876
0
03 Feb 2017
SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation
Vijay Badrinarayanan
Alex Kendall
R. Cipolla
SSeg
446
15,639
0
02 Nov 2015
1