ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2504.04701
  4. Cited By
DFormerv2: Geometry Self-Attention for RGBD Semantic Segmentation

DFormerv2: Geometry Self-Attention for RGBD Semantic Segmentation

7 April 2025
Bo Yin
Jiao-Long Cao
Ming-Ming Cheng
Qibin Hou
    3DPCMDE
ArXiv (abs)PDFHTML

Papers citing "DFormerv2: Geometry Self-Attention for RGBD Semantic Segmentation"

34 / 34 papers shown
Title
SPT: Sequence Prompt Transformer for Interactive Image Segmentation
SPT: Sequence Prompt Transformer for Interactive Image Segmentation
Senlin Cheng
Haopeng Sun
VLM
72
3
0
13 Dec 2024
Agent Attention: On the Integration of Softmax and Linear Attention
Agent Attention: On the Integration of Softmax and Linear Attention
Dongchen Han
Tianzhu Ye
Yizeng Han
Zhuofan Xia
Siyuan Pan
Pengfei Wan
Shiji Song
Gao Huang
90
83
0
14 Dec 2023
BiFormer: Vision Transformer with Bi-Level Routing Attention
BiFormer: Vision Transformer with Bi-Level Routing Attention
Lei Zhu
Xinjiang Wang
Zhanghan Ke
Wayne Zhang
Rynson W. H. Lau
179
524
0
15 Mar 2023
Delivering Arbitrary-Modal Semantic Segmentation
Delivering Arbitrary-Modal Semantic Segmentation
Jiaming Zhang
R. Liu
Haowen Shi
Kailun Yang
Simon Reiß
Kunyu Peng
Haodong Fu
Kaiwei Wang
Rainer Stiefelhagen
VLM
99
99
0
02 Mar 2023
Conv2Former: A Simple Transformer-Style ConvNet for Visual Recognition
Conv2Former: A Simple Transformer-Style ConvNet for Visual Recognition
Qibin Hou
Cheng Lu
Mingg-Ming Cheng
Jiashi Feng
ViT
109
140
0
22 Nov 2022
SegNeXt: Rethinking Convolutional Attention Design for Semantic
  Segmentation
SegNeXt: Rethinking Convolutional Attention Design for Semantic Segmentation
Meng-Hao Guo
Chenggang Lu
Qibin Hou
Zheng Liu
Ming-Ming Cheng
Shiyong Hu
SSegViTVLM
79
653
0
18 Sep 2022
Multimodal Token Fusion for Vision Transformers
Multimodal Token Fusion for Vision Transformers
Yikai Wang
Xinghao Chen
Lele Cao
Wen-bing Huang
Gang Hua
Yunhe Wang
ViT
87
182
0
19 Apr 2022
Not All Tokens Are Equal: Human-centric Visual Analysis via Token
  Clustering Transformer
Not All Tokens Are Equal: Human-centric Visual Analysis via Token Clustering Transformer
Wang Zeng
Sheng Jin
Wentao Liu
Chao Qian
Ping Luo
Ouyang Wanli
Xiaogang Wang
ViT
86
127
0
19 Apr 2022
Neighborhood Attention Transformer
Neighborhood Attention Transformer
Ali Hassani
Steven Walton
Jiacheng Li
Shengjia Li
Humphrey Shi
ViTAI4TS
94
274
0
14 Apr 2022
MultiMAE: Multi-modal Multi-task Masked Autoencoders
MultiMAE: Multi-modal Multi-task Masked Autoencoders
Roman Bachmann
David Mizrahi
Andrei Atanov
Amir Zamir
134
278
0
04 Apr 2022
CMX: Cross-Modal Fusion for RGB-X Semantic Segmentation with
  Transformers
CMX: Cross-Modal Fusion for RGB-X Semantic Segmentation with Transformers
Jiaming Zhang
Huayao Liu
Kailun Yang
Xinxin Hu
Ruiping Liu
Rainer Stiefelhagen
ViT
85
328
0
09 Mar 2022
Omnivore: A Single Model for Many Visual Modalities
Omnivore: A Single Model for Many Visual Modalities
Rohit Girdhar
Mannat Singh
Nikhil Ravi
Laurens van der Maaten
Armand Joulin
Ishan Misra
268
237
0
20 Jan 2022
A ConvNet for the 2020s
A ConvNet for the 2020s
Zhuang Liu
Hanzi Mao
Chaozheng Wu
Christoph Feichtenhofer
Trevor Darrell
Saining Xie
ViT
189
5,226
0
10 Jan 2022
Vision Transformer with Deformable Attention
Vision Transformer with Deformable Attention
Zhuofan Xia
Xuran Pan
S. Song
Li Erran Li
Gao Huang
ViT
95
484
0
03 Jan 2022
Masked-attention Mask Transformer for Universal Image Segmentation
Masked-attention Mask Transformer for Universal Image Segmentation
Bowen Cheng
Ishan Misra
Alex Schwing
Alexander Kirillov
Rohit Girdhar
ISeg
272
2,385
0
02 Dec 2021
Is Attention Better Than Matrix Decomposition?
Is Attention Better Than Matrix Decomposition?
Zhengyang Geng
Meng-Hao Guo
Hongxu Chen
Xia Li
Ke Wei
Zhouchen Lin
109
142
0
09 Sep 2021
Train Short, Test Long: Attention with Linear Biases Enables Input
  Length Extrapolation
Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation
Ofir Press
Noah A. Smith
M. Lewis
339
775
0
27 Aug 2021
ShapeConv: Shape-aware Convolutional Layer for Indoor RGB-D Semantic
  Segmentation
ShapeConv: Shape-aware Convolutional Layer for Indoor RGB-D Semantic Segmentation
Jinming Cao
Hanchao Leng
Dani Lischinski
Danny Cohen-Or
Changhe Tu
Yangyan Li
SSegMDE3DV
100
136
0
24 Aug 2021
Per-Pixel Classification is Not All You Need for Semantic Segmentation
Per-Pixel Classification is Not All You Need for Semantic Segmentation
Bowen Cheng
Alex Schwing
Alexander Kirillov
VLMViT
212
1,551
0
13 Jul 2021
CSWin Transformer: A General Vision Transformer Backbone with
  Cross-Shaped Windows
CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows
Xiaoyi Dong
Jianmin Bao
Dongdong Chen
Weiming Zhang
Nenghai Yu
Lu Yuan
Dong Chen
B. Guo
ViT
154
986
0
01 Jul 2021
Focal Self-attention for Local-Global Interactions in Vision
  Transformers
Focal Self-attention for Local-Global Interactions in Vision Transformers
Jianwei Yang
Chunyuan Li
Pengchuan Zhang
Xiyang Dai
Bin Xiao
Lu Yuan
Jianfeng Gao
ViT
80
436
0
01 Jul 2021
PVT v2: Improved Baselines with Pyramid Vision Transformer
PVT v2: Improved Baselines with Pyramid Vision Transformer
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViTAI4TS
122
1,682
0
25 Jun 2021
Segmenter: Transformer for Semantic Segmentation
Segmenter: Transformer for Semantic Segmentation
Robin Strudel
Ricardo Garcia Pinel
Ivan Laptev
Cordelia Schmid
ViT
215
1,473
0
12 May 2021
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Ze Liu
Yutong Lin
Yue Cao
Han Hu
Yixuan Wei
Zheng Zhang
Stephen Lin
B. Guo
ViT
467
21,603
0
25 Mar 2021
Efficient RGB-D Semantic Segmentation for Indoor Scene Analysis
Efficient RGB-D Semantic Segmentation for Indoor Scene Analysis
Daniel Seichter
Mona Köhler
Benjamin Lewandowski
Tim Wengefeld
H. Groß
100
223
0
13 Nov 2020
An Image is Worth 16x16 Words: Transformers for Image Recognition at
  Scale
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
679
41,483
0
22 Oct 2020
Bi-directional Cross-Modality Feature Propagation with
  Separation-and-Aggregation Gate for RGB-D Semantic Segmentation
Bi-directional Cross-Modality Feature Propagation with Separation-and-Aggregation Gate for RGB-D Semantic Segmentation
Xiaokang Chen
Kwan-Yee Lin
Jingbo Wang
Wayne Wu
Chao Qian
Hongsheng Li
Gang Zeng
MDE
94
315
0
17 Jul 2020
Momentum Contrast for Unsupervised Visual Representation Learning
Momentum Contrast for Unsupervised Visual Representation Learning
Kaiming He
Haoqi Fan
Yuxin Wu
Saining Xie
Ross B. Girshick
SSL
216
12,136
0
13 Nov 2019
Pattern-Affinitive Propagation across Depth, Surface Normal and Semantic
  Segmentation
Pattern-Affinitive Propagation across Depth, Surface Normal and Semantic Segmentation
Zhenyu Zhang
Zhen Cui
Chunyan Xu
Yan Yan
N. Sebe
Jian Yang
MDE
70
285
0
08 Jun 2019
ACNet: Attention Based Network to Exploit Complementary Features for
  RGBD Semantic Segmentation
ACNet: Attention Based Network to Exploit Complementary Features for RGBD Semantic Segmentation
Xinxin Hu
Kailun Yang
Lei Fei
Kaiwei Wang
3DPC
110
355
0
24 May 2019
Self-Attention with Relative Position Representations
Self-Attention with Relative Position Representations
Peter Shaw
Jakob Uszkoreit
Ashish Vaswani
182
2,299
0
06 Mar 2018
Deep Residual Learning for Image Recognition
Deep Residual Learning for Image Recognition
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
MedIm
2.2K
194,510
0
10 Dec 2015
Salient Object Detection: A Discriminative Regional Feature Integration
  Approach
Salient Object Detection: A Discriminative Regional Feature Integration Approach
Huaizu Jiang
Zejian Yuan
Ming-Ming Cheng
Yihong Gong
Nanning Zheng
Jingdong Wang
FAtt
103
1,270
0
22 Oct 2014
ImageNet Large Scale Visual Recognition Challenge
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky
Jia Deng
Hao Su
J. Krause
S. Satheesh
...
A. Karpathy
A. Khosla
Michael S. Bernstein
Alexander C. Berg
Li Fei-Fei
VLMObjD
1.7K
39,615
0
01 Sep 2014
1