ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2112.01527
  4. Cited By
Masked-attention Mask Transformer for Universal Image Segmentation
v1v2v3 (latest)

Masked-attention Mask Transformer for Universal Image Segmentation

2 December 2021
Bowen Cheng
Ishan Misra
Alex Schwing
Alexander Kirillov
Rohit Girdhar
    ISeg
ArXiv (abs)PDFHTML

Papers citing "Masked-attention Mask Transformer for Universal Image Segmentation"

50 / 1,408 papers shown
Title
CAM-Seg: A Continuous-valued Embedding Approach for Semantic Image Generation
CAM-Seg: A Continuous-valued Embedding Approach for Semantic Image Generation
Masud Ahmed
Zahid Hasan
Syed Arefinul Haque
A. Faridee
S. Purushotham
Suya You
Nirmalya Roy
186
0
0
19 Mar 2025
EgoDTM: Towards 3D-Aware Egocentric Video-Language Pretraining
EgoDTM: Towards 3D-Aware Egocentric Video-Language Pretraining
Boshen Xu
Yuting Mei
Xinbi Liu
Sipeng Zheng
Qin Jin
VLMMDE
108
0
0
19 Mar 2025
Universal Scene Graph Generation
Universal Scene Graph Generation
Shengqiong Wu
Hao Fei
Tat-Seng Chua
139
0
0
19 Mar 2025
Rethinking End-to-End 2D to 3D Scene Segmentation in Gaussian Splatting
Rethinking End-to-End 2D to 3D Scene Segmentation in Gaussian Splatting
Runsong Zhu
Shi Qiu
Zhengzhe Liu
Ka-Hei Hui
Qianyi Wu
Pheng Ann Heng
Chi-Wing Fu
3DGS3DV
128
2
0
18 Mar 2025
The Power of Context: How Multimodality Improves Image Super-Resolution
The Power of Context: How Multimodality Improves Image Super-Resolution
Kangfu Mei
Hossein Talebi
Mojtaba Ardakani
Vishal M. Patel
P. Milanfar
M. Delbracio
DiffM
126
2
0
18 Mar 2025
HiMTok: Learning Hierarchical Mask Tokens for Image Segmentation with Large Multimodal Model
HiMTok: Learning Hierarchical Mask Tokens for Image Segmentation with Large Multimodal Model
Tao Wang
Changxu Cheng
Lingfeng Wang
Senda Chen
Wuyue Zhao
VLM
100
1
0
17 Mar 2025
Dynamic Derivation and Elimination: Audio Visual Segmentation with Enhanced Audio Semantics
Dynamic Derivation and Elimination: Audio Visual Segmentation with Enhanced Audio Semantics
Chen Liu
Liying Yang
Peike Li
Dadong Wang
Lincheng Li
Xin Yu
VOS
157
0
0
17 Mar 2025
Shape Bias and Robustness Evaluation via Cue Decomposition for Image Classification and Segmentation
Shape Bias and Robustness Evaluation via Cue Decomposition for Image Classification and Segmentation
Edgar Heinert
Thomas Gottwald
Annika Mütze
Matthias Rottmann
147
0
0
16 Mar 2025
MTGS: Multi-Traversal Gaussian Splatting
MTGS: Multi-Traversal Gaussian Splatting
Tianyu Li
Yihang Qiu
Zhenhua Wu
Carl Lindström
Peng Su
Matthias Nießner
Hongyang Li
3DGS
270
2
0
16 Mar 2025
Logic-RAG: Augmenting Large Multimodal Models with Visual-Spatial Knowledge for Road Scene Understanding
Logic-RAG: Augmenting Large Multimodal Models with Visual-Spatial Knowledge for Road Scene Understanding
Imran Kabir
Md. Alimoor Reza
Syed Masum Billah
ReLMVLMLRM
115
1
0
16 Mar 2025
E-SAM: Training-Free Segment Every Entity Model
E-SAM: Training-Free Segment Every Entity Model
Weiming Zhang
Dingwen Xiao
Lei Chen
Lin Wang
VLM
81
1
0
15 Mar 2025
SpaceSeg: A High-Precision Intelligent Perception Segmentation Method for Multi-Spacecraft On-Orbit Targets
Hao Liu
Pengyu Guo
Siyuan Yang
Zeqing Jiang
Qinglei Hu
Dongyu Li
60
0
0
14 Mar 2025
VGGT: Visual Geometry Grounded Transformer
Jianyuan Wang
Minghao Chen
Nikita Karaev
Andrea Vedaldi
Christian Rupprecht
David Novotny
ViT
133
38
0
14 Mar 2025
Human-in-the-Loop Local Corrections of 3D Scene Layouts via Infilling
Human-in-the-Loop Local Corrections of 3D Scene Layouts via Infilling
Christopher Xie
A. Avetisyan
Henry Howard-Jenkins
Yawar Siddiqui
Julian Straub
Richard Newcombe
Vasileios Balntas
Jakob Julian Engel
3DH3DV
157
0
0
14 Mar 2025
Learning Appearance and Motion Cues for Panoptic Tracking
Juana Valeria Hurtado
Sajad Marvi
Rohit Mohan
Abhinav Valada
112
0
0
12 Mar 2025
Hybrid Rendering for Multimodal Autonomous Driving: Merging Neural and Physics-Based Simulation
Máté Tóth
Péter Kovács
Zoltán Bendefy
Zoltán Hortsin
Balázs Teréki
Tamás Matuszka
3DGSAI4CE
439
0
0
12 Mar 2025
TrackOcc: Camera-based 4D Panoptic Occupancy Tracking
Zhuoguang Chen
Kenan Li
Xiuyu Yang
Tao Jiang
Yongqian Li
Hang Zhao
125
0
0
11 Mar 2025
SAS: Segment Any 3D Scene with Integrated 2D Priors
Zechao Li
Jiahao Lu
Jiacheng Deng
Hanzhi Chang
Lifan Wu
Yanzhe Liang
Tianzhu Zhang
112
0
0
11 Mar 2025
3D Medical Imaging Segmentation on Non-Contrast CT
Canxuan Gang
Yuhan Peng
111
0
0
11 Mar 2025
DiffEGG: Diffusion-Driven Edge Generation as a Pixel-Annotation-Free Alternative for Instance Annotation
Sanghyun Jo
Ziseok Lee
Wooyeol Lee
Kyungsu Kim
131
2
0
11 Mar 2025
MaskAttn-UNet: A Mask Attention-Driven Framework for Universal Low-Resolution Image Segmentation
MaskAttn-UNet: A Mask Attention-Driven Framework for Universal Low-Resolution Image Segmentation
Anzhe Cheng
Chenzhong Yin
Yu Chang
Heng Ping
Shixuan Li
Shahin Nazarian
Paul Bogdan
SSeg
285
0
0
11 Mar 2025
From Slices to Sequences: Autoregressive Tracking Transformer for Cohesive and Consistent 3D Lymph Node Detection in CT Scans
Qinji Yu
Yirui Wang
K. Yan
Dandan Zheng
Dashan Ai
...
N. Shen
Xiaowei Ding
Le Lu
X. Ye
Dakai Jin
ViTMedIm
183
0
0
11 Mar 2025
VRMDiff: Text-Guided Video Referring Matting Generation of Diffusion
Lehan Yang
Jincen Song
Tianlong Wang
Daiqing Qi
Weili Shi
Yuheng Liu
Sheng Li
DiffMVOSVGen
131
0
0
11 Mar 2025
Seeing and Reasoning with Confidence: Supercharging Multimodal LLMs with an Uncertainty-Aware Agentic Framework
Zhuo Zhi
Chen Feng
Adam Daneshmend
Mine Orlu
Andreas Demosthenous
L. Yin
Da Li
Ziquan Liu
Miguel R. D. Rodrigues
LRM
123
1
0
11 Mar 2025
MegaSR: Mining Customized Semantics and Expressive Guidance for Image Super-Resolution
Xiaochen Li
Jianlong Wu
Xinchuan Huang
C. L. Philip Chen
Weili Guan
Xian-Sheng Hua
Liqiang Nie
DiffM
81
0
0
11 Mar 2025
FastInstShadow: A Simple Query-Based Model for Instance Shadow Detection
Takeru Inoue
Ryusuke Miyamoto
71
0
0
10 Mar 2025
Dynamic Dictionary Learning for Remote Sensing Image Segmentation
Xuechao Zou
Yue Li
Shun Zhang
Kai Li
Shiying Wang
Pin Tao
Junliang Xing
Congyan Lang
90
0
0
09 Mar 2025
Is Pre-training Applicable to the Decoder for Dense Prediction?
Is Pre-training Applicable to the Decoder for Dense Prediction?
Chao Ning
Wanshui Gan
Weihao Xuan
Naoto Yokoya
281
0
0
05 Mar 2025
Golden Cudgel Network for Real-Time Semantic Segmentation
Guoyu Yang
Yuan Wang
Daming Shi
Yanjie Wang
67
0
0
05 Mar 2025
COARSE: Collaborative Pseudo-Labeling with Coarse Real Labels for Off-Road Semantic Segmentation
Aurelio Noca
Xianmei Lei
Jonathan Becktor
J. Edlund
Anna Sabel
Patrick Spieler
Curtis Padgett
Alexandre Alahi
Deegan Atha
151
0
0
05 Mar 2025
Out-of-Distribution Segmentation in Autonomous Driving: Problems and State of the Art
Out-of-Distribution Segmentation in Autonomous Driving: Problems and State of the Art
Youssef Shoeb
Azarm Nowzad
Hanno Gottschalk
UQCV
284
2
0
04 Mar 2025
Boltzmann Attention Sampling for Image Analysis with Small Objects
Boltzmann Attention Sampling for Image Analysis with Small Objects
Theodore Zhao
Sid Kiblawi
Naoto Usuyama
Ho Hin Lee
Sam Preston
Hoifung Poon
Mu-Hsin Wei
MedIm
205
0
0
04 Mar 2025
Object-Aware Video Matting with Cross-Frame Guidance
Han Zhang
Dongyue Wu
Yuanjie Shao
Nong Sang
Changxin Gao
VOS
114
0
0
03 Mar 2025
One-shot In-context Part Segmentation
Zhenqi Dai
Ting Liu
Xinyu Zhang
Y. X. Wei
Yanning Zhang
VLM
176
1
0
03 Mar 2025
UFO: A Unified Approach to Fine-grained Visual Perception via Open-ended Language Interface
Hao Tang
Chenwei Xie
Haiyang Wang
Xiaoyi Bao
Tingyu Weng
Pandeng Li
Yun Zheng
Liwei Wang
ObjDVLM
134
1
0
03 Mar 2025
IteRPrimE: Zero-shot Referring Image Segmentation with Iterative Grad-CAM Refinement and Primary Word Emphasis
Yun Wang
Jingchen Ni
Yong-Jin Liu
Chun Yuan
Yansong Tang
96
4
0
02 Mar 2025
Training-Free Dataset Pruning for Instance Segmentation
Yalun Dai
Lingao Xiao
Ivor W. Tsang
Yang He
ISeg
107
2
0
02 Mar 2025
QORT-Former: Query-optimized Real-time Transformer for Understanding Two Hands Manipulating Objects
QORT-Former: Query-optimized Real-time Transformer for Understanding Two Hands Manipulating Objects
Elkhan Ismayilzada
MD Khalequzzaman Chowdhury Sayem
Yihalem Yimolal Tiruneh
Mubarrat Chowdhury
Muhammadjon Boboev
Seungryul Baek
ViT
132
1
0
27 Feb 2025
R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts
R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts
Zhongyang Li
Ziyue Li
Dinesh Manocha
MoE
148
0
0
27 Feb 2025
Open-Vocabulary Semantic Part Segmentation of 3D Human
Open-Vocabulary Semantic Part Segmentation of 3D Human
Keito Suzuki
Bang Du
Girish Krishnan
Kunyao Chen
Runfa Li
Truong Thao Nguyen
3DHVLM
152
0
0
27 Feb 2025
Knowledge Distillation for Semantic Segmentation: A Label Space Unification Approach
Knowledge Distillation for Semantic Segmentation: A Label Space Unification Approach
Anton Backhaus
Thorsten Luettel
Mirko Maehlisch
99
0
0
26 Feb 2025
A Lightweight and Extensible Cell Segmentation and Classification Model for Whole Slide Images
A Lightweight and Extensible Cell Segmentation and Classification Model for Whole Slide Images
N. Shvetsov
T. Kilvaer
M. Tafavvoghi
Anders Sildnes
Kajsa Møllersen
Lill-ToveRasmussen Busund
L. A. Bongo
VLM
131
1
0
26 Feb 2025
Enhancing Image Matting in Real-World Scenes with Mask-Guided Iterative Refinement
Enhancing Image Matting in Real-World Scenes with Mask-Guided Iterative Refinement
Rui Liu
72
0
0
24 Feb 2025
Vision-LSTM: xLSTM as Generic Vision Backbone
Vision-LSTM: xLSTM as Generic Vision Backbone
Benedikt Alkin
M. Beck
Korbinian Poppel
Sepp Hochreiter
Johannes Brandstetter
VLM
235
49
0
24 Feb 2025
CLIMB-3D: Continual Learning for Imbalanced 3D Instance Segmentation
CLIMB-3D: Continual Learning for Imbalanced 3D Instance Segmentation
Vishal Thengane
Jean Lahoud
Hisham Cholakkal
Rao Muhammad Anwer
L. Yin
Xiatian Zhu
Salman Khan
CLL
491
0
0
24 Feb 2025
NPSim: Nighttime Photorealistic Simulation From Daytime Images With Monocular Inverse Rendering and Ray Tracing
NPSim: Nighttime Photorealistic Simulation From Daytime Images With Monocular Inverse Rendering and Ray Tracing
Shutong Zhang
76
0
0
15 Feb 2025
A Survey on Mamba Architecture for Vision Applications
A Survey on Mamba Architecture for Vision Applications
Fady Ibrahim
Guangjun Liu
Guanghui Wang
Mamba
166
3
0
11 Feb 2025
Fully Exploiting Vision Foundation Model's Profound Prior Knowledge for Generalizable RGB-Depth Driving Scene Parsing
Sicen Guo
Tianyou Wen
Chuang-Wei Liu
Qijun Chen
Rui Fan
105
0
0
10 Feb 2025
A Novel Convolutional-Free Method for 3D Medical Imaging Segmentation
Canxuan Gang
MedImViT
89
0
0
08 Feb 2025
Beyond the Final Layer: Hierarchical Query Fusion Transformer with Agent-Interpolation Initialization for 3D Instance Segmentation
Beyond the Final Layer: Hierarchical Query Fusion Transformer with Agent-Interpolation Initialization for 3D Instance Segmentation
Jiahao Lu
Jiacheng Deng
Tianzhu Zhang
150
2
0
06 Feb 2025
Previous
12345...272829
Next