ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2112.01527
  4. Cited By
Masked-attention Mask Transformer for Universal Image Segmentation
v1v2v3 (latest)

Masked-attention Mask Transformer for Universal Image Segmentation

2 December 2021
Bowen Cheng
Ishan Misra
Alex Schwing
Alexander Kirillov
Rohit Girdhar
    ISeg
ArXiv (abs)PDFHTML

Papers citing "Masked-attention Mask Transformer for Universal Image Segmentation"

50 / 1,408 papers shown
Title
The Revolution of Multimodal Large Language Models: A Survey
The Revolution of Multimodal Large Language Models: A Survey
Davide Caffagni
Federico Cocchi
Luca Barsellotti
Nicholas Moratelli
Sara Sarto
Lorenzo Baraldi
Lorenzo Baraldi
Marcella Cornia
Rita Cucchiara
LRMVLM
139
64
0
19 Feb 2024
MAL: Motion-Aware Loss with Temporal and Distillation Hints for
  Self-Supervised Depth Estimation
MAL: Motion-Aware Loss with Temporal and Distillation Hints for Self-Supervised Depth Estimation
Yue-Jiang Dong
Fang-Lue Zhang
Song-Hai Zhang
57
1
0
18 Feb 2024
CoLLaVO: Crayon Large Language and Vision mOdel
CoLLaVO: Crayon Large Language and Vision mOdel
Byung-Kwan Lee
Beomchan Park
Chae Won Kim
Yonghyun Ro
VLMMLLM
117
18
0
17 Feb 2024
A Decoding Scheme with Successive Aggregation of Multi-Level Features
  for Light-Weight Semantic Segmentation
A Decoding Scheme with Successive Aggregation of Multi-Level Features for Light-Weight Semantic Segmentation
Jiwon Yoo
Jangwon Lee
Gyeonghwan Kim
66
0
0
17 Feb 2024
Is Continual Learning Ready for Real-world Challenges?
Is Continual Learning Ready for Real-world Challenges?
Theodora Kontogianni
Yuanwen Yue
Siyu Tang
Konrad Schindler
CLL
124
3
0
15 Feb 2024
Open-Vocabulary Segmentation with Unpaired Mask-Text Supervision
Open-Vocabulary Segmentation with Unpaired Mask-Text Supervision
Zhaoqing Wang
Xiaobo Xia
Ziye Chen
Xiao He
Yandong Guo
Biwei Huang
Tongliang Liu
VLM
98
13
0
14 Feb 2024
M2fNet: Multi-modal Forest Monitoring Network on Large-scale Virtual
  Dataset
M2fNet: Multi-modal Forest Monitoring Network on Large-scale Virtual Dataset
Yawen Lu
Yunhan Huang
Su Sun
Tansi Zhang
Xuewen Zhang
Songlin Fei
Yingjie Victor Chen
102
5
0
07 Feb 2024
Spatio-temporal Prompting Network for Robust Video Feature Extraction
Spatio-temporal Prompting Network for Robust Video Feature Extraction
Guanxiong Sun
Chi Wang
Zhaoyu Zhang
Jiankang Deng
Stefanos Zafeiriou
Yang Hua
ViT
62
4
0
04 Feb 2024
Generalizable Entity Grounding via Assistance of Large Language Model
Generalizable Entity Grounding via Assistance of Large Language Model
Lu Qi
Yi-Wen Chen
Lehan Yang
Tiancheng Shen
Xiangtai Li
Weidong Guo
Yu-Syuan Xu
Ming-Hsuan Yang
VLM
133
9
0
04 Feb 2024
Region-Based Representations Revisited
Region-Based Representations Revisited
Michal Shlapentokh-Rothman
Ansel Blume
Yao Xiao
Yuqun Wu
TV Sethuraman
Heyi Tao
Jae Yong Lee
Wilfredo Torres
Yu-Xiong Wang
Derek Hoiem
124
12
0
04 Feb 2024
Theoretical Understanding of In-Context Learning in Shallow Transformers
  with Unstructured Data
Theoretical Understanding of In-Context Learning in Shallow Transformers with Unstructured Data
Yue Xing
Xiaofeng Lin
Chenheng Xu
Namjoon Suh
Qifan Song
Guang Cheng
105
3
0
01 Feb 2024
Convolution Meets LoRA: Parameter Efficient Finetuning for Segment
  Anything Model
Convolution Meets LoRA: Parameter Efficient Finetuning for Segment Anything Model
Zihan Zhong
Zhiqiang Tang
Tong He
Haoyang Fang
Chun Yuan
104
49
0
31 Jan 2024
SAGD: Boundary-Enhanced Segment Anything in 3D Gaussian via Gaussian Decomposition
SAGD: Boundary-Enhanced Segment Anything in 3D Gaussian via Gaussian Decomposition
Xu Hu
Yuxi Wang
Lue Fan
Junsong Fan
Junran Peng
Zhen Lei
Qing Li
Zhaoxiang Zhang
Zhaoxiang Zhang
3DGS
173
9
0
31 Jan 2024
Bridging Generative and Discriminative Models for Unified Visual
  Perception with Diffusion Priors
Bridging Generative and Discriminative Models for Unified Visual Perception with Diffusion Priors
Shiyin Dong
Mingrui Zhu
Kun Cheng
Nannan Wang
Xinbo Gao
DiffM
45
3
0
29 Jan 2024
GEM: Boost Simple Network for Glass Surface Segmentation via Segment
  Anything Model and Data Synthesis
GEM: Boost Simple Network for Glass Surface Segmentation via Segment Anything Model and Data Synthesis
Jing Hao
Moyun Liu
Kuo Feng Hung
DiffM
59
2
0
27 Jan 2024
SAM-based instance segmentation models for the automation of structural
  damage detection
SAM-based instance segmentation models for the automation of structural damage detection
Zehao Ye
Lucy Lovell
A. Faramarzi
Jelena Ninić
96
15
0
27 Jan 2024
SSR: SAM is a Strong Regularizer for domain adaptive semantic
  segmentation
SSR: SAM is a Strong Regularizer for domain adaptive semantic segmentation
Yanqi Ge
Ye Huang
Wen Li
Lixin Duan
48
0
0
26 Jan 2024
Rethinking Patch Dependence for Masked Autoencoders
Rethinking Patch Dependence for Masked Autoencoders
Letian Fu
Long Lian
Renhao Wang
Baifeng Shi
Xudong Wang
Adam Yala
Trevor Darrell
Alexei A. Efros
Ken Goldberg
144
17
0
25 Jan 2024
MUSES: The Multi-Sensor Semantic Perception Dataset for Driving under
  Uncertainty
MUSES: The Multi-Sensor Semantic Perception Dataset for Driving under Uncertainty
Tim Brödermann
David Brüggemann
Daniel Gehrig
Kevin Ta
Odysseas Liagouris
Jason Corkill
Luc Van Gool
100
13
0
23 Jan 2024
EEND-M2F: Masked-attention mask transformers for speaker diarization
EEND-M2F: Masked-attention mask transformers for speaker diarization
Marc Härkönen
Samuel J. Broughton
Lahiru Samarakoon
109
9
0
23 Jan 2024
IRIS: Inverse Rendering of Indoor Scenes from Low Dynamic Range Images
IRIS: Inverse Rendering of Indoor Scenes from Low Dynamic Range Images
Zhi-Hao Lin
Jia-Bin Huang
Zhengqin Li
Zhao Dong
Christian Richardt
Tuotuo Li
Michael Zollhöfer
Johannes Kopf
Shenlong Wang
Changil Kim
3DV
139
2
0
23 Jan 2024
Detect-Order-Construct: A Tree Construction based Approach for
  Hierarchical Document Structure Analysis
Detect-Order-Construct: A Tree Construction based Approach for Hierarchical Document Structure Analysis
Jiawei Wang
Kai Hu
Zhuoyao Zhong
Lei-huan Sun
Qiang Huo
74
7
0
22 Jan 2024
Semantic Prompt Learning for Weakly-Supervised Semantic Segmentation
Semantic Prompt Learning for Weakly-Supervised Semantic Segmentation
Ci-Siang Lin
Chien-Yi Wang
Yu-Chiang Frank Wang
Min-Hung Chen
VLM
252
0
0
22 Jan 2024
S$^3$M-Net: Joint Learning of Semantic Segmentation and Stereo Matching
  for Autonomous Driving
S3^33M-Net: Joint Learning of Semantic Segmentation and Stereo Matching for Autonomous Driving
Zhiyuan Wu
Yi Feng
Chuangwei Liu
Fisher Yu
Qijun Chen
Rui Fan
111
13
0
21 Jan 2024
Pixel-Wise Recognition for Holistic Surgical Scene Understanding
Pixel-Wise Recognition for Holistic Surgical Scene Understanding
Nicolás Ayobi
Santiago Rodríguez
Alejandra Pérez
Isabela Hernández
Nicolás Aparicio
...
Sebastián Pena
J. Santander
J. Caicedo
Nicolás Fernández
Pablo Arbelaez
ViTMedIm
95
13
0
20 Jan 2024
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
Lihe Yang
Bingyi Kang
Zilong Huang
Xiaogang Xu
Jiashi Feng
Hengshuang Zhao
VLM
278
826
0
19 Jan 2024
Sat2Scene: 3D Urban Scene Generation from Satellite Images with
  Diffusion
Sat2Scene: 3D Urban Scene Generation from Satellite Images with Diffusion
Zuoyue Li
Zhenqiang Li
Zhaopeng Cui
Marc Pollefeys
Martin R. Oswald
98
16
0
19 Jan 2024
Symbol as Points: Panoptic Symbol Spotting via Point-based
  Representation
Symbol as Points: Panoptic Symbol Spotting via Point-based Representation
Wenlong Liu
Tianyu Yang
Yuhan Wang
Qizhi Yu
Lei Zhang
3DPC
58
6
0
19 Jan 2024
OMG-Seg: Is One Model Good Enough For All Segmentation?
OMG-Seg: Is One Model Good Enough For All Segmentation?
Xiangtai Li
Haobo Yuan
Wei Li
Henghui Ding
Size Wu
Wenwei Zhang
Yining Li
Kai Chen
Chen Change Loy
VLMMLLMViT
150
64
0
18 Jan 2024
A Simple Latent Diffusion Approach for Panoptic Segmentation and Mask
  Inpainting
A Simple Latent Diffusion Approach for Panoptic Segmentation and Mask Inpainting
Wouter Van Gansbeke
Bert De Brabandere
DiffM
128
11
0
18 Jan 2024
Supervised Fine-tuning in turn Improves Visual Foundation Models
Supervised Fine-tuning in turn Improves Visual Foundation Models
Xiaohu Jiang
Yixiao Ge
Yuying Ge
Dachuan Shi
Chun Yuan
Ying Shan
VLMCLIP
94
9
0
18 Jan 2024
Image Translation as Diffusion Visual Programmers
Image Translation as Diffusion Visual Programmers
Cheng Han
James Liang
Qifan Wang
Majid Rabbani
S. Dianat
Raghuveer M. Rao
Ying Nian Wu
Dongfang Liu
79
8
0
18 Jan 2024
Instance Brownian Bridge as Texts for Open-vocabulary Video Instance
  Segmentation
Instance Brownian Bridge as Texts for Open-vocabulary Video Instance Segmentation
Ze-Long Cheng
Kehan Li
Hao Li
Peng Jin
Chang Liu
Xiawu Zheng
Rongrong Ji
Jie Chen
VOS
91
2
0
18 Jan 2024
Dynamic Relation Transformer for Contextual Text Block Detection
Dynamic Relation Transformer for Contextual Text Block Detection
Jiawei Wang
Shunchi Zhang
Kai Hu
Chixiang Ma
Zhuoyao Zhong
Lei-huan Sun
Qiang Huo
58
0
0
17 Jan 2024
MaskClustering: View Consensus based Mask Graph Clustering for
  Open-Vocabulary 3D Instance Segmentation
MaskClustering: View Consensus based Mask Graph Clustering for Open-Vocabulary 3D Instance Segmentation
Mi Yan
JIazhao Zhang
Yan Zhu
Hongan Wang
3DVISeg
97
29
0
15 Jan 2024
MapNeXt: Revisiting Training and Scaling Practices for Online Vectorized
  HD Map Construction
MapNeXt: Revisiting Training and Scaling Practices for Online Vectorized HD Map Construction
Toyota Li
69
6
0
14 Jan 2024
Scalable 3D Panoptic Segmentation As Superpoint Graph Clustering
Scalable 3D Panoptic Segmentation As Superpoint Graph Clustering
Damien Robert
Hugo Raguet
Loic Landrieu
75
13
0
12 Jan 2024
Hyper-STTN: Social Group-aware Spatial-Temporal Transformer Network for
  Human Trajectory Prediction with Hypergraph Reasoning
Hyper-STTN: Social Group-aware Spatial-Temporal Transformer Network for Human Trajectory Prediction with Hypergraph Reasoning
Weizheng Wang
Le Mao
Baijian Yang
Guohua Chen
Byung-Cheol Min
ViTHAI
85
3
0
12 Jan 2024
Efficient Deformable ConvNets: Rethinking Dynamic and Sparse Operator
  for Vision Applications
Efficient Deformable ConvNets: Rethinking Dynamic and Sparse Operator for Vision Applications
Yuwen Xiong
Zhiqi Li
Yuntao Chen
Feng Wang
Xizhou Zhu
...
Hongsheng Li
Yu Qiao
Lewei Lu
Jie Zhou
Jifeng Dai
69
63
0
11 Jan 2024
Distribution-aware Interactive Attention Network and Large-scale Cloud
  Recognition Benchmark on FY-4A Satellite Image
Distribution-aware Interactive Attention Network and Large-scale Cloud Recognition Benchmark on FY-4A Satellite Image
Jiaqing Zhang
Jie Lei
Weiying Xie
Kai Jiang
Mingxiang Cao
Yunsong Li
55
3
0
06 Jan 2024
Open-Vocabulary SAM: Segment and Recognize Twenty-thousand Classes
  Interactively
Open-Vocabulary SAM: Segment and Recognize Twenty-thousand Classes Interactively
Haobo Yuan
Xiangtai Li
Chong Zhou
Yining Li
Kai Chen
Chen Change Loy
VLM
118
51
0
05 Jan 2024
ODIN: A Single Model for 2D and 3D Segmentation
ODIN: A Single Model for 2D and 3D Segmentation
Ayush Jain
Pushkal Katara
N. Gkanatsios
Adam W. Harley
Gabriel H. Sarch
Kriti Aggarwal
Vishrav Chaudhary
Katerina Fragkiadaki
3DPC
121
9
0
04 Jan 2024
Towards Robust Semantic Segmentation against Patch-based Attack via
  Attention Refinement
Towards Robust Semantic Segmentation against Patch-based Attack via Attention Refinement
Zheng Yuan
Jie Zhang
Yude Wang
Shiguang Shan
Xilin Chen
AAML
135
1
0
03 Jan 2024
FullLoRA: Efficiently Boosting the Robustness of Pretrained Vision Transformers
FullLoRA: Efficiently Boosting the Robustness of Pretrained Vision Transformers
Zheng Yuan
Jie Zhang
Shiguang Shan
Xilin Chen
110
4
0
03 Jan 2024
PROMPT-IML: Image Manipulation Localization with Pre-trained Foundation
  Models Through Prompt Tuning
PROMPT-IML: Image Manipulation Localization with Pre-trained Foundation Models Through Prompt Tuning
Xuntao Liu
Yuzhou Yang
Qichao Ying
Zhenxing Qian
Xinpeng Zhang
Sheng Li
VLM
68
4
0
01 Jan 2024
WoodScape Motion Segmentation for Autonomous Driving -- CVPR 2023 OmniCV
  Workshop Challenge
WoodScape Motion Segmentation for Autonomous Driving -- CVPR 2023 OmniCV Workshop Challenge
Saravanabalagi Ramachandran
Nathaniel Cibik
Ganesh Sistu
John L McDonald
103
0
0
31 Dec 2023
SAR-RARP50: Segmentation of surgical instrumentation and Action
  Recognition on Robot-Assisted Radical Prostatectomy Challenge
SAR-RARP50: Segmentation of surgical instrumentation and Action Recognition on Robot-Assisted Radical Prostatectomy Challenge
Dimitrios Psychogyios
Emanuele Colleoni
Beatrice van Amsterdam
Chih-Yang Li
Shu-Yu Huang
...
Santiago Rodriguez
Juanita Puentes
Pablo Arbelaez
Omid Mohareri
Danail Stoyanov
76
28
0
31 Dec 2023
Analyzing Local Representations of Self-supervised Vision Transformers
Analyzing Local Representations of Self-supervised Vision Transformers
Ani Vanyan
Alvard Barseghyan
Hakob Tamazyan
Vahan Huroyan
Hrant Khachatrian
Martin Danelljan
112
3
0
31 Dec 2023
Generalizing Single-View 3D Shape Retrieval to Occlusions and Unseen
  Objects
Generalizing Single-View 3D Shape Retrieval to Occlusions and Unseen Objects
Qirui Wu
Daniel E. Ritchie
Manolis Savva
Angel X. Chang
3DPC
69
3
0
31 Dec 2023
PlanarNeRF: Online Learning of Planar Primitives with Neural Radiance Fields
PlanarNeRF: Online Learning of Planar Primitives with Neural Radiance Fields
Zheng Chen
Qingan Yan
Huangying Zhan
Changjiang Cai
Xiangyu Xu
Yuzhong Huang
Weihan Wang
Ziyue Feng
Lantao Liu
Yi Tian Xu
3DV
126
3
0
30 Dec 2023
Previous
123...141516...272829
Next