ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2401.17904
  4. Cited By
Hi-SAM: Marrying Segment Anything Model for Hierarchical Text
  Segmentation

Hi-SAM: Marrying Segment Anything Model for Hierarchical Text Segmentation

31 January 2024
Maoyuan Ye
Jing Zhang
Juhua Liu
Chenyu Liu
Baocai Yin
Cong Liu
Bo Du
Dacheng Tao
    VLM
ArXiv (abs)PDFHTMLGithub (271★)

Papers citing "Hi-SAM: Marrying Segment Anything Model for Hierarchical Text Segmentation"

43 / 43 papers shown
Title
MG-Gen: Single Image to Motion Graphics Generation with Layer Decomposition
MG-Gen: Single Image to Motion Graphics Generation with Layer Decomposition
Takahiro Shirakawa
Tomoyuki Suzuki
Daichi Haraguchi
VGen
95
0
0
03 Apr 2025
A Token-level Text Image Foundation Model for Document Understanding
A Token-level Text Image Foundation Model for Document Understanding
Tongkun Guan
Zining Wang
Pei Fu
Zhengtao Guo
Wei Shen
...
Chen Duan
Hao Sun
Qianyi Jiang
Junfeng Luo
Xiaokang Yang
VLM
138
2
0
04 Mar 2025
Type-R: Automatically Retouching Typos for Text-to-Image Generation
Type-R: Automatically Retouching Typos for Text-to-Image Generation
Wataru Shimoda
Naoto Inoue
Daichi Haraguchi
Hayato Mitani
S. Uchida
Kota Yamaguchi
DiffM
185
0
0
27 Nov 2024
JoyType: A Robust Design for Multilingual Visual Text Creation
JoyType: A Robust Design for Multilingual Visual Text Creation
Chao Li
Chen Jiang
Xiaolong Liu
Jun Zhao
Guoxin Wang
DiffM
98
7
0
26 Sep 2024
Hierarchical Text Spotter for Joint Text Spotting and Layout Analysis
Hierarchical Text Spotter for Joint Text Spotting and Layout Analysis
Shangbang Long
Siyang Qin
Yasuhisa Fujii
Alessandro Bissacco
Michalis Raptis
59
5
0
25 Oct 2023
Matting Anything
Matting Anything
Jiacheng Li
Jitesh Jain
Humphrey Shi
VLM
70
18
0
08 Jun 2023
Segment Anything in High Quality
Segment Anything in High Quality
Lei Ke
Mingqiao Ye
Martin Danelljan
Yifan Liu
Yu-Wing Tai
Chi-Keung Tang
Feng Yu
VLM
107
337
0
02 Jun 2023
Matcher: Segment Anything with One Shot Using All-Purpose Feature
  Matching
Matcher: Segment Anything with One Shot Using All-Purpose Feature Matching
Yang Liu
Muzhi Zhu
Hengtao Li
Hao Chen
Xinlong Wang
Chunhua Shen
VLMMLLM
156
90
0
22 May 2023
Segment and Track Anything
Segment and Track Anything
Yangming Cheng
Liulei Li
Yuanyou Xu
Xiaodi Li
Zongxin Yang
Wenguan Wang
Yi Yang
VOS
76
201
0
11 May 2023
Personalize Segment Anything Model with One Shot
Personalize Segment Anything Model with One Shot
Renrui Zhang
Zhengkai Jiang
Ziyu Guo
Shilin Yan
Junting Pan
Xianzheng Ma
Hao Dong
Peng Gao
Hongsheng Li
MLLMVLM
102
219
0
04 May 2023
DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text
  Spotting
DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting
Maoyuan Ye
Jing Zhang
Shanshan Zhao
Juhua Liu
Tongliang Liu
Bo Du
Dacheng Tao
113
77
0
19 Nov 2022
DPText-DETR: Towards Better Scene Text Detection with Dynamic Points in
  Transformer
DPText-DETR: Towards Better Scene Text Detection with Dynamic Points in Transformer
Maoyuan Ye
Jing Zhang
Shanshan Zhao
Juhua Liu
Bo Du
Dacheng Tao
ViT
78
76
0
10 Jul 2022
Arbitrary Shape Text Detection via Boundary Transformer
Arbitrary Shape Text Detection via Boundary Transformer
Shi-Xue Zhang
Chun Yang
Xiaobin Zhu
Xu-Cheng Yin
71
40
0
11 May 2022
Activating More Pixels in Image Super-Resolution Transformer
Activating More Pixels in Image Super-Resolution Transformer
Xiangyu Chen
Xintao Wang
Jiantao Zhou
Yu Qiao
Chao Dong
ViT
144
634
0
09 May 2022
Vision-Language Pre-Training for Boosting Scene Text Detectors
Vision-Language Pre-Training for Boosting Scene Text Detectors
Sibo Song
Jianqiang Wan
Zhibo Yang
Jun Tang
Wenqing Cheng
Xiang Bai
Cong Yao
VLM
95
24
0
29 Apr 2022
LayoutLMv3: Pre-training for Document AI with Unified Text and Image
  Masking
LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking
Yupan Huang
Tengchao Lv
Lei Cui
Yutong Lu
Furu Wei
92
458
0
18 Apr 2022
Text Spotting Transformers
Text Spotting Transformers
Xiang Zhang
Yongwen Su
Subarna Tripathi
Zhuowen Tu
ViT
86
95
0
05 Apr 2022
End-to-end Document Recognition and Understanding with Dessurt
End-to-end Document Recognition and Understanding with Dessurt
Brian L. Davis
B. Morse
Brian L. Price
Chris Tensmeyer
Curtis Wigington
Vlad I. Morariu
VLMViT
90
73
0
30 Mar 2022
Towards Weakly-Supervised Text Spotting using a Multi-Task Transformer
Towards Weakly-Supervised Text Spotting using a Multi-Task Transformer
Yair Kittenplon
I. Lavi
Sharon Fogel
Yarin Bar
R. Manmatha
Pietro Perona
ViT
48
55
0
11 Feb 2022
DocSegTr: An Instance-Level End-to-End Document Image Segmentation
  Transformer
DocSegTr: An Instance-Level End-to-End Document Image Segmentation Transformer
Sanket Biswas
Ayan Banerjee
Josep Lladós
Umapada Pal
ViT
76
23
0
27 Jan 2022
Masked Autoencoders Are Scalable Vision Learners
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViTTPM
467
7,819
0
11 Nov 2021
Adaptive Boundary Proposal Network for Arbitrary Shape Text Detection
Adaptive Boundary Proposal Network for Arbitrary Shape Text Detection
Shi-Xue Zhang
Xiaobin Zhu
Chun Yang
Hongfa Wang
Xu-Cheng Yin
61
84
0
27 Jul 2021
DocFormer: End-to-End Transformer for Document Understanding
DocFormer: End-to-End Transformer for Document Understanding
Srikar Appalaraju
Bhavan A. Jasani
Bhargava Urala Kota
Yusheng Xie
R. Manmatha
ViT
88
279
0
22 Jun 2021
SegFormer: Simple and Efficient Design for Semantic Segmentation with
  Transformers
SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers
Enze Xie
Wenhai Wang
Zhiding Yu
Anima Anandkumar
J. Álvarez
Ping Luo
ViT
314
5,065
0
31 May 2021
StructuralLM: Structural Pre-training for Form Understanding
StructuralLM: Structural Pre-training for Form Understanding
Chenliang Li
Bin Bi
Ming Yan
Wei Wang
Songfang Huang
Fei Huang
Luo Si
LMTDAI4CE
83
134
0
24 May 2021
TextOCR: Towards large-scale end-to-end reasoning for arbitrary-shaped
  scene text
TextOCR: Towards large-scale end-to-end reasoning for arbitrary-shaped scene text
Amanpreet Singh
Guan Pang
Mandy Toh
Jing Huang
Wojciech Galuba
Tal Hassner
64
174
0
12 May 2021
Fourier Contour Embedding for Arbitrary-Shaped Text Detection
Fourier Contour Embedding for Arbitrary-Shaped Text Detection
Yiqin Zhu
Jianyong Chen
Lingyu Liang
Zhuanghui Kuang
Lianwen Jin
Wayne Zhang
62
194
0
21 Apr 2021
Simple Copy-Paste is a Strong Data Augmentation Method for Instance
  Segmentation
Simple Copy-Paste is a Strong Data Augmentation Method for Instance Segmentation
Golnaz Ghiasi
Huayu Chen
A. Srinivas
Rui Qian
Nayeon Lee
E. D. Cubuk
Quoc V. Le
Barret Zoph
ISeg
304
992
0
13 Dec 2020
MaX-DeepLab: End-to-End Panoptic Segmentation with Mask Transformers
MaX-DeepLab: End-to-End Panoptic Segmentation with Mask Transformers
Huiyu Wang
Yukun Zhu
Hartwig Adam
Alan Yuille
Liang-Chieh Chen
ViT
126
531
0
01 Dec 2020
End-to-End Object Detection with Transformers
End-to-End Object Detection with Transformers
Nicolas Carion
Francisco Massa
Gabriel Synnaeve
Nicolas Usunier
Alexander Kirillov
Sergey Zagoruyko
ViT3DVPINN
432
13,094
0
26 May 2020
ABCNet: Real-time Scene Text Spotting with Adaptive Bezier-Curve Network
ABCNet: Real-time Scene Text Spotting with Adaptive Bezier-Curve Network
Yuliang Liu
Hao Chen
Chunhua Shen
Tong He
Lianwen Jin
Liangwei Wang
93
334
0
24 Feb 2020
Real-time Scene Text Detection with Differentiable Binarization
Real-time Scene Text Detection with Differentiable Binarization
Minghui Liao
Zhaoyi Wan
Cong Yao
Kai Chen
X. Bai
78
681
0
20 Nov 2019
ICDAR 2019 Competition on Large-scale Street View Text with Partial
  Labeling -- RRC-LSVT
ICDAR 2019 Competition on Large-scale Street View Text with Partial Labeling -- RRC-LSVT
Yipeng Sun
Zihan Ni
Chee-Kheng Chng
Yuliang Liu
Canjie Luo
...
Errui Ding
Jingtuo Liu
Dimosthenis Karatzas
Chee Seng Chan
Lianwen Jin
3DV
100
158
0
17 Sep 2019
ICDAR2019 Robust Reading Challenge on Arbitrary-Shaped Text (RRC-ArT)
ICDAR2019 Robust Reading Challenge on Arbitrary-Shaped Text (RRC-ArT)
Chee-Kheng Chng
Yuliang Liu
Yipeng Sun
Chun Chet Ng
Canjie Luo
...
Errui Ding
Jingtuo Liu
Dimosthenis Karatzas
Chee Seng Chan
Lianwen Jin
3DV
92
215
0
16 Sep 2019
Deep High-Resolution Representation Learning for Visual Recognition
Deep High-Resolution Representation Learning for Visual Recognition
Jingdong Wang
Ke Sun
Tianheng Cheng
Borui Jiang
Chaorui Deng
...
Yadong Mu
Mingkui Tan
Xinggang Wang
Wenyu Liu
Bin Xiao
393
3,627
0
20 Aug 2019
Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel
  Aggregation Network
Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network
Wenhai Wang
Enze Xie
Xiaoge Song
Yuhang Zang
Wenjia Wang
Tong Lu
Gang Yu
Chunhua Shen
62
418
0
16 Aug 2019
Character Region Awareness for Text Detection
Character Region Awareness for Text Detection
Youngmin Baek
Bado Lee
Dongyoon Han
Sangdoo Yun
Hwalsuk Lee
64
784
0
03 Apr 2019
Shape Robust Text Detection with Progressive Scale Expansion Network
Shape Robust Text Detection with Progressive Scale Expansion Network
Xiang Li
Wenhai Wang
Wenbo Hou
Ruo-Ze Liu
Tong Lu
Jian Yang
92
611
0
07 Jun 2018
Encoder-Decoder with Atrous Separable Convolution for Semantic Image
  Segmentation
Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation
Liang-Chieh Chen
Yukun Zhu
George Papandreou
Florian Schroff
Hartwig Adam
SSeg
480
13,168
0
07 Feb 2018
Total-Text: A Comprehensive Dataset for Scene Text Detection and
  Recognition
Total-Text: A Comprehensive Dataset for Scene Text Detection and Recognition
Chee-Kheng Chng
Chee Seng Chan
70
462
0
28 Oct 2017
Mask R-CNN
Mask R-CNN
Kaiming He
Georgia Gkioxari
Piotr Dollár
Ross B. Girshick
ObjD
360
27,244
0
20 Mar 2017
TextBoxes: A Fast Text Detector with a Single Deep Neural Network
TextBoxes: A Fast Text Detector with a Single Deep Neural Network
Minghui Liao
Baoguang Shi
X. Bai
Xinggang Wang
Wenyu Liu
67
868
0
21 Nov 2016
V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image
  Segmentation
V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation
Fausto Milletari
Nassir Navab
Seyed-Ahmad Ahmadi
235
8,716
0
15 Jun 2016
1