ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2211.10772
  4. Cited By
DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text
  Spotting

DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting

19 November 2022
Maoyuan Ye
Jing Zhang
Shanshan Zhao
Juhua Liu
Tongliang Liu
Bo Du
Dacheng Tao
ArXivPDFHTML

Papers citing "DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting"

46 / 46 papers shown
Title
SemiETS: Integrating Spatial and Content Consistencies for Semi-Supervised End-to-end Text Spotting
SemiETS: Integrating Spatial and Content Consistencies for Semi-Supervised End-to-end Text Spotting
Dongliang Luo
Hanshen Zhu
Ziyang Zhang
Dingkang Liang
Xudong Xie
Y. Liu
Xiang Bai
VLM
39
0
0
14 Apr 2025
Edge Approximation Text Detector
Edge Approximation Text Detector
Chuang Yang
Xu Han
T. Han
Han Han
Bingxuan Zhao
Qi Wang
43
0
0
05 Apr 2025
A Context-Driven Training-Free Network for Lightweight Scene Text Segmentation and Recognition
A Context-Driven Training-Free Network for Lightweight Scene Text Segmentation and Recognition
Ritabrata Chakraborty
Shivakumara Palaiahnakote
Umapada Pal
Cheng-Lin Liu
VLM
47
0
0
19 Mar 2025
OmniParser V2: Structured-Points-of-Thought for Unified Visual Text Parsing and Its Generality to Multimodal Large Language Models
OmniParser V2: Structured-Points-of-Thought for Unified Visual Text Parsing and Its Generality to Multimodal Large Language Models
Wenwen Yu
Zhibo Yang
Jianqiang Wan
Sibo Song
J. Tang
Wenqing Cheng
Y. Liu
Xiang Bai
51
1
0
22 Feb 2025
Type-R: Automatically Retouching Typos for Text-to-Image Generation
Type-R: Automatically Retouching Typos for Text-to-Image Generation
Wataru Shimoda
Naoto Inoue
Daichi Haraguchi
Hayato Mitani
S. Uchida
Kota Yamaguchi
DiffM
99
0
0
27 Nov 2024
SignEye: Traffic Sign Interpretation from Vehicle First-Person View
Chuang Yang
Xu Han
T. Han
Yuejiao Su
Junyu Gao
Hongyuan Zhang
Yi Wang
Lap-Pui Chau
84
2
0
18 Nov 2024
HIP: Hierarchical Point Modeling and Pre-training for Visual Information
  Extraction
HIP: Hierarchical Point Modeling and Pre-training for Visual Information Extraction
Rujiao Long
Pengfei Wang
Zhibo Yang
Cong Yao
41
0
0
02 Nov 2024
Platypus: A Generalized Specialist Model for Reading Text in Various
  Forms
Platypus: A Generalized Specialist Model for Reading Text in Various Forms
Peng Wang
Zhaohai Li
Jun Tang
Humen Zhong
Fei Huang
Zhibo Yang
Cong Yao
VLM
ObjD
40
2
0
27 Aug 2024
FastTextSpotter: A High-Efficiency Transformer for Multilingual Scene Text Spotting
FastTextSpotter: A High-Efficiency Transformer for Multilingual Scene Text Spotting
Alloy Das
Sanket Biswas
Umapada Pal
Josep Lladós
Saumik Bhattacharya
57
2
0
27 Aug 2024
DNTextSpotter: Arbitrary-Shaped Scene Text Spotting via Improved
  Denoising Training
DNTextSpotter: Arbitrary-Shaped Scene Text Spotting via Improved Denoising Training
Xi Chen
Qian Qiao
Jun Gao
Tianxiang Wu
Rahul Bhadani
...
Ziqiang Cao
Larry Head
Yue Zhang
Jielei Zhang
Huyang Sun
DiffM
30
5
0
01 Aug 2024
WeCromCL: Weakly Supervised Cross-Modality Contrastive Learning for Transcription-only Supervised Text Spotting
WeCromCL: Weakly Supervised Cross-Modality Contrastive Learning for Transcription-only Supervised Text Spotting
Jingjing Wu
Zhengyao Fang
Pengyuan Lyu
Chengquan Zhang
Fanglin Chen
Guangming Lu
Wenjie Pei
50
2
0
28 Jul 2024
CLII: Visual-Text Inpainting via Cross-Modal Predictive Interaction
CLII: Visual-Text Inpainting via Cross-Modal Predictive Interaction
Liang Zhao
Qing-Wu Guo
Xiaoguang Li
Song Wang
DiffM
44
0
0
23 Jul 2024
Fine-Grained Scene Image Classification with Modality-Agnostic Adapter
Fine-Grained Scene Image Classification with Modality-Agnostic Adapter
Yiqun Wang
Zhao Zhou
Xiangcheng Du
Xingjiao Wu
Yingbin Zheng
Cheng Jin
46
0
0
03 Jul 2024
LOGO: Video Text Spotting with Language Collaboration and Glyph
  Perception Model
LOGO: Video Text Spotting with Language Collaboration and Glyph Perception Model
Hongen Liu
Di Sun
Jiahao Wang
Yi Liu
Gang Pan
48
0
0
29 May 2024
Text Grouping Adapter: Adapting Pre-trained Text Detector for Layout
  Analysis
Text Grouping Adapter: Adapting Pre-trained Text Detector for Layout Analysis
Tianci Bi
Xiaoyi Zhang
Zhizheng Zhang
Wenxuan Xie
Cuiling Lan
Yan Lu
Nanning Zheng
VLM
55
1
0
13 May 2024
VimTS: A Unified Video and Image Text Spotter for Enhancing the
  Cross-domain Generalization
VimTS: A Unified Video and Image Text Spotter for Enhancing the Cross-domain Generalization
Yuliang Liu
Mingxin Huang
Hao Yan
Linger Deng
Weijia Wu
Hao Lu
Chunhua Shen
Lianwen Jin
Xiang Bai
37
0
0
30 Apr 2024
Bridging the Gap Between End-to-End and Two-Step Text Spotting
Bridging the Gap Between End-to-End and Two-Step Text Spotting
Mingxin Huang
Hongliang Li
Yuliang Liu
Xiang Bai
Lianwen Jin
35
3
0
06 Apr 2024
OmniParser: A Unified Framework for Text Spotting, Key Information
  Extraction and Table Recognition
OmniParser: A Unified Framework for Text Spotting, Key Information Extraction and Table Recognition
Jianqiang Wan
Sibo Song
Wenwen Yu
Yuliang Liu
Wenqing Cheng
Fei Huang
Xiang Bai
Cong Yao
Zhibo Yang
51
26
0
28 Mar 2024
TextBlockV2: Towards Precise-Detection-Free Scene Text Spotting with
  Pre-trained Language Model
TextBlockV2: Towards Precise-Detection-Free Scene Text Spotting with Pre-trained Language Model
Jiahao Lyu
Jin Wei
Gangyan Zeng
Zeng Li
Enze Xie
Wei Wang
Yu Zhou
VLM
29
3
0
15 Mar 2024
ODM: A Text-Image Further Alignment Pre-training Approach for Scene Text
  Detection and Spotting
ODM: A Text-Image Further Alignment Pre-training Approach for Scene Text Detection and Spotting
Chen Duan
Pei Fu
Shan Guo
Qianyi Jiang
Xiaoming Wei
VLM
46
5
0
01 Mar 2024
Efficiently Leveraging Linguistic Priors for Scene Text Spotting
Efficiently Leveraging Linguistic Priors for Scene Text Spotting
Nguyen Nguyen
Yapeng Tian
Chenliang Xu
49
1
0
27 Feb 2024
Beyond the Mud: Datasets and Benchmarks for Computer Vision in Off-Road
  Racing
Beyond the Mud: Datasets and Benchmarks for Computer Vision in Off-Road Racing
Jacob Tyo
Motolani Olarinre
Youngseog Chung
Zachary Chase Lipton
38
0
0
12 Feb 2024
Hi-SAM: Marrying Segment Anything Model for Hierarchical Text
  Segmentation
Hi-SAM: Marrying Segment Anything Model for Hierarchical Text Segmentation
Maoyuan Ye
Jing Zhang
Juhua Liu
Chenyu Liu
Baocai Yin
Cong Liu
Bo Du
Dacheng Tao
VLM
37
11
0
31 Jan 2024
SwinTextSpotter v2: Towards Better Synergy for Scene Text Spotting
SwinTextSpotter v2: Towards Better Synergy for Scene Text Spotting
Mingxin Huang
Dezhi Peng
Hongliang Li
Zhenghao Peng
Chongyu Liu
Dahua Lin
Yuliang Liu
Xiang Bai
Lianwen Jin
77
1
0
15 Jan 2024
GoMatching: A Simple Baseline for Video Text Spotting via Long and Short
  Term Matching
GoMatching: A Simple Baseline for Video Text Spotting via Long and Short Term Matching
Haibin He
Maoyuan Ye
Jing Zhang
Juhua Liu
Dacheng Tao
VLM
49
3
0
13 Jan 2024
Inverse-like Antagonistic Scene Text Spotting via Reading-Order
  Estimation and Dynamic Sampling
Inverse-like Antagonistic Scene Text Spotting via Reading-Order Estimation and Dynamic Sampling
Shi-Xue Zhang
Chun Yang
Xiaobin Zhu
Hongyang Zhou
Hongfa Wang
Xu-Cheng Yin
34
6
0
08 Jan 2024
Parrot Captions Teach CLIP to Spot Text
Parrot Captions Teach CLIP to Spot Text
Yiqi Lin
Conghui He
Alex Jinpeng Wang
Bin Wang
Weijia Li
Mike Zheng Shou
36
7
0
21 Dec 2023
Progressive Evolution from Single-Point to Polygon for Scene Text
Progressive Evolution from Single-Point to Polygon for Scene Text
Linger Deng
Mingxin Huang
Xudong Xie
Yuliang Liu
Lianwen Jin
Xiang Bai
31
1
0
21 Dec 2023
Bridging Synthetic and Real Worlds for Pre-training Scene Text Detectors
Bridging Synthetic and Real Worlds for Pre-training Scene Text Detectors
Tongkun Guan
Wei Shen
Xuehang Yang
Xuehui Wang
Xiaokang Yang
34
7
0
08 Dec 2023
UDiffText: A Unified Framework for High-quality Text Synthesis in
  Arbitrary Images via Character-aware Diffusion Models
UDiffText: A Unified Framework for High-quality Text Synthesis in Arbitrary Images via Character-aware Diffusion Models
Yiming Zhao
Zhouhui Lian
76
27
0
08 Dec 2023
Reading Between the Mud: A Challenging Motorcycle Racer Number Dataset
Reading Between the Mud: A Challenging Motorcycle Racer Number Dataset
Jacob Tyo
Youngseog Chung
Motolani Olarinre
Zachary Chase Lipton
29
0
0
14 Nov 2023
Exploring OCR Capabilities of GPT-4V(ision) : A Quantitative and
  In-depth Evaluation
Exploring OCR Capabilities of GPT-4V(ision) : A Quantitative and In-depth Evaluation
Yongxin Shi
Dezhi Peng
Wenhui Liao
Zening Lin
Xinhong Chen
Chongyu Liu
Yuyi Zhang
Lianwen Jin
MLLM
30
44
0
25 Oct 2023
Harnessing the Power of Multi-Lingual Datasets for Pre-training: Towards
  Enhancing Text Spotting Performance
Harnessing the Power of Multi-Lingual Datasets for Pre-training: Towards Enhancing Text Spotting Performance
Alloy Das
Sanket Biswas
Ayan Banerjee
Josep Lladós
Umapada Pal
Saumik Bhattacharya
30
3
0
02 Oct 2023
Box2Poly: Memory-Efficient Polygon Prediction of Arbitrarily Shaped and
  Rotated Text
Box2Poly: Memory-Efficient Polygon Prediction of Arbitrarily Shaped and Rotated Text
Xuyang Chen
Dong Wang
Konrad Schindler
Mingwei Sun
Yongliang Wang
Nicolo Savioli
Liqiu Meng
32
0
0
20 Sep 2023
SRFormer: Text Detection Transformer with Incorporated Segmentation and
  Regression
SRFormer: Text Detection Transformer with Incorporated Segmentation and Regression
Qingwen Bu
Sungrae Park
Minsoo Khang
Yi-Bin Cheng
29
3
0
21 Aug 2023
Turning a CLIP Model into a Scene Text Spotter
Turning a CLIP Model into a Scene Text Spotter
Wenwen Yu
Yuliang Liu
Xingkui Zhu
H. Cao
Xing Sun
Xiang Bai
VLM
CLIP
24
12
0
21 Aug 2023
ESTextSpotter: Towards Better Scene Text Spotting with Explicit Synergy
  in Transformer
ESTextSpotter: Towards Better Scene Text Spotting with Explicit Synergy in Transformer
Mingxin Huang
Jiaxin Zhang
Dezhi Peng
Hao Lu
Can Huang
Yuliang Liu
Xiang Bai
Lianwen Jin
38
26
0
20 Aug 2023
DeepSolo++: Let Transformer Decoder with Explicit Points Solo for
  Multilingual Text Spotting
DeepSolo++: Let Transformer Decoder with Explicit Points Solo for Multilingual Text Spotting
Maoyuan Ye
Jing Zhang
Shanshan Zhao
Juhua Liu
Tongliang Liu
Bo Du
Dacheng Tao
46
2
0
31 May 2023
ICDAR 2023 Competition on Hierarchical Text Detection and Recognition
ICDAR 2023 Competition on Hierarchical Text Detection and Recognition
Shangbang Long
Siyang Qin
Dmitry Panteleev
Alessandro Bissacco
Yasuhisa Fujii
Michalis Raptis
VLM
39
17
0
16 May 2023
Scalable Mask Annotation for Video Text Spotting
Scalable Mask Annotation for Video Text Spotting
Haibin He
Jing Zhang
Mengyang Xu
Juhua Liu
Bo Du
Dacheng Tao
95
14
0
02 May 2023
GLT-T++: Global-Local Transformer for 3D Siamese Tracking with Ranking Loss
Jiahao Nie
Zhiwei He
Yuxiang Yang
Xudong Lv
Mingchen Gao
Jing Zhang
ViT
3DPC
39
7
0
01 Apr 2023
OmniForce: On Human-Centered, Large Model Empowered and Cloud-Edge
  Collaborative AutoML System
OmniForce: On Human-Centered, Large Model Empowered and Cloud-Edge Collaborative AutoML System
Chao Xue
Wei Liu
Shunxing Xie
Zhenfang Wang
Jiaxing Li
...
Shi-Yong Chen
Yibing Zhan
Jing Zhang
Chaoyue Wang
Dacheng Tao
43
2
0
01 Mar 2023
DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR
DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR
Shilong Liu
Feng Li
Hao Zhang
X. Yang
Xianbiao Qi
Hang Su
Jun Zhu
Lei Zhang
ViT
155
728
0
28 Jan 2022
Pix2seq: A Language Modeling Framework for Object Detection
Pix2seq: A Language Modeling Framework for Object Detection
Ting-Li Chen
Saurabh Saxena
Lala Li
David J. Fleet
Geoffrey E. Hinton
MLLM
ViT
VLM
244
344
0
22 Sep 2021
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction
  without Convolutions
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
283
3,623
0
24 Feb 2021
Convolutional Character Networks
Convolutional Character Networks
Linjie Xing
Zhi Tian
Weilin Huang
Matthew R. Scott
57
157
0
17 Oct 2019
1