ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2308.05480
146
51

YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time Object Detection

21 February 2025
Yuming Chen
Xinbin Yuan
Ruiqi Wu
Jiabao Wang
Qibin Hou
Mingg-Ming Cheng
Ming-Ming Cheng
    ObjD
ArXivPDFHTML
Abstract

We aim at providing the object detection community with an efficient and performant object detector, termed YOLO-MS. The core design is based on a series of investigations on how multi-branch features of the basic block and convolutions with different kernel sizes affect the detection performance of objects at different scales. The outcome is a new strategy that can significantly enhance multi-scale feature representations of real-time object detectors. To verify the effectiveness of our work, we train our YOLO-MS on the MS COCO dataset from scratch without relying on any other large-scale datasets, like ImageNet or pre-trained weights. Without bells and whistles, our YOLO-MS outperforms the recent state-of-the-art real-time object detectors, including YOLO-v7, RTMDet, and YOLO-v8. Taking the XS version of YOLO-MS as an example, it can achieve an AP score of 42+% on MS COCO, which is about 2% higher than RTMDet with the same model size. Furthermore, our work can also serve as a plug-and-play module for other YOLO models. Typically, our method significantly advances the APs, APl, and AP of YOLOv8-N from 18%+, 52%+, and 37%+ to 20%+, 55%+, and 40%+, respectively, with even fewer parameters and MACs. Code and trained models are publicly available atthis https URL. We also provide the Jittor version atthis https URL.

View on arXiv
@article{chen2025_2308.05480,
  title={ YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time Object Detection },
  author={ Yuming Chen and Xinbin Yuan and Jiabao Wang and Ruiqi Wu and Xiang Li and Qibin Hou and Ming-Ming Cheng },
  journal={arXiv preprint arXiv:2308.05480},
  year={ 2025 }
}
Comments on this paper