Hongyu Zhang;Zhujun Hu;Huaying Huang;Shuang Liu;Yunbo Rao;Qifei Wang;Naveed Ahmad
{"title":"SpermDet: Structure-Aware Network With Local Context Enhancement and Dual-Path Fusion for Object Detection in Sperm Images","authors":"Hongyu Zhang;Zhujun Hu;Huaying Huang;Shuang Liu;Yunbo Rao;Qifei Wang;Naveed Ahmad","doi":"10.1109/TIM.2025.3544697","DOIUrl":null,"url":null,"abstract":"Recently, deep-learning-based object detection models have been used to improve the detection performance in sperm images. However, these models encounter three primary challenges: 1) visual similarity between sperm and background noise; 2) neglecting critical local contextual features; and 3) diminishment of tiny sperm features when fusing deep features with an inaccurate foreground. In this article, we propose SpermDet to alleviate these issues. Specifically, we first develop a structure-aware alignment fusion (SAF) module to align and integrate structural information with RGB images, thus improving feature discriminability. Then, we introduce a local context enhancement block (LCEB) to effectively capture the crucial local contextual features of sperm. Furthermore, we design a semantic-aware dual-path fusion (SDF) module that uses both foreground and background information of deep layers to enhance semantic information and preserve detailed sperm features in shallow layers. Finally, we construct a dataset for sperm detection in the testicular biopsy scene, termed as SDTB. It contains 1341 images with 5548 instances, characterized by challenges of tiny sperm size and complex backgrounds. Extensive experiments on the SVIA, VISEM, and the proposed SDTB datasets demonstrate that SpermDet outperforms 24 recent models, achieving <inline-formula> <tex-math>$\\rm mAP_{50}$ </tex-math></inline-formula> scores of 77.7%, 26.3%, and 53.2%, respectively. Compared with the baseline model, SpermDet achieves <inline-formula> <tex-math>$\\rm mAP_{50}$ </tex-math></inline-formula> score improvements of 5.0%, 6.9%, and 9.2% on these datasets while reducing the parameters by 16.3%, which has the potential for application in clinical instruments. The code and dataset are publicly available at: <uri>https://github.com/Hong-yu-Zhang/SpermDet</uri>.","PeriodicalId":13341,"journal":{"name":"IEEE Transactions on Instrumentation and Measurement","volume":"74 ","pages":"1-14"},"PeriodicalIF":5.6000,"publicationDate":"2025-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Instrumentation and Measurement","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10910089/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Recently, deep-learning-based object detection models have been used to improve the detection performance in sperm images. However, these models encounter three primary challenges: 1) visual similarity between sperm and background noise; 2) neglecting critical local contextual features; and 3) diminishment of tiny sperm features when fusing deep features with an inaccurate foreground. In this article, we propose SpermDet to alleviate these issues. Specifically, we first develop a structure-aware alignment fusion (SAF) module to align and integrate structural information with RGB images, thus improving feature discriminability. Then, we introduce a local context enhancement block (LCEB) to effectively capture the crucial local contextual features of sperm. Furthermore, we design a semantic-aware dual-path fusion (SDF) module that uses both foreground and background information of deep layers to enhance semantic information and preserve detailed sperm features in shallow layers. Finally, we construct a dataset for sperm detection in the testicular biopsy scene, termed as SDTB. It contains 1341 images with 5548 instances, characterized by challenges of tiny sperm size and complex backgrounds. Extensive experiments on the SVIA, VISEM, and the proposed SDTB datasets demonstrate that SpermDet outperforms 24 recent models, achieving $\rm mAP_{50}$ scores of 77.7%, 26.3%, and 53.2%, respectively. Compared with the baseline model, SpermDet achieves $\rm mAP_{50}$ score improvements of 5.0%, 6.9%, and 9.2% on these datasets while reducing the parameters by 16.3%, which has the potential for application in clinical instruments. The code and dataset are publicly available at: https://github.com/Hong-yu-Zhang/SpermDet.
期刊介绍:
Papers are sought that address innovative solutions to the development and use of electrical and electronic instruments and equipment to measure, monitor and/or record physical phenomena for the purpose of advancing measurement science, methods, functionality and applications. The scope of these papers may encompass: (1) theory, methodology, and practice of measurement; (2) design, development and evaluation of instrumentation and measurement systems and components used in generating, acquiring, conditioning and processing signals; (3) analysis, representation, display, and preservation of the information obtained from a set of measurements; and (4) scientific and technical support to establishment and maintenance of technical standards in the field of Instrumentation and Measurement.