Kefei Zhang , Teng Wang , Xiaolin Yang , Liang Xu , Jesse Thé , Zhongchao Tan , Hesheng Yu
{"title":"STATNet: One-stage coal-gangue detector based on deep learning algorithm for real industrial application","authors":"Kefei Zhang , Teng Wang , Xiaolin Yang , Liang Xu , Jesse Thé , Zhongchao Tan , Hesheng Yu","doi":"10.1016/j.egyai.2024.100388","DOIUrl":null,"url":null,"abstract":"<div><p>Coal-gangue object detection has attracted substantial attention because it is the core of realizing vision-based intelligent and green coal separation. However, most existing studies have been focused on laboratory datasets and prioritized model lightweight. This makes the coal-gangue object detection challenging to adapt to the complex and harsh scenes of real production environments. Therefore, our project collected and labeled image datasets of coal and gangue under real production conditions from a coal preparation plant. We then designed a one-stage object model, named STATNet, following the “backbone-neck-head” architecture with the aim of enhancing the detection accuracy under industrial coal preparation scenarios. The proposed model utilizes Swin Transformer as backbone module to extract multi-scale features, improved path augmentation feature pyramid network (iPAFPN) as neck module to enrich feature fusion, and task-aligned head (TAH) as head module to mitigate conflicts and misalignments between classification and localization tasks. Experimental results on a real-world industrial dataset demonstrate that the proposed STATNet model achieves an impressive AP<sub>50</sub> of 89.27 %, significantly surpassing several state-of-the-art baseline models by 2.02 % to 5.58 %. Additionally, it exhibits stronger robustness in resisting image corruption and perturbation. These findings demonstrate its promising prospects in practical coal and gangue separation applications.</p></div>","PeriodicalId":34138,"journal":{"name":"Energy and AI","volume":"17 ","pages":"Article 100388"},"PeriodicalIF":9.6000,"publicationDate":"2024-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666546824000545/pdfft?md5=59c53a1ec82a8c06886114b0bc76cebc&pid=1-s2.0-S2666546824000545-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Energy and AI","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666546824000545","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Coal-gangue object detection has attracted substantial attention because it is the core of realizing vision-based intelligent and green coal separation. However, most existing studies have been focused on laboratory datasets and prioritized model lightweight. This makes the coal-gangue object detection challenging to adapt to the complex and harsh scenes of real production environments. Therefore, our project collected and labeled image datasets of coal and gangue under real production conditions from a coal preparation plant. We then designed a one-stage object model, named STATNet, following the “backbone-neck-head” architecture with the aim of enhancing the detection accuracy under industrial coal preparation scenarios. The proposed model utilizes Swin Transformer as backbone module to extract multi-scale features, improved path augmentation feature pyramid network (iPAFPN) as neck module to enrich feature fusion, and task-aligned head (TAH) as head module to mitigate conflicts and misalignments between classification and localization tasks. Experimental results on a real-world industrial dataset demonstrate that the proposed STATNet model achieves an impressive AP50 of 89.27 %, significantly surpassing several state-of-the-art baseline models by 2.02 % to 5.58 %. Additionally, it exhibits stronger robustness in resisting image corruption and perturbation. These findings demonstrate its promising prospects in practical coal and gangue separation applications.