Biao Xu, Xiaobao Liu, Wenjuan Gu, Jia Liu, Hongcheng Wang
{"title":"基于混合融合多级尺度特征的烟叶主脉精细分割模型","authors":"Biao Xu, Xiaobao Liu, Wenjuan Gu, Jia Liu, Hongcheng Wang","doi":"10.1007/s00500-024-09833-6","DOIUrl":null,"url":null,"abstract":"<p>Flue-cured tobacco (FCT) can be classified into upper (B), middle (C), and lower (X) parts based on characteristics such as the FCT's main veins, leaf shape, color, and thickness. Accurately measuring the geometric parameters of the main veins is crucial for identifying the different parts. However, this task has proven to be challenging. Therefore, segmenting the main veins is a prerequisite to reducing calculation errors and improving the precision of part identification. To obtain enough semantic information and improve segmentation accuracy, we propose a fine segmentation model (MSHF-Net) of FCT's main veins based on multi-level-scale features of hybrid fusion. Firstly, MobileNetV2 with a dilated convolution layer (DMobileNetV2) is selected as the backbone network for feature extraction, which optimizes training and inference speed to minimize computing costs. Subsequently, Hybrid Fusion Atrous Spatial Pyramid Pooling (HFASPP) is designed to be the strengthened backbone module for capturing more high-level semantic information, effectively preventing intermittent segmentation of some main veins. Additionally, considering the low proportion of main vein targets in the original image, the double shallow feature branches (DSFBS) are included to obtain more low-level semantic information. Finally, a channel attention mechanism (ECANet) is added to enhance useful information and eliminate redundant information after the hybrid fusion of high-low-level semantic information, preventing mis-segmentation of regions. Experimental validation demonstrates the efficiency of the MSHF-Net, with parameters of only 7.92 M, thus ensuring minimal computational requirements. The model achieves an impressive mean intersection over union (MIoU) of 85.57% and mean pixel accuracy (mPA) of 93.10% on a diverse test set of FCT parts. When applied to segment main veins in a 2296 × 1548 × 3 tobacco image, the model takes just over 0.1 s. It is noteworthy that none of the 291 randomly segmented tobacco leaf main veins show mis-segmentation, highlighting the model's robustness and practical applicability in various scenarios. These results emphasize the superior segmentation performance of the proposed model, establishing a crucial foundation for accurately discriminating FCT parts.</p>","PeriodicalId":22039,"journal":{"name":"Soft Computing","volume":"40 1","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A fine segmentation model of flue-cured tobacco’s main veins based on multi-level-scale features of hybrid fusion\",\"authors\":\"Biao Xu, Xiaobao Liu, Wenjuan Gu, Jia Liu, Hongcheng Wang\",\"doi\":\"10.1007/s00500-024-09833-6\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Flue-cured tobacco (FCT) can be classified into upper (B), middle (C), and lower (X) parts based on characteristics such as the FCT's main veins, leaf shape, color, and thickness. Accurately measuring the geometric parameters of the main veins is crucial for identifying the different parts. However, this task has proven to be challenging. Therefore, segmenting the main veins is a prerequisite to reducing calculation errors and improving the precision of part identification. To obtain enough semantic information and improve segmentation accuracy, we propose a fine segmentation model (MSHF-Net) of FCT's main veins based on multi-level-scale features of hybrid fusion. Firstly, MobileNetV2 with a dilated convolution layer (DMobileNetV2) is selected as the backbone network for feature extraction, which optimizes training and inference speed to minimize computing costs. Subsequently, Hybrid Fusion Atrous Spatial Pyramid Pooling (HFASPP) is designed to be the strengthened backbone module for capturing more high-level semantic information, effectively preventing intermittent segmentation of some main veins. Additionally, considering the low proportion of main vein targets in the original image, the double shallow feature branches (DSFBS) are included to obtain more low-level semantic information. Finally, a channel attention mechanism (ECANet) is added to enhance useful information and eliminate redundant information after the hybrid fusion of high-low-level semantic information, preventing mis-segmentation of regions. Experimental validation demonstrates the efficiency of the MSHF-Net, with parameters of only 7.92 M, thus ensuring minimal computational requirements. The model achieves an impressive mean intersection over union (MIoU) of 85.57% and mean pixel accuracy (mPA) of 93.10% on a diverse test set of FCT parts. When applied to segment main veins in a 2296 × 1548 × 3 tobacco image, the model takes just over 0.1 s. It is noteworthy that none of the 291 randomly segmented tobacco leaf main veins show mis-segmentation, highlighting the model's robustness and practical applicability in various scenarios. These results emphasize the superior segmentation performance of the proposed model, establishing a crucial foundation for accurately discriminating FCT parts.</p>\",\"PeriodicalId\":22039,\"journal\":{\"name\":\"Soft Computing\",\"volume\":\"40 1\",\"pages\":\"\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2024-08-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Soft Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s00500-024-09833-6\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Soft Computing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s00500-024-09833-6","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
A fine segmentation model of flue-cured tobacco’s main veins based on multi-level-scale features of hybrid fusion
Flue-cured tobacco (FCT) can be classified into upper (B), middle (C), and lower (X) parts based on characteristics such as the FCT's main veins, leaf shape, color, and thickness. Accurately measuring the geometric parameters of the main veins is crucial for identifying the different parts. However, this task has proven to be challenging. Therefore, segmenting the main veins is a prerequisite to reducing calculation errors and improving the precision of part identification. To obtain enough semantic information and improve segmentation accuracy, we propose a fine segmentation model (MSHF-Net) of FCT's main veins based on multi-level-scale features of hybrid fusion. Firstly, MobileNetV2 with a dilated convolution layer (DMobileNetV2) is selected as the backbone network for feature extraction, which optimizes training and inference speed to minimize computing costs. Subsequently, Hybrid Fusion Atrous Spatial Pyramid Pooling (HFASPP) is designed to be the strengthened backbone module for capturing more high-level semantic information, effectively preventing intermittent segmentation of some main veins. Additionally, considering the low proportion of main vein targets in the original image, the double shallow feature branches (DSFBS) are included to obtain more low-level semantic information. Finally, a channel attention mechanism (ECANet) is added to enhance useful information and eliminate redundant information after the hybrid fusion of high-low-level semantic information, preventing mis-segmentation of regions. Experimental validation demonstrates the efficiency of the MSHF-Net, with parameters of only 7.92 M, thus ensuring minimal computational requirements. The model achieves an impressive mean intersection over union (MIoU) of 85.57% and mean pixel accuracy (mPA) of 93.10% on a diverse test set of FCT parts. When applied to segment main veins in a 2296 × 1548 × 3 tobacco image, the model takes just over 0.1 s. It is noteworthy that none of the 291 randomly segmented tobacco leaf main veins show mis-segmentation, highlighting the model's robustness and practical applicability in various scenarios. These results emphasize the superior segmentation performance of the proposed model, establishing a crucial foundation for accurately discriminating FCT parts.
期刊介绍:
Soft Computing is dedicated to system solutions based on soft computing techniques. It provides rapid dissemination of important results in soft computing technologies, a fusion of research in evolutionary algorithms and genetic programming, neural science and neural net systems, fuzzy set theory and fuzzy systems, and chaos theory and chaotic systems.
Soft Computing encourages the integration of soft computing techniques and tools into both everyday and advanced applications. By linking the ideas and techniques of soft computing with other disciplines, the journal serves as a unifying platform that fosters comparisons, extensions, and new applications. As a result, the journal is an international forum for all scientists and engineers engaged in research and development in this fast growing field.