基于混合融合多级尺度特征的烟叶主脉精细分割模型

IF 3.1 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Soft Computing Pub Date : 2024-08-12 DOI:10.1007/s00500-024-09833-6

Biao Xu, Xiaobao Liu, Wenjuan Gu, Jia Liu, Hongcheng Wang

{"title":"基于混合融合多级尺度特征的烟叶主脉精细分割模型","authors":"Biao Xu, Xiaobao Liu, Wenjuan Gu, Jia Liu, Hongcheng Wang","doi":"10.1007/s00500-024-09833-6","DOIUrl":null,"url":null,"abstract":"<p>Flue-cured tobacco (FCT) can be classified into upper (B), middle (C), and lower (X) parts based on characteristics such as the FCT's main veins, leaf shape, color, and thickness. Accurately measuring the geometric parameters of the main veins is crucial for identifying the different parts. However, this task has proven to be challenging. Therefore, segmenting the main veins is a prerequisite to reducing calculation errors and improving the precision of part identification. To obtain enough semantic information and improve segmentation accuracy, we propose a fine segmentation model (MSHF-Net) of FCT's main veins based on multi-level-scale features of hybrid fusion. Firstly, MobileNetV2 with a dilated convolution layer (DMobileNetV2) is selected as the backbone network for feature extraction, which optimizes training and inference speed to minimize computing costs. Subsequently, Hybrid Fusion Atrous Spatial Pyramid Pooling (HFASPP) is designed to be the strengthened backbone module for capturing more high-level semantic information, effectively preventing intermittent segmentation of some main veins. Additionally, considering the low proportion of main vein targets in the original image, the double shallow feature branches (DSFBS) are included to obtain more low-level semantic information. Finally, a channel attention mechanism (ECANet) is added to enhance useful information and eliminate redundant information after the hybrid fusion of high-low-level semantic information, preventing mis-segmentation of regions. Experimental validation demonstrates the efficiency of the MSHF-Net, with parameters of only 7.92 M, thus ensuring minimal computational requirements. The model achieves an impressive mean intersection over union (MIoU) of 85.57% and mean pixel accuracy (mPA) of 93.10% on a diverse test set of FCT parts. When applied to segment main veins in a 2296 × 1548 × 3 tobacco image, the model takes just over 0.1 s. It is noteworthy that none of the 291 randomly segmented tobacco leaf main veins show mis-segmentation, highlighting the model's robustness and practical applicability in various scenarios. These results emphasize the superior segmentation performance of the proposed model, establishing a crucial foundation for accurately discriminating FCT parts.</p>","PeriodicalId":22039,"journal":{"name":"Soft Computing","volume":"40 1","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A fine segmentation model of flue-cured tobacco’s main veins based on multi-level-scale features of hybrid fusion\",\"authors\":\"Biao Xu, Xiaobao Liu, Wenjuan Gu, Jia Liu, Hongcheng Wang\",\"doi\":\"10.1007/s00500-024-09833-6\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Flue-cured tobacco (FCT) can be classified into upper (B), middle (C), and lower (X) parts based on characteristics such as the FCT's main veins, leaf shape, color, and thickness. Accurately measuring the geometric parameters of the main veins is crucial for identifying the different parts. However, this task has proven to be challenging. Therefore, segmenting the main veins is a prerequisite to reducing calculation errors and improving the precision of part identification. To obtain enough semantic information and improve segmentation accuracy, we propose a fine segmentation model (MSHF-Net) of FCT's main veins based on multi-level-scale features of hybrid fusion. Firstly, MobileNetV2 with a dilated convolution layer (DMobileNetV2) is selected as the backbone network for feature extraction, which optimizes training and inference speed to minimize computing costs. Subsequently, Hybrid Fusion Atrous Spatial Pyramid Pooling (HFASPP) is designed to be the strengthened backbone module for capturing more high-level semantic information, effectively preventing intermittent segmentation of some main veins. Additionally, considering the low proportion of main vein targets in the original image, the double shallow feature branches (DSFBS) are included to obtain more low-level semantic information. Finally, a channel attention mechanism (ECANet) is added to enhance useful information and eliminate redundant information after the hybrid fusion of high-low-level semantic information, preventing mis-segmentation of regions. Experimental validation demonstrates the efficiency of the MSHF-Net, with parameters of only 7.92 M, thus ensuring minimal computational requirements. The model achieves an impressive mean intersection over union (MIoU) of 85.57% and mean pixel accuracy (mPA) of 93.10% on a diverse test set of FCT parts. When applied to segment main veins in a 2296 × 1548 × 3 tobacco image, the model takes just over 0.1 s. It is noteworthy that none of the 291 randomly segmented tobacco leaf main veins show mis-segmentation, highlighting the model's robustness and practical applicability in various scenarios. These results emphasize the superior segmentation performance of the proposed model, establishing a crucial foundation for accurately discriminating FCT parts.</p>\",\"PeriodicalId\":22039,\"journal\":{\"name\":\"Soft Computing\",\"volume\":\"40 1\",\"pages\":\"\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2024-08-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Soft Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s00500-024-09833-6\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Soft Computing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s00500-024-09833-6","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

根据烟叶主脉、叶形、颜色和厚度等特征，烟叶可分为上部（B）、中部（C）和下部（X）。准确测量主脉的几何参数对于识别不同部分至关重要。然而，事实证明这项任务极具挑战性。因此，分割主脉是减少计算误差和提高部件识别精度的先决条件。为了获取足够的语义信息并提高分割精度，我们提出了一种基于混合融合多级尺度特征的 FCT 主脉精细分割模型（MSHF-Net）。首先，选择带有扩张卷积层的 MobileNetV2（DMobileNetV2）作为特征提取的骨干网络，优化训练和推理速度，最大限度地降低计算成本。随后，设计了混合融合阿特罗斯空间金字塔池化（HFASPP）作为强化的骨干模块，以捕捉更多高层次的语义信息，有效防止对某些主脉进行间歇性分割。此外，考虑到原始图像中主脉目标比例较低，还加入了双浅层特征分支（DSFBS），以获取更多低层语义信息。最后，在高低级语义信息混合融合后，加入通道关注机制（ECANet）以增强有用信息并消除冗余信息，防止区域错误分割。实验验证证明了 MSHF-Net 的效率，其参数仅为 7.92 M，从而确保了最小的计算需求。该模型在不同的 FCT 零件测试集上实现了令人印象深刻的 85.57% 的平均交集大于联合（MIoU）和 93.10% 的平均像素准确率（mPA）。值得注意的是，在随机分割的 291 张烟叶主脉图像中，没有一张出现错误分割，这凸显了该模型的鲁棒性和在各种场景下的实用性。这些结果凸显了所提出模型的卓越分割性能，为准确区分烟叶主脉部分奠定了重要基础。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

A fine segmentation model of flue-cured tobacco’s main veins based on multi-level-scale features of hybrid fusion

Flue-cured tobacco (FCT) can be classified into upper (B), middle (C), and lower (X) parts based on characteristics such as the FCT's main veins, leaf shape, color, and thickness. Accurately measuring the geometric parameters of the main veins is crucial for identifying the different parts. However, this task has proven to be challenging. Therefore, segmenting the main veins is a prerequisite to reducing calculation errors and improving the precision of part identification. To obtain enough semantic information and improve segmentation accuracy, we propose a fine segmentation model (MSHF-Net) of FCT's main veins based on multi-level-scale features of hybrid fusion. Firstly, MobileNetV2 with a dilated convolution layer (DMobileNetV2) is selected as the backbone network for feature extraction, which optimizes training and inference speed to minimize computing costs. Subsequently, Hybrid Fusion Atrous Spatial Pyramid Pooling (HFASPP) is designed to be the strengthened backbone module for capturing more high-level semantic information, effectively preventing intermittent segmentation of some main veins. Additionally, considering the low proportion of main vein targets in the original image, the double shallow feature branches (DSFBS) are included to obtain more low-level semantic information. Finally, a channel attention mechanism (ECANet) is added to enhance useful information and eliminate redundant information after the hybrid fusion of high-low-level semantic information, preventing mis-segmentation of regions. Experimental validation demonstrates the efficiency of the MSHF-Net, with parameters of only 7.92 M, thus ensuring minimal computational requirements. The model achieves an impressive mean intersection over union (MIoU) of 85.57% and mean pixel accuracy (mPA) of 93.10% on a diverse test set of FCT parts. When applied to segment main veins in a 2296 × 1548 × 3 tobacco image, the model takes just over 0.1 s. It is noteworthy that none of the 291 randomly segmented tobacco leaf main veins show mis-segmentation, highlighting the model's robustness and practical applicability in various scenarios. These results emphasize the superior segmentation performance of the proposed model, establishing a crucial foundation for accurately discriminating FCT parts.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Soft Computing 工程技术-计算机：跨学科应用

CiteScore

8.10

自引率

9.80%

发文量

927

审稿时长

7.3 months

期刊介绍： Soft Computing is dedicated to system solutions based on soft computing techniques. It provides rapid dissemination of important results in soft computing technologies, a fusion of research in evolutionary algorithms and genetic programming, neural science and neural net systems, fuzzy set theory and fuzzy systems, and chaos theory and chaotic systems. Soft Computing encourages the integration of soft computing techniques and tools into both everyday and advanced applications. By linking the ideas and techniques of soft computing with other disciplines, the journal serves as a unifying platform that fosters comparisons, extensions, and new applications. As a result, the journal is an international forum for all scientists and engineers engaged in research and development in this fast growing field.