基于混合融合多级尺度特征的烟叶主脉精细分割模型

IF 3.1 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Soft Computing Pub Date : 2024-08-12 DOI:10.1007/s00500-024-09833-6
Biao Xu, Xiaobao Liu, Wenjuan Gu, Jia Liu, Hongcheng Wang
{"title":"基于混合融合多级尺度特征的烟叶主脉精细分割模型","authors":"Biao Xu, Xiaobao Liu, Wenjuan Gu, Jia Liu, Hongcheng Wang","doi":"10.1007/s00500-024-09833-6","DOIUrl":null,"url":null,"abstract":"<p>Flue-cured tobacco (FCT) can be classified into upper (B), middle (C), and lower (X) parts based on characteristics such as the FCT's main veins, leaf shape, color, and thickness. Accurately measuring the geometric parameters of the main veins is crucial for identifying the different parts. However, this task has proven to be challenging. Therefore, segmenting the main veins is a prerequisite to reducing calculation errors and improving the precision of part identification. To obtain enough semantic information and improve segmentation accuracy, we propose a fine segmentation model (MSHF-Net) of FCT's main veins based on multi-level-scale features of hybrid fusion. Firstly, MobileNetV2 with a dilated convolution layer (DMobileNetV2) is selected as the backbone network for feature extraction, which optimizes training and inference speed to minimize computing costs. Subsequently, Hybrid Fusion Atrous Spatial Pyramid Pooling (HFASPP) is designed to be the strengthened backbone module for capturing more high-level semantic information, effectively preventing intermittent segmentation of some main veins. Additionally, considering the low proportion of main vein targets in the original image, the double shallow feature branches (DSFBS) are included to obtain more low-level semantic information. Finally, a channel attention mechanism (ECANet) is added to enhance useful information and eliminate redundant information after the hybrid fusion of high-low-level semantic information, preventing mis-segmentation of regions. Experimental validation demonstrates the efficiency of the MSHF-Net, with parameters of only 7.92 M, thus ensuring minimal computational requirements. The model achieves an impressive mean intersection over union (MIoU) of 85.57% and mean pixel accuracy (mPA) of 93.10% on a diverse test set of FCT parts. When applied to segment main veins in a 2296 × 1548 × 3 tobacco image, the model takes just over 0.1 s. It is noteworthy that none of the 291 randomly segmented tobacco leaf main veins show mis-segmentation, highlighting the model's robustness and practical applicability in various scenarios. These results emphasize the superior segmentation performance of the proposed model, establishing a crucial foundation for accurately discriminating FCT parts.</p>","PeriodicalId":22039,"journal":{"name":"Soft Computing","volume":"40 1","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A fine segmentation model of flue-cured tobacco’s main veins based on multi-level-scale features of hybrid fusion\",\"authors\":\"Biao Xu, Xiaobao Liu, Wenjuan Gu, Jia Liu, Hongcheng Wang\",\"doi\":\"10.1007/s00500-024-09833-6\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Flue-cured tobacco (FCT) can be classified into upper (B), middle (C), and lower (X) parts based on characteristics such as the FCT's main veins, leaf shape, color, and thickness. Accurately measuring the geometric parameters of the main veins is crucial for identifying the different parts. However, this task has proven to be challenging. Therefore, segmenting the main veins is a prerequisite to reducing calculation errors and improving the precision of part identification. To obtain enough semantic information and improve segmentation accuracy, we propose a fine segmentation model (MSHF-Net) of FCT's main veins based on multi-level-scale features of hybrid fusion. Firstly, MobileNetV2 with a dilated convolution layer (DMobileNetV2) is selected as the backbone network for feature extraction, which optimizes training and inference speed to minimize computing costs. Subsequently, Hybrid Fusion Atrous Spatial Pyramid Pooling (HFASPP) is designed to be the strengthened backbone module for capturing more high-level semantic information, effectively preventing intermittent segmentation of some main veins. Additionally, considering the low proportion of main vein targets in the original image, the double shallow feature branches (DSFBS) are included to obtain more low-level semantic information. Finally, a channel attention mechanism (ECANet) is added to enhance useful information and eliminate redundant information after the hybrid fusion of high-low-level semantic information, preventing mis-segmentation of regions. Experimental validation demonstrates the efficiency of the MSHF-Net, with parameters of only 7.92 M, thus ensuring minimal computational requirements. The model achieves an impressive mean intersection over union (MIoU) of 85.57% and mean pixel accuracy (mPA) of 93.10% on a diverse test set of FCT parts. When applied to segment main veins in a 2296 × 1548 × 3 tobacco image, the model takes just over 0.1 s. It is noteworthy that none of the 291 randomly segmented tobacco leaf main veins show mis-segmentation, highlighting the model's robustness and practical applicability in various scenarios. These results emphasize the superior segmentation performance of the proposed model, establishing a crucial foundation for accurately discriminating FCT parts.</p>\",\"PeriodicalId\":22039,\"journal\":{\"name\":\"Soft Computing\",\"volume\":\"40 1\",\"pages\":\"\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2024-08-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Soft Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s00500-024-09833-6\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Soft Computing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s00500-024-09833-6","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

根据烟叶主脉、叶形、颜色和厚度等特征,烟叶可分为上部(B)、中部(C)和下部(X)。准确测量主脉的几何参数对于识别不同部分至关重要。然而,事实证明这项任务极具挑战性。因此,分割主脉是减少计算误差和提高部件识别精度的先决条件。为了获取足够的语义信息并提高分割精度,我们提出了一种基于混合融合多级尺度特征的 FCT 主脉精细分割模型(MSHF-Net)。首先,选择带有扩张卷积层的 MobileNetV2(DMobileNetV2)作为特征提取的骨干网络,优化训练和推理速度,最大限度地降低计算成本。随后,设计了混合融合阿特罗斯空间金字塔池化(HFASPP)作为强化的骨干模块,以捕捉更多高层次的语义信息,有效防止对某些主脉进行间歇性分割。此外,考虑到原始图像中主脉目标比例较低,还加入了双浅层特征分支(DSFBS),以获取更多低层语义信息。最后,在高低级语义信息混合融合后,加入通道关注机制(ECANet)以增强有用信息并消除冗余信息,防止区域错误分割。实验验证证明了 MSHF-Net 的效率,其参数仅为 7.92 M,从而确保了最小的计算需求。该模型在不同的 FCT 零件测试集上实现了令人印象深刻的 85.57% 的平均交集大于联合(MIoU)和 93.10% 的平均像素准确率(mPA)。值得注意的是,在随机分割的 291 张烟叶主脉图像中,没有一张出现错误分割,这凸显了该模型的鲁棒性和在各种场景下的实用性。这些结果凸显了所提出模型的卓越分割性能,为准确区分烟叶主脉部分奠定了重要基础。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

摘要图片

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
A fine segmentation model of flue-cured tobacco’s main veins based on multi-level-scale features of hybrid fusion

Flue-cured tobacco (FCT) can be classified into upper (B), middle (C), and lower (X) parts based on characteristics such as the FCT's main veins, leaf shape, color, and thickness. Accurately measuring the geometric parameters of the main veins is crucial for identifying the different parts. However, this task has proven to be challenging. Therefore, segmenting the main veins is a prerequisite to reducing calculation errors and improving the precision of part identification. To obtain enough semantic information and improve segmentation accuracy, we propose a fine segmentation model (MSHF-Net) of FCT's main veins based on multi-level-scale features of hybrid fusion. Firstly, MobileNetV2 with a dilated convolution layer (DMobileNetV2) is selected as the backbone network for feature extraction, which optimizes training and inference speed to minimize computing costs. Subsequently, Hybrid Fusion Atrous Spatial Pyramid Pooling (HFASPP) is designed to be the strengthened backbone module for capturing more high-level semantic information, effectively preventing intermittent segmentation of some main veins. Additionally, considering the low proportion of main vein targets in the original image, the double shallow feature branches (DSFBS) are included to obtain more low-level semantic information. Finally, a channel attention mechanism (ECANet) is added to enhance useful information and eliminate redundant information after the hybrid fusion of high-low-level semantic information, preventing mis-segmentation of regions. Experimental validation demonstrates the efficiency of the MSHF-Net, with parameters of only 7.92 M, thus ensuring minimal computational requirements. The model achieves an impressive mean intersection over union (MIoU) of 85.57% and mean pixel accuracy (mPA) of 93.10% on a diverse test set of FCT parts. When applied to segment main veins in a 2296 × 1548 × 3 tobacco image, the model takes just over 0.1 s. It is noteworthy that none of the 291 randomly segmented tobacco leaf main veins show mis-segmentation, highlighting the model's robustness and practical applicability in various scenarios. These results emphasize the superior segmentation performance of the proposed model, establishing a crucial foundation for accurately discriminating FCT parts.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Soft Computing
Soft Computing 工程技术-计算机:跨学科应用
CiteScore
8.10
自引率
9.80%
发文量
927
审稿时长
7.3 months
期刊介绍: Soft Computing is dedicated to system solutions based on soft computing techniques. It provides rapid dissemination of important results in soft computing technologies, a fusion of research in evolutionary algorithms and genetic programming, neural science and neural net systems, fuzzy set theory and fuzzy systems, and chaos theory and chaotic systems. Soft Computing encourages the integration of soft computing techniques and tools into both everyday and advanced applications. By linking the ideas and techniques of soft computing with other disciplines, the journal serves as a unifying platform that fosters comparisons, extensions, and new applications. As a result, the journal is an international forum for all scientists and engineers engaged in research and development in this fast growing field.
期刊最新文献
Handwritten text recognition and information extraction from ancient manuscripts using deep convolutional and recurrent neural network Optimizing green solid transportation with carbon cap and trade: a multi-objective two-stage approach in a type-2 Pythagorean fuzzy context Production chain modeling based on learning flow stochastic petri nets Multi-population multi-strategy differential evolution algorithm with dynamic population size adjustment Dynamic parameter identification of modular robot manipulators based on hybrid optimization strategy: genetic algorithm and least squares method
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1