基于改进CNN和Vision Transformer的血细胞图像识别方法分析

IF 0.4 4区计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Ieice Transactions on Fundamentals of Electronics Communications and Computer Sciences Pub Date : 2023-01-01 DOI:10.1587/transfun.2023eap1056

Pingping WANG, Xinyi ZHANG, Yuyan ZHAO, Yueti LI, Kaisheng XU, Shuaiyin ZHAO

{"title":"基于改进CNN和Vision Transformer的血细胞图像识别方法分析","authors":"Pingping WANG, Xinyi ZHANG, Yuyan ZHAO, Yueti LI, Kaisheng XU, Shuaiyin ZHAO","doi":"10.1587/transfun.2023eap1056","DOIUrl":null,"url":null,"abstract":"Leukemia is a common and highly dangerous blood disease that requires early detection and treatment. Currently, the diagnosis of leukemia types mainly relies on the pathologist's morphological examination of blood cell images, which is a tedious and time-consuming process, and the diagnosis results are highly subjective and prone to misdiagnosis and missed diagnosis. This research suggests a blood cell image recognition technique based on an enhanced Vision Transformer to address these problems. Firstly, this paper incorporate convolutions with token embedding to replace the positional encoding which represent coarse spatial information. Then based on the Transformer's self-attention mechanism, this paper proposes a sparse attention module that can select identifying regions in the image, further enhancing the model's fine-grained feature expression capability. Finally, this paper uses a contrastive loss function to further increase the intra-class consistency and inter-class difference of classification features. According to experimental results, The model in this study has an identification accuracy of 92.49% on the Munich single-cell morphological dataset, which is an improvement of 1.41% over the baseline. And comparing with sota Swin transformer, this method still get greater performance. So our method has the potential to provide reference for clinical diagnosis by physicians.","PeriodicalId":55003,"journal":{"name":"Ieice Transactions on Fundamentals of Electronics Communications and Computer Sciences","volume":null,"pages":null},"PeriodicalIF":0.4000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Analysis of blood cell image recognition methods based on improved CNN and Vision Transformer\",\"authors\":\"Pingping WANG, Xinyi ZHANG, Yuyan ZHAO, Yueti LI, Kaisheng XU, Shuaiyin ZHAO\",\"doi\":\"10.1587/transfun.2023eap1056\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Leukemia is a common and highly dangerous blood disease that requires early detection and treatment. Currently, the diagnosis of leukemia types mainly relies on the pathologist's morphological examination of blood cell images, which is a tedious and time-consuming process, and the diagnosis results are highly subjective and prone to misdiagnosis and missed diagnosis. This research suggests a blood cell image recognition technique based on an enhanced Vision Transformer to address these problems. Firstly, this paper incorporate convolutions with token embedding to replace the positional encoding which represent coarse spatial information. Then based on the Transformer's self-attention mechanism, this paper proposes a sparse attention module that can select identifying regions in the image, further enhancing the model's fine-grained feature expression capability. Finally, this paper uses a contrastive loss function to further increase the intra-class consistency and inter-class difference of classification features. According to experimental results, The model in this study has an identification accuracy of 92.49% on the Munich single-cell morphological dataset, which is an improvement of 1.41% over the baseline. And comparing with sota Swin transformer, this method still get greater performance. So our method has the potential to provide reference for clinical diagnosis by physicians.\",\"PeriodicalId\":55003,\"journal\":{\"name\":\"Ieice Transactions on Fundamentals of Electronics Communications and Computer Sciences\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.4000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Ieice Transactions on Fundamentals of Electronics Communications and Computer Sciences\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1587/transfun.2023eap1056\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ieice Transactions on Fundamentals of Electronics Communications and Computer Sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1587/transfun.2023eap1056","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

摘要

白血病是一种常见且高度危险的血液疾病，需要早期发现和治疗。目前，白血病类型的诊断主要依赖于病理学家对血细胞图像的形态学检查，这是一个繁琐而耗时的过程，并且诊断结果具有很强的主观性，容易误诊和漏诊。本研究提出了一种基于增强视觉变压器的血细胞图像识别技术来解决这些问题。首先，将卷积与标记嵌入相结合，取代表示粗糙空间信息的位置编码。然后，基于Transformer的自关注机制，提出了一个稀疏关注模块，可以选择图像中的识别区域，进一步增强模型的细粒度特征表达能力。最后，本文利用对比损失函数进一步提高分类特征的类内一致性和类间差异性。实验结果表明，该模型在慕尼黑单细胞形态学数据集上的识别准确率为92.49%，比基线提高了1.41%。与sota Swin变压器相比，该方法仍具有更高的性能。本方法有可能为临床医师的诊断提供参考。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Analysis of blood cell image recognition methods based on improved CNN and Vision Transformer

Leukemia is a common and highly dangerous blood disease that requires early detection and treatment. Currently, the diagnosis of leukemia types mainly relies on the pathologist's morphological examination of blood cell images, which is a tedious and time-consuming process, and the diagnosis results are highly subjective and prone to misdiagnosis and missed diagnosis. This research suggests a blood cell image recognition technique based on an enhanced Vision Transformer to address these problems. Firstly, this paper incorporate convolutions with token embedding to replace the positional encoding which represent coarse spatial information. Then based on the Transformer's self-attention mechanism, this paper proposes a sparse attention module that can select identifying regions in the image, further enhancing the model's fine-grained feature expression capability. Finally, this paper uses a contrastive loss function to further increase the intra-class consistency and inter-class difference of classification features. According to experimental results, The model in this study has an identification accuracy of 92.49% on the Munich single-cell morphological dataset, which is an improvement of 1.41% over the baseline. And comparing with sota Swin transformer, this method still get greater performance. So our method has the potential to provide reference for clinical diagnosis by physicians.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Ieice Transactions on Fundamentals of Electronics Communications and Computer Sciences 工程技术-工程：电子与电气

CiteScore

1.10

自引率

20.00%

发文量

137

审稿时长

3.9 months

期刊介绍： Includes reports on research, developments, and examinations performed by the Society''s members for the specific fields shown in the category list such as detailed below, the contents of which may advance the development of science and industry: (1) Reports on new theories, experiments with new contents, or extensions of and supplements to conventional theories and experiments. (2) Reports on development of measurement technology and various applied technologies. (3) Reports on the planning, design, manufacture, testing, or operation of facilities, machinery, parts, materials, etc. (4) Presentation of new methods, suggestion of new angles, ideas, systematization, software, or any new facts regarding the above.