Prasenjit Dhar, K. Suganya Devi, Ramanuj Bhattacharjee, P. Srinivasan
{"title":"基于不平衡数据集的红细胞形态异常分类。","authors":"Prasenjit Dhar, K. Suganya Devi, Ramanuj Bhattacharjee, P. Srinivasan","doi":"10.1002/jemt.24786","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>Red blood cells (RBCs) or Erythrocytes are essential components of the human body and they transport oxygen <span></span><math>\n <semantics>\n <mrow>\n <mfenced>\n <msub>\n <mi>O</mi>\n <mn>2</mn>\n </msub>\n </mfenced>\n </mrow>\n </semantics></math> from the lungs to the body's tissues, regulate <span></span><math>\n <semantics>\n <mrow>\n <mi>pH</mi>\n </mrow>\n </semantics></math> balance, and support the immune system. Abnormalities in RBC shapes (Poikilocytosis) and sizes (Anisocytosis) can impede oxygen-carrying capacity, leading to conditions such as anemia, thalassemia, McLeod Syndrome, liver disease, and so on. Hematologists typically spend considerable time manually examining RBC's shapes and sizes using a microscope and it is time-consuming. The proposed LSTM based neural network (NN) deep-learning strategy helps to classify abnormal RBCs automatically and accurately and overcome blood-related disorders at an early stage. After data processing, traditional and high-level features are fused to clearly distinguish between abnormal RBC classes. Class imbalance favors the dominant class, resulting in biased forecasts. To address class imbalance, a custom loss function is generated by integrating class weights and loss functions before feeding fused features to the NN classifier. Specifically, the loss function is designed to assign higher penalties to the misclassification of underrepresented classes, ensuring that the model is more sensitive to these classes during training. This is achieved by integrating class weights directly into the cross-entropy loss calculation, thereby balancing the influence of each class on the model's learning process. The proposed approach's performance is evaluated using the publicly accessible Chula-PIC-Lab dataset and privately gathered dataset from the Cachar Cancer Hospital and Research Centre (CCHRC) in Assam, India. The proposed approach achieves an average of <span></span><math>\n <semantics>\n <mrow>\n <mn>97.83</mn>\n <mo>%</mo>\n </mrow>\n </semantics></math> and <span></span><math>\n <semantics>\n <mrow>\n <mn>98.62</mn>\n <mo>%</mo>\n </mrow>\n </semantics></math> <span></span><math>\n <semantics>\n <mrow>\n <msub>\n <mi>F</mi>\n <mn>1</mn>\n </msub>\n </mrow>\n </semantics></math>-score and accuracy on the Chula-PIC-Lab dataset and an average of <span></span><math>\n <semantics>\n <mrow>\n <mn>99.56</mn>\n <mo>%</mo>\n </mrow>\n </semantics></math> and <span></span><math>\n <semantics>\n <mrow>\n <mn>99.65</mn>\n <mo>%</mo>\n </mrow>\n </semantics></math> <span></span><math>\n <semantics>\n <mrow>\n <msub>\n <mi>F</mi>\n <mn>1</mn>\n </msub>\n </mrow>\n </semantics></math>-score and accuracy on the CCHRC dataset for <span></span><math>\n <semantics>\n <mrow>\n <mn>12</mn>\n </mrow>\n </semantics></math> and <span></span><math>\n <semantics>\n <mrow>\n <mn>6</mn>\n </mrow>\n </semantics></math> classes and surpasses benchmark models including Custom CNN, Custom LSTM, Efficient Net-B1, SMOTE, Hybrid NN, and HPKNN.</p>\n </div>","PeriodicalId":18684,"journal":{"name":"Microscopy Research and Technique","volume":"88 5","pages":"1566-1581"},"PeriodicalIF":2.1000,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Morphological Abnormalities Classification of Red Blood Cells Using Fusion Method on Imbalance Datasets\",\"authors\":\"Prasenjit Dhar, K. Suganya Devi, Ramanuj Bhattacharjee, P. Srinivasan\",\"doi\":\"10.1002/jemt.24786\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n <p>Red blood cells (RBCs) or Erythrocytes are essential components of the human body and they transport oxygen <span></span><math>\\n <semantics>\\n <mrow>\\n <mfenced>\\n <msub>\\n <mi>O</mi>\\n <mn>2</mn>\\n </msub>\\n </mfenced>\\n </mrow>\\n </semantics></math> from the lungs to the body's tissues, regulate <span></span><math>\\n <semantics>\\n <mrow>\\n <mi>pH</mi>\\n </mrow>\\n </semantics></math> balance, and support the immune system. Abnormalities in RBC shapes (Poikilocytosis) and sizes (Anisocytosis) can impede oxygen-carrying capacity, leading to conditions such as anemia, thalassemia, McLeod Syndrome, liver disease, and so on. Hematologists typically spend considerable time manually examining RBC's shapes and sizes using a microscope and it is time-consuming. The proposed LSTM based neural network (NN) deep-learning strategy helps to classify abnormal RBCs automatically and accurately and overcome blood-related disorders at an early stage. After data processing, traditional and high-level features are fused to clearly distinguish between abnormal RBC classes. Class imbalance favors the dominant class, resulting in biased forecasts. To address class imbalance, a custom loss function is generated by integrating class weights and loss functions before feeding fused features to the NN classifier. Specifically, the loss function is designed to assign higher penalties to the misclassification of underrepresented classes, ensuring that the model is more sensitive to these classes during training. This is achieved by integrating class weights directly into the cross-entropy loss calculation, thereby balancing the influence of each class on the model's learning process. The proposed approach's performance is evaluated using the publicly accessible Chula-PIC-Lab dataset and privately gathered dataset from the Cachar Cancer Hospital and Research Centre (CCHRC) in Assam, India. The proposed approach achieves an average of <span></span><math>\\n <semantics>\\n <mrow>\\n <mn>97.83</mn>\\n <mo>%</mo>\\n </mrow>\\n </semantics></math> and <span></span><math>\\n <semantics>\\n <mrow>\\n <mn>98.62</mn>\\n <mo>%</mo>\\n </mrow>\\n </semantics></math> <span></span><math>\\n <semantics>\\n <mrow>\\n <msub>\\n <mi>F</mi>\\n <mn>1</mn>\\n </msub>\\n </mrow>\\n </semantics></math>-score and accuracy on the Chula-PIC-Lab dataset and an average of <span></span><math>\\n <semantics>\\n <mrow>\\n <mn>99.56</mn>\\n <mo>%</mo>\\n </mrow>\\n </semantics></math> and <span></span><math>\\n <semantics>\\n <mrow>\\n <mn>99.65</mn>\\n <mo>%</mo>\\n </mrow>\\n </semantics></math> <span></span><math>\\n <semantics>\\n <mrow>\\n <msub>\\n <mi>F</mi>\\n <mn>1</mn>\\n </msub>\\n </mrow>\\n </semantics></math>-score and accuracy on the CCHRC dataset for <span></span><math>\\n <semantics>\\n <mrow>\\n <mn>12</mn>\\n </mrow>\\n </semantics></math> and <span></span><math>\\n <semantics>\\n <mrow>\\n <mn>6</mn>\\n </mrow>\\n </semantics></math> classes and surpasses benchmark models including Custom CNN, Custom LSTM, Efficient Net-B1, SMOTE, Hybrid NN, and HPKNN.</p>\\n </div>\",\"PeriodicalId\":18684,\"journal\":{\"name\":\"Microscopy Research and Technique\",\"volume\":\"88 5\",\"pages\":\"1566-1581\"},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2025-01-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Microscopy Research and Technique\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/10.1002/jemt.24786\",\"RegionNum\":3,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ANATOMY & MORPHOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Microscopy Research and Technique","FirstCategoryId":"5","ListUrlMain":"https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/10.1002/jemt.24786","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ANATOMY & MORPHOLOGY","Score":null,"Total":0}
引用次数: 0
摘要
红血球(rbc)或红细胞是人体的重要组成部分,它们将氧气o2 $$ \left({O}_2\right) $$从肺部运输到身体的组织,调节pH值$$ pH $$平衡,并支持免疫系统。红细胞形状(异型红细胞)和大小(异型红细胞)的异常会阻碍携氧能力,导致贫血、地中海贫血、麦克劳德综合征、肝病等疾病。血液学家通常花费大量时间用显微镜手工检查红细胞的形状和大小,这很耗时。提出的基于LSTM的神经网络深度学习策略有助于自动准确地对异常红细胞进行分类,并在早期克服血液相关疾病。数据处理后,融合传统特征和高级特征,清晰区分异常红细胞类别。阶级不平衡有利于统治阶级,导致有偏见的预测。为了解决类不平衡问题,在将融合特征输入NN分类器之前,通过对类权值和损失函数进行积分来生成自定义损失函数。具体来说,损失函数被设计为对未充分代表的类的错误分类分配更高的惩罚,确保模型在训练过程中对这些类更敏感。这是通过将类权重直接集成到交叉熵损失计算中来实现的,从而平衡了每个类对模型学习过程的影响。使用可公开访问的Chula-PIC-Lab数据集和来自印度阿萨姆邦Cachar癌症医院和研究中心(CCHRC)的私人收集的数据集来评估所提出方法的性能。该方法的平均得分为97.83 % $$ 97.83\% $$ and 98.62 % $$ 98.62\% $$ F 1 $$ {F}_1 $$ -score and accuracy on the Chula-PIC-Lab dataset and an average of 99.56 % $$ 99.56\% $$ and 99.65 % $$ 99.65\% $$ F 1 $$ {F}_1 $$ -score and accuracy on the CCHRC dataset for 12 $$ 12 $$ and 6 $$ 6 $$ classes and surpasses benchmark models including Custom CNN, Custom LSTM, Efficient Net-B1, SMOTE, Hybrid NN, and HPKNN.
Morphological Abnormalities Classification of Red Blood Cells Using Fusion Method on Imbalance Datasets
Red blood cells (RBCs) or Erythrocytes are essential components of the human body and they transport oxygen from the lungs to the body's tissues, regulate balance, and support the immune system. Abnormalities in RBC shapes (Poikilocytosis) and sizes (Anisocytosis) can impede oxygen-carrying capacity, leading to conditions such as anemia, thalassemia, McLeod Syndrome, liver disease, and so on. Hematologists typically spend considerable time manually examining RBC's shapes and sizes using a microscope and it is time-consuming. The proposed LSTM based neural network (NN) deep-learning strategy helps to classify abnormal RBCs automatically and accurately and overcome blood-related disorders at an early stage. After data processing, traditional and high-level features are fused to clearly distinguish between abnormal RBC classes. Class imbalance favors the dominant class, resulting in biased forecasts. To address class imbalance, a custom loss function is generated by integrating class weights and loss functions before feeding fused features to the NN classifier. Specifically, the loss function is designed to assign higher penalties to the misclassification of underrepresented classes, ensuring that the model is more sensitive to these classes during training. This is achieved by integrating class weights directly into the cross-entropy loss calculation, thereby balancing the influence of each class on the model's learning process. The proposed approach's performance is evaluated using the publicly accessible Chula-PIC-Lab dataset and privately gathered dataset from the Cachar Cancer Hospital and Research Centre (CCHRC) in Assam, India. The proposed approach achieves an average of and -score and accuracy on the Chula-PIC-Lab dataset and an average of and -score and accuracy on the CCHRC dataset for and classes and surpasses benchmark models including Custom CNN, Custom LSTM, Efficient Net-B1, SMOTE, Hybrid NN, and HPKNN.
期刊介绍:
Microscopy Research and Technique (MRT) publishes articles on all aspects of advanced microscopy original architecture and methodologies with applications in the biological, clinical, chemical, and materials sciences. Original basic and applied research as well as technical papers dealing with the various subsets of microscopy are encouraged. MRT is the right form for those developing new microscopy methods or using the microscope to answer key questions in basic and applied research.