{"title":"在白细胞形态学评估中比较人类水平和机器学习模型的性能。","authors":"Patrick Lawrence, Christina Brown","doi":"10.1111/ejh.14318","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <h3> Introduction</h3>\n \n <p>There is an increasing research focus on the role of machine learning in the haematology laboratory, particularly in blood cell morphologic assessment. Human-level performance is an important baseline and goal for machine learning. This study aims to assess the interobserver variability and human-level performance in blood cell morphologic assessment.</p>\n </section>\n \n <section>\n \n <h3> Methods</h3>\n \n <p>A dataset of 1000 single white blood cell images were independently labelled by 10 doctors and morphology scientists. Interobserver variability was calculated using Fleiss' kappa. Observers' labels were then separated into consensus labels used to determine ground truth, and performance labels used to assess observer performance. A machine learning model was trained and assessed using the same cell images. Explainability images (XRAI and IG) were generated for each of the test images.</p>\n </section>\n \n <section>\n \n <h3> Results</h3>\n \n <p>The Fleiss kappa for all 10 observers was 0.608, indicating substantial agreement between observers. The accuracy of human observers was 95%, with sensitivity 72% and specificity 97%. The accuracy of the machine learning model was 95%, with sensitivity 71% and specificity 97%. The model shared similar performance across labels when compared to humans. Explainability metrics demonstrated that the machine learning model was able to differentiate between the cytoplasm and nucleus of the cells, and used these features to perform predictions.</p>\n </section>\n \n <section>\n \n <h3> Conclusion</h3>\n \n <p>The substantial, though not perfect, agreement between human observers highlights the inherent subjectivity in white blood cell morphologic assessment. A machine learning model performed similarly to human observers in single white blood cell identification. Further research is needed to compare human-level and machine learning performance in ways that more closely reflect the typical process of morphologic assessment.</p>\n </section>\n </div>","PeriodicalId":11955,"journal":{"name":"European Journal of Haematology","volume":"114 1","pages":"115-119"},"PeriodicalIF":2.3000,"publicationDate":"2024-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Comparing Human-Level and Machine Learning Model Performance in White Blood Cell Morphology Assessment\",\"authors\":\"Patrick Lawrence, Christina Brown\",\"doi\":\"10.1111/ejh.14318\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n \\n <section>\\n \\n <h3> Introduction</h3>\\n \\n <p>There is an increasing research focus on the role of machine learning in the haematology laboratory, particularly in blood cell morphologic assessment. Human-level performance is an important baseline and goal for machine learning. This study aims to assess the interobserver variability and human-level performance in blood cell morphologic assessment.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Methods</h3>\\n \\n <p>A dataset of 1000 single white blood cell images were independently labelled by 10 doctors and morphology scientists. Interobserver variability was calculated using Fleiss' kappa. Observers' labels were then separated into consensus labels used to determine ground truth, and performance labels used to assess observer performance. A machine learning model was trained and assessed using the same cell images. Explainability images (XRAI and IG) were generated for each of the test images.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Results</h3>\\n \\n <p>The Fleiss kappa for all 10 observers was 0.608, indicating substantial agreement between observers. The accuracy of human observers was 95%, with sensitivity 72% and specificity 97%. The accuracy of the machine learning model was 95%, with sensitivity 71% and specificity 97%. The model shared similar performance across labels when compared to humans. Explainability metrics demonstrated that the machine learning model was able to differentiate between the cytoplasm and nucleus of the cells, and used these features to perform predictions.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Conclusion</h3>\\n \\n <p>The substantial, though not perfect, agreement between human observers highlights the inherent subjectivity in white blood cell morphologic assessment. A machine learning model performed similarly to human observers in single white blood cell identification. Further research is needed to compare human-level and machine learning performance in ways that more closely reflect the typical process of morphologic assessment.</p>\\n </section>\\n </div>\",\"PeriodicalId\":11955,\"journal\":{\"name\":\"European Journal of Haematology\",\"volume\":\"114 1\",\"pages\":\"115-119\"},\"PeriodicalIF\":2.3000,\"publicationDate\":\"2024-10-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"European Journal of Haematology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1111/ejh.14318\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"HEMATOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Journal of Haematology","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/ejh.14318","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"HEMATOLOGY","Score":null,"Total":0}
Comparing Human-Level and Machine Learning Model Performance in White Blood Cell Morphology Assessment
Introduction
There is an increasing research focus on the role of machine learning in the haematology laboratory, particularly in blood cell morphologic assessment. Human-level performance is an important baseline and goal for machine learning. This study aims to assess the interobserver variability and human-level performance in blood cell morphologic assessment.
Methods
A dataset of 1000 single white blood cell images were independently labelled by 10 doctors and morphology scientists. Interobserver variability was calculated using Fleiss' kappa. Observers' labels were then separated into consensus labels used to determine ground truth, and performance labels used to assess observer performance. A machine learning model was trained and assessed using the same cell images. Explainability images (XRAI and IG) were generated for each of the test images.
Results
The Fleiss kappa for all 10 observers was 0.608, indicating substantial agreement between observers. The accuracy of human observers was 95%, with sensitivity 72% and specificity 97%. The accuracy of the machine learning model was 95%, with sensitivity 71% and specificity 97%. The model shared similar performance across labels when compared to humans. Explainability metrics demonstrated that the machine learning model was able to differentiate between the cytoplasm and nucleus of the cells, and used these features to perform predictions.
Conclusion
The substantial, though not perfect, agreement between human observers highlights the inherent subjectivity in white blood cell morphologic assessment. A machine learning model performed similarly to human observers in single white blood cell identification. Further research is needed to compare human-level and machine learning performance in ways that more closely reflect the typical process of morphologic assessment.
期刊介绍:
European Journal of Haematology is an international journal for communication of basic and clinical research in haematology. The journal welcomes manuscripts on molecular, cellular and clinical research on diseases of the blood, vascular and lymphatic tissue, and on basic molecular and cellular research related to normal development and function of the blood, vascular and lymphatic tissue. The journal also welcomes reviews on clinical haematology and basic research, case reports, and clinical pictures.