Engineered feature embeddings meet deep learning: A novel strategy to improve bone marrow cell classification and model transparency

Jonathan Tarquino , Jhonathan Rodríguez , David Becerra , Lucia Roa-Peña , Eduardo Romero
{"title":"Engineered feature embeddings meet deep learning: A novel strategy to improve bone marrow cell classification and model transparency","authors":"Jonathan Tarquino ,&nbsp;Jhonathan Rodríguez ,&nbsp;David Becerra ,&nbsp;Lucia Roa-Peña ,&nbsp;Eduardo Romero","doi":"10.1016/j.jpi.2024.100390","DOIUrl":null,"url":null,"abstract":"<div><p>Cytomorphology evaluation of bone marrow cell is the initial step to diagnose different hematological diseases. This assessment is still manually performed by trained specialists, who may be a bottleneck within the clinical process. Deep learning algorithms are a promising approach to automate this bone marrow cell evaluation. These artificial intelligence models have focused on limited cell subtypes, mainly associated to a particular disease, and are frequently presented as black boxes. The herein introduced strategy presents an engineered feature representation, the region-attention embedding, which improves the deep learning classification performance of a cytomorphology with 21 bone marrow cell subtypes. This embedding is built upon a specific organization of cytology features within a squared matrix by distributing them after pre-segmented cell regions, i.e., cytoplasm, nucleus, and whole-cell. This novel cell image representation, aimed to preserve spatial/regional relations, is used as input of the network. Combination of region-attention embedding and deep learning networks (Xception and ResNet50) provides local relevance associated to image regions, adding up interpretable information to the prediction. Additionally, this approach is evaluated in a public database with the largest number of cell subtypes (21) by a thorough evaluation scheme with three iterations of a 3-fold cross-validation, performed in 80% of the images (<em>n</em> = 89,484), and a testing process in an unseen set of images composed by the remaining 20% of the images (<em>n</em> = 22,371). This evaluation process demonstrates the introduced strategy outperforms previously published approaches in an equivalent validation set, with a f1-score of 0.82, and presented competitive results on the unseen data partition with a f1-score of 0.56.</p></div>","PeriodicalId":37769,"journal":{"name":"Journal of Pathology Informatics","volume":"15 ","pages":"Article 100390"},"PeriodicalIF":0.0000,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2153353924000294/pdfft?md5=87a5b2e97447248282a9f8d40bb281e3&pid=1-s2.0-S2153353924000294-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Pathology Informatics","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2153353924000294","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Medicine","Score":null,"Total":0}
引用次数: 0

Abstract

Cytomorphology evaluation of bone marrow cell is the initial step to diagnose different hematological diseases. This assessment is still manually performed by trained specialists, who may be a bottleneck within the clinical process. Deep learning algorithms are a promising approach to automate this bone marrow cell evaluation. These artificial intelligence models have focused on limited cell subtypes, mainly associated to a particular disease, and are frequently presented as black boxes. The herein introduced strategy presents an engineered feature representation, the region-attention embedding, which improves the deep learning classification performance of a cytomorphology with 21 bone marrow cell subtypes. This embedding is built upon a specific organization of cytology features within a squared matrix by distributing them after pre-segmented cell regions, i.e., cytoplasm, nucleus, and whole-cell. This novel cell image representation, aimed to preserve spatial/regional relations, is used as input of the network. Combination of region-attention embedding and deep learning networks (Xception and ResNet50) provides local relevance associated to image regions, adding up interpretable information to the prediction. Additionally, this approach is evaluated in a public database with the largest number of cell subtypes (21) by a thorough evaluation scheme with three iterations of a 3-fold cross-validation, performed in 80% of the images (n = 89,484), and a testing process in an unseen set of images composed by the remaining 20% of the images (n = 22,371). This evaluation process demonstrates the introduced strategy outperforms previously published approaches in an equivalent validation set, with a f1-score of 0.82, and presented competitive results on the unseen data partition with a f1-score of 0.56.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
工程特征嵌入与深度学习的结合:改善骨髓细胞分类和模型透明度的新策略
骨髓细胞的细胞形态学评估是诊断各种血液病的第一步。这种评估仍由训练有素的专家手工完成,这可能是临床过程中的一个瓶颈。深度学习算法是一种有望实现骨髓细胞评估自动化的方法。这些人工智能模型侧重于有限的细胞亚型,主要与特定疾病相关,通常以黑盒形式呈现。本文介绍的策略提出了一种工程特征表征--区域注意嵌入,它提高了 21 种骨髓细胞亚型的细胞形态学深度学习分类性能。这种嵌入建立在方形矩阵中细胞学特征的特定组织之上,将它们分布在预先分割的细胞区域(即细胞质、细胞核和全细胞)之后。这种旨在保留空间/区域关系的新型细胞图像表示法被用作网络的输入。区域注意嵌入和深度学习网络(Xception 和 ResNet50)的结合提供了与图像区域相关的局部相关性,为预测增加了可解释的信息。此外,我们还在一个拥有最多细胞亚型的公共数据库(21)中对该方法进行了全面评估,评估方案包括对 80% 的图像(n = 89,484 张)进行三次迭代的 3 倍交叉验证,以及对由剩余 20% 的图像(n = 22,371 张)组成的未见图像集进行测试。评估结果表明,在等效验证集上,引入的策略优于之前发布的方法,f1 分数为 0.82,而在未见数据分区上,引入的策略也取得了具有竞争力的结果,f1 分数为 0.56。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Journal of Pathology Informatics
Journal of Pathology Informatics Medicine-Pathology and Forensic Medicine
CiteScore
3.70
自引率
0.00%
发文量
2
审稿时长
18 weeks
期刊介绍: The Journal of Pathology Informatics (JPI) is an open access peer-reviewed journal dedicated to the advancement of pathology informatics. This is the official journal of the Association for Pathology Informatics (API). The journal aims to publish broadly about pathology informatics and freely disseminate all articles worldwide. This journal is of interest to pathologists, informaticians, academics, researchers, health IT specialists, information officers, IT staff, vendors, and anyone with an interest in informatics. We encourage submissions from anyone with an interest in the field of pathology informatics. We publish all types of papers related to pathology informatics including original research articles, technical notes, reviews, viewpoints, commentaries, editorials, symposia, meeting abstracts, book reviews, and correspondence to the editors. All submissions are subject to rigorous peer review by the well-regarded editorial board and by expert referees in appropriate specialties.
期刊最新文献
Improving the generalizability of white blood cell classification with few-shot domain adaptation Pathology Informatics Summit 2024 Abstracts Ann Arbor Marriott at Eagle Crest Resort May 20-23, 2024 Ann Arbor, Michigan Deep learning-based classification of breast cancer molecular subtypes from H&E whole-slide images. Enhancing human phenotype ontology term extraction through synthetic case reports and embedding-based retrieval: A novel approach for improved biomedical data annotation. Prioritizing cases from a multi-institutional cohort for a dataset of pathologist annotations.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1