癌症复发的形状:拓扑数据分析预测儿科急性淋巴细胞白血病的复发

Salvador Chulián, Bernadette J. Stolz, Álvaro Martínez-Rubio, Cristina Blázquez Goñi, Juan F. Rodríguez Gutiérrez, Teresa Caballero Velázquez, Águeda Molinos Quintana, Manuel Ramírez Orellana, Ana Castillo Robleda, José Luis Fuster Soler, Alfredo Minguela Puras, María V. Martínez Sánchez, María Rosa, Víctor M. Pérez-García, Helen M. Byrne
{"title":"癌症复发的形状:拓扑数据分析预测儿科急性淋巴细胞白血病的复发","authors":"Salvador Chulián, Bernadette J. Stolz, Álvaro Martínez-Rubio, Cristina Blázquez Goñi, Juan F. Rodríguez Gutiérrez, Teresa Caballero Velázquez, Águeda Molinos Quintana, Manuel Ramírez Orellana, Ana Castillo Robleda, José Luis Fuster Soler, Alfredo Minguela Puras, María V. Martínez Sánchez, María Rosa, Víctor M. Pérez-García, Helen M. Byrne","doi":"10.1101/2021.12.22.21268233","DOIUrl":null,"url":null,"abstract":"Although children and adolescents with acute lymphoblastic leukaemia (ALL) have high survival rates, approximately 15-20% of patients relapse. Risk of relapse is routinely estimated at diagnosis by biological factors, including flow cytometry data. This high-dimensional data is typically manually assessed by projecting it onto a subset of biomarkers. Cell density and “empty spaces” in 2D projections of the data, i.e. regions devoid of cells, are then used for qualitative assessment. Here, we use topological data analysis (TDA), which quantifies shapes, including empty spaces, in data, to analyse pre-treatment ALL datasets with known patient outcomes. We combine these fully unsupervised analyses with Machine Learning (ML) to identify significant shape characteristics and demonstrate that they accurately predict risk of relapse, particularly for patients previously classified as ‘low risk’. We independently confirm the predictive power of CD10, CD20, CD38, and CD45 as biomarkers for ALL diagnosis. Based on our analyses, we propose three increasingly detailed prognostic pipelines for analysing flow cytometry data from ALL patients depending on technical and technological availability: 1. Visual inspection of specific biological features in biparametric projections of the data; 2. Computation of quantitative topological descriptors of such projections; 3. A combined analysis, using TDA and ML, in the four-parameter space defined by CD10, CD20, CD38 and CD45. Our analyses readily extend to other haematological malignancies.","PeriodicalId":501203,"journal":{"name":"medRxiv - Hematology","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"The shape of cancer relapse: Topological data analysis predicts recurrence in paediatric acute lymphoblastic leukaemia\",\"authors\":\"Salvador Chulián, Bernadette J. Stolz, Álvaro Martínez-Rubio, Cristina Blázquez Goñi, Juan F. Rodríguez Gutiérrez, Teresa Caballero Velázquez, Águeda Molinos Quintana, Manuel Ramírez Orellana, Ana Castillo Robleda, José Luis Fuster Soler, Alfredo Minguela Puras, María V. Martínez Sánchez, María Rosa, Víctor M. Pérez-García, Helen M. Byrne\",\"doi\":\"10.1101/2021.12.22.21268233\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Although children and adolescents with acute lymphoblastic leukaemia (ALL) have high survival rates, approximately 15-20% of patients relapse. Risk of relapse is routinely estimated at diagnosis by biological factors, including flow cytometry data. This high-dimensional data is typically manually assessed by projecting it onto a subset of biomarkers. Cell density and “empty spaces” in 2D projections of the data, i.e. regions devoid of cells, are then used for qualitative assessment. Here, we use topological data analysis (TDA), which quantifies shapes, including empty spaces, in data, to analyse pre-treatment ALL datasets with known patient outcomes. We combine these fully unsupervised analyses with Machine Learning (ML) to identify significant shape characteristics and demonstrate that they accurately predict risk of relapse, particularly for patients previously classified as ‘low risk’. We independently confirm the predictive power of CD10, CD20, CD38, and CD45 as biomarkers for ALL diagnosis. Based on our analyses, we propose three increasingly detailed prognostic pipelines for analysing flow cytometry data from ALL patients depending on technical and technological availability: 1. Visual inspection of specific biological features in biparametric projections of the data; 2. Computation of quantitative topological descriptors of such projections; 3. A combined analysis, using TDA and ML, in the four-parameter space defined by CD10, CD20, CD38 and CD45. Our analyses readily extend to other haematological malignancies.\",\"PeriodicalId\":501203,\"journal\":{\"name\":\"medRxiv - Hematology\",\"volume\":\"1 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-04-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"medRxiv - Hematology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1101/2021.12.22.21268233\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"medRxiv - Hematology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2021.12.22.21268233","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

虽然患有急性淋巴细胞白血病(ALL)的儿童和青少年生存率很高,但约有15-20%的患者复发。复发的风险通常在诊断时通过生物因素来估计,包括流式细胞术数据。这种高维数据通常是通过将其投射到生物标志物的子集来手动评估的。细胞密度和数据的二维投影中的“空白空间”,即没有细胞的区域,然后用于定性评估。在这里,我们使用拓扑数据分析(TDA),量化数据中的形状,包括空白空间,来分析具有已知患者结果的治疗前ALL数据集。我们将这些完全无监督的分析与机器学习(ML)相结合,以识别重要的形状特征,并证明它们可以准确预测复发风险,特别是对于以前被归类为“低风险”的患者。我们独立证实了CD10、CD20、CD38和CD45作为ALL诊断的生物标志物的预测能力。基于我们的分析,根据技术和技术的可用性,我们提出了三个越来越详细的预测管道来分析来自ALL患者的流式细胞术数据:在数据的双参数投影中目视检查特定的生物特征;2. 这类投影的定量拓扑描述符的计算;3. 在CD10, CD20, CD38和CD45定义的四参数空间中,使用TDA和ML进行组合分析。我们的分析很容易扩展到其他血液系统恶性肿瘤。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
The shape of cancer relapse: Topological data analysis predicts recurrence in paediatric acute lymphoblastic leukaemia
Although children and adolescents with acute lymphoblastic leukaemia (ALL) have high survival rates, approximately 15-20% of patients relapse. Risk of relapse is routinely estimated at diagnosis by biological factors, including flow cytometry data. This high-dimensional data is typically manually assessed by projecting it onto a subset of biomarkers. Cell density and “empty spaces” in 2D projections of the data, i.e. regions devoid of cells, are then used for qualitative assessment. Here, we use topological data analysis (TDA), which quantifies shapes, including empty spaces, in data, to analyse pre-treatment ALL datasets with known patient outcomes. We combine these fully unsupervised analyses with Machine Learning (ML) to identify significant shape characteristics and demonstrate that they accurately predict risk of relapse, particularly for patients previously classified as ‘low risk’. We independently confirm the predictive power of CD10, CD20, CD38, and CD45 as biomarkers for ALL diagnosis. Based on our analyses, we propose three increasingly detailed prognostic pipelines for analysing flow cytometry data from ALL patients depending on technical and technological availability: 1. Visual inspection of specific biological features in biparametric projections of the data; 2. Computation of quantitative topological descriptors of such projections; 3. A combined analysis, using TDA and ML, in the four-parameter space defined by CD10, CD20, CD38 and CD45. Our analyses readily extend to other haematological malignancies.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Machine Learning Insights into HLA Noncoding Sequence Mismatches and Their Impact on DPB1 Matching in Hematopoietic Cell Transplantation Comparison of haemoglobin concentration measurements using HemoCue-301 and Sysmex XN-Series 1500: a survey among anaemic Gambian infants aged 6-12 months Detection of Common Deletion Mutations (− α3.7 and − α4.2 kb) in HBA gene and Genotype-Phenotype Correlation Multi-omic and functional screening reveal targetable vulnerabilities in TP53 mutated multiple myeloma Evaluating the Therapeutic Effects of Amino Acid Treatment on Vaso-Occlusive Pain in Sickle Cell Disease: A Systematic Review and Meta-Analysis Protocol
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1