Hongru Shen, Meng Yang, Jilei Liu, Kexin Chen, Xiangchun Li
{"title":"Development of a deep learning model for cancer diagnosis by inspecting cell-free DNA end-motifs","authors":"Hongru Shen, Meng Yang, Jilei Liu, Kexin Chen, Xiangchun Li","doi":"10.1038/s41698-024-00635-5","DOIUrl":null,"url":null,"abstract":"Accurate discrimination between patients with and without cancer from cfDNA is crucial for early cancer diagnosis. Herein, we develop and validate a deep-learning-based model entitled end-motif inspection via transformer (EMIT) for discriminating individuals with and without cancer by learning feature representations from cfDNA end-motifs. EMIT is a self-supervised learning approach that models rankings of cfDNA end-motifs. We include 4606 samples subjected to different types of cfDNA sequencing to develop EIMIT, and subsequently evaluate classification performance of linear projections of EMIT on six datasets and an additional inhouse testing set encopassing whole-genome, whole-genome bisulfite and 5-hydroxymethylcytosine sequencing. The linear projection of representations from EMIT achieved area under the receiver operating curve (AUROC) values ranged from 0.895 (0.835–0.955) to 0.996 (0.994–0.997) across these six datasets, outperforming its baseline by significant margins. Additionally, we showed that linear projection of EMIT representations can achieve an AUROC of 0.962 (0.914–1.0) in identification of lung cancer on an independent testing set subjected to whole-exome sequencing. The findings of this study indicate that a transformer-based deep learning model can learn cancer-discrimative representations from cfDNA end-motifs. The representations of this deep learning model can be exploited for discriminating patients with and without cancer.","PeriodicalId":19433,"journal":{"name":"NPJ Precision Oncology","volume":null,"pages":null},"PeriodicalIF":6.8000,"publicationDate":"2024-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s41698-024-00635-5.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"NPJ Precision Oncology","FirstCategoryId":"3","ListUrlMain":"https://www.nature.com/articles/s41698-024-00635-5","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Accurate discrimination between patients with and without cancer from cfDNA is crucial for early cancer diagnosis. Herein, we develop and validate a deep-learning-based model entitled end-motif inspection via transformer (EMIT) for discriminating individuals with and without cancer by learning feature representations from cfDNA end-motifs. EMIT is a self-supervised learning approach that models rankings of cfDNA end-motifs. We include 4606 samples subjected to different types of cfDNA sequencing to develop EIMIT, and subsequently evaluate classification performance of linear projections of EMIT on six datasets and an additional inhouse testing set encopassing whole-genome, whole-genome bisulfite and 5-hydroxymethylcytosine sequencing. The linear projection of representations from EMIT achieved area under the receiver operating curve (AUROC) values ranged from 0.895 (0.835–0.955) to 0.996 (0.994–0.997) across these six datasets, outperforming its baseline by significant margins. Additionally, we showed that linear projection of EMIT representations can achieve an AUROC of 0.962 (0.914–1.0) in identification of lung cancer on an independent testing set subjected to whole-exome sequencing. The findings of this study indicate that a transformer-based deep learning model can learn cancer-discrimative representations from cfDNA end-motifs. The representations of this deep learning model can be exploited for discriminating patients with and without cancer.
期刊介绍:
Online-only and open access, npj Precision Oncology is an international, peer-reviewed journal dedicated to showcasing cutting-edge scientific research in all facets of precision oncology, spanning from fundamental science to translational applications and clinical medicine.