Development of a deep learning model for cancer diagnosis by inspecting cell-free DNA end-motifs

IF 6.8 1区 医学 Q1 ONCOLOGY NPJ Precision Oncology Pub Date : 2024-07-27 DOI:10.1038/s41698-024-00635-5
Hongru Shen, Meng Yang, Jilei Liu, Kexin Chen, Xiangchun Li
{"title":"Development of a deep learning model for cancer diagnosis by inspecting cell-free DNA end-motifs","authors":"Hongru Shen, Meng Yang, Jilei Liu, Kexin Chen, Xiangchun Li","doi":"10.1038/s41698-024-00635-5","DOIUrl":null,"url":null,"abstract":"Accurate discrimination between patients with and without cancer from cfDNA is crucial for early cancer diagnosis. Herein, we develop and validate a deep-learning-based model entitled end-motif inspection via transformer (EMIT) for discriminating individuals with and without cancer by learning feature representations from cfDNA end-motifs. EMIT is a self-supervised learning approach that models rankings of cfDNA end-motifs. We include 4606 samples subjected to different types of cfDNA sequencing to develop EIMIT, and subsequently evaluate classification performance of linear projections of EMIT on six datasets and an additional inhouse testing set encopassing whole-genome, whole-genome bisulfite and 5-hydroxymethylcytosine sequencing. The linear projection of representations from EMIT achieved area under the receiver operating curve (AUROC) values ranged from 0.895 (0.835–0.955) to 0.996 (0.994–0.997) across these six datasets, outperforming its baseline by significant margins. Additionally, we showed that linear projection of EMIT representations can achieve an AUROC of 0.962 (0.914–1.0) in identification of lung cancer on an independent testing set subjected to whole-exome sequencing. The findings of this study indicate that a transformer-based deep learning model can learn cancer-discrimative representations from cfDNA end-motifs. The representations of this deep learning model can be exploited for discriminating patients with and without cancer.","PeriodicalId":19433,"journal":{"name":"NPJ Precision Oncology","volume":null,"pages":null},"PeriodicalIF":6.8000,"publicationDate":"2024-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s41698-024-00635-5.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"NPJ Precision Oncology","FirstCategoryId":"3","ListUrlMain":"https://www.nature.com/articles/s41698-024-00635-5","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Accurate discrimination between patients with and without cancer from cfDNA is crucial for early cancer diagnosis. Herein, we develop and validate a deep-learning-based model entitled end-motif inspection via transformer (EMIT) for discriminating individuals with and without cancer by learning feature representations from cfDNA end-motifs. EMIT is a self-supervised learning approach that models rankings of cfDNA end-motifs. We include 4606 samples subjected to different types of cfDNA sequencing to develop EIMIT, and subsequently evaluate classification performance of linear projections of EMIT on six datasets and an additional inhouse testing set encopassing whole-genome, whole-genome bisulfite and 5-hydroxymethylcytosine sequencing. The linear projection of representations from EMIT achieved area under the receiver operating curve (AUROC) values ranged from 0.895 (0.835–0.955) to 0.996 (0.994–0.997) across these six datasets, outperforming its baseline by significant margins. Additionally, we showed that linear projection of EMIT representations can achieve an AUROC of 0.962 (0.914–1.0) in identification of lung cancer on an independent testing set subjected to whole-exome sequencing. The findings of this study indicate that a transformer-based deep learning model can learn cancer-discrimative representations from cfDNA end-motifs. The representations of this deep learning model can be exploited for discriminating patients with and without cancer.

Abstract Image

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
通过检测无细胞 DNA 末端位点开发用于癌症诊断的深度学习模型
从 cfDNA 中准确区分癌症患者和非癌症患者对于早期癌症诊断至关重要。在本文中,我们开发并验证了一种基于深度学习的模型,名为 "通过转换器进行末端修饰检查(EMIT)",该模型通过学习 cfDNA 末端修饰的特征表征来区分癌症患者和非癌症患者。EMIT 是一种自我监督的学习方法,可对 cfDNA 末端主题词的排名进行建模。我们纳入了 4606 份经过不同类型 cfDNA 测序的样本来开发 EIMIT,随后在六个数据集和一个额外的内部测试集(包括全基因组、全基因组亚硫酸氢盐测序和 5-羟甲基胞嘧啶测序)上评估了 EMIT 线性投影的分类性能。在这六个数据集中,EMIT的线性投影表示法的接收者操作曲线下面积(AUROC)值从0.895(0.835-0.955)到0.996(0.994-0.997)不等,明显优于其基准值。此外,我们还发现,在全外显子组测序的独立测试集上识别肺癌时,EMIT 表示的线性投影的 AUROC 可以达到 0.962(0.914-1.0)。这项研究的结果表明,基于变换器的深度学习模型可以从 cfDNA 末端位点中学习癌症鉴别表征。这种深度学习模型的表征可用于区分癌症患者和非癌症患者。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
9.90
自引率
1.30%
发文量
87
审稿时长
18 weeks
期刊介绍: Online-only and open access, npj Precision Oncology is an international, peer-reviewed journal dedicated to showcasing cutting-edge scientific research in all facets of precision oncology, spanning from fundamental science to translational applications and clinical medicine.
期刊最新文献
Benchmark of screening markers for KEAP1/NFE2L2 mutations and joint analysis with the K1N2-score Immune infiltration correlates with transcriptomic subtypes in primary estrogen receptor positive invasive lobular breast cancer RNF4 mediated degradation of PDHA1 promotes colorectal cancer metabolism and metastasis Multi-omics analysis of Prolyl 3-hydroxylase 1 as a prognostic biomarker for immune infiltration in ccRCC Artificial intelligence-based morphologic classification and molecular characterization of neuroblastic tumors from digital histopathology
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1