基于深度学习的血清表面增强拉曼光谱的多种癌症早期检测:一项大规模病例对照研究。

IF 8.3 1区 医学 Q1 MEDICINE, GENERAL & INTERNAL BMC Medicine Pub Date : 2025-02-21 DOI:10.1186/s12916-025-03887-5
Yuxiang Lin, Qiyi Zhang, Hanxi Chen, Shuhang Liu, Kaiming Peng, Xiaojie Wang, Liyong Zhang, Jun Huang, Xiuqing Yan, Xueliang Lin, Uddin M D Hasan, Mahabub Sarwara, Fangmeng Fu, Shangyuan Feng, Chuan Wang
{"title":"基于深度学习的血清表面增强拉曼光谱的多种癌症早期检测:一项大规模病例对照研究。","authors":"Yuxiang Lin, Qiyi Zhang, Hanxi Chen, Shuhang Liu, Kaiming Peng, Xiaojie Wang, Liyong Zhang, Jun Huang, Xiuqing Yan, Xueliang Lin, Uddin M D Hasan, Mahabub Sarwara, Fangmeng Fu, Shangyuan Feng, Chuan Wang","doi":"10.1186/s12916-025-03887-5","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Early detection of cancer can help patients with more effective treatments and result in better prognosis. Unfortunately, established cancer screening technologies are limited for use, especially for multi-cancer early detection. In this study, we described a serum-based platform integrating surface-enhanced Raman spectroscopy (SERS) technology with resampling strategy, feature dimensionality enhancement, deep learning and interpretability analysis methods for sensitive and accurate pan-cancer screening.</p><p><strong>Methods: </strong>Totally, 1655 early-stage patients with breast cancer (BC, n = 569), lung cancer (LC, n = 513), thyroid cancer (TC, n = 220), colorectal cancer (CC, n = 215), gastric cancer (GC, n = 100), esophageal cancer (EC, n = 38), and 1896 healthy controls (HC) were enrolled. The serum SERS spectra were obtained from each participant. Data dimension enhancement was conducted by heatmap transformation and continuous wavelet transform (CWT). The dimensionalization SERS spectral data were subsequently analyzed by residual neural network (ResNet) as convolutional neural network (CNN) algorithm. Class activation mapping (CAM) method was performed to elucidate the potential biological significance of spectral data classification.</p><p><strong>Results: </strong>All participants were divided into a training set and a test set with a ratio of 7:3. The BorderlineSMOTE method was selected as the most appropriate resampling strategy and the deep neural network (DNN) model achieved desirable performance among all groups (accuracy rate: 93.15%, precision rate: 88:46%, recall rate: 85.68%, and F1-score: 86.98%), with the generated AUC values of 0.991 for HC, 0.995 for BC, 0.979 for LC, 0.996 for TC, 0.994 for CC, 0.982 for GC, and 0.941 for EC, respectively. Furthermore, the combination use of SERS spectra data and ResNet (form of heatmap) were also capable of effectively distinguishing different categories and making accurate predictions (accuracy rate: 94.75%, precision rate: 89.02, recall rate: 86.97, and F1-score: 87.88), with the AUC values of 0.996 for HC, 0.995 for BC, 0.988 for LC, 0.999 for TC, 0.993 for CC, 0.985 for GC, and 0.940 for EC, respectively. Additionally, strong wave number range of the spectral data was observed in the CAM analysis.</p><p><strong>Conclusions: </strong>Our study has offered a highly effective serum SERS-based approach for multi-cancer early detection, which might shed new light on cancer screening in clinical practice.</p>","PeriodicalId":9188,"journal":{"name":"BMC Medicine","volume":"23 1","pages":"97"},"PeriodicalIF":8.3000,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11846373/pdf/","citationCount":"0","resultStr":"{\"title\":\"Multi-cancer early detection based on serum surface-enhanced Raman spectroscopy with deep learning: a large-scale case-control study.\",\"authors\":\"Yuxiang Lin, Qiyi Zhang, Hanxi Chen, Shuhang Liu, Kaiming Peng, Xiaojie Wang, Liyong Zhang, Jun Huang, Xiuqing Yan, Xueliang Lin, Uddin M D Hasan, Mahabub Sarwara, Fangmeng Fu, Shangyuan Feng, Chuan Wang\",\"doi\":\"10.1186/s12916-025-03887-5\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Early detection of cancer can help patients with more effective treatments and result in better prognosis. Unfortunately, established cancer screening technologies are limited for use, especially for multi-cancer early detection. In this study, we described a serum-based platform integrating surface-enhanced Raman spectroscopy (SERS) technology with resampling strategy, feature dimensionality enhancement, deep learning and interpretability analysis methods for sensitive and accurate pan-cancer screening.</p><p><strong>Methods: </strong>Totally, 1655 early-stage patients with breast cancer (BC, n = 569), lung cancer (LC, n = 513), thyroid cancer (TC, n = 220), colorectal cancer (CC, n = 215), gastric cancer (GC, n = 100), esophageal cancer (EC, n = 38), and 1896 healthy controls (HC) were enrolled. The serum SERS spectra were obtained from each participant. Data dimension enhancement was conducted by heatmap transformation and continuous wavelet transform (CWT). The dimensionalization SERS spectral data were subsequently analyzed by residual neural network (ResNet) as convolutional neural network (CNN) algorithm. Class activation mapping (CAM) method was performed to elucidate the potential biological significance of spectral data classification.</p><p><strong>Results: </strong>All participants were divided into a training set and a test set with a ratio of 7:3. The BorderlineSMOTE method was selected as the most appropriate resampling strategy and the deep neural network (DNN) model achieved desirable performance among all groups (accuracy rate: 93.15%, precision rate: 88:46%, recall rate: 85.68%, and F1-score: 86.98%), with the generated AUC values of 0.991 for HC, 0.995 for BC, 0.979 for LC, 0.996 for TC, 0.994 for CC, 0.982 for GC, and 0.941 for EC, respectively. Furthermore, the combination use of SERS spectra data and ResNet (form of heatmap) were also capable of effectively distinguishing different categories and making accurate predictions (accuracy rate: 94.75%, precision rate: 89.02, recall rate: 86.97, and F1-score: 87.88), with the AUC values of 0.996 for HC, 0.995 for BC, 0.988 for LC, 0.999 for TC, 0.993 for CC, 0.985 for GC, and 0.940 for EC, respectively. Additionally, strong wave number range of the spectral data was observed in the CAM analysis.</p><p><strong>Conclusions: </strong>Our study has offered a highly effective serum SERS-based approach for multi-cancer early detection, which might shed new light on cancer screening in clinical practice.</p>\",\"PeriodicalId\":9188,\"journal\":{\"name\":\"BMC Medicine\",\"volume\":\"23 1\",\"pages\":\"97\"},\"PeriodicalIF\":8.3000,\"publicationDate\":\"2025-02-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11846373/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"BMC Medicine\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1186/s12916-025-03887-5\",\"RegionNum\":1,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MEDICINE, GENERAL & INTERNAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12916-025-03887-5","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MEDICINE, GENERAL & INTERNAL","Score":null,"Total":0}
引用次数: 0

摘要

背景:早期发现癌症可以帮助患者获得更有效的治疗和更好的预后。不幸的是,现有的癌症筛查技术的应用有限,特别是对多种癌症的早期检测。在这项研究中,我们描述了一个基于血清的平台,将表面增强拉曼光谱(SERS)技术与重采样策略、特征维数增强、深度学习和可解释性分析方法相结合,用于敏感和准确的泛癌症筛查。方法:共纳入1655例早期乳腺癌(BC, n = 569)、肺癌(LC, n = 513)、甲状腺癌(TC, n = 220)、结直肠癌(CC, n = 215)、胃癌(GC, n = 100)、食管癌(EC, n = 38)患者和1896例健康对照(HC)。获得每位参与者的血清SERS谱。采用热图变换和连续小波变换对数据进行维数增强。随后用残差神经网络(ResNet)作为卷积神经网络(CNN)算法对维度化SERS谱数据进行分析。采用类激活映射(CAM)方法阐明光谱数据分类的潜在生物学意义。结果:所有参与者被分为训练集和测试集,比例为7:3。选择BorderlineSMOTE方法作为最合适的重采样策略,深度神经网络(DNN)模型在所有组中均取得了较好的效果(准确率:93.15%,准确率:88:46%,召回率:85.68%,f1评分:86.98%),HC、BC、LC、TC、CC、0.994、GC和EC的AUC值分别为0.991、0.995、0.979、0.982和0.941。此外,SERS光谱数据与ResNet(热图形式)结合使用也能有效区分不同类别并做出准确的预测(准确率为94.75%,准确率为89.02,召回率为86.97,f1评分为87.88),其中HC、BC、LC、TC、CC、0.993、GC和EC的AUC值分别为0.996、0.995、0.988、0.999、0.985和0.940。此外,在CAM分析中还观察到光谱数据的强波数范围。结论:本研究提供了一种高效的基于血清sers的多癌早期检测方法,可能为临床癌症筛查提供新的思路。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Multi-cancer early detection based on serum surface-enhanced Raman spectroscopy with deep learning: a large-scale case-control study.

Background: Early detection of cancer can help patients with more effective treatments and result in better prognosis. Unfortunately, established cancer screening technologies are limited for use, especially for multi-cancer early detection. In this study, we described a serum-based platform integrating surface-enhanced Raman spectroscopy (SERS) technology with resampling strategy, feature dimensionality enhancement, deep learning and interpretability analysis methods for sensitive and accurate pan-cancer screening.

Methods: Totally, 1655 early-stage patients with breast cancer (BC, n = 569), lung cancer (LC, n = 513), thyroid cancer (TC, n = 220), colorectal cancer (CC, n = 215), gastric cancer (GC, n = 100), esophageal cancer (EC, n = 38), and 1896 healthy controls (HC) were enrolled. The serum SERS spectra were obtained from each participant. Data dimension enhancement was conducted by heatmap transformation and continuous wavelet transform (CWT). The dimensionalization SERS spectral data were subsequently analyzed by residual neural network (ResNet) as convolutional neural network (CNN) algorithm. Class activation mapping (CAM) method was performed to elucidate the potential biological significance of spectral data classification.

Results: All participants were divided into a training set and a test set with a ratio of 7:3. The BorderlineSMOTE method was selected as the most appropriate resampling strategy and the deep neural network (DNN) model achieved desirable performance among all groups (accuracy rate: 93.15%, precision rate: 88:46%, recall rate: 85.68%, and F1-score: 86.98%), with the generated AUC values of 0.991 for HC, 0.995 for BC, 0.979 for LC, 0.996 for TC, 0.994 for CC, 0.982 for GC, and 0.941 for EC, respectively. Furthermore, the combination use of SERS spectra data and ResNet (form of heatmap) were also capable of effectively distinguishing different categories and making accurate predictions (accuracy rate: 94.75%, precision rate: 89.02, recall rate: 86.97, and F1-score: 87.88), with the AUC values of 0.996 for HC, 0.995 for BC, 0.988 for LC, 0.999 for TC, 0.993 for CC, 0.985 for GC, and 0.940 for EC, respectively. Additionally, strong wave number range of the spectral data was observed in the CAM analysis.

Conclusions: Our study has offered a highly effective serum SERS-based approach for multi-cancer early detection, which might shed new light on cancer screening in clinical practice.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
BMC Medicine
BMC Medicine 医学-医学:内科
CiteScore
13.10
自引率
1.10%
发文量
435
审稿时长
4-8 weeks
期刊介绍: BMC Medicine is an open access, transparent peer-reviewed general medical journal. It is the flagship journal of the BMC series and publishes outstanding and influential research in various areas including clinical practice, translational medicine, medical and health advances, public health, global health, policy, and general topics of interest to the biomedical and sociomedical professional communities. In addition to research articles, the journal also publishes stimulating debates, reviews, unique forum articles, and concise tutorials. All articles published in BMC Medicine are included in various databases such as Biological Abstracts, BIOSIS, CAS, Citebase, Current contents, DOAJ, Embase, MEDLINE, PubMed, Science Citation Index Expanded, OAIster, SCImago, Scopus, SOCOLAR, and Zetoc.
期刊最新文献
Is there a bidirectional relationship between the number of unhealthy lifestyle factors and depressive symptoms in adolescents? Evidence from a longitudinal study. Excess cardiovascular risk associated with community-acquired lower respiratory tract infections: a population-based nested self-controlled case-series. Ambient pollution components and sources are associated with hippocampal architecture and memory in pre-adolescents. The economic burden of depression disease for China, India and the USA in 2025-2050: a health-augmented macroeconomic modelling study. Spectral focused imaging enables enhanced colorectal adenoma detection: a multicenter, parallel randomized controlled trial.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1