Yuxiang Lin, Qiyi Zhang, Hanxi Chen, Shuhang Liu, Kaiming Peng, Xiaojie Wang, Liyong Zhang, Jun Huang, Xiuqing Yan, Xueliang Lin, Uddin M D Hasan, Mahabub Sarwara, Fangmeng Fu, Shangyuan Feng, Chuan Wang
{"title":"Multi-cancer early detection based on serum surface-enhanced Raman spectroscopy with deep learning: a large-scale case-control study.","authors":"Yuxiang Lin, Qiyi Zhang, Hanxi Chen, Shuhang Liu, Kaiming Peng, Xiaojie Wang, Liyong Zhang, Jun Huang, Xiuqing Yan, Xueliang Lin, Uddin M D Hasan, Mahabub Sarwara, Fangmeng Fu, Shangyuan Feng, Chuan Wang","doi":"10.1186/s12916-025-03887-5","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Early detection of cancer can help patients with more effective treatments and result in better prognosis. Unfortunately, established cancer screening technologies are limited for use, especially for multi-cancer early detection. In this study, we described a serum-based platform integrating surface-enhanced Raman spectroscopy (SERS) technology with resampling strategy, feature dimensionality enhancement, deep learning and interpretability analysis methods for sensitive and accurate pan-cancer screening.</p><p><strong>Methods: </strong>Totally, 1655 early-stage patients with breast cancer (BC, n = 569), lung cancer (LC, n = 513), thyroid cancer (TC, n = 220), colorectal cancer (CC, n = 215), gastric cancer (GC, n = 100), esophageal cancer (EC, n = 38), and 1896 healthy controls (HC) were enrolled. The serum SERS spectra were obtained from each participant. Data dimension enhancement was conducted by heatmap transformation and continuous wavelet transform (CWT). The dimensionalization SERS spectral data were subsequently analyzed by residual neural network (ResNet) as convolutional neural network (CNN) algorithm. Class activation mapping (CAM) method was performed to elucidate the potential biological significance of spectral data classification.</p><p><strong>Results: </strong>All participants were divided into a training set and a test set with a ratio of 7:3. The BorderlineSMOTE method was selected as the most appropriate resampling strategy and the deep neural network (DNN) model achieved desirable performance among all groups (accuracy rate: 93.15%, precision rate: 88:46%, recall rate: 85.68%, and F1-score: 86.98%), with the generated AUC values of 0.991 for HC, 0.995 for BC, 0.979 for LC, 0.996 for TC, 0.994 for CC, 0.982 for GC, and 0.941 for EC, respectively. Furthermore, the combination use of SERS spectra data and ResNet (form of heatmap) were also capable of effectively distinguishing different categories and making accurate predictions (accuracy rate: 94.75%, precision rate: 89.02, recall rate: 86.97, and F1-score: 87.88), with the AUC values of 0.996 for HC, 0.995 for BC, 0.988 for LC, 0.999 for TC, 0.993 for CC, 0.985 for GC, and 0.940 for EC, respectively. Additionally, strong wave number range of the spectral data was observed in the CAM analysis.</p><p><strong>Conclusions: </strong>Our study has offered a highly effective serum SERS-based approach for multi-cancer early detection, which might shed new light on cancer screening in clinical practice.</p>","PeriodicalId":9188,"journal":{"name":"BMC Medicine","volume":"23 1","pages":"97"},"PeriodicalIF":7.0000,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11846373/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12916-025-03887-5","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MEDICINE, GENERAL & INTERNAL","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Early detection of cancer can help patients with more effective treatments and result in better prognosis. Unfortunately, established cancer screening technologies are limited for use, especially for multi-cancer early detection. In this study, we described a serum-based platform integrating surface-enhanced Raman spectroscopy (SERS) technology with resampling strategy, feature dimensionality enhancement, deep learning and interpretability analysis methods for sensitive and accurate pan-cancer screening.
Methods: Totally, 1655 early-stage patients with breast cancer (BC, n = 569), lung cancer (LC, n = 513), thyroid cancer (TC, n = 220), colorectal cancer (CC, n = 215), gastric cancer (GC, n = 100), esophageal cancer (EC, n = 38), and 1896 healthy controls (HC) were enrolled. The serum SERS spectra were obtained from each participant. Data dimension enhancement was conducted by heatmap transformation and continuous wavelet transform (CWT). The dimensionalization SERS spectral data were subsequently analyzed by residual neural network (ResNet) as convolutional neural network (CNN) algorithm. Class activation mapping (CAM) method was performed to elucidate the potential biological significance of spectral data classification.
Results: All participants were divided into a training set and a test set with a ratio of 7:3. The BorderlineSMOTE method was selected as the most appropriate resampling strategy and the deep neural network (DNN) model achieved desirable performance among all groups (accuracy rate: 93.15%, precision rate: 88:46%, recall rate: 85.68%, and F1-score: 86.98%), with the generated AUC values of 0.991 for HC, 0.995 for BC, 0.979 for LC, 0.996 for TC, 0.994 for CC, 0.982 for GC, and 0.941 for EC, respectively. Furthermore, the combination use of SERS spectra data and ResNet (form of heatmap) were also capable of effectively distinguishing different categories and making accurate predictions (accuracy rate: 94.75%, precision rate: 89.02, recall rate: 86.97, and F1-score: 87.88), with the AUC values of 0.996 for HC, 0.995 for BC, 0.988 for LC, 0.999 for TC, 0.993 for CC, 0.985 for GC, and 0.940 for EC, respectively. Additionally, strong wave number range of the spectral data was observed in the CAM analysis.
Conclusions: Our study has offered a highly effective serum SERS-based approach for multi-cancer early detection, which might shed new light on cancer screening in clinical practice.
期刊介绍:
BMC Medicine is an open access, transparent peer-reviewed general medical journal. It is the flagship journal of the BMC series and publishes outstanding and influential research in various areas including clinical practice, translational medicine, medical and health advances, public health, global health, policy, and general topics of interest to the biomedical and sociomedical professional communities. In addition to research articles, the journal also publishes stimulating debates, reviews, unique forum articles, and concise tutorials. All articles published in BMC Medicine are included in various databases such as Biological Abstracts, BIOSIS, CAS, Citebase, Current contents, DOAJ, Embase, MEDLINE, PubMed, Science Citation Index Expanded, OAIster, SCImago, Scopus, SOCOLAR, and Zetoc.