{"title":"基于临床数据使用机器学习方法的卵巢癌早期预测和风险分层。","authors":"Ting Gui, Dongyan Cao, Jiaxin Yang, Zhenhao Wei, Jiatong Xie, Wei Wang, Yang Xiang, Peng Peng","doi":"10.3802/jgo.2025.36.e53","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>Our study was aimed to construct a predictive model to advance ovarian cancer diagnosis by machine learning.</p><p><strong>Methods: </strong>A retrospective analysis of patients with pelvic/adnexal/ovarian mass was performed. Potential features related to ovarian cancer were obtained as many as possible. The optimal machine learning algorithm was selected among six candidates through 5-fold cross validation. Top 20 features having the most powerful predictive significance were ranked by Shapley Additive Interpretation (Shap) method. Clinical validation was further performed to confirm whether our model could advance diagnosis of ovarian cancer.</p><p><strong>Results: </strong>A total of 9,799 patients were collected. The inclusion criteria included age >18 years old, the first diagnosis being pelvic/adnexal/ovarian mass of undetermined significance, and pathological report indispensable. Four hundred and thirty-eight dimensional features were obtained after filtration. LightGBM showed the best performance with accuracy 88%. Among the top 20 features, 55% belonged to laboratory test report, 35% came from imaging examination report, and 10% were attributed to basic demographics and main symptom. Age, CA125, and risk of ovarian malignancy algorithm were the top three. Our predictive model performed stably in testing and clinical validation datasets, and was found to advance the diagnosis of ovarian cancer about 17 days before clinical pathological examination.</p><p><strong>Conclusion: </strong>LightGBM was the optimal algorithm for our predictive model with accuracy of 88%. Laboratory test and imaging examination played essential roles in diagnosing ovarian cancer. Our model could advance the diagnosis of ovarian cancer before clinical pathological examination.</p>","PeriodicalId":15868,"journal":{"name":"Journal of Gynecologic Oncology","volume":" ","pages":""},"PeriodicalIF":3.4000,"publicationDate":"2024-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Early prediction and risk stratification of ovarian cancer based on clinical data using machine learning approaches.\",\"authors\":\"Ting Gui, Dongyan Cao, Jiaxin Yang, Zhenhao Wei, Jiatong Xie, Wei Wang, Yang Xiang, Peng Peng\",\"doi\":\"10.3802/jgo.2025.36.e53\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Objective: </strong>Our study was aimed to construct a predictive model to advance ovarian cancer diagnosis by machine learning.</p><p><strong>Methods: </strong>A retrospective analysis of patients with pelvic/adnexal/ovarian mass was performed. Potential features related to ovarian cancer were obtained as many as possible. The optimal machine learning algorithm was selected among six candidates through 5-fold cross validation. Top 20 features having the most powerful predictive significance were ranked by Shapley Additive Interpretation (Shap) method. Clinical validation was further performed to confirm whether our model could advance diagnosis of ovarian cancer.</p><p><strong>Results: </strong>A total of 9,799 patients were collected. The inclusion criteria included age >18 years old, the first diagnosis being pelvic/adnexal/ovarian mass of undetermined significance, and pathological report indispensable. Four hundred and thirty-eight dimensional features were obtained after filtration. LightGBM showed the best performance with accuracy 88%. Among the top 20 features, 55% belonged to laboratory test report, 35% came from imaging examination report, and 10% were attributed to basic demographics and main symptom. Age, CA125, and risk of ovarian malignancy algorithm were the top three. Our predictive model performed stably in testing and clinical validation datasets, and was found to advance the diagnosis of ovarian cancer about 17 days before clinical pathological examination.</p><p><strong>Conclusion: </strong>LightGBM was the optimal algorithm for our predictive model with accuracy of 88%. Laboratory test and imaging examination played essential roles in diagnosing ovarian cancer. Our model could advance the diagnosis of ovarian cancer before clinical pathological examination.</p>\",\"PeriodicalId\":15868,\"journal\":{\"name\":\"Journal of Gynecologic Oncology\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2024-12-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Gynecologic Oncology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.3802/jgo.2025.36.e53\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"OBSTETRICS & GYNECOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Gynecologic Oncology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.3802/jgo.2025.36.e53","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"OBSTETRICS & GYNECOLOGY","Score":null,"Total":0}
Early prediction and risk stratification of ovarian cancer based on clinical data using machine learning approaches.
Objective: Our study was aimed to construct a predictive model to advance ovarian cancer diagnosis by machine learning.
Methods: A retrospective analysis of patients with pelvic/adnexal/ovarian mass was performed. Potential features related to ovarian cancer were obtained as many as possible. The optimal machine learning algorithm was selected among six candidates through 5-fold cross validation. Top 20 features having the most powerful predictive significance were ranked by Shapley Additive Interpretation (Shap) method. Clinical validation was further performed to confirm whether our model could advance diagnosis of ovarian cancer.
Results: A total of 9,799 patients were collected. The inclusion criteria included age >18 years old, the first diagnosis being pelvic/adnexal/ovarian mass of undetermined significance, and pathological report indispensable. Four hundred and thirty-eight dimensional features were obtained after filtration. LightGBM showed the best performance with accuracy 88%. Among the top 20 features, 55% belonged to laboratory test report, 35% came from imaging examination report, and 10% were attributed to basic demographics and main symptom. Age, CA125, and risk of ovarian malignancy algorithm were the top three. Our predictive model performed stably in testing and clinical validation datasets, and was found to advance the diagnosis of ovarian cancer about 17 days before clinical pathological examination.
Conclusion: LightGBM was the optimal algorithm for our predictive model with accuracy of 88%. Laboratory test and imaging examination played essential roles in diagnosing ovarian cancer. Our model could advance the diagnosis of ovarian cancer before clinical pathological examination.
期刊介绍:
The Journal of Gynecologic Oncology (JGO) is an official publication of the Asian Society of Gynecologic Oncology. Abbreviated title is ''J Gynecol Oncol''. It was launched in 1990. The JGO''s aim is to publish the highest quality manuscripts dedicated to the advancement of care of the patients with gynecologic cancer. It is an international peer-reviewed periodical journal that is published bimonthly (January, March, May, July, September, and November). Supplement numbers are at times published. The journal publishes editorials, original and review articles, correspondence, book review, etc.