{"title":"An online clustering algorithm predicting model for prostate cancer based on PHI-related variables and PI-RADS in different PSA populations.","authors":"Jiyuan Hu, Qi Miao, Jiayi Ren, Hongbo Su, Xianlu Zhang, Jianbin Bi, Gejun Zhang","doi":"10.1186/s12935-025-03677-2","DOIUrl":null,"url":null,"abstract":"<p><strong>Background and aim: </strong>Prostate cancer is the most common male malignancy. Current diagnostic methods using single TPSA and PHI lack specificity. Some researches have created nomograms for predicting risk, but these are not easily visualized. Our study aims to find the best negative predictive value (NPV) for PHI, then build a clustering model to display prostate cancer risk categories, particularly useful for patients with PSA > 20 and be actually applied in clinical work.</p><p><strong>Method: </strong>We collected 708 patients in the training cohort and 143 in the validation cohort, divided into three groups based on their PSA levels. Next, we determined optimal and customized PHI cut-off values, calculated NPV and PPV, and selected logistic regression as the best method among several machine-learning algorithms. Subsequently, the significant variables were identified, and then a clustering algorithm was constructed. Finally, the model was validated and made available online for further clinical application.</p><p><strong>Results: </strong>The Optimal PHI cut-off lower limits for PSA > 4, PSA4-20, PSA > 20 subgroups were 23.85, 24.35, and 40.75, with upper limits of 142.9, 143, and 135.6, respectively. The clustering model of the optimal cohort for PSA > 4 and PSA 4-20 sub-groups showed a superior Silhouette coefficients of 0.433 and 0.526 than that of the customized PHI cohort (0.432, 0.452). The PSA > 20 subgroup owned the highest Silhouette coefficient of 0.572. The validation cohort showed AUC values of 0.761, 0.823, 0.833 for these 3 sub-groups, with accuracy rates of 88.81%, 90.38%, and 82.05%.</p><p><strong>Conclusion: </strong>In conclusion, our clustering model effectively categorizes patients into distinct risk groups with clear visualization and has demonstrated stability and reliability in the validation cohort, potentially aiding in early diagnosis of prostate cancer in clinical practice.</p>","PeriodicalId":9385,"journal":{"name":"Cancer Cell International","volume":"25 1","pages":"44"},"PeriodicalIF":5.3000,"publicationDate":"2025-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11827463/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cancer Cell International","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12935-025-03677-2","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Background and aim: Prostate cancer is the most common male malignancy. Current diagnostic methods using single TPSA and PHI lack specificity. Some researches have created nomograms for predicting risk, but these are not easily visualized. Our study aims to find the best negative predictive value (NPV) for PHI, then build a clustering model to display prostate cancer risk categories, particularly useful for patients with PSA > 20 and be actually applied in clinical work.
Method: We collected 708 patients in the training cohort and 143 in the validation cohort, divided into three groups based on their PSA levels. Next, we determined optimal and customized PHI cut-off values, calculated NPV and PPV, and selected logistic regression as the best method among several machine-learning algorithms. Subsequently, the significant variables were identified, and then a clustering algorithm was constructed. Finally, the model was validated and made available online for further clinical application.
Results: The Optimal PHI cut-off lower limits for PSA > 4, PSA4-20, PSA > 20 subgroups were 23.85, 24.35, and 40.75, with upper limits of 142.9, 143, and 135.6, respectively. The clustering model of the optimal cohort for PSA > 4 and PSA 4-20 sub-groups showed a superior Silhouette coefficients of 0.433 and 0.526 than that of the customized PHI cohort (0.432, 0.452). The PSA > 20 subgroup owned the highest Silhouette coefficient of 0.572. The validation cohort showed AUC values of 0.761, 0.823, 0.833 for these 3 sub-groups, with accuracy rates of 88.81%, 90.38%, and 82.05%.
Conclusion: In conclusion, our clustering model effectively categorizes patients into distinct risk groups with clear visualization and has demonstrated stability and reliability in the validation cohort, potentially aiding in early diagnosis of prostate cancer in clinical practice.
期刊介绍:
Cancer Cell International publishes articles on all aspects of cancer cell biology, originating largely from, but not limited to, work using cell culture techniques.
The journal focuses on novel cancer studies reporting data from biological experiments performed on cells grown in vitro, in two- or three-dimensional systems, and/or in vivo (animal experiments). These types of experiments have provided crucial data in many fields, from cell proliferation and transformation, to epithelial-mesenchymal interaction, to apoptosis, and host immune response to tumors.
Cancer Cell International also considers articles that focus on novel technologies or novel pathways in molecular analysis and on epidemiological studies that may affect patient care, as well as articles reporting translational cancer research studies where in vitro discoveries are bridged to the clinic. As such, the journal is interested in laboratory and animal studies reporting on novel biomarkers of tumor progression and response to therapy and on their applicability to human cancers.