Decision tree-based data mining approach for the evaluation of survival in primary malignant bone tumors: A surveillance, epidemiology and end results database study.
Dilek Yapar, Aliekber Yapar, Mehmet Ali Tokgöz, Uğur Bilge
{"title":"Decision tree-based data mining approach for the evaluation of survival in primary malignant bone tumors: A surveillance, epidemiology and end results database study.","authors":"Dilek Yapar, Aliekber Yapar, Mehmet Ali Tokgöz, Uğur Bilge","doi":"10.1177/10225536231189780","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>This study aimed to conduct a large-scale population-based study to understand the epidemiological characteristics of Primary Malignant Bone Tumors (PMBTs) and determine the prognostic factors by concurrently using the classical statistical method and data mining methods.</p><p><strong>Methods: </strong>Patients included in this study were extracted from the National Cancer Institute's Surveillance, Epidemiology and End Results (SEER) database: \"Incidence-SEER Research Data, 18 Registries, Nov 2020 Sub\". Patients with unclassified and incomplete information were excluded. This search algorithm resulted in a dataset comprising 6234 cases. Survival analyses were performed with Kaplan-Meier curves and the Log-rank test. Multivariate Cox regression analysis determined the independent prognostic factors of PMBT. A decision tree-based data mining technique was used in this study to confirm the prognostic factors.</p><p><strong>Results: </strong>5-years survival rate was 63.6% and 10-years survival rate was 55.3% in the patients with PMBT. Sex, age, median household income, histology, primary site, grade, stage, metastasis, and the total number of malignant tumors were determined as independent risk factors associated with overall survival (OS) in the multivariate COX regression analysis. The prognostic factors resulting in five terminal nodes in the decision tree (DT) included stage, age, and grade. The stage was the most important determining factor for vital status. The terminal node with the shortest number of surviving patients included 801 (72.3%) deaths in 1102 patients with distant stage, and hazard ratio was calculated as 5.4 (95% CI: 4.9-5.9; <i>p</i> < .001). These patients had a median survival of only 17 months.</p><p><strong>Conclusions: </strong>Rules extracted from DTs provide information about risk factors in specific patient groups and can be used by clinicians making decisions on individual patients. We recommend using DTs in combination with COX regression analysis to determine risk factors and the effect of these factors on survival.</p>","PeriodicalId":48794,"journal":{"name":"Journal of Orthopaedic Surgery","volume":"31 2","pages":"10225536231189780"},"PeriodicalIF":1.3000,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Orthopaedic Surgery","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1177/10225536231189780","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ORTHOPEDICS","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose: This study aimed to conduct a large-scale population-based study to understand the epidemiological characteristics of Primary Malignant Bone Tumors (PMBTs) and determine the prognostic factors by concurrently using the classical statistical method and data mining methods.
Methods: Patients included in this study were extracted from the National Cancer Institute's Surveillance, Epidemiology and End Results (SEER) database: "Incidence-SEER Research Data, 18 Registries, Nov 2020 Sub". Patients with unclassified and incomplete information were excluded. This search algorithm resulted in a dataset comprising 6234 cases. Survival analyses were performed with Kaplan-Meier curves and the Log-rank test. Multivariate Cox regression analysis determined the independent prognostic factors of PMBT. A decision tree-based data mining technique was used in this study to confirm the prognostic factors.
Results: 5-years survival rate was 63.6% and 10-years survival rate was 55.3% in the patients with PMBT. Sex, age, median household income, histology, primary site, grade, stage, metastasis, and the total number of malignant tumors were determined as independent risk factors associated with overall survival (OS) in the multivariate COX regression analysis. The prognostic factors resulting in five terminal nodes in the decision tree (DT) included stage, age, and grade. The stage was the most important determining factor for vital status. The terminal node with the shortest number of surviving patients included 801 (72.3%) deaths in 1102 patients with distant stage, and hazard ratio was calculated as 5.4 (95% CI: 4.9-5.9; p < .001). These patients had a median survival of only 17 months.
Conclusions: Rules extracted from DTs provide information about risk factors in specific patient groups and can be used by clinicians making decisions on individual patients. We recommend using DTs in combination with COX regression analysis to determine risk factors and the effect of these factors on survival.
期刊介绍:
Journal of Orthopaedic Surgery is an open access peer-reviewed journal publishing original reviews and research articles on all aspects of orthopaedic surgery. It is the official journal of the Asia Pacific Orthopaedic Association.
The journal welcomes and will publish materials of a diverse nature, from basic science research to clinical trials and surgical techniques. The journal encourages contributions from all parts of the world, but special emphasis is given to research of particular relevance to the Asia Pacific region.