Yuan Cao, Yixian Yang, Yunchao Chen, Mengqi Luan, Yan Hu, Lu Zhang, Weiwei Zhan, Wei Zhou
{"title":"优化甲状腺 AUS 结节恶性程度预测:逻辑回归和机器学习模型的综合研究。","authors":"Yuan Cao, Yixian Yang, Yunchao Chen, Mengqi Luan, Yan Hu, Lu Zhang, Weiwei Zhan, Wei Zhou","doi":"10.3389/fendo.2024.1366687","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>The accurate diagnosis of thyroid nodules with indeterminate cytology, particularly in the atypia of undetermined significance (AUS) category, remains challenging. This study aims to predict the risk of malignancy in AUS nodules by comparing two machine learning (ML) and three conventional logistic regression (LR) models.</p><p><strong>Methods: </strong>A retrospective study on 356 AUS nodules in 342 individuals from 6728 patients who underwent thyroid surgery in 2021. All the clinical, ultrasonographic, and molecular data were collected and randomly separated into training and validation cohorts at a ratio of 7: 3. ML (random forest and XGBoost) and LR (lasso regression, best subset selection, and backward stepwise regression) models were constructed and evaluated using area under the curve (AUC), calibration, and clinical utility metrics.</p><p><strong>Results: </strong>Approximately 90% (321/356) of the AUS nodules were malignant, predominantly papillary thyroid carcinoma with 68.6% BRAF V600E mutations. The final LR prediction model based on backward stepwise regression exhibited superior discrimination with AUC values of 0.83 (95% CI: 0.73-0.92) and 0.80 (95% CI: 0.67-0.94) in training and validation, respectively. Well calibration, and clinical utility were also confirmed. The ML models showed moderate performance. A nomogram was developed on the final LR model.</p><p><strong>Conclusions: </strong>The LR model developed using the backward stepwise regression, outperformed ML models in predicting malignancy in AUS thyroid nodules. The corresponding nomogram based on this model provides a valuable and practical tool for personalized risk assessment, potentially reducing unnecessary surgeries and enhancing clinical decision-making.</p>","PeriodicalId":12447,"journal":{"name":"Frontiers in Endocrinology","volume":"15 ","pages":"1366687"},"PeriodicalIF":3.9000,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11576180/pdf/","citationCount":"0","resultStr":"{\"title\":\"Optimizing thyroid AUS nodules malignancy prediction: a comprehensive study of logistic regression and machine learning models.\",\"authors\":\"Yuan Cao, Yixian Yang, Yunchao Chen, Mengqi Luan, Yan Hu, Lu Zhang, Weiwei Zhan, Wei Zhou\",\"doi\":\"10.3389/fendo.2024.1366687\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>The accurate diagnosis of thyroid nodules with indeterminate cytology, particularly in the atypia of undetermined significance (AUS) category, remains challenging. This study aims to predict the risk of malignancy in AUS nodules by comparing two machine learning (ML) and three conventional logistic regression (LR) models.</p><p><strong>Methods: </strong>A retrospective study on 356 AUS nodules in 342 individuals from 6728 patients who underwent thyroid surgery in 2021. All the clinical, ultrasonographic, and molecular data were collected and randomly separated into training and validation cohorts at a ratio of 7: 3. ML (random forest and XGBoost) and LR (lasso regression, best subset selection, and backward stepwise regression) models were constructed and evaluated using area under the curve (AUC), calibration, and clinical utility metrics.</p><p><strong>Results: </strong>Approximately 90% (321/356) of the AUS nodules were malignant, predominantly papillary thyroid carcinoma with 68.6% BRAF V600E mutations. The final LR prediction model based on backward stepwise regression exhibited superior discrimination with AUC values of 0.83 (95% CI: 0.73-0.92) and 0.80 (95% CI: 0.67-0.94) in training and validation, respectively. Well calibration, and clinical utility were also confirmed. The ML models showed moderate performance. A nomogram was developed on the final LR model.</p><p><strong>Conclusions: </strong>The LR model developed using the backward stepwise regression, outperformed ML models in predicting malignancy in AUS thyroid nodules. The corresponding nomogram based on this model provides a valuable and practical tool for personalized risk assessment, potentially reducing unnecessary surgeries and enhancing clinical decision-making.</p>\",\"PeriodicalId\":12447,\"journal\":{\"name\":\"Frontiers in Endocrinology\",\"volume\":\"15 \",\"pages\":\"1366687\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2024-11-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11576180/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Frontiers in Endocrinology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.3389/fendo.2024.1366687\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q2\",\"JCRName\":\"ENDOCRINOLOGY & METABOLISM\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Endocrinology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.3389/fendo.2024.1366687","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"ENDOCRINOLOGY & METABOLISM","Score":null,"Total":0}
引用次数: 0
摘要
背景:准确诊断细胞学不确定的甲状腺结节,尤其是意义未定的不典型性(AUS)结节仍具有挑战性。本研究旨在通过比较两种机器学习(ML)模型和三种传统逻辑回归(LR)模型来预测AUS结节的恶性风险:方法:对2021年接受甲状腺手术的6728名患者中342人的356个AUS结节进行回顾性研究。构建了 ML(随机森林和 XGBoost)和 LR(套索回归、最佳子集选择和后向逐步回归)模型,并使用曲线下面积(AUC)、校准和临床效用指标对其进行了评估:约90%(321/356)的AUS结节为恶性,主要为甲状腺乳头状癌,68.6%的结节存在BRAF V600E突变。基于后向逐步回归的最终LR预测模型显示出卓越的辨别能力,训练和验证的AUC值分别为0.83(95% CI:0.73-0.92)和0.80(95% CI:0.67-0.94)。良好的校准性和临床实用性也得到了证实。ML 模型显示出中等水平的性能。在最终的 LR 模型上建立了一个提名图:结论:采用后向逐步回归法建立的 LR 模型在预测 AUS 甲状腺结节恶性程度方面优于 ML 模型。基于该模型的相应提名图为个性化风险评估提供了一个有价值的实用工具,有可能减少不必要的手术,提高临床决策水平。
Optimizing thyroid AUS nodules malignancy prediction: a comprehensive study of logistic regression and machine learning models.
Background: The accurate diagnosis of thyroid nodules with indeterminate cytology, particularly in the atypia of undetermined significance (AUS) category, remains challenging. This study aims to predict the risk of malignancy in AUS nodules by comparing two machine learning (ML) and three conventional logistic regression (LR) models.
Methods: A retrospective study on 356 AUS nodules in 342 individuals from 6728 patients who underwent thyroid surgery in 2021. All the clinical, ultrasonographic, and molecular data were collected and randomly separated into training and validation cohorts at a ratio of 7: 3. ML (random forest and XGBoost) and LR (lasso regression, best subset selection, and backward stepwise regression) models were constructed and evaluated using area under the curve (AUC), calibration, and clinical utility metrics.
Results: Approximately 90% (321/356) of the AUS nodules were malignant, predominantly papillary thyroid carcinoma with 68.6% BRAF V600E mutations. The final LR prediction model based on backward stepwise regression exhibited superior discrimination with AUC values of 0.83 (95% CI: 0.73-0.92) and 0.80 (95% CI: 0.67-0.94) in training and validation, respectively. Well calibration, and clinical utility were also confirmed. The ML models showed moderate performance. A nomogram was developed on the final LR model.
Conclusions: The LR model developed using the backward stepwise regression, outperformed ML models in predicting malignancy in AUS thyroid nodules. The corresponding nomogram based on this model provides a valuable and practical tool for personalized risk assessment, potentially reducing unnecessary surgeries and enhancing clinical decision-making.
期刊介绍:
Frontiers in Endocrinology is a field journal of the "Frontiers in" journal series.
In today’s world, endocrinology is becoming increasingly important as it underlies many of the challenges societies face - from obesity and diabetes to reproduction, population control and aging. Endocrinology covers a broad field from basic molecular and cellular communication through to clinical care and some of the most crucial public health issues. The journal, thus, welcomes outstanding contributions in any domain of endocrinology.
Frontiers in Endocrinology publishes articles on the most outstanding discoveries across a wide research spectrum of Endocrinology. The mission of Frontiers in Endocrinology is to bring all relevant Endocrinology areas together on a single platform.