Enhancing diagnostic accuracy in symptom-based health checkers: a comprehensive machine learning approach with clinical vignettes and benchmarking.

IF 3 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Frontiers in Artificial Intelligence Pub Date : 2024-10-01 eCollection Date: 2024-01-01 DOI:10.3389/frai.2024.1397388

Leila Aissaoui Ferhi, Manel Ben Amar, Fethi Choubani, Ridha Bouallegue

{"title":"Enhancing diagnostic accuracy in symptom-based health checkers: a comprehensive machine learning approach with clinical vignettes and benchmarking.","authors":"Leila Aissaoui Ferhi, Manel Ben Amar, Fethi Choubani, Ridha Bouallegue","doi":"10.3389/frai.2024.1397388","DOIUrl":null,"url":null,"abstract":"Introduction: The development of machine learning models for symptom-based health checkers is a rapidly evolving area with significant implications for healthcare. Accurate and efficient diagnostic tools can enhance patient outcomes and optimize healthcare resources. This study focuses on evaluating and optimizing machine learning models using a dataset of 10 diseases and 9,572 samples.Methods: The dataset was divided into training and testing sets to facilitate model training and evaluation. The following models were selected and optimized: Decision Tree, Random Forest, Naive Bayes, Logistic Regression and K-Nearest Neighbors. Evaluation metrics included accuracy, F1 scores, and 10-fold cross-validation. ROC-AUC and precision-recall curves were also utilized to assess model performance, particularly in scenarios with imbalanced datasets. Clinical vignettes were employed to gauge the real-world applicability of the models.Results: The performance of the models was evaluated using accuracy, F1 scores, and 10-fold cross-validation. The use of ROC-AUC curves revealed that model performance improved with increasing complexity. Precision-recall curves were particularly useful in evaluating model sensitivity in imbalanced dataset scenarios. Clinical vignettes demonstrated the robustness of the models in providing accurate diagnoses.Discussion: The study underscores the importance of comprehensive model evaluation techniques. The use of clinical vignette testing and analysis of ROC-AUC and precision-recall curves are crucial in ensuring the reliability and sensitivity of symptom-based health checkers. These techniques provide a more nuanced understanding of model performance and highlight areas for further improvement.Conclusion: This study highlights the significance of employing diverse evaluation metrics and methods to ensure the robustness and accuracy of machine learning models in symptom-based health checkers. The integration of clinical vignettes and the analysis of ROC-AUC and precision-recall curves are essential steps in developing reliable and sensitive diagnostic tools.","PeriodicalId":33315,"journal":{"name":"Frontiers in Artificial Intelligence","volume":"7 ","pages":"1397388"},"PeriodicalIF":3.0000,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11483353/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/frai.2024.1397388","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Introduction: The development of machine learning models for symptom-based health checkers is a rapidly evolving area with significant implications for healthcare. Accurate and efficient diagnostic tools can enhance patient outcomes and optimize healthcare resources. This study focuses on evaluating and optimizing machine learning models using a dataset of 10 diseases and 9,572 samples.

Methods: The dataset was divided into training and testing sets to facilitate model training and evaluation. The following models were selected and optimized: Decision Tree, Random Forest, Naive Bayes, Logistic Regression and K-Nearest Neighbors. Evaluation metrics included accuracy, F1 scores, and 10-fold cross-validation. ROC-AUC and precision-recall curves were also utilized to assess model performance, particularly in scenarios with imbalanced datasets. Clinical vignettes were employed to gauge the real-world applicability of the models.

Results: The performance of the models was evaluated using accuracy, F1 scores, and 10-fold cross-validation. The use of ROC-AUC curves revealed that model performance improved with increasing complexity. Precision-recall curves were particularly useful in evaluating model sensitivity in imbalanced dataset scenarios. Clinical vignettes demonstrated the robustness of the models in providing accurate diagnoses.

Discussion: The study underscores the importance of comprehensive model evaluation techniques. The use of clinical vignette testing and analysis of ROC-AUC and precision-recall curves are crucial in ensuring the reliability and sensitivity of symptom-based health checkers. These techniques provide a more nuanced understanding of model performance and highlight areas for further improvement.

Conclusion: This study highlights the significance of employing diverse evaluation metrics and methods to ensure the robustness and accuracy of machine learning models in symptom-based health checkers. The integration of clinical vignettes and the analysis of ROC-AUC and precision-recall curves are essential steps in developing reliable and sensitive diagnostic tools.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

提高基于症状的健康检查器的诊断准确性：利用临床案例和基准的综合机器学习方法。

简介为基于症状的健康检查器开发机器学习模型是一个快速发展的领域，对医疗保健具有重大意义。准确高效的诊断工具可以提高患者的治疗效果，优化医疗资源。本研究的重点是使用包含 10 种疾病和 9,572 个样本的数据集评估和优化机器学习模型：方法：将数据集分为训练集和测试集，以便于模型的训练和评估。选择并优化了以下模型：决策树、随机森林、奈夫贝叶斯、逻辑回归和 K-近邻。评估指标包括准确率、F1 分数和 10 倍交叉验证。此外，还利用 ROC-AUC 和精度-召回曲线来评估模型性能，尤其是在数据集不平衡的情况下。此外，还采用了临床案例来衡量模型在现实世界中的适用性：结果：使用准确率、F1 分数和 10 倍交叉验证评估了模型的性能。使用 ROC-AUC 曲线显示，模型性能随着复杂度的增加而提高。精确度-召回曲线对评估不平衡数据集情况下的模型灵敏度特别有用。临床案例证明了模型在提供准确诊断方面的稳健性：本研究强调了综合模型评估技术的重要性。讨论：该研究强调了综合模型评估技术的重要性，使用临床小样本测试以及 ROC-AUC 和精确度-召回曲线分析对于确保基于症状的健康检查器的可靠性和灵敏度至关重要。这些技术能更细致地了解模型的性能，并突出需要进一步改进的地方：本研究强调了采用不同的评估指标和方法来确保基于症状的健康检查器中机器学习模型的稳健性和准确性的重要性。在开发可靠、灵敏的诊断工具时，整合临床案例、分析 ROC-AUC 和精确度-召回曲线是必不可少的步骤。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊