评估自闭症谱系障碍早期诊断的机器学习方法

Rownak Ara Rasul , Promy Saha , Diponkor Bala , S.M. Rakib Ul Karim , Md. Ibrahim Abdullah , Bishwajit Saha
{"title":"评估自闭症谱系障碍早期诊断的机器学习方法","authors":"Rownak Ara Rasul ,&nbsp;Promy Saha ,&nbsp;Diponkor Bala ,&nbsp;S.M. Rakib Ul Karim ,&nbsp;Md. Ibrahim Abdullah ,&nbsp;Bishwajit Saha","doi":"10.1016/j.health.2023.100293","DOIUrl":null,"url":null,"abstract":"<div><p>Autistic Spectrum Disorder (ASD) is a neurological disease characterized by difficulties with social interaction, communication, and repetitive activities. While its primary origin lies in genetics, early detection is crucial, and leveraging machine learning offers a promising avenue for a faster and more cost-effective diagnosis. This study employs diverse machine learning methods to identify crucial ASD traits, aiming to enhance and automate the diagnostic process. We study eight state-of-the-art classification models to determine their effectiveness in ASD detection. We evaluate the models using accuracy, precision, recall, specificity, F1-score, area under the curve (AUC), kappa, and log loss metrics to find the best classifier for these binary datasets. Among all the classification models, for the children dataset, the SVM and LR models achieve the highest accuracy of 100% and for the adult dataset, the LR model produces the highest accuracy of 97.14%. Our proposed ANN model provides the highest accuracy of 94.24% for the new combined dataset when hyperparameters are precisely tuned for each model. As almost all classification models achieve high accuracy which utilize true labels, we become interested in delving into five popular clustering algorithms to understand model behavior in scenarios without true labels. We calculate Normalized Mutual Information (NMI), Adjusted Rand Index (ARI), and Silhouette Coefficient (SC) metrics to select the best clustering models. Our evaluation finds that spectral clustering outperforms all other benchmarking clustering models in terms of NMI and ARI metrics while demonstrating comparability to the optimal SC achieved by k-means. The implemented code is available at <span>GitHub</span><svg><path></path></svg>.</p></div>","PeriodicalId":73222,"journal":{"name":"Healthcare analytics (New York, N.Y.)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772442523001600/pdfft?md5=e0fd6cd67baa47c33181f21a1d4a70e4&pid=1-s2.0-S2772442523001600-main.pdf","citationCount":"0","resultStr":"{\"title\":\"An evaluation of machine learning approaches for early diagnosis of autism spectrum disorder\",\"authors\":\"Rownak Ara Rasul ,&nbsp;Promy Saha ,&nbsp;Diponkor Bala ,&nbsp;S.M. Rakib Ul Karim ,&nbsp;Md. Ibrahim Abdullah ,&nbsp;Bishwajit Saha\",\"doi\":\"10.1016/j.health.2023.100293\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Autistic Spectrum Disorder (ASD) is a neurological disease characterized by difficulties with social interaction, communication, and repetitive activities. While its primary origin lies in genetics, early detection is crucial, and leveraging machine learning offers a promising avenue for a faster and more cost-effective diagnosis. This study employs diverse machine learning methods to identify crucial ASD traits, aiming to enhance and automate the diagnostic process. We study eight state-of-the-art classification models to determine their effectiveness in ASD detection. We evaluate the models using accuracy, precision, recall, specificity, F1-score, area under the curve (AUC), kappa, and log loss metrics to find the best classifier for these binary datasets. Among all the classification models, for the children dataset, the SVM and LR models achieve the highest accuracy of 100% and for the adult dataset, the LR model produces the highest accuracy of 97.14%. Our proposed ANN model provides the highest accuracy of 94.24% for the new combined dataset when hyperparameters are precisely tuned for each model. As almost all classification models achieve high accuracy which utilize true labels, we become interested in delving into five popular clustering algorithms to understand model behavior in scenarios without true labels. We calculate Normalized Mutual Information (NMI), Adjusted Rand Index (ARI), and Silhouette Coefficient (SC) metrics to select the best clustering models. Our evaluation finds that spectral clustering outperforms all other benchmarking clustering models in terms of NMI and ARI metrics while demonstrating comparability to the optimal SC achieved by k-means. The implemented code is available at <span>GitHub</span><svg><path></path></svg>.</p></div>\",\"PeriodicalId\":73222,\"journal\":{\"name\":\"Healthcare analytics (New York, N.Y.)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-01-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2772442523001600/pdfft?md5=e0fd6cd67baa47c33181f21a1d4a70e4&pid=1-s2.0-S2772442523001600-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Healthcare analytics (New York, N.Y.)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2772442523001600\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Healthcare analytics (New York, N.Y.)","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772442523001600","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

自闭症(ASD)是一种以社交、沟通和重复性活动困难为特征的神经系统疾病。虽然自闭症的主要病因在于遗传,但早期检测至关重要,而利用机器学习为更快、更具成本效益的诊断提供了一条大有可为的途径。本研究采用多种机器学习方法来识别 ASD 的关键特征,旨在提高诊断过程的效率和自动化程度。我们研究了八个最先进的分类模型,以确定它们在 ASD 检测中的有效性。我们使用准确度、精确度、召回率、特异性、F1-分数、曲线下面积(AUC)、卡帕和对数损失指标对模型进行评估,以找到这些二元数据集的最佳分类器。在所有分类模型中,对于儿童数据集,SVM 和 LR 模型的准确率最高,达到 100%;对于成人数据集,LR 模型的准确率最高,达到 97.14%。在对每个模型的超参数进行精确调整后,我们提出的 ANN 模型在新的组合数据集上的准确率最高,达到 94.24%。由于几乎所有使用真实标签的分类模型都能达到很高的准确率,因此我们有兴趣深入研究五种流行的聚类算法,以了解模型在无真实标签情况下的行为。我们计算归一化互信息(NMI)、调整后兰德指数(ARI)和轮廓系数(SC)指标来选择最佳聚类模型。我们的评估发现,就 NMI 和 ARI 指标而言,频谱聚类优于所有其他基准聚类模型,同时与 k-means 实现的最佳 SC 具有可比性。实现代码可在 GitHub 上获取。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
An evaluation of machine learning approaches for early diagnosis of autism spectrum disorder

Autistic Spectrum Disorder (ASD) is a neurological disease characterized by difficulties with social interaction, communication, and repetitive activities. While its primary origin lies in genetics, early detection is crucial, and leveraging machine learning offers a promising avenue for a faster and more cost-effective diagnosis. This study employs diverse machine learning methods to identify crucial ASD traits, aiming to enhance and automate the diagnostic process. We study eight state-of-the-art classification models to determine their effectiveness in ASD detection. We evaluate the models using accuracy, precision, recall, specificity, F1-score, area under the curve (AUC), kappa, and log loss metrics to find the best classifier for these binary datasets. Among all the classification models, for the children dataset, the SVM and LR models achieve the highest accuracy of 100% and for the adult dataset, the LR model produces the highest accuracy of 97.14%. Our proposed ANN model provides the highest accuracy of 94.24% for the new combined dataset when hyperparameters are precisely tuned for each model. As almost all classification models achieve high accuracy which utilize true labels, we become interested in delving into five popular clustering algorithms to understand model behavior in scenarios without true labels. We calculate Normalized Mutual Information (NMI), Adjusted Rand Index (ARI), and Silhouette Coefficient (SC) metrics to select the best clustering models. Our evaluation finds that spectral clustering outperforms all other benchmarking clustering models in terms of NMI and ARI metrics while demonstrating comparability to the optimal SC achieved by k-means. The implemented code is available at GitHub.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Healthcare analytics (New York, N.Y.)
Healthcare analytics (New York, N.Y.) Applied Mathematics, Modelling and Simulation, Nursing and Health Professions (General)
CiteScore
4.40
自引率
0.00%
发文量
0
审稿时长
79 days
期刊最新文献
An electrocardiogram signal classification using a hybrid machine learning and deep learning approach An inter-hospital performance assessment model for evaluating hospitals performing hip arthroplasty A data envelopment analysis model for optimizing transfer time of ischemic stroke patients under endovascular thrombectomy An investigation of Susceptible–Exposed–Infectious–Recovered (SEIR) tuberculosis model dynamics with pseudo-recovery and psychological effect A novel integrated logistic regression model enhanced with recursive feature elimination and explainable artificial intelligence for dementia prediction
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1