Rownak Ara Rasul , Promy Saha , Diponkor Bala , S.M. Rakib Ul Karim , Md. Ibrahim Abdullah , Bishwajit Saha
{"title":"An evaluation of machine learning approaches for early diagnosis of autism spectrum disorder","authors":"Rownak Ara Rasul , Promy Saha , Diponkor Bala , S.M. Rakib Ul Karim , Md. Ibrahim Abdullah , Bishwajit Saha","doi":"10.1016/j.health.2023.100293","DOIUrl":null,"url":null,"abstract":"<div><p>Autistic Spectrum Disorder (ASD) is a neurological disease characterized by difficulties with social interaction, communication, and repetitive activities. While its primary origin lies in genetics, early detection is crucial, and leveraging machine learning offers a promising avenue for a faster and more cost-effective diagnosis. This study employs diverse machine learning methods to identify crucial ASD traits, aiming to enhance and automate the diagnostic process. We study eight state-of-the-art classification models to determine their effectiveness in ASD detection. We evaluate the models using accuracy, precision, recall, specificity, F1-score, area under the curve (AUC), kappa, and log loss metrics to find the best classifier for these binary datasets. Among all the classification models, for the children dataset, the SVM and LR models achieve the highest accuracy of 100% and for the adult dataset, the LR model produces the highest accuracy of 97.14%. Our proposed ANN model provides the highest accuracy of 94.24% for the new combined dataset when hyperparameters are precisely tuned for each model. As almost all classification models achieve high accuracy which utilize true labels, we become interested in delving into five popular clustering algorithms to understand model behavior in scenarios without true labels. We calculate Normalized Mutual Information (NMI), Adjusted Rand Index (ARI), and Silhouette Coefficient (SC) metrics to select the best clustering models. Our evaluation finds that spectral clustering outperforms all other benchmarking clustering models in terms of NMI and ARI metrics while demonstrating comparability to the optimal SC achieved by k-means. The implemented code is available at <span>GitHub</span><svg><path></path></svg>.</p></div>","PeriodicalId":73222,"journal":{"name":"Healthcare analytics (New York, N.Y.)","volume":"5 ","pages":"Article 100293"},"PeriodicalIF":0.0000,"publicationDate":"2024-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772442523001600/pdfft?md5=e0fd6cd67baa47c33181f21a1d4a70e4&pid=1-s2.0-S2772442523001600-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Healthcare analytics (New York, N.Y.)","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772442523001600","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Autistic Spectrum Disorder (ASD) is a neurological disease characterized by difficulties with social interaction, communication, and repetitive activities. While its primary origin lies in genetics, early detection is crucial, and leveraging machine learning offers a promising avenue for a faster and more cost-effective diagnosis. This study employs diverse machine learning methods to identify crucial ASD traits, aiming to enhance and automate the diagnostic process. We study eight state-of-the-art classification models to determine their effectiveness in ASD detection. We evaluate the models using accuracy, precision, recall, specificity, F1-score, area under the curve (AUC), kappa, and log loss metrics to find the best classifier for these binary datasets. Among all the classification models, for the children dataset, the SVM and LR models achieve the highest accuracy of 100% and for the adult dataset, the LR model produces the highest accuracy of 97.14%. Our proposed ANN model provides the highest accuracy of 94.24% for the new combined dataset when hyperparameters are precisely tuned for each model. As almost all classification models achieve high accuracy which utilize true labels, we become interested in delving into five popular clustering algorithms to understand model behavior in scenarios without true labels. We calculate Normalized Mutual Information (NMI), Adjusted Rand Index (ARI), and Silhouette Coefficient (SC) metrics to select the best clustering models. Our evaluation finds that spectral clustering outperforms all other benchmarking clustering models in terms of NMI and ARI metrics while demonstrating comparability to the optimal SC achieved by k-means. The implemented code is available at GitHub.