{"title":"Cross-Corpus Disparity of Parkinson's Voice Datasets Observed on Control Group Distribution","authors":"N. Pah, V. Indrawati, D. Kumar","doi":"10.1109/ICAIIC57133.2023.10066982","DOIUrl":null,"url":null,"abstract":"Parkinson'$s$ disease (PD) is one of the most common neurodegenerative disorders. PD has been the fastest growth in prevalence, and it has become the leading cause of disability. The severity or progression of PD can be reduced if diagnosed at the early stages. It is therefore necessary to develop rapid and simple screening methods or tools to diagnose PD. Speech impairment is one of the early symptoms of PD which is commonly termed Parkinsonian hypokinetic dysarthria. Many researchers have developed a computerized method to identify of diagnosing PD based on voice features. However, the inaccuracy of the developed models was inconsistent especially when being tested on different datasets. The possible cause is the unwanted variability and biases between datasets. This study investigates the possible inconsistencies between Parkinson's voice datasets. The inconsistencies were investigated in the statistical distribution of voice parameters of the healthy-control (HC) group. This work observes the statistical distribution of sustained phoneme parameters extracted from the healthy-control (HC) group of five datasets using ANOVA and the Post-Hoc Turkey-Cramer test. The result suggests that the diversity in language and ethnicity were not contributing significantly to any biases between databases. The other result confirms that noises in the recording contribute to the biases in the extracted voice features, especially the harmonic features","PeriodicalId":105769,"journal":{"name":"2023 International Conference on Artificial Intelligence in Information and Communication (ICAIIC)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 International Conference on Artificial Intelligence in Information and Communication (ICAIIC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAIIC57133.2023.10066982","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Parkinson'$s$ disease (PD) is one of the most common neurodegenerative disorders. PD has been the fastest growth in prevalence, and it has become the leading cause of disability. The severity or progression of PD can be reduced if diagnosed at the early stages. It is therefore necessary to develop rapid and simple screening methods or tools to diagnose PD. Speech impairment is one of the early symptoms of PD which is commonly termed Parkinsonian hypokinetic dysarthria. Many researchers have developed a computerized method to identify of diagnosing PD based on voice features. However, the inaccuracy of the developed models was inconsistent especially when being tested on different datasets. The possible cause is the unwanted variability and biases between datasets. This study investigates the possible inconsistencies between Parkinson's voice datasets. The inconsistencies were investigated in the statistical distribution of voice parameters of the healthy-control (HC) group. This work observes the statistical distribution of sustained phoneme parameters extracted from the healthy-control (HC) group of five datasets using ANOVA and the Post-Hoc Turkey-Cramer test. The result suggests that the diversity in language and ethnicity were not contributing significantly to any biases between databases. The other result confirms that noises in the recording contribute to the biases in the extracted voice features, especially the harmonic features