Introduction: Huntington's disease (HD) is a progressive neurodegenerative disorder characterized by motor, cognitive, and psychiatric decline. The Unified Huntington's Disease Rating Scale Total Motor Score (UHDRS-TMS) is standard for staging manifest disease, but is relatively insensitive to subtle premanifest changes. Speech abnormalities are emerging as candidate digital biomarkers; however, reliably separating premanifest HD (preHD) from healthy controls remains challenging. Here, we assess the feasibility of a speech-only approach by training and comparing multiple classifiers across diverse feature sets and structured tasks to determine whether speech alone can discriminate preHD from controls.
Methods: Speech samples were collected from 94 individuals with HD (38 premanifest, 56 manifest) and 36 controls using a standardized six-task protocol administered via tablet. From these recordings, 188 lexical and prosodic features were automatically extracted. We trained 4 machine learning classifiers: random forest, support vector machine, XGBoost, and deep neural networks (DNNs), within 10-fold cross-validation using three feature configurations: (1) all tasks (188 features), (2) the top 30 ANOVA-ranked features, and (3) 22 features from the Caterpillar passage alone.
Results: Traditional classifiers showed limited accuracy. A DNN using only the Caterpillar task achieved 81% unweighted accuracy for classifying preHD versus controls. Accuracy increased to 83% for prodromal HD and 87% when all HD participants were compared to controls. Adding features from additional tasks did not improve performance.
Conclusion: A brief, structured speech task combined with deep learning enabled accurate classification of preHD. These findings support speech analysis as a scalable, objective tool for early disease detection and monitoring.
扫码关注我们
求助内容:
应助结果提醒方式:
