Objective: To develop a multiparametric index based on machine learning (ML) to predict and classify the overall degree of vocal deviation (GG).
Method: The sample consisted of 300 dysphonic and non-dysphonic participants of both sexes. Two speech tasks were sustained vowel [a] and connected speech (counting numbers from 1 to 10). Five speech-language pathologists performed the auditory-perceptual judgment (APJ) of the GG and the degrees of roughness (GR), breathiness (GB), instability (GI), and strain (GS). We extracted 47 acoustic measurements from these tasks. The APJ result and the acoustic measurements were used to develop the multiparametric index. We used mean absolute error, root mean square error, and coefficient of determination (R²) to select the best model of ML to predict GG and feature importance to select the best set of variables for the index. After classifying the GG between nondysphonic, mild, moderate, and severe, the final model was validated using accuracy, sensitivity, specificity, predictive values, likelihood ratios, F1-Score, and weighted kappa.
Results: The gradient boost model showed the best performance among the ML models. Eight features were selected in the model, including four acoustic measures (jitterLoc, smoothed cepstral peak prominenc, mean harmonic-to-noise ratio (HNRmean), and correlation) and four APJ measures (GR, GB, GS, and GI). The final model correctly classified 93.75% of participants and obtained a weighted kappa index of 0.9374, demonstrating the model's excellent performance.
Conclusion: The Integrated Vocal Deviation Index includes four acoustic measures and four auditory-perceptual measures and showed excellent performance in classifying voices according to GG.