{"title":"头颈部癌症语音清晰度感知判断的自动建模。","authors":"Sebastião Quintas, Mathieu Balaguer, Julie Mauclair, Virginie Woisard, Julien Pinquier","doi":"10.1111/1460-6984.13004","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <h3> Background</h3>\n \n <p>Perceptual measures such as speech intelligibility are known to be biased, variant and subjective, to which an automatic approach has been seen as a more reliable alternative. On the other hand, automatic approaches tend to lack explainability, an aspect that can prevent the widespread usage of these technologies clinically.</p>\n </section>\n \n <section>\n \n <h3> Aims</h3>\n \n <p>In the present work, we aim to study the relationship between four perceptual parameters and speech intelligibility by automatically modelling the behaviour of six perceptual judges, in the context of head and neck cancer. From this evaluation we want to assess the different levels of relevance of each parameter as well as the different judge profiles that arise, both perceptually and automatically.</p>\n </section>\n \n <section>\n \n <h3> Methods and Procedures</h3>\n \n <p>Based on a passage reading task from the Carcinologic Speech Severity Index (C2SI) corpus, six expert listeners assessed the voice quality, resonance, prosody and phonemic distortions, as well as the speech intelligibility of patients treated for oral or oropharyngeal cancer. A statistical analysis and an ensemble of automatic systems, one per judge, were devised, where speech intelligibility is predicted as a function of the four aforementioned perceptual parameters of voice quality, resonance, prosody and phonemic distortions.</p>\n </section>\n \n <section>\n \n <h3> Outcomes and Results</h3>\n \n <p>The results suggest that we can automatically predict speech intelligibility as a function of the four aforementioned perceptual parameters, achieving a high correlation of 0.775 (Spearman's <i>ρ</i>). Furthermore, different judge profiles were found perceptually that were successfully modelled automatically.</p>\n </section>\n \n <section>\n \n <h3> Conclusions and Implications</h3>\n \n <p>The four investigated perceptual parameters influence the global rating of speech intelligibility, showing that different judge profiles emerge. The proposed automatic approach displayed a more uniform profile across all judges, displaying a more reliable, unbiased and objective prediction. The system also adds an extra layer of interpretability, since speech intelligibility is regressed as a direct function of the individual prediction of the four perceptual parameters, an improvement over more black box approaches.</p>\n </section>\n \n <section>\n \n <h3> WHAT THIS PAPER ADDS</h3>\n \n <section>\n \n <h3> What is already known on this subject</h3>\n \n <div>\n <ul>\n \n <li>Speech intelligibility is a clinical measure typically used in the post-treatment assessment of speech affecting disorders, such as head and neck cancer. Their perceptual assessment is currently the main method of evaluation; however, it is known to be quite subjective since intelligibility can be seen as a combination of other perceptual parameters (voice quality, resonance, etc.). Given this, automatic approaches have been seen as a more viable alternative to the traditionally used perceptual assessments.</li>\n </ul>\n </div>\n </section>\n \n <section>\n \n <h3> What this study adds to existing knowledge</h3>\n \n <div>\n <ul>\n \n <li>The present work introduces a study based on the relationship between four perceptual parameters (voice quality, resonance, prosody and phonemic distortions) and speech intelligibility, by automatically modelling the behaviour of six perceptual judges. The results suggest that different judge profiles arise, both in the perceptual case as well as in the automatic models. These different profiles found showcase the different schools of thought that perceptual judges have, in comparison to the automatic judges, that display more uniform levels of relevance across all the four perceptual parameters. This aspect shows that an automatic approach promotes unbiased, reliable and more objective predictions.</li>\n </ul>\n </div>\n </section>\n \n <section>\n \n <h3> What are the clinical implications of this work?</h3>\n \n <div>\n <ul>\n \n <li>The automatic prediction of speech intelligibility, using a combination of four perceptual parameters, show that these approaches can achieve high correlations with the reference scores while maintaining a certain degree of explainability. The more uniform judge profiles found on the automatic case also display less biased results towards the four perceptual parameters. This aspect facilitates the clinical implementation of this class of systems, as opposed to the more subjective and harder to reproduce perceptual assessments.</li>\n </ul>\n </div>\n </section>\n </section>\n </div>","PeriodicalId":49182,"journal":{"name":"International Journal of Language & Communication Disorders","volume":"59 4","pages":"1422-1435"},"PeriodicalIF":1.5000,"publicationDate":"2024-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1460-6984.13004","citationCount":"0","resultStr":"{\"title\":\"Automatic modelling of perceptual judges in the context of head and neck cancer speech intelligibility\",\"authors\":\"Sebastião Quintas, Mathieu Balaguer, Julie Mauclair, Virginie Woisard, Julien Pinquier\",\"doi\":\"10.1111/1460-6984.13004\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n \\n <section>\\n \\n <h3> Background</h3>\\n \\n <p>Perceptual measures such as speech intelligibility are known to be biased, variant and subjective, to which an automatic approach has been seen as a more reliable alternative. On the other hand, automatic approaches tend to lack explainability, an aspect that can prevent the widespread usage of these technologies clinically.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Aims</h3>\\n \\n <p>In the present work, we aim to study the relationship between four perceptual parameters and speech intelligibility by automatically modelling the behaviour of six perceptual judges, in the context of head and neck cancer. From this evaluation we want to assess the different levels of relevance of each parameter as well as the different judge profiles that arise, both perceptually and automatically.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Methods and Procedures</h3>\\n \\n <p>Based on a passage reading task from the Carcinologic Speech Severity Index (C2SI) corpus, six expert listeners assessed the voice quality, resonance, prosody and phonemic distortions, as well as the speech intelligibility of patients treated for oral or oropharyngeal cancer. A statistical analysis and an ensemble of automatic systems, one per judge, were devised, where speech intelligibility is predicted as a function of the four aforementioned perceptual parameters of voice quality, resonance, prosody and phonemic distortions.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Outcomes and Results</h3>\\n \\n <p>The results suggest that we can automatically predict speech intelligibility as a function of the four aforementioned perceptual parameters, achieving a high correlation of 0.775 (Spearman's <i>ρ</i>). Furthermore, different judge profiles were found perceptually that were successfully modelled automatically.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Conclusions and Implications</h3>\\n \\n <p>The four investigated perceptual parameters influence the global rating of speech intelligibility, showing that different judge profiles emerge. The proposed automatic approach displayed a more uniform profile across all judges, displaying a more reliable, unbiased and objective prediction. The system also adds an extra layer of interpretability, since speech intelligibility is regressed as a direct function of the individual prediction of the four perceptual parameters, an improvement over more black box approaches.</p>\\n </section>\\n \\n <section>\\n \\n <h3> WHAT THIS PAPER ADDS</h3>\\n \\n <section>\\n \\n <h3> What is already known on this subject</h3>\\n \\n <div>\\n <ul>\\n \\n <li>Speech intelligibility is a clinical measure typically used in the post-treatment assessment of speech affecting disorders, such as head and neck cancer. Their perceptual assessment is currently the main method of evaluation; however, it is known to be quite subjective since intelligibility can be seen as a combination of other perceptual parameters (voice quality, resonance, etc.). Given this, automatic approaches have been seen as a more viable alternative to the traditionally used perceptual assessments.</li>\\n </ul>\\n </div>\\n </section>\\n \\n <section>\\n \\n <h3> What this study adds to existing knowledge</h3>\\n \\n <div>\\n <ul>\\n \\n <li>The present work introduces a study based on the relationship between four perceptual parameters (voice quality, resonance, prosody and phonemic distortions) and speech intelligibility, by automatically modelling the behaviour of six perceptual judges. The results suggest that different judge profiles arise, both in the perceptual case as well as in the automatic models. These different profiles found showcase the different schools of thought that perceptual judges have, in comparison to the automatic judges, that display more uniform levels of relevance across all the four perceptual parameters. This aspect shows that an automatic approach promotes unbiased, reliable and more objective predictions.</li>\\n </ul>\\n </div>\\n </section>\\n \\n <section>\\n \\n <h3> What are the clinical implications of this work?</h3>\\n \\n <div>\\n <ul>\\n \\n <li>The automatic prediction of speech intelligibility, using a combination of four perceptual parameters, show that these approaches can achieve high correlations with the reference scores while maintaining a certain degree of explainability. The more uniform judge profiles found on the automatic case also display less biased results towards the four perceptual parameters. This aspect facilitates the clinical implementation of this class of systems, as opposed to the more subjective and harder to reproduce perceptual assessments.</li>\\n </ul>\\n </div>\\n </section>\\n </section>\\n </div>\",\"PeriodicalId\":49182,\"journal\":{\"name\":\"International Journal of Language & Communication Disorders\",\"volume\":\"59 4\",\"pages\":\"1422-1435\"},\"PeriodicalIF\":1.5000,\"publicationDate\":\"2024-01-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1460-6984.13004\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Language & Communication Disorders\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1111/1460-6984.13004\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Language & Communication Disorders","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/1460-6984.13004","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY","Score":null,"Total":0}
Automatic modelling of perceptual judges in the context of head and neck cancer speech intelligibility
Background
Perceptual measures such as speech intelligibility are known to be biased, variant and subjective, to which an automatic approach has been seen as a more reliable alternative. On the other hand, automatic approaches tend to lack explainability, an aspect that can prevent the widespread usage of these technologies clinically.
Aims
In the present work, we aim to study the relationship between four perceptual parameters and speech intelligibility by automatically modelling the behaviour of six perceptual judges, in the context of head and neck cancer. From this evaluation we want to assess the different levels of relevance of each parameter as well as the different judge profiles that arise, both perceptually and automatically.
Methods and Procedures
Based on a passage reading task from the Carcinologic Speech Severity Index (C2SI) corpus, six expert listeners assessed the voice quality, resonance, prosody and phonemic distortions, as well as the speech intelligibility of patients treated for oral or oropharyngeal cancer. A statistical analysis and an ensemble of automatic systems, one per judge, were devised, where speech intelligibility is predicted as a function of the four aforementioned perceptual parameters of voice quality, resonance, prosody and phonemic distortions.
Outcomes and Results
The results suggest that we can automatically predict speech intelligibility as a function of the four aforementioned perceptual parameters, achieving a high correlation of 0.775 (Spearman's ρ). Furthermore, different judge profiles were found perceptually that were successfully modelled automatically.
Conclusions and Implications
The four investigated perceptual parameters influence the global rating of speech intelligibility, showing that different judge profiles emerge. The proposed automatic approach displayed a more uniform profile across all judges, displaying a more reliable, unbiased and objective prediction. The system also adds an extra layer of interpretability, since speech intelligibility is regressed as a direct function of the individual prediction of the four perceptual parameters, an improvement over more black box approaches.
WHAT THIS PAPER ADDS
What is already known on this subject
Speech intelligibility is a clinical measure typically used in the post-treatment assessment of speech affecting disorders, such as head and neck cancer. Their perceptual assessment is currently the main method of evaluation; however, it is known to be quite subjective since intelligibility can be seen as a combination of other perceptual parameters (voice quality, resonance, etc.). Given this, automatic approaches have been seen as a more viable alternative to the traditionally used perceptual assessments.
What this study adds to existing knowledge
The present work introduces a study based on the relationship between four perceptual parameters (voice quality, resonance, prosody and phonemic distortions) and speech intelligibility, by automatically modelling the behaviour of six perceptual judges. The results suggest that different judge profiles arise, both in the perceptual case as well as in the automatic models. These different profiles found showcase the different schools of thought that perceptual judges have, in comparison to the automatic judges, that display more uniform levels of relevance across all the four perceptual parameters. This aspect shows that an automatic approach promotes unbiased, reliable and more objective predictions.
What are the clinical implications of this work?
The automatic prediction of speech intelligibility, using a combination of four perceptual parameters, show that these approaches can achieve high correlations with the reference scores while maintaining a certain degree of explainability. The more uniform judge profiles found on the automatic case also display less biased results towards the four perceptual parameters. This aspect facilitates the clinical implementation of this class of systems, as opposed to the more subjective and harder to reproduce perceptual assessments.
期刊介绍:
The International Journal of Language & Communication Disorders (IJLCD) is the official journal of the Royal College of Speech & Language Therapists. The Journal welcomes submissions on all aspects of speech, language, communication disorders and speech and language therapy. It provides a forum for the exchange of information and discussion of issues of clinical or theoretical relevance in the above areas.