{"title":"在隐私感知抑郁检测中混淆说话者属性的影响","authors":"Nujud Aloshban , Anna Esposito , Alessandro Vinciarelli , Tanaya Guha","doi":"10.1016/j.patrec.2024.10.016","DOIUrl":null,"url":null,"abstract":"<div><div>Detection of depressive symptoms from spoken content has emerged as an efficient Artificial Intelligence (AI) tool for diagnosing this serious mental health condition. Since speech is a highly sensitive form of data, privacy-enhancing measures need to be in place for this technology to be useful. A common approach to enhance speech privacy is by using adversarial learning that involves concealing speaker’s specific attributes/identity while maintaining performance of the primary task. Although this technique works well for applications such as speech recognition, they are often ineffective for depression detection due to the interplay between certain speaker attributes and the performance of depression detection. This paper studies such interplay through a systematic study on how obfuscating specific speaker attributes (age, education) through adversarial learning impact the performance of a depression detection model. We highlight the relevance of two previously unexplored speaker attributes to depression detection, while considering a multimodal (audio-lexical) setting to highlight the relative vulnerabilities of the modalities under obfuscation. Results on a publicly available, clinically validated, depression detection dataset shows that attempts to disentangle age/education attributes through adversarial learning result in a large drop in depression detection accuracy, especially for the text modality. This calls for a revisit to how privacy mitigation should to be achieved for depression detection and any human-centric applications for that matter.</div></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"186 ","pages":"Pages 300-305"},"PeriodicalIF":3.9000,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"On the effects of obfuscating speaker attributes in privacy-aware depression detection\",\"authors\":\"Nujud Aloshban , Anna Esposito , Alessandro Vinciarelli , Tanaya Guha\",\"doi\":\"10.1016/j.patrec.2024.10.016\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Detection of depressive symptoms from spoken content has emerged as an efficient Artificial Intelligence (AI) tool for diagnosing this serious mental health condition. Since speech is a highly sensitive form of data, privacy-enhancing measures need to be in place for this technology to be useful. A common approach to enhance speech privacy is by using adversarial learning that involves concealing speaker’s specific attributes/identity while maintaining performance of the primary task. Although this technique works well for applications such as speech recognition, they are often ineffective for depression detection due to the interplay between certain speaker attributes and the performance of depression detection. This paper studies such interplay through a systematic study on how obfuscating specific speaker attributes (age, education) through adversarial learning impact the performance of a depression detection model. We highlight the relevance of two previously unexplored speaker attributes to depression detection, while considering a multimodal (audio-lexical) setting to highlight the relative vulnerabilities of the modalities under obfuscation. Results on a publicly available, clinically validated, depression detection dataset shows that attempts to disentangle age/education attributes through adversarial learning result in a large drop in depression detection accuracy, especially for the text modality. This calls for a revisit to how privacy mitigation should to be achieved for depression detection and any human-centric applications for that matter.</div></div>\",\"PeriodicalId\":54638,\"journal\":{\"name\":\"Pattern Recognition Letters\",\"volume\":\"186 \",\"pages\":\"Pages 300-305\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2024-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Pattern Recognition Letters\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0167865524003040\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition Letters","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167865524003040","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
On the effects of obfuscating speaker attributes in privacy-aware depression detection
Detection of depressive symptoms from spoken content has emerged as an efficient Artificial Intelligence (AI) tool for diagnosing this serious mental health condition. Since speech is a highly sensitive form of data, privacy-enhancing measures need to be in place for this technology to be useful. A common approach to enhance speech privacy is by using adversarial learning that involves concealing speaker’s specific attributes/identity while maintaining performance of the primary task. Although this technique works well for applications such as speech recognition, they are often ineffective for depression detection due to the interplay between certain speaker attributes and the performance of depression detection. This paper studies such interplay through a systematic study on how obfuscating specific speaker attributes (age, education) through adversarial learning impact the performance of a depression detection model. We highlight the relevance of two previously unexplored speaker attributes to depression detection, while considering a multimodal (audio-lexical) setting to highlight the relative vulnerabilities of the modalities under obfuscation. Results on a publicly available, clinically validated, depression detection dataset shows that attempts to disentangle age/education attributes through adversarial learning result in a large drop in depression detection accuracy, especially for the text modality. This calls for a revisit to how privacy mitigation should to be achieved for depression detection and any human-centric applications for that matter.
期刊介绍:
Pattern Recognition Letters aims at rapid publication of concise articles of a broad interest in pattern recognition.
Subject areas include all the current fields of interest represented by the Technical Committees of the International Association of Pattern Recognition, and other developing themes involving learning and recognition.