{"title":"基于自然智能手机语音的抑郁检测语音地标双元图","authors":"Zhaocheng Huang, J. Epps, Dale Joachim","doi":"10.1109/ICASSP.2019.8682916","DOIUrl":null,"url":null,"abstract":"Detection of depression from speech has attracted significant research attention in recent years but remains a challenge, particularly for speech from diverse smartphones in natural environments. This paper proposes two sets of novel features based on speech landmark bigrams associated with abrupt speech articulatory events for depression detection from smartphone audio recordings. Combined with techniques adapted from natural language text processing, the proposed features further exploit landmark bigrams by discovering latent articulatory events. Experimental results on a large, naturalistic corpus containing various spoken tasks recorded from diverse smartphones suggest that speech landmark bigram features provide a 30.1% relative improvement in F1 (depressed) relative to an acoustic feature baseline system. As might be expected, a key finding was the importance of tailoring the choice of landmark bigrams to each elicitation task, revealing that different aspects of speech articulation are elicited by different tasks, which can be effectively captured by the landmark approaches.","PeriodicalId":13203,"journal":{"name":"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"3 10 1","pages":"5856-5860"},"PeriodicalIF":0.0000,"publicationDate":"2019-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"19","resultStr":"{\"title\":\"Speech Landmark Bigrams for Depression Detection from Naturalistic Smartphone Speech\",\"authors\":\"Zhaocheng Huang, J. Epps, Dale Joachim\",\"doi\":\"10.1109/ICASSP.2019.8682916\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Detection of depression from speech has attracted significant research attention in recent years but remains a challenge, particularly for speech from diverse smartphones in natural environments. This paper proposes two sets of novel features based on speech landmark bigrams associated with abrupt speech articulatory events for depression detection from smartphone audio recordings. Combined with techniques adapted from natural language text processing, the proposed features further exploit landmark bigrams by discovering latent articulatory events. Experimental results on a large, naturalistic corpus containing various spoken tasks recorded from diverse smartphones suggest that speech landmark bigram features provide a 30.1% relative improvement in F1 (depressed) relative to an acoustic feature baseline system. As might be expected, a key finding was the importance of tailoring the choice of landmark bigrams to each elicitation task, revealing that different aspects of speech articulation are elicited by different tasks, which can be effectively captured by the landmark approaches.\",\"PeriodicalId\":13203,\"journal\":{\"name\":\"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)\",\"volume\":\"3 10 1\",\"pages\":\"5856-5860\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-06-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"19\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICASSP.2019.8682916\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASSP.2019.8682916","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Speech Landmark Bigrams for Depression Detection from Naturalistic Smartphone Speech
Detection of depression from speech has attracted significant research attention in recent years but remains a challenge, particularly for speech from diverse smartphones in natural environments. This paper proposes two sets of novel features based on speech landmark bigrams associated with abrupt speech articulatory events for depression detection from smartphone audio recordings. Combined with techniques adapted from natural language text processing, the proposed features further exploit landmark bigrams by discovering latent articulatory events. Experimental results on a large, naturalistic corpus containing various spoken tasks recorded from diverse smartphones suggest that speech landmark bigram features provide a 30.1% relative improvement in F1 (depressed) relative to an acoustic feature baseline system. As might be expected, a key finding was the importance of tailoring the choice of landmark bigrams to each elicitation task, revealing that different aspects of speech articulation are elicited by different tasks, which can be effectively captured by the landmark approaches.