{"title":"ClearSpeech","authors":"Dong Ma, Ting Dang, Ming Ding, Rajesh Balan","doi":"10.1145/3631409","DOIUrl":null,"url":null,"abstract":"Wireless earbuds have been gaining increasing popularity and using them to make phone calls or issue voice commands requires the earbud microphones to pick up human speech. When the speaker is in a noisy environment, speech quality degrades significantly and requires speech enhancement (SE). In this paper, we present ClearSpeech, a novel deep-learning-based SE system designed for wireless earbuds. Specifically, by jointly using the earbud's in-ear and out-ear microphones, we devised a suite of techniques to effectively fuse the two signals and enhance the magnitude and phase of the speech spectrogram. We built an earbud prototype to evaluate ClearSpeech under various settings with data collected from 20 subjects. Our results suggest that ClearSpeech can improve the SE performance significantly compared to conventional approaches using the out-ear microphone only. We also show that ClearSpeech can process user speech in real-time on smartphones.","PeriodicalId":20553,"journal":{"name":"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies","volume":"3 6","pages":"1 - 25"},"PeriodicalIF":3.6000,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"ClearSpeech\",\"authors\":\"Dong Ma, Ting Dang, Ming Ding, Rajesh Balan\",\"doi\":\"10.1145/3631409\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Wireless earbuds have been gaining increasing popularity and using them to make phone calls or issue voice commands requires the earbud microphones to pick up human speech. When the speaker is in a noisy environment, speech quality degrades significantly and requires speech enhancement (SE). In this paper, we present ClearSpeech, a novel deep-learning-based SE system designed for wireless earbuds. Specifically, by jointly using the earbud's in-ear and out-ear microphones, we devised a suite of techniques to effectively fuse the two signals and enhance the magnitude and phase of the speech spectrogram. We built an earbud prototype to evaluate ClearSpeech under various settings with data collected from 20 subjects. Our results suggest that ClearSpeech can improve the SE performance significantly compared to conventional approaches using the out-ear microphone only. We also show that ClearSpeech can process user speech in real-time on smartphones.\",\"PeriodicalId\":20553,\"journal\":{\"name\":\"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies\",\"volume\":\"3 6\",\"pages\":\"1 - 25\"},\"PeriodicalIF\":3.6000,\"publicationDate\":\"2024-01-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3631409\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3631409","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
摘要
无线耳塞越来越受欢迎,使用它拨打电话或发出语音命令需要耳塞麦克风拾取人的语音。当说话者处于嘈杂环境中时,语音质量会明显下降,因此需要进行语音增强(SE)。在本文中,我们介绍了 ClearSpeech,这是一种基于深度学习的新型 SE 系统,专为无线耳塞设计。具体来说,通过联合使用耳塞的耳内和耳外麦克风,我们设计了一套技术来有效融合这两个信号,并增强语音频谱图的幅度和相位。我们制作了一个耳塞原型,利用从 20 名受试者那里收集的数据,对 ClearSpeech 在各种设置下的效果进行了评估。结果表明,与只使用耳外麦克风的传统方法相比,ClearSpeech 能显著提高 SE 性能。我们还证明 ClearSpeech 可以在智能手机上实时处理用户语音。
Wireless earbuds have been gaining increasing popularity and using them to make phone calls or issue voice commands requires the earbud microphones to pick up human speech. When the speaker is in a noisy environment, speech quality degrades significantly and requires speech enhancement (SE). In this paper, we present ClearSpeech, a novel deep-learning-based SE system designed for wireless earbuds. Specifically, by jointly using the earbud's in-ear and out-ear microphones, we devised a suite of techniques to effectively fuse the two signals and enhance the magnitude and phase of the speech spectrogram. We built an earbud prototype to evaluate ClearSpeech under various settings with data collected from 20 subjects. Our results suggest that ClearSpeech can improve the SE performance significantly compared to conventional approaches using the out-ear microphone only. We also show that ClearSpeech can process user speech in real-time on smartphones.