Significance of Prosody Modification in Privacy Preservation on speaker verification

2022 National Conference on Communications (NCC) Pub Date : 2022-05-24 DOI:10.1109/NCC55593.2022.9806769

Ayush Agarwal, Amitabh Swain, Jagabandhu Mishra, S. Prasanna

{"title":"Significance of Prosody Modification in Privacy Preservation on speaker verification","authors":"Ayush Agarwal, Amitabh Swain, Jagabandhu Mishra, S. Prasanna","doi":"10.1109/NCC55593.2022.9806769","DOIUrl":null,"url":null,"abstract":"Privacy is the major concern that comes to the user's mind before sharing their data. There are various methods proposed in literature for providing privacy to speech data. Previous works that have been done to protect the speaker identity were done for speech applications like automatic speech recognition (ASR), speech analysis, etc. For these applications the presence of speaker identity is not essential while processing. The objective of this work is to provide privacy to the task in which presence of speaker identity is essential at the time of processing. In this work, privacy is provided to the speaker identity information present in speech signals while performing automatic speaker verification (ASV) tasks. In order to achieve the same, this work proposes a prosody modification based approach. The proposed approach is able to conceal the speaker identity from human perception by changing the pitch of the speech utterances with a pitch modification factor of $\\alpha\\geq 1$ But at the same time the ASV system provides consistent performance irrespective of the change in pitch (i.e. for $\\alpha\\geq 1)$. The same evidence has been shown through experiments in TIMIT and IITG-MV databases. A subjective study has also performed to verify the extent of speaker anonymization with respect to humans. The subjective study evaluates the performance in terms of mean opinion score (MOS). The observed MOS signifies the ability of the proposed approach to conceal the speaker's identity.","PeriodicalId":403870,"journal":{"name":"2022 National Conference on Communications (NCC)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 National Conference on Communications (NCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NCC55593.2022.9806769","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Privacy is the major concern that comes to the user's mind before sharing their data. There are various methods proposed in literature for providing privacy to speech data. Previous works that have been done to protect the speaker identity were done for speech applications like automatic speech recognition (ASR), speech analysis, etc. For these applications the presence of speaker identity is not essential while processing. The objective of this work is to provide privacy to the task in which presence of speaker identity is essential at the time of processing. In this work, privacy is provided to the speaker identity information present in speech signals while performing automatic speaker verification (ASV) tasks. In order to achieve the same, this work proposes a prosody modification based approach. The proposed approach is able to conceal the speaker identity from human perception by changing the pitch of the speech utterances with a pitch modification factor of $\alpha\geq 1$ But at the same time the ASV system provides consistent performance irrespective of the change in pitch (i.e. for $\alpha\geq 1)$. The same evidence has been shown through experiments in TIMIT and IITG-MV databases. A subjective study has also performed to verify the extent of speaker anonymization with respect to humans. The subjective study evaluates the performance in terms of mean opinion score (MOS). The observed MOS signifies the ability of the proposed approach to conceal the speaker's identity.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

韵律修改在说话人验证隐私保护中的意义

隐私是用户在分享数据之前最关心的问题。文献中提出了多种保护语音数据隐私的方法。以往保护说话人身份的工作主要是针对自动语音识别(ASR)、语音分析等语音应用。对于这些应用，说话人身份的存在在处理过程中是不必要的。这项工作的目的是为任务提供隐私，其中说话人身份的存在在处理时是必不可少的。在这项工作中，在执行自动说话人验证(ASV)任务时，为语音信号中存在的说话人身份信息提供了隐私。为了达到这一目的，本文提出了一种基于韵律修饰的方法。所提出的方法能够通过改变语音的音高(音高修改因子为$\alpha\geq 1$)来隐藏说话者的身份，而不受人类感知的影响。但与此同时，ASV系统提供了一致的性能，而不受音高变化的影响(即$\alpha\geq 1)$)。在TIMIT和IITG-MV数据库中的实验也显示了同样的证据。还进行了一项主观研究，以验证说话人匿名化的程度。主观研究以平均意见得分(MOS)来评估绩效。观察到的MOS表明该方法能够隐藏说话人的身份。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2022 National Conference on Communications (NCC)

自引率

0.00%

发文量