A Benchmark for Multi-speaker Anonymization

Xiaoxiao Miao, Ruijie Tao, Chang Zeng, Xin Wang
{"title":"A Benchmark for Multi-speaker Anonymization","authors":"Xiaoxiao Miao, Ruijie Tao, Chang Zeng, Xin Wang","doi":"arxiv-2407.05608","DOIUrl":null,"url":null,"abstract":"Privacy-preserving voice protection approaches primarily suppress\nprivacy-related information derived from paralinguistic attributes while\npreserving the linguistic content. Existing solutions focus on single-speaker\nscenarios. However, they lack practicality for real-world applications, i.e.,\nmulti-speaker scenarios. In this paper, we present an initial attempt to\nprovide a multi-speaker anonymization benchmark by defining the task and\nevaluation protocol, proposing benchmarking solutions, and discussing the\nprivacy leakage of overlapping conversations. Specifically, ideal multi-speaker\nanonymization should preserve the number of speakers and the turn-taking\nstructure of the conversation, ensuring accurate context conveyance while\nmaintaining privacy. To achieve that, a cascaded system uses speaker\ndiarization to aggregate the speech of each speaker and speaker anonymization\nto conceal speaker privacy and preserve speech content. Additionally, we\npropose two conversation-level speaker vector anonymization methods to improve\nthe utility further. Both methods aim to make the original and corresponding\npseudo-speaker identities of each speaker unlinkable while preserving or even\nimproving the distinguishability among pseudo-speakers in a conversation. The\nfirst method minimizes the differential similarity across speaker pairs in the\noriginal and anonymized conversations to maintain original speaker\nrelationships in the anonymized version. The other method minimizes the\naggregated similarity across anonymized speakers to achieve better\ndifferentiation between speakers. Experiments conducted on both non-overlap\nsimulated and real-world datasets demonstrate the effectiveness of the\nmulti-speaker anonymization system with the proposed speaker anonymizers.\nAdditionally, we analyzed overlapping speech regarding privacy leakage and\nprovide potential solutions.","PeriodicalId":501178,"journal":{"name":"arXiv - CS - Sound","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Sound","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2407.05608","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Privacy-preserving voice protection approaches primarily suppress privacy-related information derived from paralinguistic attributes while preserving the linguistic content. Existing solutions focus on single-speaker scenarios. However, they lack practicality for real-world applications, i.e., multi-speaker scenarios. In this paper, we present an initial attempt to provide a multi-speaker anonymization benchmark by defining the task and evaluation protocol, proposing benchmarking solutions, and discussing the privacy leakage of overlapping conversations. Specifically, ideal multi-speaker anonymization should preserve the number of speakers and the turn-taking structure of the conversation, ensuring accurate context conveyance while maintaining privacy. To achieve that, a cascaded system uses speaker diarization to aggregate the speech of each speaker and speaker anonymization to conceal speaker privacy and preserve speech content. Additionally, we propose two conversation-level speaker vector anonymization methods to improve the utility further. Both methods aim to make the original and corresponding pseudo-speaker identities of each speaker unlinkable while preserving or even improving the distinguishability among pseudo-speakers in a conversation. The first method minimizes the differential similarity across speaker pairs in the original and anonymized conversations to maintain original speaker relationships in the anonymized version. The other method minimizes the aggregated similarity across anonymized speakers to achieve better differentiation between speakers. Experiments conducted on both non-overlap simulated and real-world datasets demonstrate the effectiveness of the multi-speaker anonymization system with the proposed speaker anonymizers. Additionally, we analyzed overlapping speech regarding privacy leakage and provide potential solutions.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
多扬声器匿名化基准
保护隐私的语音保护方法主要是在保留语言内容的同时,抑制从副语言属性中提取的与隐私相关的信息。现有的解决方案主要针对单人场景。然而,这些方案在实际应用中(即多扬声器场景)缺乏实用性。在本文中,我们通过定义任务和评估协议、提出基准解决方案以及讨论重叠会话的隐私泄露问题,初步尝试提供多发言人匿名化基准。具体来说,理想的多发言人匿名化应该保留发言人的数量和对话的轮流结构,在保证隐私的同时确保上下文的准确传达。为了实现这一目标,一个级联系统使用说话者匿名化来聚合每个说话者的语音,并使用说话者匿名化来隐藏说话者的隐私并保留语音内容。此外,我们还提出了两种会话级别的说话者矢量匿名化方法,以进一步提高实用性。这两种方法都旨在使每个说话人的原始身份和相应的伪说话人身份不可链接,同时保留甚至提高对话中伪说话人之间的可区分性。第一种方法尽量减少原始对话和匿名对话中说话者对之间的相似性差异,以保持匿名版本中说话者的原始关系。另一种方法则最小化匿名发言人之间的集合相似性,以实现发言人之间更好的区分。在非重叠模拟数据集和真实世界数据集上进行的实验证明,使用所提出的说话者匿名器的多说话者匿名系统是有效的,此外,我们还分析了有关隐私泄露的重叠语音,并提供了潜在的解决方案。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Explaining Deep Learning Embeddings for Speech Emotion Recognition by Predicting Interpretable Acoustic Features ESPnet-EZ: Python-only ESPnet for Easy Fine-tuning and Integration Prevailing Research Areas for Music AI in the Era of Foundation Models Egocentric Speaker Classification in Child-Adult Dyadic Interactions: From Sensing to Computational Modeling The T05 System for The VoiceMOS Challenge 2024: Transfer Learning from Deep Image Classifier to Naturalness MOS Prediction of High-Quality Synthetic Speech
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1