Adversarial Perturbation Prediction for Real-Time Protection of Speech Privacy

IF 6.3 1区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS IEEE Transactions on Information Forensics and Security Pub Date : 2024-09-23 DOI:10.1109/TIFS.2024.3463538
Zhaoyang Zhang;Shen Wang;Guopu Zhu;Dechen Zhan;Jiwu Huang
{"title":"Adversarial Perturbation Prediction for Real-Time Protection of Speech Privacy","authors":"Zhaoyang Zhang;Shen Wang;Guopu Zhu;Dechen Zhan;Jiwu Huang","doi":"10.1109/TIFS.2024.3463538","DOIUrl":null,"url":null,"abstract":"The widespread collection and analysis of private speech signals have become increasingly prevalent, raising significant privacy concerns. To protect speech signals from unauthorized analysis, adversarial attack methods for deceiving speaker recognition models have been proposed. While a few of these methods are specifically designed for real-time protection of speech signals, they introduce significant delays that can severely impact speech communication when applied to streaming speech data. In this paper, we present a novel approach that aims to offer real-time protection for speech signals without delays. By utilizing observed data only, we generate initial adversarial seed perturbations and refine them to obtain the necessary adversarial perturbations predicted for adjacent unobserved signals. This refinement process is conducted via a proposed model called PAPG. On the basis of perturbation prediction, we develop a streaming audio processing framework that generates perturbations in synchronization with the playback of the original signal, effectively eliminating delays. The experimental results demonstrate that under the proposed attack, the average Top-1 accuracy of various advanced speaker recognition methods is reduced by 89%, and the average equal error rate (EER) increases to 36%. Remarkably, these results are achieved without delays while maintaining superior perceptual quality.","PeriodicalId":13492,"journal":{"name":"IEEE Transactions on Information Forensics and Security","volume":"19 ","pages":"8701-8716"},"PeriodicalIF":6.3000,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Information Forensics and Security","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10689457/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

The widespread collection and analysis of private speech signals have become increasingly prevalent, raising significant privacy concerns. To protect speech signals from unauthorized analysis, adversarial attack methods for deceiving speaker recognition models have been proposed. While a few of these methods are specifically designed for real-time protection of speech signals, they introduce significant delays that can severely impact speech communication when applied to streaming speech data. In this paper, we present a novel approach that aims to offer real-time protection for speech signals without delays. By utilizing observed data only, we generate initial adversarial seed perturbations and refine them to obtain the necessary adversarial perturbations predicted for adjacent unobserved signals. This refinement process is conducted via a proposed model called PAPG. On the basis of perturbation prediction, we develop a streaming audio processing framework that generates perturbations in synchronization with the playback of the original signal, effectively eliminating delays. The experimental results demonstrate that under the proposed attack, the average Top-1 accuracy of various advanced speaker recognition methods is reduced by 89%, and the average equal error rate (EER) increases to 36%. Remarkably, these results are achieved without delays while maintaining superior perceptual quality.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用于实时保护语音隐私的对抗性干扰预测
对私人语音信号的广泛收集和分析已变得越来越普遍,这引起了人们对隐私的极大关注。为了保护语音信号免受未经授权的分析,人们提出了欺骗说话者识别模型的对抗性攻击方法。虽然其中有一些方法是专门为实时保护语音信号而设计的,但它们在应用于流式语音数据时会带来严重的延迟,从而严重影响语音通信。在本文中,我们提出了一种新方法,旨在为语音信号提供无延迟的实时保护。通过仅利用观测数据,我们生成了初始对抗种子扰动,并对其进行细化,以获得为相邻未观测信号预测的必要对抗扰动。这一细化过程是通过一个名为 PAPG 的模型进行的。在扰动预测的基础上,我们开发了一种流音频处理框架,它能在播放原始信号时同步生成扰动,从而有效消除延迟。实验结果表明,在所提出的攻击下,各种先进说话人识别方法的平均 Top-1 准确率降低了 89%,平均等错误率 (EER) 增加到 36%。值得注意的是,这些结果是在没有延迟的情况下实现的,同时还保持了卓越的感知质量。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
IEEE Transactions on Information Forensics and Security
IEEE Transactions on Information Forensics and Security 工程技术-工程:电子与电气
CiteScore
14.40
自引率
7.40%
发文量
234
审稿时长
6.5 months
期刊介绍: The IEEE Transactions on Information Forensics and Security covers the sciences, technologies, and applications relating to information forensics, information security, biometrics, surveillance and systems applications that incorporate these features
期刊最新文献
Attackers Are Not the Same! Unveiling the Impact of Feature Distribution on Label Inference Attacks Backdoor Online Tracing With Evolving Graphs LHADRO: A Robust Control Framework for Autonomous Vehicles Under Cyber-Physical Attacks Towards Mobile Palmprint Recognition via Multi-view Hierarchical Graph Learning Succinct Hash-based Arbitrary-Range Proofs
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1