Listening to Sounds of Silence for Audio replay attack detection

Mohammad Hajipour, M. Akhaee, Ramin Toosi
{"title":"Listening to Sounds of Silence for Audio replay attack detection","authors":"Mohammad Hajipour, M. Akhaee, Ramin Toosi","doi":"10.1109/ICSPIS54653.2021.9729353","DOIUrl":null,"url":null,"abstract":"Automatic Speaker Verification (ASV) is a biometric authentication system identifying a person based on the voice presented to a system. Nowadays, due to the widespread use of these systems, various attacks are carried out on them. These attacks are in four different formats, which are impersonation, speech synthesis, voice conversion and replay attack. One of the most commonly used attacks is replay attack due to its simplicity. The purpose of this study is to provide a countermeasure system against replay attacks. We found that the effect of noises generated by different recorders and playback devices on the spoof samples can be used as a criterion for attack detection. So this study analyzes the silent parts of the speech signal that include the noises of various recording and playback devices. Also due to the proper operation of deep convolutional neural networks in classification applications, we propose an ensemble classifier based on end to end neural networks architecture and residual structures to accurately distinguish spoof utterances from genuine ones. We have decreased the t-DCF metric on ASVspoof2019 database by almost 16% compared to similar models that have processed on full speech signals.","PeriodicalId":286966,"journal":{"name":"2021 7th International Conference on Signal Processing and Intelligent Systems (ICSPIS)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 7th International Conference on Signal Processing and Intelligent Systems (ICSPIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSPIS54653.2021.9729353","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Automatic Speaker Verification (ASV) is a biometric authentication system identifying a person based on the voice presented to a system. Nowadays, due to the widespread use of these systems, various attacks are carried out on them. These attacks are in four different formats, which are impersonation, speech synthesis, voice conversion and replay attack. One of the most commonly used attacks is replay attack due to its simplicity. The purpose of this study is to provide a countermeasure system against replay attacks. We found that the effect of noises generated by different recorders and playback devices on the spoof samples can be used as a criterion for attack detection. So this study analyzes the silent parts of the speech signal that include the noises of various recording and playback devices. Also due to the proper operation of deep convolutional neural networks in classification applications, we propose an ensemble classifier based on end to end neural networks architecture and residual structures to accurately distinguish spoof utterances from genuine ones. We have decreased the t-DCF metric on ASVspoof2019 database by almost 16% compared to similar models that have processed on full speech signals.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
收听沉默之声音频重放攻击检测
自动说话人验证(ASV)是一种基于提供给系统的声音来识别人的生物识别认证系统。如今,由于这些系统的广泛使用,对其进行了各种攻击。这些攻击有四种不同的形式,分别是模仿、语音合成、语音转换和重放攻击。由于其简单性,重放攻击是最常用的攻击之一。本研究的目的是提供一个对抗重放攻击的对抗系统。我们发现不同的录音和播放设备产生的噪声对欺骗样本的影响可以作为攻击检测的标准。因此,本研究分析了语音信号中的无声部分,包括各种录音和播放设备的噪声。此外,由于深度卷积神经网络在分类应用中的正确运行,我们提出了一种基于端到端神经网络架构和残差结构的集成分类器,以准确区分恶搞话语和真实话语。与处理完整语音信号的类似模型相比,我们将asvspof2019数据库上的t-DCF指标降低了近16%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Intelligent Fault Diagnosis of Rolling BearingBased on Deep Transfer Learning Using Time-Frequency Representation Wind Energy Potential Approximation with Various Metaheuristic Optimization Techniques Deployment Listening to Sounds of Silence for Audio replay attack detection Transcranial Magnetic Stimulation of Prefrontal Cortex Alters Functional Brain Network Architecture: Graph Theoretical Analysis Anomaly Detection and Resilience-Oriented Countermeasures against Cyberattacks in Smart Grids
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1