基于深度学习的音频欺骗攻击检测DFWF模型

Multiscale multimodal medical imaging : Third International Workshop, MMMI 2022, held in conjunction with MICCAI 2022, Singapore, September 22, 2022, proceedings Pub Date : 2022-09-15 DOI:10.36548/jaicn.2022.3.004

Kottilingam Kottursamy

{"title":"基于深度学习的音频欺骗攻击检测DFWF模型","authors":"Kottilingam Kottursamy","doi":"10.36548/jaicn.2022.3.004","DOIUrl":null,"url":null,"abstract":"One of the biggest threats in the speaker verification system is that of fake audio attacks. Over the years several detection approaches have been introduced that were designed to provide efficient and spoof-proof data-specific scenarios. However, the speaker verification system is still exposed to fake audio threats. Hence to address this issue, several authors have proposed methodologies to retrain and finetune the input data. The drawback with retraining and fine-tuning is that retraining requires high computation resources and time while fine-tuning results in degradation of performance. Moreover, in certain situations, the previous data becomes unavailable and cannot be accessed immediately. In this paper, we have proposed a solution that detects fake without continual-learning based methods and fake detection without forgetting in order to develop a new model which is capable of detecting spoofing attacks in an incremental fashion. In order to retain original model memory, knowledge distillation loss is introduced. In several scenarios, the distribution of genuine voice is said to be very consistent. In several scenarios, there is consistency in distribution of genuine voice hence a similarity loss is embedded additionally to perform a positive sample alignment. The output of the proposed work indicates an error rate reduction of up to 80% as observed and recorded.","PeriodicalId":74231,"journal":{"name":"Multiscale multimodal medical imaging : Third International Workshop, MMMI 2022, held in conjunction with MICCAI 2022, Singapore, September 22, 2022, proceedings","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Deep Learning based DFWF Model for Audio Spoofing Attack Detection\",\"authors\":\"Kottilingam Kottursamy\",\"doi\":\"10.36548/jaicn.2022.3.004\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"One of the biggest threats in the speaker verification system is that of fake audio attacks. Over the years several detection approaches have been introduced that were designed to provide efficient and spoof-proof data-specific scenarios. However, the speaker verification system is still exposed to fake audio threats. Hence to address this issue, several authors have proposed methodologies to retrain and finetune the input data. The drawback with retraining and fine-tuning is that retraining requires high computation resources and time while fine-tuning results in degradation of performance. Moreover, in certain situations, the previous data becomes unavailable and cannot be accessed immediately. In this paper, we have proposed a solution that detects fake without continual-learning based methods and fake detection without forgetting in order to develop a new model which is capable of detecting spoofing attacks in an incremental fashion. In order to retain original model memory, knowledge distillation loss is introduced. In several scenarios, the distribution of genuine voice is said to be very consistent. In several scenarios, there is consistency in distribution of genuine voice hence a similarity loss is embedded additionally to perform a positive sample alignment. The output of the proposed work indicates an error rate reduction of up to 80% as observed and recorded.\",\"PeriodicalId\":74231,\"journal\":{\"name\":\"Multiscale multimodal medical imaging : Third International Workshop, MMMI 2022, held in conjunction with MICCAI 2022, Singapore, September 22, 2022, proceedings\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-09-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Multiscale multimodal medical imaging : Third International Workshop, MMMI 2022, held in conjunction with MICCAI 2022, Singapore, September 22, 2022, proceedings\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.36548/jaicn.2022.3.004\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Multiscale multimodal medical imaging : Third International Workshop, MMMI 2022, held in conjunction with MICCAI 2022, Singapore, September 22, 2022, proceedings","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.36548/jaicn.2022.3.004","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

扬声器验证系统中最大的威胁之一是假音频攻击。多年来，已经引入了几种检测方法，旨在提供有效且防欺骗的特定数据场景。然而，扬声器验证系统仍然暴露在虚假音频威胁中。因此，为了解决这个问题，一些作者提出了重新训练和微调输入数据的方法。再训练和微调的缺点是再训练需要大量的计算资源和时间，而微调会导致性能下降。此外，在某些情况下，以前的数据变得不可用，不能立即访问。在本文中，我们提出了一种不基于持续学习的方法检测虚假和不遗忘虚假检测的解决方案，以开发一种能够以增量方式检测欺骗攻击的新模型。为了保留原有的模型记忆，引入了知识蒸馏损失。在一些场景中，真实声音的分布据说是非常一致的。在一些情况下，真实语音的分布存在一致性，因此额外嵌入相似性损失以执行正样本比对。所建议工作的输出表明，观察和记录的错误率降低了80%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Deep Learning based DFWF Model for Audio Spoofing Attack Detection

One of the biggest threats in the speaker verification system is that of fake audio attacks. Over the years several detection approaches have been introduced that were designed to provide efficient and spoof-proof data-specific scenarios. However, the speaker verification system is still exposed to fake audio threats. Hence to address this issue, several authors have proposed methodologies to retrain and finetune the input data. The drawback with retraining and fine-tuning is that retraining requires high computation resources and time while fine-tuning results in degradation of performance. Moreover, in certain situations, the previous data becomes unavailable and cannot be accessed immediately. In this paper, we have proposed a solution that detects fake without continual-learning based methods and fake detection without forgetting in order to develop a new model which is capable of detecting spoofing attacks in an incremental fashion. In order to retain original model memory, knowledge distillation loss is introduced. In several scenarios, the distribution of genuine voice is said to be very consistent. In several scenarios, there is consistency in distribution of genuine voice hence a similarity loss is embedded additionally to perform a positive sample alignment. The output of the proposed work indicates an error rate reduction of up to 80% as observed and recorded.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Multiscale multimodal medical imaging : Third International Workshop, MMMI 2022, held in conjunction with MICCAI 2022, Singapore, September 22, 2022, proceedings

自引率

0.00%

发文量