使用 "混合粘贴 "命令伪造的韩国智能手机音频文件频谱图数据集

IF 2 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Data Pub Date : 2023-12-01 DOI:10.3390/data8120183

Yeongmin Son, Won Jun Kwak, Jae Wan Park

{"title":"使用 \"混合粘贴 \"命令伪造的韩国智能手机音频文件频谱图数据集","authors":"Yeongmin Son, Won Jun Kwak, Jae Wan Park","doi":"10.3390/data8120183","DOIUrl":null,"url":null,"abstract":"This study focuses on the field of voice forgery detection, which is increasing in importance owing to the introduction of advanced voice editing technologies and the proliferation of smartphones. This study introduces a unique dataset that was built specifically to identify forgeries created using the “Mix Paste” technique. This editing technique can overlay audio segments from similar or different environments without creating a new timeframe, making it nearly infeasible to detect forgeries using traditional methods. The dataset consists of 4665 and 45,672 spectrogram images from 1555 original audio files and 15,224 forged audio files, respectively. The original audio was recorded using iPhone and Samsung Galaxy smartphones to ensure a realistic sampling environment. The forged files were created from these recordings and subsequently converted into spectrograms. The dataset also provided the metadata of the original voice files, offering additional context and information that could be used for analysis and detection. This dataset not only fills a gap in existing research but also provides valuable support for developing more efficient deep learning models for voice forgery detection. By addressing the “Mix Paste” technique, the dataset caters to a critical need in voice authentication and forensics, potentially contributing to enhancing security in society.","PeriodicalId":36824,"journal":{"name":"Data","volume":" 27","pages":""},"PeriodicalIF":2.0000,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Spectrogram Dataset of Korean Smartphone Audio Files Forged Using the “Mix Paste” Command\",\"authors\":\"Yeongmin Son, Won Jun Kwak, Jae Wan Park\",\"doi\":\"10.3390/data8120183\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This study focuses on the field of voice forgery detection, which is increasing in importance owing to the introduction of advanced voice editing technologies and the proliferation of smartphones. This study introduces a unique dataset that was built specifically to identify forgeries created using the “Mix Paste” technique. This editing technique can overlay audio segments from similar or different environments without creating a new timeframe, making it nearly infeasible to detect forgeries using traditional methods. The dataset consists of 4665 and 45,672 spectrogram images from 1555 original audio files and 15,224 forged audio files, respectively. The original audio was recorded using iPhone and Samsung Galaxy smartphones to ensure a realistic sampling environment. The forged files were created from these recordings and subsequently converted into spectrograms. The dataset also provided the metadata of the original voice files, offering additional context and information that could be used for analysis and detection. This dataset not only fills a gap in existing research but also provides valuable support for developing more efficient deep learning models for voice forgery detection. By addressing the “Mix Paste” technique, the dataset caters to a critical need in voice authentication and forensics, potentially contributing to enhancing security in society.\",\"PeriodicalId\":36824,\"journal\":{\"name\":\"Data\",\"volume\":\" 27\",\"pages\":\"\"},\"PeriodicalIF\":2.0000,\"publicationDate\":\"2023-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Data\",\"FirstCategoryId\":\"90\",\"ListUrlMain\":\"https://doi.org/10.3390/data8120183\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data","FirstCategoryId":"90","ListUrlMain":"https://doi.org/10.3390/data8120183","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

这项研究的重点是语音伪造检测领域，由于先进语音编辑技术的引入和智能手机的普及，这一领域的重要性日益增加。本研究介绍了一个独特的数据集，专门用于识别使用“混合粘贴”技术创建的伪造品。这种编辑技术可以覆盖来自相似或不同环境的音频片段，而无需创建新的时间框架，这使得使用传统方法检测伪造几乎不可行。该数据集分别由来自1555个原始音频文件和15224个伪造音频文件的4665张和45,672张频谱图图像组成。原始音频是使用iPhone和三星Galaxy智能手机录制的，以确保真实的采样环境。伪造文件是根据这些录音制作的，随后被转换成频谱图。该数据集还提供了原始语音文件的元数据，提供了可用于分析和检测的附加上下文和信息。该数据集不仅填补了现有研究的空白，而且为开发更有效的语音伪造检测深度学习模型提供了有价值的支持。通过解决“混合粘贴”技术，该数据集满足了语音认证和取证的关键需求，可能有助于提高社会的安全性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Spectrogram Dataset of Korean Smartphone Audio Files Forged Using the “Mix Paste” Command

This study focuses on the field of voice forgery detection, which is increasing in importance owing to the introduction of advanced voice editing technologies and the proliferation of smartphones. This study introduces a unique dataset that was built specifically to identify forgeries created using the “Mix Paste” technique. This editing technique can overlay audio segments from similar or different environments without creating a new timeframe, making it nearly infeasible to detect forgeries using traditional methods. The dataset consists of 4665 and 45,672 spectrogram images from 1555 original audio files and 15,224 forged audio files, respectively. The original audio was recorded using iPhone and Samsung Galaxy smartphones to ensure a realistic sampling environment. The forged files were created from these recordings and subsequently converted into spectrograms. The dataset also provided the metadata of the original voice files, offering additional context and information that could be used for analysis and detection. This dataset not only fills a gap in existing research but also provides valuable support for developing more efficient deep learning models for voice forgery detection. By addressing the “Mix Paste” technique, the dataset caters to a critical need in voice authentication and forensics, potentially contributing to enhancing security in society.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊