SpMis:合成语音错误信息检测研究

Peizhuo Liu, Li Wang, Renqiang He, Haorui He, Lei Wang, Huadi Zheng, Jie Shi, Tong Xiao, Zhizheng Wu
{"title":"SpMis:合成语音错误信息检测研究","authors":"Peizhuo Liu, Li Wang, Renqiang He, Haorui He, Lei Wang, Huadi Zheng, Jie Shi, Tong Xiao, Zhizheng Wu","doi":"arxiv-2409.11308","DOIUrl":null,"url":null,"abstract":"In recent years, speech generation technology has advanced rapidly, fueled by\ngenerative models and large-scale training techniques. While these developments\nhave enabled the production of high-quality synthetic speech, they have also\nraised concerns about the misuse of this technology, particularly for\ngenerating synthetic misinformation. Current research primarily focuses on\ndistinguishing machine-generated speech from human-produced speech, but the\nmore urgent challenge is detecting misinformation within spoken content. This\ntask requires a thorough analysis of factors such as speaker identity, topic,\nand synthesis. To address this need, we conduct an initial investigation into\nsynthetic spoken misinformation detection by introducing an open-source\ndataset, SpMis. SpMis includes speech synthesized from over 1,000 speakers\nacross five common topics, utilizing state-of-the-art text-to-speech systems.\nAlthough our results show promising detection capabilities, they also reveal\nsubstantial challenges for practical implementation, underscoring the\nimportance of ongoing research in this critical area.","PeriodicalId":501030,"journal":{"name":"arXiv - CS - Computation and Language","volume":"50 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SpMis: An Investigation of Synthetic Spoken Misinformation Detection\",\"authors\":\"Peizhuo Liu, Li Wang, Renqiang He, Haorui He, Lei Wang, Huadi Zheng, Jie Shi, Tong Xiao, Zhizheng Wu\",\"doi\":\"arxiv-2409.11308\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In recent years, speech generation technology has advanced rapidly, fueled by\\ngenerative models and large-scale training techniques. While these developments\\nhave enabled the production of high-quality synthetic speech, they have also\\nraised concerns about the misuse of this technology, particularly for\\ngenerating synthetic misinformation. Current research primarily focuses on\\ndistinguishing machine-generated speech from human-produced speech, but the\\nmore urgent challenge is detecting misinformation within spoken content. This\\ntask requires a thorough analysis of factors such as speaker identity, topic,\\nand synthesis. To address this need, we conduct an initial investigation into\\nsynthetic spoken misinformation detection by introducing an open-source\\ndataset, SpMis. SpMis includes speech synthesized from over 1,000 speakers\\nacross five common topics, utilizing state-of-the-art text-to-speech systems.\\nAlthough our results show promising detection capabilities, they also reveal\\nsubstantial challenges for practical implementation, underscoring the\\nimportance of ongoing research in this critical area.\",\"PeriodicalId\":501030,\"journal\":{\"name\":\"arXiv - CS - Computation and Language\",\"volume\":\"50 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Computation and Language\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.11308\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Computation and Language","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11308","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

近年来,语音生成技术在生成模型和大规模训练技术的推动下发展迅速。虽然这些发展使高质量合成语音的生成成为可能,但同时也引发了对滥用该技术的担忧,尤其是生成合成错误信息。目前的研究主要集中在区分机器生成的语音和人类生成的语音,但更紧迫的挑战是检测口语内容中的错误信息。这项任务要求对说话人身份、话题和合成等因素进行全面分析。为了满足这一需求,我们引入了一个开源数据集 SpMis,对合成语音错误信息检测进行了初步研究。虽然我们的结果显示了良好的检测能力,但同时也揭示了实际应用中的巨大挑战,强调了在这一关键领域持续开展研究的重要性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
SpMis: An Investigation of Synthetic Spoken Misinformation Detection
In recent years, speech generation technology has advanced rapidly, fueled by generative models and large-scale training techniques. While these developments have enabled the production of high-quality synthetic speech, they have also raised concerns about the misuse of this technology, particularly for generating synthetic misinformation. Current research primarily focuses on distinguishing machine-generated speech from human-produced speech, but the more urgent challenge is detecting misinformation within spoken content. This task requires a thorough analysis of factors such as speaker identity, topic, and synthesis. To address this need, we conduct an initial investigation into synthetic spoken misinformation detection by introducing an open-source dataset, SpMis. SpMis includes speech synthesized from over 1,000 speakers across five common topics, utilizing state-of-the-art text-to-speech systems. Although our results show promising detection capabilities, they also reveal substantial challenges for practical implementation, underscoring the importance of ongoing research in this critical area.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
LLMs + Persona-Plug = Personalized LLMs MEOW: MEMOry Supervised LLM Unlearning Via Inverted Facts Extract-and-Abstract: Unifying Extractive and Abstractive Summarization within Single Encoder-Decoder Framework Development and bilingual evaluation of Japanese medical large language model within reasonably low computational resources Human-like Affective Cognition in Foundation Models
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1