通过文本转换器和密集神经网络识别侵权内容网站的方法论

IF 2.8 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Future Internet Pub Date : 2023-12-09 DOI:10.3390/fi15120397
Aldo Hernandez-Suarez, G. Sánchez-Pérez, L. K. Toscano-Medina, Hector Perez-Meana, J. Portillo-Portillo, J. Olivares-Mercado
{"title":"通过文本转换器和密集神经网络识别侵权内容网站的方法论","authors":"Aldo Hernandez-Suarez, G. Sánchez-Pérez, L. K. Toscano-Medina, Hector Perez-Meana, J. Portillo-Portillo, J. Olivares-Mercado","doi":"10.3390/fi15120397","DOIUrl":null,"url":null,"abstract":"The rapid evolution of the Internet of Everything (IoE) has significantly enhanced global connectivity and multimedia content sharing, simultaneously escalating the unauthorized distribution of multimedia content, posing risks to intellectual property rights. In 2022 alone, about 130 billion accesses to potentially non-compliant websites were recorded, underscoring the challenges for industries reliant on copyright-protected assets. Amidst prevailing uncertainties and the need for technical and AI-integrated solutions, this study introduces two pivotal contributions. First, it establishes a novel taxonomy aimed at safeguarding and identifying IoE-based content infringements. Second, it proposes an innovative architecture combining IoE components with automated sensors to compile a dataset reflective of potential copyright breaches. This dataset is analyzed using a Bidirectional Encoder Representations from Transformers-based advanced Natural Language Processing (NLP) algorithm, further fine-tuned by a dense neural network (DNN), achieving a remarkable 98.71% accuracy in pinpointing websites that violate copyright.","PeriodicalId":37982,"journal":{"name":"Future Internet","volume":"579 ","pages":""},"PeriodicalIF":2.8000,"publicationDate":"2023-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Methodological Approach for Identifying Websites with Infringing Content via Text Transformers and Dense Neural Networks\",\"authors\":\"Aldo Hernandez-Suarez, G. Sánchez-Pérez, L. K. Toscano-Medina, Hector Perez-Meana, J. Portillo-Portillo, J. Olivares-Mercado\",\"doi\":\"10.3390/fi15120397\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The rapid evolution of the Internet of Everything (IoE) has significantly enhanced global connectivity and multimedia content sharing, simultaneously escalating the unauthorized distribution of multimedia content, posing risks to intellectual property rights. In 2022 alone, about 130 billion accesses to potentially non-compliant websites were recorded, underscoring the challenges for industries reliant on copyright-protected assets. Amidst prevailing uncertainties and the need for technical and AI-integrated solutions, this study introduces two pivotal contributions. First, it establishes a novel taxonomy aimed at safeguarding and identifying IoE-based content infringements. Second, it proposes an innovative architecture combining IoE components with automated sensors to compile a dataset reflective of potential copyright breaches. This dataset is analyzed using a Bidirectional Encoder Representations from Transformers-based advanced Natural Language Processing (NLP) algorithm, further fine-tuned by a dense neural network (DNN), achieving a remarkable 98.71% accuracy in pinpointing websites that violate copyright.\",\"PeriodicalId\":37982,\"journal\":{\"name\":\"Future Internet\",\"volume\":\"579 \",\"pages\":\"\"},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2023-12-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Future Internet\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3390/fi15120397\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Future Internet","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/fi15120397","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

万物互联(IoE)的快速发展大大加强了全球互联和多媒体内容共享,同时也加剧了未经授权的多媒体内容传播,给知识产权带来了风险。仅在 2022 年,就记录了约 1300 亿次对可能不合规网站的访问,这凸显了依赖版权保护资产的行业所面临的挑战。面对普遍存在的不确定性以及对技术和人工智能集成解决方案的需求,本研究提出了两个关键贡献。首先,它建立了一种新颖的分类法,旨在保护和识别基于物联网的内容侵权。其次,它提出了一种创新的架构,将物联网组件与自动传感器相结合,以编制反映潜在版权侵犯行为的数据集。该数据集使用基于变换器的双向编码器表示的高级自然语言处理(NLP)算法进行分析,并通过密集神经网络(DNN)进行进一步微调,在精确定位侵犯版权的网站方面取得了令人瞩目的 98.71% 的准确率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Methodological Approach for Identifying Websites with Infringing Content via Text Transformers and Dense Neural Networks
The rapid evolution of the Internet of Everything (IoE) has significantly enhanced global connectivity and multimedia content sharing, simultaneously escalating the unauthorized distribution of multimedia content, posing risks to intellectual property rights. In 2022 alone, about 130 billion accesses to potentially non-compliant websites were recorded, underscoring the challenges for industries reliant on copyright-protected assets. Amidst prevailing uncertainties and the need for technical and AI-integrated solutions, this study introduces two pivotal contributions. First, it establishes a novel taxonomy aimed at safeguarding and identifying IoE-based content infringements. Second, it proposes an innovative architecture combining IoE components with automated sensors to compile a dataset reflective of potential copyright breaches. This dataset is analyzed using a Bidirectional Encoder Representations from Transformers-based advanced Natural Language Processing (NLP) algorithm, further fine-tuned by a dense neural network (DNN), achieving a remarkable 98.71% accuracy in pinpointing websites that violate copyright.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Future Internet
Future Internet Computer Science-Computer Networks and Communications
CiteScore
7.10
自引率
5.90%
发文量
303
审稿时长
11 weeks
期刊介绍: Future Internet is a scholarly open access journal which provides an advanced forum for science and research concerned with evolution of Internet technologies and related smart systems for “Net-Living” development. The general reference subject is therefore the evolution towards the future internet ecosystem, which is feeding a continuous, intensive, artificial transformation of the lived environment, for a widespread and significant improvement of well-being in all spheres of human life (private, public, professional). Included topics are: • advanced communications network infrastructures • evolution of internet basic services • internet of things • netted peripheral sensors • industrial internet • centralized and distributed data centers • embedded computing • cloud computing • software defined network functions and network virtualization • cloud-let and fog-computing • big data, open data and analytical tools • cyber-physical systems • network and distributed operating systems • web services • semantic structures and related software tools • artificial and augmented intelligence • augmented reality • system interoperability and flexible service composition • smart mission-critical system architectures • smart terminals and applications • pro-sumer tools for application design and development • cyber security compliance • privacy compliance • reliability compliance • dependability compliance • accountability compliance • trust compliance • technical quality of basic services.
期刊最新文献
Controllable Queuing System with Elastic Traffic and Signals for Resource Capacity Planning in 5G Network Slicing Internet-of-Things Traffic Analysis and Device Identification Based on Two-Stage Clustering in Smart Home Environments Resource Indexing and Querying in Large Connected Environments An Analysis of Methods and Metrics for Task Scheduling in Fog Computing Evaluating Embeddings from Pre-Trained Language Models and Knowledge Graphs for Educational Content Recommendation
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1