首页 > 最新文献

Journal of King Saud University-Computer and Information Sciences最新文献

英文 中文
DeepExtract: Semantic-driven extractive text summarization framework using LLMs and hierarchical positional encoding DeepExtract:使用 LLM 和分层位置编码的语义驱动提取式文本摘要框架
IF 5.2 2区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-10-01 Epub Date: 2024-08-30 DOI: 10.1016/j.jksuci.2024.102178
Aytuğ Onan , Hesham A. Alhumyani

In the age of information overload, the ability to distill essential content from extensive texts is invaluable. DeepExtract introduces an advanced framework for extractive summarization, utilizing the groundbreaking capabilities of GPT-4 along with innovative hierarchical positional encoding to redefine information extraction. This manuscript details the development of DeepExtract, which integrates semantic-driven techniques to analyze and summarize complex documents effectively. The framework is structured around a novel hierarchical tree construction that categorizes sentences and sections not just by their physical placement within a text, but by their contextual and thematic significance, leveraging dynamic embeddings generated by GPT-4. We introduce a multi-faceted scoring system that evaluates sentences based on coherence, relevance, and novelty, ensuring that summaries are not only concise but rich with essential content. Further, DeepExtract employs optimized semantic clustering to group thematic elements, which enhances the representativeness of the summaries. This paper demonstrates through comprehensive evaluations that DeepExtract significantly outperforms existing extractive summarization models in terms of accuracy and efficiency, making it a potent tool for academic, professional, and general use. We conclude with a discussion on the practical applications of DeepExtract in various domains, highlighting its adaptability and potential in navigating the vast expanses of digital text.

在信息过载的时代,从大量文本中提炼出重要内容的能力非常宝贵。DeepExtract 引入了先进的提取摘要框架,利用 GPT-4 的突破性功能和创新的分层位置编码重新定义信息提取。本手稿详细介绍了 DeepExtract 的开发过程,它集成了语义驱动技术,可有效分析和总结复杂文档。该框架是围绕一种新颖的分层树结构构建的,它不仅根据句子和章节在文本中的物理位置,还根据其上下文和主题意义,利用 GPT-4 生成的动态嵌入对其进行分类。我们引入了多方面的评分系统,根据连贯性、相关性和新颖性对句子进行评估,确保摘要不仅简明扼要,而且包含丰富的重要内容。此外,DeepExtract 还采用了优化的语义聚类来对主题元素进行分组,从而增强了摘要的代表性。本文通过综合评估证明,DeepExtract 在准确性和效率方面明显优于现有的提取式摘要模型,使其成为学术、专业和一般用途的有力工具。最后,我们讨论了 DeepExtract 在各个领域的实际应用,强调了它在浏览广袤的数字文本时的适应性和潜力。
{"title":"DeepExtract: Semantic-driven extractive text summarization framework using LLMs and hierarchical positional encoding","authors":"Aytuğ Onan ,&nbsp;Hesham A. Alhumyani","doi":"10.1016/j.jksuci.2024.102178","DOIUrl":"10.1016/j.jksuci.2024.102178","url":null,"abstract":"<div><p>In the age of information overload, the ability to distill essential content from extensive texts is invaluable. DeepExtract introduces an advanced framework for extractive summarization, utilizing the groundbreaking capabilities of GPT-4 along with innovative hierarchical positional encoding to redefine information extraction. This manuscript details the development of DeepExtract, which integrates semantic-driven techniques to analyze and summarize complex documents effectively. The framework is structured around a novel hierarchical tree construction that categorizes sentences and sections not just by their physical placement within a text, but by their contextual and thematic significance, leveraging dynamic embeddings generated by GPT-4. We introduce a multi-faceted scoring system that evaluates sentences based on coherence, relevance, and novelty, ensuring that summaries are not only concise but rich with essential content. Further, DeepExtract employs optimized semantic clustering to group thematic elements, which enhances the representativeness of the summaries. This paper demonstrates through comprehensive evaluations that DeepExtract significantly outperforms existing extractive summarization models in terms of accuracy and efficiency, making it a potent tool for academic, professional, and general use. We conclude with a discussion on the practical applications of DeepExtract in various domains, highlighting its adaptability and potential in navigating the vast expanses of digital text.</p></div>","PeriodicalId":48547,"journal":{"name":"Journal of King Saud University-Computer and Information Sciences","volume":"36 8","pages":"Article 102178"},"PeriodicalIF":5.2,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1319157824002672/pdfft?md5=ee7790d3716e8b2a6454863f15695239&pid=1-s2.0-S1319157824002672-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142098684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Establishing a multimodal dataset for Arabic Sign Language (ArSL) production 建立阿拉伯手语(ArSL)制作的多模态数据集
IF 5.2 2区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-10-01 Epub Date: 2024-08-30 DOI: 10.1016/j.jksuci.2024.102165
Samah Abbas , Dimah Alahmadi , Hassanin Al-Barhamtoshy

This paper addresses the potential of Arabic Sign Language (ArSL) recognition systems to facilitate direct communication and enhance social engagement between deaf and non-deaf. Specifically, we focus on the domain of religion to address the lack of accessible religious content for the deaf community. We propose a multimodal architecture framework and develop a novel dataset for ArSL production. The dataset comprises 1950 audio signals with corresponding 131 texts, including words and phrases, and 262 ArSL videos. These videos were recorded by two expert signers and annotated using ELAN based on gloss representation. To evaluate ArSL videos, we employ Cosine similarities and mode distances based on MobileNetV2 and Euclidean distance based on MediaPipe. Additionally, we implement Jac card Similarity to evaluate the gloss representation, resulting in an overall similarity score of 85% between the glosses of the two ArSL videos. The evaluation highlights the complexity of creating an ArSL video corpus and reveals slight differences between the two videos. The findings emphasize the need for careful annotation and representation of ArSL videos to ensure accurate recognition and understanding. Overall, it contributes to bridging the gap in accessible religious content for deaf community by developing a multimodal framework and a comprehensive ArSL dataset.

本文探讨了阿拉伯语手语 (ArSL) 识别系统在促进聋人与非聋人之间的直接交流和社会参与方面的潜力。具体而言,我们将重点放在宗教领域,以解决聋人群体缺乏无障碍宗教内容的问题。我们提出了一个多模态架构框架,并开发了一个新颖的 ArSL 生成数据集。该数据集包括 1950 个音频信号和相应的 131 个文本(包括单词和短语),以及 262 个 ArSL 视频。这些视频由两位专家手语者录制,并使用基于词汇表的 ELAN 进行注释。为了评估 ArSL 视频,我们采用了基于 MobileNetV2 的余弦相似度和模式距离,以及基于 MediaPipe 的欧氏距离。此外,我们还采用了 Jac card Similarity 来评估词汇表,结果发现两段 ArSL 视频的词汇表之间的总体相似度达到了 85%。评估结果凸显了创建 ArSL 视频语料库的复杂性,并揭示了两段视频之间的细微差别。评估结果强调了对 ArSL 视频进行仔细标注和表示的必要性,以确保准确的识别和理解。总之,通过开发一个多模态框架和一个全面的 ArSL 数据集,该研究有助于缩小聋人社区在无障碍宗教内容方面的差距。
{"title":"Establishing a multimodal dataset for Arabic Sign Language (ArSL) production","authors":"Samah Abbas ,&nbsp;Dimah Alahmadi ,&nbsp;Hassanin Al-Barhamtoshy","doi":"10.1016/j.jksuci.2024.102165","DOIUrl":"10.1016/j.jksuci.2024.102165","url":null,"abstract":"<div><p>This paper addresses the potential of Arabic Sign Language (ArSL) recognition systems to facilitate direct communication and enhance social engagement between deaf and non-deaf. Specifically, we focus on the domain of religion to address the lack of accessible religious content for the deaf community. We propose a multimodal architecture framework and develop a novel dataset for ArSL production. The dataset comprises 1950 audio signals with corresponding 131 texts, including words and phrases, and 262 ArSL videos. These videos were recorded by two expert signers and annotated using ELAN based on gloss representation. To evaluate ArSL videos, we employ Cosine similarities and mode distances based on MobileNetV2 and Euclidean distance based on MediaPipe. Additionally, we implement Jac card Similarity to evaluate the gloss representation, resulting in an overall similarity score of 85% between the glosses of the two ArSL videos. The evaluation highlights the complexity of creating an ArSL video corpus and reveals slight differences between the two videos. The findings emphasize the need for careful annotation and representation of ArSL videos to ensure accurate recognition and understanding. Overall, it contributes to bridging the gap in accessible religious content for deaf community by developing a multimodal framework and a comprehensive ArSL dataset.</p></div>","PeriodicalId":48547,"journal":{"name":"Journal of King Saud University-Computer and Information Sciences","volume":"36 8","pages":"Article 102165"},"PeriodicalIF":5.2,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1319157824002544/pdfft?md5=301cc3d87bf22d8e207fb35edd191aea&pid=1-s2.0-S1319157824002544-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142136337","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A formal specification language and automatic modeling method of asset securitization contract 资产证券化合同的形式化规范语言和自动建模方法
IF 5.2 2区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-10-01 Epub Date: 2024-08-21 DOI: 10.1016/j.jksuci.2024.102163
Yang Li , Kai Hu , Jie Li , Kaixiang Lu , Yuan Ai

Asset securitization is an important financial derivative involving complicated asset transfer operations. Therefore, digitizing traditional asset securitization contracts will improve efficiency and facilitate reliability verification. Furthermore, accurate and verifiable requirement description is essential for collaborative development between financial professionals and software engineers. A domain specific language for writing asset securitization contract has been proposed. This solves the problem of difficulty for financial professionals to directly write smart contract by simplifying writing rules. However, due to existing design of the language focused on some simple scenarios, it is insufficient and informal to describe various detailed scenarios. What is more, there are still many reliability issues, such as verifying the correctness of the logical properties of the contract and ensuring the consistency between the contract text and the contract code, within the language in the generation and execution of smart contracts. To overcome the challenges stated above, we extend, simplify and innovate the syntax subset of the domain specific language and name it AS-SC (Asset Securitization – Smart Contract), which can be used by financial professionals to accurately describe requirements. Besides, because formal methods are math-based techniques that describe system properties and can generate programs in a more formal and reliable manner, we propose a semantic consistent code conversion method, named AS2EB, for converting from AS-SC to Event-B, a common and useful formal language. AS2EB method can be used by software engineers to verify requirements. The combination of AS-SC and AS2EB ensures consistency and reliability of the requirements, and reduces the cost of repeated communications and later testing. Taking the credit asset securitization contract as case study, the feasibility and rationality of AS-SC and AS2EB are validated. In addition, by carrying out experiments on three randomly selected real cases in different classic scenarios, we show high-efficiency and reliability of AS2EB method.

资产证券化是一种重要的金融衍生工具,涉及复杂的资产转移操作。因此,将传统的资产证券化合同数字化将提高效率并促进可靠性验证。此外,准确、可验证的需求描述对于金融专业人员和软件工程师之间的合作开发至关重要。有人提出了一种用于编写资产证券化合同的特定领域语言。这通过简化编写规则,解决了金融专业人士难以直接编写智能合约的问题。然而,由于该语言的现有设计侧重于一些简单的场景,在描述各种详细场景时显得不够充分和不正规。此外,在智能合约的生成和执行过程中,该语言还存在许多可靠性问题,如验证合约逻辑属性的正确性、确保合约文本与合约代码的一致性等。为了克服上述挑战,我们对特定领域语言的语法子集进行了扩展、简化和创新,并将其命名为 AS-SC(资产证券化-智能合约),可供金融专业人士准确描述需求。此外,由于形式化方法是基于数学的技术,可以描述系统属性,并能以更形式化、更可靠的方式生成程序,因此我们提出了一种语义一致的代码转换方法,命名为AS2EB,用于将AS-SC转换为常用且有用的形式化语言Event-B。AS2EB 方法可用于软件工程师验证需求。AS-SC 和 AS2EB 的结合确保了需求的一致性和可靠性,降低了反复沟通和后期测试的成本。以信贷资产证券化合同为例,验证了 AS-SC 和 AS2EB 的可行性和合理性。此外,通过对随机抽取的三个不同经典场景的真实案例进行实验,我们展示了 AS2EB 方法的高效性和可靠性。
{"title":"A formal specification language and automatic modeling method of asset securitization contract","authors":"Yang Li ,&nbsp;Kai Hu ,&nbsp;Jie Li ,&nbsp;Kaixiang Lu ,&nbsp;Yuan Ai","doi":"10.1016/j.jksuci.2024.102163","DOIUrl":"10.1016/j.jksuci.2024.102163","url":null,"abstract":"<div><p>Asset securitization is an important financial derivative involving complicated asset transfer operations. Therefore, digitizing traditional asset securitization contracts will improve efficiency and facilitate reliability verification. Furthermore, accurate and verifiable requirement description is essential for collaborative development between financial professionals and software engineers. A domain specific language for writing asset securitization contract has been proposed. This solves the problem of difficulty for financial professionals to directly write smart contract by simplifying writing rules. However, due to existing design of the language focused on some simple scenarios, it is insufficient and informal to describe various detailed scenarios. What is more, there are still many reliability issues, such as verifying the correctness of the logical properties of the contract and ensuring the consistency between the contract text and the contract code, within the language in the generation and execution of smart contracts. To overcome the challenges stated above, we extend, simplify and innovate the syntax subset of the domain specific language and name it AS-SC (Asset Securitization – Smart Contract), which can be used by financial professionals to accurately describe requirements. Besides, because formal methods are math-based techniques that describe system properties and can generate programs in a more formal and reliable manner, we propose a semantic consistent code conversion method, named AS2EB, for converting from AS-SC to Event-B, a common and useful formal language. AS2EB method can be used by software engineers to verify requirements. The combination of AS-SC and AS2EB ensures consistency and reliability of the requirements, and reduces the cost of repeated communications and later testing. Taking the credit asset securitization contract as case study, the feasibility and rationality of AS-SC and AS2EB are validated. In addition, by carrying out experiments on three randomly selected real cases in different classic scenarios, we show high-efficiency and reliability of AS2EB method.</p></div>","PeriodicalId":48547,"journal":{"name":"Journal of King Saud University-Computer and Information Sciences","volume":"36 8","pages":"Article 102163"},"PeriodicalIF":5.2,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1319157824002520/pdfft?md5=9af49e4b57c4f2d8d674b3287497b478&pid=1-s2.0-S1319157824002520-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142088599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Design and FPGA implementation of nested grid multi-scroll chaotic system 嵌套网格多卷混沌系统的设计与 FPGA 实现
IF 5.2 2区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-10-01 Epub Date: 2024-09-10 DOI: 10.1016/j.jksuci.2024.102186
Guofeng Yu, Chunlei Fan, Jiale Xi, Chengbin Xu

Conventional multi-scroll chaotic systems are often constrained by the number of attractors and the complexity of generation, making it challenging to meet the increasing demands of communication and computation. This paper revolves around the modified Chua’s system. By modifying its differential equation and introducing traditional nonlinear functions, such as the step function sequence and sawtooth function sequence. A nested grid multi-scroll chaotic system (NGMSCS) can be established, capable of generating nested grid multi-scroll attractors. In contrast to conventional grid multi-scroll chaotic attractors, scroll-like phenomena can be initiated outside the grid structure, thereby revealing more complex dynamic behavior and topological features. Through the theoretical design and analysis of the equilibrium point of the system and its stability, the number of saddle-focused equilibrium points of index 2 is further expanded, which can generate (2 N+2) × M attractors, and the formation mechanism is elaborated and verified in detail. In addition, the generation of an arbitrary number of equilibrium points in the y-direction is achieved by transforming the x and y variables, which can generate M×(2 N+2) attractors, increasing the complexity of the system. The system’s dynamical properties are discussed in depth via time series plots, Lyapunov exponents, Poincaré cross sections, 0–1 tests, bifurcation diagrams, and attraction basins. The existence of attractors is confirmed through numerical simulations and FPGA-based hardware experiments.

传统的多辊混沌系统往往受制于吸引子的数量和生成的复杂性,因而难以满足日益增长的通信和计算需求。本文围绕修正的蔡氏系统展开论述。通过修改其微分方程并引入传统的非线性函数,如阶跃函数序列和锯齿函数序列。嵌套网格多卷混沌系统(NGMSCS)就可以建立起来,并能产生嵌套网格多卷吸引子。与传统的网格多卷积混沌吸引子相比,卷积现象可以在网格结构之外启动,从而显示出更复杂的动态行为和拓扑特征。通过对系统平衡点及其稳定性的理论设计和分析,进一步扩展了指数为 2 的鞍焦平衡点数量,可生成(2 N+2 )×M 个吸引子,并详细阐述和验证了其形成机理。此外,通过变换 x 和 y 变量,在 y 方向上生成任意数量的平衡点,可产生 M×(2 N+2) 个吸引子,增加了系统的复杂性。通过时间序列图、Lyapunov 指数、Poincaré 截面、0-1 检验、分岔图和吸引盆地,深入讨论了系统的动力学特性。吸引子的存在通过数值模拟和基于 FPGA 的硬件实验得到了证实。
{"title":"Design and FPGA implementation of nested grid multi-scroll chaotic system","authors":"Guofeng Yu,&nbsp;Chunlei Fan,&nbsp;Jiale Xi,&nbsp;Chengbin Xu","doi":"10.1016/j.jksuci.2024.102186","DOIUrl":"10.1016/j.jksuci.2024.102186","url":null,"abstract":"<div><p>Conventional multi-scroll chaotic systems are often constrained by the number of attractors and the complexity of generation, making it challenging to meet the increasing demands of communication and computation. This paper revolves around the modified Chua’s system. By modifying its differential equation and introducing traditional nonlinear functions, such as the step function sequence and sawtooth function sequence. A nested grid multi-scroll chaotic system (NGMSCS) can be established, capable of generating nested grid multi-scroll attractors. In contrast to conventional grid multi-scroll chaotic attractors, scroll-like phenomena can be initiated outside the grid structure, thereby revealing more complex dynamic behavior and topological features. Through the theoretical design and analysis of the equilibrium point of the system and its stability, the number of saddle-focused equilibrium points of index 2 is further expanded, which can generate (2 N+2) × M attractors, and the formation mechanism is elaborated and verified in detail. In addition, the generation of an arbitrary number of equilibrium points in the <em>y</em>-direction is achieved by transforming the <em>x</em> and <em>y</em> variables, which can generate M×(2 N+2) attractors, increasing the complexity of the system. The system’s dynamical properties are discussed in depth via time series plots, Lyapunov exponents, Poincaré cross sections, 0–1 tests, bifurcation diagrams, and attraction basins. The existence of attractors is confirmed through numerical simulations and FPGA-based hardware experiments.</p></div>","PeriodicalId":48547,"journal":{"name":"Journal of King Saud University-Computer and Information Sciences","volume":"36 8","pages":"Article 102186"},"PeriodicalIF":5.2,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1319157824002751/pdfft?md5=5a97268ac1950c4cb177bec835b9c871&pid=1-s2.0-S1319157824002751-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142233768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-objective optimization in order to allocate computing and telecommunication resources based on non-orthogonal access, participation of cloud server and edge server in 5G networks 基于非正交访问、云服务器和边缘服务器在 5G 网络中的参与,进行多目标优化以分配计算和电信资源
IF 5.2 2区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-10-01 Epub Date: 2024-09-16 DOI: 10.1016/j.jksuci.2024.102187
Liying Zhao , Chao Liu , Entie Qi , Sinan Shi
Mobile edge processing is a cutting-edge technique that addresses the limitations of mobile devices by enabling users to offload computational tasks to edge servers, rather than relying on distant cloud servers. This approach significantly reduces the latency associated with cloud processing, thereby enhancing the quality of service. In this paper, we propose a system in which a cellular network, comprising multiple users, interacts with both cloud and edge servers to process service requests. The system assumes non-orthogonal multiple access (NOMA) for user access to the radio spectrum. We model the interactions between users and servers using queuing theory, aiming to minimize the total energy consumption of users, service delivery time, and overall network operation costs. The problem is mathematically formulated as a multi-objective, bounded non-convex optimization problem. The Structural Correspondence Analysis (SCA) method is employed to obtain the global optimal solution. Simulation results demonstrate that the proposed model reduces energy consumption, delay, and network costs by approximately 50%, under the given assumptions.
移动边缘处理是一种前沿技术,可解决移动设备的局限性,使用户能够将计算任务卸载到边缘服务器,而不是依赖遥远的云服务器。这种方法大大减少了与云处理相关的延迟,从而提高了服务质量。在本文中,我们提出了一个由多个用户组成的蜂窝网络与云服务器和边缘服务器交互处理服务请求的系统。该系统假定用户访问无线电频谱时使用非正交多址接入(NOMA)。我们使用排队理论对用户和服务器之间的交互进行建模,旨在最大限度地减少用户的总能耗、服务交付时间和整体网络运营成本。该问题在数学上被表述为一个多目标、有界非凸优化问题。采用结构对应分析(SCA)方法获得全局最优解。仿真结果表明,在给定的假设条件下,所提出的模型可将能耗、延迟和网络成本降低约 50%。
{"title":"Multi-objective optimization in order to allocate computing and telecommunication resources based on non-orthogonal access, participation of cloud server and edge server in 5G networks","authors":"Liying Zhao ,&nbsp;Chao Liu ,&nbsp;Entie Qi ,&nbsp;Sinan Shi","doi":"10.1016/j.jksuci.2024.102187","DOIUrl":"10.1016/j.jksuci.2024.102187","url":null,"abstract":"<div><div>Mobile edge processing is a cutting-edge technique that addresses the limitations of mobile devices by enabling users to offload computational tasks to edge servers, rather than relying on distant cloud servers. This approach significantly reduces the latency associated with cloud processing, thereby enhancing the quality of service. In this paper, we propose a system in which a cellular network, comprising multiple users, interacts with both cloud and edge servers to process service requests. The system assumes non-orthogonal multiple access (NOMA) for user access to the radio spectrum. We model the interactions between users and servers using queuing theory, aiming to minimize the total energy consumption of users, service delivery time, and overall network operation costs. The problem is mathematically formulated as a multi-objective, bounded non-convex optimization problem. The Structural Correspondence Analysis (SCA) method is employed to obtain the global optimal solution. Simulation results demonstrate that the proposed model reduces energy consumption, delay, and network costs by approximately 50%, under the given assumptions.</div></div>","PeriodicalId":48547,"journal":{"name":"Journal of King Saud University-Computer and Information Sciences","volume":"36 8","pages":"Article 102187"},"PeriodicalIF":5.2,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142319258","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A low-time-consumption image encryption combining 2D parametric Pascal matrix chaotic system and elementary operation 一种结合二维参数帕斯卡矩阵混沌系统和基本运算的低耗时图像加密方法
IF 5.2 2区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-10-01 Epub Date: 2024-08-28 DOI: 10.1016/j.jksuci.2024.102169
Jun Lu , Jiaxin Zhang , Dezhi An , Dawei Hao , Xiaokai Ren , Ruoyu Zhao

The rapid development of the big data era has resulted in traditional image encryption algorithms consuming more time in handling the huge amount of data. The consumption of time cost needs to be reduced while ensuring the security of encryption algorithms. With this in mind, the paper proposes a low-time-consumption image encryption (LTC-IE) combining 2D parametric Pascal matrix chaotic system (2D-PPMCS) and elementary operation. First, the 2D-PPMCS with robustness and complex chaotic behavior is adopted. Second, the SHA-256 hash values are applied to the chaotic sequences generated by 2D-PPMCS, which are processed and applied to image permutation and diffusion encryption. In the permutation stage, the pixel matrix is permutation encrypted based on the permutation matrix generated from the chaotic sequences. For diffusion encryption, elementary operations are utilized to construct the model, such as exclusive or, modulo, and arithmetic operations (addition, subtraction, multiplication, and division). After analyzing the security experiments, the LTC-IE algorithm ensures security and robustness while reducing the time cost consumption.

大数据时代的快速发展导致传统图像加密算法在处理海量数据时耗费更多时间。在保证加密算法安全性的同时,还需要降低时间成本的消耗。有鉴于此,本文提出了一种结合二维参数帕斯卡矩阵混沌系统(2D-PPMCS)和基本运算的低耗时图像加密(LTC-IE)。首先,采用具有鲁棒性和复杂混沌行为的二维参数帕斯卡矩阵混沌系统。其次,将 SHA-256 哈希值应用于 2D-PPMCS 生成的混沌序列,经过处理后应用于图像置换和扩散加密。在置换阶段,根据混沌序列生成的置换矩阵对像素矩阵进行置换加密。在扩散加密阶段,利用基本运算来构建模型,如排他性或、模和算术运算(加、减、乘、除)。经过安全实验分析,LTC-IE 算法在降低时间成本消耗的同时,确保了安全性和鲁棒性。
{"title":"A low-time-consumption image encryption combining 2D parametric Pascal matrix chaotic system and elementary operation","authors":"Jun Lu ,&nbsp;Jiaxin Zhang ,&nbsp;Dezhi An ,&nbsp;Dawei Hao ,&nbsp;Xiaokai Ren ,&nbsp;Ruoyu Zhao","doi":"10.1016/j.jksuci.2024.102169","DOIUrl":"10.1016/j.jksuci.2024.102169","url":null,"abstract":"<div><p>The rapid development of the big data era has resulted in traditional image encryption algorithms consuming more time in handling the huge amount of data. The consumption of time cost needs to be reduced while ensuring the security of encryption algorithms. With this in mind, the paper proposes a low-time-consumption image encryption (LTC-IE) combining 2D parametric Pascal matrix chaotic system (2D-PPMCS) and elementary operation. First, the 2D-PPMCS with robustness and complex chaotic behavior is adopted. Second, the SHA-256 hash values are applied to the chaotic sequences generated by 2D-PPMCS, which are processed and applied to image permutation and diffusion encryption. In the permutation stage, the pixel matrix is permutation encrypted based on the permutation matrix generated from the chaotic sequences. For diffusion encryption, elementary operations are utilized to construct the model, such as exclusive or, modulo, and arithmetic operations (addition, subtraction, multiplication, and division). After analyzing the security experiments, the LTC-IE algorithm ensures security and robustness while reducing the time cost consumption.</p></div>","PeriodicalId":48547,"journal":{"name":"Journal of King Saud University-Computer and Information Sciences","volume":"36 8","pages":"Article 102169"},"PeriodicalIF":5.2,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1319157824002581/pdfft?md5=db7fa2d27baba2dde9365c9407528c9f&pid=1-s2.0-S1319157824002581-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142098681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards the development of believable agents: Adopting neural architectures and adaptive neuro-fuzzy inference system via playback of human traces 开发可信的代理:通过回放人类痕迹采用神经架构和自适应神经模糊推理系统
IF 5.2 2区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-10-01 Epub Date: 2024-09-02 DOI: 10.1016/j.jksuci.2024.102182
Naveed Anwer Butt , Mian Muhammad Awais , Samra Shahzadi , Tai-hoon Kim , Imran Ashraf

Artificial intelligence (AI) research on video games primarily focused on the imitation of human-like behavior during the past few years. Moreover, to increase the perceived worth of amusement and gratification, there is an enormous rise in the demand for intelligent agents that can imitate human players and video game characters. However, the agents developed using the majority of current approaches are perceived as rather more mechanical, which leads to frustration, and more importantly, failure in engagement. On that account, this study proposes an imitation learning framework to generate human-like behavior for more precise and accurate reproduction. To build a computational model, two learning paradigms are explored, artificial neural networks (ANN) and adaptive neuro-fuzzy inference systems (ANFIS). This study utilized several variations of ANN, including feed-forward, recurrent, extreme learning machines, and regressions, to simulate human player behavior. Furthermore, to find the ideal ANFIS, grid partitioning, subtractive clustering, and fuzzy c-means clustering are used for training. The results demonstrate that ANFIS hybrid intelligence systems trained with subtractive clustering are overall best with an average accuracy of 95%, followed by fuzzy c-means with an average accuracy of 87%. Also, the believability of the obtained AI agents is tested using two statistical methods, i.e., the Mann–Whitney U test and the cosine similarity analysis. Both methods validate that the observed behavior has been reproduced with high accuracy.

在过去几年里,有关视频游戏的人工智能(AI)研究主要集中在模仿人类行为上。此外,为了提高娱乐和满足感的感知价值,对能够模仿人类玩家和视频游戏角色的智能代理的需求也大幅上升。然而,目前使用大多数方法开发的代理被认为是比较机械的,这会导致挫败感,更重要的是,会导致参与失败。有鉴于此,本研究提出了一种模仿学习框架,以生成类似人类的行为,从而实现更精确、更准确的再现。为了建立一个计算模型,我们探索了两种学习范式,即人工神经网络(ANN)和自适应神经模糊推理系统(ANFIS)。本研究利用了几种不同的人工神经网络,包括前馈、递归、极端学习机和回归,来模拟人类球员的行为。此外,为了找到理想的 ANFIS,还使用了网格划分、减法聚类和模糊 c-means 聚类来进行训练。结果表明,使用减法聚类训练的 ANFIS 混合智能系统总体最佳,平均准确率为 95%,其次是模糊 c-means,平均准确率为 87%。此外,还使用两种统计方法,即曼-惠特尼 U 检验和余弦相似性分析,对所获得的人工智能代理的可信度进行了测试。这两种方法都验证了观察到的行为得到了高精度的再现。
{"title":"Towards the development of believable agents: Adopting neural architectures and adaptive neuro-fuzzy inference system via playback of human traces","authors":"Naveed Anwer Butt ,&nbsp;Mian Muhammad Awais ,&nbsp;Samra Shahzadi ,&nbsp;Tai-hoon Kim ,&nbsp;Imran Ashraf","doi":"10.1016/j.jksuci.2024.102182","DOIUrl":"10.1016/j.jksuci.2024.102182","url":null,"abstract":"<div><p>Artificial intelligence (AI) research on video games primarily focused on the imitation of human-like behavior during the past few years. Moreover, to increase the perceived worth of amusement and gratification, there is an enormous rise in the demand for intelligent agents that can imitate human players and video game characters. However, the agents developed using the majority of current approaches are perceived as rather more mechanical, which leads to frustration, and more importantly, failure in engagement. On that account, this study proposes an imitation learning framework to generate human-like behavior for more precise and accurate reproduction. To build a computational model, two learning paradigms are explored, artificial neural networks (ANN) and adaptive neuro-fuzzy inference systems (ANFIS). This study utilized several variations of ANN, including feed-forward, recurrent, extreme learning machines, and regressions, to simulate human player behavior. Furthermore, to find the ideal ANFIS, grid partitioning, subtractive clustering, and fuzzy c-means clustering are used for training. The results demonstrate that ANFIS hybrid intelligence systems trained with subtractive clustering are overall best with an average accuracy of 95%, followed by fuzzy c-means with an average accuracy of 87%. Also, the believability of the obtained AI agents is tested using two statistical methods, i.e., the Mann–Whitney U test and the cosine similarity analysis. Both methods validate that the observed behavior has been reproduced with high accuracy.</p></div>","PeriodicalId":48547,"journal":{"name":"Journal of King Saud University-Computer and Information Sciences","volume":"36 8","pages":"Article 102182"},"PeriodicalIF":5.2,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1319157824002714/pdfft?md5=542b4e8449657f4dbd195276e5fb54c1&pid=1-s2.0-S1319157824002714-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142229614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SARD: Fake news detection based on CLIP contrastive learning and multimodal semantic alignment SARD:基于 CLIP 对比学习和多模态语义配准的假新闻检测
IF 5.2 2区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-10-01 Epub Date: 2024-08-14 DOI: 10.1016/j.jksuci.2024.102160
Facheng Yan, Mingshu Zhang, Bin Wei, Kelan Ren, Wen Jiang

The automatic detection of multimodal fake news can be used to effectively identify potential risks in cyberspace. Most of the existing multimodal fake news detection methods focus on fully exploiting textual and visual features in news content, thus neglecting the full utilization of news social context features that play an important role in improving fake news detection. To this end, we propose a new fake news detection method based on CLIP contrastive learning and multimodal semantic alignment (SARD). SARD leverages cutting-edge multimodal learning techniques, such as CLIP, and robust cross-modal contrastive learning methods to integrate features of news-oriented heterogeneous information networks (N-HIN) with multi-level textual and visual features into a unified framework for the first time. This framework not only achieves cross-modal alignment between deep textual and visual features but also considers cross-modal associations and semantic alignments across different modalities. Furthermore, SARD enhances fake news detection by aligning semantic features between news content and N-HIN features, an aspect largely overlooked by existing methods. We test and evaluate SARD on three real-world datasets. Experimental results demonstrate that SARD significantly outperforms the twelve state-of-the-art competitors in fake news detection, with an average improvement of 2.89% in Mac.F1 score and 2.13% in accuracy compared to the leading baseline models across three datasets.

多模态假新闻的自动检测可用于有效识别网络空间的潜在风险。现有的多模态假新闻检测方法大多侧重于充分利用新闻内容中的文本和视觉特征,从而忽视了充分利用新闻社会语境特征,而社会语境特征在提高假新闻检测能力方面发挥着重要作用。为此,我们提出了一种基于 CLIP 对比学习和多模态语义对齐(SARD)的新型假新闻检测方法。SARD 利用前沿的多模态学习技术(如 CLIP)和稳健的跨模态对比学习方法,首次将面向新闻的异构信息网络(N-HIN)特征与多层次的文本和视觉特征整合到一个统一的框架中。该框架不仅实现了深度文本和视觉特征之间的跨模态对齐,还考虑了不同模态之间的跨模态关联和语义对齐。此外,SARD 还通过对齐新闻内容和 N-HIN 特征之间的语义特征来增强假新闻检测,而现有方法在很大程度上忽略了这一点。我们在三个真实世界的数据集上对 SARD 进行了测试和评估。实验结果表明,在假新闻检测方面,SARD 明显优于 12 个最先进的竞争对手,在三个数据集上,与领先的基线模型相比,Mac.F1 分数平均提高了 2.89%,准确率平均提高了 2.13%。
{"title":"SARD: Fake news detection based on CLIP contrastive learning and multimodal semantic alignment","authors":"Facheng Yan,&nbsp;Mingshu Zhang,&nbsp;Bin Wei,&nbsp;Kelan Ren,&nbsp;Wen Jiang","doi":"10.1016/j.jksuci.2024.102160","DOIUrl":"10.1016/j.jksuci.2024.102160","url":null,"abstract":"<div><p>The automatic detection of multimodal fake news can be used to effectively identify potential risks in cyberspace. Most of the existing multimodal fake news detection methods focus on fully exploiting textual and visual features in news content, thus neglecting the full utilization of news social context features that play an important role in improving fake news detection. To this end, we propose a new fake news detection method based on CLIP contrastive learning and multimodal semantic alignment (SARD). SARD leverages cutting-edge multimodal learning techniques, such as CLIP, and robust cross-modal contrastive learning methods to integrate features of news-oriented heterogeneous information networks (N-HIN) with multi-level textual and visual features into a unified framework for the first time. This framework not only achieves cross-modal alignment between deep textual and visual features but also considers cross-modal associations and semantic alignments across different modalities. Furthermore, SARD enhances fake news detection by aligning semantic features between news content and N-HIN features, an aspect largely overlooked by existing methods. We test and evaluate SARD on three real-world datasets. Experimental results demonstrate that SARD significantly outperforms the twelve state-of-the-art competitors in fake news detection, with an average improvement of 2.89% in Mac.F1 score and 2.13% in accuracy compared to the leading baseline models across three datasets.</p></div>","PeriodicalId":48547,"journal":{"name":"Journal of King Saud University-Computer and Information Sciences","volume":"36 8","pages":"Article 102160"},"PeriodicalIF":5.2,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1319157824002490/pdfft?md5=497eb195281148df13643994f201fe62&pid=1-s2.0-S1319157824002490-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142076508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An efficient hybrid approach for forecasting real-time stock market indices 预测实时股票市场指数的高效混合方法
IF 5.2 2区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-10-01 Epub Date: 2024-08-29 DOI: 10.1016/j.jksuci.2024.102180
Riya Kalra , Tinku Singh , Suryanshi Mishra , Satakshi , Naveen Kumar , Taehong Kim , Manish Kumar

The stock market’s volatility, noise, and information overload necessitate efficient prediction methods. Forecasting index prices in this environment is complex due to the non-linear and non-stationary nature of time series data generated from the stock market. Machine learning and deep learning have emerged as powerful tools for identifying financial data patterns and generating predictions based on historical trends. However, updating these models in real-time is crucial for accurate predictions. Deep learning models require extensive computational resources and careful hyperparameter optimization, while incremental learning models struggle to balance stability and adaptability. This paper proposes a novel hybrid bidirectional-LSTM (H.BLSTM) model that combines incremental learning and deep learning techniques for real-time index price prediction, addressing these scalability and memory challenges. The method utilizes both univariate time series derived from historical index prices and multivariate time series incorporating technical indicators. Implementation within a real-time trading system demonstrates the method’s effectiveness in achieving more accurate price forecasts for major stock indices globally through extensive experimentation. The proposed model achieved an average mean absolute percentage error of 0.001 across nine stock indices, significantly outperforming traditional models. It has an average forecasting delay of 2 s, making it suitable for real-time trading applications.

股票市场的波动性、噪音和信息过载要求我们采用高效的预测方法。由于股票市场产生的时间序列数据具有非线性和非平稳性,因此在这种环境下预测指数价格非常复杂。机器学习和深度学习已成为基于历史趋势识别金融数据模式和生成预测的强大工具。然而,实时更新这些模型对于准确预测至关重要。深度学习模型需要大量的计算资源和细致的超参数优化,而增量学习模型则难以兼顾稳定性和适应性。本文提出了一种新颖的混合双向 LSTM(H.BLSTM)模型,该模型结合了增量学习和深度学习技术,用于实时指数价格预测,解决了这些可扩展性和内存方面的难题。该方法利用了从历史指数价格中得出的单变量时间序列和包含技术指标的多变量时间序列。在实时交易系统中的实施表明,通过广泛的实验,该方法能有效地对全球主要股票指数进行更准确的价格预测。所提出的模型在九个股票指数中的平均绝对百分比误差为 0.001,明显优于传统模型。它的平均预测延迟时间为 2 秒,适合实时交易应用。
{"title":"An efficient hybrid approach for forecasting real-time stock market indices","authors":"Riya Kalra ,&nbsp;Tinku Singh ,&nbsp;Suryanshi Mishra ,&nbsp;Satakshi ,&nbsp;Naveen Kumar ,&nbsp;Taehong Kim ,&nbsp;Manish Kumar","doi":"10.1016/j.jksuci.2024.102180","DOIUrl":"10.1016/j.jksuci.2024.102180","url":null,"abstract":"<div><p>The stock market’s volatility, noise, and information overload necessitate efficient prediction methods. Forecasting index prices in this environment is complex due to the non-linear and non-stationary nature of time series data generated from the stock market. Machine learning and deep learning have emerged as powerful tools for identifying financial data patterns and generating predictions based on historical trends. However, updating these models in real-time is crucial for accurate predictions. Deep learning models require extensive computational resources and careful hyperparameter optimization, while incremental learning models struggle to balance stability and adaptability. This paper proposes a novel hybrid bidirectional-LSTM (H.BLSTM) model that combines incremental learning and deep learning techniques for real-time index price prediction, addressing these scalability and memory challenges. The method utilizes both univariate time series derived from historical index prices and multivariate time series incorporating technical indicators. Implementation within a real-time trading system demonstrates the method’s effectiveness in achieving more accurate price forecasts for major stock indices globally through extensive experimentation. The proposed model achieved an average mean absolute percentage error of 0.001 across nine stock indices, significantly outperforming traditional models. It has an average forecasting delay of 2 s, making it suitable for real-time trading applications.</p></div>","PeriodicalId":48547,"journal":{"name":"Journal of King Saud University-Computer and Information Sciences","volume":"36 8","pages":"Article 102180"},"PeriodicalIF":5.2,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1319157824002696/pdfft?md5=990fa1b67fa197073ed336d80589c08c&pid=1-s2.0-S1319157824002696-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142098691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unsupervised selective labeling for semi-supervised industrial defect detection 用于半监督工业缺陷检测的无监督选择性标记
IF 5.2 2区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-10-01 Epub Date: 2024-08-30 DOI: 10.1016/j.jksuci.2024.102179
Jian Ge , Qin Qin , Shaojing Song , Jinhua Jiang , Zhiwei Shen

In industrial detection scenarios, achieving high accuracy typically relies on extensive labeled datasets, which are costly and time-consuming. This has motivated a shift towards semi-supervised learning (SSL), which leverages labeled and unlabeled data to improve learning efficiency and reduce annotation costs. This work proposes the unsupervised spectral clustering labeling (USCL) method to optimize SSL for industrial challenges like defect variability, rarity, and complex distributions. Integral to USCL, we employ the multi-task fusion self-supervised learning (MTSL) method to extract robust feature representations through multiple self-supervised tasks. Additionally, we introduce the Enhanced Spectral Clustering (ESC) method and a dynamic selecting function (DSF). ESC effectively integrates both local and global similarity matrices, improving clustering accuracy. The DSF maximally selects the most valuable instances for labeling, significantly enhancing the representativeness and diversity of the labeled data. USCL consistently improves various SSL methods compared to traditional instance selection methods. For example, it boosts Efficient Teacher by 5%, 6.6%, and 7.8% in mean Average Precision(mAP) on the Automotive Sealing Rings Defect Dataset, the Metallic Surface Defect Dataset, and the Printed Circuit Boards (PCB) Defect Dataset with 10% labeled data. Our work sets a new benchmark for SSL in industrial settings.

在工业检测场景中,要实现高精度通常需要大量标注数据集,而这些数据集成本高、耗时长。这促使人们转向半监督学习(SSL),即利用已标注和未标注数据来提高学习效率并降低标注成本。本研究提出了无监督光谱聚类标注(USCL)方法,以优化 SSL,应对缺陷多变性、稀有性和复杂分布等工业挑战。作为 USCL 的组成部分,我们采用了多任务融合自我监督学习(MTSL)方法,通过多个自我监督任务提取稳健的特征表征。此外,我们还引入了增强光谱聚类(ESC)方法和动态选择函数(DSF)。ESC 有效整合了局部和全局相似性矩阵,提高了聚类的准确性。DSF 可最大限度地选择最有价值的实例进行标记,从而显著提高标记数据的代表性和多样性。与传统的实例选择方法相比,USCL 不断改进各种 SSL 方法。例如,在汽车密封环缺陷数据集、金属表面缺陷数据集和印刷电路板(PCB)缺陷数据集上,USCL 在平均精度(mAP)方面分别提高了高效教师 5%、6.6% 和 7.8%,标注数据的比例为 10%。我们的工作为工业环境中的 SSL 树立了新的基准。
{"title":"Unsupervised selective labeling for semi-supervised industrial defect detection","authors":"Jian Ge ,&nbsp;Qin Qin ,&nbsp;Shaojing Song ,&nbsp;Jinhua Jiang ,&nbsp;Zhiwei Shen","doi":"10.1016/j.jksuci.2024.102179","DOIUrl":"10.1016/j.jksuci.2024.102179","url":null,"abstract":"<div><p>In industrial detection scenarios, achieving high accuracy typically relies on extensive labeled datasets, which are costly and time-consuming. This has motivated a shift towards semi-supervised learning (SSL), which leverages labeled and unlabeled data to improve learning efficiency and reduce annotation costs. This work proposes the unsupervised spectral clustering labeling (USCL) method to optimize SSL for industrial challenges like defect variability, rarity, and complex distributions. Integral to USCL, we employ the multi-task fusion self-supervised learning (MTSL) method to extract robust feature representations through multiple self-supervised tasks. Additionally, we introduce the Enhanced Spectral Clustering (ESC) method and a dynamic selecting function (DSF). ESC effectively integrates both local and global similarity matrices, improving clustering accuracy. The DSF maximally selects the most valuable instances for labeling, significantly enhancing the representativeness and diversity of the labeled data. USCL consistently improves various SSL methods compared to traditional instance selection methods. For example, it boosts Efficient Teacher by 5%, 6.6%, and 7.8% in mean Average Precision(mAP) on the Automotive Sealing Rings Defect Dataset, the Metallic Surface Defect Dataset, and the Printed Circuit Boards (PCB) Defect Dataset with 10% labeled data. Our work sets a new benchmark for SSL in industrial settings.</p></div>","PeriodicalId":48547,"journal":{"name":"Journal of King Saud University-Computer and Information Sciences","volume":"36 8","pages":"Article 102179"},"PeriodicalIF":5.2,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1319157824002684/pdfft?md5=2e9ae7d3bfac3922191cefd8f900c5a6&pid=1-s2.0-S1319157824002684-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142117390","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of King Saud University-Computer and Information Sciences
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1