Pub Date : 2024-09-11DOI: 10.1016/j.jksuci.2024.102188
Heemin Kim , Byeong-Chan Kim , Sumi Lee , Minjung Kang , Hyunjee Nam , Sunghwan Park , Il-Youp Kwak , Jaewoo Lee
Recently, adversarial patches have become frequently used in adversarial attacks in real-world settings, evolving into various shapes and numbers. However, existing defense methods often exhibit limitations in addressing specific attacks, datasets, or conditions. This underscores the demand for versatile and robust defenses capable of operating across diverse scenarios. In this paper, we propose the RAPID (Robust multi-pAtch masker using channel-wise Pooled varIance with two-stage patch Detection) framework, a stable solution to restore detection efficacy in the presence of multiple patches. The RAPID framework excels in defending against attacks regardless of patch number or shape, offering a versatile defense adaptable to diverse adversarial scenarios. RAPID employs a two-stage strategy to identify and mask coordinates associated with patch attacks. In the first stage, we propose the ‘channel-wise pooled variance’ to detect candidate patch regions. In the second step, upon detecting these regions, we identify dense areas as patches and mask them accordingly. This framework easily integrates into the preprocessing stage of any object detection model due to its independent structure, requiring no modifications to the model itself. Evaluation indicates that RAPID enhances robustness by up to 60% compared to other defenses. RAPID achieves mAP50 and mAP@50-95 values of 0.696 and 0.479, respectively.
最近,对抗性补丁在现实世界的对抗性攻击中被频繁使用,并演变成各种形状和数量。然而,现有的防御方法在应对特定攻击、数据集或条件时往往表现出局限性。这凸显了对能够在不同场景下运行的多功能、强大的防御系统的需求。在本文中,我们提出了 RAPID(Robust multi-pAtch masker using channel-wise Pooled varIance with two-stage patch Detection)框架,这是一种在存在多个补丁的情况下恢复检测功效的稳定解决方案。RAPID 框架在抵御攻击方面表现出色,无论补丁数量或形状如何,都能提供适应不同对抗场景的多功能防御。RAPID 采用两阶段策略来识别和屏蔽与补丁攻击相关的坐标。在第一阶段,我们提出了 "信道汇集方差 "来检测候选补丁区域。第二步,在检测到这些区域后,我们将密集区域识别为补丁,并对其进行相应的屏蔽。由于该框架结构独立,无需修改模型本身,因此可轻松集成到任何物体检测模型的预处理阶段。评估结果表明,与其他防御方法相比,RAPID 增强了高达 60% 的鲁棒性。RAPID 的 mAP50 和 mAP@50-95 值分别为 0.696 和 0.479。
{"title":"RAPID: Robust multi-pAtch masker using channel-wise Pooled varIance with two-stage patch Detection","authors":"Heemin Kim , Byeong-Chan Kim , Sumi Lee , Minjung Kang , Hyunjee Nam , Sunghwan Park , Il-Youp Kwak , Jaewoo Lee","doi":"10.1016/j.jksuci.2024.102188","DOIUrl":"10.1016/j.jksuci.2024.102188","url":null,"abstract":"<div><p>Recently, adversarial patches have become frequently used in adversarial attacks in real-world settings, evolving into various shapes and numbers. However, existing defense methods often exhibit limitations in addressing specific attacks, datasets, or conditions. This underscores the demand for versatile and robust defenses capable of operating across diverse scenarios. In this paper, we propose the RAPID (<strong>R</strong>obust multi-p<strong>A</strong>tch masker using channel-wise <strong>P</strong>ooled var<strong>I</strong>ance with two-stage patch <strong>D</strong>etection) framework, a stable solution to restore detection efficacy in the presence of multiple patches. The RAPID framework excels in defending against attacks regardless of patch number or shape, offering a versatile defense adaptable to diverse adversarial scenarios. RAPID employs a two-stage strategy to identify and mask coordinates associated with patch attacks. In the first stage, we propose the ‘channel-wise pooled variance’ to detect candidate patch regions. In the second step, upon detecting these regions, we identify dense areas as patches and mask them accordingly. This framework easily integrates into the preprocessing stage of any object detection model due to its independent structure, requiring no modifications to the model itself. Evaluation indicates that RAPID enhances robustness by up to 60% compared to other defenses. RAPID achieves mAP50 and mAP@50-95 values of 0.696 and 0.479, respectively.</p></div>","PeriodicalId":48547,"journal":{"name":"Journal of King Saud University-Computer and Information Sciences","volume":"36 8","pages":"Article 102188"},"PeriodicalIF":5.2,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1319157824002775/pdfft?md5=097312e661d7cf2bd4bcbc118fd164bd&pid=1-s2.0-S1319157824002775-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142169352","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-10DOI: 10.1016/j.jksuci.2024.102186
Guofeng Yu, Chunlei Fan, Jiale Xi, Chengbin Xu
Conventional multi-scroll chaotic systems are often constrained by the number of attractors and the complexity of generation, making it challenging to meet the increasing demands of communication and computation. This paper revolves around the modified Chua’s system. By modifying its differential equation and introducing traditional nonlinear functions, such as the step function sequence and sawtooth function sequence. A nested grid multi-scroll chaotic system (NGMSCS) can be established, capable of generating nested grid multi-scroll attractors. In contrast to conventional grid multi-scroll chaotic attractors, scroll-like phenomena can be initiated outside the grid structure, thereby revealing more complex dynamic behavior and topological features. Through the theoretical design and analysis of the equilibrium point of the system and its stability, the number of saddle-focused equilibrium points of index 2 is further expanded, which can generate (2 N+2) × M attractors, and the formation mechanism is elaborated and verified in detail. In addition, the generation of an arbitrary number of equilibrium points in the y-direction is achieved by transforming the x and y variables, which can generate M×(2 N+2) attractors, increasing the complexity of the system. The system’s dynamical properties are discussed in depth via time series plots, Lyapunov exponents, Poincaré cross sections, 0–1 tests, bifurcation diagrams, and attraction basins. The existence of attractors is confirmed through numerical simulations and FPGA-based hardware experiments.
传统的多辊混沌系统往往受制于吸引子的数量和生成的复杂性,因而难以满足日益增长的通信和计算需求。本文围绕修正的蔡氏系统展开论述。通过修改其微分方程并引入传统的非线性函数,如阶跃函数序列和锯齿函数序列。嵌套网格多卷混沌系统(NGMSCS)就可以建立起来,并能产生嵌套网格多卷吸引子。与传统的网格多卷积混沌吸引子相比,卷积现象可以在网格结构之外启动,从而显示出更复杂的动态行为和拓扑特征。通过对系统平衡点及其稳定性的理论设计和分析,进一步扩展了指数为 2 的鞍焦平衡点数量,可生成(2 N+2 )×M 个吸引子,并详细阐述和验证了其形成机理。此外,通过变换 x 和 y 变量,在 y 方向上生成任意数量的平衡点,可产生 M×(2 N+2) 个吸引子,增加了系统的复杂性。通过时间序列图、Lyapunov 指数、Poincaré 截面、0-1 检验、分岔图和吸引盆地,深入讨论了系统的动力学特性。吸引子的存在通过数值模拟和基于 FPGA 的硬件实验得到了证实。
{"title":"Design and FPGA implementation of nested grid multi-scroll chaotic system","authors":"Guofeng Yu, Chunlei Fan, Jiale Xi, Chengbin Xu","doi":"10.1016/j.jksuci.2024.102186","DOIUrl":"10.1016/j.jksuci.2024.102186","url":null,"abstract":"<div><p>Conventional multi-scroll chaotic systems are often constrained by the number of attractors and the complexity of generation, making it challenging to meet the increasing demands of communication and computation. This paper revolves around the modified Chua’s system. By modifying its differential equation and introducing traditional nonlinear functions, such as the step function sequence and sawtooth function sequence. A nested grid multi-scroll chaotic system (NGMSCS) can be established, capable of generating nested grid multi-scroll attractors. In contrast to conventional grid multi-scroll chaotic attractors, scroll-like phenomena can be initiated outside the grid structure, thereby revealing more complex dynamic behavior and topological features. Through the theoretical design and analysis of the equilibrium point of the system and its stability, the number of saddle-focused equilibrium points of index 2 is further expanded, which can generate (2 N+2) × M attractors, and the formation mechanism is elaborated and verified in detail. In addition, the generation of an arbitrary number of equilibrium points in the <em>y</em>-direction is achieved by transforming the <em>x</em> and <em>y</em> variables, which can generate M×(2 N+2) attractors, increasing the complexity of the system. The system’s dynamical properties are discussed in depth via time series plots, Lyapunov exponents, Poincaré cross sections, 0–1 tests, bifurcation diagrams, and attraction basins. The existence of attractors is confirmed through numerical simulations and FPGA-based hardware experiments.</p></div>","PeriodicalId":48547,"journal":{"name":"Journal of King Saud University-Computer and Information Sciences","volume":"36 8","pages":"Article 102186"},"PeriodicalIF":5.2,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1319157824002751/pdfft?md5=5a97268ac1950c4cb177bec835b9c871&pid=1-s2.0-S1319157824002751-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142233768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-02DOI: 10.1016/j.jksuci.2024.102182
Naveed Anwer Butt , Mian Muhammad Awais , Samra Shahzadi , Tai-hoon Kim , Imran Ashraf
Artificial intelligence (AI) research on video games primarily focused on the imitation of human-like behavior during the past few years. Moreover, to increase the perceived worth of amusement and gratification, there is an enormous rise in the demand for intelligent agents that can imitate human players and video game characters. However, the agents developed using the majority of current approaches are perceived as rather more mechanical, which leads to frustration, and more importantly, failure in engagement. On that account, this study proposes an imitation learning framework to generate human-like behavior for more precise and accurate reproduction. To build a computational model, two learning paradigms are explored, artificial neural networks (ANN) and adaptive neuro-fuzzy inference systems (ANFIS). This study utilized several variations of ANN, including feed-forward, recurrent, extreme learning machines, and regressions, to simulate human player behavior. Furthermore, to find the ideal ANFIS, grid partitioning, subtractive clustering, and fuzzy c-means clustering are used for training. The results demonstrate that ANFIS hybrid intelligence systems trained with subtractive clustering are overall best with an average accuracy of 95%, followed by fuzzy c-means with an average accuracy of 87%. Also, the believability of the obtained AI agents is tested using two statistical methods, i.e., the Mann–Whitney U test and the cosine similarity analysis. Both methods validate that the observed behavior has been reproduced with high accuracy.
{"title":"Towards the development of believable agents: Adopting neural architectures and adaptive neuro-fuzzy inference system via playback of human traces","authors":"Naveed Anwer Butt , Mian Muhammad Awais , Samra Shahzadi , Tai-hoon Kim , Imran Ashraf","doi":"10.1016/j.jksuci.2024.102182","DOIUrl":"10.1016/j.jksuci.2024.102182","url":null,"abstract":"<div><p>Artificial intelligence (AI) research on video games primarily focused on the imitation of human-like behavior during the past few years. Moreover, to increase the perceived worth of amusement and gratification, there is an enormous rise in the demand for intelligent agents that can imitate human players and video game characters. However, the agents developed using the majority of current approaches are perceived as rather more mechanical, which leads to frustration, and more importantly, failure in engagement. On that account, this study proposes an imitation learning framework to generate human-like behavior for more precise and accurate reproduction. To build a computational model, two learning paradigms are explored, artificial neural networks (ANN) and adaptive neuro-fuzzy inference systems (ANFIS). This study utilized several variations of ANN, including feed-forward, recurrent, extreme learning machines, and regressions, to simulate human player behavior. Furthermore, to find the ideal ANFIS, grid partitioning, subtractive clustering, and fuzzy c-means clustering are used for training. The results demonstrate that ANFIS hybrid intelligence systems trained with subtractive clustering are overall best with an average accuracy of 95%, followed by fuzzy c-means with an average accuracy of 87%. Also, the believability of the obtained AI agents is tested using two statistical methods, i.e., the Mann–Whitney U test and the cosine similarity analysis. Both methods validate that the observed behavior has been reproduced with high accuracy.</p></div>","PeriodicalId":48547,"journal":{"name":"Journal of King Saud University-Computer and Information Sciences","volume":"36 8","pages":"Article 102182"},"PeriodicalIF":5.2,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1319157824002714/pdfft?md5=542b4e8449657f4dbd195276e5fb54c1&pid=1-s2.0-S1319157824002714-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142229614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-02DOI: 10.1016/j.jksuci.2024.102183
Jianxin Tang , Jitao Qu , Shihui Song , Zhili Zhao , Qian Du
Exploring effective and efficient strategies for identifying influential nodes from social networks as seeds to promote the propagation of influence remains a crucial challenge in the field of influence maximization (IM), which has attracted significant research efforts. Deep learning-based approaches have been adopted as an alternative promising solution to the IM problem. However, a robust model that captures the associations between network information and node influence needs to be investigated, while concurrently considering the effects of the overlapped influence on training labels. To address these challenges, a GCNT model, which integrates Graph Convolutional Networks with Graph Transformers, is introduced in this paper to capture the intricate relationships among the topology of the network, node attributes, and node influence effectively. Furthermore, an innovative method called - is proposed to generate labels to alleviate the issue of overlapped influence spread. Moreover, a Mask mechanism specially tailored for the IM problem is presented along with an input embedding balancing strategy. The effectiveness of the GCNT model is demonstrated through comprehensive experiments conducted on six real-world networks, and the model shows its competitive performance in terms of both influence maximization and computational efficiency over state-of-the-art methods.
在影响力最大化(IM)领域,探索从社交网络中识别有影响力的节点作为种子以促进影响力传播的切实有效的策略仍然是一个重要挑战,吸引了大量研究人员的努力。基于深度学习的方法已被采用,作为解决 IM 问题的另一种有前途的方案。然而,需要研究一种能捕捉网络信息与节点影响力之间关联的稳健模型,同时考虑重叠影响力对训练标签的影响。为了应对这些挑战,本文引入了一个 GCNT 模型,该模型将图卷积网络与图变换器整合在一起,能有效捕捉网络拓扑、节点属性和节点影响力之间错综复杂的关系。此外,本文还提出了一种名为 "Greedy-LIE "的创新方法来生成标签,以缓解影响扩散重叠的问题。此外,还提出了专门针对 IM 问题的掩码机制以及输入嵌入平衡策略。通过在六个真实世界网络上进行的综合实验,证明了 GCNT 模型的有效性,而且该模型在影响力最大化和计算效率方面的表现都优于最先进的方法。
{"title":"GCNT: Identify influential seed set effectively in social networks by integrating graph convolutional networks with graph transformers","authors":"Jianxin Tang , Jitao Qu , Shihui Song , Zhili Zhao , Qian Du","doi":"10.1016/j.jksuci.2024.102183","DOIUrl":"10.1016/j.jksuci.2024.102183","url":null,"abstract":"<div><p>Exploring effective and efficient strategies for identifying influential nodes from social networks as seeds to promote the propagation of influence remains a crucial challenge in the field of influence maximization (IM), which has attracted significant research efforts. Deep learning-based approaches have been adopted as an alternative promising solution to the IM problem. However, a robust model that captures the associations between network information and node influence needs to be investigated, while concurrently considering the effects of the overlapped influence on training labels. To address these challenges, a GCNT model, which integrates Graph Convolutional Networks with Graph Transformers, is introduced in this paper to capture the intricate relationships among the topology of the network, node attributes, and node influence effectively. Furthermore, an innovative method called <span><math><mrow><mi>G</mi><mi>r</mi><mi>e</mi><mi>e</mi><mi>d</mi><mi>y</mi></mrow></math></span>-<span><math><mrow><mi>L</mi><mi>I</mi><mi>E</mi></mrow></math></span> is proposed to generate labels to alleviate the issue of overlapped influence spread. Moreover, a Mask mechanism specially tailored for the IM problem is presented along with an input embedding balancing strategy. The effectiveness of the GCNT model is demonstrated through comprehensive experiments conducted on six real-world networks, and the model shows its competitive performance in terms of both influence maximization and computational efficiency over state-of-the-art methods.</p></div>","PeriodicalId":48547,"journal":{"name":"Journal of King Saud University-Computer and Information Sciences","volume":"36 8","pages":"Article 102183"},"PeriodicalIF":5.2,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1319157824002726/pdfft?md5=fb687d0a26ab54db6f7c889e608384a1&pid=1-s2.0-S1319157824002726-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142149684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-01DOI: 10.1016/j.jksuci.2024.102145
Praveen Kumar Donta , Chinmaya Kumar Dehury , Yu-Chen Hu
This special issue is a collection of emerging trends and challenges in applying learning-driven approaches to data fabric architectures within the cloud-to-thing continuum. As data generation and processing increasingly occur at the edge, there is a growing need for intelligent, adaptive data management solutions that seamlessly operate across distributed environments. In this special issue, we received research contributions from various groups around the world. We chose the eight most appropriate and novel contributions to include in this special issue. These eight contributions were further categorized into three themes: Data Handling approaches, resource optimization and management, and security and attacks. Additionally, this editorial suggests future research directions that will potentially lead to groundbreaking insights, which could pave the way for a new era of learning techniques in Data Fabric and the Cloud-to-Thing Continuum.
{"title":"Learning-driven Data Fabric Trends and Challenges for cloud-to-thing continuum","authors":"Praveen Kumar Donta , Chinmaya Kumar Dehury , Yu-Chen Hu","doi":"10.1016/j.jksuci.2024.102145","DOIUrl":"10.1016/j.jksuci.2024.102145","url":null,"abstract":"<div><p>This special issue is a collection of emerging trends and challenges in applying learning-driven approaches to data fabric architectures within the cloud-to-thing continuum. As data generation and processing increasingly occur at the edge, there is a growing need for intelligent, adaptive data management solutions that seamlessly operate across distributed environments. In this special issue, we received research contributions from various groups around the world. We chose the eight most appropriate and novel contributions to include in this special issue. These eight contributions were further categorized into three themes: Data Handling approaches, resource optimization and management, and security and attacks. Additionally, this editorial suggests future research directions that will potentially lead to groundbreaking insights, which could pave the way for a new era of learning techniques in Data Fabric and the Cloud-to-Thing Continuum.</p></div>","PeriodicalId":48547,"journal":{"name":"Journal of King Saud University-Computer and Information Sciences","volume":"36 7","pages":"Article 102145"},"PeriodicalIF":5.2,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1319157824002349/pdfft?md5=286285bbd5dfa0b63dd8785bf5349c2e&pid=1-s2.0-S1319157824002349-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142230508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-31DOI: 10.1016/j.jksuci.2024.102177
Huanhuan Hou , Azlan Ismail
The huge energy consumption of data centers in cloud computing leads to increased operating costs and high carbon emissions to the environment. Deep Reinforcement Learning (DRL) technology combines of deep learning and reinforcement learning, which has an obvious advantage in solving complex task scheduling problems. Deep Q Network(DQN)-based task scheduling has been employed for objective optimization. However, training the DQN algorithm may result in value overestimation, which can negatively impact the learning effectiveness. The replay buffer technique, while increasing sample utilization, does not distinguish between sample importance, resulting in limited utilization of valuable samples. This study proposes an enhanced task scheduling algorithm based on the DQN framework, which utilizes a more optimized Dueling-network architecture as well as Double DQN strategy to alleviate the overestimation bias and address the shortcomings of DQN. It also incorporates a prioritized experience replay technique to achieve importance sampling of experience data, which overcomes the problem of low utilization due to uniform sampling from replay memory. Based on these improved techniques, we developed an energy-efficient task scheduling algorithm called EETS (Energy-Efficient Task Scheduling). This algorithm automatically learns the optimal scheduling policy from historical data while interacting with the environment. Experimental results demonstrate that EETS exhibits faster convergence rates and higher rewards compared to both DQN and DDQN. In scheduling performance, EETS outperforms other baseline algorithms in key metrics, including energy consumption, average task response time, and average machine working time. Particularly, it has a significant advantage when handling large batches of tasks.
{"title":"EETS: An energy-efficient task scheduler in cloud computing based on improved DQN algorithm","authors":"Huanhuan Hou , Azlan Ismail","doi":"10.1016/j.jksuci.2024.102177","DOIUrl":"10.1016/j.jksuci.2024.102177","url":null,"abstract":"<div><p>The huge energy consumption of data centers in cloud computing leads to increased operating costs and high carbon emissions to the environment. Deep Reinforcement Learning (DRL) technology combines of deep learning and reinforcement learning, which has an obvious advantage in solving complex task scheduling problems. Deep Q Network(DQN)-based task scheduling has been employed for objective optimization. However, training the DQN algorithm may result in value overestimation, which can negatively impact the learning effectiveness. The replay buffer technique, while increasing sample utilization, does not distinguish between sample importance, resulting in limited utilization of valuable samples. This study proposes an enhanced task scheduling algorithm based on the DQN framework, which utilizes a more optimized Dueling-network architecture as well as Double DQN strategy to alleviate the overestimation bias and address the shortcomings of DQN. It also incorporates a prioritized experience replay technique to achieve importance sampling of experience data, which overcomes the problem of low utilization due to uniform sampling from replay memory. Based on these improved techniques, we developed an energy-efficient task scheduling algorithm called EETS (Energy-Efficient Task Scheduling). This algorithm automatically learns the optimal scheduling policy from historical data while interacting with the environment. Experimental results demonstrate that EETS exhibits faster convergence rates and higher rewards compared to both DQN and DDQN. In scheduling performance, EETS outperforms other baseline algorithms in key metrics, including energy consumption, average task response time, and average machine working time. Particularly, it has a significant advantage when handling large batches of tasks.</p></div>","PeriodicalId":48547,"journal":{"name":"Journal of King Saud University-Computer and Information Sciences","volume":"36 8","pages":"Article 102177"},"PeriodicalIF":5.2,"publicationDate":"2024-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1319157824002660/pdfft?md5=a86e26e6d8a0d8a013697db9338917a5&pid=1-s2.0-S1319157824002660-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142149683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-30DOI: 10.1016/j.jksuci.2024.102165
Samah Abbas , Dimah Alahmadi , Hassanin Al-Barhamtoshy
This paper addresses the potential of Arabic Sign Language (ArSL) recognition systems to facilitate direct communication and enhance social engagement between deaf and non-deaf. Specifically, we focus on the domain of religion to address the lack of accessible religious content for the deaf community. We propose a multimodal architecture framework and develop a novel dataset for ArSL production. The dataset comprises 1950 audio signals with corresponding 131 texts, including words and phrases, and 262 ArSL videos. These videos were recorded by two expert signers and annotated using ELAN based on gloss representation. To evaluate ArSL videos, we employ Cosine similarities and mode distances based on MobileNetV2 and Euclidean distance based on MediaPipe. Additionally, we implement Jac card Similarity to evaluate the gloss representation, resulting in an overall similarity score of 85% between the glosses of the two ArSL videos. The evaluation highlights the complexity of creating an ArSL video corpus and reveals slight differences between the two videos. The findings emphasize the need for careful annotation and representation of ArSL videos to ensure accurate recognition and understanding. Overall, it contributes to bridging the gap in accessible religious content for deaf community by developing a multimodal framework and a comprehensive ArSL dataset.
{"title":"Establishing a multimodal dataset for Arabic Sign Language (ArSL) production","authors":"Samah Abbas , Dimah Alahmadi , Hassanin Al-Barhamtoshy","doi":"10.1016/j.jksuci.2024.102165","DOIUrl":"10.1016/j.jksuci.2024.102165","url":null,"abstract":"<div><p>This paper addresses the potential of Arabic Sign Language (ArSL) recognition systems to facilitate direct communication and enhance social engagement between deaf and non-deaf. Specifically, we focus on the domain of religion to address the lack of accessible religious content for the deaf community. We propose a multimodal architecture framework and develop a novel dataset for ArSL production. The dataset comprises 1950 audio signals with corresponding 131 texts, including words and phrases, and 262 ArSL videos. These videos were recorded by two expert signers and annotated using ELAN based on gloss representation. To evaluate ArSL videos, we employ Cosine similarities and mode distances based on MobileNetV2 and Euclidean distance based on MediaPipe. Additionally, we implement Jac card Similarity to evaluate the gloss representation, resulting in an overall similarity score of 85% between the glosses of the two ArSL videos. The evaluation highlights the complexity of creating an ArSL video corpus and reveals slight differences between the two videos. The findings emphasize the need for careful annotation and representation of ArSL videos to ensure accurate recognition and understanding. Overall, it contributes to bridging the gap in accessible religious content for deaf community by developing a multimodal framework and a comprehensive ArSL dataset.</p></div>","PeriodicalId":48547,"journal":{"name":"Journal of King Saud University-Computer and Information Sciences","volume":"36 8","pages":"Article 102165"},"PeriodicalIF":5.2,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1319157824002544/pdfft?md5=301cc3d87bf22d8e207fb35edd191aea&pid=1-s2.0-S1319157824002544-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142136337","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-30DOI: 10.1016/j.jksuci.2024.102178
Aytuğ Onan , Hesham A. Alhumyani
In the age of information overload, the ability to distill essential content from extensive texts is invaluable. DeepExtract introduces an advanced framework for extractive summarization, utilizing the groundbreaking capabilities of GPT-4 along with innovative hierarchical positional encoding to redefine information extraction. This manuscript details the development of DeepExtract, which integrates semantic-driven techniques to analyze and summarize complex documents effectively. The framework is structured around a novel hierarchical tree construction that categorizes sentences and sections not just by their physical placement within a text, but by their contextual and thematic significance, leveraging dynamic embeddings generated by GPT-4. We introduce a multi-faceted scoring system that evaluates sentences based on coherence, relevance, and novelty, ensuring that summaries are not only concise but rich with essential content. Further, DeepExtract employs optimized semantic clustering to group thematic elements, which enhances the representativeness of the summaries. This paper demonstrates through comprehensive evaluations that DeepExtract significantly outperforms existing extractive summarization models in terms of accuracy and efficiency, making it a potent tool for academic, professional, and general use. We conclude with a discussion on the practical applications of DeepExtract in various domains, highlighting its adaptability and potential in navigating the vast expanses of digital text.
{"title":"DeepExtract: Semantic-driven extractive text summarization framework using LLMs and hierarchical positional encoding","authors":"Aytuğ Onan , Hesham A. Alhumyani","doi":"10.1016/j.jksuci.2024.102178","DOIUrl":"10.1016/j.jksuci.2024.102178","url":null,"abstract":"<div><p>In the age of information overload, the ability to distill essential content from extensive texts is invaluable. DeepExtract introduces an advanced framework for extractive summarization, utilizing the groundbreaking capabilities of GPT-4 along with innovative hierarchical positional encoding to redefine information extraction. This manuscript details the development of DeepExtract, which integrates semantic-driven techniques to analyze and summarize complex documents effectively. The framework is structured around a novel hierarchical tree construction that categorizes sentences and sections not just by their physical placement within a text, but by their contextual and thematic significance, leveraging dynamic embeddings generated by GPT-4. We introduce a multi-faceted scoring system that evaluates sentences based on coherence, relevance, and novelty, ensuring that summaries are not only concise but rich with essential content. Further, DeepExtract employs optimized semantic clustering to group thematic elements, which enhances the representativeness of the summaries. This paper demonstrates through comprehensive evaluations that DeepExtract significantly outperforms existing extractive summarization models in terms of accuracy and efficiency, making it a potent tool for academic, professional, and general use. We conclude with a discussion on the practical applications of DeepExtract in various domains, highlighting its adaptability and potential in navigating the vast expanses of digital text.</p></div>","PeriodicalId":48547,"journal":{"name":"Journal of King Saud University-Computer and Information Sciences","volume":"36 8","pages":"Article 102178"},"PeriodicalIF":5.2,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1319157824002672/pdfft?md5=ee7790d3716e8b2a6454863f15695239&pid=1-s2.0-S1319157824002672-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142098684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-30DOI: 10.1016/j.jksuci.2024.102179
Jian Ge , Qin Qin , Shaojing Song , Jinhua Jiang , Zhiwei Shen
In industrial detection scenarios, achieving high accuracy typically relies on extensive labeled datasets, which are costly and time-consuming. This has motivated a shift towards semi-supervised learning (SSL), which leverages labeled and unlabeled data to improve learning efficiency and reduce annotation costs. This work proposes the unsupervised spectral clustering labeling (USCL) method to optimize SSL for industrial challenges like defect variability, rarity, and complex distributions. Integral to USCL, we employ the multi-task fusion self-supervised learning (MTSL) method to extract robust feature representations through multiple self-supervised tasks. Additionally, we introduce the Enhanced Spectral Clustering (ESC) method and a dynamic selecting function (DSF). ESC effectively integrates both local and global similarity matrices, improving clustering accuracy. The DSF maximally selects the most valuable instances for labeling, significantly enhancing the representativeness and diversity of the labeled data. USCL consistently improves various SSL methods compared to traditional instance selection methods. For example, it boosts Efficient Teacher by 5%, 6.6%, and 7.8% in mean Average Precision(mAP) on the Automotive Sealing Rings Defect Dataset, the Metallic Surface Defect Dataset, and the Printed Circuit Boards (PCB) Defect Dataset with 10% labeled data. Our work sets a new benchmark for SSL in industrial settings.
{"title":"Unsupervised selective labeling for semi-supervised industrial defect detection","authors":"Jian Ge , Qin Qin , Shaojing Song , Jinhua Jiang , Zhiwei Shen","doi":"10.1016/j.jksuci.2024.102179","DOIUrl":"10.1016/j.jksuci.2024.102179","url":null,"abstract":"<div><p>In industrial detection scenarios, achieving high accuracy typically relies on extensive labeled datasets, which are costly and time-consuming. This has motivated a shift towards semi-supervised learning (SSL), which leverages labeled and unlabeled data to improve learning efficiency and reduce annotation costs. This work proposes the unsupervised spectral clustering labeling (USCL) method to optimize SSL for industrial challenges like defect variability, rarity, and complex distributions. Integral to USCL, we employ the multi-task fusion self-supervised learning (MTSL) method to extract robust feature representations through multiple self-supervised tasks. Additionally, we introduce the Enhanced Spectral Clustering (ESC) method and a dynamic selecting function (DSF). ESC effectively integrates both local and global similarity matrices, improving clustering accuracy. The DSF maximally selects the most valuable instances for labeling, significantly enhancing the representativeness and diversity of the labeled data. USCL consistently improves various SSL methods compared to traditional instance selection methods. For example, it boosts Efficient Teacher by 5%, 6.6%, and 7.8% in mean Average Precision(mAP) on the Automotive Sealing Rings Defect Dataset, the Metallic Surface Defect Dataset, and the Printed Circuit Boards (PCB) Defect Dataset with 10% labeled data. Our work sets a new benchmark for SSL in industrial settings.</p></div>","PeriodicalId":48547,"journal":{"name":"Journal of King Saud University-Computer and Information Sciences","volume":"36 8","pages":"Article 102179"},"PeriodicalIF":5.2,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1319157824002684/pdfft?md5=2e9ae7d3bfac3922191cefd8f900c5a6&pid=1-s2.0-S1319157824002684-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142117390","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-29DOI: 10.1016/j.jksuci.2024.102180
Riya Kalra , Tinku Singh , Suryanshi Mishra , Satakshi , Naveen Kumar , Taehong Kim , Manish Kumar
The stock market’s volatility, noise, and information overload necessitate efficient prediction methods. Forecasting index prices in this environment is complex due to the non-linear and non-stationary nature of time series data generated from the stock market. Machine learning and deep learning have emerged as powerful tools for identifying financial data patterns and generating predictions based on historical trends. However, updating these models in real-time is crucial for accurate predictions. Deep learning models require extensive computational resources and careful hyperparameter optimization, while incremental learning models struggle to balance stability and adaptability. This paper proposes a novel hybrid bidirectional-LSTM (H.BLSTM) model that combines incremental learning and deep learning techniques for real-time index price prediction, addressing these scalability and memory challenges. The method utilizes both univariate time series derived from historical index prices and multivariate time series incorporating technical indicators. Implementation within a real-time trading system demonstrates the method’s effectiveness in achieving more accurate price forecasts for major stock indices globally through extensive experimentation. The proposed model achieved an average mean absolute percentage error of 0.001 across nine stock indices, significantly outperforming traditional models. It has an average forecasting delay of 2 s, making it suitable for real-time trading applications.
{"title":"An efficient hybrid approach for forecasting real-time stock market indices","authors":"Riya Kalra , Tinku Singh , Suryanshi Mishra , Satakshi , Naveen Kumar , Taehong Kim , Manish Kumar","doi":"10.1016/j.jksuci.2024.102180","DOIUrl":"10.1016/j.jksuci.2024.102180","url":null,"abstract":"<div><p>The stock market’s volatility, noise, and information overload necessitate efficient prediction methods. Forecasting index prices in this environment is complex due to the non-linear and non-stationary nature of time series data generated from the stock market. Machine learning and deep learning have emerged as powerful tools for identifying financial data patterns and generating predictions based on historical trends. However, updating these models in real-time is crucial for accurate predictions. Deep learning models require extensive computational resources and careful hyperparameter optimization, while incremental learning models struggle to balance stability and adaptability. This paper proposes a novel hybrid bidirectional-LSTM (H.BLSTM) model that combines incremental learning and deep learning techniques for real-time index price prediction, addressing these scalability and memory challenges. The method utilizes both univariate time series derived from historical index prices and multivariate time series incorporating technical indicators. Implementation within a real-time trading system demonstrates the method’s effectiveness in achieving more accurate price forecasts for major stock indices globally through extensive experimentation. The proposed model achieved an average mean absolute percentage error of 0.001 across nine stock indices, significantly outperforming traditional models. It has an average forecasting delay of 2 s, making it suitable for real-time trading applications.</p></div>","PeriodicalId":48547,"journal":{"name":"Journal of King Saud University-Computer and Information Sciences","volume":"36 8","pages":"Article 102180"},"PeriodicalIF":5.2,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1319157824002696/pdfft?md5=990fa1b67fa197073ed336d80589c08c&pid=1-s2.0-S1319157824002696-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142098691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}