首页 > 最新文献

Journal of King Saud University-Computer and Information Sciences最新文献

英文 中文
Online label aggregation with incomplete crowd responses. 不完全人群响应的在线标签聚合。
IF 6.1 2区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-01 Epub Date: 2026-01-20 DOI: 10.1007/s44443-025-00381-z
Yuyang Liu, Haoyu Liu, Runze Wu, Chengliang Chai, Minmin Lin, Renyu Zhu, Hui Liu, Tangjie Lv, Changjie Fan

Crowdsourcing delivers responses that are asynchronous and incomplete, making offline aggregators that assume complete response sets impractical. Prior online methods often either require per-step completeness or repeatedly reload historical responses, which is storage- and privacy-unfriendly and susceptible to forgetting. We present OLA-Incomplete, an online label-aggregation framework designed for incomplete response streams. It integrates a variational-inference aggregator with a generative replay module that preserves historical information without reloading prior responses and explicitly models unknown worker reliability. At each update step, the generator replays cumulative responses and side information for previously observed instances to mitigate catastrophic forgetting, while the aggregator infers current truths by maximizing the evidence lower bound over a mixture of replayed and newly received labels. Across three public datasets-Duck, RTE, and PostSent-OLA-Incomplete attains final accuracies of 90.74%, 92.50%, and 95.99%, respectively, delivering at least 7.79% relative improvement over the strongest baseline. The approach further exhibits strong instantaneous online accuracy and robustness across response-chunk sizes and arrival orders, underscoring its practical utility for real-world crowdsourcing workflows.

众包提供的响应是异步的和不完整的,使得假设完整响应集的离线聚合器不切实际。以前的在线方法通常要么要求每一步完成,要么反复重新加载历史响应,这对存储和隐私都不友好,而且容易被遗忘。我们提出了OLA-Incomplete,一个为不完整响应流设计的在线标签聚合框架。它集成了一个变异推理聚合器和一个生成重播模块,该模块保留了历史信息,而无需重新加载先前的响应,并显式地建模未知的工作可靠性。在每个更新步骤中,生成器重播先前观察到的实例的累积响应和附带信息,以减轻灾难性的遗忘,而聚合器通过在重播和新接收的标签混合的基础上最大化证据下限来推断当前的事实。在三个公共数据集(duck, RTE和postsent - ola)中,incomplete的最终准确率分别达到90.74%,92.50%和95.99%,比最强基线至少提供7.79%的相对改进。该方法进一步展示了强大的即时在线准确性和响应块大小和到达订单的鲁棒性,强调了其在现实世界众包工作流程中的实用性。
{"title":"Online label aggregation with incomplete crowd responses.","authors":"Yuyang Liu, Haoyu Liu, Runze Wu, Chengliang Chai, Minmin Lin, Renyu Zhu, Hui Liu, Tangjie Lv, Changjie Fan","doi":"10.1007/s44443-025-00381-z","DOIUrl":"https://doi.org/10.1007/s44443-025-00381-z","url":null,"abstract":"<p><p>Crowdsourcing delivers responses that are asynchronous and incomplete, making offline aggregators that assume complete response sets impractical. Prior online methods often either require per-step completeness or repeatedly reload historical responses, which is storage- and privacy-unfriendly and susceptible to forgetting. We present OLA-Incomplete, an online label-aggregation framework designed for incomplete response streams. It integrates a variational-inference aggregator with a generative replay module that preserves historical information without reloading prior responses and explicitly models unknown worker reliability. At each update step, the generator replays cumulative responses and side information for previously observed instances to mitigate catastrophic forgetting, while the aggregator infers current truths by maximizing the evidence lower bound over a mixture of replayed and newly received labels. Across three public datasets-Duck, RTE, and PostSent-OLA-Incomplete attains final accuracies of 90.74%, 92.50%, and 95.99%, respectively, delivering at least 7.79% relative improvement over the strongest baseline. The approach further exhibits strong instantaneous online accuracy and robustness across response-chunk sizes and arrival orders, underscoring its practical utility for real-world crowdsourcing workflows.</p>","PeriodicalId":48547,"journal":{"name":"Journal of King Saud University-Computer and Information Sciences","volume":"38 2","pages":"76"},"PeriodicalIF":6.1,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12948928/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147327830","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Low-light image enhancement: A comprehensive review on methods, datasets and evaluation metrics 弱光图像增强:方法、数据集和评估指标综合评述
IF 5.2 2区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-12-01 Epub Date: 2024-11-05 DOI: 10.1016/j.jksuci.2024.102234
Zhan Jingchun , Goh Eg Su , Mohd Shahrizal Sunar
Enhancing low-light images in computer vision is a significant challenge that requires innovative methods to improve its robustness. Low-light image enhancement (LLIE) enhances the quality of images affected by poor lighting conditions by implementing various loss functions such as reconstruction, perceptual, smoothness, adversarial, and exposure. This review analyses and compares different methods, ranging from traditional to cutting-edge deep learning methods, showcasing the significant advancements in the field. Although similar reviews have been studied on LLIE, this paper not only updates the knowledge but also focuses on recent deep learning methods from various perspectives or interpretations. The methodology used in this paper compares different methods from the literature and identifies the potential research gaps. This paper highlights the recent advancements in the field by classifying them into three classes, demonstrated by the continuous enhancements in LLIE methods. These improved methods use different loss functions showing higher efficacy through metrics such as Peak Signal-to-Noise Ratio, Structural Similarity Index Measure, and Naturalness Image Quality Evaluator. The research emphasizes the significance of advanced deep learning techniques and comprehensively compares different LLIE methods on various benchmark image datasets. This research is a foundation for scientists to illustrate potential future research directions.
在计算机视觉中增强低照度图像是一项重大挑战,需要创新方法来提高其鲁棒性。低照度图像增强(LLIE)通过实施各种损失函数(如重建、感知、平滑度、对抗和曝光)来提高受低照度条件影响的图像质量。本综述分析并比较了从传统方法到前沿深度学习方法等不同方法,展示了该领域的重大进展。虽然类似的综述已对 LLIE 进行了研究,但本文不仅更新了相关知识,还从不同的角度或解释关注了最新的深度学习方法。本文采用的方法比较了文献中的不同方法,并找出了潜在的研究空白。本文重点介绍了该领域的最新进展,将其分为三类,并通过 LLIE 方法的不断改进加以展示。这些改进方法使用不同的损失函数,通过峰值信噪比、结构相似性指数测量和自然度图像质量评估器等指标显示出更高的功效。研究强调了先进深度学习技术的重要性,并在各种基准图像数据集上全面比较了不同的 LLIE 方法。这项研究为科学家说明未来潜在的研究方向奠定了基础。
{"title":"Low-light image enhancement: A comprehensive review on methods, datasets and evaluation metrics","authors":"Zhan Jingchun ,&nbsp;Goh Eg Su ,&nbsp;Mohd Shahrizal Sunar","doi":"10.1016/j.jksuci.2024.102234","DOIUrl":"10.1016/j.jksuci.2024.102234","url":null,"abstract":"<div><div>Enhancing low-light images in computer vision is a significant challenge that requires innovative methods to improve its robustness. Low-light image enhancement (LLIE) enhances the quality of images affected by poor lighting conditions by implementing various loss functions such as reconstruction, perceptual, smoothness, adversarial, and exposure. This review analyses and compares different methods, ranging from traditional to cutting-edge deep learning methods, showcasing the significant advancements in the field. Although similar reviews have been studied on LLIE, this paper not only updates the knowledge but also focuses on recent deep learning methods from various perspectives or interpretations. The methodology used in this paper compares different methods from the literature and identifies the potential research gaps. This paper highlights the recent advancements in the field by classifying them into three classes, demonstrated by the continuous enhancements in LLIE methods. These improved methods use different loss functions showing higher efficacy through metrics such as Peak Signal-to-Noise Ratio, Structural Similarity Index Measure, and Naturalness Image Quality Evaluator. The research emphasizes the significance of advanced deep learning techniques and comprehensively compares different LLIE methods on various benchmark image datasets. This research is a foundation for scientists to illustrate potential future research directions.</div></div>","PeriodicalId":48547,"journal":{"name":"Journal of King Saud University-Computer and Information Sciences","volume":"36 10","pages":"Article 102234"},"PeriodicalIF":5.2,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142657773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing foreign exchange reserve security for central banks using Blockchain, FHE, and AWS 利用区块链、FHE 和 AWS 加强中央银行的外汇储备安全
IF 5.2 2区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-12-01 Epub Date: 2024-11-20 DOI: 10.1016/j.jksuci.2024.102251
Khandakar Md Shafin , Saha Reno
In order to maintain the value of the national currency and control foreign debt, central banks are vital to the management of a nation’s foreign exchange reserves. These reserves, however, are vulnerable to a variety of hazards, including as money laundering, fraud, theft, and cyberattacks. These are issues that traditional financial systems frequently face because of their vulnerabilities and inefficiency. Using modern innovations in a blockchain-based solution can help tackle these serious issues. To protect data privacy, the Microsoft SEAL library is utilized for homomorphic encryption (FHE). For the development of smart contracts, Solidity is employed within the Ethereum blockchain ecosystem. Additionally, Amazon Web Services (AWS) is leveraged to provide a scalable and powerful infrastructure to support our solution. To guarantee safe and effective transaction validation, our method incorporates a hybrid consensus process that combines Proof of Authority (PoA) with Byzantine Fault Tolerance (BFT). The administration of foreign exchange reserves by central banks is made more secure, transparent, and operationally efficient by this all-inclusive approach.
为了保持本国货币的价值和控制外债,中央银行对国家外汇储备的管理至关重要。然而,这些储备容易受到各种危害的影响,包括洗钱、欺诈、盗窃和网络攻击。这些都是传统金融系统因其脆弱性和低效率而经常面临的问题。在基于区块链的解决方案中使用现代创新技术有助于解决这些严重问题。为了保护数据隐私,微软 SEAL 库被用于同态加密(FHE)。为了开发智能合约,在以太坊区块链生态系统中使用了 Solidity。此外,亚马逊网络服务(AWS)为支持我们的解决方案提供了可扩展的强大基础设施。为了保证安全有效的交易验证,我们的方法采用了混合共识流程,将权威证明(PoA)与拜占庭容错(BFT)相结合。通过这种包罗万象的方法,中央银行对外汇储备的管理变得更加安全、透明和高效。
{"title":"Enhancing foreign exchange reserve security for central banks using Blockchain, FHE, and AWS","authors":"Khandakar Md Shafin ,&nbsp;Saha Reno","doi":"10.1016/j.jksuci.2024.102251","DOIUrl":"10.1016/j.jksuci.2024.102251","url":null,"abstract":"<div><div>In order to maintain the value of the national currency and control foreign debt, central banks are vital to the management of a nation’s foreign exchange reserves. These reserves, however, are vulnerable to a variety of hazards, including as money laundering, fraud, theft, and cyberattacks. These are issues that traditional financial systems frequently face because of their vulnerabilities and inefficiency. Using modern innovations in a blockchain-based solution can help tackle these serious issues. To protect data privacy, the Microsoft SEAL library is utilized for homomorphic encryption (FHE). For the development of smart contracts, Solidity is employed within the Ethereum blockchain ecosystem. Additionally, Amazon Web Services (AWS) is leveraged to provide a scalable and powerful infrastructure to support our solution. To guarantee safe and effective transaction validation, our method incorporates a hybrid consensus process that combines Proof of Authority (PoA) with Byzantine Fault Tolerance (BFT). The administration of foreign exchange reserves by central banks is made more secure, transparent, and operationally efficient by this all-inclusive approach.</div></div>","PeriodicalId":48547,"journal":{"name":"Journal of King Saud University-Computer and Information Sciences","volume":"36 10","pages":"Article 102251"},"PeriodicalIF":5.2,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142705879","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unmanned combat aerial vehicle path planning in complex environment using multi-strategy sparrow search algorithm with double-layer coding 基于双层编码的多策略麻雀搜索算法的复杂环境下无人机路径规划
IF 5.2 2区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-12-01 Epub Date: 2024-11-23 DOI: 10.1016/j.jksuci.2024.102255
Liangdong Qu , Jingkun Fan
Unmanned combat aerial vehicles (UCAV) path planning in complex environments demands a substantial number of path points to determine feasible paths. Establishing an effective flight path for UCAVs requires numerous path points to account for fuel constraints, artillery threats, and radar avoidance. This increase in path points raises the dimensionality of the problem, which in turn degrades algorithm performance. To mitigate this issue, a double-layer coding (DLC) model is utilized to remove redundant path points, consequently lowering computational complexity and operational difficulties. Meanwhile, this paper introduces a novel enhanced sparrow search algorithm (MESSA) based on multi-strategy for UCAV path planning. The MESSA incorporates a novel dynamic fitness regulation learning strategy (DFRL), a random differential learning strategy (RDL), an elite example equilibrium learning strategy (EEEL), a dynamic elimination and regeneration strategy based on the elite example (DERE), and quadratic interpolation (QI). Furthermore, MESSA is compared against 11 state-of-the-art algorithms, demonstrating exceptional optimization performance and robustness. Additionally, the combination of MESSA with the DLC model (DLC-MESSA) is applied to solve the UCAV path planning problem. The experimental results from five complex environments indicate that DLC-MESSA outperforms other algorithms in 80% of the cases by achieving the lowest average cost, thereby demonstrating its superior robustness and computational efficiency.
复杂环境下的无人机路径规划需要大量路径点来确定可行路径。为无人驾驶飞机建立有效的飞行路径需要许多路径点来考虑燃料限制、火炮威胁和雷达规避。路径点的增加提高了问题的维度,这反过来又降低了算法的性能。为了解决这一问题,采用双层编码(DLC)模型去除冗余路径点,从而降低了计算复杂度和操作难度。同时,提出了一种新的基于多策略的增强型麻雀搜索算法(MESSA)用于无人机路径规划。MESSA包含了一种新的动态适应度调节学习策略(DFRL)、随机差分学习策略(RDL)、精英样本均衡学习策略(EEEL)、基于精英样本的动态消除和再生策略(DERE)和二次插值(QI)。此外,将MESSA与11种最先进的算法进行了比较,证明了卓越的优化性能和鲁棒性。此外,将MESSA与DLC模型相结合(DLC-MESSA)用于解决无人机的路径规划问题。五个复杂环境的实验结果表明,在80%的情况下,DLC-MESSA算法的平均成本最低,优于其他算法,从而证明了其优越的鲁棒性和计算效率。
{"title":"Unmanned combat aerial vehicle path planning in complex environment using multi-strategy sparrow search algorithm with double-layer coding","authors":"Liangdong Qu ,&nbsp;Jingkun Fan","doi":"10.1016/j.jksuci.2024.102255","DOIUrl":"10.1016/j.jksuci.2024.102255","url":null,"abstract":"<div><div>Unmanned combat aerial vehicles (UCAV) path planning in complex environments demands a substantial number of path points to determine feasible paths. Establishing an effective flight path for UCAVs requires numerous path points to account for fuel constraints, artillery threats, and radar avoidance. This increase in path points raises the dimensionality of the problem, which in turn degrades algorithm performance. To mitigate this issue, a double-layer coding (DLC) model is utilized to remove redundant path points, consequently lowering computational complexity and operational difficulties. Meanwhile, this paper introduces a novel enhanced sparrow search algorithm (MESSA) based on multi-strategy for UCAV path planning. The MESSA incorporates a novel dynamic fitness regulation learning strategy (DFRL), a random differential learning strategy (RDL), an elite example equilibrium learning strategy (EEEL), a dynamic elimination and regeneration strategy based on the elite example (DERE), and quadratic interpolation (QI). Furthermore, MESSA is compared against 11 state-of-the-art algorithms, demonstrating exceptional optimization performance and robustness. Additionally, the combination of MESSA with the DLC model (DLC-MESSA) is applied to solve the UCAV path planning problem. The experimental results from five complex environments indicate that DLC-MESSA outperforms other algorithms in 80% of the cases by achieving the lowest average cost, thereby demonstrating its superior robustness and computational efficiency.</div></div>","PeriodicalId":48547,"journal":{"name":"Journal of King Saud University-Computer and Information Sciences","volume":"36 10","pages":"Article 102255"},"PeriodicalIF":5.2,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142744902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improving clustering-based and adaptive position-aware interpolation oversampling for imbalanced data classification 改进基于聚类和自适应位置感知的插值超采样,实现不平衡数据分类
IF 5.2 2区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-12-01 Epub Date: 2024-12-11 DOI: 10.1016/j.jksuci.2024.102253
Yujiang Wang , Marshima Mohd Rosli , Norzilah Musa , Lei Wang
Class imbalance is one of the most significant difficulties in modern machine learning. This is because of the inherent bias of standard classifiers toward favoring majority instances while often ignoring minority instances. Interpolation-based oversampling techniques are among the most popular solutions for generating synthetic minority samples to correct imbalanced class distributions. However, synthetic minority samples have a risk of overlapping with the majority-class samples. Inappropriate interpolation of minority samples during oversampling can also result in over generalization. To overcome these drawbacks, we propose a Clustering-based and Adaptive Position-aware Interpolation Oversampling algorithm (CAPAIO) for imbalanced binary dataset classification. CAPAIO initially employs an improved density-based clustering algorithm to group minority instances into inland, borderline, and trapped samples. It then adaptively determines the size of each subcluster and allocates weights to minority samples, guiding the synthesis of minority samples based on these weights. Finally, distinct interpolation oversampling algorithms are individually performed on these three categories of minority samples. The experimental results demonstrate the effectiveness of the proposed CAPAIO in most datasets compared with eleven other oversampling algorithms.
类不平衡是现代机器学习中最重要的困难之一。这是因为标准分类器的固有偏见倾向于支持多数实例,而经常忽略少数实例。基于插值的过采样技术是生成合成少数样本以纠正不平衡类分布的最流行的解决方案之一。然而,合成的少数类样本有与多数类样本重叠的风险。过采样过程中对少数样本的不适当插值也会导致过泛化。为了克服这些缺点,我们提出了一种基于聚类的自适应位置感知插值过采样算法(CAPAIO)用于不平衡二值数据集分类。CAPAIO最初采用一种改进的基于密度的聚类算法,将少数样本分为内陆样本、边缘样本和捕获样本。然后自适应地确定每个子簇的大小,并为少数样本分配权重,指导基于这些权重的少数样本的合成。最后,对这三类少数样本分别进行了不同的插值过采样算法。实验结果表明,与其他11种过采样算法相比,本文提出的CAPAIO算法在大多数数据集上是有效的。
{"title":"Improving clustering-based and adaptive position-aware interpolation oversampling for imbalanced data classification","authors":"Yujiang Wang ,&nbsp;Marshima Mohd Rosli ,&nbsp;Norzilah Musa ,&nbsp;Lei Wang","doi":"10.1016/j.jksuci.2024.102253","DOIUrl":"10.1016/j.jksuci.2024.102253","url":null,"abstract":"<div><div>Class imbalance is one of the most significant difficulties in modern machine learning. This is because of the inherent bias of standard classifiers toward favoring majority instances while often ignoring minority instances. Interpolation-based oversampling techniques are among the most popular solutions for generating synthetic minority samples to correct imbalanced class distributions. However, synthetic minority samples have a risk of overlapping with the majority-class samples. Inappropriate interpolation of minority samples during oversampling can also result in over generalization. To overcome these drawbacks, we propose a Clustering-based and Adaptive Position-aware Interpolation Oversampling algorithm (CAPAIO) for imbalanced binary dataset classification. CAPAIO initially employs an improved density-based clustering algorithm to group minority instances into inland, borderline, and trapped samples. It then adaptively determines the size of each subcluster and allocates weights to minority samples, guiding the synthesis of minority samples based on these weights. Finally, distinct interpolation oversampling algorithms are individually performed on these three categories of minority samples. The experimental results demonstrate the effectiveness of the proposed CAPAIO in most datasets compared with eleven other oversampling algorithms.</div></div>","PeriodicalId":48547,"journal":{"name":"Journal of King Saud University-Computer and Information Sciences","volume":"36 10","pages":"Article 102253"},"PeriodicalIF":5.2,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143180408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On the robustness of arabic aspect-based sentiment analysis: A comprehensive exploration of transformer-based models 基于阿拉伯语方面的情感分析的稳健性:基于转换器模型的全面探索
IF 5.2 2区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-12-01 Epub Date: 2024-12-11 DOI: 10.1016/j.jksuci.2024.102264
Alanod AlMasaud, Heyam H. Al-Baity
In the era of rapid technological advancement, users generate an overwhelming volume of data on social media networks and e-commerce platforms daily. This data, rich in opinions, sentiments, values, and habits, holds immense value for both consumers and businesses. Leveraging this unstructured data manually is error-prone and time-consuming. The field of Sentiment Analysis automates the process of analyzing human opinions from this data. Sentiment Analysis classifies text into positive, negative, or neutral sentiments. However, it confines text classification to a single sentiment polarity, providing a broad overview without accounting for specific aspects. With the growing demand for data analysis, this standard sentiment polarity classification is no longer sufficient. Aspect-Based Sentiment Analysis has emerged to dig deeper into the text, uncovering perspectives and points of view. It can identify multiple aspects in text with corresponding sentiment polarity. Therefore, interest in this field has increased and many research efforts have been devoted recently to tackle this problem for the English language. Unfortunately, there is a scarcity of Arabic research in this field. This study will address the aforementioned deficiency by investigating the potential of four transformer models namely, AraBERT v2.0, ArBERT, MARBERT, and Multilingual BERT in enhancing the accuracy of Aspect-Based Sentiment Analysis for Arabic texts using two dedicated corpora (AraMA and AraMAMS). The extensive experiments revealed that the proposed approach achieved its expected effect surpassing the results of previous studies in the field. The best results of Aspect Category Detection and Aspect Sentiment Classification tasks in AraMA corpus were obtained by using AraBERT v2.0 with F1-Measure result equals to 95.75% and 92.83% respectively. In addition, the best result of Aspect Category Detection and Aspect Sentiment Classification tasks in AraMAMS corpus were achieved by using AraBERT v2.0 with F1-Measure result equals to 95.54% and 89.52% respectively.
在技术飞速发展的时代,用户每天都会在社交媒体网络和电子商务平台上产生大量数据。这些数据蕴含着丰富的观点、情感、价值观和习惯,对消费者和企业都具有巨大的价值。手动利用这些非结构化数据既容易出错,又耗费时间。情感分析领域可自动分析这些数据中的人类观点。情感分析将文本分为正面、负面或中性情感。然而,它将文本分类局限于单一的情感极性,只提供了一个广泛的概述,而没有考虑到具体的方面。随着数据分析需求的不断增长,这种标准的情感极性分类已不再足够。基于方面的情感分析法应运而生,它能深入挖掘文本,揭示观点和视角。它可以识别文本中具有相应情感极性的多个方面。因此,人们对这一领域的兴趣与日俱增,近来许多研究人员都致力于解决英语语言中的这一问题。遗憾的是,阿拉伯语在这一领域的研究却很少。本研究将针对上述不足,使用两个专用语料库(AraMA 和 AraMAMS)研究四种转换器模型(即 AraBERT v2.0、ArBERT、MARBERT 和多语言 BERT)在提高阿拉伯语文本基于方面的情感分析准确性方面的潜力。大量实验表明,所提出的方法达到了预期效果,超过了该领域以往的研究结果。通过使用 AraBERT v2.0,AraMA 语料库中的方面类别检测和方面情感分类任务获得了最佳结果,F1-Measure 结果分别为 95.75% 和 92.83%。此外,在 AraMAMS 语料库中,使用 AraBERT v2.0 进行的方面类别检测和方面情感分类任务取得了最佳结果,F1-Measure 结果分别为 95.54% 和 89.52%。
{"title":"On the robustness of arabic aspect-based sentiment analysis: A comprehensive exploration of transformer-based models","authors":"Alanod AlMasaud,&nbsp;Heyam H. Al-Baity","doi":"10.1016/j.jksuci.2024.102264","DOIUrl":"10.1016/j.jksuci.2024.102264","url":null,"abstract":"<div><div>In the era of rapid technological advancement, users generate an overwhelming volume of data on social media networks and e-commerce platforms daily. This data, rich in opinions, sentiments, values, and habits, holds immense value for both consumers and businesses. Leveraging this unstructured data manually is error-prone and time-consuming. The field of Sentiment Analysis automates the process of analyzing human opinions from this data. Sentiment Analysis classifies text into positive, negative, or neutral sentiments. However, it confines text classification to a single sentiment polarity, providing a broad overview without accounting for specific aspects. With the growing demand for data analysis, this standard sentiment polarity classification is no longer sufficient. Aspect-Based Sentiment Analysis has emerged to dig deeper into the text, uncovering perspectives and points of view. It can identify multiple aspects in text with corresponding sentiment polarity. Therefore, interest in this field has increased and many research efforts have been devoted recently to tackle this problem for the English language. Unfortunately, there is a scarcity of Arabic research in this field. This study will address the aforementioned deficiency by investigating the potential of four transformer models namely, AraBERT v2.0, ArBERT, MARBERT, and Multilingual BERT in enhancing the accuracy of Aspect-Based Sentiment Analysis for Arabic texts using two dedicated corpora (AraMA and AraMAMS). The extensive experiments revealed that the proposed approach achieved its expected effect surpassing the results of previous studies in the field. The best results of Aspect Category Detection and Aspect Sentiment Classification tasks in AraMA corpus were obtained by using AraBERT v2.0 with F1-Measure result equals to 95.75% and 92.83% respectively. In addition, the best result of Aspect Category Detection and Aspect Sentiment Classification tasks in AraMAMS corpus were achieved by using AraBERT v2.0 with F1-Measure result equals to 95.54% and 89.52% respectively.</div></div>","PeriodicalId":48547,"journal":{"name":"Journal of King Saud University-Computer and Information Sciences","volume":"36 10","pages":"Article 102264"},"PeriodicalIF":5.2,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143180412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improving cache-enabled D2D communications using actor–critic networks over licensed and unlicensed spectrum 在许可和非许可频谱上利用行为批评网络改进支持缓存的 D2D 通信
IF 5.2 2区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-12-01 Epub Date: 2024-11-19 DOI: 10.1016/j.jksuci.2024.102249
Muhammad Sheraz , Teong Chee Chuah , Kashif Sultan , Manzoor Ahmed , It Ee Lee , Saw Chin Tan
Cache-enabled Device-to-Device (D2D) communications is an effective way to improve data sharing. User Equipment (UE)-level caching holds the potential to reduce the data traffic burden on the core network. Licensed spectrum is utilized for D2D communications, but due to spectrum scarcity, exploring unlicensed spectrum is essential to enhance network capacity. In this paper, we propose caching at the UE-level and exploit both licensed and unlicensed spectrum for optimizing throughput. First, we propose a reinforcement learning-based data caching scheme leveraging an actor–critic network to improve cache-enabled D2D communications. Besides, licensed and unlicensed spectrum are devised for D2D communications considering interference from existing cellular and Wi-Fi users. A duty cycle-based unlicensed spectrum access algorithm is employed, guaranteeing the Signal-to-Interference and Noise Ratio (SINR) required by the users. The unlicensed spectrum is prone to data packets collisions. Therefore, Request-to-Send/Clear-to-Send (RTS/CTS) mechanism is utilized in conjunction with Carrier Sense Multiple Access with Collision Avoidance (CSMA/CA) to alleviate both the interference and packets collision problems of the unlicensed spectrum. Extensive simulations are performed to analyze the performance gain of our proposed scheme compared to the benchmarks under different network scenarios. The obtained results demonstrate that our proposed scheme possesses the potential to optimize network performance.
支持缓存的设备到设备(D2D)通信是改善数据共享的有效方法。用户设备(UE)级缓存有可能减轻核心网络的数据流量负担。许可频谱可用于 D2D 通信,但由于频谱稀缺,探索非许可频谱对提高网络容量至关重要。在本文中,我们提出了 UE 级缓存,并利用许可和非许可频谱优化吞吐量。首先,我们提出了一种基于强化学习的数据缓存方案,利用行为批判网络改善缓存支持的 D2D 通信。此外,考虑到现有蜂窝和 Wi-Fi 用户的干扰,我们还为 D2D 通信设计了许可和非许可频谱。采用了基于占空比的非授权频谱接入算法,保证了用户所需的信噪比(SINR)。未授权频谱容易发生数据包碰撞。因此,请求发送/清除发送(RTS/CTS)机制与带碰撞避免功能的载波侦测多路访问(CSMA/CA)相结合,可减轻非授权频谱的干扰和数据包碰撞问题。我们进行了广泛的仿真,分析了在不同网络场景下,我们提出的方案与基准方案相比的性能增益。结果表明,我们提出的方案具有优化网络性能的潜力。
{"title":"Improving cache-enabled D2D communications using actor–critic networks over licensed and unlicensed spectrum","authors":"Muhammad Sheraz ,&nbsp;Teong Chee Chuah ,&nbsp;Kashif Sultan ,&nbsp;Manzoor Ahmed ,&nbsp;It Ee Lee ,&nbsp;Saw Chin Tan","doi":"10.1016/j.jksuci.2024.102249","DOIUrl":"10.1016/j.jksuci.2024.102249","url":null,"abstract":"<div><div>Cache-enabled Device-to-Device (D2D) communications is an effective way to improve data sharing. User Equipment (UE)-level caching holds the potential to reduce the data traffic burden on the core network. Licensed spectrum is utilized for D2D communications, but due to spectrum scarcity, exploring unlicensed spectrum is essential to enhance network capacity. In this paper, we propose caching at the UE-level and exploit both licensed and unlicensed spectrum for optimizing throughput. First, we propose a reinforcement learning-based data caching scheme leveraging an actor–critic network to improve cache-enabled D2D communications. Besides, licensed and unlicensed spectrum are devised for D2D communications considering interference from existing cellular and Wi-Fi users. A duty cycle-based unlicensed spectrum access algorithm is employed, guaranteeing the Signal-to-Interference and Noise Ratio (SINR) required by the users. The unlicensed spectrum is prone to data packets collisions. Therefore, Request-to-Send/Clear-to-Send (RTS/CTS) mechanism is utilized in conjunction with Carrier Sense Multiple Access with Collision Avoidance (CSMA/CA) to alleviate both the interference and packets collision problems of the unlicensed spectrum. Extensive simulations are performed to analyze the performance gain of our proposed scheme compared to the benchmarks under different network scenarios. The obtained results demonstrate that our proposed scheme possesses the potential to optimize network performance.</div></div>","PeriodicalId":48547,"journal":{"name":"Journal of King Saud University-Computer and Information Sciences","volume":"36 10","pages":"Article 102249"},"PeriodicalIF":5.2,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142705873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimizing resource allocation for enhanced urban connectivity in LEO-UAV-RIS networks 优化资源分配,增强低地轨道无人机-RIS 网络的城市连通性
IF 5.2 2区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-12-01 Epub Date: 2024-11-15 DOI: 10.1016/j.jksuci.2024.102238
Abdulbasit A. Darem , Tareq M. Alkhaldi , Asma A. Alhashmi , Wahida Mansouri , Abed Saif Ahmed Alghawli , Tawfik Al-Hadhrami
Sixth-generation (6G) communication advancements target massive connectivity, ultra-reliable low-latency communication (URLLC), and high data rates, essential for IoT applications. Yet, in natural disasters, particularly in dense urban areas, 6G quality of service (QoS) can falter when terrestrial networks—such as base stations—become unavailable, unstable, or strained by high user density and dynamic environments. Additionally, high-rise buildings in smart cities contribute to signal blockages. To ensure reliable, high-quality connectivity, integrating low-Earth Orbit (LEO) satellites, unmanned aerial vehicles (UAVs), and reconfigurable intelligent surfaces (RIS) into a multilayer (ML) network offers a solution: LEO satellites provide broad coverage, UAVs reduce congestion with flexible positioning, and RIS enhances signal quality. Despite these benefits, this integration brings challenges in resource allocation, requiring path loss models that account for both line-of-sight (LOS) and non-line-of-sight (NLOS) links. To address these, a joint optimization problem is formulated focusing on resource distribution fairness. Given its complexity, a framework is proposed to decouple the problem into subproblems using the block coordinate descent (BCD) method. These subproblems include UAV placement optimization, user association, subcarrier allocation via orthogonal frequency division multiple access (OFDMA), power allocation, and RIS phase shift control. OFDMA efficiently manages shared resources and mitigates interference. This iterative approach optimizes each subproblem, ensuring convergence to a locally optimal solution. Additionally, we propose a low-complexity solution for RIS phase shift control, proving its feasibility and efficiency mathematically. The numerical results demonstrate that the proposed scheme achieves up to 43.5% higher sum rates and 80% lower outage probabilities compared to the schemes without RIS. The low complexity solution for RIS optimization achieves performance within 1.8% of the SDP approach in terms of the sum rate. This model significantly improves network performance and reliability, achieving a 16.3% higher sum rate and a 44.4% reduction in outage probability compared to joint optimization of SAT-UAV resources. These findings highlight the robustness and efficiency of the ML network model, making it ideal for next-generation communication systems in high-density urban environments.
第六代(6G)通信技术的发展目标是实现大规模连接、超可靠低延迟通信(URLLC)和高数据传输速率,这对于物联网应用至关重要。然而,在自然灾害中,特别是在密集的城市地区,当地面网络(如基站)不可用、不稳定或因用户密度高和环境多变而紧张时,6G 的服务质量(QoS)就会出现问题。此外,智能城市中的高层建筑也会造成信号阻塞。为确保可靠、高质量的连接,将低地轨道(LEO)卫星、无人机(UAV)和可重构智能表面(RIS)集成到多层(ML)网络中提供了一种解决方案:低地轨道卫星可提供广泛的覆盖范围,无人飞行器可通过灵活定位减少拥堵,而可重构智能表面(RIS)可提高信号质量。尽管有这些优势,但这种整合也给资源分配带来了挑战,需要同时考虑视距(LOS)和非视距(NLOS)链路的路径损耗模型。为解决这些问题,提出了一个联合优化问题,重点是资源分配的公平性。考虑到问题的复杂性,提出了一个框架,利用分块坐标下降(BCD)方法将问题分解为多个子问题。这些子问题包括无人机位置优化、用户关联、通过正交频分多址(OFDMA)分配子载波、功率分配和 RIS 相移控制。OFDMA 可有效管理共享资源并减少干扰。这种迭代方法对每个子问题进行优化,确保收敛到局部最优解。此外,我们还为 RIS 相移控制提出了一种低复杂度解决方案,从数学上证明了其可行性和效率。数值结果表明,与没有 RIS 的方案相比,所提出的方案最多可提高 43.5% 的总和率,降低 80% 的中断概率。RIS 优化的低复杂度解决方案在总和率方面的性能仅为 SDP 方法的 1.8%。该模型大大提高了网络性能和可靠性,与 SAT-UAV 资源联合优化相比,总和率提高了 16.3%,中断概率降低了 44.4%。这些发现凸显了 ML 网络模型的稳健性和高效性,使其成为高密度城市环境中下一代通信系统的理想选择。
{"title":"Optimizing resource allocation for enhanced urban connectivity in LEO-UAV-RIS networks","authors":"Abdulbasit A. Darem ,&nbsp;Tareq M. Alkhaldi ,&nbsp;Asma A. Alhashmi ,&nbsp;Wahida Mansouri ,&nbsp;Abed Saif Ahmed Alghawli ,&nbsp;Tawfik Al-Hadhrami","doi":"10.1016/j.jksuci.2024.102238","DOIUrl":"10.1016/j.jksuci.2024.102238","url":null,"abstract":"<div><div>Sixth-generation (6G) communication advancements target massive connectivity, ultra-reliable low-latency communication (URLLC), and high data rates, essential for IoT applications. Yet, in natural disasters, particularly in dense urban areas, 6G quality of service (QoS) can falter when terrestrial networks—such as base stations—become unavailable, unstable, or strained by high user density and dynamic environments. Additionally, high-rise buildings in smart cities contribute to signal blockages. To ensure reliable, high-quality connectivity, integrating low-Earth Orbit (LEO) satellites, unmanned aerial vehicles (UAVs), and reconfigurable intelligent surfaces (RIS) into a multilayer (ML) network offers a solution: LEO satellites provide broad coverage, UAVs reduce congestion with flexible positioning, and RIS enhances signal quality. Despite these benefits, this integration brings challenges in resource allocation, requiring path loss models that account for both line-of-sight (LOS) and non-line-of-sight (NLOS) links. To address these, a joint optimization problem is formulated focusing on resource distribution fairness. Given its complexity, a framework is proposed to decouple the problem into subproblems using the block coordinate descent (BCD) method. These subproblems include UAV placement optimization, user association, subcarrier allocation via orthogonal frequency division multiple access (OFDMA), power allocation, and RIS phase shift control. OFDMA efficiently manages shared resources and mitigates interference. This iterative approach optimizes each subproblem, ensuring convergence to a locally optimal solution. Additionally, we propose a low-complexity solution for RIS phase shift control, proving its feasibility and efficiency mathematically. The numerical results demonstrate that the proposed scheme achieves up to 43.5% higher sum rates and 80% lower outage probabilities compared to the schemes without RIS. The low complexity solution for RIS optimization achieves performance within 1.8% of the SDP approach in terms of the sum rate. This model significantly improves network performance and reliability, achieving a 16.3% higher sum rate and a 44.4% reduction in outage probability compared to joint optimization of SAT-UAV resources. These findings highlight the robustness and efficiency of the ML network model, making it ideal for next-generation communication systems in high-density urban environments.</div></div>","PeriodicalId":48547,"journal":{"name":"Journal of King Saud University-Computer and Information Sciences","volume":"36 10","pages":"Article 102238"},"PeriodicalIF":5.2,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142705878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Picking point identification and localization method based on swin-transformer for high-quality tea 基于摆动变压器的优质茶叶采摘点识别与定位方法
IF 5.2 2区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-12-01 Epub Date: 2024-12-02 DOI: 10.1016/j.jksuci.2024.102262
Zhiyao Pan, Jinan Gu, Wenbo Wang, Xinling Fang, Zilin Xia, Qihang Wang, Mengni Wang
In the nature scene, because of the high degree of similarity between the background and the tea buds, as well as the different growth postures of the tea buds, finding and precisely identifying the picking point is challenging. To solve these issues, this paper proposes a precise way to find the best picking point for tea buds by combining traditional algorithms with Swin-Transformer-based target detection and semantic segmentation algorithms, namely SORC-SFT. Firstly, an improved target detection algorithm, Swin-Oriented R-CNN (SORC), is used to realize the recognition of four types of high-quality tea. The mean Average Precision (mAP) of the four categories was 82.3% after replacing the feature fusion network FPN with PAFPN and adding the Coordinate Attention (CA) mechanism. Secondly, the corresponding segmentation mask of the four recognized categories is obtained by adding Semask, Feature Alignment Module (FAM), and Feature Selection Module (FSM) to the improved semantic segmentation algorithm Semask-Fa-Transformer (SFT). The mean Intersection over Union (mIoU) of the semantic segmentation algorithm for each category is 89.83%, 91.97%, 88.85%, and 89.68%, respectively. Finally, the morphology of different categories of tea buds is analyzed, and the traditional algorithm is used to realize the accurate localization of the identified tea buds. For the four tested categories, the proportion of correct samples in locating picking points is 96.18%, 91.28%, 93.85%, and 90.58%, respectively. The experimental results show that, out of all the algorithms, the proposed picking point identification and localization approach has the best performance and will make a strong contribution to the accurate identification of tea leaves during the intelligent picking process.
在自然场景中,由于背景与茶芽的相似度较高,且茶芽的生长姿态各不相同,因此寻找并精确识别采摘点具有一定的挑战性。为了解决这些问题,本文通过将传统算法与基于斯文变换器的目标检测和语义分割算法(即 SORC-SFT)相结合,提出了一种精确寻找茶芽最佳采摘点的方法。首先,使用改进的目标检测算法 Swin-Oriented R-CNN (SORC) 实现对四种优质茶叶的识别。将特征融合网络 FPN 替换为 PAFPN 并加入坐标注意(CA)机制后,四类茶叶的平均精度(mAP)为 82.3%。其次,在改进的语义分割算法 Semask-Fa-Transformer(SFT)中加入 Semask、特征对齐模块(FAM)和特征选择模块(FSM),得到四个识别类别的相应分割掩码。每个类别的语义分割算法的平均交集大于联合率(mIoU)分别为 89.83%、91.97%、88.85% 和 89.68%。最后,对不同类别的茶芽进行形态分析,并利用传统算法实现对识别出的茶芽的精确定位。对于四个测试类别,采摘点定位的正确样本比例分别为 96.18%、91.28%、93.85% 和 90.58%。实验结果表明,在所有算法中,所提出的采摘点识别和定位方法性能最佳,将为智能采摘过程中茶叶的准确识别做出有力贡献。
{"title":"Picking point identification and localization method based on swin-transformer for high-quality tea","authors":"Zhiyao Pan,&nbsp;Jinan Gu,&nbsp;Wenbo Wang,&nbsp;Xinling Fang,&nbsp;Zilin Xia,&nbsp;Qihang Wang,&nbsp;Mengni Wang","doi":"10.1016/j.jksuci.2024.102262","DOIUrl":"10.1016/j.jksuci.2024.102262","url":null,"abstract":"<div><div>In the nature scene, because of the high degree of similarity between the background and the tea buds, as well as the different growth postures of the tea buds, finding and precisely identifying the picking point is challenging. To solve these issues, this paper proposes a precise way to find the best picking point for tea buds by combining traditional algorithms with Swin-Transformer-based target detection and semantic segmentation algorithms, namely SORC-SFT. Firstly, an improved target detection algorithm, Swin-Oriented R-CNN (SORC), is used to realize the recognition of four types of high-quality tea. The mean Average Precision (mAP) of the four categories was 82.3% after replacing the feature fusion network FPN with PAFPN and adding the Coordinate Attention (CA) mechanism. Secondly, the corresponding segmentation mask of the four recognized categories is obtained by adding Semask, Feature Alignment Module (FAM), and Feature Selection Module (FSM) to the improved semantic segmentation algorithm Semask-Fa-Transformer (SFT). The mean Intersection over Union (mIoU) of the semantic segmentation algorithm for each category is 89.83%, 91.97%, 88.85%, and 89.68%, respectively. Finally, the morphology of different categories of tea buds is analyzed, and the traditional algorithm is used to realize the accurate localization of the identified tea buds. For the four tested categories, the proportion of correct samples in locating picking points is 96.18%, 91.28%, 93.85%, and 90.58%, respectively. The experimental results show that, out of all the algorithms, the proposed picking point identification and localization approach has the best performance and will make a strong contribution to the accurate identification of tea leaves during the intelligent picking process.</div></div>","PeriodicalId":48547,"journal":{"name":"Journal of King Saud University-Computer and Information Sciences","volume":"36 10","pages":"Article 102262"},"PeriodicalIF":5.2,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143180405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Semi-supervised learning for skeleton behavior recognition: A multi-dimensional graph comparison approach 骨架行为识别的半监督学习:一种多维图比较方法
IF 5.2 2区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-12-01 Epub Date: 2024-12-07 DOI: 10.1016/j.jksuci.2024.102266
Qiang Zhao , Moyan Zhang , Hongjuan Li , Baozhen Song , Yujun Li
Skeleton-based action recognition, as a crucial research direction in computer vision, confronts numerous issues and challenges. Most existing research methods rely heavily on extensive labeled data for training, which significantly constraints their training effectiveness and generalization capability when labeled data is scarce. Consequently, how to integrate labeled and unlabeled data to overcome the limitations imposed by label scarcity has emerged as a pivotal research focus in skeleton-based action recognition. Targeting this label scarcity problem, this paper introduces a semi-supervised skeleton-based action recognition approach leveraging multi-dimensional feature-based graph contrastive learning. Firstly, three feature extractors are devised to extract and exploit the available informative cues from limited data thoroughly. The holistic feature extractor comprises five spatio-temporal graph convolutional blocks and a global average pooling layer. The detailed feature extractor is constructed by stacking the same spatio-temporal graph convolutional blocks, while the relational feature extractor primarily integrates stacked attention graph convolutional blocks and a global average pooling layer. Secondly, the sample relationship construction mechanism in graph contrastive learning is enhanced. A clustering process is employed to formulate soft positive/negative sample pairs based on sample similarity, and a sample connectivity matrix further weights the distances between these pairs, thereby enhancing classification accuracy. Furthermore, a novel loss function grounded in the information bottleneck theory is formulated to guide the model towards learning more robust and efficient skeleton action representations. Experimental evaluations demonstrate the superiority of our proposed method (MDKS) on two datasets, NTU60 and NW-UCLA. Specifically, on the NTU60 dataset, MDKS achieves classification accuracy improvements of 4.7% and 1.9% under the X-sub and X-view evaluation protocols, respectively, compared to the benchmark MAC-Learning method. On the NW-UCLA dataset, MDKS outperforms MAC-Learning by 1.4%, 1.2%, 1.9%, and 1.4% in classification accuracy under different labeled data ratios ranging from 5% to 40%. This work offers novel insights and methodologies for advancing skeleton-based action recognition. Future research will delve into label imbalance, label noise, multi-modal information fusion, and cross-scene generalization capabilities.
基于骨骼的动作识别作为计算机视觉的一个重要研究方向,面临着诸多问题和挑战。现有的大多数研究方法都严重依赖于大量的标记数据进行训练,这在标记数据稀缺的情况下,极大地限制了它们的训练效率和泛化能力。因此,如何整合标记和未标记数据以克服标签稀缺性所带来的限制已成为基于骨架的动作识别的关键研究热点。针对标签稀缺性问题,本文引入了一种基于多维特征的图对比学习的半监督骨架动作识别方法。首先,设计了三个特征提取器,从有限的数据中充分提取和利用可用的信息线索。整体特征提取器包括5个时空图卷积块和一个全局平均池化层。详细特征提取器是通过叠加相同的时空图卷积块来构建的,而关系特征提取器主要是将叠加的注意图卷积块和全局平均池化层集成在一起。其次,增强了图对比学习中的样本关系构建机制。基于样本相似度,采用聚类过程形成软正/负样本对,并利用样本连通性矩阵进一步加权这些对之间的距离,从而提高分类精度。此外,基于信息瓶颈理论,提出了一种新的损失函数,以指导模型学习更鲁棒和高效的骨架动作表示。实验验证了我们提出的方法(MDKS)在NTU60和NW-UCLA两个数据集上的优越性。具体而言,在NTU60数据集上,MDKS在X-sub和X-view评估协议下的分类准确率分别比基准MAC-Learning方法提高了4.7%和1.9%。在NW-UCLA数据集上,MDKS在5% - 40%的不同标记数据比例下的分类准确率分别比MAC-Learning高1.4%、1.2%、1.9%和1.4%。这项工作为推进基于骨骼的动作识别提供了新的见解和方法。未来的研究将深入研究标签不平衡、标签噪声、多模态信息融合和跨场景泛化能力。
{"title":"Semi-supervised learning for skeleton behavior recognition: A multi-dimensional graph comparison approach","authors":"Qiang Zhao ,&nbsp;Moyan Zhang ,&nbsp;Hongjuan Li ,&nbsp;Baozhen Song ,&nbsp;Yujun Li","doi":"10.1016/j.jksuci.2024.102266","DOIUrl":"10.1016/j.jksuci.2024.102266","url":null,"abstract":"<div><div>Skeleton-based action recognition, as a crucial research direction in computer vision, confronts numerous issues and challenges. Most existing research methods rely heavily on extensive labeled data for training, which significantly constraints their training effectiveness and generalization capability when labeled data is scarce. Consequently, how to integrate labeled and unlabeled data to overcome the limitations imposed by label scarcity has emerged as a pivotal research focus in skeleton-based action recognition. Targeting this label scarcity problem, this paper introduces a semi-supervised skeleton-based action recognition approach leveraging multi-dimensional feature-based graph contrastive learning. Firstly, three feature extractors are devised to extract and exploit the available informative cues from limited data thoroughly. The holistic feature extractor comprises five spatio-temporal graph convolutional blocks and a global average pooling layer. The detailed feature extractor is constructed by stacking the same spatio-temporal graph convolutional blocks, while the relational feature extractor primarily integrates stacked attention graph convolutional blocks and a global average pooling layer. Secondly, the sample relationship construction mechanism in graph contrastive learning is enhanced. A clustering process is employed to formulate soft positive/negative sample pairs based on sample similarity, and a sample connectivity matrix further weights the distances between these pairs, thereby enhancing classification accuracy. Furthermore, a novel loss function grounded in the information bottleneck theory is formulated to guide the model towards learning more robust and efficient skeleton action representations. Experimental evaluations demonstrate the superiority of our proposed method (MDKS) on two datasets, NTU60 and NW-UCLA. Specifically, on the NTU60 dataset, MDKS achieves classification accuracy improvements of 4.7% and 1.9% under the X-sub and X-view evaluation protocols, respectively, compared to the benchmark MAC-Learning method. On the NW-UCLA dataset, MDKS outperforms MAC-Learning by 1.4%, 1.2%, 1.9%, and 1.4% in classification accuracy under different labeled data ratios ranging from 5% to 40%. This work offers novel insights and methodologies for advancing skeleton-based action recognition. Future research will delve into label imbalance, label noise, multi-modal information fusion, and cross-scene generalization capabilities.</div></div>","PeriodicalId":48547,"journal":{"name":"Journal of King Saud University-Computer and Information Sciences","volume":"36 10","pages":"Article 102266"},"PeriodicalIF":5.2,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143180411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of King Saud University-Computer and Information Sciences
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1