首页 > 最新文献

Intelligent Systems with Applications最新文献

英文 中文
Knowledge graph learning algorithm based on deep convolutional networks 基于深度卷积网络的知识图谱学习算法
Pub Date : 2024-05-07 DOI: 10.1016/j.iswa.2024.200386
Yuzhong Zhou, Zhengping Lin, Jie Lin, Yuliang Yang, Jiahao Shi

Knowledge graphs (KGs) serve as invaluable tools for organizing and representing structural information, enabling powerful data analysis and retrieval. In this paper, we propose a novel knowledge graph learning algorithm based on deep convolutional neural networks (KGLA-DCNN) to enhance the classification accuracy of KG nodes. Leveraging the hierarchical and relational nature of KGs, our algorithm utilizes deep convolutional neural networks to capture intricate patterns and dependencies within the graph. We evaluate the effectiveness of KGLA-DCNN on two benchmark datasets, Cora and Citeseer, renowned for their challenging node classification tasks. Through extensive experiments, we demonstrate that our proposed algorithm significantly improves classification accuracy compared to state-of-the-art methods, showcasing its capability to leverage the rich structural information inherent in KGs. The results highlight the potential of deep convolutional neural networks in enhancing the learning and representation capabilities of knowledge graphs, paving the way for more accurate and efficient knowledge discovery in diverse domains.

知识图谱(KG)是组织和表示结构信息的宝贵工具,可实现强大的数据分析和检索。在本文中,我们提出了一种基于深度卷积神经网络(KGLA-DCNN)的新型知识图谱学习算法,以提高知识图谱节点的分类准确性。利用知识图谱的层次性和关系性,我们的算法利用深度卷积神经网络捕捉图谱中错综复杂的模式和依赖关系。我们在两个基准数据集 Cora 和 Citeseer 上评估了 KGLA-DCNN 的有效性,这两个数据集因其具有挑战性的节点分类任务而闻名。通过大量实验,我们证明了与最先进的方法相比,我们提出的算法显著提高了分类准确率,展示了其利用 KG 固有的丰富结构信息的能力。这些结果凸显了深度卷积神经网络在增强知识图谱的学习和表示能力方面的潜力,为在不同领域更准确、更高效地发现知识铺平了道路。
{"title":"Knowledge graph learning algorithm based on deep convolutional networks","authors":"Yuzhong Zhou,&nbsp;Zhengping Lin,&nbsp;Jie Lin,&nbsp;Yuliang Yang,&nbsp;Jiahao Shi","doi":"10.1016/j.iswa.2024.200386","DOIUrl":"https://doi.org/10.1016/j.iswa.2024.200386","url":null,"abstract":"<div><p>Knowledge graphs (KGs) serve as invaluable tools for organizing and representing structural information, enabling powerful data analysis and retrieval. In this paper, we propose a novel knowledge graph learning algorithm based on deep convolutional neural networks (KGLA-DCNN) to enhance the classification accuracy of KG nodes. Leveraging the hierarchical and relational nature of KGs, our algorithm utilizes deep convolutional neural networks to capture intricate patterns and dependencies within the graph. We evaluate the effectiveness of KGLA-DCNN on two benchmark datasets, Cora and Citeseer, renowned for their challenging node classification tasks. Through extensive experiments, we demonstrate that our proposed algorithm significantly improves classification accuracy compared to state-of-the-art methods, showcasing its capability to leverage the rich structural information inherent in KGs. The results highlight the potential of deep convolutional neural networks in enhancing the learning and representation capabilities of knowledge graphs, paving the way for more accurate and efficient knowledge discovery in diverse domains.</p></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"22 ","pages":"Article 200386"},"PeriodicalIF":0.0,"publicationDate":"2024-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667305324000619/pdfft?md5=46a5a31def65df19b6674bd0989f4307&pid=1-s2.0-S2667305324000619-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140906184","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MLMSign: Multi-lingual multi-modal illumination-invariant sign language recognition MLMSign:多语言多模态光照不变手语识别
Pub Date : 2024-05-06 DOI: 10.1016/j.iswa.2024.200384
Arezoo Sadeghzadeh , A.F.M. Shahen Shah , Md Baharul Islam

Sign language (SL) serves as a visual communication tool bearing great significance for deaf people to interact with others and facilitate their daily life. Wide varieties of SLs and the lack of interpretation knowledge necessitate developing automated sign language recognition (SLR) systems to attenuate the communication gap between the deaf and hearing communities. Despite numerous advanced static SLR systems, they are not practical and favorable enough for real-life scenarios once assessed simultaneously from different critical aspects: accuracy in dealing with high intra- and slight inter-class variations, robustness, computational complexity, and generalization ability. To this end, we propose a novel multi-lingual multi-modal SLR system, namely MLMSign, by taking full strengths of hand-crafted features and deep learning models to enhance the performance and the robustness of the system against illumination changes while minimizing computational cost. The RGB sign images and 2D visualizations of their hand-crafted features, i.e., Histogram of Oriented Gradients (HOG) features and a channel of Lab color space, are employed as three input modalities to train a novel Convolutional Neural Network (CNN). The number of layers, filters, kernel size, learning rate, and optimization technique are carefully selected through an extensive parametric study to minimize the computational cost without compromising accuracy. The system’s performance and robustness are significantly enhanced by jointly deploying the models of these three modalities through ensemble learning. The impact of each modality is optimized based on their impact coefficient determined by grid search. In addition to the comprehensive quantitative assessment, the capabilities of our proposed model and the effectiveness of ensembling over three modalities are evaluated qualitatively using the Grad-CAM visualization model. Experimental results on the test data with additional illumination changes verify the high robustness of our system in dealing with overexposed and underexposed lighting conditions. Achieving a high accuracy (>99.33%) on six benchmark datasets (i.e., Massey, Static ASL, NUS II, TSL Fingerspelling, BdSL36v1, and PSL) demonstrates that our system notably outperforms the recent state-of-the-art approaches with a minimum number of parameters and high generalization ability over complex datasets. Its promising performance for four different sign languages makes it a feasible system for multi-lingual applications.

手语(SL)是一种视觉交流工具,对聋人与他人交流和日常生活具有重要意义。由于手语种类繁多且缺乏翻译知识,因此有必要开发自动手语识别(SLR)系统,以缩小聋人和听人群体之间的沟通差距。尽管有许多先进的静态手语识别系统,但如果同时从不同的关键方面进行评估:处理类内和类间高度差异的准确性、鲁棒性、计算复杂性和泛化能力,这些系统在现实生活中都不够实用和有利。为此,我们提出了一种新颖的多语言多模态 SLR 系统,即 MLMSign,充分发挥手工特征和深度学习模型的优势,在最大程度降低计算成本的同时,提高系统的性能和对光照变化的鲁棒性。RGB 符号图像及其手工创建特征的二维可视化,即定向梯度直方图(HOG)特征和 L∗a∗b∗ 色彩空间的 a∗ 通道,被用作训练新型卷积神经网络(CNN)的三种输入模式。通过广泛的参数研究,对层数、滤波器、核大小、学习率和优化技术进行了精心选择,以在不影响准确性的前提下最大限度地降低计算成本。通过集合学习联合部署这三种模式的模型,系统的性能和鲁棒性得到了显著提升。每种模式的影响都是根据网格搜索确定的影响系数进行优化的。除了全面的定量评估外,我们还利用 Grad-CAM 可视化模型对我们提出的模型的能力和三种模式的集合效果进行了定性评估。对测试数据进行的实验结果表明,我们的系统在处理曝光过度和曝光不足的照明条件时具有很强的鲁棒性。我们的系统在六个基准数据集(即 Massey、Static ASL、NUS II、TSL Fersingpelling、BdSL36v1 和 PSL)上获得了很高的准确率(99.33%),这表明我们的系统以最少的参数和对复杂数据集的高泛化能力明显优于最新的先进方法。该系统在四种不同手语中的良好表现使其成为多语言应用的可行系统。
{"title":"MLMSign: Multi-lingual multi-modal illumination-invariant sign language recognition","authors":"Arezoo Sadeghzadeh ,&nbsp;A.F.M. Shahen Shah ,&nbsp;Md Baharul Islam","doi":"10.1016/j.iswa.2024.200384","DOIUrl":"https://doi.org/10.1016/j.iswa.2024.200384","url":null,"abstract":"<div><p>Sign language (SL) serves as a visual communication tool bearing great significance for deaf people to interact with others and facilitate their daily life. Wide varieties of SLs and the lack of interpretation knowledge necessitate developing automated sign language recognition (SLR) systems to attenuate the communication gap between the deaf and hearing communities. Despite numerous advanced static SLR systems, they are not practical and favorable enough for real-life scenarios once assessed simultaneously from different critical aspects: accuracy in dealing with high intra- and slight inter-class variations, robustness, computational complexity, and generalization ability. To this end, we propose a novel multi-lingual multi-modal SLR system, namely <em>MLMSign</em>, by taking full strengths of hand-crafted features and deep learning models to enhance the performance and the robustness of the system against illumination changes while minimizing computational cost. The RGB sign images and 2D visualizations of their hand-crafted features, i.e., Histogram of Oriented Gradients (HOG) features and <span><math><msup><mrow><mi>a</mi></mrow><mrow><mo>∗</mo></mrow></msup></math></span> channel of <span><math><mrow><msup><mrow><mi>L</mi></mrow><mrow><mo>∗</mo></mrow></msup><msup><mrow><mi>a</mi></mrow><mrow><mo>∗</mo></mrow></msup><msup><mrow><mi>b</mi></mrow><mrow><mo>∗</mo></mrow></msup></mrow></math></span> color space, are employed as three input modalities to train a novel Convolutional Neural Network (CNN). The number of layers, filters, kernel size, learning rate, and optimization technique are carefully selected through an extensive parametric study to minimize the computational cost without compromising accuracy. The system’s performance and robustness are significantly enhanced by jointly deploying the models of these three modalities through ensemble learning. The impact of each modality is optimized based on their impact coefficient determined by grid search. In addition to the comprehensive quantitative assessment, the capabilities of our proposed model and the effectiveness of ensembling over three modalities are evaluated qualitatively using the Grad-CAM visualization model. Experimental results on the test data with additional illumination changes verify the high robustness of our system in dealing with overexposed and underexposed lighting conditions. Achieving a high accuracy (<span><math><mrow><mo>&gt;</mo><mn>99</mn><mo>.</mo><mn>33</mn><mtext>%</mtext></mrow></math></span>) on six benchmark datasets (i.e., Massey, Static ASL, NUS II, TSL Fingerspelling, BdSL36v1, and PSL) demonstrates that our system notably outperforms the recent state-of-the-art approaches with a minimum number of parameters and high generalization ability over complex datasets. Its promising performance for four different sign languages makes it a feasible system for multi-lingual applications.</p></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"22 ","pages":"Article 200384"},"PeriodicalIF":0.0,"publicationDate":"2024-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667305324000590/pdfft?md5=9a754731551f7380f553abb3c302ac3a&pid=1-s2.0-S2667305324000590-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140900945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A graph-based cardiac arrhythmia classification methodology using one-lead ECG recordings 使用单导联心电图记录的基于图的心律失常分类方法
Pub Date : 2024-05-05 DOI: 10.1016/j.iswa.2024.200385
Dorsa EPMoghaddam , Ananya Muguli , Mehdi Razavi , Behnaam Aazhang

In this study, we present a novel graph-based methodology for an accurate classification of cardiac arrhythmia diseases using a single-lead electrocardiogram (ECG). The proposed approach employs the visibility graph technique to generate graphs from time signals. Subsequently, informative features are extracted from each graph and then fed into classifiers to match the input ECG signal with the appropriate target arrhythmia class. The six target classes in this study are normal (N), left bundle branch block (LBBB), right bundle branch block (RBBB), premature ventricular contraction (PVC), atrial premature contraction (A), and fusion (F) beats. Three classification models were explored, including graph convolutional neural network (GCN), multi-layer perceptron (MLP), and random forest (RF). ECG recordings from the MIT-BIH arrhythmia database were utilized to train and evaluate these classifiers. The results indicate that the multi-layer perceptron model attains the highest performance, showcasing an average accuracy of 99.02%. Following closely, the random forest achieves a strong performance as well, with an accuracy of 98.94% while providing critical intuitions.

在本研究中,我们提出了一种基于图的新方法,利用单导联心电图(ECG)对心律失常疾病进行准确分类。所提出的方法采用可见性图技术从时间信号中生成图。随后,从每个图中提取信息特征,然后输入分类器,将输入心电图信号与适当的目标心律失常类别相匹配。本研究中的六个目标类别是正常(N)、左束支传导阻滞(LBBB)、右束支传导阻滞(RBBB)、室性早搏(PVC)、房性早搏(A)和融合(F)搏动。研究人员探索了三种分类模型,包括图卷积神经网络(GCN)、多层感知器(MLP)和随机森林(RF)。利用 MIT-BIH 心律失常数据库中的心电图记录来训练和评估这些分类器。结果表明,多层感知器模型的性能最高,平均准确率达到 99.02%。紧随其后的随机森林也表现出色,准确率达到 98.94%,同时提供了重要的直觉。
{"title":"A graph-based cardiac arrhythmia classification methodology using one-lead ECG recordings","authors":"Dorsa EPMoghaddam ,&nbsp;Ananya Muguli ,&nbsp;Mehdi Razavi ,&nbsp;Behnaam Aazhang","doi":"10.1016/j.iswa.2024.200385","DOIUrl":"https://doi.org/10.1016/j.iswa.2024.200385","url":null,"abstract":"<div><p>In this study, we present a novel graph-based methodology for an accurate classification of cardiac arrhythmia diseases using a single-lead electrocardiogram (ECG). The proposed approach employs the visibility graph technique to generate graphs from time signals. Subsequently, informative features are extracted from each graph and then fed into classifiers to match the input ECG signal with the appropriate target arrhythmia class. The six target classes in this study are normal (N), left bundle branch block (LBBB), right bundle branch block (RBBB), premature ventricular contraction (PVC), atrial premature contraction (A), and fusion (F) beats. Three classification models were explored, including graph convolutional neural network (GCN), multi-layer perceptron (MLP), and random forest (RF). ECG recordings from the MIT-BIH arrhythmia database were utilized to train and evaluate these classifiers. The results indicate that the multi-layer perceptron model attains the highest performance, showcasing an average accuracy of 99.02%. Following closely, the random forest achieves a strong performance as well, with an accuracy of 98.94% while providing critical intuitions.</p></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"22 ","pages":"Article 200385"},"PeriodicalIF":0.0,"publicationDate":"2024-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667305324000607/pdfft?md5=1ca4832d63eeddf441689db0de490c21&pid=1-s2.0-S2667305324000607-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140947344","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automated pneumothorax segmentation and quantification algorithm based on deep learning 基于深度学习的气胸自动分割和量化算法
Pub Date : 2024-05-04 DOI: 10.1016/j.iswa.2024.200383
Wannipa Sae-Lim , Wiphada Wettayaprasit , Ruedeekorn Suwannanon , Siripong Cheewatanakornkul , Pattara Aiyarak

A collapsed lung, also known as a pneumothorax, is a medical condition characterized by the presence of air in the chest cavity between the lung and chest wall. A chest radiograph is commonly used to diagnose pneumothorax; however, manual segmentation of the pneumothorax region can be difficult to achieve due to its complicated appearance and the variable quality of the image. To address this, we introduce a two-phase deep learning framework designed to enhance the accuracy of lung and pneumothorax segmentation from chest radiographs. Initially, a U-Net model with a ResNet34 backbone, trained on the Shenzhen and Montgomery datasets, is utilized to achieve precise lung region segmentation. Subsequently, for pneumothorax segmentation, we propose the PTXSeg-Net—a convolutional neural network model trained on the SIIM-ACR pneumothorax dataset. The PTXSeg-Net is an enhancement of the U-Net architecture, modified to incorporate attention gates and residual blocks to refine learning capabilities, further strengthened by deep supervision, allowing for more nuanced gradient utilization across all network layers. We employ transfer learning by pre-training an autoencoder to extract robust chest X-ray representations. Data refinement techniques are applied to the SIIM-ACR dataset to further improve training outcomes. Our results indicate that PTXSeg-Net outperforms other models in pneumothorax segmentation, achieving the highest Dice score of 0.9124 and Jaccard index of 0.8894 on the refined dataset with autoencoder pre-training. Moreover, leveraging the predicted lung and pneumothorax segmentation masks from the two-phase framework, we propose a quantification algorithm for estimating the pneumothorax size ratio. Its validity has been confirmed through expert assessments by a radiologist and a surgeon on a test set comprising 495 images. The high acceptance rates, averaging 96.97 %, demonstrate substantial agreement between the proposed method and expert clinical assessments. The implications of these results are significant for clinical practice, offering a deep learning technology for more accurate and efficient pneumothorax identification and quantification. This improvement facilitates the timely determination of required management and treatment strategies, potentially leading to enhancements in patient outcomes.

肺塌陷又称气胸,是一种以胸腔内肺与胸壁之间存在空气为特征的病症。胸片通常用于诊断气胸;然而,由于气胸区域的外观复杂且图像质量参差不齐,人工分割气胸区域可能难以实现。为此,我们引入了一个两阶段深度学习框架,旨在提高胸片肺和气胸分割的准确性。首先,利用在深圳和蒙哥马利数据集上训练的带有 ResNet34 主干网的 U-Net 模型来实现精确的肺部区域分割。随后,针对气胸分割,我们提出了 PTXSeg-Net 模型--一种在 SIIM-ACR 气胸数据集上训练的卷积神经网络模型。PTXSeg-Net 是 U-Net 架构的增强版,它结合了注意力门和残差块来完善学习能力,并通过深度监督得到进一步加强,使所有网络层都能更细致地利用梯度。我们通过预训练自动编码器来采用迁移学习,以提取稳健的胸部 X 射线表征。我们将数据提炼技术应用于 SIIM-ACR 数据集,以进一步改善训练结果。我们的研究结果表明,PTXSeg-Net 在气胸分割方面的表现优于其他模型,在经过自动编码器预训练的细化数据集上,PTXSeg-Net 获得了最高的 Dice 分数(0.9124)和 Jaccard 指数(0.8894)。此外,利用两阶段框架中预测的肺和气胸分割掩模,我们提出了一种用于估算气胸大小比的量化算法。通过放射科医生和外科医生对 495 张图像的测试集进行专家评估,证实了该算法的有效性。平均接受率高达 96.97%,这表明所提出的方法与专家临床评估结果非常吻合。这些结果对临床实践意义重大,为更准确、更高效地识别和量化气胸提供了一种深度学习技术。这一改进有助于及时确定所需的管理和治疗策略,从而改善患者的预后。
{"title":"Automated pneumothorax segmentation and quantification algorithm based on deep learning","authors":"Wannipa Sae-Lim ,&nbsp;Wiphada Wettayaprasit ,&nbsp;Ruedeekorn Suwannanon ,&nbsp;Siripong Cheewatanakornkul ,&nbsp;Pattara Aiyarak","doi":"10.1016/j.iswa.2024.200383","DOIUrl":"https://doi.org/10.1016/j.iswa.2024.200383","url":null,"abstract":"<div><p>A collapsed lung, also known as a pneumothorax, is a medical condition characterized by the presence of air in the chest cavity between the lung and chest wall. A chest radiograph is commonly used to diagnose pneumothorax; however, manual segmentation of the pneumothorax region can be difficult to achieve due to its complicated appearance and the variable quality of the image. To address this, we introduce a two-phase deep learning framework designed to enhance the accuracy of lung and pneumothorax segmentation from chest radiographs. Initially, a U-Net model with a ResNet34 backbone, trained on the Shenzhen and Montgomery datasets, is utilized to achieve precise lung region segmentation. Subsequently, for pneumothorax segmentation, we propose the PTXSeg-Net—a convolutional neural network model trained on the SIIM-ACR pneumothorax dataset. The PTXSeg-Net is an enhancement of the U-Net architecture, modified to incorporate attention gates and residual blocks to refine learning capabilities, further strengthened by deep supervision, allowing for more nuanced gradient utilization across all network layers. We employ transfer learning by pre-training an autoencoder to extract robust chest X-ray representations. Data refinement techniques are applied to the SIIM-ACR dataset to further improve training outcomes. Our results indicate that PTXSeg-Net outperforms other models in pneumothorax segmentation, achieving the highest Dice score of 0.9124 and Jaccard index of 0.8894 on the refined dataset with autoencoder pre-training. Moreover, leveraging the predicted lung and pneumothorax segmentation masks from the two-phase framework, we propose a quantification algorithm for estimating the pneumothorax size ratio. Its validity has been confirmed through expert assessments by a radiologist and a surgeon on a test set comprising 495 images. The high acceptance rates, averaging 96.97 %, demonstrate substantial agreement between the proposed method and expert clinical assessments. The implications of these results are significant for clinical practice, offering a deep learning technology for more accurate and efficient pneumothorax identification and quantification. This improvement facilitates the timely determination of required management and treatment strategies, potentially leading to enhancements in patient outcomes.</p></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"22 ","pages":"Article 200383"},"PeriodicalIF":0.0,"publicationDate":"2024-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667305324000589/pdfft?md5=001cf6cb60c73ed1f2fe96f4ff9233fe&pid=1-s2.0-S2667305324000589-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140893464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cluster-based wireless sensor network framework for denial-of-service attack detection based on variable selection ensemble machine learning algorithms 基于变量选择集合机器学习算法的集群式无线传感器网络拒绝服务攻击检测框架
Pub Date : 2024-04-30 DOI: 10.1016/j.iswa.2024.200381
Ayuba John , Ismail Fauzi Bin Isnin , Syed Hamid Hussain Madni , Muhammed Faheem

A Cluster-Based Wireless Sensor Network (CBWSN) is a system designed to remotely control and monitor specific events or phenomena in areas such as smart grids, intelligent healthcare, circular economies in smart cities, and underwater surveillance. The wide range of applications of technology in almost every field of human activity exposes it to various security threats from cybercriminals. One of the pressing concerns that requires immediate attention is the risk of security breaches, such as intrusions in wireless sensor network traffic. Poor detection of denial-of-service (DoS) attacks, such as Grayhole, Blackhole, Flooding, and Scheduling attacks, can deplete the energy of sensor nodes. This can cause certain sensor nodes to fail, leading to a degradation in network coverage or lifetime. The detection of such attacks has resulted in significant computational complexity in the related works. As new threats arise, security attacks get more sophisticated, focusing on the target system's vulnerabilities. This paper proposed the development of Cluster-Based Wireless Sensor Network and Variable Selection Ensemble Machine Learning Algorithms (CBWSN_VSEMLA) as a security threats detection system framework for DoS attack detection. The CBWSN model is designed using a Fuzzy C-Means (FCM) clustering technique, whereas VSEMLA is a detection system comprised of Principal Component Analysis (PCA) for feature selection and various ensemble machine learning algorithms (Bagging, LogitBoost, and RandomForest) for the detection of grayhole attacks, blackhole attacks, flooding attacks, and scheduling attacks. The experimental results of the model performance and complexity comparison for DoS attack evaluation using the WSN-DS dataset show that the PCA_RandomForest IDS model outperforms with 99.999 % accuracy, followed by the PCA_Bagging IDS model with 99.78 % accuracy and the PCA_LogitBoost model with 98.88 % accuracy. However, the PCA_RandomForest model has a high computational complexity, taking 231.64 s to train, followed by the PCA_LogitBoost model, which takes 57.44 s to train, and the PCA_Bagging model, which takes 0.91 s to train to be the best in terms of model computational complexity. Thus, the models surpassed all baseline models in terms of model detection accuracy on flooding, scheduling, grayhole, and blackhole attacks.

基于集群的无线传感器网络(CBWSN)是一种系统,旨在远程控制和监测智能电网、智能医疗、智能城市循环经济和水下监视等领域的特定事件或现象。技术在几乎所有人类活动领域的广泛应用,使其面临来自网络犯罪分子的各种安全威胁。需要立即关注的一个紧迫问题是安全漏洞的风险,如无线传感器网络流量的入侵。对灰洞、黑洞、洪水和调度攻击等拒绝服务(DoS)攻击的检测不力会耗尽传感器节点的能量。这会导致某些传感器节点失效,从而降低网络覆盖范围或寿命。在相关工作中,对此类攻击的检测带来了巨大的计算复杂性。随着新威胁的出现,针对目标系统漏洞的安全攻击也越来越复杂。本文提出开发基于集群的无线传感器网络和变量选择集合机器学习算法(CBWSN_VSEMLA)作为安全威胁检测系统框架,用于 DoS 攻击检测。CBWSN 模型是利用模糊 C-Means (FCM) 聚类技术设计的,而 VSEMLA 则是由用于特征选择的主成分分析 (PCA) 和用于检测灰洞攻击、黑洞攻击、洪水攻击和调度攻击的各种集合机器学习算法(Bagging、LogitBoost 和 RandomForest)组成的检测系统。利用 WSN-DS 数据集评估 DoS 攻击的模型性能和复杂度比较实验结果表明,PCA_RandomForest IDS 模型的准确率为 99.999%,PCA_Bagging IDS 模型的准确率为 99.78%,PCA_LogitBoost 模型的准确率为 98.88%。不过,PCA_RandomForest 模型的计算复杂度较高,需要 231.64 秒的训练时间,其次是 PCA_LogitBoost 模型,需要 57.44 秒的训练时间,而 PCA_Bagging 模型在模型计算复杂度方面最好,只需要 0.91 秒的训练时间。因此,在洪水攻击、调度攻击、灰洞攻击和黑洞攻击的模型检测精度方面,这些模型超过了所有基线模型。
{"title":"Cluster-based wireless sensor network framework for denial-of-service attack detection based on variable selection ensemble machine learning algorithms","authors":"Ayuba John ,&nbsp;Ismail Fauzi Bin Isnin ,&nbsp;Syed Hamid Hussain Madni ,&nbsp;Muhammed Faheem","doi":"10.1016/j.iswa.2024.200381","DOIUrl":"https://doi.org/10.1016/j.iswa.2024.200381","url":null,"abstract":"<div><p>A Cluster-Based Wireless Sensor Network (CBWSN) is a system designed to remotely control and monitor specific events or phenomena in areas such as smart grids, intelligent healthcare, circular economies in smart cities, and underwater surveillance. The wide range of applications of technology in almost every field of human activity exposes it to various security threats from cybercriminals. One of the pressing concerns that requires immediate attention is the risk of security breaches, such as intrusions in wireless sensor network traffic. Poor detection of denial-of-service (DoS) attacks, such as Grayhole, Blackhole, Flooding, and Scheduling attacks, can deplete the energy of sensor nodes. This can cause certain sensor nodes to fail, leading to a degradation in network coverage or lifetime. The detection of such attacks has resulted in significant computational complexity in the related works. As new threats arise, security attacks get more sophisticated, focusing on the target system's vulnerabilities. This paper proposed the development of Cluster-Based Wireless Sensor Network and Variable Selection Ensemble Machine Learning Algorithms (CBWSN_VSEMLA) as a security threats detection system framework for DoS attack detection. The CBWSN model is designed using a Fuzzy C-Means (FCM) clustering technique, whereas VSEMLA is a detection system comprised of Principal Component Analysis (PCA) for feature selection and various ensemble machine learning algorithms (Bagging, LogitBoost, and RandomForest) for the detection of grayhole attacks, blackhole attacks, flooding attacks, and scheduling attacks. The experimental results of the model performance and complexity comparison for DoS attack evaluation using the WSN-DS dataset show that the PCA_RandomForest IDS model outperforms with 99.999 % accuracy, followed by the PCA_Bagging IDS model with 99.78 % accuracy and the PCA_LogitBoost model with 98.88 % accuracy. However, the PCA_RandomForest model has a high computational complexity, taking 231.64 s to train, followed by the PCA_LogitBoost model, which takes 57.44 s to train, and the PCA_Bagging model, which takes 0.91 s to train to be the best in terms of model computational complexity. Thus, the models surpassed all baseline models in terms of model detection accuracy on flooding, scheduling, grayhole, and blackhole attacks.</p></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"22 ","pages":"Article 200381"},"PeriodicalIF":0.0,"publicationDate":"2024-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667305324000565/pdfft?md5=8097b11aa2208789384c68cfe528a8ec&pid=1-s2.0-S2667305324000565-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140822807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Real-time cyber/physical interplay in scheduling for peak load optimisation in Cyber–Physical Energy Systems 网络物理能源系统中调度高峰负荷优化的实时网络/物理相互作用
Pub Date : 2024-04-30 DOI: 10.1016/j.iswa.2024.200380
Daniele De Martini , Guido Benetti , Tullio Facchinetti

This paper is about reducing the power consumption of Cyber–Physical Energy Systems (CPESs) composed of many loads through the usage of a scheduling technique inspired by the real-time-computing domain; the electric-load coordination guarantees a more efficient operation of the entire system by avoiding unnecessary concurrent activation of loads and thus limiting the peak load. We represent the power loads themselves as “physical” components and the computing devices that coordinate them as “cyber” components and formally derive the relationship between the operations in cyber and physical domains as the interplay between the schedules enforced on each component: indeed, the schedule of the loads – generated by combining a two-dimensional bin-packing and an optimal multi-processor real-time scheduling algorithm – influences the timing of the processing tasks that are dedicated to the activation/deactivation of loads themselves. We also consider non-schedulable loads by introducing a policy to cope with the presence of such loads. Numerical simulations and experiments confirm the good performance of the proposed peak load reduction method. The usage of real-time scheduling in this context provides inherent resource optimisation by limiting the number of concurrent loads that are active at the same time, thus directly reducing the overall peak load; moreover, thanks to the limited computational complexity of the algorithms, it scales to large systems, overcoming the scalability issues of common optimisation methods. Numerical simulations and experiments confirm the good performance of the proposed peak load reduction method.

本文旨在通过使用一种受实时计算领域启发的调度技术,降低由许多负载组成的网络物理能源系统(CPES)的功耗;电力负载协调通过避免不必要的负载并发启动,从而限制峰值负载,保证整个系统更高效地运行。我们将电力负载本身表示为 "物理 "组件,将协调它们的计算设备表示为 "网络 "组件,并将网络域和物理域中的操作关系正式推导为每个组件上执行的计划之间的相互作用:实际上,负载的计划--通过结合二维分仓包装和最佳多处理器实时调度算法生成--影响着专门用于激活/停用负载本身的处理任务的时间安排。我们还考虑了不可调度的负载,引入了应对此类负载的策略。数值模拟和实验证实了所提出的峰值负载降低方法的良好性能。在这种情况下使用实时调度,通过限制同时活动的并发负载数量,提供了内在的资源优化,从而直接降低了总体峰值负载;此外,由于算法的计算复杂度有限,它可以扩展到大型系统,克服了普通优化方法的可扩展性问题。数值模拟和实验证实了所提出的峰值负载降低方法的良好性能。
{"title":"Real-time cyber/physical interplay in scheduling for peak load optimisation in Cyber–Physical Energy Systems","authors":"Daniele De Martini ,&nbsp;Guido Benetti ,&nbsp;Tullio Facchinetti","doi":"10.1016/j.iswa.2024.200380","DOIUrl":"https://doi.org/10.1016/j.iswa.2024.200380","url":null,"abstract":"<div><p>This paper is about reducing the power consumption of Cyber–Physical Energy Systems (CPESs) composed of many loads through the usage of a scheduling technique inspired by the real-time-computing domain; the electric-load coordination guarantees a more efficient operation of the entire system by avoiding unnecessary concurrent activation of loads and thus limiting the peak load. We represent the power loads themselves as “physical” components and the computing devices that coordinate them as “cyber” components and formally derive the relationship between the operations in cyber and physical domains as the interplay between the schedules enforced on each component: indeed, the schedule of the loads – generated by combining a two-dimensional bin-packing and an optimal multi-processor real-time scheduling algorithm – influences the timing of the processing tasks that are dedicated to the activation/deactivation of loads themselves. We also consider non-schedulable loads by introducing a policy to cope with the presence of such loads. Numerical simulations and experiments confirm the good performance of the proposed peak load reduction method. The usage of real-time scheduling in this context provides inherent resource optimisation by limiting the number of concurrent loads that are active at the same time, thus directly reducing the overall peak load; moreover, thanks to the limited computational complexity of the algorithms, it scales to large systems, overcoming the scalability issues of common optimisation methods. Numerical simulations and experiments confirm the good performance of the proposed peak load reduction method.</p></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"22 ","pages":"Article 200380"},"PeriodicalIF":0.0,"publicationDate":"2024-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667305324000553/pdfft?md5=02f12be40a46be024a7e532781a92906&pid=1-s2.0-S2667305324000553-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140822808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A study of the role of new feature fusion based on multimedia analysis on accounting investment decision methods in economic models 基于多媒体分析的新特征融合对经济模型中会计投资决策方法的作用研究
Pub Date : 2024-04-27 DOI: 10.1016/j.iswa.2024.200375
Ping Wang

Accounting investing analysis has been expanding at a steady clip, and some of the findings suggest that investors' restricted reasoning and self-psychological sentiments may not always lead them to make their negative emotions influence their financial decisions, leading to a loss. Digital multimedia fusion and display are made possible, and numerous terminals may now communicate with one another in a seamless, real-time manner thanks to the growth of e-commerce and multimedia. This paper proposes a Multimedia Analysis (MA) feature fusion method for understanding the psychological emotions associated with accounting investments in the online retail environment, which can then be used to guide the development of an investment strategy that is both appropriate and successful for the target demographic. The primary goal of this study is to show how multimedia information retrieval tasks may benefit from combining text pre-filtering with image sorting. For this investigation; they used information from the reliable China Stock Market and Accounting Research (CSMAR) Database. The information fusion technology that supports this paper's investigation is used to dissect experiment outcomes, examine issues with the emotional effect of financial investment clients, and assess the paper's intended study topic. In experiments, we found a 97% accuracy rate in terms of accuracy.

会计投资分析一直在稳步发展,其中一些研究结果表明,投资者的限制性推理和自我心理情绪可能并不总能使他们的负面情绪影响他们的财务决策,从而导致亏损。由于电子商务和多媒体的发展,数字多媒体融合和显示成为可能,众多终端现在可以以无缝、实时的方式相互通信。本文提出了一种多媒体分析(MA)特征融合方法,用于了解网络零售环境中与会计投资相关的心理情绪,进而用于指导制定既适合目标人群又能取得成功的投资策略。这项研究的主要目标是展示多媒体信息的检索任务如何从文本预过滤与图像分类的结合中获益。在这项研究中,他们使用了可靠的中国股票市场与会计研究(CSMAR)数据库中的信息。支持本文调查的信息融合技术用于剖析实验结果、研究金融投资客户的情感效应问题以及评估本文的预期研究课题。在实验中,我们发现准确率达到了 97%。
{"title":"A study of the role of new feature fusion based on multimedia analysis on accounting investment decision methods in economic models","authors":"Ping Wang","doi":"10.1016/j.iswa.2024.200375","DOIUrl":"https://doi.org/10.1016/j.iswa.2024.200375","url":null,"abstract":"<div><p>Accounting investing analysis has been expanding at a steady clip, and some of the findings suggest that investors' restricted reasoning and self-psychological sentiments may not always lead them to make their negative emotions influence their financial decisions, leading to a loss. Digital multimedia fusion and display are made possible, and numerous terminals may now communicate with one another in a seamless, real-time manner thanks to the growth of e-commerce and multimedia. This paper proposes a Multimedia Analysis (MA) feature fusion method for understanding the psychological emotions associated with accounting investments in the online retail environment, which can then be used to guide the development of an investment strategy that is both appropriate and successful for the target demographic. The primary goal of this study is to show how multimedia information retrieval tasks may benefit from combining text pre-filtering with image sorting. For this investigation; they used information from the reliable China Stock Market and Accounting Research (CSMAR) Database. The information fusion technology that supports this paper's investigation is used to dissect experiment outcomes, examine issues with the emotional effect of financial investment clients, and assess the paper's intended study topic. In experiments, we found a 97% accuracy rate in terms of accuracy.</p></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"22 ","pages":"Article 200375"},"PeriodicalIF":0.0,"publicationDate":"2024-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667305324000498/pdfft?md5=067aab60bf9936afd3df2576617e266b&pid=1-s2.0-S2667305324000498-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140910276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Wind turbine fault detection and identification using a two-tier machine learning framework 使用双层机器学习框架检测和识别风力发电机故障
Pub Date : 2024-04-26 DOI: 10.1016/j.iswa.2024.200372
Zaid Allal , Hassan N. Noura , Flavien Vernier , Ola Salman , Khaled Chahine

A proactive approach is essential to optimize wind turbine maintenance and minimize downtime. By utilizing advanced data analysis techniques on the existing Supervisory Control and Data Acquisition (SCADA) system data, valuable insights can be gained into wind turbine performance without incurring high costs. This allows for early fault detection and predictive maintenance, ensuring that unscheduled or reactive maintenance is minimized and revenue loss is mitigated. In this study, data from a wind turbine SCADA system in the southeast of Ireland were collected, preprocessed, and analyzed using statistical and visualization techniques to uncover hidden patterns related to five fault types within the system. The paper introduces a conditional function designed to test two given scenarios. The first scenario employs a two-tier approach involving fault detection followed by fault identification. Initially, faulty samples are detected in the first tier and then passed to the second tier, which is trained to diagnose the specific fault type for each sample. In contrast, the second scenario involves a simpler solution referred to as naive, which treats fault types and normal cases together in the same dataset and trains a model to distinguish between normal samples and those related to specific fault types. Machine learning models, particularly robust classifiers, were tested in both scenarios. Thirteen classifiers were included, ranging from tree-based to traditional classifiers, neural networks, and ensemble learners. Additionally, an averaging feature importance technique was employed to select the most impactful features on the model decisions as a starting point. A comparison of the results reveals that the proposed two-tier approach is more accurate and less time-consuming, achieving 95% accuracy in separating faulty from normal samples and approximately 91% in diagnosing each fault type. Furthermore, ensemble learners, particularly bagging and stacking, demonstrated superior fault detection and identification performance. The performance of the classifiers was validated using t-SNE and explainable AI techniques, confirming that the impactful features align with the findings and that the proposed two-tier solution outperforms the naive solution. These results strongly indicate that the proposed solution is accurate, independent, and less complex compared to existing solutions.

积极主动的方法对于优化风机维护和减少停机时间至关重要。通过对现有的监控和数据采集 (SCADA) 系统数据采用先进的数据分析技术,可以在不产生高额成本的情况下深入了解风力涡轮机的性能。这样就可以进行早期故障检测和预测性维护,确保最大限度地减少计划外或被动维护,减少收入损失。本研究收集、预处理和分析了爱尔兰东南部风力涡轮机 SCADA 系统的数据,并使用统计和可视化技术揭示了系统内五种故障类型的隐藏模式。论文介绍了一个条件函数,旨在测试两种给定的情景。第一种情况采用了一种双层方法,包括故障检测和故障识别。最初,第一层检测到故障样本,然后将其传递给第二层,第二层经过训练后可诊断出每个样本的特定故障类型。与此相反,第二种方案涉及一种更简单的解决方案,即 "天真 "方案,它将故障类型和正常情况放在同一个数据集中处理,并训练一个模型来区分正常样本和与特定故障类型相关的样本。机器学习模型,尤其是鲁棒分类器,在这两种方案中都进行了测试。其中包括 13 种分类器,从基于树的分类器到传统分类器、神经网络和集合学习器,不一而足。此外,还采用了平均特征重要性技术,以选择对模型决策影响最大的特征作为起点。结果对比显示,所提出的双层方法更准确、更省时,在区分故障样本和正常样本方面达到了 95% 的准确率,在诊断每种故障类型方面达到了约 91% 的准确率。此外,集合学习器,特别是袋装和堆叠学习器,在故障检测和识别方面表现出色。使用 t-SNE 和可解释人工智能技术对分类器的性能进行了验证,证实了有影响的特征与研究结果一致,而且所提出的双层解决方案优于天真解决方案。这些结果有力地表明,与现有解决方案相比,所提出的解决方案准确、独立且复杂度较低。
{"title":"Wind turbine fault detection and identification using a two-tier machine learning framework","authors":"Zaid Allal ,&nbsp;Hassan N. Noura ,&nbsp;Flavien Vernier ,&nbsp;Ola Salman ,&nbsp;Khaled Chahine","doi":"10.1016/j.iswa.2024.200372","DOIUrl":"https://doi.org/10.1016/j.iswa.2024.200372","url":null,"abstract":"<div><p>A proactive approach is essential to optimize wind turbine maintenance and minimize downtime. By utilizing advanced data analysis techniques on the existing Supervisory Control and Data Acquisition (SCADA) system data, valuable insights can be gained into wind turbine performance without incurring high costs. This allows for early fault detection and predictive maintenance, ensuring that unscheduled or reactive maintenance is minimized and revenue loss is mitigated. In this study, data from a wind turbine SCADA system in the southeast of Ireland were collected, preprocessed, and analyzed using statistical and visualization techniques to uncover hidden patterns related to five fault types within the system. The paper introduces a conditional function designed to test two given scenarios. The first scenario employs a two-tier approach involving fault detection followed by fault identification. Initially, faulty samples are detected in the first tier and then passed to the second tier, which is trained to diagnose the specific fault type for each sample. In contrast, the second scenario involves a simpler solution referred to as naive, which treats fault types and normal cases together in the same dataset and trains a model to distinguish between normal samples and those related to specific fault types. Machine learning models, particularly robust classifiers, were tested in both scenarios. Thirteen classifiers were included, ranging from tree-based to traditional classifiers, neural networks, and ensemble learners. Additionally, an averaging feature importance technique was employed to select the most impactful features on the model decisions as a starting point. A comparison of the results reveals that the proposed two-tier approach is more accurate and less time-consuming, achieving 95% accuracy in separating faulty from normal samples and approximately 91% in diagnosing each fault type. Furthermore, ensemble learners, particularly bagging and stacking, demonstrated superior fault detection and identification performance. The performance of the classifiers was validated using t-SNE and explainable AI techniques, confirming that the impactful features align with the findings and that the proposed two-tier solution outperforms the naive solution. These results strongly indicate that the proposed solution is accurate, independent, and less complex compared to existing solutions.</p></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"22 ","pages":"Article 200372"},"PeriodicalIF":0.0,"publicationDate":"2024-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667305324000486/pdfft?md5=4606e6eb1accac3bd5df946c38764e84&pid=1-s2.0-S2667305324000486-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140813414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Detection of Arabic offensive language in social media using machine learning models 利用机器学习模型检测社交媒体中的阿拉伯语攻击性语言
Pub Date : 2024-04-26 DOI: 10.1016/j.iswa.2024.200376
Aya Mousa , Ismail Shahin , Ali Bou Nassif , Ashraf Elnagar

This research aims to detect different types of Arabic offensive language in twitter. It uses a multiclass classification system in which each tweet is categorized into one or more of the offensive language types based on the used word(s). In this study, five types are classified, which are: bullying, insult, racism, obscene, and non-offensive. To classify the abusive language, a cascaded model consisting of Bidirectional Encoder Representation of Transformers (BERT) models (AraBERT, ArabicBERT, XLMRoBERTa, GigaBERT, MBERT, and QARiB), deep learning models (1D-CNN, BiLSTM), and Radial Basis Function (RBF) is presented in this work. In addition, various types of machine learning models are utilized. The dataset is collected from twitter in which each class has the same number of tweets (balanced dataset). Each tweet is assigned to one or more of the selected offensive language types to build multiclass and multilabel systems. In addition, a binary dataset is constructed by assigning the tweets to offensive or non-offensive classes. The highest results are obtained from implementing the cascaded model started by ArabicBERT followed by BiLSTM and RBF with an accuracy, precision, recall, and F1-score of 98.4%, 98.2%,92.8%, and 98.4%, respectively. RBF records the highest results among the utilized traditional classifiers with an accuracy, precision, recall, and F1-score of 60% for each measurement individually, while KNN records the lowest results obtaining 45%, 46%, 45%, and 43% in terms of accuracy, precision, recall, and F1-score, respectively.

本研究旨在检测 twitter 中不同类型的阿拉伯语攻击性语言。它采用多类分类系统,根据使用的单词将每条推文归入一种或多种攻击性语言类型。在本研究中,共分为五种类型:欺凌、侮辱、种族主义、淫秽和非攻击性。为了对辱骂性语言进行分类,本研究提出了一个级联模型,该模型由双向变压器编码器表征(BERT)模型(AraBERT、ArabicBERT、XLMRoBERTa、GigaBERT、MBERT 和 QARiB)、深度学习模型(1D-CNN、BiLSTM)和径向基函数(RBF)组成。此外,还使用了各种类型的机器学习模型。数据集收集自 twitter,其中每个类别都有相同数量的推文(平衡数据集)。每条推文都被分配到一个或多个选定的攻击性语言类型中,以建立多类别和多标签系统。此外,还通过将推文分配到攻击性或非攻击性类别来构建二元数据集。在实施级联模型时,ArabicBERT 的结果最高,其次是 BiLSTM 和 RBF,准确率、精确率、召回率和 F1 分数分别为 98.4%、98.2%、92.8% 和 98.4%。在所使用的传统分类器中,RBF 的结果最高,准确率、精确度、召回率和 F1 分数均达到 60%,而 KNN 的结果最低,准确率、精确度、召回率和 F1 分数分别为 45%、46%、45% 和 43%。
{"title":"Detection of Arabic offensive language in social media using machine learning models","authors":"Aya Mousa ,&nbsp;Ismail Shahin ,&nbsp;Ali Bou Nassif ,&nbsp;Ashraf Elnagar","doi":"10.1016/j.iswa.2024.200376","DOIUrl":"https://doi.org/10.1016/j.iswa.2024.200376","url":null,"abstract":"<div><p>This research aims to detect different types of Arabic offensive language in twitter. It uses a multiclass classification system in which each tweet is categorized into one or more of the offensive language types based on the used word(s). In this study, five types are classified, which are: bullying, insult, racism, obscene, and non-offensive. To classify the abusive language, a cascaded model consisting of Bidirectional Encoder Representation of Transformers (BERT) models (AraBERT, ArabicBERT, XLMRoBERTa, GigaBERT, MBERT, and QARiB), deep learning models (1D-CNN, BiLSTM), and Radial Basis Function (RBF) is presented in this work. In addition, various types of machine learning models are utilized. The dataset is collected from twitter in which each class has the same number of tweets (balanced dataset). Each tweet is assigned to one or more of the selected offensive language types to build multiclass and multilabel systems. In addition, a binary dataset is constructed by assigning the tweets to offensive or non-offensive classes. The highest results are obtained from implementing the cascaded model started by ArabicBERT followed by BiLSTM and RBF with an accuracy, precision, recall, and F1-score of 98.4%, 98.2%,92.8%, and 98.4%, respectively. RBF records the highest results among the utilized traditional classifiers with an accuracy, precision, recall, and F1-score of 60% for each measurement individually, while KNN records the lowest results obtaining 45%, 46%, 45%, and 43% in terms of accuracy, precision, recall, and F1-score, respectively.</p></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"22 ","pages":"Article 200376"},"PeriodicalIF":0.0,"publicationDate":"2024-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667305324000516/pdfft?md5=f5155135e406793f134b79e0164c3049&pid=1-s2.0-S2667305324000516-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140818721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A learning system-based soft multiple linear regression model 基于学习系统的软多元线性回归模型
Pub Date : 2024-04-26 DOI: 10.1016/j.iswa.2024.200378
Gholamreza Hesamian , Faezeh Torkian , Arne Johannssen , Nataliya Chukhrova

Machine learning applied to regression models offers powerful mathematical tools for predicting responses based on one or more predictor variables. This paper extends the concept of multiple linear regression by implementing a learning system and incorporating both fuzzy predictors and fuzzy responses. To estimate the unknown parameters of this soft regression model, the approach involves minimizing the absolute distance between two lines under three constraints related to the absolute error distance between observed data and their respective predicted lines. A thorough comparative analysis is conducted, showcasing the practical applicability and superiority of the proposed soft multiple linear regression model. The effectiveness of the model is demonstrated through a comprehensive examination involving simulation studies and real-life application examples.

应用于回归模型的机器学习为基于一个或多个预测变量预测反应提供了强大的数学工具。本文扩展了多元线性回归的概念,实施了一个学习系统,并纳入了模糊预测因子和模糊响应。为了估算这个软回归模型的未知参数,该方法涉及在与观测数据和各自预测线之间的绝对误差距离有关的三个约束条件下,最小化两条线之间的绝对距离。通过全面的比较分析,展示了所提出的软多元线性回归模型的实际应用性和优越性。通过模拟研究和实际应用实例的综合检验,证明了该模型的有效性。
{"title":"A learning system-based soft multiple linear regression model","authors":"Gholamreza Hesamian ,&nbsp;Faezeh Torkian ,&nbsp;Arne Johannssen ,&nbsp;Nataliya Chukhrova","doi":"10.1016/j.iswa.2024.200378","DOIUrl":"https://doi.org/10.1016/j.iswa.2024.200378","url":null,"abstract":"<div><p>Machine learning applied to regression models offers powerful mathematical tools for predicting responses based on one or more predictor variables. This paper extends the concept of multiple linear regression by implementing a learning system and incorporating both fuzzy predictors and fuzzy responses. To estimate the unknown parameters of this soft regression model, the approach involves minimizing the absolute distance between two lines under three constraints related to the absolute error distance between observed data and their respective predicted lines. A thorough comparative analysis is conducted, showcasing the practical applicability and superiority of the proposed soft multiple linear regression model. The effectiveness of the model is demonstrated through a comprehensive examination involving simulation studies and real-life application examples.</p></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"22 ","pages":"Article 200378"},"PeriodicalIF":0.0,"publicationDate":"2024-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S266730532400053X/pdfft?md5=8eb7bc998874d64b464285b08379fed3&pid=1-s2.0-S266730532400053X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140879823","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Intelligent Systems with Applications
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1