PeerJ Computer Science最新文献_第8页

Image enhancement with art design: a visual feature approach with a CNN-transformer fusion model. 艺术设计的图像增强：CNN-transformer融合模型的视觉特征方法。

IF 3.5 4区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

PeerJ Computer Science

Pub Date : 2024-11-13 eCollection Date: 2024-01-01 DOI: 10.7717/peerj-cs.2417

Ming Xu, Jinwei Cui, Xiaoyu Ma, Zhiyi Zou, Zhisheng Xin, Muhammad Bilal

Graphic design, as a product of the burgeoning new media era, has seen its users' requirements for images continuously evolve. However, external factors such as light and noise often cause graphic design images to become distorted during acquisition. To enhance the definition of these images, this paper introduces a novel image enhancement model based on visual features. Initially, a histogram equalization (HE) algorithm is applied to enhance the graphic design images. Subsequently, image feature extraction is performed using a dual-flow network comprising convolutional neural network (CNN) and Transformer architectures. The CNN employs a residual dense block (RDB) to embed spatial local structure information with varying receptive fields. An improved attention mechanism module, attention feature fusion (AFF), is then introduced to integrate the image features extracted from the dual-flow network. Finally, through image perception quality guided adversarial learning, the model adjusts the initial enhanced image's color and recovers more details. Experimental results demonstrate that the proposed algorithm model achieves enhancement effects exceeding 90% on two large image datasets, which represents a 5%-10% improvement over other models. Furthermore, the algorithm exhibits superior performance in terms of peak signal-to-noise ratio (PSNR) and structural similarity index measure (SSIM) image quality evaluation metrics. Our findings indicate that the fusion model significantly enhances image quality, thereby advancing the field of graphic design and showcasing its potential in cultural and creative product design.

平面设计作为蓬勃发展的新媒体时代的产物，其用户对图像的需求也在不断演变。然而，光线、噪音等外部因素往往会导致平面设计图像在获取过程中发生畸变。为了增强这些图像的清晰度，本文引入了一种新的基于视觉特征的图像增强模型。首先，采用直方图均衡化（HE）算法对平面设计图像进行增强。随后，使用包含卷积神经网络（CNN）和Transformer架构的双流网络进行图像特征提取。CNN采用残差密集块（RDB）来嵌入具有不同接受域的空间局部结构信息。然后引入改进的注意机制模块——注意特征融合（AFF），对双流网络提取的图像特征进行融合。最后，通过图像感知质量引导的对抗学习，调整初始增强图像的颜色，恢复更多细节。实验结果表明，该算法模型在两个大型图像数据集上的增强效果超过90%，比其他模型提高了5% ~ 10%。此外，该算法在峰值信噪比（PSNR）和结构相似指数度量（SSIM）图像质量评价指标方面表现出优异的性能。我们的研究结果表明，融合模型显著提高了图像质量，从而推动了平面设计领域的发展，并展示了其在文化创意产品设计中的潜力。

{"title":"Image enhancement with art design: a visual feature approach with a CNN-transformer fusion model.","authors":"Ming Xu, Jinwei Cui, Xiaoyu Ma, Zhiyi Zou, Zhisheng Xin, Muhammad Bilal","doi":"10.7717/peerj-cs.2417","DOIUrl":"10.7717/peerj-cs.2417","url":null,"abstract":"Graphic design, as a product of the burgeoning new media era, has seen its users' requirements for images continuously evolve. However, external factors such as light and noise often cause graphic design images to become distorted during acquisition. To enhance the definition of these images, this paper introduces a novel image enhancement model based on visual features. Initially, a histogram equalization (HE) algorithm is applied to enhance the graphic design images. Subsequently, image feature extraction is performed using a dual-flow network comprising convolutional neural network (CNN) and Transformer architectures. The CNN employs a residual dense block (RDB) to embed spatial local structure information with varying receptive fields. An improved attention mechanism module, attention feature fusion (AFF), is then introduced to integrate the image features extracted from the dual-flow network. Finally, through image perception quality guided adversarial learning, the model adjusts the initial enhanced image's color and recovers more details. Experimental results demonstrate that the proposed algorithm model achieves enhancement effects exceeding 90% on two large image datasets, which represents a 5%-10% improvement over other models. Furthermore, the algorithm exhibits superior performance in terms of peak signal-to-noise ratio (PSNR) and structural similarity index measure (SSIM) image quality evaluation metrics. Our findings indicate that the fusion model significantly enhances image quality, thereby advancing the field of graphic design and showcasing its potential in cultural and creative product design.","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"10 ","pages":"e2417"},"PeriodicalIF":3.5,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11623052/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142803313","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

EAD: effortless anomalies detection, a deep learning based approach for detecting outliers in English textual data. EAD：轻松异常检测，一种基于深度学习的方法，用于检测英语文本数据中的异常值。

IF 3.5 4区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

PeerJ Computer Science

Pub Date : 2024-11-13 eCollection Date: 2024-01-01 DOI: 10.7717/peerj-cs.2479

Xiuzhe Wang

Anomalies are the existential abnormalities in data, the identification of which is known as anomaly detection. The absence of timely detection of anomalies may affect the key processes of decision-making, fraud detection, and automated classification. Most of the existing models of anomaly detection utilize the traditional way of tokenizing and are computationally costlier, mainly if the outliers are to be extracted from a large script. This research work intends to propose an unsupervised, all-MiniLM-L6-v2-based system for the detection of outliers. The method makes use of centroid embeddings to extract outliers in high-variety, large-volume data. To avoid mistakenly treating novelty as an outlier, the Minimum Covariance Determinant (MCD) based approach is followed to count the novelty of the input script. The proposed method is implemented in a Python project, App. for Anomalies Detection (AAD). The system is evaluated by two non-related datasets-the 20 newsgroups text dataset and the SMS spam collection dataset. The robust accuracy (94%) and F1 score (0.95) revealed that the proposed method could effectively trace anomalies in a comparatively large script. The process is applicable in extracting meanings from textual data, particularly in the domains of human resource management and security.

异常是数据中存在的异常，其识别称为异常检测。如果不能及时发现异常，可能会影响决策、欺诈检测和自动分类的关键过程。大多数现有的异常检测模型使用传统的标记方法，并且计算成本较高，特别是当从大型脚本中提取异常值时。本研究工作旨在提出一种无监督的、基于全minilm - l6 -v2的异常值检测系统。该方法利用质心嵌入来提取高品种、大容量数据中的异常值。为了避免将新颖性错误地视为异常值，采用基于最小协方差行列式（MCD）的方法来计算输入脚本的新颖性。提出的方法在Python项目应用程序异常检测（AAD）中实现。系统通过两个不相关的数据集- 20新闻组文本数据集和短信垃圾邮件收集数据集进行评估。鲁棒准确率（94%）和F1分数（0.95）表明，该方法可以有效地跟踪较大范围内的异常。该过程适用于从文本数据中提取含义，特别是在人力资源管理和安全领域。

{"title":"EAD: effortless anomalies detection, a deep learning based approach for detecting outliers in English textual data.","authors":"Xiuzhe Wang","doi":"10.7717/peerj-cs.2479","DOIUrl":"10.7717/peerj-cs.2479","url":null,"abstract":"Anomalies are the existential abnormalities in data, the identification of which is known as anomaly detection. The absence of timely detection of anomalies may affect the key processes of decision-making, fraud detection, and automated classification. Most of the existing models of anomaly detection utilize the traditional way of tokenizing and are computationally costlier, mainly if the outliers are to be extracted from a large script. This research work intends to propose an unsupervised, all-MiniLM-L6-v2-based system for the detection of outliers. The method makes use of centroid embeddings to extract outliers in high-variety, large-volume data. To avoid mistakenly treating novelty as an outlier, the Minimum Covariance Determinant (MCD) based approach is followed to count the novelty of the input script. The proposed method is implemented in a Python project, App. for Anomalies Detection (AAD). The system is evaluated by two non-related datasets-the 20 newsgroups text dataset and the SMS spam collection dataset. The robust accuracy (94%) and F1 score (0.95) revealed that the proposed method could effectively trace anomalies in a comparatively large script. The process is applicable in extracting meanings from textual data, particularly in the domains of human resource management and security.","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"10 ","pages":"e2479"},"PeriodicalIF":3.5,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11623099/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142803340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Novel statistically equivalent signature-based hybrid feature selection and ensemble deep learning LSTM and GRU for chronic kidney disease classification. 基于统计等效签名的新型混合特征选择和集成深度学习LSTM和GRU用于慢性肾病分类。

IF 3.5 4区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

PeerJ Computer Science

Pub Date : 2024-11-13 eCollection Date: 2024-01-01 DOI: 10.7717/peerj-cs.2467

Yogesh N, Purohit Shrinivasacharya, Nagaraj Naik

Chronic kidney disease (CKD) involves numerous variables, but only a few significantly impact the classification task. The statistically equivalent signature (SES) method, inspired by constraint-based learning of Bayesian networks, is employed to identify essential features in CKD. Unlike conventional feature selection methods, which typically focus on a single set of features with the highest predictive potential, the SES method can identify multiple predictive feature subsets with similar performance. However, most feature selection (FS) classifiers perform suboptimally with strongly correlated data. The FS approach faces challenges in identifying crucial features and selecting the most effective classifier, particularly in high-dimensional data. This study proposes using the Least Absolute Shrinkage and Selection Operator (LASSO) in conjunction with the SES method for feature selection in CKD identification. Following this, an ensemble deep-learning model combining long short-term memory (LSTM) and gated recurrent unit (GRU) networks is proposed for CKD classification. The features selected by the hybrid feature selection method are fed into the ensemble deep-learning model. The model's performance is evaluated using accuracy, precision, recall, and F1 score metrics. The experimental results are compared with individual classifiers, including decision tree (DT), Random Forest (RF), logistic regression (LR), and support vector machine (SVM). The findings indicate a 2% improvement in classification accuracy when using the proposed hybrid feature selection method combined with the LSTM and GRU ensemble deep-learning model. Further analysis reveals that certain features, such as HEMO, POT, bacteria, and coronary artery disease, contribute minimally to the classification task. Future research could explore additional feature selection methods, including dynamic feature selection that adapts to evolving datasets and incorporates clinical knowledge to enhance CKD classification accuracy further.

慢性肾脏病（CKD）涉及众多变量，但只有少数变量会对分类任务产生重大影响。统计等效特征（SES）方法受贝叶斯网络基于约束学习的启发，用于识别 CKD 的基本特征。传统的特征选择方法通常只关注一组具有最高预测潜力的特征，而 SES 方法则不同，它能识别出多个具有相似性能的预测特征子集。然而，大多数特征选择（FS）分类器在处理强相关数据时表现并不理想。FS 方法在识别关键特征和选择最有效的分类器方面面临挑战，尤其是在高维数据中。本研究建议在 CKD 识别中结合使用最小绝对收缩和选择算子（LASSO）和 SES 方法进行特征选择。在此基础上，提出了一种结合了长短期记忆（LSTM）和门控递归单元（GRU）网络的集合深度学习模型，用于 CKD 分类。混合特征选择法选出的特征被输入到集合深度学习模型中。该模型的性能使用准确率、精确度、召回率和 F1 分数指标进行评估。实验结果与单个分类器进行了比较，包括决策树（DT）、随机森林（RF）、逻辑回归（LR）和支持向量机（SVM）。研究结果表明，当使用所提出的混合特征选择方法与 LSTM 和 GRU 集合深度学习模型相结合时，分类准确率提高了 2%。进一步的分析表明，某些特征，如 HEMO、POT、细菌和冠状动脉疾病，对分类任务的贡献微乎其微。未来的研究可以探索更多的特征选择方法，包括动态特征选择，以适应不断变化的数据集，并结合临床知识，进一步提高 CKD 分类的准确性。

{"title":"Novel statistically equivalent signature-based hybrid feature selection and ensemble deep learning LSTM and GRU for chronic kidney disease classification.","authors":"Yogesh N, Purohit Shrinivasacharya, Nagaraj Naik","doi":"10.7717/peerj-cs.2467","DOIUrl":"10.7717/peerj-cs.2467","url":null,"abstract":"Chronic kidney disease (CKD) involves numerous variables, but only a few significantly impact the classification task. The statistically equivalent signature (SES) method, inspired by constraint-based learning of Bayesian networks, is employed to identify essential features in CKD. Unlike conventional feature selection methods, which typically focus on a single set of features with the highest predictive potential, the SES method can identify multiple predictive feature subsets with similar performance. However, most feature selection (FS) classifiers perform suboptimally with strongly correlated data. The FS approach faces challenges in identifying crucial features and selecting the most effective classifier, particularly in high-dimensional data. This study proposes using the Least Absolute Shrinkage and Selection Operator (LASSO) in conjunction with the SES method for feature selection in CKD identification. Following this, an ensemble deep-learning model combining long short-term memory (LSTM) and gated recurrent unit (GRU) networks is proposed for CKD classification. The features selected by the hybrid feature selection method are fed into the ensemble deep-learning model. The model's performance is evaluated using accuracy, precision, recall, and F1 score metrics. The experimental results are compared with individual classifiers, including decision tree (DT), Random Forest (RF), logistic regression (LR), and support vector machine (SVM). The findings indicate a 2% improvement in classification accuracy when using the proposed hybrid feature selection method combined with the LSTM and GRU ensemble deep-learning model. Further analysis reveals that certain features, such as HEMO, POT, bacteria, and coronary artery disease, contribute minimally to the classification task. Future research could explore additional feature selection methods, including dynamic feature selection that adapts to evolving datasets and incorporates clinical knowledge to enhance CKD classification accuracy further.","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"10 ","pages":"e2467"},"PeriodicalIF":3.5,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11639220/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142830834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

DeepEMPR: coffee leaf disease detection with deep learning and enhanced multivariance product representation. DeepEMPR：基于深度学习和增强的多方差产品表示的咖啡叶病检测。

IF 3.5 4区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

PeerJ Computer Science

Pub Date : 2024-11-13 eCollection Date: 2024-01-01 DOI: 10.7717/peerj-cs.2406

Ahmet Topal, Burcu Tunga, Erfan Babaee Tirkolaee

Plant diseases threaten agricultural sustainability by reducing crop yields. Rapid and accurate disease identification is crucial for effective management. Recent advancements in artificial intelligence (AI) have facilitated the development of automated systems for disease detection. This study focuses on enhancing the classification of diseases and estimating their severity in coffee leaf images. To do so, we propose a novel approach as the preprocessing step for the classification in which enhanced multivariance product representation (EMPR) is used to decompose the considered image into components, a new image is constructed using some of those components, and the contrast of the new image is enhanced by applying high-dimensional model representation (HDMR) to highlight the diseased parts of the leaves. Popular convolutional neural network (CNN) architectures, including AlexNet, VGG16, and ResNet50, are evaluated. Results show that VGG16 achieves the highest classification accuracy of approximately 96%, while all models perform well in predicting disease severity levels, with accuracies exceeding 85%. Notably, the ResNet50 model achieves accuracy levels surpassing 90%. This research contributes to the advancement of automated crop health management systems.

植物病害通过降低作物产量威胁农业的可持续性。快速准确的疾病识别是有效管理的关键。人工智能（AI）的最新进展促进了疾病检测自动化系统的发展。本研究的重点是加强咖啡叶图像的疾病分类和估计其严重程度。为此，我们提出了一种新的方法作为分类的预处理步骤，该方法使用增强多方差积表示（enhanced multivariance product representation， EMPR）将考虑的图像分解为组件，使用其中的一些组件构建新图像，并通过应用高维模型表示（high-dimensional model representation， HDMR）来突出叶片的病变部分，从而增强新图像的对比度。评估了流行的卷积神经网络（CNN）架构，包括AlexNet， VGG16和ResNet50。结果表明，VGG16的分类准确率最高，约为96%，而所有模型在预测疾病严重程度方面都表现良好，准确率超过85%。值得注意的是，ResNet50模型的准确率超过了90%。本研究有助于作物健康管理自动化系统的发展。

{"title":"DeepEMPR: coffee leaf disease detection with deep learning and enhanced multivariance product representation.","authors":"Ahmet Topal, Burcu Tunga, Erfan Babaee Tirkolaee","doi":"10.7717/peerj-cs.2406","DOIUrl":"10.7717/peerj-cs.2406","url":null,"abstract":"Plant diseases threaten agricultural sustainability by reducing crop yields. Rapid and accurate disease identification is crucial for effective management. Recent advancements in artificial intelligence (AI) have facilitated the development of automated systems for disease detection. This study focuses on enhancing the classification of diseases and estimating their severity in coffee leaf images. To do so, we propose a novel approach as the preprocessing step for the classification in which enhanced multivariance product representation (EMPR) is used to decompose the considered image into components, a new image is constructed using some of those components, and the contrast of the new image is enhanced by applying high-dimensional model representation (HDMR) to highlight the diseased parts of the leaves. Popular convolutional neural network (CNN) architectures, including AlexNet, VGG16, and ResNet50, are evaluated. Results show that VGG16 achieves the highest classification accuracy of approximately 96%, while all models perform well in predicting disease severity levels, with accuracies exceeding 85%. Notably, the ResNet50 model achieves accuracy levels surpassing 90%. This research contributes to the advancement of automated crop health management systems.","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"10 ","pages":"e2406"},"PeriodicalIF":3.5,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11623072/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142803300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Implementing federated learning over VPN-based wireless backhaul networks for healthcare systems. 在医疗保健系统的基于vpn的无线回程网络上实现联邦学习。

IF 3.5 4区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

PeerJ Computer Science

Pub Date : 2024-11-13 eCollection Date: 2024-01-01 DOI: 10.7717/peerj-cs.2422

Atif Mahmood, Zati Hakim Azizul, Mohammed Zakariah, Samir Brahim Belhaouari, Ayman Altameem, Roziana Ramli, Abdulaziz S Almazyad, Miss Laiha Mat Kiah, Saaidal Razalli Azzuhri

Federated learning (FL) is a popular method where edge devices work together to train machine learning models. This study introduces an efficient network for analyzing healthcare records. It uses VPN technology and applies a federated learning approach over a wireless backhaul network. The study compares different wireless backhaul channels, including terahertz (THz), E/V band (mmWave), and microwave, for their effectiveness. We looked closely at a suggested FL network that uses VPN technology over awireless backhaul network. We compared it with the standard method and found that using the FedAvg algorithm with Terahertz (THz) for communication gave the best accuracy. The time it took to reach a conclusion improved a lot, going from 55 seconds to an impressive 38 seconds. This emphasizes how having a faster communication link makes FL networks work much better. Furthermore, a three-step plan was executed to boost security, adopting a multi-layered method to safeguard the FL network and its confidential data. The first step involves integrating a private network into the current telecom infrastructure, establishing an initial layer of security. To enhance security further, licensed frequency channels are introduced, providing an extra layer of protection. The highest level of security is achieved by combining a private network with licensed frequency channels, complemented by an additional layer of security through VPN-based measures. This comprehensive strategy ensures the application of strong security protocols.

联邦学习（FL）是一种流行的方法，其中边缘设备一起工作以训练机器学习模型。本研究介绍了一种用于分析医疗记录的高效网络。它使用VPN技术，并在无线回程网络上应用联邦学习方法。该研究比较了不同的无线回程信道，包括太赫兹（THz）、E/V波段（毫米波）和微波的有效性。我们仔细研究了在无线回程网络上使用VPN技术的FL网络。我们将其与标准方法进行了比较，发现使用太赫兹（THz）通信的fedag算法具有最好的精度。得出结论所需的时间大大缩短，从55秒缩短到38秒。这强调了拥有更快的通信链路如何使FL网络工作得更好。此外，执行了一个三步走计划来提高安全性，采用多层方法来保护FL网络及其机密数据。第一步是将一个专用网络集成到当前的电信基础设施中，建立一个初始的安全层。为了进一步增强安全性，引入了许可频率通道，提供了额外的保护层。最高级别的安全性是通过将专用网络与许可的频率通道结合起来实现的，并通过基于vpn的措施补充了额外的安全性层。这种综合策略确保了强大的安全协议的应用。

{"title":"Implementing federated learning over VPN-based wireless backhaul networks for healthcare systems.","authors":"Atif Mahmood, Zati Hakim Azizul, Mohammed Zakariah, Samir Brahim Belhaouari, Ayman Altameem, Roziana Ramli, Abdulaziz S Almazyad, Miss Laiha Mat Kiah, Saaidal Razalli Azzuhri","doi":"10.7717/peerj-cs.2422","DOIUrl":"10.7717/peerj-cs.2422","url":null,"abstract":"Federated learning (FL) is a popular method where edge devices work together to train machine learning models. This study introduces an efficient network for analyzing healthcare records. It uses VPN technology and applies a federated learning approach over a wireless backhaul network. The study compares different wireless backhaul channels, including terahertz (THz), E/V band (mmWave), and microwave, for their effectiveness. We looked closely at a suggested FL network that uses VPN technology over awireless backhaul network. We compared it with the standard method and found that using the FedAvg algorithm with Terahertz (THz) for communication gave the best accuracy. The time it took to reach a conclusion improved a lot, going from 55 seconds to an impressive 38 seconds. This emphasizes how having a faster communication link makes FL networks work much better. Furthermore, a three-step plan was executed to boost security, adopting a multi-layered method to safeguard the FL network and its confidential data. The first step involves integrating a private network into the current telecom infrastructure, establishing an initial layer of security. To enhance security further, licensed frequency channels are introduced, providing an extra layer of protection. The highest level of security is achieved by combining a private network with licensed frequency channels, complemented by an additional layer of security through VPN-based measures. This comprehensive strategy ensures the application of strong security protocols.","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"10 ","pages":"e2422"},"PeriodicalIF":3.5,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11622844/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142803316","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Adaptive memory reservation strategy for heavy workloads in the Spark environment. 针对 Spark 环境中繁重工作负载的自适应内存预留策略。

IF 3.5 4区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

PeerJ Computer Science

Pub Date : 2024-11-13 eCollection Date: 2024-01-01 DOI: 10.7717/peerj-cs.2460

Bohan Li, Xin He, Junyang Yu, Guanghui Wang, Yixin Song, Shunjie Pan, Hangyu Gu

The rise of the Internet of Things (IoT) and Industry 2.0 has spurred a growing need for extensive data computing, and Spark emerged as a promising Big Data platform, attributed to its distributed in-memory computing capabilities. However, practical heavy workloads often lead to memory bottleneck issues in the Spark platform. This results in resilient distributed datasets (RDD) eviction and, in extreme cases, violent memory contentions, causing a significant degradation in Spark computational efficiency. To tackle this issue, we propose an adaptive memory reservation (AMR) strategy in this article, specifically designed for heavy workloads in the Spark environment. Specifically, we model optimal task parallelism by minimizing the disparity between the number of tasks completed without blocking and the number completed in regular rounds. Optimal memory for task parallelism is determined to establish an efficient execution memory space for computational parallelism. Subsequently, through adaptive execution memory reservation and dynamic adjustments, such as compression or expansion based on task progress, the strategy ensures dynamic task parallelism in the Spark parallel computing process. Considering the cost of RDD cache location and real-time memory space usage, we select suitable storage locations for different RDD types to alleviate execution memory pressure. Finally, we conduct extensive laboratory experiments to validate the effectiveness of AMR. Results indicate that, compared to existing memory management solutions, AMR reduces the execution time by approximately 46.8%.

物联网（IoT）和工业2.0的兴起刺激了对广泛数据计算的日益增长的需求，而Spark因其分布式内存计算能力而成为一个有前途的大数据平台。然而，实际的繁重工作负载常常会导致Spark平台中的内存瓶颈问题。这将导致弹性分布式数据集（RDD）驱逐，在极端情况下，还会导致剧烈的内存争用，从而导致Spark计算效率的显著降低。为了解决这个问题，我们在本文中提出了一种自适应内存保留（AMR）策略，专门为Spark环境中的繁重工作负载设计。具体来说，我们通过最小化未阻塞完成的任务数量与常规回合完成的任务数量之间的差异来建模最优任务并行性。确定任务并行的最优内存，为计算并行建立有效的执行内存空间。随后，该策略通过自适应的执行内存预留和基于任务进度的压缩或扩展等动态调整，保证了Spark并行计算过程中的动态任务并行性。考虑到RDD缓存位置成本和实时内存空间使用情况，我们为不同类型的RDD选择合适的存储位置，以减轻执行内存压力。最后，我们进行了大量的实验室实验来验证AMR的有效性。结果表明，与现有的内存管理解决方案相比，AMR减少了大约46.8%的执行时间。

{"title":"Adaptive memory reservation strategy for heavy workloads in the Spark environment.","authors":"Bohan Li, Xin He, Junyang Yu, Guanghui Wang, Yixin Song, Shunjie Pan, Hangyu Gu","doi":"10.7717/peerj-cs.2460","DOIUrl":"10.7717/peerj-cs.2460","url":null,"abstract":"The rise of the Internet of Things (IoT) and Industry 2.0 has spurred a growing need for extensive data computing, and Spark emerged as a promising Big Data platform, attributed to its distributed in-memory computing capabilities. However, practical heavy workloads often lead to memory bottleneck issues in the Spark platform. This results in resilient distributed datasets (RDD) eviction and, in extreme cases, violent memory contentions, causing a significant degradation in Spark computational efficiency. To tackle this issue, we propose an adaptive memory reservation (AMR) strategy in this article, specifically designed for heavy workloads in the Spark environment. Specifically, we model optimal task parallelism by minimizing the disparity between the number of tasks completed without blocking and the number completed in regular rounds. Optimal memory for task parallelism is determined to establish an efficient execution memory space for computational parallelism. Subsequently, through adaptive execution memory reservation and dynamic adjustments, such as compression or expansion based on task progress, the strategy ensures dynamic task parallelism in the Spark parallel computing process. Considering the cost of RDD cache location and real-time memory space usage, we select suitable storage locations for different RDD types to alleviate execution memory pressure. Finally, we conduct extensive laboratory experiments to validate the effectiveness of AMR. Results indicate that, compared to existing memory management solutions, AMR reduces the execution time by approximately 46.8%.","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"10 ","pages":"e2460"},"PeriodicalIF":3.5,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11639302/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142830724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Automatic distractor generation in multiple-choice questions: a systematic literature review. 多项选择题中自动干扰因素的产生：系统的文献综述。

IF 3.5 4区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

PeerJ Computer Science

Pub Date : 2024-11-13 eCollection Date: 2024-01-01 DOI: 10.7717/peerj-cs.2441

Halim Wildan Awalurahman, Indra Budi

Background: Multiple-choice questions (MCQs) are one of the most used assessment formats. However, creating MCQs is a challenging task, particularly when formulating the distractor. Numerous studies have proposed automatic distractor generation. However, there has been no literature review to summarize and present the current state of research in this field. This study aims to perform a systematic literature review to identify trends and the state of the art of automatic distractor generation studies.

Methodology: We conducted a systematic literature following the Kitchenham framework. The relevant literature was retrieved from the ACM Digital Library, IEEE Xplore, Science Direct, and Scopus databases.

Results: A total of 60 relevant studies from 2009 to 2024 were identified and extracted to answer three research questions regarding the data sources, methods, types of questions, evaluation, languages, and domains used in the automatic distractor generation research. The results of the study indicated that automatic distractor generation has been growing with improvement and expansion in many aspects. Furthermore, trends and the state of the art in this topic were observed.

Conclusions: Nevertheless, we identified potential research gaps, including the need to explore further data sources, methods, languages, and domains. This study can serve as a reference for future studies proposing research within the field of automatic distractor generation.

背景：选择题（mcq）是最常用的评估形式之一。然而，创建mcq是一项具有挑战性的任务，特别是在制定分心物时。许多研究都提出了自动干扰物的产生。然而，目前还没有文献综述来总结和介绍这一领域的研究现状。本研究旨在进行系统的文献综述，以确定自动分心物产生研究的趋势和现状。方法：我们按照Kitchenham框架进行了系统的文献研究。相关文献检索自ACM数字图书馆、IEEE explore、Science Direct和Scopus数据库。结果：选取了2009 - 2024年的60篇相关研究，从数据来源、方法、问题类型、评价、语言和领域三个方面回答了自动分心物生成研究的三个研究问题。研究结果表明，自动干扰物的产生在许多方面都在不断改进和扩展。此外，还观察了这一专题的趋势和最新状况。结论：尽管如此，我们发现了潜在的研究差距，包括需要进一步探索数据来源、方法、语言和领域。本研究可为今后在自动干扰物产生领域的研究提供参考。

{"title":"Automatic distractor generation in multiple-choice questions: a systematic literature review.","authors":"Halim Wildan Awalurahman, Indra Budi","doi":"10.7717/peerj-cs.2441","DOIUrl":"10.7717/peerj-cs.2441","url":null,"abstract":"Background: Multiple-choice questions (MCQs) are one of the most used assessment formats. However, creating MCQs is a challenging task, particularly when formulating the distractor. Numerous studies have proposed automatic distractor generation. However, there has been no literature review to summarize and present the current state of research in this field. This study aims to perform a systematic literature review to identify trends and the state of the art of automatic distractor generation studies.Methodology: We conducted a systematic literature following the Kitchenham framework. The relevant literature was retrieved from the ACM Digital Library, IEEE Xplore, Science Direct, and Scopus databases.Results: A total of 60 relevant studies from 2009 to 2024 were identified and extracted to answer three research questions regarding the data sources, methods, types of questions, evaluation, languages, and domains used in the automatic distractor generation research. The results of the study indicated that automatic distractor generation has been growing with improvement and expansion in many aspects. Furthermore, trends and the state of the art in this topic were observed.Conclusions: Nevertheless, we identified potential research gaps, including the need to explore further data sources, methods, languages, and domains. This study can serve as a reference for future studies proposing research within the field of automatic distractor generation.","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"10 ","pages":"e2441"},"PeriodicalIF":3.5,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11623049/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142803358","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Simulation methods realized by virtual reality modeling language for 3D animation considering fuzzy model recognition. 考虑模糊模型识别的虚拟现实建模语言实现三维动画仿真方法。

IF 3.5 4区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

PeerJ Computer Science

Pub Date : 2024-11-13 eCollection Date: 2024-01-01 DOI: 10.7717/peerj-cs.2354

Yu Zhu, Shifan Xie

The creation of 3D animation increasingly prioritizes the enhancement of character effects, narrative depth, and audience engagement to address the growing demands for visual stimulation, cultural enrichment, and interactive experiences. The advancement of virtual reality (VR) animation is anticipated to require sustained collaboration among researchers, animation experts, and hardware developers over an extended period to achieve full maturity. This article explores the use of Virtual Reality Modeling Language (VRML) in generating 3D stereoscopic forms and environments, applying texture mapping, optimizing lighting effects, and establishing interactive user responses, thereby enriching the 3D animation experience. VRML's functionality is further expanded through the integration of script programs in languages such as Java, JavaScript, and VRML Script via the Script node. The implementation of fuzzy model recognition within 3D animation simulations enhances the identification of textual, musical, and linguistic elements, resulting in improved frame rates. This study also analyzes the real-time correlation between the number of polygons and frame rates in a virtual museum animation scene. The findings demonstrate that the frame rate of the 3D animation within this virtual setting consistently exceeds 40 frames per second, thereby ensuring robust real-time performance, preserving the quality of 3D models, and optimizing rendering speed and visual effects without affecting the system's responsiveness to additional functions.

3D动画的创作越来越注重增强角色效果、叙事深度和观众参与度，以满足人们对视觉刺激、文化丰富和互动体验日益增长的需求。虚拟现实（VR）动画的发展预计需要研究人员、动画专家和硬件开发人员在很长一段时间内持续合作，以实现完全成熟。本文探讨了虚拟现实建模语言（VRML）在生成三维立体形态和环境、应用纹理映射、优化灯光效果、建立交互式用户响应等方面的应用，从而丰富了三维动画体验。通过script节点集成Java、JavaScript和VRML script等语言的脚本程序，进一步扩展了VRML的功能。在3D动画模拟中实现模糊模型识别增强了对文本、音乐和语言元素的识别，从而提高了帧率。本研究还分析了虚拟博物馆动画场景中多边形数量与帧率之间的实时相关性。研究结果表明，在这种虚拟设置下，3D动画的帧率始终超过每秒40帧，从而确保了强大的实时性能，保持了3D模型的质量，优化了渲染速度和视觉效果，而不会影响系统对其他功能的响应。

{"title":"Simulation methods realized by virtual reality modeling language for 3D animation considering fuzzy model recognition.","authors":"Yu Zhu, Shifan Xie","doi":"10.7717/peerj-cs.2354","DOIUrl":"10.7717/peerj-cs.2354","url":null,"abstract":"The creation of 3D animation increasingly prioritizes the enhancement of character effects, narrative depth, and audience engagement to address the growing demands for visual stimulation, cultural enrichment, and interactive experiences. The advancement of virtual reality (VR) animation is anticipated to require sustained collaboration among researchers, animation experts, and hardware developers over an extended period to achieve full maturity. This article explores the use of Virtual Reality Modeling Language (VRML) in generating 3D stereoscopic forms and environments, applying texture mapping, optimizing lighting effects, and establishing interactive user responses, thereby enriching the 3D animation experience. VRML's functionality is further expanded through the integration of script programs in languages such as Java, JavaScript, and VRML Script via the Script node. The implementation of fuzzy model recognition within 3D animation simulations enhances the identification of textual, musical, and linguistic elements, resulting in improved frame rates. This study also analyzes the real-time correlation between the number of polygons and frame rates in a virtual museum animation scene. The findings demonstrate that the frame rate of the 3D animation within this virtual setting consistently exceeds 40 frames per second, thereby ensuring robust real-time performance, preserving the quality of 3D models, and optimizing rendering speed and visual effects without affecting the system's responsiveness to additional functions.","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"10 ","pages":"e2354"},"PeriodicalIF":3.5,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11623235/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142803333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Foreground separation knowledge distillation for object detection. 前景分离知识精馏用于目标检测。

IF 3.5 4区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

PeerJ Computer Science

Pub Date : 2024-11-13 eCollection Date: 2024-01-01 DOI: 10.7717/peerj-cs.2485

Chao Li, Rugui Liu, Zhe Quan, Pengpeng Hu, Jun Sun

In recent years, deep learning models have become predominant methods for computer vision tasks, but the large computation and storage requirements of many models make them challenging to deploy on devices with limited resources. Knowledge distillation (KD) is a widely used approach for model compression. However, when applied in the object detection problems, the existing KD methods either directly applies the feature map or simply separate the foreground from the background by using a binary mask, aligning the attention between the teacher and the student models. Unfortunately, these methods either completely overlook or fail to thoroughly eliminate noise, resulting in unsatisfactory model accuracy for student models. To address this issue, we propose a foreground separation distillation (FSD) method in this paper. The FSD method enables student models to distinguish between foreground and background using Gaussian heatmaps, reducing irrelevant information in the learning process. Additionally, FSD also extracts the channel feature by converting the spatial feature maps into probabilistic forms to fully utilize the knowledge in each channel of a well-trained teacher. Experimental results demonstrate that the YOLOX detector enhanced with our distillation method achieved superior performance on both the fall detection and the VOC2007 datasets. For example, YOLOX with FSD achieved 73.1% mean average precision (mAP) on the Fall Detection dataset, which is 1.6% higher than the baseline. The code of FSD is accessible via https://doi.org/10.5281/zenodo.13829676.

近年来，深度学习模型已成为计算机视觉任务的主要方法，但许多模型的大量计算和存储需求使其难以在资源有限的设备上部署。知识蒸馏（Knowledge distillation， KD）是一种应用广泛的模型压缩方法。然而，当应用于目标检测问题时，现有的KD方法要么直接应用特征映射，要么简单地使用二值掩码将前景与背景分离，使教师模型和学生模型的注意力对准。不幸的是，这些方法要么完全忽略了噪声，要么没有彻底消除噪声，导致学生模型的模型精度不理想。为了解决这一问题，本文提出了一种前景分离蒸馏（FSD）方法。FSD方法使学生模型能够使用高斯热图区分前景和背景，减少学习过程中的不相关信息。此外，消防处还通过将空间特征图转换为概率形式提取通道特征，以充分利用训练有素的教师在每个通道中的知识。实验结果表明，采用我们的蒸馏方法增强的YOLOX检测器在跌倒检测和VOC2007数据集上都取得了优异的性能。例如，使用FSD的YOLOX在跌倒检测数据集上实现了73.1%的平均精度（mAP），比基线高1.6%。消防处的程式码可透过https://doi.org/10.5281/zenodo.13829676浏览。

{"title":"Foreground separation knowledge distillation for object detection.","authors":"Chao Li, Rugui Liu, Zhe Quan, Pengpeng Hu, Jun Sun","doi":"10.7717/peerj-cs.2485","DOIUrl":"10.7717/peerj-cs.2485","url":null,"abstract":"In recent years, deep learning models have become predominant methods for computer vision tasks, but the large computation and storage requirements of many models make them challenging to deploy on devices with limited resources. Knowledge distillation (KD) is a widely used approach for model compression. However, when applied in the object detection problems, the existing KD methods either directly applies the feature map or simply separate the foreground from the background by using a binary mask, aligning the attention between the teacher and the student models. Unfortunately, these methods either completely overlook or fail to thoroughly eliminate noise, resulting in unsatisfactory model accuracy for student models. To address this issue, we propose a foreground separation distillation (FSD) method in this paper. The FSD method enables student models to distinguish between foreground and background using Gaussian heatmaps, reducing irrelevant information in the learning process. Additionally, FSD also extracts the channel feature by converting the spatial feature maps into probabilistic forms to fully utilize the knowledge in each channel of a well-trained teacher. Experimental results demonstrate that the YOLOX detector enhanced with our distillation method achieved superior performance on both the fall detection and the VOC2007 datasets. For example, YOLOX with FSD achieved 73.1% mean average precision (mAP) on the Fall Detection dataset, which is 1.6% higher than the baseline. The code of FSD is accessible via https://doi.org/10.5281/zenodo.13829676.","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"10 ","pages":"e2485"},"PeriodicalIF":3.5,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11623026/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142803341","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Hierarchical multi-label classification model for science and technology news based on heterogeneous graph semantic enhancement. 基于异构图语义增强的科技新闻分层多标签分类模型。

IF 3.5 4区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

PeerJ Computer Science

Pub Date : 2024-11-12 eCollection Date: 2024-01-01 DOI: 10.7717/peerj-cs.2469

Quan Cheng, Jingyi Cheng, Jian Chen, Shaojun Liu

In the context of high-quality economic development, technological innovation has emerged as a fundamental driver of socio-economic progress. The consequent proliferation of science and technology news, which acts as a vital medium for disseminating technological advancements and policy changes, has attracted considerable attention from technology management agencies and innovation organizations. Nevertheless, online science and technology news has historically exhibited characteristics such as limited scale, disorderliness, and multi-dimensionality, which is extremely inconvenient for users of deep application. While single-label classification techniques can effectively categorize textual information, they face challenges in leading science and technology news classification due to a lack of a hierarchical knowledge framework and insufficient capacity to reveal knowledge integration features. This study proposes a hierarchical multi-label classification model for science and technology news, enhanced by heterogeneous graph semantics. The model captures multi-dimensional themes and hierarchical structural features within science and technology news through a hierarchical transmission module. It integrates graph convolutional networks to extract node information and hierarchical relationships from heterogeneous graphs, while also incorporating prior knowledge from domain knowledge graphs to address data scarcity. This approach enhances the understanding and classification capabilities of the semantics of science and technology news. Experimental results demonstrate that the model achieves precision, recall, and F1 scores of 84.21%, 88.89%, and 86.49%, respectively, significantly surpassing baseline models. This research presents an innovative solution for hierarchical multi-label classification tasks, demonstrating significant application potential in addressing data scarcity and complex thematic classification challenges.

在经济高质量发展的背景下，技术创新已成为社会经济进步的根本动力。作为传播技术进步和政策变化的重要媒介的科学和技术新闻的扩散，引起了技术管理机构和创新组织的相当重视。然而，网络科技新闻在历史上呈现出规模有限、无序、多维度等特点，给用户的深度应用带来极大不便。单标签分类技术虽然可以有效地对文本信息进行分类，但由于缺乏层次化的知识框架，无法充分揭示知识集成特征，在主流科技新闻分类中面临挑战。本文提出了一种基于异构图语义的分层多标签科技新闻分类模型。该模型通过分层传播模块捕捉科技新闻中的多维主题和分层结构特征。它集成了图卷积网络，从异构图中提取节点信息和层次关系，同时还结合了领域知识图的先验知识来解决数据稀缺性问题。该方法提高了对科技新闻语义的理解和分类能力。实验结果表明，该模型的准确率、召回率和F1得分分别为84.21%、88.89%和86.49%，显著高于基线模型。本研究提出了一种分层多标签分类任务的创新解决方案，在解决数据稀缺性和复杂主题分类挑战方面展示了巨大的应用潜力。

{"title":"Hierarchical multi-label classification model for science and technology news based on heterogeneous graph semantic enhancement.","authors":"Quan Cheng, Jingyi Cheng, Jian Chen, Shaojun Liu","doi":"10.7717/peerj-cs.2469","DOIUrl":"10.7717/peerj-cs.2469","url":null,"abstract":"In the context of high-quality economic development, technological innovation has emerged as a fundamental driver of socio-economic progress. The consequent proliferation of science and technology news, which acts as a vital medium for disseminating technological advancements and policy changes, has attracted considerable attention from technology management agencies and innovation organizations. Nevertheless, online science and technology news has historically exhibited characteristics such as limited scale, disorderliness, and multi-dimensionality, which is extremely inconvenient for users of deep application. While single-label classification techniques can effectively categorize textual information, they face challenges in leading science and technology news classification due to a lack of a hierarchical knowledge framework and insufficient capacity to reveal knowledge integration features. This study proposes a hierarchical multi-label classification model for science and technology news, enhanced by heterogeneous graph semantics. The model captures multi-dimensional themes and hierarchical structural features within science and technology news through a hierarchical transmission module. It integrates graph convolutional networks to extract node information and hierarchical relationships from heterogeneous graphs, while also incorporating prior knowledge from domain knowledge graphs to address data scarcity. This approach enhances the understanding and classification capabilities of the semantics of science and technology news. Experimental results demonstrate that the model achieves precision, recall, and F1 scores of 84.21%, 88.89%, and 86.49%, respectively, significantly surpassing baseline models. This research presents an innovative solution for hierarchical multi-label classification tasks, demonstrating significant application potential in addressing data scarcity and complex thematic classification challenges.","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"10 ","pages":"e2469"},"PeriodicalIF":3.5,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11623068/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142803355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0