With the rapid development and widespread adoption of intelligent vehicles and the Internet of Vehicles (IoV), vehicle security has become a growing concern. Modern vehicles manage key components via the controller area network (CAN) connected electronic control units (ECUs). CAN bus intrusion techniques are the primary methods of compromising the IoV, posing a significant threat to the normal operation of critical vehicle systems, such as the power systems. However, existing attack detection methods still have shortcomings in terms of feature extraction and the diversity of attack type detection. To address these challenges, we propose an intrusion detection framework named basic ensemble and pioneer class decision (BEPCD). The framework first constructs a 15-dimensional feature model to hierarchically characterize CAN bus messages. Subsequently, BEPCD incorporates multi-model ensemble learning enhanced by a Pioneer class selector and confidence-driven voting mechanisms, enabling precise classification of both conventional and emerging attack patterns. Additionally, we analyze the importance of different data features across four machine learning algorithms. Experimental results on public datasets demonstrate that the proposed detection framework effectively detects intrusions in-vehicle CAN bus. Compared to other intrusion detection frameworks, our framework improves the overall F1-score by 1% to 5%. Notably, it achieves an approximately 77.5% performance enhancement in detecting replay attacks.
{"title":"BEPCD: an ensemble learning-based intrusion detection framework for in-vehicle CAN bus.","authors":"Bocheng Xu, Fei Cao, Xilong Li, Song Tian, Wenbo Deng, Shudan Yue","doi":"10.7717/peerj-cs.3108","DOIUrl":"10.7717/peerj-cs.3108","url":null,"abstract":"<p><p>With the rapid development and widespread adoption of intelligent vehicles and the Internet of Vehicles (IoV), vehicle security has become a growing concern. Modern vehicles manage key components <i>via</i> the controller area network (CAN) connected electronic control units (ECUs). CAN bus intrusion techniques are the primary methods of compromising the IoV, posing a significant threat to the normal operation of critical vehicle systems, such as the power systems. However, existing attack detection methods still have shortcomings in terms of feature extraction and the diversity of attack type detection. To address these challenges, we propose an intrusion detection framework named basic ensemble and pioneer class decision (BEPCD). The framework first constructs a 15-dimensional feature model to hierarchically characterize CAN bus messages. Subsequently, BEPCD incorporates multi-model ensemble learning enhanced by a Pioneer class selector and confidence-driven voting mechanisms, enabling precise classification of both conventional and emerging attack patterns. Additionally, we analyze the importance of different data features across four machine learning algorithms. Experimental results on public datasets demonstrate that the proposed detection framework effectively detects intrusions in-vehicle CAN bus. Compared to other intrusion detection frameworks, our framework improves the overall F1-score by 1% to 5%. Notably, it achieves an approximately 77.5% performance enhancement in detecting replay attacks.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"11 ","pages":"e3108"},"PeriodicalIF":2.5,"publicationDate":"2025-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12453696/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145132626","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-18eCollection Date: 2025-01-01DOI: 10.7717/peerj-cs.3109
Hongxia Wang, Teng Lv
With the increasing prevalence and diversity of imaging devices, palmprint recognition has emerged as a technology that better meets the demands of the modern era. However, traditional manual methods have limitations in effectively extracting palmprint principal line features. To address this, we introduce a novel data augmentation method. First, the wide line extraction (WLE) filter is utilized to specifically target and extract the prominent principal lines of palmprints by leveraging their direction and width characteristics. Then, a Gabor filter is applied to the WLE-extracted results to purify the features and remove fine lines, as fine lines can introduce noise and redundancy that interfere with the accurate extraction of significant principal line features crucial for palmprint recognition. Evaluating this data augmentation across four common Vision Transformer (ViT) classification models, experimental results show that it improves the recognition rates of all databases to varying degrees, with a remarkable 32.9% increase on the high-resolution XINHUA database. With the successful removal of fine lines by WLE, we propose a new Layer Visual Transformer (LViT) design paradigm. For its input, distinct blocking strategies are adopted, carefully designed to partition the data to capture different levels of spatial and feature information, using larger blocks for global structure and smaller ones for local details. The output results of these different blocking strategies are fused by "sum fusion" and "maximum fusion", and the local and global features are effectively utilized by combining complementary information to improve the recognition performance and get state-of-the-art results on multiple databases. Moreover, LViT requires fewer training iterations due to the synergistic effects of the blocking strategies, optimizing the learning process. Finally, by simulating real-world noise conditions, we comprehensively evaluate LViT and find that, compared with traditional methods, our approach exhibits excellent noise-resistant generalization ability, maintaining stable performance across the PolyU II, IIT Delhi, XINHUA, and NTU-CP-V1 databases.
{"title":"Palmprint recognition based on principal line features.","authors":"Hongxia Wang, Teng Lv","doi":"10.7717/peerj-cs.3109","DOIUrl":"10.7717/peerj-cs.3109","url":null,"abstract":"<p><p>With the increasing prevalence and diversity of imaging devices, palmprint recognition has emerged as a technology that better meets the demands of the modern era. However, traditional manual methods have limitations in effectively extracting palmprint principal line features. To address this, we introduce a novel data augmentation method. First, the wide line extraction (WLE) filter is utilized to specifically target and extract the prominent principal lines of palmprints by leveraging their direction and width characteristics. Then, a Gabor filter is applied to the WLE-extracted results to purify the features and remove fine lines, as fine lines can introduce noise and redundancy that interfere with the accurate extraction of significant principal line features crucial for palmprint recognition. Evaluating this data augmentation across four common Vision Transformer (ViT) classification models, experimental results show that it improves the recognition rates of all databases to varying degrees, with a remarkable 32.9% increase on the high-resolution XINHUA database. With the successful removal of fine lines by WLE, we propose a new Layer Visual Transformer (LViT) design paradigm. For its input, distinct blocking strategies are adopted, carefully designed to partition the data to capture different levels of spatial and feature information, using larger blocks for global structure and smaller ones for local details. The output results of these different blocking strategies are fused by \"sum fusion\" and \"maximum fusion\", and the local and global features are effectively utilized by combining complementary information to improve the recognition performance and get state-of-the-art results on multiple databases. Moreover, LViT requires fewer training iterations due to the synergistic effects of the blocking strategies, optimizing the learning process. Finally, by simulating real-world noise conditions, we comprehensively evaluate LViT and find that, compared with traditional methods, our approach exhibits excellent noise-resistant generalization ability, maintaining stable performance across the PolyU II, IIT Delhi, XINHUA, and NTU-CP-V1 databases.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"11 ","pages":"e3109"},"PeriodicalIF":2.5,"publicationDate":"2025-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12453761/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145132555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-18eCollection Date: 2025-01-01DOI: 10.7717/peerj-cs.3083
Chengming Rao, Zunhao Hu, QiMing Zhao, Min Shan, Li Mao
One of the main challenges encountered in visual object detection is the multi-scale issue. Many approaches have been proposed to tackle this issue. In this article, we propose a novel neck that can perform effective fusion of multi-scale features for a single-stage object detector. This neck, named the deformable convolution and path aggregation network (DePAN), is an integration of a path aggregation network with a deformable convolution block added to the feature fusion branch to improve the flexibility of feature point sampling. The deformable convolution block is implemented by repeated stacking of a deformable convolution cell. The DePAN neck can be plugged in and easily applied to various models for object detection. We apply the proposed neck to the baseline models of Yolov6-N and YOLOV6-T, and test the improved models on COCO2017 and PASCAL VOC2012 datasets, as well as a medical image dataset. The experimental results verify the effectiveness and applicability in real-world object detection.
{"title":"A path aggregation network with deformable convolution for visual object detection.","authors":"Chengming Rao, Zunhao Hu, QiMing Zhao, Min Shan, Li Mao","doi":"10.7717/peerj-cs.3083","DOIUrl":"10.7717/peerj-cs.3083","url":null,"abstract":"<p><p>One of the main challenges encountered in visual object detection is the multi-scale issue. Many approaches have been proposed to tackle this issue. In this article, we propose a novel neck that can perform effective fusion of multi-scale features for a single-stage object detector. This neck, named the deformable convolution and path aggregation network (DePAN), is an integration of a path aggregation network with a deformable convolution block added to the feature fusion branch to improve the flexibility of feature point sampling. The deformable convolution block is implemented by repeated stacking of a deformable convolution cell. The DePAN neck can be plugged in and easily applied to various models for object detection. We apply the proposed neck to the baseline models of Yolov6-N and YOLOV6-T, and test the improved models on COCO2017 and PASCAL VOC2012 datasets, as well as a medical image dataset. The experimental results verify the effectiveness and applicability in real-world object detection.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"11 ","pages":"e3083"},"PeriodicalIF":2.5,"publicationDate":"2025-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12453868/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145131794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-18eCollection Date: 2025-01-01DOI: 10.7717/peerj-cs.3121
Jing Wang, Muhammad Asif
The rapid advancement of artificial intelligence (AI) has catalyzed transformative changes in education, particularly in mobile and online learning environments. While existing deep learning models struggle to efficiently integrate the complexity of remote education data and optimize model performance, this article proposes an intelligent evaluation method for students' learning states based on multimodal data. First, the joint characteristics of the pre-class mental status survey information and the health big data of teachers and students in the online teaching process constitute input data. Then, the multilayer perceptron (MLP) is used to intelligently identify the students' status and classify their enthusiasm for the class. Finally, the particle swarm optimization (PSO) model is used to optimize the model and improve the overall recognition rate. Compared to traditional methods, the PSO-MLP model with combined multimodal data performs well, achieving an accuracy of 0.891. It provides an operational, technical solution for the education system, provides a new AI foundation for personalized teaching and student health management by accurately assessing students' learning status, and helps to improve the effectiveness and efficiency of remote education.
{"title":"Leveraging PSO-MLP for intelligent assessment of student learning in remote environments: a multimodal approach.","authors":"Jing Wang, Muhammad Asif","doi":"10.7717/peerj-cs.3121","DOIUrl":"10.7717/peerj-cs.3121","url":null,"abstract":"<p><p>The rapid advancement of artificial intelligence (AI) has catalyzed transformative changes in education, particularly in mobile and online learning environments. While existing deep learning models struggle to efficiently integrate the complexity of remote education data and optimize model performance, this article proposes an intelligent evaluation method for students' learning states based on multimodal data. First, the joint characteristics of the pre-class mental status survey information and the health big data of teachers and students in the online teaching process constitute input data. Then, the multilayer perceptron (MLP) is used to intelligently identify the students' status and classify their enthusiasm for the class. Finally, the particle swarm optimization (PSO) model is used to optimize the model and improve the overall recognition rate. Compared to traditional methods, the PSO-MLP model with combined multimodal data performs well, achieving an accuracy of 0.891. It provides an operational, technical solution for the education system, provides a new AI foundation for personalized teaching and student health management by accurately assessing students' learning status, and helps to improve the effectiveness and efficiency of remote education.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"11 ","pages":"e3121"},"PeriodicalIF":2.5,"publicationDate":"2025-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12453797/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145132696","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-15eCollection Date: 2025-01-01DOI: 10.7717/peerj-cs.3098
Rong Zhu, Yong Wang, Junliang Shang, Ling-Yun Dai, Feng Li
Microorganisms play an important role in many complex diseases, influencing their onset, progression, and potential treatment outcomes. Exploring the associations between microbes and human diseases can deepen our understanding of disease mechanisms and assist in improving diagnosis and therapy. However, traditional biological experiments used to uncover such relationships often demand substantial time and resources. In response to these limitations, computational methods have gained traction as more practical tools for predicting microbe-disease associations. Despite their growing use, many of these models still face challenges in terms of accuracy, stability, and adaptability to noisy or sparse data. To overcome the aforementioned limitations, we propose a novel predictive framework, HyperGraph Neural Network with Transformer for Microbe-Disease Associations (HGNNTMDA), designed to infer potential associations between human microbes and diseases. The framework begins by integrating microbe-disease association data with similarity-based features to construct node representations. Two graph construction strategies are employed: a K-nearest neighbor (KNN)-based adjacency matrix to build a standard graph, and a K-means clustering approach that groups similar nodes into clusters, which serve as hyperedges to define the incidence matrix of a hypergraph. Separate hypergraph neural networks (HGNNs) are then applied to microbe and disease graphs to extract structured node-level features. An attention mechanism (AM) is subsequently introduced to emphasize informative signals, followed by a Transformer module to capture contextual dependencies and enhance global feature representation. A fully connected layer then projects these features into a unified space, where association scores between microbes and diseases are computed. For model optimization, we propose a hybrid loss strategy combining contrastive loss and Huber loss. The contrastive loss aids in learning discriminative embeddings, while the Huber loss enhances robustness against outliers and improves predictive stability. The effectiveness of HGNNTMDA is validated on two benchmark datasets-HMDAD and Disbiome-using five-fold cross-validation (5CV). Our model achieves an AUC of 0.9976 on HMDAD and 0.9423 on Disbiome, outperforming six existing state-of-the-art methods. Further case studies confirm its practical value in discovering novel microbe-disease associations.
微生物在许多复杂疾病中发挥重要作用,影响其发病、进展和潜在的治疗结果。探索微生物与人类疾病之间的联系可以加深我们对疾病机制的理解,并有助于改善诊断和治疗。然而,用于揭示这种关系的传统生物学实验往往需要大量的时间和资源。为了应对这些限制,计算方法作为预测微生物与疾病关联的更实用的工具得到了关注。尽管越来越多地使用这些模型,但其中许多模型在准确性、稳定性和对噪声或稀疏数据的适应性方面仍然面临挑战。为了克服上述局限性,我们提出了一个新的预测框架,HyperGraph Neural Network with Transformer for Microbe-Disease Associations (HGNNTMDA),旨在推断人类微生物和疾病之间的潜在关联。该框架首先将微生物-疾病关联数据与基于相似性的特征集成,以构建节点表示。采用了两种图构建策略:基于k近邻(KNN)的邻接矩阵构建标准图,以及k均值聚类方法,将相似节点分组成簇,作为超边定义超图的关联矩阵。然后将分离的超图神经网络(hgnn)应用于微生物和疾病图以提取结构化的节点级特征。随后引入了一个注意机制(AM)来强调信息信号,然后是一个Transformer模块来捕获上下文依赖性并增强全局特征表示。然后,一个完全连接的层将这些特征投射到一个统一的空间中,在那里计算微生物和疾病之间的关联得分。为了优化模型,我们提出了一种结合对比损失和Huber损失的混合损失策略。对比损失有助于学习判别嵌入,而Huber损失增强了对异常值的鲁棒性并提高了预测稳定性。HGNNTMDA的有效性在两个基准数据集(hmdad和disbiome)上进行了五倍交叉验证(5CV)。我们的模型在HMDAD上的AUC为0.9976,在Disbiome上的AUC为0.9423,优于现有的六种最先进的方法。进一步的案例研究证实了它在发现新的微生物-疾病关联方面的实用价值。
{"title":"Optimizing transformer-based prediction of human microbe-disease associations through integrated loss strategies.","authors":"Rong Zhu, Yong Wang, Junliang Shang, Ling-Yun Dai, Feng Li","doi":"10.7717/peerj-cs.3098","DOIUrl":"10.7717/peerj-cs.3098","url":null,"abstract":"<p><p>Microorganisms play an important role in many complex diseases, influencing their onset, progression, and potential treatment outcomes. Exploring the associations between microbes and human diseases can deepen our understanding of disease mechanisms and assist in improving diagnosis and therapy. However, traditional biological experiments used to uncover such relationships often demand substantial time and resources. In response to these limitations, computational methods have gained traction as more practical tools for predicting microbe-disease associations. Despite their growing use, many of these models still face challenges in terms of accuracy, stability, and adaptability to noisy or sparse data. To overcome the aforementioned limitations, we propose a novel predictive framework, HyperGraph Neural Network with Transformer for Microbe-Disease Associations (HGNNTMDA), designed to infer potential associations between human microbes and diseases. The framework begins by integrating microbe-disease association data with similarity-based features to construct node representations. Two graph construction strategies are employed: a K-nearest neighbor (KNN)-based adjacency matrix to build a standard graph, and a K-means clustering approach that groups similar nodes into clusters, which serve as hyperedges to define the incidence matrix of a hypergraph. Separate hypergraph neural networks (HGNNs) are then applied to microbe and disease graphs to extract structured node-level features. An attention mechanism (AM) is subsequently introduced to emphasize informative signals, followed by a Transformer module to capture contextual dependencies and enhance global feature representation. A fully connected layer then projects these features into a unified space, where association scores between microbes and diseases are computed. For model optimization, we propose a hybrid loss strategy combining contrastive loss and Huber loss. The contrastive loss aids in learning discriminative embeddings, while the Huber loss enhances robustness against outliers and improves predictive stability. The effectiveness of HGNNTMDA is validated on two benchmark datasets-HMDAD and Disbiome-using five-fold cross-validation (5CV). Our model achieves an AUC of 0.9976 on HMDAD and 0.9423 on Disbiome, outperforming six existing state-of-the-art methods. Further case studies confirm its practical value in discovering novel microbe-disease associations.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"11 ","pages":"e3098"},"PeriodicalIF":2.5,"publicationDate":"2025-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12453706/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145132561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The rapid expansion of the Internet of Things (IoT) has significantly increased the volume and diversity of network traffic, making accurate IoT traffic classification crucial for maintaining network security and efficiency. However, existing traffic classification methods, including traditional machine learning and deep learning approaches, often exhibit critical limitations, such as insufficient generalization across diverse IoT environments, dependency on extensive labelled datasets, and susceptibility to overfitting in dynamic scenarios. While recent transformer-based models show promise in capturing contextual information, they typically rely on standard tokenization, which is ill-suited for the irregular nature of IoT traffic and often remains confined to single-purpose tasks. To address these challenges, this study introduces MIND-IoT, a novel and scalable framework for classifying generalized IoT traffic. MIND-IoT employs a hybrid architecture that combines Transformer-based models for capturing long-range dependencies and convolutional neural networks (CNNs) for efficient local feature extraction. A key innovation is IoT-Tokenize, a custom tokenization pipeline designed to preserve the structural semantics of network flows by converting statistical traffic features into semantically meaningful feature-value pairs. The framework operates in two phases: a pre-training phase utilizing masked language modeling (MLM) on large-scale IoT data (UNSW IoT Traces and MonIoTr) to learn robust representations and a fine-tuning phase that adapts the model to specific classification tasks, including binary IoT vs. non-IoT classification, IoT category classification, and device identification. Comprehensive evaluation across multiple diverse datasets (IoT Sentinel, YourThings, and IoT-FCSIT, in addition to the pre-training datasets) demonstrates MIND-IoT's superior performance, robustness, and adaptability compared to traditional methods. The model achieves an accuracy of up to 98.14% and a 97.85% F1-score, demonstrating its ability to classify new datasets and adapt to emerging tasks with minimal fine-tuning and remarkable efficiency. This research positions MIND-IoT as a highly effective and scalable solution for real-world IoT traffic classification challenges.
{"title":"Transformer-based tokenization for IoT traffic classification across diverse network environments.","authors":"Firdaus Afifi, Faiz Zaki, Hazim Hanif, Nik Aqil, Nor Badrul Anuar","doi":"10.7717/peerj-cs.3126","DOIUrl":"10.7717/peerj-cs.3126","url":null,"abstract":"<p><p>The rapid expansion of the Internet of Things (IoT) has significantly increased the volume and diversity of network traffic, making accurate IoT traffic classification crucial for maintaining network security and efficiency. However, existing traffic classification methods, including traditional machine learning and deep learning approaches, often exhibit critical limitations, such as insufficient generalization across diverse IoT environments, dependency on extensive labelled datasets, and susceptibility to overfitting in dynamic scenarios. While recent transformer-based models show promise in capturing contextual information, they typically rely on standard tokenization, which is ill-suited for the irregular nature of IoT traffic and often remains confined to single-purpose tasks. To address these challenges, this study introduces MIND-IoT, a novel and scalable framework for classifying generalized IoT traffic. MIND-IoT employs a hybrid architecture that combines Transformer-based models for capturing long-range dependencies and convolutional neural networks (CNNs) for efficient local feature extraction. A key innovation is IoT-Tokenize, a custom tokenization pipeline designed to preserve the structural semantics of network flows by converting statistical traffic features into semantically meaningful feature-value pairs. The framework operates in two phases: a pre-training phase utilizing masked language modeling (MLM) on large-scale IoT data (UNSW IoT Traces and MonIoTr) to learn robust representations and a fine-tuning phase that adapts the model to specific classification tasks, including binary IoT <i>vs</i>. non-IoT classification, IoT category classification, and device identification. Comprehensive evaluation across multiple diverse datasets (IoT Sentinel, YourThings, and IoT-FCSIT, in addition to the pre-training datasets) demonstrates MIND-IoT's superior performance, robustness, and adaptability compared to traditional methods. The model achieves an accuracy of up to 98.14% and a 97.85% F1-score, demonstrating its ability to classify new datasets and adapt to emerging tasks with minimal fine-tuning and remarkable efficiency. This research positions MIND-IoT as a highly effective and scalable solution for real-world IoT traffic classification challenges.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"11 ","pages":"e3126"},"PeriodicalIF":2.5,"publicationDate":"2025-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12453836/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145132725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-14eCollection Date: 2025-01-01DOI: 10.7717/peerj-cs.3064
Zahraa Ahmed, Mesut Çevik
One of the most prominent neurodegenerative diseases globally is Alzheimer's disease (AD). The early diagnosis of AD is a challenging task due to complex pathophysiology caused by the presence and accumulation of neurofibrillary tangles and amyloid plaques. However, the late enriched understanding of the genetic underpinnings of AD has been made possible due to recent advancements in data mining analysis methods, machine learning, and microarray technologies. However, the "curse of dimensionality" caused by the high-dimensional microarray datasets impacts the accurate prediction of the disease due to issues of overfitting, bias, and high computational demands. To alleviate such an effect, this study proposes a gene selection approach based on the parameter-free and large-scale manta ray foraging optimization algorithm. Given the dimensional disparities and statistical relationship distributions of the six investigated datasets, in addition to four evaluated machine learning classifiers; the proposed Sign Random Mutation and Best Rank enhancements that substantially improved MRFO's exploration and exploitation contributed to efficient identification of relevant genes and to machine learning improved prediction accuracy.
{"title":"Improving machine learning detection of Alzheimer disease using enhanced manta ray gene selection of Alzheimer gene expression datasets.","authors":"Zahraa Ahmed, Mesut Çevik","doi":"10.7717/peerj-cs.3064","DOIUrl":"10.7717/peerj-cs.3064","url":null,"abstract":"<p><p>One of the most prominent neurodegenerative diseases globally is Alzheimer's disease (AD). The early diagnosis of AD is a challenging task due to complex pathophysiology caused by the presence and accumulation of neurofibrillary tangles and amyloid plaques. However, the late enriched understanding of the genetic underpinnings of AD has been made possible due to recent advancements in data mining analysis methods, machine learning, and microarray technologies. However, the \"curse of dimensionality\" caused by the high-dimensional microarray datasets impacts the accurate prediction of the disease due to issues of overfitting, bias, and high computational demands. To alleviate such an effect, this study proposes a gene selection approach based on the parameter-free and large-scale manta ray foraging optimization algorithm. Given the dimensional disparities and statistical relationship distributions of the six investigated datasets, in addition to four evaluated machine learning classifiers; the proposed Sign Random Mutation and Best Rank enhancements that substantially improved MRFO's exploration and exploitation contributed to efficient identification of relevant genes and to machine learning improved prediction accuracy.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"11 ","pages":"e3064"},"PeriodicalIF":2.5,"publicationDate":"2025-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12453835/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145132749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-14eCollection Date: 2025-01-01DOI: 10.7717/peerj-cs.3076
Siyu Yun, Xinsheng Wang
As the core engine of electronic design automation (EDA) tools, the efficiency of Boolean Satisfiability Problem (SAT) solver largely determines the cycle of integrated circuit research and development. The effectiveness of SAT solvers has steadily turned into the key bottleneck of circuit design cycle due to the dramatically increased integrated circuit scale. The primary issue of SAT solver now is the divergence between SAT used in industry and research on pure solution algorithms. We propose a strategy for partitioning the SAT problem based on the structural information then solving it. By effectively extracting the structure information from the original SAT problem, the self-organizing map (SOM) neural network deployed in the division section can speed up the sub-thread solver's processing while avoiding cumbersome parameter adjustments. The experimental results demonstrate the stability and scalability of our technique, which can drastically shorten the time required to solve industrial benchmarks from various sources.
{"title":"Multi-step partitioning combined with SOM neural network-based clustering technique effectively improves SAT solver performance.","authors":"Siyu Yun, Xinsheng Wang","doi":"10.7717/peerj-cs.3076","DOIUrl":"https://doi.org/10.7717/peerj-cs.3076","url":null,"abstract":"<p><p>As the core engine of electronic design automation (EDA) tools, the efficiency of Boolean Satisfiability Problem (SAT) solver largely determines the cycle of integrated circuit research and development. The effectiveness of SAT solvers has steadily turned into the key bottleneck of circuit design cycle due to the dramatically increased integrated circuit scale. The primary issue of SAT solver now is the divergence between SAT used in industry and research on pure solution algorithms. We propose a strategy for partitioning the SAT problem based on the structural information then solving it. By effectively extracting the structure information from the original SAT problem, the self-organizing map (SOM) neural network deployed in the division section can speed up the sub-thread solver's processing while avoiding cumbersome parameter adjustments. The experimental results demonstrate the stability and scalability of our technique, which can drastically shorten the time required to solve industrial benchmarks from various sources.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"11 ","pages":"e3076"},"PeriodicalIF":2.5,"publicationDate":"2025-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12453816/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145132679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-14eCollection Date: 2025-01-01DOI: 10.7717/peerj-cs.3115
Teng Li, Xiaodong Guo, Cun Ji
With the rapid development of the Internet of Things, time series classification (TSC) has gained significant attention from researchers due to its applications in various real-world fields, including electroencephalogram/electrocardiogram classification, emotion recognition, and error message detection. To improve classification performance, numerous TSC methods have been proposed in recent years. Among these, shapelet-based TSC methods are particularly notable for their intuitive interpretability. A critical task within these methods is evaluating the quality of candidate shapelets. This paper provides a comprehensive survey of the state-of-the-art measures for assessing shapelet quality. To present a structured overview, we begin by proposing a taxonomy of these measures, followed by a detailed description of each one. We then discuss these measures, highlighting the challenges faced by current research and offering suggestions for future directions. Finally, we summarize the findings of this survey. We hope that this work will serve as a valuable resource for researchers in the field.
{"title":"A literature survey of shapelet quality measures for time series classification.","authors":"Teng Li, Xiaodong Guo, Cun Ji","doi":"10.7717/peerj-cs.3115","DOIUrl":"10.7717/peerj-cs.3115","url":null,"abstract":"<p><p>With the rapid development of the Internet of Things, time series classification (TSC) has gained significant attention from researchers due to its applications in various real-world fields, including electroencephalogram/electrocardiogram classification, emotion recognition, and error message detection. To improve classification performance, numerous TSC methods have been proposed in recent years. Among these, shapelet-based TSC methods are particularly notable for their intuitive interpretability. A critical task within these methods is evaluating the quality of candidate shapelets. This paper provides a comprehensive survey of the state-of-the-art measures for assessing shapelet quality. To present a structured overview, we begin by proposing a taxonomy of these measures, followed by a detailed description of each one. We then discuss these measures, highlighting the challenges faced by current research and offering suggestions for future directions. Finally, we summarize the findings of this survey. We hope that this work will serve as a valuable resource for researchers in the field.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"11 ","pages":"e3115"},"PeriodicalIF":2.5,"publicationDate":"2025-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12453792/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145132639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Intensity modulated radiation therapy (IMRT) is a prevalent approach for administering radiation therapy in cancer treatment. The primary objective of IMRT is to devise a treatment strategy that eradicates cancer cells from the tumour while minimising damage to the surrounding organs at risk. Conventional IMRT planning entails a sequential procedure: optimising beam intensity for a certain set of angles, followed by sequencing. Unfortunately, treatment plans obtained in the optimisation stage are severely impaired after the sequencing stage due to physical and delivery constraints that are not considered during the optimisation stage. One method that tackles the issues above is the direct aperture optimisation (DAO) technique. The DAO problem seeks to generate a set of deliverable aperture configurations and a corresponding set of radiation intensities. This method accounts for physical and delivery time limitations, facilitating the creation of clinically appropriate treatment programs. In this article, we propose and compare two variable neighbourhood search (VNS) based algorithms, called variable neighbourhood descent (VND) and reduced variable neighbourhood search (rVNS). The VND algorithm is a deterministic variant of VNS that systematically explores different neighbourhood structures. This approach allows for a more thorough solution for space exploration while maintaining computational efficiency. The rVNS, unlike traditional VNS algorithms, does not require any transition rule, as it integrates a set of predefined neighbourhood moves at each iteration. We apply our proposed algorithms to prostate cancer cases, achieving highly competitive results for both algorithms. In particular, the proposed rVNS requires 62.75% fewer apertures and achieved a 63.93% reduction in beam-on time compared to the sequential approach's best case, which means treatment plans that can be delivered in less time. Additionally, we evaluate the clinical quality of the treatment plans using established dosimetric indicators, comparing our results against those produced by matRad's tool for DAO to assess target coverage and organ-at-risk sparing.
{"title":"Comparing variable neighbourhood search algorithms for the direct aperture optimisation in radiotherapy.","authors":"Mauricio Moyano, Keiny Meza-Vasquez, Gonzalo Tello-Valenzuela, Nicolle Ojeda-Ortega, Carolina Lagos, Guillermo Cabrera-Guerrero","doi":"10.7717/peerj-cs.3094","DOIUrl":"10.7717/peerj-cs.3094","url":null,"abstract":"<p><p>Intensity modulated radiation therapy (IMRT) is a prevalent approach for administering radiation therapy in cancer treatment. The primary objective of IMRT is to devise a treatment strategy that eradicates cancer cells from the tumour while minimising damage to the surrounding organs at risk. Conventional IMRT planning entails a sequential procedure: optimising beam intensity for a certain set of angles, followed by sequencing. Unfortunately, treatment plans obtained in the optimisation stage are severely impaired after the sequencing stage due to physical and delivery constraints that are not considered during the optimisation stage. One method that tackles the issues above is the direct aperture optimisation (DAO) technique. The DAO problem seeks to generate a set of deliverable aperture configurations and a corresponding set of radiation intensities. This method accounts for physical and delivery time limitations, facilitating the creation of clinically appropriate treatment programs. In this article, we propose and compare two variable neighbourhood search (VNS) based algorithms, called variable neighbourhood descent (VND) and reduced variable neighbourhood search (rVNS). The VND algorithm is a deterministic variant of VNS that systematically explores different neighbourhood structures. This approach allows for a more thorough solution for space exploration while maintaining computational efficiency. The rVNS, unlike traditional VNS algorithms, does not require any transition rule, as it integrates a set of predefined neighbourhood moves at each iteration. We apply our proposed algorithms to prostate cancer cases, achieving highly competitive results for both algorithms. In particular, the proposed rVNS requires 62.75% fewer apertures and achieved a 63.93% reduction in beam-on time compared to the sequential approach's best case, which means treatment plans that can be delivered in less time. Additionally, we evaluate the clinical quality of the treatment plans using established dosimetric indicators, comparing our results against those produced by matRad's tool for DAO to assess target coverage and organ-at-risk sparing.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"11 ","pages":"e3094"},"PeriodicalIF":2.5,"publicationDate":"2025-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12453873/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145132379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}