首页 > 最新文献

PeerJ Computer Science最新文献

英文 中文
Data trace as the scientific foundation for trusted metrological data: a review for future metrology direction. 数据溯源是可信计量数据的科学基础——对未来计量方向的展望。
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-08-14 eCollection Date: 2025-01-01 DOI: 10.7717/peerj-cs.3106
Zhanshuo Cao, Boyong Gao, Zilong Liu, Xingchuang Xiong, Bin Wang, Chenbo Pei

In the context of the digital transformation of metrology, ensuring the trustworthiness and integrity of measurement data during its generation, transmission, and storage-i.e., trustworthy detection of measurement data-has become a critical challenge. Data traces are residual marks left during the data processing, which help identify malicious activities targeting measurement data. These traces are especially important when the trust and integrity of potential data evidence are under threat. To this end, this article systematically reviews relevant core techniques and analyzes various detection methods across the different stages of the data lifecycle, evaluating their applicability and limitations in identifying data tampering, unauthorized access, and anomalous operations. The findings suggest that trace detection technologies can enhance the traceability and transparency of metrological data, thereby providing technical support for building a trustworthy digital metrology system. This review lays the theoretical foundation for future research on developing automated anomaly detection models, improving forensic techniques for data tampering in measurement devices, and constructing multi-modal, full-lifecycle traceability frameworks for measurement data. Subsequent studies should focus on aligning these technologies with metrological standards and verifying their deployment in real-world measurement instruments.

在计量数字化转型的背景下,确保测量数据在产生、传输和存储过程中的可靠性和完整性。,测量数据的可靠检测已成为一个关键的挑战。数据痕迹是在数据处理过程中留下的残留标记,有助于识别针对测量数据的恶意活动。当潜在数据证据的信任和完整性受到威胁时,这些痕迹尤为重要。为此,本文系统回顾了相关核心技术,并分析了数据生命周期不同阶段的各种检测方法,评估了它们在识别数据篡改、未经授权访问和异常操作方面的适用性和局限性。研究结果表明,痕量检测技术可以提高计量数据的可追溯性和透明度,从而为建立可信赖的数字计量系统提供技术支持。本文综述为未来开发自动化异常检测模型、改进测量设备数据篡改的取证技术以及构建多模态、全生命周期的测量数据可追溯框架奠定了理论基础。后续的研究应该集中在将这些技术与计量标准结合起来,并验证它们在实际测量仪器中的部署。
{"title":"Data trace as the scientific foundation for trusted metrological data: a review for future metrology direction.","authors":"Zhanshuo Cao, Boyong Gao, Zilong Liu, Xingchuang Xiong, Bin Wang, Chenbo Pei","doi":"10.7717/peerj-cs.3106","DOIUrl":"10.7717/peerj-cs.3106","url":null,"abstract":"<p><p>In the context of the digital transformation of metrology, ensuring the trustworthiness and integrity of measurement data during its generation, transmission, and storage-<i>i.e</i>., trustworthy detection of measurement data-has become a critical challenge. Data traces are residual marks left during the data processing, which help identify malicious activities targeting measurement data. These traces are especially important when the trust and integrity of potential data evidence are under threat. To this end, this article systematically reviews relevant core techniques and analyzes various detection methods across the different stages of the data lifecycle, evaluating their applicability and limitations in identifying data tampering, unauthorized access, and anomalous operations. The findings suggest that trace detection technologies can enhance the traceability and transparency of metrological data, thereby providing technical support for building a trustworthy digital metrology system. This review lays the theoretical foundation for future research on developing automated anomaly detection models, improving forensic techniques for data tampering in measurement devices, and constructing multi-modal, full-lifecycle traceability frameworks for measurement data. Subsequent studies should focus on aligning these technologies with metrological standards and verifying their deployment in real-world measurement instruments.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"11 ","pages":"e3106"},"PeriodicalIF":2.5,"publicationDate":"2025-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12453846/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145132562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A comprehensive review of ball detection techniques in sports. 体育运动中球检测技术的综合综述。
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-08-12 eCollection Date: 2025-01-01 DOI: 10.7717/peerj-cs.3079
Cristiano Moreira, Lino Ferreira, Paulo Jorge Coelho

Detecting balls in sports plays a pivotal role in enhancing game analysis, providing real-time data for spectators, and improving decision-making and strategic thinking for referees and coaches. This is a highly debated and researched topic, but most works focus on one sport. Effective generalization of a single method or algorithm to different sports is much harder to achieve. This article reviews methodologies and advancements in object detection tailored to ball detection across various sports. Traditional computer vision techniques and modern deep learning methods are visited, emphasizing their strengths, limitations, and adaptability to diverse game scenarios. The challenges of occlusion, dynamic backgrounds, varying ball sizes, and high-speed movements are identified and discussed. This review aims to consolidate existing knowledge, compare state-of-the-art detection models, highlight pivotal challenges and possible solutions, and propose future research directions. The article underscores the importance of optimizations for accurate and efficient ball detection, setting the foundation for next-generation sports analytics systems.

在体育运动中,检测球对于加强比赛分析、为观众提供实时数据、提高裁判员和教练员的决策和战略思维都具有举足轻重的作用。这是一个备受争议和研究的话题,但大多数作品都集中在一项运动上。将单一方法或算法有效地推广到不同的运动中是非常困难的。本文回顾了各种运动中针对球检测的目标检测方法和进展。访问了传统的计算机视觉技术和现代深度学习方法,强调了它们的优势,局限性以及对不同游戏场景的适应性。识别并讨论了遮挡、动态背景、不同球大小和高速运动的挑战。本文旨在整合现有知识,比较最新的检测模型,突出关键挑战和可能的解决方案,并提出未来的研究方向。文章强调了优化准确和高效的球检测的重要性,为下一代体育分析系统奠定了基础。
{"title":"A comprehensive review of ball detection techniques in sports.","authors":"Cristiano Moreira, Lino Ferreira, Paulo Jorge Coelho","doi":"10.7717/peerj-cs.3079","DOIUrl":"10.7717/peerj-cs.3079","url":null,"abstract":"<p><p>Detecting balls in sports plays a pivotal role in enhancing game analysis, providing real-time data for spectators, and improving decision-making and strategic thinking for referees and coaches. This is a highly debated and researched topic, but most works focus on one sport. Effective generalization of a single method or algorithm to different sports is much harder to achieve. This article reviews methodologies and advancements in object detection tailored to ball detection across various sports. Traditional computer vision techniques and modern deep learning methods are visited, emphasizing their strengths, limitations, and adaptability to diverse game scenarios. The challenges of occlusion, dynamic backgrounds, varying ball sizes, and high-speed movements are identified and discussed. This review aims to consolidate existing knowledge, compare state-of-the-art detection models, highlight pivotal challenges and possible solutions, and propose future research directions. The article underscores the importance of optimizations for accurate and efficient ball detection, setting the foundation for next-generation sports analytics systems.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"11 ","pages":"e3079"},"PeriodicalIF":2.5,"publicationDate":"2025-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12453710/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145132617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An enhanced BERT model with improved local feature extraction and long-range dependency capture in promoter prediction for hearing loss. 基于改进局部特征提取和远程依赖捕获的增强BERT模型在听力损失启动子预测中的应用。
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-08-12 eCollection Date: 2025-01-01 DOI: 10.7717/peerj-cs.3104
Jing Sun, Yangfan Huang, Jiale Fu, Li Teng, Xiao Liu, Xiaohua Luo

Promoter prediction has a key role in helping to understand gene regulation and in developing gene therapies for complex diseases such as hearing loss (HL). While traditional Bidirectional Encoder Representations from Transformers (BERT) models excel in capturing contextual information, they often have limitations in simultaneously extracting local sequence features and long-range dependencies inherent in genomic data. To address this challenge, we propose DNABERT-CBL (DNABERT-2_CNN_BiLSTM), an enhanced BERT-based architecture that fuses a convolutional neural network (CNN) and a bidirectional long and short-term memory (BiLSTM) layer. The CNN module is able to capture local regulatory features, while the BiLSTM module can effectively model long-distance dependencies, enabling efficient integration of global and local features of promoter sequences. The models are optimized using three strategies: individual learning, cross-disease training and global training, and the performance of each module is verified by constructing comparison models with different combinations. The experimental results show that DNABERT-CBL outperforms the baseline DNABERT-2_BASE model in hearing loss promoter prediction, with a 20% reduction in loss, a 3.3% improvement in the area under the working characteristic curve (AUC) of the subjects, and a 5.8% improvement in accuracy at a sequence length of 600 base pairs. In addition, DNABERT-CBL consistently outperforms other state-of-the-art BERT-based genome models on several evaluation metrics, highlighting its superior generalization ability. Overall, DNABERT-CBL provides an effective framework for accurate promoter prediction, offers valuable insights into gene regulatory mechanisms, and supports the development of gene therapies for hearing loss and related diseases.

启动子预测在帮助理解基因调控和开发复杂疾病(如听力损失(HL))的基因疗法方面具有关键作用。虽然传统的双向编码器表示(BERT)模型在捕获上下文信息方面表现出色,但它们在同时提取基因组数据中固有的局部序列特征和长期依赖关系方面往往存在局限性。为了应对这一挑战,我们提出了DNABERT-CBL (DNABERT-2_CNN_BiLSTM),这是一种基于bert的增强架构,融合了卷积神经网络(CNN)和双向长短期记忆(BiLSTM)层。CNN模块能够捕获局部调控特征,而BiLSTM模块可以有效地建模长距离依赖关系,从而实现启动子序列的全局和局部特征的有效整合。采用个体学习、跨疾病训练和全局训练三种策略对模型进行优化,并通过构建不同组合的比较模型来验证各模块的性能。实验结果表明,DNABERT-CBL在听力损失促进因子预测方面优于基线DNABERT-2_BASE模型,在600碱基对的序列长度上,听力损失减少20%,受试者工作特征曲线下面积(AUC)提高3.3%,准确性提高5.8%。此外,DNABERT-CBL在几个评估指标上始终优于其他最先进的基于bert的基因组模型,突出了其优越的泛化能力。总的来说,DNABERT-CBL为准确预测启动子提供了一个有效的框架,为基因调控机制提供了有价值的见解,并为听力损失和相关疾病的基因治疗提供了支持。
{"title":"An enhanced BERT model with improved local feature extraction and long-range dependency capture in promoter prediction for hearing loss.","authors":"Jing Sun, Yangfan Huang, Jiale Fu, Li Teng, Xiao Liu, Xiaohua Luo","doi":"10.7717/peerj-cs.3104","DOIUrl":"10.7717/peerj-cs.3104","url":null,"abstract":"<p><p>Promoter prediction has a key role in helping to understand gene regulation and in developing gene therapies for complex diseases such as hearing loss (HL). While traditional Bidirectional Encoder Representations from Transformers (BERT) models excel in capturing contextual information, they often have limitations in simultaneously extracting local sequence features and long-range dependencies inherent in genomic data. To address this challenge, we propose DNABERT-CBL (DNABERT-2_CNN_BiLSTM), an enhanced BERT-based architecture that fuses a convolutional neural network (CNN) and a bidirectional long and short-term memory (BiLSTM) layer. The CNN module is able to capture local regulatory features, while the BiLSTM module can effectively model long-distance dependencies, enabling efficient integration of global and local features of promoter sequences. The models are optimized using three strategies: individual learning, cross-disease training and global training, and the performance of each module is verified by constructing comparison models with different combinations. The experimental results show that DNABERT-CBL outperforms the baseline DNABERT-2_BASE model in hearing loss promoter prediction, with a 20% reduction in loss, a 3.3% improvement in the area under the working characteristic curve (AUC) of the subjects, and a 5.8% improvement in accuracy at a sequence length of 600 base pairs. In addition, DNABERT-CBL consistently outperforms other state-of-the-art BERT-based genome models on several evaluation metrics, highlighting its superior generalization ability. Overall, DNABERT-CBL provides an effective framework for accurate promoter prediction, offers valuable insights into gene regulatory mechanisms, and supports the development of gene therapies for hearing loss and related diseases.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"11 ","pages":"e3104"},"PeriodicalIF":2.5,"publicationDate":"2025-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12453759/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145132496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Design of artwork resource management system based on block classification coding and bit plane rearrangement. 基于块分类编码和位平面重排的艺术品资源管理系统设计。
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-08-12 eCollection Date: 2025-01-01 DOI: 10.7717/peerj-cs.3092
Xiaomeng Xia

With the vigorous development of the art market, the management of art resources is confronted with increasingly difficult challenges, such as copyright protection, authenticity verification, and efficient storage. Currently, the digital watermarking and compression schemes applied to artworks struggle to achieve an effective balance among robustness, image quality preservation, and watermark capacity. Moreover, they lack sufficient scalability when dealing with large-scale datasets. To address these issues, this article proposes an innovative algorithm that integrates watermarking and compression for artwork images, namely the Block Classification Coding-Bit Plane Rearrangement-Integrated Compression and Watermark Embedding (BCC-BPR-ICWE) algorithm. By employing refined block classification coding (RS-BCC) and optimized bit plane rearrangement (BPR) techniques, this algorithm significantly enhances the watermark embedding capacity and robustness while ensuring image quality. Experimental results demonstrate that, compared to existing classical algorithms, the proposed method excels in terms of watermarked image quality (PSNR > 57 dB, SSIM = 0.9993), watermark capacity (0.5 bpp), and tampering recovery performance (PSNR = 41.17 dB, SSIM = 0.9993). The research in this article provides strong support for its practical application in large-scale art resource management systems. The proposed technique not only promotes the application of digital watermarking and compression technologies in the field of art management but also offers new ideas and directions for the future development of related technologies.

随着艺术品市场的蓬勃发展,艺术品资源的管理面临着版权保护、真伪鉴定、高效保管等日益严峻的挑战。目前,应用于艺术品的数字水印和压缩方案都在努力实现鲁棒性、图像质量保持和水印容量之间的有效平衡。此外,它们在处理大规模数据集时缺乏足够的可扩展性。为了解决这些问题,本文提出了一种集成了艺术品图像水印和压缩的创新算法,即块分类编码-位平面重排-集成压缩和水印嵌入(BCC-BPR-ICWE)算法。该算法采用了改进的块分类编码(RS-BCC)和优化的位平面重排(BPR)技术,在保证图像质量的同时显著提高了水印嵌入能力和鲁棒性。实验结果表明,与现有的经典算法相比,该方法在水印图像质量(PSNR为bbb57 dB, SSIM = 0.9993)、水印容量(0.5 bpp)和篡改恢复性能(PSNR为41.17 dB, SSIM = 0.9993)方面表现优异。本文的研究为其在大型艺术资源管理系统中的实际应用提供了有力的支持。该技术不仅促进了数字水印和压缩技术在艺术管理领域的应用,而且为相关技术的未来发展提供了新的思路和方向。
{"title":"Design of artwork resource management system based on block classification coding and bit plane rearrangement.","authors":"Xiaomeng Xia","doi":"10.7717/peerj-cs.3092","DOIUrl":"10.7717/peerj-cs.3092","url":null,"abstract":"<p><p>With the vigorous development of the art market, the management of art resources is confronted with increasingly difficult challenges, such as copyright protection, authenticity verification, and efficient storage. Currently, the digital watermarking and compression schemes applied to artworks struggle to achieve an effective balance among robustness, image quality preservation, and watermark capacity. Moreover, they lack sufficient scalability when dealing with large-scale datasets. To address these issues, this article proposes an innovative algorithm that integrates watermarking and compression for artwork images, namely the Block Classification Coding-Bit Plane Rearrangement-Integrated Compression and Watermark Embedding (BCC-BPR-ICWE) algorithm. By employing refined block classification coding (RS-BCC) and optimized bit plane rearrangement (BPR) techniques, this algorithm significantly enhances the watermark embedding capacity and robustness while ensuring image quality. Experimental results demonstrate that, compared to existing classical algorithms, the proposed method excels in terms of watermarked image quality (PSNR > 57 dB, SSIM = 0.9993), watermark capacity (0.5 bpp), and tampering recovery performance (PSNR = 41.17 dB, SSIM = 0.9993). The research in this article provides strong support for its practical application in large-scale art resource management systems. The proposed technique not only promotes the application of digital watermarking and compression technologies in the field of art management but also offers new ideas and directions for the future development of related technologies.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"11 ","pages":"e3092"},"PeriodicalIF":2.5,"publicationDate":"2025-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12453755/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145132552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A novel deep learning approach for predicting stone-free rates post-ESWL on uncontrasted CT. 一种新的深度学习方法预测非对比CT eswl后无结石率。
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-08-11 eCollection Date: 2025-01-01 DOI: 10.7717/peerj-cs.3111
Ozgur Efiloglu, Muhammed Yildirim, Kadir Yildirim, Harun Bingol, Mustafa Kaan Akalin, Meftun Culpan, Bilal Alatas, Asif Yildirim

Extracorporeal shock wave lithotripsy (ESWL) is one of the most often employed therapy methods for managing kidney stones. In our work, we sought to assess the efficacy of the artificial intelligence model developed using non-contrast computed tomography (CT) images in predicting stone-free rates for ESWL. The main difference between this study and other studies is that it proposes an artificial intelligence-based model that predicts the success of ESWL treatment using artificial intelligence methods. Data from 910 patients who underwent ESWL between January 2016 and June 2021 were analyzed retrospectively. Since the local binary pattern (LBP) and histogram of oriented gradients (HOG) feature extraction methods gave more successful results than other methods, a new feature map was obtained using the neighborhood component analysis (NCA) dimension reduction method after combining the features obtained using these methods. Then, the reduced feature map was classified into classifiers. In conclusion, we analyzed the effect of ESWL treatment using different artificial intelligence methods and found that the prediction accuracy was 94% on average. Results were obtained from seven different convolutional neural networks (CNNs) and two textural-based models in the study. Since textural-based models achieved the highest success among these models, these models were used as the base in the proposed model. The proposed model achieved better results than nine different models used in the study. When the results obtained from the proposed hybrid model for ESWL prediction are examined, this model will guide experts in the treatment of the disease.

体外冲击波碎石(ESWL)是治疗肾结石最常用的治疗方法之一。在我们的工作中,我们试图评估使用非对比计算机断层扫描(CT)图像开发的人工智能模型在预测ESWL无结石率方面的功效。本研究与其他研究的主要区别在于,它提出了一种基于人工智能的模型,可以使用人工智能方法预测ESWL治疗的成功。回顾性分析了2016年1月至2021年6月期间接受ESWL治疗的910例患者的数据。由于局部二值模式(LBP)和定向梯度直方图(HOG)特征提取方法的提取成功率高于其他方法,因此将这两种方法获得的特征结合起来,采用邻域成分分析(NCA)降维方法得到新的特征图。然后,对约简后的特征映射进行分类。综上所述,我们分析了不同人工智能方法对ESWL治疗的效果,发现预测准确率平均为94%。研究结果来自7种不同的卷积神经网络(cnn)和2种基于纹理的模型。由于基于纹理的模型在这些模型中取得了最高的成功率,因此将这些模型作为本文模型的基础。该模型比研究中使用的9种不同模型取得了更好的结果。当所提出的用于ESWL预测的混合模型得到的结果被检验时,该模型将指导专家治疗该疾病。
{"title":"A novel deep learning approach for predicting stone-free rates post-ESWL on uncontrasted CT.","authors":"Ozgur Efiloglu, Muhammed Yildirim, Kadir Yildirim, Harun Bingol, Mustafa Kaan Akalin, Meftun Culpan, Bilal Alatas, Asif Yildirim","doi":"10.7717/peerj-cs.3111","DOIUrl":"10.7717/peerj-cs.3111","url":null,"abstract":"<p><p>Extracorporeal shock wave lithotripsy (ESWL) is one of the most often employed therapy methods for managing kidney stones. In our work, we sought to assess the efficacy of the artificial intelligence model developed using non-contrast computed tomography (CT) images in predicting stone-free rates for ESWL. The main difference between this study and other studies is that it proposes an artificial intelligence-based model that predicts the success of ESWL treatment using artificial intelligence methods. Data from 910 patients who underwent ESWL between January 2016 and June 2021 were analyzed retrospectively. Since the local binary pattern (LBP) and histogram of oriented gradients (HOG) feature extraction methods gave more successful results than other methods, a new feature map was obtained using the neighborhood component analysis (NCA) dimension reduction method after combining the features obtained using these methods. Then, the reduced feature map was classified into classifiers. In conclusion, we analyzed the effect of ESWL treatment using different artificial intelligence methods and found that the prediction accuracy was 94% on average. Results were obtained from seven different convolutional neural networks (CNNs) and two textural-based models in the study. Since textural-based models achieved the highest success among these models, these models were used as the base in the proposed model. The proposed model achieved better results than nine different models used in the study. When the results obtained from the proposed hybrid model for ESWL prediction are examined, this model will guide experts in the treatment of the disease.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"11 ","pages":"e3111"},"PeriodicalIF":2.5,"publicationDate":"2025-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12453815/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145132627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Morphological and structural complexity analysis of low-resource English-Turkish language pair using neural machine translation models. 基于神经机器翻译模型的低资源英-土耳其语对形态和结构复杂性分析。
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-08-11 eCollection Date: 2025-01-01 DOI: 10.7717/peerj-cs.3072
Mehmet Acı, Nisa Vuran Sarı, Çiğdem İnan Acı

Neural machine translation (NMT) has achieved remarkable success in high-resource language pairs; however, its effectiveness for morphologically rich and low-resource languages like Turkish remains underexplored. As a highly agglutinative and morphologically complex language with limited high-quality parallel data, Turkish serves as a representative case for evaluating NMT systems on low-resource and linguistically challenging settings. Its structural divergence from English makes it a critical testbed for assessing tokenization strategies, attention mechanisms, and model generalizability in neural translation. This study investigates the comparative performance of two prominent NMT paradigms-the Transformer architecture, and recurrent-based sequence-to-sequence (Seq2Seq) models with attention for both English-to-Turkish and Turkish-to-English translation. The models are evaluated under various configurations, including different tokenization strategies (Byte Pair Encoding (BPE) vs. Word Tokenization), attention mechanisms (Bahdanau and an exploratory hybrid mechanism combining Bahdanau and Scaled Dot-Product attention), and architectural depths (layer count and attention head number). Extensive experiments using automatic metrics such as BiLingual Evaluation Understudy (BLEU), Metric for Evaluation of Translation with Explicit ORdering (METEOR), and Translation Error Rate (TER) reveal that the Transformer model with three layers, eight attention heads, and BPE tokenization achieved the best performance, obtaining a BLEU score of 47.85 and METEOR score of 44.62 in the English-to-Turkish direction. Similar performance trends were observed in the reverse direction, indicating the model's generalizability. These findings highlight the potential of carefully optimized Transformer-based NMT systems in handling the complexities of morphologically rich, low-resource languages like Turkish in both translation directions.

神经机器翻译(NMT)在高资源语言对中取得了显著的成功;然而,它对像土耳其语这样词形丰富而资源匮乏的语言的有效性仍未得到充分探索。作为一种高度粘连且形态复杂的语言,土耳其语具有有限的高质量并行数据,可以作为在低资源和语言挑战性设置下评估NMT系统的代表性案例。它与英语的结构差异使其成为评估神经翻译中标记化策略、注意机制和模型可泛化性的关键测试平台。本研究考察了两种著名的NMT范式——Transformer架构和基于循环的序列对序列(Seq2Seq)模型的比较性能,并关注了英语到土耳其语和土耳其语到英语的翻译。这些模型在各种配置下进行评估,包括不同的标记化策略(字节对编码(BPE) vs.单词标记化)、注意机制(Bahdanau和结合Bahdanau和scale Dot-Product注意的探索性混合机制)和架构深度(层数和注意头数)。使用双语评价替代(BLEU)、显式排序翻译评价度量(METEOR)和翻译错误率(TER)等自动度量进行的大量实验表明,具有三层、八个注意头和BPE标记化的Transformer模型取得了最佳性能,在英语到土耳其语方向上BLEU得分为47.85,METEOR得分为44.62。在相反的方向上观察到类似的性能趋势,表明该模型的泛化性。这些发现强调了精心优化的基于transformer的NMT系统在处理形态学丰富、低资源语言(如土耳其语)的复杂性方面的潜力。
{"title":"Morphological and structural complexity analysis of low-resource English-Turkish language pair using neural machine translation models.","authors":"Mehmet Acı, Nisa Vuran Sarı, Çiğdem İnan Acı","doi":"10.7717/peerj-cs.3072","DOIUrl":"10.7717/peerj-cs.3072","url":null,"abstract":"<p><p>Neural machine translation (NMT) has achieved remarkable success in high-resource language pairs; however, its effectiveness for morphologically rich and low-resource languages like Turkish remains underexplored. As a highly agglutinative and morphologically complex language with limited high-quality parallel data, Turkish serves as a representative case for evaluating NMT systems on low-resource and linguistically challenging settings. Its structural divergence from English makes it a critical testbed for assessing tokenization strategies, attention mechanisms, and model generalizability in neural translation. This study investigates the comparative performance of two prominent NMT paradigms-the Transformer architecture, and recurrent-based sequence-to-sequence (Seq2Seq) models with attention for both English-to-Turkish and Turkish-to-English translation. The models are evaluated under various configurations, including different tokenization strategies (Byte Pair Encoding (BPE) <i>vs</i>. Word Tokenization), attention mechanisms (Bahdanau and an exploratory hybrid mechanism combining Bahdanau and Scaled Dot-Product attention), and architectural depths (layer count and attention head number). Extensive experiments using automatic metrics such as BiLingual Evaluation Understudy (BLEU), Metric for Evaluation of Translation with Explicit ORdering (METEOR), and Translation Error Rate (TER) reveal that the Transformer model with three layers, eight attention heads, and BPE tokenization achieved the best performance, obtaining a BLEU score of 47.85 and METEOR score of 44.62 in the English-to-Turkish direction. Similar performance trends were observed in the reverse direction, indicating the model's generalizability. These findings highlight the potential of carefully optimized Transformer-based NMT systems in handling the complexities of morphologically rich, low-resource languages like Turkish in both translation directions.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"11 ","pages":"e3072"},"PeriodicalIF":2.5,"publicationDate":"2025-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12453858/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145132730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A review of deep learning methods in aquatic animal husbandry. 深度学习方法在水产养殖中的应用综述。
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-08-11 eCollection Date: 2025-01-01 DOI: 10.7717/peerj-cs.3105
Marzuraikah Mohd Stofa, Fatimah Az Zahra Azizan, Mohd Asyraf Zulkifley

Aquatic animal husbandry is crucial for global food security and supports millions of livelihoods around the world. With the growing demand for seafood, this industry has become economically significant for many regions, contributing to local and global economies. However, as the industry grows, it faces various major challenges that are not encountered in small-scale setups. Traditional methods for classifying, detecting, and monitoring aquatic animals are often time-consuming, labor-intensive, and prone to inaccuracies. The labor-intensive nature of these operations has led many aquaculture operators to move towards automation systems. Yet, for an automation system to be effectively deployed, it needs an intelligent decision-making system, which is where deep learning techniques come into play. In this article, an extensive methodological review of machine learning methods, primarily the deep learning methods used in aquatic animal husbandry are concisely summarized. This article focuses on the use of deep learning in three key areas: classification, localization, and segmentation. Generally, classification techniques are vital in distinguishing between different species of aquatic organisms, while localization methods are used to identify the respective animal's position within a video or an image. Segmentation techniques, on the other hand, enable the precise delineation of organism boundaries, which is essential information in accurate monitoring systems. Among these key areas, segmentation techniques, particularly through the U-Net model, have shown the best results, even achieving a high segmentation performance of 94.44%. This article also highlights the potential of deep learning to enhance the precision, productivity, and sustainability of automated operations in aquatic animal husbandry. Looking ahead, deep learning offers huge potential to transform the aquaculture industry in terms of cost and operations. Future research should focus on refining existing models to better address real-world challenges such as sensor input quality and multi-modal data across various environments for better automation in the aquaculture industry.

水生畜牧业对全球粮食安全至关重要,支持着世界各地数百万人的生计。随着对海产品需求的增长,该行业对许多地区的经济意义重大,为当地和全球经济做出了贡献。然而,随着行业的发展,它面临着各种各样的重大挑战,这些挑战在小规模的设置中是不会遇到的。对水生动物进行分类、检测和监测的传统方法往往耗时、费力,而且容易出错。这些作业的劳动密集型性质导致许多水产养殖经营者转向自动化系统。然而,为了有效地部署自动化系统,它需要一个智能决策系统,这就是深度学习技术发挥作用的地方。在本文中,对机器学习方法进行了广泛的方法学回顾,主要是对水产畜牧业中使用的深度学习方法进行了简要总结。本文主要关注深度学习在三个关键领域的应用:分类、定位和分割。一般来说,分类技术对于区分不同种类的水生生物至关重要,而定位方法用于识别视频或图像中各自动物的位置。另一方面,分割技术可以精确地描绘生物边界,这是精确监测系统中必不可少的信息。在这些关键领域中,以U-Net模型为代表的分割技术表现出了最好的效果,甚至达到了94.44%的分割性能。本文还强调了深度学习在提高水产畜牧业自动化操作的精度、生产力和可持续性方面的潜力。展望未来,深度学习在成本和运营方面为改变水产养殖业提供了巨大的潜力。未来的研究应侧重于改进现有模型,以更好地解决现实世界的挑战,如传感器输入质量和跨各种环境的多模态数据,从而更好地实现水产养殖业的自动化。
{"title":"A review of deep learning methods in aquatic animal husbandry.","authors":"Marzuraikah Mohd Stofa, Fatimah Az Zahra Azizan, Mohd Asyraf Zulkifley","doi":"10.7717/peerj-cs.3105","DOIUrl":"10.7717/peerj-cs.3105","url":null,"abstract":"<p><p>Aquatic animal husbandry is crucial for global food security and supports millions of livelihoods around the world. With the growing demand for seafood, this industry has become economically significant for many regions, contributing to local and global economies. However, as the industry grows, it faces various major challenges that are not encountered in small-scale setups. Traditional methods for classifying, detecting, and monitoring aquatic animals are often time-consuming, labor-intensive, and prone to inaccuracies. The labor-intensive nature of these operations has led many aquaculture operators to move towards automation systems. Yet, for an automation system to be effectively deployed, it needs an intelligent decision-making system, which is where deep learning techniques come into play. In this article, an extensive methodological review of machine learning methods, primarily the deep learning methods used in aquatic animal husbandry are concisely summarized. This article focuses on the use of deep learning in three key areas: classification, localization, and segmentation. Generally, classification techniques are vital in distinguishing between different species of aquatic organisms, while localization methods are used to identify the respective animal's position within a video or an image. Segmentation techniques, on the other hand, enable the precise delineation of organism boundaries, which is essential information in accurate monitoring systems. Among these key areas, segmentation techniques, particularly through the U-Net model, have shown the best results, even achieving a high segmentation performance of 94.44%. This article also highlights the potential of deep learning to enhance the precision, productivity, and sustainability of automated operations in aquatic animal husbandry. Looking ahead, deep learning offers huge potential to transform the aquaculture industry in terms of cost and operations. Future research should focus on refining existing models to better address real-world challenges such as sensor input quality and multi-modal data across various environments for better automation in the aquaculture industry.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"11 ","pages":"e3105"},"PeriodicalIF":2.5,"publicationDate":"2025-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12453753/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145132412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Innovative multi objective optimization based automatic fake news detection. 创新的基于多目标优化的假新闻自动检测。
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-08-11 eCollection Date: 2025-01-01 DOI: 10.7717/peerj-cs.3016
Cebrail Barut, Suna Yildirim, Bilal Alatas, Gungor Yildirim

With the digital revolution, access to information is expanding day by day and individuals can access information quickly through the internet and social media platforms. However, in most cases, there is no mechanism in place to evaluate the accuracy of news that spreads rapidly on social media. This increases the potential for fake news to mislead both individuals and society. In order to minimize the negative effects of fake news, it has become a critical necessity to detect them quickly and effectively. Metaheuristic methods can provide more effective solutions in fake news detection compared to traditional methods. Especially in small datasets, metaheuristics are known to produce faster and more effective solutions than artificial intelligence and machine learning based methods. In the literature, the majority of fake news detection studies have focused on the optimization of a single criterion. In this study, unlike other studies, a method that enables simultaneous optimization of two criteria (precision and recall) in fake news detection is developed. In the proposed approach, an innovative solution is presented by using the Crowding Distance Level method instead of the Crowding Distance method used in the standard Non-dominated Sorting Genetic Algorithm 2 (NSGA-2) algorithm. The proposed method is tested on four different datasets such as Covid-19, Syrian war daily news and FakeNewsNet (Gossipcop). The results show that the proposed method achieves high success especially on small datasets.

随着数字革命的到来,获取信息的途径日益扩大,个人可以通过互联网和社交媒体平台快速获取信息。然而,在大多数情况下,没有适当的机制来评估在社交媒体上迅速传播的新闻的准确性。这增加了假新闻误导个人和社会的可能性。为了最大限度地减少假新闻的负面影响,快速有效地发现它们已经成为一种至关重要的必要性。与传统方法相比,元启发式方法可以为假新闻检测提供更有效的解决方案。特别是在小数据集中,元启发式比基于人工智能和机器学习的方法产生更快、更有效的解决方案。在文献中,大多数假新闻检测研究都集中在单一标准的优化上。在本研究中,与其他研究不同,开发了一种方法,可以同时优化假新闻检测中的两个标准(精度和召回率)。在该方法中,采用拥挤距离水平法取代了标准非支配排序遗传算法2 (NSGA-2)中使用的拥挤距离法,提出了一种创新的解决方案。提出的方法在四个不同的数据集上进行了测试,如Covid-19,叙利亚战争每日新闻和FakeNewsNet (gossip)。结果表明,该方法在小数据集上取得了较高的成功率。
{"title":"Innovative multi objective optimization based automatic fake news detection.","authors":"Cebrail Barut, Suna Yildirim, Bilal Alatas, Gungor Yildirim","doi":"10.7717/peerj-cs.3016","DOIUrl":"10.7717/peerj-cs.3016","url":null,"abstract":"<p><p>With the digital revolution, access to information is expanding day by day and individuals can access information quickly through the internet and social media platforms. However, in most cases, there is no mechanism in place to evaluate the accuracy of news that spreads rapidly on social media. This increases the potential for fake news to mislead both individuals and society. In order to minimize the negative effects of fake news, it has become a critical necessity to detect them quickly and effectively. Metaheuristic methods can provide more effective solutions in fake news detection compared to traditional methods. Especially in small datasets, metaheuristics are known to produce faster and more effective solutions than artificial intelligence and machine learning based methods. In the literature, the majority of fake news detection studies have focused on the optimization of a single criterion. In this study, unlike other studies, a method that enables simultaneous optimization of two criteria (precision and recall) in fake news detection is developed. In the proposed approach, an innovative solution is presented by using the Crowding Distance Level method instead of the Crowding Distance method used in the standard Non-dominated Sorting Genetic Algorithm 2 (NSGA-2) algorithm. The proposed method is tested on four different datasets such as Covid-19, Syrian war daily news and FakeNewsNet (Gossipcop). The results show that the proposed method achieves high success especially on small datasets.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"11 ","pages":"e3016"},"PeriodicalIF":2.5,"publicationDate":"2025-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12453838/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145132683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Detection of offensive content in the Kazakh language using machine learning and deep learning approaches. 使用机器学习和深度学习方法检测哈萨克语中的攻击性内容。
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-08-11 eCollection Date: 2025-01-01 DOI: 10.7717/peerj-cs.3027
Milana Bolatbek, Moldir Sagynay, Shynar Mussiraliyeva, Zhastay Yeltay

This article addresses the urgent need to detect destructive content, including religious extremism, racism, cyberbullying, and nation oriented extremism messages, on social media platforms in the Kazakh language. Given the agglutinative structure and rich morphology of Kazakh, standard natural language processing (NLP) models require significant adaptation. The study employs a range of machine learning and deep learning techniques, such as logistic regression, support vector machines (SVM), and long short-term memory (LSTM) networks, to classify destructive content. This article demonstrates the effectiveness of combining n-gram and stemming methods with machine learning algorithms, achieving high accuracy in content classification. The findings underscore the importance of developing language-specific NLP tools tailored to Kazakh's linguistic complexities. This research not only contributes to ensuring online safety by detecting destructive content in Kazakh digital spaces, but also provides a framework for applying similar techniques to other lesser-resourced languages.

本文讨论了在哈萨克语社交媒体平台上检测破坏性内容的迫切需要,包括宗教极端主义、种族主义、网络欺凌和面向国家的极端主义信息。鉴于哈萨克语的黏着结构和丰富的形态,标准的自然语言处理(NLP)模型需要进行重大调整。该研究采用了一系列机器学习和深度学习技术,如逻辑回归、支持向量机(SVM)和长短期记忆(LSTM)网络,对破坏性内容进行分类。本文展示了将n-gram和词干提取方法与机器学习算法相结合的有效性,实现了内容分类的高精度。研究结果强调了开发针对哈萨克语语言复杂性的特定语言的NLP工具的重要性。这项研究不仅有助于侦测哈萨克数位空间的破坏性内容,确保线上安全,也提供框架,将类似技术应用于其他资源较少的语言。
{"title":"Detection of offensive content in the Kazakh language using machine learning and deep learning approaches.","authors":"Milana Bolatbek, Moldir Sagynay, Shynar Mussiraliyeva, Zhastay Yeltay","doi":"10.7717/peerj-cs.3027","DOIUrl":"10.7717/peerj-cs.3027","url":null,"abstract":"<p><p>This article addresses the urgent need to detect destructive content, including religious extremism, racism, cyberbullying, and nation oriented extremism messages, on social media platforms in the Kazakh language. Given the agglutinative structure and rich morphology of Kazakh, standard natural language processing (NLP) models require significant adaptation. The study employs a range of machine learning and deep learning techniques, such as logistic regression, support vector machines (SVM), and long short-term memory (LSTM) networks, to classify destructive content. This article demonstrates the effectiveness of combining n-gram and stemming methods with machine learning algorithms, achieving high accuracy in content classification. The findings underscore the importance of developing language-specific NLP tools tailored to Kazakh's linguistic complexities. This research not only contributes to ensuring online safety by detecting destructive content in Kazakh digital spaces, but also provides a framework for applying similar techniques to other lesser-resourced languages.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"11 ","pages":"e3027"},"PeriodicalIF":2.5,"publicationDate":"2025-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12453855/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145132568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DDSUD: dynamically detecting subsequence uncertainty and diversity for active learning in imbalanced Chinese sentiment analysis. 动态检测子序列不确定性和多样性在不平衡汉语情感分析中的主动学习。
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-08-11 eCollection Date: 2025-01-01 DOI: 10.7717/peerj-cs.3091
Shufeng Xiong, Yibo Si, Guipei Zhang, Bingkun Wang, Guang Zheng, Haiping Si

Sentiment structure analysis in Chinese text typically relies on supervised deep-learning methods for sequence labeling. However, obtaining large-scale labeled datasets is both resource-intensive and time-consuming. To address these challenges, this study proposes Dynamically Detecting Subsequence Uncertainty and Diversity (DDSUD), a Bidirectional Encoder Representations from Transformers (BERT)-based active learning framework designed to tackle subsequence uncertainty and enhance the diversity of imbalanced datasets. DDSUD combines subsequence uncertainty detection, diversity-driven sample selection, and dynamic weighting, enabling an adaptive balance between these factors throughout the active learning iterations. Experimental results show that DDSUD achieves performance close to fully supervised training schemes with only 50% of the data labeled, and outperforms other state-of-the-art active learning methods with the same amount of labeled data. Moreover, by dynamically adjusting the trade-off between subsequence uncertainty and diversity, DDSUD demonstrates strong adaptability and generalization capability in low-resource environments, especially in handling imbalanced datasets, significantly improving the recognition of minority class samples.

中文文本情感结构分析通常依赖于有监督的深度学习方法进行序列标注。然而,获取大规模标记数据集既耗费资源又耗费时间。为了解决这些挑战,本研究提出了动态检测子序列不确定性和多样性(DDSUD),这是一种基于双向编码器表示的主动学习框架,旨在解决子序列不确定性并增强不平衡数据集的多样性。DDSUD结合了子序列不确定性检测、多样性驱动的样本选择和动态加权,在整个主动学习迭代过程中实现了这些因素之间的自适应平衡。实验结果表明,DDSUD在只有50%的数据标记的情况下达到了接近全监督训练方案的性能,并且在相同数量的标记数据下优于其他最先进的主动学习方法。此外,通过动态调整子序列不确定性和多样性之间的权衡,DDSUD在低资源环境下表现出较强的适应性和泛化能力,特别是在处理不平衡数据集时,显著提高了对少数类样本的识别能力。
{"title":"DDSUD: dynamically detecting subsequence uncertainty and diversity for active learning in imbalanced Chinese sentiment analysis.","authors":"Shufeng Xiong, Yibo Si, Guipei Zhang, Bingkun Wang, Guang Zheng, Haiping Si","doi":"10.7717/peerj-cs.3091","DOIUrl":"10.7717/peerj-cs.3091","url":null,"abstract":"<p><p>Sentiment structure analysis in Chinese text typically relies on supervised deep-learning methods for sequence labeling. However, obtaining large-scale labeled datasets is both resource-intensive and time-consuming. To address these challenges, this study proposes Dynamically Detecting Subsequence Uncertainty and Diversity (DDSUD), a Bidirectional Encoder Representations from Transformers (BERT)-based active learning framework designed to tackle subsequence uncertainty and enhance the diversity of imbalanced datasets. DDSUD combines subsequence uncertainty detection, diversity-driven sample selection, and dynamic weighting, enabling an adaptive balance between these factors throughout the active learning iterations. Experimental results show that DDSUD achieves performance close to fully supervised training schemes with only 50% of the data labeled, and outperforms other state-of-the-art active learning methods with the same amount of labeled data. Moreover, by dynamically adjusting the trade-off between subsequence uncertainty and diversity, DDSUD demonstrates strong adaptability and generalization capability in low-resource environments, especially in handling imbalanced datasets, significantly improving the recognition of minority class samples.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"11 ","pages":"e3091"},"PeriodicalIF":2.5,"publicationDate":"2025-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12453870/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145132476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
PeerJ Computer Science
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1