首页 > 最新文献

Computers & Electrical Engineering最新文献

英文 中文
Federated learning in healthcare: Recent progress and challenges 医疗保健中的联邦学习:最近的进展和挑战
IF 4.9 3区 计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2026-01-07 DOI: 10.1016/j.compeleceng.2025.110924
Amara Miloudi , Abdelkader Laouid , Ahcène Bounceur , Mostefa Kara , Mohammed Mounir Bouhamed , Mohammad Hamoudeh , Insaf Kraidia
Federated Learning (FL) emerged as a transformative approach to collaborative model training in healthcare, enabling multiple institutions to develop robust Machine Learning models without compromising sensitive patient data. This review examines recent advances, applications, and challenges associated with FL in healthcare, focusing on its potential to enhance data security and privacy through the aggregation of decentralized models. A comprehensive literature review was conducted using databases including PubMed, Google Scholar, and Scopus, identifying 316 relevant publications, from which 23 were selected for detailed analysis. The findings highlight the applications of FL in critical healthcare areas, including oncology, infectious diseases, medical imaging, drug development, and personalized medicine. Although FL offers significant opportunities for precision medicine by managing fragmented and heterogeneous datasets, substantial challenges remain, particularly regarding data standardization, model convergence, and communication efficiency. This review also addresses crucial aspects such as privacy-preserving techniques, ethical compliance, and system scalability, emphasizing the need for interdisciplinary solutions. Ultimately, FL demonstrates significant potential to revolutionize healthcare by improving patient outcomes and accelerating medical research while maintaining strict regulatory compliance. Future research directions are discussed to overcome current barriers and advance the broader adoption of FL in healthcare applications.
联邦学习(FL)作为医疗保健领域协作模型培训的一种变革性方法出现,使多个机构能够在不损害敏感患者数据的情况下开发强大的机器学习模型。本文回顾了FL在医疗保健领域的最新进展、应用和挑战,重点关注其通过分散模型的聚合增强数据安全和隐私的潜力。利用PubMed、b谷歌Scholar、Scopus等数据库进行全面的文献综述,筛选出316篇相关文献,并从中选取23篇进行详细分析。研究结果强调了FL在关键医疗保健领域的应用,包括肿瘤学、传染病、医学成像、药物开发和个性化医疗。尽管FL通过管理碎片化和异构数据集为精准医疗提供了重要机会,但仍存在重大挑战,特别是在数据标准化、模型融合和通信效率方面。本文还讨论了隐私保护技术、道德合规和系统可扩展性等关键方面,强调了跨学科解决方案的必要性。最终,FL通过改善患者预后和加速医学研究,同时保持严格的法规遵从性,展示了革命性医疗保健的巨大潜力。讨论了未来的研究方向,以克服当前的障碍,并推动FL在医疗保健应用中的广泛采用。
{"title":"Federated learning in healthcare: Recent progress and challenges","authors":"Amara Miloudi ,&nbsp;Abdelkader Laouid ,&nbsp;Ahcène Bounceur ,&nbsp;Mostefa Kara ,&nbsp;Mohammed Mounir Bouhamed ,&nbsp;Mohammad Hamoudeh ,&nbsp;Insaf Kraidia","doi":"10.1016/j.compeleceng.2025.110924","DOIUrl":"10.1016/j.compeleceng.2025.110924","url":null,"abstract":"<div><div>Federated Learning (FL) emerged as a transformative approach to collaborative model training in healthcare, enabling multiple institutions to develop robust Machine Learning models without compromising sensitive patient data. This review examines recent advances, applications, and challenges associated with FL in healthcare, focusing on its potential to enhance data security and privacy through the aggregation of decentralized models. A comprehensive literature review was conducted using databases including PubMed, Google Scholar, and Scopus, identifying 316 relevant publications, from which 23 were selected for detailed analysis. The findings highlight the applications of FL in critical healthcare areas, including oncology, infectious diseases, medical imaging, drug development, and personalized medicine. Although FL offers significant opportunities for precision medicine by managing fragmented and heterogeneous datasets, substantial challenges remain, particularly regarding data standardization, model convergence, and communication efficiency. This review also addresses crucial aspects such as privacy-preserving techniques, ethical compliance, and system scalability, emphasizing the need for interdisciplinary solutions. Ultimately, FL demonstrates significant potential to revolutionize healthcare by improving patient outcomes and accelerating medical research while maintaining strict regulatory compliance. Future research directions are discussed to overcome current barriers and advance the broader adoption of FL in healthcare applications.</div></div>","PeriodicalId":50630,"journal":{"name":"Computers & Electrical Engineering","volume":"131 ","pages":"Article 110924"},"PeriodicalIF":4.9,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145927374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A secure expert system framework for private function evaluation using functional encryption and multi-party computation 基于功能加密和多方计算的私有功能评估安全专家系统框架
IF 4.9 3区 计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2026-01-07 DOI: 10.1016/j.compeleceng.2025.110930
Rahat Naz , Jaydeep Howlader , Shahnawaz Ahmad
Cloud systems and edge-based systems have an increasing appetite for privacy-preserving computation over distributed sensitive data. Most existing cryptographic solutions don't perform well when executing complex inference tasks while hiding both the input data and the logic of the functions. This can be a serious shortcoming in particular areas, such as healthcare analytics and financial modeling, where data privacy and model protections are paramount. Although secure multiparty computation (SMPC) and functional encryption (FE) hold promise individually, current implementations are often either not scalable or not auditable from end to end in adversarial models. This work presents a hybrid framework that fuses FE with SMPC to enable private function evaluation (PFE) in decentralized environments. The architecture supports encrypted expert inference, leveraging a trust-weighted cryptographic consensus layer, dynamic key management, and function-specific policy enforcement. An adaptive fusion of secure execution and traceable audit logging ensures both privacy and compliance without sacrificing computational tractability. Experimental validation demonstrates that our system reduces decision latency by up to 18 % over standard FE baselines and improves leakage resistance under semi-honest and collusion-based attacks by 23 %, with auditability scores reaching 87 % in real-world simulation settings. By enabling the execution of confidential functions with built-in explainability and regulatory transparency, the proposed system lays the foundation for secure AI-as-a-service platforms. Its compatibility with edge deployments and extensibility toward zero-knowledge and post-quantum cryptography position it as a robust candidate for the next generation of trust-aware decentralized computation.
云系统和基于边缘的系统对分布式敏感数据的隐私保护计算的需求越来越大。大多数现有的加密解决方案在执行复杂的推理任务时都不能很好地执行,同时隐藏输入数据和函数的逻辑。在某些领域,这可能是一个严重的缺点,例如医疗保健分析和财务建模,在这些领域,数据隐私和模型保护至关重要。尽管安全多方计算(SMPC)和功能加密(FE)各自都有希望,但在对抗性模型中,当前的实现通常要么不可扩展,要么不可从端到端进行审计。这项工作提出了一个混合框架,将FE与SMPC融合在一起,在分散的环境中实现私有功能评估(PFE)。该体系结构支持加密专家推理,利用信任加权的加密共识层、动态密钥管理和特定于功能的策略实施。安全执行和可跟踪审计日志的自适应融合在不牺牲计算可跟踪性的情况下确保了隐私和遵从性。实验验证表明,我们的系统比标准FE基线减少了18%的决策延迟,并在半诚实和基于串通的攻击下提高了23%的泄漏阻力,在真实世界的模拟设置中可审计性得分达到87%。通过使机密功能的执行具有内置的可解释性和监管透明度,拟议的系统为安全的ai即服务平台奠定了基础。它与边缘部署的兼容性以及对零知识和后量子密码学的可扩展性使其成为下一代信任感知分散计算的健壮候选者。
{"title":"A secure expert system framework for private function evaluation using functional encryption and multi-party computation","authors":"Rahat Naz ,&nbsp;Jaydeep Howlader ,&nbsp;Shahnawaz Ahmad","doi":"10.1016/j.compeleceng.2025.110930","DOIUrl":"10.1016/j.compeleceng.2025.110930","url":null,"abstract":"<div><div>Cloud systems and edge-based systems have an increasing appetite for privacy-preserving computation over distributed sensitive data. Most existing cryptographic solutions don't perform well when executing complex inference tasks while hiding both the input data and the logic of the functions. This can be a serious shortcoming in particular areas, such as healthcare analytics and financial modeling, where data privacy and model protections are paramount. Although secure multiparty computation (SMPC) and functional encryption (FE) hold promise individually, current implementations are often either not scalable or not auditable from end to end in adversarial models. This work presents a hybrid framework that fuses FE with SMPC to enable private function evaluation (PFE) in decentralized environments. The architecture supports encrypted expert inference, leveraging a trust-weighted cryptographic consensus layer, dynamic key management, and function-specific policy enforcement. An adaptive fusion of secure execution and traceable audit logging ensures both privacy and compliance without sacrificing computational tractability. Experimental validation demonstrates that our system reduces decision latency by up to 18 % over standard FE baselines and improves leakage resistance under semi-honest and collusion-based attacks by 23 %, with auditability scores reaching 87 % in real-world simulation settings. By enabling the execution of confidential functions with built-in explainability and regulatory transparency, the proposed system lays the foundation for secure AI-as-a-service platforms. Its compatibility with edge deployments and extensibility toward zero-knowledge and post-quantum cryptography position it as a robust candidate for the next generation of trust-aware decentralized computation.</div></div>","PeriodicalId":50630,"journal":{"name":"Computers & Electrical Engineering","volume":"131 ","pages":"Article 110930"},"PeriodicalIF":4.9,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145927371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhanced ECG arrhythmia detection with deep learning and multi-head attention mechanism 利用深度学习和多头注意机制增强心律失常检测
IF 4.9 3区 计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2026-01-07 DOI: 10.1016/j.compeleceng.2026.110957
Saoueb Kerdoudi , Larbi Guezouli , Tahar Dilekh
Detecting arrhythmias via electrocardiograms (ECGs) is vital for healthcare. While deep learning has advanced classification, capturing critical patterns in complex data remains challenging. We propose Res_Bi-LSTM_MHA, a novel model integrating a multi-head self-attention (MHA) mechanism to selectively focus on relevant signal segments. This enhances the capture of subtle features often missed by conventional methods. By combining Residual Networks (ResNet) for robust feature extraction with Bidirectional Long Short-Term Memory (Bi-LSTM) for temporal dependencies, our approach significantly improves accuracy. We evaluated the model at subject and record levels using the China Physiological Signal Challenge (CPSC 2018), St. Petersburg Institute of Cardiological Technics (INCART), and Massachusetts Institute of Technology-Beth Israel Hospital (MIT-BIH) databases. The model achieved an F1 score of 98.01% and 99.42% accuracy on the MIT-BIH dataset. Our results demonstrate that effectively utilizing attention mechanisms offers a substantial improvement in arrhythmia classification.
通过心电图(ecg)检测心律失常对医疗保健至关重要。虽然深度学习具有高级分类,但在复杂数据中捕获关键模式仍然具有挑战性。我们提出了一种新的模型Res_Bi-LSTM_MHA,该模型集成了多头自注意(MHA)机制,可以选择性地关注相关信号段。这增强了对传统方法经常错过的细微特征的捕捉。通过将残差网络(ResNet)用于鲁棒特征提取和双向长短期记忆(Bi-LSTM)用于时间依赖性,我们的方法显着提高了准确性。我们使用中国生理信号挑战(CPSC 2018)、圣彼得堡心脏病技术研究所(INCART)和麻省理工学院-贝斯以色列医院(MIT-BIH)的数据库在受试者和记录水平上评估了该模型。该模型在MIT-BIH数据集上的F1得分为98.01%,准确率为99.42%。我们的研究结果表明,有效地利用注意力机制可以大大改善心律失常的分类。
{"title":"Enhanced ECG arrhythmia detection with deep learning and multi-head attention mechanism","authors":"Saoueb Kerdoudi ,&nbsp;Larbi Guezouli ,&nbsp;Tahar Dilekh","doi":"10.1016/j.compeleceng.2026.110957","DOIUrl":"10.1016/j.compeleceng.2026.110957","url":null,"abstract":"<div><div>Detecting arrhythmias via electrocardiograms (ECGs) is vital for healthcare. While deep learning has advanced classification, capturing critical patterns in complex data remains challenging. We propose Res_Bi-LSTM_MHA, a novel model integrating a multi-head self-attention (MHA) mechanism to selectively focus on relevant signal segments. This enhances the capture of subtle features often missed by conventional methods. By combining Residual Networks (ResNet) for robust feature extraction with Bidirectional Long Short-Term Memory (Bi-LSTM) for temporal dependencies, our approach significantly improves accuracy. We evaluated the model at subject and record levels using the China Physiological Signal Challenge (CPSC 2018), St. Petersburg Institute of Cardiological Technics (INCART), and Massachusetts Institute of Technology-Beth Israel Hospital (MIT-BIH) databases. The model achieved an F1 score of 98.01% and 99.42% accuracy on the MIT-BIH dataset. Our results demonstrate that effectively utilizing attention mechanisms offers a substantial improvement in arrhythmia classification.</div></div>","PeriodicalId":50630,"journal":{"name":"Computers & Electrical Engineering","volume":"131 ","pages":"Article 110957"},"PeriodicalIF":4.9,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145927375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Single-phase switched-capacitor based common ground five-level inverter for grid-tied PV systems with double gain 双增益并网光伏系统单相开关电容共地五电平逆变器
IF 4.9 3区 计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2026-01-06 DOI: 10.1016/j.compeleceng.2025.110928
Katroth Kalyan Singh, Kirubakaran Annamalai
This article proposes a single-phase transformerless inverter for grid-tied PV installations. At the output stage, the proposed inverter can produce five levels of voltage. It features two electrolytic switching capacitors (SCs), six power switches, and two power diodes. This architecture is lighter and less expensive due to the usage of fewer power electronic components. Because the negative DC line of the suggested inverter is directly connected to the grid neutral in PV applications, leakage current is completely minimized. Another advantage of this design is that it may easily double the output voltage without the need for a transformer or inductor. Self-balancing is achieved by symmetrically charging and discharging the SCs in parallel and in series with the input voltage over time. Therefore, a complex control technique to balance the SCs is no longer necessary with the proposed inverter. The design specifications of the proposed inverter are provided. To illustrate the benefits of the proposed inverter, including the reduction of total standing voltage and cost function, a quantitative comparison analysis with similar five-level topologies is also presented. An experimental prototype of a 1 kW grid-tied system is used to validate the topology and demonstrate the capabilities of the proposed inverter with a closed-loop PR controller. Moreover, the system dynamics are tested under different loading conditions and input voltage variations.
本文提出了一种用于并网光伏装置的单相无变压器逆变器。在输出阶段,所提出的逆变器可以产生五个等级的电压。它具有两个电解开关电容器(SCs),六个功率开关和两个功率二极管。由于使用更少的电力电子元件,这种架构更轻,更便宜。由于建议的逆变器的负直流线路直接连接到光伏应用中的电网中性点,因此泄漏电流完全最小化。这种设计的另一个优点是,它可以很容易地加倍输出电压,而不需要变压器或电感。自平衡是通过与输入电压随时间平行或串联对称充电和放电来实现的。因此,对于所提出的逆变器,不再需要复杂的控制技术来平衡sc。给出了逆变器的设计参数。为了说明所提出的逆变器的优点,包括降低总驻电压和成本函数,还提供了与类似五级拓扑的定量比较分析。一个1千瓦并网系统的实验原型被用来验证拓扑结构,并展示了带闭环PR控制器的逆变器的能力。并对系统在不同负载条件和输入电压变化下的动力学特性进行了测试。
{"title":"Single-phase switched-capacitor based common ground five-level inverter for grid-tied PV systems with double gain","authors":"Katroth Kalyan Singh,&nbsp;Kirubakaran Annamalai","doi":"10.1016/j.compeleceng.2025.110928","DOIUrl":"10.1016/j.compeleceng.2025.110928","url":null,"abstract":"<div><div>This article proposes a single-phase transformerless inverter for grid-tied PV installations. At the output stage, the proposed inverter can produce five levels of voltage. It features two electrolytic switching capacitors (SCs), six power switches, and two power diodes. This architecture is lighter and less expensive due to the usage of fewer power electronic components. Because the negative DC line of the suggested inverter is directly connected to the grid neutral in PV applications, leakage current is completely minimized. Another advantage of this design is that it may easily double the output voltage without the need for a transformer or inductor. Self-balancing is achieved by symmetrically charging and discharging the SCs in parallel and in series with the input voltage over time. Therefore, a complex control technique to balance the SCs is no longer necessary with the proposed inverter. The design specifications of the proposed inverter are provided. To illustrate the benefits of the proposed inverter, including the reduction of total standing voltage and cost function, a quantitative comparison analysis with similar five-level topologies is also presented. An experimental prototype of a 1 kW grid-tied system is used to validate the topology and demonstrate the capabilities of the proposed inverter with a closed-loop PR controller. Moreover, the system dynamics are tested under different loading conditions and input voltage variations.</div></div>","PeriodicalId":50630,"journal":{"name":"Computers & Electrical Engineering","volume":"131 ","pages":"Article 110928"},"PeriodicalIF":4.9,"publicationDate":"2026-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145927369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Role of SSL models: Finetuning and feature optimization for dysarthric speech recognition and keyword spotting SSL模型的作用:对困难语音识别和关键字定位的微调和特征优化
IF 4.9 3区 计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2026-01-03 DOI: 10.1016/j.compeleceng.2025.110921
Paban Sapkota, Hemant Kumar Kathania, Subham Kutum
Self-supervised learning (SSL) models are increasingly used in speech processing tasks, where they provide powerful pretrained representations of speech. Most existing methods utilize these models by either fine-tuning them on domain-specific data or using their output representations as input features in conventional ASR systems. However, the relationship between SSL layer representations and the severity level of dysarthric speech remains poorly understood, despite the potential for different layers to capture features that vary in relevance across severity levels. Furthermore, the high dimensionality of these representations, often reaching up to 1024 dimensions, imposes a heavy computational load, highlighting the need for optimized feature representations in downstream ASR and keyword spotting (KWS) tasks. This study proposes a severity-independent approach for dysarthric speech processing using SSL features, investigating three state-of-the-art pretrained models: Wav2Vec2, HuBERT, and Data2Vec. We propose: (1) selecting SSL layers based on severity level to extract the most useful features; (2) a Kaldi-based ASR system, that uses an autoencoder to reduce the size of SSL features; and (3) validating the proposed SSL feature optimization in a KWS task. We evaluate the proposed method using a DNN–HMM model in Kaldi on two standard dysarthric speech datasets: TORGO and UAspeech. Our approach shows that selecting severity-specific SSL layers, combined with autoencoder (AE)-based feature optimization, leads to significant improvements over both zero-shot and fine-tuned SSL baselines. On TORGO, our method achieved a WER of 23.12%, outperforming zero-shot (60.35%) and fine-tuned SSL model (40.48%). On UAspeech, it reached 50.33% WER, surpassing both the fine-tuned (51.04%) and MFCC-based systems (58.67%). Layer-wise analysis revealed consistent trends: lower layers were more effective for very high-severity speech, while mid-to-upper layers performed better for low/medium-severity cases. Further, in the KWS task, later SSL layers showed the best performance, with our proposed system outperforming the MFCC baseline. These findings highlight the generalization of our proposed method, which combines layer-specific selection and autoencoder-based optimization of SSL features, for dysarthric speech processing tasks.
自监督学习(SSL)模型越来越多地用于语音处理任务,在这些任务中,它们提供了强大的预训练语音表示。大多数现有方法利用这些模型,要么对特定领域的数据进行微调,要么在传统的ASR系统中使用它们的输出表示作为输入特征。然而,尽管不同的层捕获的特征在不同的严重级别上具有不同的相关性,但人们对SSL层表示与不良语音的严重级别之间的关系仍然知之甚少。此外,这些表征的高维数(通常达到1024维)带来了沉重的计算负荷,突出了在下游ASR和关键字定位(KWS)任务中对优化特征表征的需求。本研究提出了一种使用SSL特征的独立于严重程度的语音处理方法,研究了三种最先进的预训练模型:Wav2Vec2、HuBERT和Data2Vec。我们建议:(1)根据安全级别选择SSL层,提取最有用的特征;(2)基于kaldi的ASR系统,该系统使用自编码器来减小SSL特征的大小;(3)在KWS任务中验证所提出的SSL特性优化。我们使用Kaldi中的DNN-HMM模型在两个标准的困难语音数据集:TORGO和uasspeech上评估了所提出的方法。我们的方法表明,选择特定于严重性的SSL层,结合基于自动编码器(AE)的特征优化,可以显著改善零射击和微调SSL基线。在TORGO上,我们的方法获得了23.12%的WER,优于零射击(60.35%)和微调SSL模型(40.48%)。在UAspeech上,其识别率达到50.33%,超过了微调系统(51.04%)和基于mfcc的系统(58.67%)。分层分析揭示了一致的趋势:较低的层次对非常严重的语音更有效,而中高层对低/中等严重的情况表现更好。此外,在KWS任务中,较晚的SSL层表现出最佳性能,我们提出的系统的性能优于MFCC基线。这些发现突出了我们提出的方法的泛化,该方法结合了特定层的选择和基于自动编码器的SSL特征优化,用于困难语音处理任务。
{"title":"Role of SSL models: Finetuning and feature optimization for dysarthric speech recognition and keyword spotting","authors":"Paban Sapkota,&nbsp;Hemant Kumar Kathania,&nbsp;Subham Kutum","doi":"10.1016/j.compeleceng.2025.110921","DOIUrl":"10.1016/j.compeleceng.2025.110921","url":null,"abstract":"<div><div>Self-supervised learning (SSL) models are increasingly used in speech processing tasks, where they provide powerful pretrained representations of speech. Most existing methods utilize these models by either fine-tuning them on domain-specific data or using their output representations as input features in conventional ASR systems. However, the relationship between SSL layer representations and the severity level of dysarthric speech remains poorly understood, despite the potential for different layers to capture features that vary in relevance across severity levels. Furthermore, the high dimensionality of these representations, often reaching up to 1024 dimensions, imposes a heavy computational load, highlighting the need for optimized feature representations in downstream ASR and keyword spotting (KWS) tasks. This study proposes a severity-independent approach for dysarthric speech processing using SSL features, investigating three state-of-the-art pretrained models: Wav2Vec2, HuBERT, and Data2Vec. We propose: (1) selecting SSL layers based on severity level to extract the most useful features; (2) a Kaldi-based ASR system, that uses an autoencoder to reduce the size of SSL features; and (3) validating the proposed SSL feature optimization in a KWS task. We evaluate the proposed method using a DNN–HMM model in Kaldi on two standard dysarthric speech datasets: TORGO and UAspeech. Our approach shows that selecting severity-specific SSL layers, combined with autoencoder (AE)-based feature optimization, leads to significant improvements over both zero-shot and fine-tuned SSL baselines. On TORGO, our method achieved a WER of 23.12%, outperforming zero-shot (60.35%) and fine-tuned SSL model (40.48%). On UAspeech, it reached 50.33% WER, surpassing both the fine-tuned (51.04%) and MFCC-based systems (58.67%). Layer-wise analysis revealed consistent trends: lower layers were more effective for very high-severity speech, while mid-to-upper layers performed better for low/medium-severity cases. Further, in the KWS task, later SSL layers showed the best performance, with our proposed system outperforming the MFCC baseline. These findings highlight the generalization of our proposed method, which combines layer-specific selection and autoencoder-based optimization of SSL features, for dysarthric speech processing tasks.</div></div>","PeriodicalId":50630,"journal":{"name":"Computers & Electrical Engineering","volume":"131 ","pages":"Article 110921"},"PeriodicalIF":4.9,"publicationDate":"2026-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145927370","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A hybrid reinforcement learning framework for adaptive multi-horizon electricity load forecasting: The DWRNet approach 自适应多视界电力负荷预测的混合强化学习框架:dwnet方法
IF 4.9 3区 计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2026-01-02 DOI: 10.1016/j.compeleceng.2025.110926
Muhammad Farhan Khan , Sile Hu , Yuan Gao , Yu Guo , Yuan Wang , Maryam Saeed , Yucan Zhao , Jiaqiang Yang
Accurate and adaptive multi-horizon electricity load forecasting is essential for secure operation of modern power systems and for the integration of variable renewable generation. This paper proposes DWRNet, a Dynamic Weighted Residual Network that combines statistical decomposition, deep residual learning, and reinforcement learning (RL)-based adaptive fusion. A Fruit Fly Optimization-tuned Holt-Winters model first extracts the dominant seasonal-trend component, while a Long Short-Term Memory (LSTM) network learns the nonlinear residual structure. A continuous-action policy-gradient controller then produces horizon-dependent convex weights that balance the statistical and neural forecasts, enabling the ensemble to adapt to changing load regimes while remaining lightweight enough for EMS/SCADA deployment. DWRNet is evaluated on four years of hourly load data from two structurally different power systems (Inner Mongolia, China and Germany) over 24 h, 168 h, and 720 h horizons, and compared against strong baselines including SVR, LSTM, GRU, CNN, CNN-LSTM, and recent Transformer-based models (Informer, FEDformer) under a common rolling-origin protocol. Across both regions and all horizons, DWRNet consistently achieves the best or near-best MAE, RMSE, sMAPE and R² values, with particularly notable gains on weekly and monthly forecasts. Robustness is assessed through cross-validation with varying training fractions, bootstrap-based confidence intervals, ablation studies, and residual diagnostics, which collectively indicate that the improvements are stable and not attributable to overfitting. A complexity analysis and runtime benchmarks further show that the RL-based blending stage adds only modest offline training cost and negligible inference overhead. DWRNet offers a practical and scalable solution for real-time energy forecasting, with strong potential for use in energy management systems, dispatch operations, and smart grid planning.
准确、自适应的多水平负荷预测对于现代电力系统的安全运行和可变可再生能源发电的整合至关重要。本文提出了一种动态加权残差网络DWRNet,它结合了统计分解、深度残差学习和基于强化学习(RL)的自适应融合。果蝇优化的Holt-Winters模型首先提取主要的季节趋势成分,而长短期记忆(LSTM)网络学习非线性剩余结构。然后,连续动作策略梯度控制器产生与水平相关的凸权值,以平衡统计和神经预测,使集成能够适应不断变化的负载状态,同时保持足够轻量的EMS/SCADA部署。DWRNet基于两个结构不同的电力系统(内蒙古、中国和德国)在24小时、168小时和720小时期间的4年每小时负荷数据进行评估,并与强大的基线进行比较,包括SVR、LSTM、GRU、CNN、CNN-LSTM和最近基于变压器的模型(Informer、FEDformer)。在这两个地区和所有范围内,dwnet始终能够实现最佳或接近最佳的MAE、RMSE、sMAPE和R²值,特别是在每周和每月的预测中获得显著的收益。鲁棒性通过不同训练分数、基于自启动的置信区间、消融研究和剩余诊断的交叉验证来评估,这些共同表明改进是稳定的,而不是归因于过拟合。复杂度分析和运行时基准进一步表明,基于强化学习的混合阶段只增加了适度的离线训练成本和可忽略的推理开销。DWRNet为实时能源预测提供了实用且可扩展的解决方案,在能源管理系统、调度操作和智能电网规划中具有强大的应用潜力。
{"title":"A hybrid reinforcement learning framework for adaptive multi-horizon electricity load forecasting: The DWRNet approach","authors":"Muhammad Farhan Khan ,&nbsp;Sile Hu ,&nbsp;Yuan Gao ,&nbsp;Yu Guo ,&nbsp;Yuan Wang ,&nbsp;Maryam Saeed ,&nbsp;Yucan Zhao ,&nbsp;Jiaqiang Yang","doi":"10.1016/j.compeleceng.2025.110926","DOIUrl":"10.1016/j.compeleceng.2025.110926","url":null,"abstract":"<div><div>Accurate and adaptive multi-horizon electricity load forecasting is essential for secure operation of modern power systems and for the integration of variable renewable generation. This paper proposes DWRNet, a Dynamic Weighted Residual Network that combines statistical decomposition, deep residual learning, and reinforcement learning (RL)-based adaptive fusion. A Fruit Fly Optimization-tuned Holt-Winters model first extracts the dominant seasonal-trend component, while a Long Short-Term Memory (LSTM) network learns the nonlinear residual structure. A continuous-action policy-gradient controller then produces horizon-dependent convex weights that balance the statistical and neural forecasts, enabling the ensemble to adapt to changing load regimes while remaining lightweight enough for EMS/SCADA deployment. DWRNet is evaluated on four years of hourly load data from two structurally different power systems (Inner Mongolia, China and Germany) over 24 h, 168 h, and 720 h horizons, and compared against strong baselines including SVR, LSTM, GRU, CNN, CNN-LSTM, and recent Transformer-based models (Informer, FEDformer) under a common rolling-origin protocol. Across both regions and all horizons, DWRNet consistently achieves the best or near-best MAE, RMSE, sMAPE and R² values, with particularly notable gains on weekly and monthly forecasts. Robustness is assessed through cross-validation with varying training fractions, bootstrap-based confidence intervals, ablation studies, and residual diagnostics, which collectively indicate that the improvements are stable and not attributable to overfitting. A complexity analysis and runtime benchmarks further show that the RL-based blending stage adds only modest offline training cost and negligible inference overhead. DWRNet offers a practical and scalable solution for real-time energy forecasting, with strong potential for use in energy management systems, dispatch operations, and smart grid planning.</div></div>","PeriodicalId":50630,"journal":{"name":"Computers & Electrical Engineering","volume":"131 ","pages":"Article 110926"},"PeriodicalIF":4.9,"publicationDate":"2026-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145885937","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multiagent deep reinforcement learning-based distributed control strategy for energy management in DC Microgrid 基于多智能体深度强化学习的直流微电网能量管理分布式控制策略
IF 4.9 3区 计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2025-12-31 DOI: 10.1016/j.compeleceng.2025.110927
Alankrita, Avadh Pati, Nabanita Adhikary
This paper presents a Multi-Agent Deep Reinforcement learning (MARL) framework for distributed energy management in a DC Microgrid (DC MG) comprising Photovoltaic, Wind Turbine, and Energy Storage Systems, with the primary objective of maintaining DC link voltage stability. The decentralized control architecture employs local voltage measurements as agent state inputs and uses Deep Q-Networks to estimate individual action-value functions. Three algorithmic approaches are investigated: Independent DQN (IDQN), Value Decomposition Networks (VDN), and QMIX, each evaluated with Multilayer Perceptron (MLP) and Recurrent Neural Network (RNN) architectures. The custom reward function integrates voltage deviation penalties, power balance constraints, and battery cycling costs to achieve high renewable penetration and efficient storage dispatch. Case studies validate framework performance under diverse conditions, including variable generation and demand, network delays, false data injection attacks, ground faults, and plug-and-play topology changes. Results reveal scenario-dependent performance characteristics: RNN based VDN achieves superior voltage regulation under normal operation, IDQN demonstrates robust reward optimization during cyber-attacks, while RNN based QMIX excels in adversarial scenarios during false data injection and fastest transient response during plug-and-play events. Computational analysis identifies architecture-dependent scaling trade-offs, with QMIX requiring more compute requirements and centralized coordination overhead, while IDQN's distributed architecture and lower resource consumption suggest better scalability for multi-agent expansion. The framework demonstrates the practical viability of MARL-based distributed control for resilient energy management in DC MG with scenario-appropriate algorithm selection.
本文提出了一个多智能体深度强化学习(MARL)框架,用于包括光伏、风力涡轮机和储能系统在内的直流微电网(DC MG)的分布式能源管理,其主要目标是保持直流链路电压稳定。分散控制体系结构采用本地电压测量作为代理状态输入,并使用Deep Q-Networks来估计单个动作值函数。研究了三种算法方法:独立DQN (IDQN),价值分解网络(VDN)和QMIX,每种算法都使用多层感知器(MLP)和递归神经网络(RNN)架构进行评估。自定义奖励功能集成了电压偏差惩罚、功率平衡约束和电池循环成本,以实现高可再生能源渗透率和高效的储能调度。案例研究验证了框架在不同条件下的性能,包括变量生成和需求、网络延迟、虚假数据注入攻击、接地故障和即插即用拓扑变化。结果揭示了场景相关的性能特征:基于RNN的VDN在正常运行下实现了卓越的电压调节,IDQN在网络攻击中表现出强大的奖励优化,而基于RNN的QMIX在虚假数据注入的对抗场景中表现出色,在即插即用事件中表现出最快的瞬态响应。计算分析确定了与体系结构相关的扩展权衡,QMIX需要更多的计算需求和集中的协调开销,而IDQN的分布式体系结构和较低的资源消耗为多代理扩展提供了更好的可伸缩性。该框架通过场景化算法的选择,证明了基于marl的分布式控制在直流电网弹性能量管理中的实际可行性。
{"title":"Multiagent deep reinforcement learning-based distributed control strategy for energy management in DC Microgrid","authors":"Alankrita,&nbsp;Avadh Pati,&nbsp;Nabanita Adhikary","doi":"10.1016/j.compeleceng.2025.110927","DOIUrl":"10.1016/j.compeleceng.2025.110927","url":null,"abstract":"<div><div>This paper presents a Multi-Agent Deep Reinforcement learning (MARL) framework for distributed energy management in a DC Microgrid (DC MG) comprising Photovoltaic, Wind Turbine, and Energy Storage Systems, with the primary objective of maintaining DC link voltage stability. The decentralized control architecture employs local voltage measurements as agent state inputs and uses Deep Q-Networks to estimate individual action-value functions. Three algorithmic approaches are investigated: Independent DQN (IDQN), Value Decomposition Networks (VDN), and QMIX, each evaluated with Multilayer Perceptron (MLP) and Recurrent Neural Network (RNN) architectures. The custom reward function integrates voltage deviation penalties, power balance constraints, and battery cycling costs to achieve high renewable penetration and efficient storage dispatch. Case studies validate framework performance under diverse conditions, including variable generation and demand, network delays, false data injection attacks, ground faults, and plug-and-play topology changes. Results reveal scenario-dependent performance characteristics: RNN based VDN achieves superior voltage regulation under normal operation, IDQN demonstrates robust reward optimization during cyber-attacks, while RNN based QMIX excels in adversarial scenarios during false data injection and fastest transient response during plug-and-play events. Computational analysis identifies architecture-dependent scaling trade-offs, with QMIX requiring more compute requirements and centralized coordination overhead, while IDQN's distributed architecture and lower resource consumption suggest better scalability for multi-agent expansion. The framework demonstrates the practical viability of MARL-based distributed control for resilient energy management in DC MG with scenario-appropriate algorithm selection.</div></div>","PeriodicalId":50630,"journal":{"name":"Computers & Electrical Engineering","volume":"131 ","pages":"Article 110927"},"PeriodicalIF":4.9,"publicationDate":"2025-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145885936","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hybrid Brown-Bear and Hippopotamus Optimization with Quasi-Opposition-Based Learning for Optimal Power Flow with Renewable Energy Integration 基于准对立学习的棕熊-河马混合优化可再生能源一体化最优潮流
IF 4.9 3区 计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2025-12-29 DOI: 10.1016/j.compeleceng.2025.110922
Mohamed Lahdeb , Ali Hennache , Bachir Bentouati , M.M.R. Ahmed , Ragab A. El-Sehiemy , M. Elzalik
The optimal power flow (OPF) problem is a highly nonlinear and complex multi-dimension optimization problem, especially with the increased penetration of uncertain renewable energies (RES). In this line, this paper presents the Hybrid Brown-Bear and Hippopotamus Optimization Algorithms with Quasi-Opposition-Based Learning (HBOA-QOBL) to enhance multi-dimension OPF solution. The algorithm combines the strengths of Brown-Bear optimizer, which excels in exploration and adaptive search mechanisms, and the Hippopotamus optimizer, known for its social behavior modeling and localized search strategies. By integrating QOBL, the HBOA-QOBL improves exploration through the generation of quasi-opposite solutions, allowing for a wider search of the solution space and reducing the risk of premature convergence. Adaptive search mechanisms embedded in HBOA-QOBL enhance exploitation by dynamically adjusting search behaviors during iterative power dispatch tuning, enabling improved fine-tuning of generation schedules and voltage profiles. The effectiveness of the proposed method is evaluated on the IEEE 30-bus, 57-bus, and 118-bus test systems for multiple dimension OPF objectives, including fuel cost minimization, emission reduction, power loss reduction, voltage deviation minimization, reactive power loss reduction and the voltage stability indicator (L-index). Simulation results indicate faster convergence compared to conventional techniques, achieving near-optimal solutions within 200 iterations, with a standard deviation of 63.8%, demonstrating superior technical and economic performance relative to previous research. Key convergence parameters such as population size, maximum iterations, and learning factor are explicitly tuned to enhance both exploration and exploitation. Simulation results confirm that HBOA-QOBL outperforms conventional optimization techniques in terms of solution quality, convergence speed, and stability, establishing significant improvement in the technical and economic issues.
最优潮流(OPF)问题是一个高度非线性、复杂的多维优化问题,特别是随着不确定可再生能源(RES)渗透率的增加。在这方面,本文提出了一种基于准对立学习的棕熊和河马混合优化算法(HBOA-QOBL)来增强多维OPF解。该算法结合了擅长探索和自适应搜索机制的棕熊优化器和以社会行为建模和本地化搜索策略而闻名的河马优化器的优势。通过集成QOBL, HBOA-QOBL通过生成准相反解来改进探索,允许更广泛的解空间搜索并降低过早收敛的风险。嵌入在HBOA-QOBL中的自适应搜索机制通过在迭代电力调度调优过程中动态调整搜索行为来提高利用率,从而改进了发电计划和电压分布的微调。在IEEE 30总线、57总线和118总线测试系统上对该方法的有效性进行了评估,测试目标包括燃料成本最小化、排放减少、功率损耗减少、电压偏差最小化、无功损耗减少和电压稳定指标(L-index)。仿真结果表明,与传统技术相比,该方法收敛速度更快,在200次迭代内获得接近最优解,标准差为63.8%,与以往的研究相比,具有优越的技术和经济性能。关键的收敛参数,如人口规模、最大迭代和学习因子被明确地调整以增强探索和开发。仿真结果表明,HBOA-QOBL在求解质量、收敛速度、稳定性等方面均优于传统优化技术,在技术经济问题上取得了显著的进步。
{"title":"Hybrid Brown-Bear and Hippopotamus Optimization with Quasi-Opposition-Based Learning for Optimal Power Flow with Renewable Energy Integration","authors":"Mohamed Lahdeb ,&nbsp;Ali Hennache ,&nbsp;Bachir Bentouati ,&nbsp;M.M.R. Ahmed ,&nbsp;Ragab A. El-Sehiemy ,&nbsp;M. Elzalik","doi":"10.1016/j.compeleceng.2025.110922","DOIUrl":"10.1016/j.compeleceng.2025.110922","url":null,"abstract":"<div><div>The optimal power flow (OPF) problem <strong>is</strong> a highly nonlinear and complex multi-dimension optimization problem, especially with the increased penetration of uncertain renewable energies (RES). In this line, this paper presents the Hybrid Brown-Bear and Hippopotamus Optimization Algorithms with Quasi-Opposition-Based Learning (HBOA-QOBL) to enhance multi-dimension OPF solution. The algorithm combines the strengths of Brown-Bear optimizer, which excels in exploration and adaptive search mechanisms, and the Hippopotamus optimizer, known for its social behavior modeling and localized search strategies. By integrating QOBL, the HBOA-QOBL improves exploration through the generation of quasi-opposite solutions, allowing for a wider search of the solution space and reducing the risk of premature convergence. Adaptive search mechanisms embedded in HBOA-QOBL enhance exploitation by dynamically adjusting search behaviors during iterative power dispatch tuning, enabling improved fine-tuning of generation schedules and voltage profiles. The effectiveness of the proposed method is evaluated on the IEEE 30-bus, 57-bus, and 118-bus test systems for multiple dimension OPF objectives, including fuel cost minimization, emission reduction, power loss reduction, voltage deviation minimization, reactive power loss reduction and the voltage stability indicator (L-index). Simulation results indicate faster convergence compared to conventional techniques, achieving near-optimal solutions within 200 iterations, with a standard deviation of 63.8%, demonstrating superior technical and economic performance relative to previous research. Key convergence parameters such as population size, maximum iterations, and learning factor are explicitly tuned to enhance both exploration and exploitation. Simulation results confirm that HBOA-QOBL outperforms conventional optimization techniques in terms of solution quality, convergence speed, and stability, establishing significant improvement in the technical and economic issues.</div></div>","PeriodicalId":50630,"journal":{"name":"Computers & Electrical Engineering","volume":"131 ","pages":"Article 110922"},"PeriodicalIF":4.9,"publicationDate":"2025-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145885934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Digital image watermarking using histogram based pixel sorting and pixel value search techniques 使用基于直方图的像素排序和像素值搜索技术的数字图像水印
IF 4.9 3区 计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2025-12-29 DOI: 10.1016/j.compeleceng.2025.110918
Ande Bhargav, Mohamed Asan Basiri M.
Reversible digital image watermarking methods are crucial for embedding authentication information in medical imaging, military communication, and etc. The reversible data hiding (RDH) techniques embed auxiliary data or necessitate separate transmission of location maps to recover the data. These practices reduce the imperceptibility of the stegano image and demand higher bandwidth. To overcome these limitations, this paper proposes histogram-based pixel sorting (HBPS) in Algorithm-I, which directly embeds data into the least significant bits (LSBs), improving the Peak Signal-to-Noise Ratio (PSNR) by 22.29%. The experimental results validate the superior visual quality of the recovered cover image with average PSNR exceeding 50 dB. Algorithms-II and III incorporate preprocessing of the cover image using Laplacian kernel and the proposed triplet linear pixel transformation (TLPT), respectively to preserve the visual integrity of the cover image. The observed PSNR and latency gains compared to existing methods are statistically significant at the 95% confidence level using t-tests with Bonferroni correction. The preprocessing technique in Algorithm-IV refines the pixel value search algorithm (PVSA) with a sharpening filter to reduce latency by 52.82%. The multi-core implementation of PVSA to reduce the latency is shown in Algorithm-V.
在医学成像、军事通信等领域,可逆数字图像水印方法是嵌入认证信息的关键。可逆数据隐藏(RDH)技术嵌入辅助数据或需要单独传输位置图来恢复数据。这些做法降低了隐写图像的不可感知性,需要更高的带宽。为了克服这些限制,本文在算法- i中提出了基于直方图的像素排序(HBPS),该算法将数据直接嵌入到最低有效位(LSBs)中,将峰值信噪比(PSNR)提高了22.29%。实验结果表明,恢复的覆盖图像具有良好的视觉质量,平均信噪比超过50 dB。算法ii和算法III分别使用拉普拉斯核和提出的三重线性像素变换(TLPT)对封面图像进行预处理,以保持封面图像的视觉完整性。使用Bonferroni校正的t检验,与现有方法相比,观察到的PSNR和延迟增益在95%置信水平上具有统计学意义。算法iv中的预处理技术通过锐化滤波器对像素值搜索算法(PVSA)进行了改进,延迟降低了52.82%。在Algorithm-V中展示了PVSA的多核实现以减少延迟。
{"title":"Digital image watermarking using histogram based pixel sorting and pixel value search techniques","authors":"Ande Bhargav,&nbsp;Mohamed Asan Basiri M.","doi":"10.1016/j.compeleceng.2025.110918","DOIUrl":"10.1016/j.compeleceng.2025.110918","url":null,"abstract":"<div><div>Reversible digital image watermarking methods are crucial for embedding authentication information in medical imaging, military communication, and etc. The reversible data hiding (RDH) techniques embed auxiliary data or necessitate separate transmission of location maps to recover the data. These practices reduce the imperceptibility of the stegano image and demand higher bandwidth. To overcome these limitations, this paper proposes histogram-based pixel sorting (HBPS) in Algorithm-I, which directly embeds data into the least significant bits (LSBs), improving the Peak Signal-to-Noise Ratio (PSNR) by 22.29%. The experimental results validate the superior visual quality of the recovered cover image with average PSNR exceeding 50 dB. Algorithms-II and III incorporate preprocessing of the cover image using Laplacian kernel and the proposed triplet linear pixel transformation (TLPT), respectively to preserve the visual integrity of the cover image. The observed PSNR and latency gains compared to existing methods are statistically significant at the 95% confidence level using t-tests with Bonferroni correction. The preprocessing technique in Algorithm-IV refines the pixel value search algorithm (PVSA) with a sharpening filter to reduce latency by 52.82%. The multi-core implementation of PVSA to reduce the latency is shown in Algorithm-V.</div></div>","PeriodicalId":50630,"journal":{"name":"Computers & Electrical Engineering","volume":"131 ","pages":"Article 110918"},"PeriodicalIF":4.9,"publicationDate":"2025-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145885935","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mizo hand sign language detection using a multi-scale transformer-based hybrid feature extractor and fusion network 基于多尺度变压器混合特征提取和融合网络的Mizo手语检测
IF 4.9 3区 计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2025-12-27 DOI: 10.1016/j.compeleceng.2025.110916
Barrister R , Ambeth Kumar V. D , Ashok Kumar V. D
Sign language is the primary means of communication for those who are hard of hearing or speaking. In daily lives, people rely on visual signals to express their thoughts and emotions because of deafness or being dumb. Most commonly, sign language is communicated through hand gestures and is analyzed in the present research, but it faces the problem of inaccurate detection of poses due to improper extraction of features. Also, the study and detection concerning Mizo sign language are very rarely seen in the literature. Hence, the proposed study presents a novel hybrid model that combines machine learning and deep learning to detect Mizo hand sign language. The Mizo hand sign language datasets are used in the first phase of the system evaluation process to assess its effectiveness. The next step involves pre-processing to remove extraneous background from photos. Next, a hybrid feature extraction is carried out using a depth-wise convolutional network (DCN) and a spatial-frequency multi-scale dilated transformer (SF-MSDT) in order to extract the significant features. The output of the hybrid feature extractor is fed independently over the feature fusion module to generate a single dimensional feature vector. In order to detect the Mizo sign language, classification is finally performed using three classifiers named support vector machine (SVM), random forest classifier, and Residual network (ResNet). The experimental analysis demonstrates the most feasible ResNet classifier with an accuracy of 98.23 %, precision of 92.36 %, recall of 88.52 %, and F1-score of 85.77 %. The proposed model using a ResNet classifier possesses 1.25 % improved accuracy when compared with recurrent networks and 4.3 % with convolutional networks.
手语是那些听力或语言有障碍的人的主要交流手段。在日常生活中,由于耳聋或哑,人们依靠视觉信号来表达自己的思想和情感。最常见的手语是通过手势进行交流,本研究对其进行了分析,但由于特征提取不当,存在姿势检测不准确的问题。此外,对米佐语手语的研究和检测在文献中也很少见到。因此,本研究提出了一种结合机器学习和深度学习的新型混合模型来检测Mizo手语。Mizo手语数据集用于系统评估过程的第一阶段,以评估其有效性。下一步是对照片进行预处理,去除多余的背景。其次,利用深度卷积网络(DCN)和空频多尺度膨胀变压器(SF-MSDT)进行混合特征提取,以提取重要特征。混合特征提取器的输出通过特征融合模块独立馈送,生成单维特征向量。为了检测Mizo手语,最后使用支持向量机(SVM)、随机森林分类器和残差网络(ResNet)三种分类器进行分类。实验分析表明,最可行的ResNet分类器准确率为98.23%,精密度为92.36%,召回率为88.52%,f1评分为85.77%。使用ResNet分类器的模型与循环网络相比准确率提高了1.25%,与卷积网络相比准确率提高了4.3%。
{"title":"Mizo hand sign language detection using a multi-scale transformer-based hybrid feature extractor and fusion network","authors":"Barrister R ,&nbsp;Ambeth Kumar V. D ,&nbsp;Ashok Kumar V. D","doi":"10.1016/j.compeleceng.2025.110916","DOIUrl":"10.1016/j.compeleceng.2025.110916","url":null,"abstract":"<div><div>Sign language is the primary means of communication for those who are hard of hearing or speaking. In daily lives, people rely on visual signals to express their thoughts and emotions because of deafness or being dumb. Most commonly, sign language is communicated through hand gestures and is analyzed in the present research, but it faces the problem of inaccurate detection of poses due to improper extraction of features. Also, the study and detection concerning Mizo sign language are very rarely seen in the literature. Hence, the proposed study presents a novel hybrid model that combines machine learning and deep learning to detect Mizo hand sign language. The Mizo hand sign language datasets are used in the first phase of the system evaluation process to assess its effectiveness. The next step involves pre-processing to remove extraneous background from photos. Next, a hybrid feature extraction is carried out using a depth-wise convolutional network (DCN) and a spatial-frequency multi-scale dilated transformer (SF-MSDT) in order to extract the significant features. The output of the hybrid feature extractor is fed independently over the feature fusion module to generate a single dimensional feature vector. In order to detect the Mizo sign language, classification is finally performed using three classifiers named support vector machine (SVM), random forest classifier, and Residual network (ResNet). The experimental analysis demonstrates the most feasible ResNet classifier with an accuracy of 98.23 %, precision of 92.36 %, recall of 88.52 %, and F1-score of 85.77 %. The proposed model using a ResNet classifier possesses 1.25 % improved accuracy when compared with recurrent networks and 4.3 % with convolutional networks.</div></div>","PeriodicalId":50630,"journal":{"name":"Computers & Electrical Engineering","volume":"131 ","pages":"Article 110916"},"PeriodicalIF":4.9,"publicationDate":"2025-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145842637","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Computers & Electrical Engineering
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1