首页 > 最新文献

IET Signal Processing最新文献

英文 中文
Compressive TDOA Estimation Method Based on Amplitude Phase Clustering in Time-Frequency Domain 基于时频幅相聚类的压缩TDOA估计方法
IF 1.4 4区 工程技术 Q4 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-20 DOI: 10.1049/sil2/3642027
Yang Jiao, Changxiong Xia, Jingyu Zhai, Tianyi Xing, Qun Wan

In the field of signal processing, modulation signals, including phase shift keying (PSK) and quadrature amplitude modulation (QAM), can significantly enhance the signal-to-noise ratio (SNR) through aliasing transmission following clustering and sorting. This article presents two novel approaches to compressed time difference of arrival (TDOA) estimation, leveraging amplitude-phase clustering signals. A carefully designed compression matrix is constructed based on the unique amplitude and phase characteristics of the signals. The study then analyzes the Cramér–Rao lower bound (CRLB) under full-sampling conditions. Finally, TDOA estimation is performed using the approximate maximum likelihood (AML) method. Simulation results demonstrate that the proposed compressed sampling TDOA estimation methods, based on amplitude-phase clustering, achieve accuracy within an order of magnitude of full-sampling performance. Additionally, this article explores the application of OFDM-QAM signals, which exhibit amplitude-phase convergence in the frequency domain, for time difference estimation in compressed sampling. A novel frequency-domain aliasing time difference estimation algorithm based on amplitude-phase convergence is proposed. Experimental results indicate that under high SNR conditions, the algorithm incurs only a minor SNR degradation of ~4 dB compared to time difference estimation in uncompressed transmission.

在信号处理领域,调制信号,包括相移键控(PSK)和正交调幅(QAM),通过聚类和排序后的混叠传输,可以显著提高信噪比(SNR)。本文提出了两种利用幅相聚类信号的压缩到达时间差(TDOA)估计方法。根据信号独特的幅度和相位特性,精心设计了压缩矩阵。然后分析了全采样条件下的cramsamr - rao下界(CRLB)。最后,使用近似最大似然(AML)方法进行TDOA估计。仿真结果表明,基于幅相聚类的压缩采样TDOA估计方法的精度在全采样性能的一个数量级以内。此外,本文还探讨了OFDM-QAM信号在频域表现出幅相收敛性的应用,用于压缩采样中的时间差估计。提出了一种新的基于幅相收敛的频域混叠时延估计算法。实验结果表明,在高信噪比条件下,与未压缩传输时的时差估计相比,该算法的信噪比下降幅度较小,约为4 dB。
{"title":"Compressive TDOA Estimation Method Based on Amplitude Phase Clustering in Time-Frequency Domain","authors":"Yang Jiao,&nbsp;Changxiong Xia,&nbsp;Jingyu Zhai,&nbsp;Tianyi Xing,&nbsp;Qun Wan","doi":"10.1049/sil2/3642027","DOIUrl":"https://doi.org/10.1049/sil2/3642027","url":null,"abstract":"<p>In the field of signal processing, modulation signals, including phase shift keying (PSK) and quadrature amplitude modulation (QAM), can significantly enhance the signal-to-noise ratio (SNR) through aliasing transmission following clustering and sorting. This article presents two novel approaches to compressed time difference of arrival (TDOA) estimation, leveraging amplitude-phase clustering signals. A carefully designed compression matrix is constructed based on the unique amplitude and phase characteristics of the signals. The study then analyzes the Cramér–Rao lower bound (CRLB) under full-sampling conditions. Finally, TDOA estimation is performed using the approximate maximum likelihood (AML) method. Simulation results demonstrate that the proposed compressed sampling TDOA estimation methods, based on amplitude-phase clustering, achieve accuracy within an order of magnitude of full-sampling performance. Additionally, this article explores the application of OFDM-QAM signals, which exhibit amplitude-phase convergence in the frequency domain, for time difference estimation in compressed sampling. A novel frequency-domain aliasing time difference estimation algorithm based on amplitude-phase convergence is proposed. Experimental results indicate that under high SNR conditions, the algorithm incurs only a minor SNR degradation of ~4 dB compared to time difference estimation in uncompressed transmission.</p>","PeriodicalId":56301,"journal":{"name":"IET Signal Processing","volume":"2026 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2026-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/sil2/3642027","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146096380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Joint MobileViT and Knowledge Distillation Network for Hand Gesture Recognition via mmWave Radar 基于毫米波雷达的手势识别联合MobileViT和知识蒸馏网络
IF 1.4 4区 工程技术 Q4 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-17 DOI: 10.1049/sil2/9971257
Xiangqun Zhang, Zhizhou Ge, Kai Lu, Genyuan Du, Jiawen Shen, Xiangqian Gao

Hand gesture recognition using mmWave radar has emerged as a promising technology for human–computer interaction (HCI), smart home systems, and the Internet of Things (IoT). However, the practical application of this technology is often constrained by the high computational complexity and significant storage demands of contemporary deep neural networks, which impede their deployment on resource-limited embedded devices. To address this limitation, we present a novel approach that combines an improved MobileViT model with a knowledge distillation (KD) framework. The proposed method consists of three main stages. First, raw radar signals are captured and restructured into a three-dimensional format (Chirps × Samples × Frames, a 3D tensor) and processed to generate range-time maps (RTMs) and Doppler-time maps (DTMs). Second, an improved MobileViT network is designed, incorporating fewer redundant blocks, a lower input resolution, and a dual-branch input structure to effectively fuse features from the RTM and DTM. This enhanced architecture serves as a robust teacher model, excelling at extracting both local and global spatiotemporal features for accurate gesture recognition. Finally, KD is applied to transfer knowledge from the teacher model to a compact student network, thereby achieving model compression. Experimental results demonstrate that the final distilled student model, evaluated on the test set, has only 0.018 M parameters—~10% of the teacher model’s size—while still achieving a high recognition accuracy of 99.16%. Consequently, the resulting model is highly compact and accurate, demonstrating its suitability for real-world embedded deployment.

使用毫米波雷达的手势识别已经成为人机交互(HCI)、智能家居系统和物联网(IoT)的一项有前途的技术。然而,该技术的实际应用往往受到当代深度神经网络的高计算复杂度和巨大存储需求的限制,这阻碍了它们在资源有限的嵌入式设备上的部署。为了解决这一限制,我们提出了一种将改进的MobileViT模型与知识蒸馏(KD)框架相结合的新方法。该方法包括三个主要阶段。首先,捕获原始雷达信号并将其重组为三维格式(Chirps × Samples × Frames, 3D张量),并进行处理以生成距离时间图(rtm)和多普勒时间图(dtm)。其次,设计了一种改进的MobileViT网络,采用更少的冗余块、更低的输入分辨率和双支路输入结构,有效地融合了RTM和DTM的特征。这种增强的架构作为一个鲁棒的教师模型,擅长提取局部和全局的时空特征,以实现准确的手势识别。最后,利用KD将知识从教师模型转移到紧凑的学生网络中,从而实现模型压缩。实验结果表明,在测试集上评估的最终蒸馏学生模型只有0.018 M个参数,约为教师模型大小的10%,但仍然达到了99.16%的高识别准确率。因此,生成的模型非常紧凑和准确,证明了它适合实际的嵌入式部署。
{"title":"Joint MobileViT and Knowledge Distillation Network for Hand Gesture Recognition via mmWave Radar","authors":"Xiangqun Zhang,&nbsp;Zhizhou Ge,&nbsp;Kai Lu,&nbsp;Genyuan Du,&nbsp;Jiawen Shen,&nbsp;Xiangqian Gao","doi":"10.1049/sil2/9971257","DOIUrl":"https://doi.org/10.1049/sil2/9971257","url":null,"abstract":"<p>Hand gesture recognition using mmWave radar has emerged as a promising technology for human–computer interaction (HCI), smart home systems, and the Internet of Things (IoT). However, the practical application of this technology is often constrained by the high computational complexity and significant storage demands of contemporary deep neural networks, which impede their deployment on resource-limited embedded devices. To address this limitation, we present a novel approach that combines an improved MobileViT model with a knowledge distillation (KD) framework. The proposed method consists of three main stages. First, raw radar signals are captured and restructured into a three-dimensional format (Chirps × Samples × Frames, a 3D tensor) and processed to generate range-time maps (RTMs) and Doppler-time maps (DTMs). Second, an improved MobileViT network is designed, incorporating fewer redundant blocks, a lower input resolution, and a dual-branch input structure to effectively fuse features from the RTM and DTM. This enhanced architecture serves as a robust teacher model, excelling at extracting both local and global spatiotemporal features for accurate gesture recognition. Finally, KD is applied to transfer knowledge from the teacher model to a compact student network, thereby achieving model compression. Experimental results demonstrate that the final distilled student model, evaluated on the test set, has only 0.018 M parameters—~10% of the teacher model’s size—while still achieving a high recognition accuracy of 99.16%. Consequently, the resulting model is highly compact and accurate, demonstrating its suitability for real-world embedded deployment.</p>","PeriodicalId":56301,"journal":{"name":"IET Signal Processing","volume":"2026 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2026-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/sil2/9971257","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146002264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Imagined Chinese Speech Decoding Based on Initials and Finals From EEG Activity 基于脑电活动声母韵母的汉语想象语音解码
IF 1.4 4区 工程技术 Q4 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-15 DOI: 10.1049/sil2/5451362
Jingyu Gu, Jiuchuan Jiang, Qian Cai, Haixian Wang

Brain-computer interface (BCI) plays an important role in various fields, such as neuroscience, rehabilitation, and machine learning. The silent BCI, which can reconstruct inner speech from neural activity, holds great promise for aphasia patients. In this paper, we design an imagined Chinese speech experimental paradigm based on initials and finals and collect raw signals from eight healthy participants by using 64-channel scalp electroencephalograms. Linear predictive coding (LPC) and mel frequency cepstral coefficients (MFCC), which are classical algorithms in the field of speech recognition, are used to extract distinguishing features for speech classification and reconstruction. Besides, the phase-lock value (PLV) is introduced to enrich the feature information. We choose support vector machine (SVM), linear discriminant analysis (LDA), decision tree (DT), and LogitBoost (LB) for binary classification in several different cases. Two-channel selection (CS) based on Broca’s area and Wernicke’s area of the brain is also introduced in the paper. The highest imaginary speech decoding accuracy reaches 84.38%, which demonstrates the effectiveness of the feature engineering. In addition, the comparative analysis is conducted with deep learning methods specifically designed for small sample scenarios. This study offers a novel systematic approach for the research of Chinese speech imagination BCI.

脑机接口(BCI)在神经科学、康复、机器学习等领域发挥着重要作用。无声脑机接口可以从神经活动中重建内部语言,对失语症患者很有希望。本文设计了一种基于声母韵母的汉语想象语音实验范式,并利用64通道头皮脑电图采集了8名健康受试者的原始语音信号。采用线性预测编码(LPC)和线性倒谱系数(MFCC)这两种经典的语音识别算法提取特征,用于语音分类和重构。此外,还引入锁相值(PLV)来丰富特征信息。我们选择支持向量机(SVM)、线性判别分析(LDA)、决策树(DT)和LogitBoost (LB)在几种不同情况下进行二值分类。本文还介绍了基于Broca区和Wernicke区的双通道选择(CS)。最高想象语音解码准确率达到84.38%,验证了特征工程的有效性。此外,还与专门针对小样本场景设计的深度学习方法进行了对比分析。本研究为汉语语音想象脑机接口的研究提供了一种新的系统方法。
{"title":"Imagined Chinese Speech Decoding Based on Initials and Finals From EEG Activity","authors":"Jingyu Gu,&nbsp;Jiuchuan Jiang,&nbsp;Qian Cai,&nbsp;Haixian Wang","doi":"10.1049/sil2/5451362","DOIUrl":"https://doi.org/10.1049/sil2/5451362","url":null,"abstract":"<p>Brain-computer interface (BCI) plays an important role in various fields, such as neuroscience, rehabilitation, and machine learning. The silent BCI, which can reconstruct inner speech from neural activity, holds great promise for aphasia patients. In this paper, we design an imagined Chinese speech experimental paradigm based on initials and finals and collect raw signals from eight healthy participants by using 64-channel scalp electroencephalograms. Linear predictive coding (LPC) and mel frequency cepstral coefficients (MFCC), which are classical algorithms in the field of speech recognition, are used to extract distinguishing features for speech classification and reconstruction. Besides, the phase-lock value (PLV) is introduced to enrich the feature information. We choose support vector machine (SVM), linear discriminant analysis (LDA), decision tree (DT), and LogitBoost (LB) for binary classification in several different cases. Two-channel selection (CS) based on Broca’s area and Wernicke’s area of the brain is also introduced in the paper. The highest imaginary speech decoding accuracy reaches 84.38%, which demonstrates the effectiveness of the feature engineering. In addition, the comparative analysis is conducted with deep learning methods specifically designed for small sample scenarios. This study offers a novel systematic approach for the research of Chinese speech imagination BCI.</p>","PeriodicalId":56301,"journal":{"name":"IET Signal Processing","volume":"2026 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/sil2/5451362","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146002293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Revolutionizing Fetal QRS Complex Detection: A Cutting-Edge Algorithm Employing New Adaptive Filters and Peak Thresholds 革命性的胎儿QRS复合体检测:采用新的自适应滤波器和峰值阈值的前沿算法
IF 1.4 4区 工程技术 Q4 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-11 DOI: 10.1049/sil2/8873906
Elias Mazrooei Rad, Seyyed Ali Zendehbad, Vahideh Hosseinzadeh
<div> <section> <h3> Purpose</h3> <p>This study proposes an innovative algorithm based on the morphological parameters of noninvasive fetal electrocardiography (NI-FECG) for the comprehensive analysis of all electrocardiography (ECG) signal components, including the P wave, PR interval, QRS complex, ST segment, T wave, U wave, and QT interval. Accurate identification of these components is critical for a holistic evaluation of fetal heart health. While the QRS complex is crucial for detecting arrhythmias and guiding clinical interventions, a complete diagnostic evaluation requires analyzing all waveform components. The computational efficiency and robustness of the proposed algorithm make it highly suitable for real-time clinical applications, distinguishing it from conventional methods.</p> </section> <section> <h3> Materials and Methods</h3> <p>The proposed method utilizes a dataset of 55 multichannel abdominal NI-FECG recordings collected between gestational weeks 21 and 40. This method focuses on enhancing detection accuracy across various ECG signal components by incorporating adaptive filtering techniques and dynamic peak thresholding to minimize noise interference and improve signal clarity. The study provides a detailed evaluation of the variability in QRS morphology and the influence of arrhythmias on the ECG waveform.</p> </section> <section> <h3> Results</h3> <p>Experimental results demonstrate the proposed algorithm’s robustness, achieving an average detection error of 1.78% and a standard deviation of 0.46% across all participants for all ECG components, with specific emphasis on the QRS complex. Additionally, the computational complexity is significantly reduced compared to existing approaches, ensuring feasibility for real-time deployment in clinical settings.</p> </section> <section> <h3> Conclusion</h3> <p>This study presents a groundbreaking algorithm for fetal ECG analysis using NI-FECG signals, yielding a high precision for the detection of all morphological features, such as the PR, RR, ST, and QT intervals. The holistic approach ensures a more reliable assessment of fetal heart health. With a swift 0.22-s execution time, the algorithm is practical for real-time applications. Our findings highlight the significance of tailored biological signal processing over generic artificial intelligence (AI)-based models in enhancing accuracy and noise resilience in fetal ECG analysis. Further research is needed to optimize performance in diverse clinical scenarios. Integration of this approach into routine clinical practice could significantly improve
本研究提出一种基于无创胎儿心电图(NI-FECG)形态学参数的创新算法,用于综合分析包括P波、PR间期、QRS复合体、ST段、T波、U波和QT间期在内的所有心电图信号分量。准确识别这些成分对胎儿心脏健康的整体评估至关重要。虽然QRS复合体对于检测心律失常和指导临床干预至关重要,但完整的诊断评估需要分析所有波形成分。该算法的计算效率和鲁棒性使其非常适合于实时临床应用,区别于传统的方法。材料和方法该方法利用55个多通道腹部NI-FECG记录数据集,收集于妊娠第21周至第40周。该方法通过结合自适应滤波技术和动态峰值阈值来提高各种心电信号分量的检测精度,从而最大限度地减少噪声干扰,提高信号清晰度。该研究详细评估了QRS形态学的变异性以及心律失常对心电图波形的影响。实验结果证明了该算法的鲁棒性,在所有参与者中,对所有ECG分量的平均检测误差为1.78%,标准差为0.46%,特别强调QRS复合体。此外,与现有方法相比,计算复杂性大大降低,确保了在临床环境中实时部署的可行性。本研究提出了一种利用NI-FECG信号进行胎儿心电图分析的突破性算法,对所有形态学特征(如PR、RR、ST和QT间期)的检测精度很高。整体方法确保胎儿心脏健康更可靠的评估。该算法的执行时间为0.22秒,适用于实时应用。我们的研究结果强调了定制生物信号处理在提高胎儿心电图分析的准确性和噪声恢复能力方面的重要性,而不是基于通用人工智能(AI)的模型。需要进一步的研究来优化在不同临床情况下的表现。将该方法纳入常规临床实践可显著改善胎儿心血管健康监测和整体妊娠结局。
{"title":"Revolutionizing Fetal QRS Complex Detection: A Cutting-Edge Algorithm Employing New Adaptive Filters and Peak Thresholds","authors":"Elias Mazrooei Rad,&nbsp;Seyyed Ali Zendehbad,&nbsp;Vahideh Hosseinzadeh","doi":"10.1049/sil2/8873906","DOIUrl":"https://doi.org/10.1049/sil2/8873906","url":null,"abstract":"&lt;div&gt;\u0000 \u0000 &lt;section&gt;\u0000 \u0000 &lt;h3&gt; Purpose&lt;/h3&gt;\u0000 \u0000 &lt;p&gt;This study proposes an innovative algorithm based on the morphological parameters of noninvasive fetal electrocardiography (NI-FECG) for the comprehensive analysis of all electrocardiography (ECG) signal components, including the P wave, PR interval, QRS complex, ST segment, T wave, U wave, and QT interval. Accurate identification of these components is critical for a holistic evaluation of fetal heart health. While the QRS complex is crucial for detecting arrhythmias and guiding clinical interventions, a complete diagnostic evaluation requires analyzing all waveform components. The computational efficiency and robustness of the proposed algorithm make it highly suitable for real-time clinical applications, distinguishing it from conventional methods.&lt;/p&gt;\u0000 &lt;/section&gt;\u0000 \u0000 &lt;section&gt;\u0000 \u0000 &lt;h3&gt; Materials and Methods&lt;/h3&gt;\u0000 \u0000 &lt;p&gt;The proposed method utilizes a dataset of 55 multichannel abdominal NI-FECG recordings collected between gestational weeks 21 and 40. This method focuses on enhancing detection accuracy across various ECG signal components by incorporating adaptive filtering techniques and dynamic peak thresholding to minimize noise interference and improve signal clarity. The study provides a detailed evaluation of the variability in QRS morphology and the influence of arrhythmias on the ECG waveform.&lt;/p&gt;\u0000 &lt;/section&gt;\u0000 \u0000 &lt;section&gt;\u0000 \u0000 &lt;h3&gt; Results&lt;/h3&gt;\u0000 \u0000 &lt;p&gt;Experimental results demonstrate the proposed algorithm’s robustness, achieving an average detection error of 1.78% and a standard deviation of 0.46% across all participants for all ECG components, with specific emphasis on the QRS complex. Additionally, the computational complexity is significantly reduced compared to existing approaches, ensuring feasibility for real-time deployment in clinical settings.&lt;/p&gt;\u0000 &lt;/section&gt;\u0000 \u0000 &lt;section&gt;\u0000 \u0000 &lt;h3&gt; Conclusion&lt;/h3&gt;\u0000 \u0000 &lt;p&gt;This study presents a groundbreaking algorithm for fetal ECG analysis using NI-FECG signals, yielding a high precision for the detection of all morphological features, such as the PR, RR, ST, and QT intervals. The holistic approach ensures a more reliable assessment of fetal heart health. With a swift 0.22-s execution time, the algorithm is practical for real-time applications. Our findings highlight the significance of tailored biological signal processing over generic artificial intelligence (AI)-based models in enhancing accuracy and noise resilience in fetal ECG analysis. Further research is needed to optimize performance in diverse clinical scenarios. Integration of this approach into routine clinical practice could significantly improve","PeriodicalId":56301,"journal":{"name":"IET Signal Processing","volume":"2026 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2026-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/sil2/8873906","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146002145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Multi-Branch Attention-Enhanced Architecture for OSA Detection Using ECG Signals 一种基于心电信号的多分支注意力增强OSA检测架构
IF 1.4 4区 工程技术 Q4 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-08 DOI: 10.1049/sil2/5631289
Alireza Cheshmberah, Majid Ziaratban

Obstructive Sleep Apnea (OSA) is a prevalent and underdiagnosed sleep disorder that can lead to serious cardiovascular and cognitive complications if left untreated. This study presents a novel deep learning architecture based on convolution, LSTM, short-time Fourier transform (STFT), attention, and transformer modules for automated OSA detection using single-lead electrocardiogram (ECG) signals, aiming to improve diagnostic accuracy. The proposed model integrates six parallel branches for feature extraction, combining convolutional layers, recurrent units, STFT, and residual connections to capture multiscale temporal and frequency-domain patterns. A three-path feature refinement that incorporates sequential, convolutional, and transformer encoder is considered as the second part of the model. Channel attention-based feature fusion modules are employed in the first and second parts to enhance feature relevance and suppress noise. Experimental evaluations on the PhysioNet Apnea-ECG dataset demonstrate that the proposed model achieves superior segment-level classification performance with 91.97% accuracy, 91.28% sensitivity, and 92.41% specificity. These findings suggest that the proposed method offers a robust, and scalable solution. Regarding the small number of parameters of the model, it can potentially be considered for real-time and wearable-based OSA monitoring applications. All codes and the trained model are released at https://github.com/mziaratban/OSA.

阻塞性睡眠呼吸暂停(OSA)是一种普遍存在且未被确诊的睡眠障碍,如果不及时治疗,可能导致严重的心血管和认知并发症。本研究提出了一种基于卷积、LSTM、短时傅立叶变换(STFT)、注意力和变压器模块的新型深度学习架构,用于使用单导联心电图(ECG)信号自动检测OSA,旨在提高诊断准确性。该模型集成了六个并行分支进行特征提取,结合卷积层、循环单元、STFT和残差连接来捕获多尺度时域和频域模式。结合顺序、卷积和变压器编码器的三路径特征细化被认为是模型的第二部分。第一部分和第二部分采用基于信道关注的特征融合模块来增强特征相关性和抑制噪声。在PhysioNet呼吸暂停- ecg数据集上的实验评估表明,该模型具有优异的分段级分类性能,准确率为91.97%,灵敏度为91.28%,特异性为92.41%。这些发现表明,所提出的方法提供了一个健壮的、可扩展的解决方案。由于该模型参数较少,因此可以考虑将其用于实时和基于可穿戴设备的OSA监测应用。所有代码和训练模型都发布在https://github.com/mziaratban/OSA。
{"title":"A Multi-Branch Attention-Enhanced Architecture for OSA Detection Using ECG Signals","authors":"Alireza Cheshmberah,&nbsp;Majid Ziaratban","doi":"10.1049/sil2/5631289","DOIUrl":"https://doi.org/10.1049/sil2/5631289","url":null,"abstract":"<p>Obstructive Sleep Apnea (OSA) is a prevalent and underdiagnosed sleep disorder that can lead to serious cardiovascular and cognitive complications if left untreated. This study presents a novel deep learning architecture based on convolution, LSTM, short-time Fourier transform (STFT), attention, and transformer modules for automated OSA detection using single-lead electrocardiogram (ECG) signals, aiming to improve diagnostic accuracy. The proposed model integrates six parallel branches for feature extraction, combining convolutional layers, recurrent units, STFT, and residual connections to capture multiscale temporal and frequency-domain patterns. A three-path feature refinement that incorporates sequential, convolutional, and transformer encoder is considered as the second part of the model. Channel attention-based feature fusion modules are employed in the first and second parts to enhance feature relevance and suppress noise. Experimental evaluations on the PhysioNet Apnea-ECG dataset demonstrate that the proposed model achieves superior segment-level classification performance with 91.97% accuracy, 91.28% sensitivity, and 92.41% specificity. These findings suggest that the proposed method offers a robust, and scalable solution. Regarding the small number of parameters of the model, it can potentially be considered for real-time and wearable-based OSA monitoring applications. All codes and the trained model are released at https://github.com/mziaratban/OSA.</p>","PeriodicalId":56301,"journal":{"name":"IET Signal Processing","volume":"2026 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2026-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/sil2/5631289","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145986778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Anomaly Detection in Gas Turbines Based on Semisupervised Learning Framework 基于半监督学习框架的燃气轮机异常检测
IF 1.4 4区 工程技术 Q4 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-08 DOI: 10.1049/sil2/4814061
Ying Wang, Yunpeng Cao, Shuying Li, Kun Yao

Currently, gas turbines are finding increasingly widespread applications. To ensure their safe and stable operation, condition monitoring and anomaly detection are crucial. However, traditional anomaly detection methods for gas turbines often rely solely on a single learning approach. To address these issues, this paper proposes a novel semisupervised learning framework that synergistically combines the large-scale automatic labeling capability of unsupervised learning with the precise classification power of supervised learning. First, an unsupervised learning algorithm is employed to hierarchically label anomalies in real operational data as suspicious anomalies, high-probability anomalies, and actual anomalies. Next, oversampling techniques are applied to address class imbalance issues by augmenting underrepresented classes in the dataset. Finally, supervised learning methods are utilized to train models on the labeled samples, with their performance compared against other machine learning (ML) approaches. Through comparative analysis of multiclass classification evaluation metrics, the feasibility of the proposed semisupervised learning framework is demonstrated, and the optimal monitoring model is identified. The core contribution is providing a semi-supervised learning framework that categorizes operational data into a multitier hierarchy to enable a nuanced early warning mechanism.

目前,燃气轮机的应用越来越广泛。为了保证其安全稳定运行,状态监测和异常检测至关重要。然而,传统的燃气轮机异常检测方法往往只依赖于单一的学习方法。为了解决这些问题,本文提出了一种新的半监督学习框架,该框架将无监督学习的大规模自动标记能力与监督学习的精确分类能力协同结合。首先,采用无监督学习算法将实际运行数据中的异常分层标注为可疑异常、高概率异常和实际异常。接下来,通过增加数据集中代表性不足的类,应用过采样技术来解决类不平衡问题。最后,利用监督学习方法在标记样本上训练模型,并将其性能与其他机器学习(ML)方法进行比较。通过对多类分类评价指标的对比分析,验证了所提出的半监督学习框架的可行性,并确定了最优监控模型。核心贡献是提供了一个半监督学习框架,该框架将操作数据分类为多层层次结构,以实现细微的早期预警机制。
{"title":"Anomaly Detection in Gas Turbines Based on Semisupervised Learning Framework","authors":"Ying Wang,&nbsp;Yunpeng Cao,&nbsp;Shuying Li,&nbsp;Kun Yao","doi":"10.1049/sil2/4814061","DOIUrl":"https://doi.org/10.1049/sil2/4814061","url":null,"abstract":"<p>Currently, gas turbines are finding increasingly widespread applications. To ensure their safe and stable operation, condition monitoring and anomaly detection are crucial. However, traditional anomaly detection methods for gas turbines often rely solely on a single learning approach. To address these issues, this paper proposes a novel semisupervised learning framework that synergistically combines the large-scale automatic labeling capability of unsupervised learning with the precise classification power of supervised learning. First, an unsupervised learning algorithm is employed to hierarchically label anomalies in real operational data as suspicious anomalies, high-probability anomalies, and actual anomalies. Next, oversampling techniques are applied to address class imbalance issues by augmenting underrepresented classes in the dataset. Finally, supervised learning methods are utilized to train models on the labeled samples, with their performance compared against other machine learning (ML) approaches. Through comparative analysis of multiclass classification evaluation metrics, the feasibility of the proposed semisupervised learning framework is demonstrated, and the optimal monitoring model is identified. The core contribution is providing a semi-supervised learning framework that categorizes operational data into a multitier hierarchy to enable a nuanced early warning mechanism.</p>","PeriodicalId":56301,"journal":{"name":"IET Signal Processing","volume":"2026 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2026-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/sil2/4814061","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145986777","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Affine-Image-Driven and Finite-Time Stable Formation Control for Multi-AUV Systems With Unknown Complex Disturbances: A Geometric Perspective 具有未知复杂扰动的多auv系统仿射图像驱动有限时间稳定编队控制:几何视角
IF 1.4 4区 工程技术 Q4 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-07 DOI: 10.1049/sil2/6243643
Fan Luan, Ping Zhang, Xiasheng Shi

Formation control of multi-autonomous underwater vehicle (AUV) systems is essential for collaborative ocean tasks such as environmental monitoring and resource exploration, but it is hindered by challenges like limited flexibility in formation transformation, model uncertainties, and unknown ocean current disturbances. To address these issues, this paper presents a finite-time flexible formation control strategy based on the affine image framework for second-order multi-AUV systems with AUV model uncertainties and external disturbances. The proposed strategy incorporates a finite-time disturbance observer to accurately estimate composite disturbances (including model uncertainties and ocean currents) and a sliding mode control (SMC) approach to ensure robust finite-time convergence of the formation. A key innovation is the extension of the affine image framework to second-order dynamics, enabling flexible transformations between affine and rigid formations while requiring only d + 1 leaders (in d-dimensional space) to know the target shape, thus, minimizing global information needs. The proposed method achieves finite-time stability and strong robustness against uncertainties, providing a unified solution for flexible formation control in uncertain underwater environments. Numerical simulations validate that the strategy effectively guides AUVs to converge to desired formations within finite time.

多自主水下航行器(AUV)系统的编队控制对于环境监测和资源勘探等协同海洋任务至关重要,但它受到编队转换灵活性有限、模型不确定性和未知洋流干扰等挑战的阻碍。针对这些问题,提出了一种基于仿射图像框架的二阶多AUV系统有限时间柔性编队控制策略。该策略结合了一个有限时间干扰观测器来精确估计复合干扰(包括模型不确定性和洋流)和一个滑模控制(SMC)方法来确保编队的鲁棒有限时间收敛。一个关键的创新是将仿射图像框架扩展到二阶动力学,实现仿射和刚性编队之间的灵活转换,同时只需要d + 1个领导者(在d维空间中)知道目标形状,从而最大限度地减少全局信息需求。该方法具有有限时间稳定性和较强的抗不确定性鲁棒性,为不确定水下环境下的柔性编队控制提供了统一的解决方案。数值模拟验证了该策略能有效地引导auv在有限时间内收敛到期望的地层。
{"title":"Affine-Image-Driven and Finite-Time Stable Formation Control for Multi-AUV Systems With Unknown Complex Disturbances: A Geometric Perspective","authors":"Fan Luan,&nbsp;Ping Zhang,&nbsp;Xiasheng Shi","doi":"10.1049/sil2/6243643","DOIUrl":"https://doi.org/10.1049/sil2/6243643","url":null,"abstract":"<p>Formation control of multi-autonomous underwater vehicle (AUV) systems is essential for collaborative ocean tasks such as environmental monitoring and resource exploration, but it is hindered by challenges like limited flexibility in formation transformation, model uncertainties, and unknown ocean current disturbances. To address these issues, this paper presents a finite-time flexible formation control strategy based on the affine image framework for second-order multi-AUV systems with AUV model uncertainties and external disturbances. The proposed strategy incorporates a finite-time disturbance observer to accurately estimate composite disturbances (including model uncertainties and ocean currents) and a sliding mode control (SMC) approach to ensure robust finite-time convergence of the formation. A key innovation is the extension of the affine image framework to second-order dynamics, enabling flexible transformations between affine and rigid formations while requiring only <i>d</i> + 1 leaders (in <i>d</i>-dimensional space) to know the target shape, thus, minimizing global information needs. The proposed method achieves finite-time stability and strong robustness against uncertainties, providing a unified solution for flexible formation control in uncertain underwater environments. Numerical simulations validate that the strategy effectively guides AUVs to converge to desired formations within finite time.</p>","PeriodicalId":56301,"journal":{"name":"IET Signal Processing","volume":"2026 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/sil2/6243643","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145963782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Scene-Driven Semantic Alignment for Generative Communication in Underwater Images 水下图像生成通信的场景驱动语义对齐
IF 1.4 4区 工程技术 Q4 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-03 DOI: 10.1049/sil2/9480527
Yuetong Li, Zhenyu Jia, Yuyang Peng, Yi Zhu, Fei Yuan

Underwater images serve as one of the most intuitive media for human perception of underwater environments. However, the limited bandwidth and susceptibility to noise in underwater acoustic (UWA) channels pose significant challenges for traditional image encoding and transmission methods, thereby hindering high-quality image reconstruction. Semantic communication aims to shift from pixel-level to semantic-level transmission, enhancing both reliability and efficiency. This paper proposes a scene-guided generative communication method for underwater images with semantic alignment. We decouple and transmit only the essential layout information of underwater scenes. This enables highly efficient compression. At the receiver, we employ a graph convolutional network (GCN) to correct layout distortions and a context-aware diffusion model to generate realistic underwater images that preserve high semantic fidelity to the original. Experimental results demonstrate that, compared to pixel-fidelity communication and other generative communication approaches, our method consistently achieves superior image reconstruction quality even under adverse channel conditions, such as extremely low signal-to-noise ratios (SNRs), and exhibits significant advantages in downstream tasks.

水下图像是人类感知水下环境最直观的媒介之一。然而,水声(UWA)信道有限的带宽和对噪声的敏感性对传统的图像编码和传输方法提出了重大挑战,从而阻碍了高质量的图像重建。语义通信旨在从像素级向语义级传输转变,提高可靠性和效率。提出了一种基于场景引导的水下图像语义对齐生成通信方法。我们只对水下场景的基本布局信息进行解耦和传输。这样可以实现高效的压缩。在接收端,我们使用图形卷积网络(GCN)来纠正布局扭曲,并使用上下文感知扩散模型来生成逼真的水下图像,并保持对原始图像的高语义保真度。实验结果表明,与像素保真通信和其他生成式通信方法相比,即使在极低信噪比(SNRs)等不利信道条件下,我们的方法也能始终保持卓越的图像重建质量,并且在下游任务中表现出显著的优势。
{"title":"Scene-Driven Semantic Alignment for Generative Communication in Underwater Images","authors":"Yuetong Li,&nbsp;Zhenyu Jia,&nbsp;Yuyang Peng,&nbsp;Yi Zhu,&nbsp;Fei Yuan","doi":"10.1049/sil2/9480527","DOIUrl":"https://doi.org/10.1049/sil2/9480527","url":null,"abstract":"<p>Underwater images serve as one of the most intuitive media for human perception of underwater environments. However, the limited bandwidth and susceptibility to noise in underwater acoustic (UWA) channels pose significant challenges for traditional image encoding and transmission methods, thereby hindering high-quality image reconstruction. Semantic communication aims to shift from pixel-level to semantic-level transmission, enhancing both reliability and efficiency. This paper proposes a scene-guided generative communication method for underwater images with semantic alignment. We decouple and transmit only the essential layout information of underwater scenes. This enables highly efficient compression. At the receiver, we employ a graph convolutional network (GCN) to correct layout distortions and a context-aware diffusion model to generate realistic underwater images that preserve high semantic fidelity to the original. Experimental results demonstrate that, compared to pixel-fidelity communication and other generative communication approaches, our method consistently achieves superior image reconstruction quality even under adverse channel conditions, such as extremely low signal-to-noise ratios (SNRs), and exhibits significant advantages in downstream tasks.</p>","PeriodicalId":56301,"journal":{"name":"IET Signal Processing","volume":"2026 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2026-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/sil2/9480527","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145904624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Autonomous Online Self-Adaptive Stereo Network 自主在线自适应立体网络
IF 1.4 4区 工程技术 Q4 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-12-22 DOI: 10.1049/sil2/8808531
Zihan Jia, Xiao Yang, Zheng Zhang, Yige Hu

Existing end-to-end stereo matching networks face significant deployment challenges due to their reliance on large training datasets and the limited ability of synthetic data to represent real-world scenarios, leading to a severe domain shift. To address these challenges, this paper proposes a novel stereo matching network along with an efficient fine-tuning strategy. First, a lightweight modular stereo matching network is proposed, which incorporates domain knowledge and optimizes specific modules to enhance generalization. Second, a confidence estimation network is developed to generate occlusion masks, which filter erroneous self-supervised loss. Then, the Mann–Kendall (MK) trend detection method is used to evaluate the loss change trend of the most recent few frames in the online adaptive process to measure the model’s adaptation degree to the scene, and control the online operation mode of the model. Finally, online and offline experiments on multiple datasets demonstrate the competitive performance of our method.

由于现有的端到端立体匹配网络依赖于大型训练数据集,并且合成数据表示现实场景的能力有限,因此面临着重大的部署挑战,导致了严重的领域转移。为了解决这些问题,本文提出了一种新的立体匹配网络以及一种有效的微调策略。首先,提出了一种轻量级的模块化立体匹配网络,该网络融合了领域知识,并对特定模块进行了优化,增强了网络的泛化能力;其次,建立了一个置信度估计网络来生成遮挡掩模,滤除错误的自监督损失;然后,利用Mann-Kendall (MK)趋势检测方法对在线自适应过程中最近几帧的损失变化趋势进行评估,以衡量模型对场景的适应程度,控制模型的在线运行方式。最后,在多个数据集上的在线和离线实验证明了我们的方法具有竞争力的性能。
{"title":"Autonomous Online Self-Adaptive Stereo Network","authors":"Zihan Jia,&nbsp;Xiao Yang,&nbsp;Zheng Zhang,&nbsp;Yige Hu","doi":"10.1049/sil2/8808531","DOIUrl":"10.1049/sil2/8808531","url":null,"abstract":"<p>Existing end-to-end stereo matching networks face significant deployment challenges due to their reliance on large training datasets and the limited ability of synthetic data to represent real-world scenarios, leading to a severe domain shift. To address these challenges, this paper proposes a novel stereo matching network along with an efficient fine-tuning strategy. First, a lightweight modular stereo matching network is proposed, which incorporates domain knowledge and optimizes specific modules to enhance generalization. Second, a confidence estimation network is developed to generate occlusion masks, which filter erroneous self-supervised loss. Then, the Mann–Kendall (MK) trend detection method is used to evaluate the loss change trend of the most recent few frames in the online adaptive process to measure the model’s adaptation degree to the scene, and control the online operation mode of the model. Finally, online and offline experiments on multiple datasets demonstrate the competitive performance of our method.</p>","PeriodicalId":56301,"journal":{"name":"IET Signal Processing","volume":"2025 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/sil2/8808531","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145824909","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
6G-Enabled AI–Driven Technology for Real-Time Ecosystem Monitoring and Analysis in Mining Regions 支持6g的人工智能驱动技术用于矿区生态系统实时监测和分析
IF 1.4 4区 工程技术 Q4 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-12-21 DOI: 10.1049/sil2/7901466
Meijing Zhang, Venkateshwaran Bakthavachalam, Sekar Kidambi Raju, Raj Anand Sundaramoorthy, Ganesh Karthikeyan Varadarajan

For effective management of the ecological balance of natural resources, it is crucial to conduct real-time environmental monitoring in the context of mining activities. This research demonstrates five hybrid machine learning models—XGBC + neural networks (NNs), autoencoder + isolation forest, SVM + long short-term memory (LSTM), graph NN + random forest (GNN-RF), and transformer + CatBoost—for the integrated real-time ecosystem monitoring in mining regions framework. The GNN-RF model performed best with a training accuracy of 99.12% and a testing accuracy of 93.81%, surpassing the rest. Five optimizers were tested in optimizing the GNN-RF model: momentumized, adaptive, dual-averaged gradient (Madgrad), AdaHessian, layer-wise adaptive rate scaling (LARS), sharpness-aware minimization (SAM), and LION. The GNN-RF model with the Madgrad optimal setting achieved the best training record of 99.34% while maintaining 93.81% during tests, showcasing high generalization capabilities and surpassing other configurations. This Madgrad-optimized GNN-RF model configuration incorporates the advantages of graph NNs in spatial and relational data pattern recognition, along with RFs’ complex-feature interactions, enhancing performance for streaming monitoring tasks. GNN-RF with Madgrad becomes the proposed model for scalable and precision-centric detection of anomalies in mining regions to maintain the balance of all ecosystem elements within a rapidly changing ecological environment. This research highlights the promise that advanced optimization methods combined with a hybrid machine learning framework hold in mitigating environmental issues in developed industrial sectors, forming a basis for progress in sustainable mining techniques.

为了有效地管理自然资源的生态平衡,在采矿活动的背景下进行实时环境监测是至关重要的。本文研究了5种混合机器学习模型——xgbc +神经网络(NN)、自编码器+隔离森林、支持向量机+长短期记忆(LSTM)、图神经网络+随机森林(GNN-RF)和变压器+ catboost——用于矿区生态系统综合实时监测框架。GNN-RF模型的训练准确率为99.12%,测试准确率为93.81%,优于其他模型。在优化GNN-RF模型时,测试了五种优化器:动量化、自适应、双平均梯度(Madgrad)、AdaHessian、分层自适应速率缩放(LARS)、锐度感知最小化(SAM)和LION。Madgrad最优设置下的GNN-RF模型达到了99.34%的最佳训练记录,在测试过程中保持了93.81%,显示出较高的泛化能力,超过了其他配置。这种madgrad优化的GNN-RF模型配置结合了图nn在空间和关系数据模式识别方面的优势,以及rf的复杂特征交互,增强了流监控任务的性能。在快速变化的生态环境中,GNN-RF与Madgrad成为可扩展和以精度为中心的矿区异常检测的拟议模型,以维持所有生态系统元素的平衡。这项研究强调了先进的优化方法与混合机器学习框架相结合,在缓解发达工业部门的环境问题方面的前景,为可持续采矿技术的进步奠定了基础。
{"title":"6G-Enabled AI–Driven Technology for Real-Time Ecosystem Monitoring and Analysis in Mining Regions","authors":"Meijing Zhang,&nbsp;Venkateshwaran Bakthavachalam,&nbsp;Sekar Kidambi Raju,&nbsp;Raj Anand Sundaramoorthy,&nbsp;Ganesh Karthikeyan Varadarajan","doi":"10.1049/sil2/7901466","DOIUrl":"10.1049/sil2/7901466","url":null,"abstract":"<p>For effective management of the ecological balance of natural resources, it is crucial to conduct real-time environmental monitoring in the context of mining activities. This research demonstrates five hybrid machine learning models—XGBC + neural networks (NNs), autoencoder + isolation forest, SVM + long short-term memory (LSTM), graph NN + random forest (GNN-RF), and transformer + CatBoost—for the integrated real-time ecosystem monitoring in mining regions framework. The GNN-RF model performed best with a training accuracy of 99.12% and a testing accuracy of 93.81%, surpassing the rest. Five optimizers were tested in optimizing the GNN-RF model: momentumized, adaptive, dual-averaged gradient (Madgrad), AdaHessian, layer-wise adaptive rate scaling (LARS), sharpness-aware minimization (SAM), and LION. The GNN-RF model with the Madgrad optimal setting achieved the best training record of 99.34% while maintaining 93.81% during tests, showcasing high generalization capabilities and surpassing other configurations. This Madgrad-optimized GNN-RF model configuration incorporates the advantages of graph NNs in spatial and relational data pattern recognition, along with RFs’ complex-feature interactions, enhancing performance for streaming monitoring tasks. GNN-RF with Madgrad becomes the proposed model for scalable and precision-centric detection of anomalies in mining regions to maintain the balance of all ecosystem elements within a rapidly changing ecological environment. This research highlights the promise that advanced optimization methods combined with a hybrid machine learning framework hold in mitigating environmental issues in developed industrial sectors, forming a basis for progress in sustainable mining techniques.</p>","PeriodicalId":56301,"journal":{"name":"IET Signal Processing","volume":"2025 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/sil2/7901466","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145846025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IET Signal Processing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1