首页 > 最新文献

Digital Signal Processing最新文献

英文 中文
ATPL-VIO: Adaptive point and line feature fusion for visual-inertial SLAM in real-world environments ATPL-VIO:现实环境中视觉惯性SLAM的自适应点和线特征融合
IF 3 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-12-03 DOI: 10.1016/j.dsp.2025.105818
Peichao Cong , Yangang Zhu , Murong Deng , Yixuan Xiao , Xianquan Wan , Xin Zhang
In the intelligent development of mobile platforms such as drones and service robots, simultaneous localization and mapping (SLAM) in real-world environments serves as a core technical support. Although existing point-line fusion visual-inertial SLAM can improve localization performance in low-texture environments, the high computational cost of feature extraction limits its real-time performance. To address this issue, this paper proposes an Adaptive Point-Line Feature Fusion Visual-Inertial SLAM Algorithm (ATPL-VIO), aiming to simultaneously enhance the algorithm’s real-time performance and localization accuracy. First, by introducing the number of feature points and inter-frame velocity to partition the scene, the fusion method of line features in the front end is dynamically adjusted, and the frequency of line feature extraction is controlled to strengthen data association, system robustness, and real-time performance. Second, based on the characteristics of the scene, high-quality line features are selected from segment length and direction to participate in nonlinear optimization, achieving high-precision localization with a small number of effective features while reducing computational time. Finally, the time-dimensional data association is enhanced by combining visual and IMU data within a sliding window, further reducing localization errors. Experimental results show that the proposed algorithm reduces the maximum error by 28.26 %, the root mean square error by 20.49 %, and improves real-time performance by 12.60 % with only 23.85 % of the line segment features used, compared to PL-VIO. Additionally, both indoor and outdoor experiments demonstrate the advantages of the proposed algorithm in terms of localization accuracy and mapping performance.
在无人机、服务机器人等移动平台的智能化发展中,现实环境下的同步定位与地图绘制(SLAM)是核心技术支撑。虽然现有的点-线融合视觉惯性SLAM可以提高低纹理环境下的定位性能,但较高的特征提取计算成本限制了其实时性。针对这一问题,本文提出了一种自适应点线特征融合视觉惯性SLAM算法(ATPL-VIO),旨在同时提高算法的实时性和定位精度。首先,通过引入特征点个数和帧间速度对场景进行分割,动态调整前端线特征融合方法,控制线特征提取频率,增强数据关联、系统鲁棒性和实时性;其次,根据场景特征,从线段长度和方向中选择高质量的线段特征参与非线性优化,在减少计算时间的同时,利用少量有效特征实现高精度定位。最后,通过在滑动窗口内结合视觉和IMU数据,增强了时间维数据的关联,进一步降低了定位误差。实验结果表明,与PL-VIO相比,该算法在仅使用23.85%的线段特征的情况下,最大误差降低了28.26%,均方根误差降低了20.49%,实时性提高了12.60%。此外,室内和室外实验都证明了该算法在定位精度和映射性能方面的优势。
{"title":"ATPL-VIO: Adaptive point and line feature fusion for visual-inertial SLAM in real-world environments","authors":"Peichao Cong ,&nbsp;Yangang Zhu ,&nbsp;Murong Deng ,&nbsp;Yixuan Xiao ,&nbsp;Xianquan Wan ,&nbsp;Xin Zhang","doi":"10.1016/j.dsp.2025.105818","DOIUrl":"10.1016/j.dsp.2025.105818","url":null,"abstract":"<div><div>In the intelligent development of mobile platforms such as drones and service robots, simultaneous localization and mapping (SLAM) in real-world environments serves as a core technical support. Although existing point-line fusion visual-inertial SLAM can improve localization performance in low-texture environments, the high computational cost of feature extraction limits its real-time performance. To address this issue, this paper proposes an Adaptive Point-Line Feature Fusion Visual-Inertial SLAM Algorithm (ATPL-VIO), aiming to simultaneously enhance the algorithm’s real-time performance and localization accuracy. First, by introducing the number of feature points and inter-frame velocity to partition the scene, the fusion method of line features in the front end is dynamically adjusted, and the frequency of line feature extraction is controlled to strengthen data association, system robustness, and real-time performance. Second, based on the characteristics of the scene, high-quality line features are selected from segment length and direction to participate in nonlinear optimization, achieving high-precision localization with a small number of effective features while reducing computational time. Finally, the time-dimensional data association is enhanced by combining visual and IMU data within a sliding window, further reducing localization errors. Experimental results show that the proposed algorithm reduces the maximum error by 28.26 %, the root mean square error by 20.49 %, and improves real-time performance by 12.60 % with only 23.85 % of the line segment features used, compared to PL-VIO. Additionally, both indoor and outdoor experiments demonstrate the advantages of the proposed algorithm in terms of localization accuracy and mapping performance.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"170 ","pages":"Article 105818"},"PeriodicalIF":3.0,"publicationDate":"2025-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145737507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The inertial navigation attitude estimation algorithm based on interactive fusion 基于交互融合的惯性导航姿态估计算法
IF 3 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-12-03 DOI: 10.1016/j.dsp.2025.105817
Meiying Qiao , Kefei Gao , Haotian Han , Yunqiang Qiu
This paper proposes an interactively fused cubature Kalman filter for attitude estimation to address the accuracy degradation of inertial navigation system under simultaneous process and measurement model mismatch. The algorithm designs two parallel sub-filters: an adaptive sub-filter and a robust sub-filter. The adaptive sub-filter incorporates strong tracking theory to adaptively inflate the predicted state covariance, thereby enhancing the tracking of abrupt state changes. It also utilizes a variational Bayesian approach to estimate the process noise covariance in real time, mitigating the estimation bias caused by process model mismatch. The robust sub-filter tackles measurement model mismatch based on the minimum Cauchy kernel loss criterion, which suppresses the influence of outliers by inflating the measurement noise covariance, thus improving system robustness. These two sub-filters are fused within an interactive framework. A method based on likelihood functions and model probabilities is introduced to update the Markov state transition probability matrix, ensuring it aligns with the optimal sub-filter and improves the estimation accuracy when both models mismatch. Simulations and elevator experiments verify the proposed algorithm's superior performance under model mismatch conditions. The solving accuracy of the proposed algorithm under models mismatch is verified through simulations and elevator experiments.
针对惯性导航系统在同步处理和测量模型失配情况下精度下降的问题,提出了一种用于姿态估计的交互式融合培养卡尔曼滤波器。该算法设计了两个并行子滤波器:自适应子滤波器和鲁棒子滤波器。自适应子滤波器结合强跟踪理论自适应膨胀预测状态协方差,从而增强对状态突变的跟踪能力。它还利用变分贝叶斯方法实时估计过程噪声协方差,减轻了过程模型不匹配引起的估计偏差。鲁棒子滤波器基于最小柯西核损失准则处理测量模型失配,通过膨胀测量噪声协方差来抑制异常值的影响,从而提高系统的鲁棒性。这两个子过滤器融合在一个交互式框架中。提出了一种基于似然函数和模型概率的马尔可夫状态转移概率矩阵更新方法,使其与最优子滤波器保持一致,提高了两种模型不匹配时的估计精度。仿真和电梯实验验证了该算法在模型不匹配条件下的优越性能。通过仿真和电梯实验验证了该算法在模型不匹配情况下的求解精度。
{"title":"The inertial navigation attitude estimation algorithm based on interactive fusion","authors":"Meiying Qiao ,&nbsp;Kefei Gao ,&nbsp;Haotian Han ,&nbsp;Yunqiang Qiu","doi":"10.1016/j.dsp.2025.105817","DOIUrl":"10.1016/j.dsp.2025.105817","url":null,"abstract":"<div><div>This paper proposes an interactively fused cubature Kalman filter for attitude estimation to address the accuracy degradation of inertial navigation system under simultaneous process and measurement model mismatch. The algorithm designs two parallel sub-filters: an adaptive sub-filter and a robust sub-filter. The adaptive sub-filter incorporates strong tracking theory to adaptively inflate the predicted state covariance, thereby enhancing the tracking of abrupt state changes. It also utilizes a variational Bayesian approach to estimate the process noise covariance in real time, mitigating the estimation bias caused by process model mismatch. The robust sub-filter tackles measurement model mismatch based on the minimum Cauchy kernel loss criterion, which suppresses the influence of outliers by inflating the measurement noise covariance, thus improving system robustness. These two sub-filters are fused within an interactive framework. A method based on likelihood functions and model probabilities is introduced to update the Markov state transition probability matrix, ensuring it aligns with the optimal sub-filter and improves the estimation accuracy when both models mismatch. Simulations and elevator experiments verify the proposed algorithm's superior performance under model mismatch conditions. The solving accuracy of the proposed algorithm under models mismatch is verified through simulations and elevator experiments.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"170 ","pages":"Article 105817"},"PeriodicalIF":3.0,"publicationDate":"2025-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145685182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient learned image compression with dual-space aggregation transformer 基于双空间聚合变压器的高效学习图像压缩
IF 3 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-12-02 DOI: 10.1016/j.dsp.2025.105797
Kai Hu , Peiguang Jing , Renhe Liu , Bo Wei , Yu Liu
In recent years, learned image compression (LIC) techniques have gained significant success by leveraging Convolutional Neural Networks (CNNs) and Transformer architectures. Due to the complementary strengths of CNNs and Transformers in feature representation, these architectures have been widely integrated into LIC tasks using either cascaded or parallel collaborative structures. However, these collaborative approaches often lead to suboptimal rate-distortion performance or increased model complexity. To address this issue, this paper introduces a lightweight Dual-Space Aggregation Transformer (DSAT) module, which integrates a multi-scale convolutional block with an 11  ×  11 large-scale convolution kernel into the Transformer block to adaptively aggregate local and global context features. Specifically, a gate perceptron (GP) block is employed to replace multi-layer perceptrons (MLPs), enabling adaptive focus on important features. Building upon the DSAT module, we design an efficient learned image compression method that achieves superior rate-distortion performance while maintaining lower model complexity. To enhance compression efficiency, we propose a Mixed Channel-Spatial Context (MCSC) entropy model, which accurately predicts latent symbols to reduce redundancy in both spatial and channel dimensions. Furthermore, to mitigate the impact of missing frequency information on coding performance, we introduce a novel non-local frequency loss into the LIC task and jointly optimize the method in both the frequency and pixel domains. Experimental results on prevalent datasets demonstrate that our approach delivers optimal rate-distortion performance. In particular, experiments on the Kodak dataset show a BD-rate gain of 9.52% over VVC, with lower complexity.
近年来,通过利用卷积神经网络(cnn)和Transformer架构,学习图像压缩(LIC)技术取得了显著的成功。由于cnn和transformer在特征表示方面的互补优势,这些架构已被广泛集成到使用级联或并行协作结构的LIC任务中。然而,这些协作方法经常导致次优的速率失真性能或增加模型复杂性。为了解决这一问题,本文引入了一种轻量级的双空间聚合变压器(DSAT)模块,该模块将具有11  ×  11大规模卷积核的多尺度卷积块集成到Transformer块中,以自适应地聚合局部和全局上下文特征。具体而言,采用门感知器(GP)块代替多层感知器(mlp),实现对重要特征的自适应聚焦。在DSAT模块的基础上,我们设计了一种高效的学习图像压缩方法,在保持较低模型复杂度的同时,实现了优越的率失真性能。为了提高压缩效率,我们提出了一个混合信道-空间上下文(MCSC)熵模型,该模型可以准确地预测潜在符号,以减少空间和信道维度上的冗余。此外,为了减轻频率信息缺失对编码性能的影响,我们在LIC任务中引入了一种新的非局部频率损失,并在频率域和像素域对方法进行了优化。在常用数据集上的实验结果表明,我们的方法提供了最佳的率失真性能。特别是,在柯达数据集上的实验表明,与VVC相比,BD-rate的增益为9.52%,且复杂度更低。
{"title":"Efficient learned image compression with dual-space aggregation transformer","authors":"Kai Hu ,&nbsp;Peiguang Jing ,&nbsp;Renhe Liu ,&nbsp;Bo Wei ,&nbsp;Yu Liu","doi":"10.1016/j.dsp.2025.105797","DOIUrl":"10.1016/j.dsp.2025.105797","url":null,"abstract":"<div><div>In recent years, learned image compression (LIC) techniques have gained significant success by leveraging Convolutional Neural Networks (CNNs) and Transformer architectures. Due to the complementary strengths of CNNs and Transformers in feature representation, these architectures have been widely integrated into LIC tasks using either cascaded or parallel collaborative structures. However, these collaborative approaches often lead to suboptimal rate-distortion performance or increased model complexity. To address this issue, this paper introduces a lightweight Dual-Space Aggregation Transformer (DSAT) module, which integrates a multi-scale convolutional block with an 11  ×  11 large-scale convolution kernel into the Transformer block to adaptively aggregate local and global context features. Specifically, a gate perceptron (GP) block is employed to replace multi-layer perceptrons (MLPs), enabling adaptive focus on important features. Building upon the DSAT module, we design an efficient learned image compression method that achieves superior rate-distortion performance while maintaining lower model complexity. To enhance compression efficiency, we propose a Mixed Channel-Spatial Context (MCSC) entropy model, which accurately predicts latent symbols to reduce redundancy in both spatial and channel dimensions. Furthermore, to mitigate the impact of missing frequency information on coding performance, we introduce a novel non-local frequency loss into the LIC task and jointly optimize the method in both the frequency and pixel domains. Experimental results on prevalent datasets demonstrate that our approach delivers optimal rate-distortion performance. In particular, experiments on the Kodak dataset show a BD-rate gain of 9.52% over VVC, with lower complexity.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"170 ","pages":"Article 105797"},"PeriodicalIF":3.0,"publicationDate":"2025-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145738043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FP-CLIP : Foreground-panorama prompt learning for zero-shot anomaly detection FP-CLIP:前景全景提示学习零拍摄异常检测
IF 3 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-12-02 DOI: 10.1016/j.dsp.2025.105798
Ao Lu , Jincun Liu , Yaoguang Wei , Yan Meng , Dong An
Anomaly detection and segmentation play pivotal roles in various fields, such as industrial inspection, medical imaging, and agricultural monitoring. These technologies are essential for performing detailed analyses of complex systems, like pinpointing flaws in production processes, recognizing irregularities in medical images, or tracking dynamic changes in environmental conditions. A significant challenge faced by existing zero-shot anomaly segmentation approaches is their high rate of false positives, often caused by confusing anomalies present in the background with those relevant to the primary object of interest. To tackle this issue, we propose an advanced zero-shot anomaly segmentation framework that efficiently differentiates between foreground (subject-specific) features and background features. This advancement minimizes false alarms resulting from irrelevant background anomalies. Furthermore, our method integrates a multi-modal large language model to better understand the relationship between the foreground and the panorama, enhancing the precision of anomaly identification. The approach utilizes a learnable prompting mechanism that incorporates subject prompts, panorama prompts, normal prompts, and abnormal prompts, encoded through a CLIP-based text encoder to improve anomaly discernment. Our experiments across several benchmark datasets reveal state-of-the-art performance in both anomaly detection and segmentation, highlighting its effectiveness and adaptability in zero-shot scenarios. This research contributes a potent tool for applications spanning agriculture, healthcare, industrial quality control, and beyond.
异常检测与分割在工业检测、医学成像、农业监测等领域发挥着举足轻重的作用。这些技术对于执行复杂系统的详细分析至关重要,例如精确定位生产过程中的缺陷,识别医学图像中的不规则性,或跟踪环境条件的动态变化。现有的零射击异常分割方法面临的一个重大挑战是它们的高误报率,通常是由于将背景中的异常与主要感兴趣对象相关的异常混淆造成的。为了解决这个问题,我们提出了一种先进的零镜头异常分割框架,该框架可以有效地区分前景(特定主题)特征和背景特征。这一进步最大限度地减少了由不相关的背景异常引起的假警报。此外,该方法集成了多模态大语言模型,更好地理解前景与全景之间的关系,提高了异常识别的精度。该方法利用一种可学习的提示机制,该机制结合了主题提示、全景提示、正常提示和异常提示,通过基于clip的文本编码器进行编码,以提高异常识别能力。我们在几个基准数据集上的实验显示了最先进的异常检测和分割性能,突出了其在零射击场景中的有效性和适应性。这项研究为农业、医疗保健、工业质量控制等领域的应用提供了强有力的工具。
{"title":"FP-CLIP : Foreground-panorama prompt learning for zero-shot anomaly detection","authors":"Ao Lu ,&nbsp;Jincun Liu ,&nbsp;Yaoguang Wei ,&nbsp;Yan Meng ,&nbsp;Dong An","doi":"10.1016/j.dsp.2025.105798","DOIUrl":"10.1016/j.dsp.2025.105798","url":null,"abstract":"<div><div>Anomaly detection and segmentation play pivotal roles in various fields, such as industrial inspection, medical imaging, and agricultural monitoring. These technologies are essential for performing detailed analyses of complex systems, like pinpointing flaws in production processes, recognizing irregularities in medical images, or tracking dynamic changes in environmental conditions. A significant challenge faced by existing zero-shot anomaly segmentation approaches is their high rate of false positives, often caused by confusing anomalies present in the background with those relevant to the primary object of interest. To tackle this issue, we propose an advanced zero-shot anomaly segmentation framework that efficiently differentiates between foreground (subject-specific) features and background features. This advancement minimizes false alarms resulting from irrelevant background anomalies. Furthermore, our method integrates a multi-modal large language model to better understand the relationship between the foreground and the panorama, enhancing the precision of anomaly identification. The approach utilizes a learnable prompting mechanism that incorporates subject prompts, panorama prompts, normal prompts, and abnormal prompts, encoded through a CLIP-based text encoder to improve anomaly discernment. Our experiments across several benchmark datasets reveal state-of-the-art performance in both anomaly detection and segmentation, highlighting its effectiveness and adaptability in zero-shot scenarios. This research contributes a potent tool for applications spanning agriculture, healthcare, industrial quality control, and beyond.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"170 ","pages":"Article 105798"},"PeriodicalIF":3.0,"publicationDate":"2025-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145738098","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MDANet: Apple leaf disease detection network based on multi-scale dynamic adaptation and feature enhancement MDANet:基于多尺度动态适应和特征增强的苹果叶片病害检测网络
IF 3 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-12-02 DOI: 10.1016/j.dsp.2025.105804
Chunman Yan, Jiuli Wang
Accurate detection of apple leaf diseases is crucial for ensuring the yield and quality of the apple industry. However,detection in natural environments still faces challenges such as variable lesion scales and irregular morphologies, difficulty in distinguishing similar diseases, and information loss under complex background interference. To address these challenges, this paper proposes an improved Multi-scale Dynamic Adaptive Detection Network (MDANet). First, an Adaptive Multi-scale Feature Enhancement (AMFE) module is introduced, which constructs progressive receptive fields through asymmetric convolution chains and effectively captures lesion features at different scales by combining adaptive feature fusion mechanisms. Second, a Dynamic Feature Convolution Module (DFCM) is integrated, which dynamically generates convolution kernel parameters for each input sample through a routing network and adaptively adjusts feature extraction strategies based on subtle texture differences among diseases, significantly enhancing the discriminative capability for similar diseases.Finally, an Adaptive Feature Fusion Downsampling (AFFD) module is introduced, which combines the advantages of average pooling and max pooling to highlight edge texture details while preserving the overall statistical characteristics of lesions, effectively mitigating information loss problems under complex backgrounds. Experimental results demonstrate that on the Apple Leaf Disease Dataset (ALDD), MDANet achieves 85.4 % precision, 77.3 % recall, 83.6 % mAP @0.5, 3.1 M parameters, 6.2 G FLOPs, and 104 FPS, showcasing its strong generalization and suitability for complex apple leaf disease detection. Cross-dataset experiments further validate the model's robustness and generalization capability, providing an effective solution for disease control in smart agriculture.
苹果叶片病害的准确检测是保证苹果产量和品质的关键。然而,在自然环境下的检测仍然面临着病变尺度多变、形态不规则、同类疾病难以区分、复杂背景干扰下信息丢失等挑战。为了解决这些问题,本文提出了一种改进的多尺度动态自适应检测网络(MDANet)。首先,引入自适应多尺度特征增强(AMFE)模块,该模块通过非对称卷积链构建渐进式感受野,结合自适应特征融合机制有效捕获不同尺度的病灶特征;其次,集成动态特征卷积模块(DFCM),通过路由网络动态生成每个输入样本的卷积核参数,并根据疾病之间细微的纹理差异自适应调整特征提取策略,显著增强了对相似疾病的判别能力;最后介绍了一种自适应特征融合下采样(AFFD)模块,该模块结合了平均池化和最大池化的优点,在保留病灶整体统计特征的同时突出边缘纹理细节,有效缓解了复杂背景下的信息丢失问题。实验结果表明,在苹果叶病数据集(ALDD)上,MDANet的准确率为85.4%,召回率为77.3%,mAP @0.5为83.6%,参数为3.1 M, FLOPs为6.2 G, FPS为104 FPS,具有较强的泛化能力和对复杂苹果叶病检测的适应性。跨数据集实验进一步验证了模型的鲁棒性和泛化能力,为智慧农业疾病防控提供了有效的解决方案。
{"title":"MDANet: Apple leaf disease detection network based on multi-scale dynamic adaptation and feature enhancement","authors":"Chunman Yan,&nbsp;Jiuli Wang","doi":"10.1016/j.dsp.2025.105804","DOIUrl":"10.1016/j.dsp.2025.105804","url":null,"abstract":"<div><div>Accurate detection of apple leaf diseases is crucial for ensuring the yield and quality of the apple industry. However,detection in natural environments still faces challenges such as variable lesion scales and irregular morphologies, difficulty in distinguishing similar diseases, and information loss under complex background interference. To address these challenges, this paper proposes an improved Multi-scale Dynamic Adaptive Detection Network (MDANet). First, an Adaptive Multi-scale Feature Enhancement (AMFE) module is introduced, which constructs progressive receptive fields through asymmetric convolution chains and effectively captures lesion features at different scales by combining adaptive feature fusion mechanisms. Second, a Dynamic Feature Convolution Module (DFCM) is integrated, which dynamically generates convolution kernel parameters for each input sample through a routing network and adaptively adjusts feature extraction strategies based on subtle texture differences among diseases, significantly enhancing the discriminative capability for similar diseases.Finally, an Adaptive Feature Fusion Downsampling (AFFD) module is introduced, which combines the advantages of average pooling and max pooling to highlight edge texture details while preserving the overall statistical characteristics of lesions, effectively mitigating information loss problems under complex backgrounds. Experimental results demonstrate that on the Apple Leaf Disease Dataset (ALDD), MDANet achieves 85.4 % precision, 77.3 % recall, 83.6 % mAP @0.5, 3.1 M parameters, 6.2 G FLOPs, and 104 FPS, showcasing its strong generalization and suitability for complex apple leaf disease detection. Cross-dataset experiments further validate the model's robustness and generalization capability, providing an effective solution for disease control in smart agriculture.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"170 ","pages":"Article 105804"},"PeriodicalIF":3.0,"publicationDate":"2025-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145685184","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Signing EEG-based biometric authentication system using multivariate Fourier-Bessel series expansion-based entropies 基于多变量傅立叶-贝塞尔级数展开熵的基于脑电图的签名生物特征认证系统
IF 3 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-12-01 DOI: 10.1016/j.dsp.2025.105803
Sai Pranavi Kamaraju, Kritiprasanna Das, Ram Bilas Pachori
Data privacy and security are severe concerns in our world today. A broad range of biometric identification systems relies on physiological traits, including fingerprints, iris scans, and facial recognition for authentication. Conventional methods like signatures, passwords can easily be spoofed by an unauthorized person. We have proposed a framework for biometric identification using electroencephalogram (EEG) signals recorded during signing, as an individual can not replicate another individual’s signals. Multivariate variational mode decomposition (MVMD) is applied on multi-channel EEG signals to extract properly aligned oscillatory modes. Features are extracted using Fourier-Bessel series expansion (FBSE)-based entropies. We have extended the univariate entropy for multi-channel signals, namely, multivariate FBSE-based entropy (M-FBSE-E). The M-FBSE-E features are classified using machine learning classifiers for biometric identification. We have collected a database from 35 participants to validate our proposed model. These features are evaluated using machine learning classifiers to distinguish genuine from forged signatures. The proposed method achieves 93.4 ± 7.0 % accuracy in subject-wise and 89.4 ± 1.9 % in subject-independent settings. Experimental results show the effectiveness of the proposed framework for EEG-based biometric identification.
数据隐私和安全是当今世界的严重问题。广泛的生物识别系统依赖于生理特征,包括指纹、虹膜扫描和面部识别进行身份验证。签名、密码等传统方法很容易被未经授权的人欺骗。我们提出了一个使用在签名过程中记录的脑电图(EEG)信号进行生物识别的框架,因为个体不能复制另一个个体的信号。对多通道脑电信号进行多变量变分模态分解(MVMD),提取合适排列的振荡模态。使用基于傅里叶-贝塞尔级数展开(FBSE)的熵提取特征。我们扩展了多通道信号的单变量熵,即基于多变量fbse的熵(M-FBSE-E)。M-FBSE-E特征使用机器学习分类器进行生物识别分类。我们从35个参与者中收集了一个数据库来验证我们提出的模型。使用机器学习分类器评估这些特征,以区分真实签名和伪造签名。该方法在学科方面的准确率为93.4 ± 7.0%,在学科独立方面的准确率为89.4 ± 1.9%。实验结果表明了该框架在基于脑电图的生物特征识别中的有效性。
{"title":"Signing EEG-based biometric authentication system using multivariate Fourier-Bessel series expansion-based entropies","authors":"Sai Pranavi Kamaraju,&nbsp;Kritiprasanna Das,&nbsp;Ram Bilas Pachori","doi":"10.1016/j.dsp.2025.105803","DOIUrl":"10.1016/j.dsp.2025.105803","url":null,"abstract":"<div><div>Data privacy and security are severe concerns in our world today. A broad range of biometric identification systems relies on physiological traits, including fingerprints, iris scans, and facial recognition for authentication. Conventional methods like signatures, passwords can easily be spoofed by an unauthorized person. We have proposed a framework for biometric identification using electroencephalogram (EEG) signals recorded during signing, as an individual can not replicate another individual’s signals. Multivariate variational mode decomposition (MVMD) is applied on multi-channel EEG signals to extract properly aligned oscillatory modes. Features are extracted using Fourier-Bessel series expansion (FBSE)-based entropies. We have extended the univariate entropy for multi-channel signals, namely, multivariate FBSE-based entropy (M-FBSE-E). The M-FBSE-E features are classified using machine learning classifiers for biometric identification. We have collected a database from 35 participants to validate our proposed model. These features are evaluated using machine learning classifiers to distinguish genuine from forged signatures. The proposed method achieves 93.4 ± 7.0 % accuracy in subject-wise and 89.4 ± 1.9 % in subject-independent settings. Experimental results show the effectiveness of the proposed framework for EEG-based biometric identification.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"170 ","pages":"Article 105803"},"PeriodicalIF":3.0,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145790617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PCT-ViT: A vision transformer incorporating fine-grained perception enhancement and counterfactual token selection PCT-ViT:结合细粒度感知增强和反事实令牌选择的视觉转换器
IF 3 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-11-30 DOI: 10.1016/j.dsp.2025.105796
Zeng Gao , Jikai Lu , Le Zhao , Qian Li , Nanhua Chen , Peng Li
Fine-Grained Visual Classification (FGVC) is characterized by small inter-class variance and large intra-class variance. Consequently, it poses high demands on key region localization, discriminative feature representation, and spatial structure perception. Although Transformers exhibit strong global modeling capabilities, they still face challenges in FGVC tasks, such as inaccurate focus on key regions, rigid positional encoding, and interference from redundant tokens. To address these issues, we propose PCT-ViT, a Vision Transformer framework that integrates fine-grained perception enhancement and counterfactual discriminative token re-estimation. This approach systematically improving the model’s awareness of discriminative regions and its structural adaptability. Specifically, the proposed PCT-ViT consists of three modules: Dual-path Semantic Perception Module (DPM), Dynamic Position Encoding (DPE), and Counterfactual Token Selection (CTS). DPM integrates channel context semantics and spatial responses to guide the model to focus on potential key regions in the early stages of feature extraction. DPE introduces adaptive positional priors via content-aware transformations to enhance robustness against geometric deformations and positional perturbations. CTS leverages hierarchical attention fusion and perturbation sensitivity analysis to dynamically select the most discriminative tokens, effectively suppressing redundant attention diffusion. Experimental results on CUB-200-2011, Stanford Dogs, NABirds, and Food-101 benchmarks demonstrate that PCT-ViT achieves superior classification accuracy compared to various state-of-the-art CNN and Transformer-based methods, exhibiting strong competitiveness.
细粒度视觉分类(FGVC)具有类间方差小、类内方差大的特点。因此,对关键区域定位、判别特征表示和空间结构感知提出了很高的要求。尽管变形金刚显示出强大的全局建模能力,但它们在FGVC任务中仍然面临挑战,例如对关键区域的不准确关注、严格的位置编码以及冗余标记的干扰。为了解决这些问题,我们提出了PCT-ViT,这是一个集成了细粒度感知增强和反事实判别令牌重新估计的视觉转换框架。该方法系统地提高了模型对判别区域的感知能力和结构适应性。具体而言,PCT-ViT由三个模块组成:双路径语义感知模块(DPM)、动态位置编码(DPE)和反事实令牌选择(CTS)。DPM集成了通道上下文语义和空间响应,引导模型在特征提取的早期阶段关注潜在的关键区域。DPE通过内容感知转换引入自适应位置先验,以增强对几何变形和位置扰动的鲁棒性。CTS利用分层注意融合和扰动敏感性分析动态选择最具判别性的标记,有效抑制冗余注意扩散。在CUB-200-2011、Stanford Dogs、NABirds和Food-101基准上的实验结果表明,与各种最先进的CNN和基于transformer的方法相比,PCT-ViT具有更高的分类精度,具有很强的竞争力。
{"title":"PCT-ViT: A vision transformer incorporating fine-grained perception enhancement and counterfactual token selection","authors":"Zeng Gao ,&nbsp;Jikai Lu ,&nbsp;Le Zhao ,&nbsp;Qian Li ,&nbsp;Nanhua Chen ,&nbsp;Peng Li","doi":"10.1016/j.dsp.2025.105796","DOIUrl":"10.1016/j.dsp.2025.105796","url":null,"abstract":"<div><div>Fine-Grained Visual Classification (FGVC) is characterized by small inter-class variance and large intra-class variance. Consequently, it poses high demands on key region localization, discriminative feature representation, and spatial structure perception. Although Transformers exhibit strong global modeling capabilities, they still face challenges in FGVC tasks, such as inaccurate focus on key regions, rigid positional encoding, and interference from redundant tokens. To address these issues, we propose PCT-ViT, a Vision Transformer framework that integrates fine-grained perception enhancement and counterfactual discriminative token re-estimation. This approach systematically improving the model’s awareness of discriminative regions and its structural adaptability. Specifically, the proposed PCT-ViT consists of three modules: Dual-path Semantic Perception Module (DPM), Dynamic Position Encoding (DPE), and Counterfactual Token Selection (CTS). DPM integrates channel context semantics and spatial responses to guide the model to focus on potential key regions in the early stages of feature extraction. DPE introduces adaptive positional priors via content-aware transformations to enhance robustness against geometric deformations and positional perturbations. CTS leverages hierarchical attention fusion and perturbation sensitivity analysis to dynamically select the most discriminative tokens, effectively suppressing redundant attention diffusion. Experimental results on CUB-200-2011, Stanford Dogs, NABirds, and Food-101 benchmarks demonstrate that PCT-ViT achieves superior classification accuracy compared to various state-of-the-art CNN and Transformer-based methods, exhibiting strong competitiveness.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"170 ","pages":"Article 105796"},"PeriodicalIF":3.0,"publicationDate":"2025-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145685183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A lightweight full-resolution cascade network for vessel segmentation 用于船舶分割的轻量级全分辨率级联网络
IF 3 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-11-30 DOI: 10.1016/j.dsp.2025.105765
Shangdong Liu , Mingjie Yin , Ruyang Liu , Lincen Jiang , Jianwei Liu , Yimu Ji , Chen Wang , Chenxi Zhu , Zeng Chen , Ziyi Wang
Vessel segmentation is crucial for medical assessments and operative strategies in imaging. In recent years, U-Net architectures have been widely used for image segmentation tasks. However, they often struggle to accurately segment small and low-visibility vessels due to the inherent loss of spatial information within their encoder-decoder framework. To address this issue, we propose LFRC-Net, a lightweight full-resolution cascade network designed for precise vessel segmentation. LFRC-Net includes two innovative components: an improved ConvNeXt convolution and a cascade feature enhancement module (CFEM). The enhanced ConvNeXt convolution incorporates a recurrent mechanism with Efficient Localization Attention (ELA), significantly improving feature extraction and enabling accurate detection of small-caliber, low-contrast vessels. Additionally, the CFEM integrates multiple feature maps to greatly enhance feature representation, leading to better segmentation accuracy. Evaluations on the CHASE_DB1, DCA1, ROSSA, and OCTA-3M datasets indicate that LFRC-Net consistently outperforms existing mainstream methods. Specifically, the Dice coefficients achieved on the CHASE_DB1 and DCA1 datasets are 80.99 % and 79.72 %, respectively, with corresponding accuracies of 97.52 % and 97.93 %. The proposed model performs well across different datasets, demonstrating its robustness.
血管分割对医学评估和影像学手术策略至关重要。近年来,U-Net架构被广泛应用于图像分割任务。然而,由于编码器-解码器框架中固有的空间信息丢失,它们通常难以准确分割小型和低能见度船只。为了解决这个问题,我们提出了LFRC-Net,这是一种轻量级的全分辨率级联网络,专为精确的船舶分割而设计。LFRC-Net包括两个创新组件:改进的ConvNeXt卷积和级联特征增强模块(CFEM)。增强的ConvNeXt卷积结合了高效定位注意(ELA)的循环机制,显著改善了特征提取,并能够准确检测小口径、低对比度的血管。此外,CFEM集成了多个特征映射,大大增强了特征表示,从而提高了分割精度。对CHASE_DB1、DCA1、ROSSA和OCTA-3M数据集的评估表明,LFRC-Net始终优于现有的主流方法。其中,在CHASE_DB1和DCA1数据集上获得的Dice系数分别为80.99%和79.72%,对应的准确率分别为97.52%和97.93%。该模型在不同的数据集上表现良好,证明了其鲁棒性。
{"title":"A lightweight full-resolution cascade network for vessel segmentation","authors":"Shangdong Liu ,&nbsp;Mingjie Yin ,&nbsp;Ruyang Liu ,&nbsp;Lincen Jiang ,&nbsp;Jianwei Liu ,&nbsp;Yimu Ji ,&nbsp;Chen Wang ,&nbsp;Chenxi Zhu ,&nbsp;Zeng Chen ,&nbsp;Ziyi Wang","doi":"10.1016/j.dsp.2025.105765","DOIUrl":"10.1016/j.dsp.2025.105765","url":null,"abstract":"<div><div>Vessel segmentation is crucial for medical assessments and operative strategies in imaging. In recent years, U-Net architectures have been widely used for image segmentation tasks. However, they often struggle to accurately segment small and low-visibility vessels due to the inherent loss of spatial information within their encoder-decoder framework. To address this issue, we propose LFRC-Net, a lightweight full-resolution cascade network designed for precise vessel segmentation. LFRC-Net includes two innovative components: an improved ConvNeXt convolution and a cascade feature enhancement module (CFEM). The enhanced ConvNeXt convolution incorporates a recurrent mechanism with Efficient Localization Attention (ELA), significantly improving feature extraction and enabling accurate detection of small-caliber, low-contrast vessels. Additionally, the CFEM integrates multiple feature maps to greatly enhance feature representation, leading to better segmentation accuracy. Evaluations on the CHASE_DB1, DCA1, ROSSA, and OCTA-3M datasets indicate that LFRC-Net consistently outperforms existing mainstream methods. Specifically, the Dice coefficients achieved on the CHASE_DB1 and DCA1 datasets are 80.99 % and 79.72 %, respectively, with corresponding accuracies of 97.52 % and 97.93 %. The proposed model performs well across different datasets, demonstrating its robustness.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"170 ","pages":"Article 105765"},"PeriodicalIF":3.0,"publicationDate":"2025-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145685188","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Finding ground-based RFI sources based on single-pass SAR raw data: Extraction, estimation and localization 基于单次SAR原始数据的地面RFI源:提取、估计和定位
IF 3 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-11-30 DOI: 10.1016/j.dsp.2025.105802
Zewen Fu, Yufeng Sun, Zhengwei Guo, Ning Li
Ground-based radio frequency interference (RFI) sources significantly deteriorate the quality of synthetic aperture radar (SAR) satellite imagery. Given the increasingly complex electromagnetic environment, the necessity for electromagnetic situational awareness has intensified. In this article, an approach for finding ground-based RFI source using single-pass SAR raw data is proposed, which can conveniently and effectively acquire the location information of these sources. Specifically, the RFI signal is extracted from the contaminated SAR raw data based on its differences from the SAR signal. Subsequently, parameters of the RFI signal, including the center frequency, modulation frequency, are estimated using improved Chinese Remainder Theorem (ICRT). Finally, the location of the RFI source is determined by combining the parameter estimation results with the localization index (LI). The localization accuracy of the proposed approach is validated through simulation experiments. Based on the single-pass SAR measured raw data of Sentinel-1 in Henan, China, the localization results show that the RFI source is near the Xinzheng International Airport. Compared to traditional methods, the proposed approach significantly improves localization accuracy and adaptability, enabling the SAR system to achieve high-precision situational awareness.
地面射频干扰源严重影响合成孔径雷达(SAR)卫星成像质量。随着电磁环境的日益复杂,电磁态势感知的必要性日益增强。本文提出了一种利用单通道SAR原始数据查找地面射频信号源的方法,可以方便有效地获取地面射频信号源的位置信息。具体来说,基于RFI信号与SAR信号的差异,从被污染的SAR原始数据中提取RFI信号。随后,利用改进的中国剩余定理(ICRT)估计了RFI信号的中心频率、调制频率等参数。最后,将参数估计结果与定位索引(LI)相结合,确定RFI源的位置。仿真实验验证了该方法的定位精度。基于Sentinel-1在中国河南地区的单次SAR实测原始数据,定位结果表明RFI源位于新郑国际机场附近。与传统方法相比,该方法显著提高了定位精度和自适应能力,使SAR系统能够实现高精度态势感知。
{"title":"Finding ground-based RFI sources based on single-pass SAR raw data: Extraction, estimation and localization","authors":"Zewen Fu,&nbsp;Yufeng Sun,&nbsp;Zhengwei Guo,&nbsp;Ning Li","doi":"10.1016/j.dsp.2025.105802","DOIUrl":"10.1016/j.dsp.2025.105802","url":null,"abstract":"<div><div>Ground-based radio frequency interference (RFI) sources significantly deteriorate the quality of synthetic aperture radar (SAR) satellite imagery. Given the increasingly complex electromagnetic environment, the necessity for electromagnetic situational awareness has intensified. In this article, an approach for finding ground-based RFI source using single-pass SAR raw data is proposed, which can conveniently and effectively acquire the location information of these sources. Specifically, the RFI signal is extracted from the contaminated SAR raw data based on its differences from the SAR signal. Subsequently, parameters of the RFI signal, including the center frequency, modulation frequency, are estimated using improved Chinese Remainder Theorem (ICRT). Finally, the location of the RFI source is determined by combining the parameter estimation results with the localization index (LI). The localization accuracy of the proposed approach is validated through simulation experiments. Based on the single-pass SAR measured raw data of Sentinel-1 in Henan, China, the localization results show that the RFI source is near the Xinzheng International Airport. Compared to traditional methods, the proposed approach significantly improves localization accuracy and adaptability, enabling the SAR system to achieve high-precision situational awareness.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"170 ","pages":"Article 105802"},"PeriodicalIF":3.0,"publicationDate":"2025-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145685126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Group sparse and super resolution time-frequency-based method for emotion recognition 基于组稀疏和超分辨率时频的情绪识别方法
IF 3 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-11-29 DOI: 10.1016/j.dsp.2025.105761
Amit Kumar Dwivedi, Om Prakash Verma, Sachin Taran
Emotion identification in human-computer interaction and psychological evaluation depends on accurate electroencephalogram (EEG) data measurement and interpretation. However, EEG data frequently suffer from noise contamination, which negatively impacts measurement accuracy and signal integrity. This paper proposes a novel framework for improving EEG signals by using Group Sparse Mode Decomposition in combination with the Bhattacharyya Distance to efficiently remove noise from the signals. The cleaned-up signals are then transformed using Superlet Transform (SLT) and Adaptive Superlet Transform (ASLT) to create detailed time-frequency images that help in precisely measuring EEG and identifying significant features. The proposed framework improves signal clarity, reduces measurement uncertainty, and enhances the reliability of EEG-based emotion assessment. Experimental results validate the method’s effectiveness by demonstrating improved signal-to-noise ratio (SNR), feature separability, and measurement consistency across multiple datasets. The study also proposes a state-of-the-art super-resolution neural network (SRNET). SRNET is trained on time-frequency images generated from the Superlet and Adaptive Superlet Transform images of EEG signals. It effectively captures subtle spectral and temporal patterns for the identification of emotions. SRNET surpasses conventional transfer learning models, including VGG-16, GoogleNet, and AlexNet, achieving an impressive accuracy rate of 99.63 % while significantly reducing training time.
人机交互和心理评价中的情绪识别依赖于准确的脑电图数据测量和解释。然而,脑电数据经常受到噪声污染,影响测量精度和信号完整性。本文提出了一种新的脑电信号改进框架,将分组稀疏模式分解与Bhattacharyya距离相结合,有效地去除信号中的噪声。然后使用超小波变换(SLT)和自适应超小波变换(ASLT)对清理后的信号进行变换,生成详细的时频图像,有助于精确测量EEG并识别重要特征。该框架提高了信号清晰度,降低了测量不确定性,提高了基于脑电图的情绪评估的可靠性。实验结果验证了该方法的有效性,提高了多数据集的信噪比(SNR)、特征可分离性和测量一致性。该研究还提出了一种最先进的超分辨率神经网络(SRNET)。SRNET是在脑电信号的Superlet和自适应Superlet变换图像生成的时频图像上进行训练的。它有效地捕捉细微的光谱和时间模式,以识别情绪。SRNET超越了传统的迁移学习模型,包括VGG-16、GoogleNet和AlexNet,在显著减少训练时间的同时,实现了99.63%的令人印象深刻的准确率。
{"title":"Group sparse and super resolution time-frequency-based method for emotion recognition","authors":"Amit Kumar Dwivedi,&nbsp;Om Prakash Verma,&nbsp;Sachin Taran","doi":"10.1016/j.dsp.2025.105761","DOIUrl":"10.1016/j.dsp.2025.105761","url":null,"abstract":"<div><div>Emotion identification in human-computer interaction and psychological evaluation depends on accurate electroencephalogram (EEG) data measurement and interpretation. However, EEG data frequently suffer from noise contamination, which negatively impacts measurement accuracy and signal integrity. This paper proposes a novel framework for improving EEG signals by using Group Sparse Mode Decomposition in combination with the Bhattacharyya Distance to efficiently remove noise from the signals. The cleaned-up signals are then transformed using Superlet Transform (SLT) and Adaptive Superlet Transform (ASLT) to create detailed time-frequency images that help in precisely measuring EEG and identifying significant features. The proposed framework improves signal clarity, reduces measurement uncertainty, and enhances the reliability of EEG-based emotion assessment. Experimental results validate the method’s effectiveness by demonstrating improved signal-to-noise ratio (SNR), feature separability, and measurement consistency across multiple datasets. The study also proposes a state-of-the-art super-resolution neural network (SRNET). SRNET is trained on time-frequency images generated from the Superlet and Adaptive Superlet Transform images of EEG signals. It effectively captures subtle spectral and temporal patterns for the identification of emotions. SRNET surpasses conventional transfer learning models, including VGG-16, GoogleNet, and AlexNet, achieving an impressive accuracy rate of 99.63 % while significantly reducing training time.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"170 ","pages":"Article 105761"},"PeriodicalIF":3.0,"publicationDate":"2025-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145685179","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Digital Signal Processing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1