Digital Signal Processing最新文献_第8页

A multi-scale dual-decoder autoencoder model for domain-shift machine sound anomaly detection 用于域转移机器声音异常检测的多尺度双解码器自动编码器模型

IF 2.9 3区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

Digital Signal Processing

Pub Date : 2024-10-11 DOI: 10.1016/j.dsp.2024.104813

Shengbing Chen, Yong Sun, Junjie Wang, Mengyuan Wan, Mengyuan Liu, Xiaofan Li

Anomaly detection through machine sounds plays a crucial role in the development of industrial automation due to its excellent flexibility and real-time response capabilities. However, in real-world scenarios, the occurrence frequency of machine anomaly events is relatively low, making it difficult to collect anomaly sound data under various operating conditions. Moreover, due to the influence of operating conditions and environmental noise, the collected sound data may have distribution differences, leading to data domain shifts issues. To address these problems, we propose an unsupervised multi-scale dual-decoder autoencoder (MS-D2AE) network for anomaly sound detection. The MS-D2AE model consists of residual layers, an encoder, and two decoders. The model fuses fine-grained information of sound features through the Multi-scale Feature Fusion Module (MTSFFM), enabling the model to effectively learn feature data from multiple scales. By using a residual layer composed of a single MTSFFM, the encoder's input is directly connected to the intermediate results, further enhancing information transmission. The designed dual-decoder autoencoder structure, in addition to reconstructing error calculation, also utilizes the similarity error calculation between the outputs of the two decoders, encouraging the model to more accurately reconstruct the feature data during learning, thus more comprehensively learning the feature representation of normal data. Additionally, to mitigate the impact of data shift on model performance, we design a feature domain mixing method that blends sound features from both source and target domains to enhance the diversity and generalization of sound features. Finally, we have verified the effectiveness of this method on the Dcase2023 Challenge Task2 and Dcase2022 Challenge Task2 datasets.

通过机器声音进行异常检测具有出色的灵活性和实时响应能力，因此在工业自动化发展中发挥着至关重要的作用。然而，在实际应用场景中，机器异常事件的发生频率相对较低，因此很难收集到各种运行条件下的异常声音数据。此外，由于工作条件和环境噪声的影响，采集到的声音数据可能存在分布差异，从而导致数据域转移问题。针对这些问题，我们提出了一种用于异常声音检测的无监督多尺度双解码器自动编码器（MS-D2AE）网络。MS-D2AE 模型由残差层、一个编码器和两个解码器组成。该模型通过多尺度特征融合模块（MTSFFM）融合声音特征的细粒度信息，使模型能够有效地学习来自多个尺度的特征数据。通过使用由单个 MTSFFM 组成的残差层，编码器的输入与中间结果直接相连，进一步加强了信息传输。所设计的双解码器自动编码器结构，除了重构误差计算外，还利用两个解码器输出之间的相似性误差计算，促使模型在学习过程中更准确地重构特征数据，从而更全面地学习正常数据的特征表示。此外，为了减轻数据偏移对模型性能的影响，我们设计了一种特征域混合方法，将源域和目标域的声音特征混合在一起，以增强声音特征的多样性和泛化能力。最后，我们在 Dcase2023 Challenge Task2 和 Dcase2022 Challenge Task2 数据集上验证了该方法的有效性。

{"title":"A multi-scale dual-decoder autoencoder model for domain-shift machine sound anomaly detection","authors":"Shengbing Chen, Yong Sun, Junjie Wang, Mengyuan Wan, Mengyuan Liu, Xiaofan Li","doi":"10.1016/j.dsp.2024.104813","DOIUrl":"10.1016/j.dsp.2024.104813","url":null,"abstract":"<div><div>Anomaly detection through machine sounds plays a crucial role in the development of industrial automation due to its excellent flexibility and real-time response capabilities. However, in real-world scenarios, the occurrence frequency of machine anomaly events is relatively low, making it difficult to collect anomaly sound data under various operating conditions. Moreover, due to the influence of operating conditions and environmental noise, the collected sound data may have distribution differences, leading to data domain shifts issues. To address these problems, we propose an unsupervised multi-scale dual-decoder autoencoder (MS-D2AE) network for anomaly sound detection. The MS-D2AE model consists of residual layers, an encoder, and two decoders. The model fuses fine-grained information of sound features through the Multi-scale Feature Fusion Module (MTSFFM), enabling the model to effectively learn feature data from multiple scales. By using a residual layer composed of a single MTSFFM, the encoder's input is directly connected to the intermediate results, further enhancing information transmission. The designed dual-decoder autoencoder structure, in addition to reconstructing error calculation, also utilizes the similarity error calculation between the outputs of the two decoders, encouraging the model to more accurately reconstruct the feature data during learning, thus more comprehensively learning the feature representation of normal data. Additionally, to mitigate the impact of data shift on model performance, we design a feature domain mixing method that blends sound features from both source and target domains to enhance the diversity and generalization of sound features. Finally, we have verified the effectiveness of this method on the Dcase2023 Challenge Task2 and Dcase2022 Challenge Task2 datasets.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"156 ","pages":"Article 104813"},"PeriodicalIF":2.9,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142438309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

TAG-fusion: Two-stage attention guided multi-modal fusion network for semantic segmentation TAG-fusion：用于语义分割的两阶段注意力引导多模态融合网络

IF 2.9 3区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

Digital Signal Processing

Pub Date : 2024-10-11 DOI: 10.1016/j.dsp.2024.104807

Zhizhou Zhang, Wenwu Wang, Lei Zhu, Zhibin Tang

In the current research, leveraging auxiliary modalities, such as depth information or point cloud information, to improve RGB semantic segmentation has shown significant potential. However, existing methods mainly use convolutional modules for aggregating features from auxiliary modalities, thereby lacking sufficient exploitation of long-range dependencies. Moreover, fusion strategies are typically limited to singular approaches. In this paper, we propose a transformer-based multimodal fusion framework to better utilize auxiliary modalities for enhancing semantic segmentation results. Specifically, we employ a dual-stream architecture for extracting features from RGB and auxiliary modalities, respectively. We incorporate both early fusion and deep feature fusion techniques. At each layer, we introduce mixed attention mechanisms to leverage features from other modalities, guiding and enhancing the current modality's features before propagating them to the subsequent stage of feature extraction. After the extraction of features from different modalities, we employ an enhanced cross-attention mechanism for feature interaction, followed by channel fusion to obtain the final semantic features. Subsequently, we provide separate supervision to the network on the RGB stream, auxiliary stream, and fusion stream to facilitate the learning of representations for different modalities. The experimental results demonstrate that our framework exhibits superior performance across diverse modalities. Specifically, our approach achieves state-of-the-art results on the NYU Depth V2, SUN-RGBD, DELIVER and MFNet datasets.

在目前的研究中，利用深度信息或点云信息等辅助模态来改进 RGB 语义分割已显示出巨大的潜力。然而，现有方法主要使用卷积模块来聚合辅助模态的特征，因此缺乏对长距离依赖关系的充分挖掘。此外，融合策略通常仅限于单一方法。在本文中，我们提出了一种基于变换器的多模态融合框架，以更好地利用辅助模态来增强语义分割结果。具体来说，我们采用双流架构，分别从 RGB 和辅助模态中提取特征。我们采用了早期融合和深度特征融合技术。在每一层，我们都引入了混合注意力机制，以利用其他模态的特征，在将当前模态的特征传播到后续特征提取阶段之前，引导和增强当前模态的特征。从不同模态提取特征后，我们采用增强型交叉注意机制进行特征交互，然后进行通道融合，以获得最终的语义特征。随后，我们分别对网络的 RGB 流、辅助流和融合流进行监督，以促进不同模态的表征学习。实验结果表明，我们的框架在不同模态下均表现出卓越的性能。具体来说，我们的方法在纽约大学深度 V2、SUN-RGBD、DELIVER 和 MFNet 数据集上取得了最先进的结果。

{"title":"TAG-fusion: Two-stage attention guided multi-modal fusion network for semantic segmentation","authors":"Zhizhou Zhang, Wenwu Wang, Lei Zhu, Zhibin Tang","doi":"10.1016/j.dsp.2024.104807","DOIUrl":"10.1016/j.dsp.2024.104807","url":null,"abstract":"<div><div>In the current research, leveraging auxiliary modalities, such as depth information or point cloud information, to improve RGB semantic segmentation has shown significant potential. However, existing methods mainly use convolutional modules for aggregating features from auxiliary modalities, thereby lacking sufficient exploitation of long-range dependencies. Moreover, fusion strategies are typically limited to singular approaches. In this paper, we propose a transformer-based multimodal fusion framework to better utilize auxiliary modalities for enhancing semantic segmentation results. Specifically, we employ a dual-stream architecture for extracting features from RGB and auxiliary modalities, respectively. We incorporate both early fusion and deep feature fusion techniques. At each layer, we introduce mixed attention mechanisms to leverage features from other modalities, guiding and enhancing the current modality's features before propagating them to the subsequent stage of feature extraction. After the extraction of features from different modalities, we employ an enhanced cross-attention mechanism for feature interaction, followed by channel fusion to obtain the final semantic features. Subsequently, we provide separate supervision to the network on the RGB stream, auxiliary stream, and fusion stream to facilitate the learning of representations for different modalities. The experimental results demonstrate that our framework exhibits superior performance across diverse modalities. Specifically, our approach achieves state-of-the-art results on the NYU Depth V2, SUN-RGBD, DELIVER and MFNet datasets.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"156 ","pages":"Article 104807"},"PeriodicalIF":2.9,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142433293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Large-scale multi-view spectral clustering based on two-stage well-distributed anchor selection 基于两阶段良好分布锚点选择的大规模多视角光谱聚类

IF 2.9 3区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

Digital Signal Processing

Pub Date : 2024-10-11 DOI: 10.1016/j.dsp.2024.104815

Xinran Cheng, Ziyue Tang, Xinmu Qi, Xinyi Qiang, Huamei Xi, Xia Ji

Spectral clustering has attracted much attention because of its good clustering effect, but its high computational cost makes it difficult to apply to large-scale multi-view clustering. In response to this issue, a simple and efficient large-scale multi-view spectral clustering algorithm is proposed, which is based on a Two-stage Well-distributed Anchor Selection strategy (TWAS). Firstly, the data set is divided into several disjoint sample blocks to get the global well-distributed anchor candidate. Then, the algorithm proceeds to select anchor points within each local candidate anchor set. This two-stage anchor selection strategy facilitates the identification of anchors with significant representativeness at a reduced computational expense, thereby adeptly capturing the intrinsic data structure. Secondly, the present study devises an adaptive near-neighbor graph learning approach to construct an anchor-based intra-view similarity matrix. Finally, the multiple views are fused to obtain a consistent inter-view similarity matrix, and the clustering results are obtained. Extensive experiments demonstrate the effectiveness, efficiency, and stability of the TWAS algorithm.

光谱聚类因其良好的聚类效果而备受关注，但其高昂的计算成本使其难以应用于大规模多视角聚类。针对这一问题，本文提出了一种简单高效的大规模多视角光谱聚类算法，该算法基于两阶段分布良好的锚点选择策略（TWAS）。首先，将数据集划分为多个不相邻的样本块，以获得全局分布良好的候选锚点。然后，算法继续在每个局部候选锚点集中选择锚点。这种两阶段锚点选择策略有助于识别具有显著代表性的锚点，同时降低计算成本，从而有效地捕捉数据的内在结构。其次，本研究设计了一种自适应近邻图学习方法，用于构建基于锚点的视图内相似性矩阵。最后，融合多个视图以获得一致的视图间相似性矩阵，并得出聚类结果。大量实验证明了 TWAS 算法的有效性、高效性和稳定性。

{"title":"Large-scale multi-view spectral clustering based on two-stage well-distributed anchor selection","authors":"Xinran Cheng, Ziyue Tang, Xinmu Qi, Xinyi Qiang, Huamei Xi, Xia Ji","doi":"10.1016/j.dsp.2024.104815","DOIUrl":"10.1016/j.dsp.2024.104815","url":null,"abstract":"<div><div>Spectral clustering has attracted much attention because of its good clustering effect, but its high computational cost makes it difficult to apply to large-scale multi-view clustering. In response to this issue, a simple and efficient large-scale multi-view spectral clustering algorithm is proposed, which is based on a Two-stage Well-distributed Anchor Selection strategy (TWAS). Firstly, the data set is divided into several disjoint sample blocks to get the global well-distributed anchor candidate. Then, the algorithm proceeds to select anchor points within each local candidate anchor set. This two-stage anchor selection strategy facilitates the identification of anchors with significant representativeness at a reduced computational expense, thereby adeptly capturing the intrinsic data structure. Secondly, the present study devises an adaptive near-neighbor graph learning approach to construct an anchor-based intra-view similarity matrix. Finally, the multiple views are fused to obtain a consistent inter-view similarity matrix, and the clustering results are obtained. Extensive experiments demonstrate the effectiveness, efficiency, and stability of the TWAS algorithm.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"156 ","pages":"Article 104815"},"PeriodicalIF":2.9,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142446375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Physical information-guided multidirectional gated recurrent unit network fusing attention to solve the Black-Scholes equation 物理信息引导的多向门控递归单元网络融合注意力求解布莱克-斯科尔斯方程

IF 2.9 3区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

Digital Signal Processing

Pub Date : 2024-10-11 DOI: 10.1016/j.dsp.2024.104766

Zhaoyang Zhang, Qingwang Wang, Yinxing Zhang, Tao Shen

Reasonable option pricing is crucial in the financial derivatives market. Finding analytical solutions for the Black-Scholes (BS) equation, particularly for American options or with fluctuating volatility and interest rates, is challenging. BS equations exhibit strong time-series characteristics, with asset prices typically adhering to geometric Brownian motion. To address the BS equations, we propose a sequence-to-sequence model guided by physical information (PI), called PiMGA. The PiMGA fuses a multidirectional gated recurrent unit (GRU) network with an attention module, where multidirectional GRU enhances the coding performance of the input sequences and the attention module balances the feature weights of the hidden variables. Prior physical knowledge in BS equations is jointly used as a constraint, forming the penalty function for objective optimization. This allows PiMGA to serve as an efficient approximation function in the learning paradigm of physically informed machine learning to solve BS equations. BS equations with various complexities illustrate the accuracy and feasibility of PiMGA for numerical solutions. Furthermore, the out-of-distribution generalization ability of PiMGA is verified by predicting the Nasdaq 100 index.

合理的期权定价对金融衍生品市场至关重要。为布莱克-斯科尔斯（Black-Scholes，BS）方程（尤其是美式期权或波动率和利率波动的期权）寻找分析解具有挑战性。BS 方程具有很强的时间序列特征，资产价格通常遵循几何布朗运动。为了解决 BS 方程问题，我们提出了一种以物理信息（PI）为指导的序列到序列模型，称为 PiMGA。PiMGA 融合了多向门控递归单元（GRU）网络和注意力模块，其中多向门控递归单元增强了输入序列的编码性能，而注意力模块则平衡了隐藏变量的特征权重。BS 方程中的先验物理知识被共同用作约束条件，形成目标优化的惩罚函数。这使得 PiMGA 成为物理信息机器学习范式中的高效近似函数，用于求解 BS 方程。不同复杂程度的 BS 方程说明了 PiMGA 数值求解的准确性和可行性。此外，还通过预测纳斯达克 100 指数验证了 PiMGA 在分布外的泛化能力。

{"title":"Physical information-guided multidirectional gated recurrent unit network fusing attention to solve the Black-Scholes equation","authors":"Zhaoyang Zhang, Qingwang Wang, Yinxing Zhang, Tao Shen","doi":"10.1016/j.dsp.2024.104766","DOIUrl":"10.1016/j.dsp.2024.104766","url":null,"abstract":"<div><div>Reasonable option pricing is crucial in the financial derivatives market. Finding analytical solutions for the Black-Scholes (BS) equation, particularly for American options or with fluctuating volatility and interest rates, is challenging. BS equations exhibit strong time-series characteristics, with asset prices typically adhering to geometric Brownian motion. To address the BS equations, we propose a sequence-to-sequence model guided by physical information (PI), called PiMGA. The PiMGA fuses a multidirectional gated recurrent unit (GRU) network with an attention module, where multidirectional GRU enhances the coding performance of the input sequences and the attention module balances the feature weights of the hidden variables. Prior physical knowledge in BS equations is jointly used as a constraint, forming the penalty function for objective optimization. This allows PiMGA to serve as an efficient approximation function in the learning paradigm of physically informed machine learning to solve BS equations. BS equations with various complexities illustrate the accuracy and feasibility of PiMGA for numerical solutions. Furthermore, the out-of-distribution generalization ability of PiMGA is verified by predicting the Nasdaq 100 index.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"156 ","pages":"Article 104766"},"PeriodicalIF":2.9,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142438310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Enhancing RODNet detection in complex road environments based on ESM and ISM methods 基于 ESM 和 ISM 方法加强复杂道路环境中的 RODNet 检测

IF 2.9 3区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

Digital Signal Processing

Pub Date : 2024-10-10 DOI: 10.1016/j.dsp.2024.104816

Yu Guo, Yaxin Xiao, Yan Zhou, Yanyan Li, Siyu Yang, Chuangrui Meng

In autonomous driving, accurately identifying traffic targets is crucial for ensuring the safe and reliable operation of autonomous vehicles. Millimeter-wave radar, known for its low cost, long detection range, and excellent performance under various weather conditions. Deep learning algorithms, particularly the radar object detection network (RODNet), have been effectively applied to radar target detection by analyzing the range-azimuth (RA) heatmaps that capture complex target features. However, the low angular resolution of radar RA heatmaps, combined with the high sensitivity of millimeter-wave radar to metal objects, makes adjacent targets prone to misdetection and increases the likelihood of misclassification of target types due to metal reflections from road obstacles. To address these issues, this paper proposes an innovative extension suppression method to enhance RA heatmaps, reducing interference between adjacent targets and significantly improving target resolution. Additionally, the paper incorporates Gaussian filtering, peak detection, and amplitude suppression algorithms to design an interference suppression method, accurately identifying and mitigating strong reflections from non-target regions, thereby improving detection efficiency in complex environments. The effectiveness and superiority of these methods have been fully validated, with AP improvements of 18% in overlapping scenarios, 2% in metal obstacle scenarios, and around 10% in high-speed scenarios compared to the latest methods.

在自动驾驶中，准确识别交通目标是确保自动驾驶汽车安全可靠运行的关键。毫米波雷达以其成本低、探测距离远、在各种天气条件下性能优异而著称。深度学习算法，特别是雷达目标检测网络（RODNet），通过分析捕捉复杂目标特征的测距方位（RA）热图，已被有效地应用于雷达目标检测。然而，雷达方位角热图的角度分辨率较低，再加上毫米波雷达对金属物体的高灵敏度，使得相邻目标容易被误检，并增加了因道路障碍物的金属反射而导致目标类型分类错误的可能性。为解决这些问题，本文提出了一种创新的扩展抑制方法来增强 RA 热图，减少相邻目标之间的干扰，并显著提高目标分辨率。此外，本文还结合高斯滤波、峰值检测和振幅抑制算法设计了一种干扰抑制方法，可准确识别和减轻来自非目标区域的强反射，从而提高复杂环境下的探测效率。这些方法的有效性和优越性已得到充分验证，与最新方法相比，在重叠场景中 AP 提高了 18%，在金属障碍物场景中提高了 2%，在高速场景中提高了约 10%。

{"title":"Enhancing RODNet detection in complex road environments based on ESM and ISM methods","authors":"Yu Guo, Yaxin Xiao, Yan Zhou, Yanyan Li, Siyu Yang, Chuangrui Meng","doi":"10.1016/j.dsp.2024.104816","DOIUrl":"10.1016/j.dsp.2024.104816","url":null,"abstract":"<div><div>In autonomous driving, accurately identifying traffic targets is crucial for ensuring the safe and reliable operation of autonomous vehicles. Millimeter-wave radar, known for its low cost, long detection range, and excellent performance under various weather conditions. Deep learning algorithms, particularly the radar object detection network (RODNet), have been effectively applied to radar target detection by analyzing the range-azimuth (RA) heatmaps that capture complex target features. However, the low angular resolution of radar RA heatmaps, combined with the high sensitivity of millimeter-wave radar to metal objects, makes adjacent targets prone to misdetection and increases the likelihood of misclassification of target types due to metal reflections from road obstacles. To address these issues, this paper proposes an innovative extension suppression method to enhance RA heatmaps, reducing interference between adjacent targets and significantly improving target resolution. Additionally, the paper incorporates Gaussian filtering, peak detection, and amplitude suppression algorithms to design an interference suppression method, accurately identifying and mitigating strong reflections from non-target regions, thereby improving detection efficiency in complex environments. The effectiveness and superiority of these methods have been fully validated, with AP improvements of 18% in overlapping scenarios, 2% in metal obstacle scenarios, and around 10% in high-speed scenarios compared to the latest methods.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"156 ","pages":"Article 104816"},"PeriodicalIF":2.9,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142531111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

FusionNGFPE: An image fusion approach driven by non-global fuzzy pre-enhancement framework FusionNGFPE：非全局模糊预增强框架驱动的图像融合方法

IF 2.9 3区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

Digital Signal Processing

Pub Date : 2024-10-10 DOI: 10.1016/j.dsp.2024.104801

Xiangbo Zhang , Gang Liu , Mingyi Li , Qin Ren , Haojie Tang , Durga Prasad Bavirisetti

The majority of prevailing image fusion methods employ a global strategy, often resulting in a reduction of contrast. This study addresses this issue by proposing a novel image fusion approach called FusionNGFPE, specifically designed for the structural characteristics of infrared (IR) imagery. The approach introduces a contrast equalization algorithm based on the Fourth-order Partial Differential Equation (FPDE) to enhance background regions effectively. Considering the inherent differences between IR and visible (VIS) images, we developed a hybrid fusion strategy that combines the Expectation Maximization (EM) algorithm and Principal Component Analysis (PCA). Comparative analysis with state-of-the-art fusion methods shows that our proposed algorithm achieves superior performance in both qualitative and quantitative evaluations. To further demonstrate the practical significance of FusionNGFPE, we integrated this fusion framework into the RGBT target tracking task using the VOT-RGBT and OTCBVS datasets. Extensive comparative experiments confirm that the FusionNGFPE framework integrates seamlessly with the tracking task, significantly improving tracking accuracy across diverse scenarios.

大多数流行的图像融合方法都采用全局策略，这往往会导致对比度降低。针对这一问题，本研究提出了一种名为 FusionNGFPE 的新型图像融合方法，专门针对红外图像的结构特征而设计。该方法引入了基于四阶偏微分方程（FPDE）的对比度均衡算法，以有效增强背景区域。考虑到红外图像与可见光（VIS）图像之间的固有差异，我们开发了一种混合融合策略，该策略结合了期望最大化（EM）算法和主成分分析（PCA）。与最先进的融合方法进行的比较分析表明，我们提出的算法在定性和定量评估中都取得了优异的性能。为了进一步证明 FusionNGFPE 的实际意义，我们使用 VOT-RGBT 和 OTCBVS 数据集将该融合框架集成到 RGBT 目标跟踪任务中。广泛的对比实验证实，FusionNGFPE 框架与跟踪任务无缝集成，显著提高了不同场景下的跟踪精度。

{"title":"FusionNGFPE: An image fusion approach driven by non-global fuzzy pre-enhancement framework","authors":"Xiangbo Zhang , Gang Liu , Mingyi Li , Qin Ren , Haojie Tang , Durga Prasad Bavirisetti","doi":"10.1016/j.dsp.2024.104801","DOIUrl":"10.1016/j.dsp.2024.104801","url":null,"abstract":"<div><div>The majority of prevailing image fusion methods employ a global strategy, often resulting in a reduction of contrast. This study addresses this issue by proposing a novel image fusion approach called FusionNGFPE, specifically designed for the structural characteristics of infrared (IR) imagery. The approach introduces a contrast equalization algorithm based on the Fourth-order Partial Differential Equation (FPDE) to enhance background regions effectively. Considering the inherent differences between IR and visible (VIS) images, we developed a hybrid fusion strategy that combines the Expectation Maximization (EM) algorithm and Principal Component Analysis (PCA). Comparative analysis with state-of-the-art fusion methods shows that our proposed algorithm achieves superior performance in both qualitative and quantitative evaluations. To further demonstrate the practical significance of FusionNGFPE, we integrated this fusion framework into the RGBT target tracking task using the VOT-RGBT and OTCBVS datasets. Extensive comparative experiments confirm that the FusionNGFPE framework integrates seamlessly with the tracking task, significantly improving tracking accuracy across diverse scenarios.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"156 ","pages":"Article 104801"},"PeriodicalIF":2.9,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142433292","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Enhancing Hopfield network performance for pattern retrieval using sparse recovery algorithm and Parzen estimator 利用稀疏恢复算法和 Parzen 估计器提高模式检索的 Hopfield 网络性能

IF 2.9 3区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

Digital Signal Processing

Pub Date : 2024-10-10 DOI: 10.1016/j.dsp.2024.104814

Djordje Stanković , Andjela Draganić , Cornel Ioana , Irena Orović

An improved pattern recovery approach that integrates the Hopfield neural network (HNN) with the iterative signal reconstruction and Parzen window-based classification is proposed. The HNN is observed as a form of associative memory network, used for various pattern recognition and optimization tasks. However, when the input pattern is highly damaged with a very limited set of available samples, the Hopfield network fails to perform the retrieval. The convex optimization-based gradient descent algorithm is considered for pattern recovery of damaged inputs in order to provide an improved pattern approximation for further processing within the HNN, enabling successful network performance. Additionally, in the case of grayscale images, the Parzen window approach is used to classify the probability density functions (pdfs) of the training set and to choose those being comparable to the pdf of the input pattern, therefore refining the selection of patterns and providing better convergence to the exact retrieval. The theoretical considerations are verified experimentally, showing the high performance of the proposed approach when only 10 % of the pixels are available for binary patterns and 40 % of pixels for grayscale patterns.

本文提出了一种改进的模式恢复方法，它将 Hopfield 神经网络（HNN）与迭代信号重建和基于 Parzen 窗口的分类相结合。据观察，HNN 是一种联想记忆网络，用于各种模式识别和优化任务。然而，当输入模式受到严重破坏且可用样本非常有限时，Hopfield 网络就无法进行检索。我们考虑采用基于凸优化的梯度下降算法来恢复受损输入的模式，以便为 HNN 的进一步处理提供改进的模式近似值，从而成功实现网络性能。此外，在灰度图像的情况下，使用 Parzen 窗口方法对训练集的概率密度函数（pdf）进行分类，并选择那些与输入模式的 pdf 具有可比性的概率密度函数，从而改进模式的选择，更好地收敛到精确检索。实验验证了理论考虑因素，表明当二进制模式只有 10% 的像素可用，灰度模式只有 40% 的像素可用时，所建议的方法具有很高的性能。

{"title":"Enhancing Hopfield network performance for pattern retrieval using sparse recovery algorithm and Parzen estimator","authors":"Djordje Stanković , Andjela Draganić , Cornel Ioana , Irena Orović","doi":"10.1016/j.dsp.2024.104814","DOIUrl":"10.1016/j.dsp.2024.104814","url":null,"abstract":"<div><div>An improved pattern recovery approach that integrates the Hopfield neural network (HNN) with the iterative signal reconstruction and Parzen window-based classification is proposed. The HNN is observed as a form of associative memory network, used for various pattern recognition and optimization tasks. However, when the input pattern is highly damaged with a very limited set of available samples, the Hopfield network fails to perform the retrieval. The convex optimization-based gradient descent algorithm is considered for pattern recovery of damaged inputs in order to provide an improved pattern approximation for further processing within the HNN, enabling successful network performance. Additionally, in the case of grayscale images, the Parzen window approach is used to classify the probability density functions (pdfs) of the training set and to choose those being comparable to the pdf of the input pattern, therefore refining the selection of patterns and providing better convergence to the exact retrieval. The theoretical considerations are verified experimentally, showing the high performance of the proposed approach when only 10 % of the pixels are available for binary patterns and 40 % of pixels for grayscale patterns.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"156 ","pages":"Article 104814"},"PeriodicalIF":2.9,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142446376","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

IRS aided visible light positioning with a single LED transmitter 使用单个 LED 发射器进行 IRS 辅助可见光定位

IF 2.9 3区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

Digital Signal Processing

Pub Date : 2024-10-10 DOI: 10.1016/j.dsp.2024.104799

Efe Tarhan , Furkan Kokdogan , Sinan Gezici

We propose a visible light positioning (VLP) system with a single light emitting diode (LED) transmitter and an intelligent reflecting surface (IRS) for estimating the position of a receiver equipped with a single photo-detector. By performing a number of transmissions from the LED transmitter and optimizing the orientation vectors of the IRS elements for each transmission, position information is extracted by the receiver based on power measurements of the signals reflecting from the IRS. The theoretical limit and the maximum likelihood (ML) estimator are presented for the proposed setting. In addition, an algorithm, named IRS focusing, is proposed for determining the orientations of the IRS elements during the localization process. The effectiveness of the proposed localization approach is demonstrated through simulations. Furthermore, extensions are provided to apply the proposed approach in the presence of partial prior information about the receiver position and when the IRS is located at the LED transmitter.

我们提出了一种可见光定位（VLP）系统，该系统由一个发光二极管（LED）发射器和一个智能反射面（IRS）组成，用于估计装有单个光电探测器的接收器的位置。通过 LED 发射器进行多次发射，并优化每次发射的 IRS 元件方向向量，接收器可根据 IRS 反射信号的功率测量值提取位置信息。针对提议的设置，提出了理论极限和最大似然 (ML) 估计器。此外，还提出了一种名为 IRS 聚焦的算法，用于在定位过程中确定 IRS 元件的方向。通过模拟演示了所提出的定位方法的有效性。此外，还提供了扩展功能，以便在接收器位置存在部分先验信息以及 IRS 位于 LED 发射器时应用所提出的方法。

引用次数: 0

Drift suppression method based on signal stability detection and adaptive Kalman filter for NMR sensor 基于信号稳定性检测和自适应卡尔曼滤波器的核磁共振传感器漂移抑制方法

IF 2.9 3区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

Digital Signal Processing

Pub Date : 2024-10-09 DOI: 10.1016/j.dsp.2024.104812

Qipeng Wang , Zhanchao Liu , Zekun Wu , Jingsong Wang , Chunyu Qu , Jianli Li

The small volume, high precision, and low cost of Nuclear Magnetic Resonance (NMR) sensors make them one of the best choices for future miniaturized and chip-scale Inertial Navigation System (INS). Due to technical and process limitations, NMR sensors inevitably exhibit random drift. To suppress these errors, a drift suppression method based on Signal Stability Detection and Adaptive Kalman Filter (SSD-AKF) for NMR sensors is proposed. Firstly, a state space model for the Kalman filter is established based on an Auto Regressive Moving Average (ARMA) sequence model. Secondly, to address the issue of reduced filtering accuracy caused by unstable signal noise in innovation-based AKF, an adaptive filtering method aided by a signal stability detection is proposed. The proposed method utilizes the standard deviation of prior information to assess the stability of the signal. Based on this assessment, the adaptive filter adjusts the gain matrix, ultimately enhancing the stability of the filter. The dynamic experimental results show that the proposed method can effectively improve filter performance and reduce sensor drift.

核磁共振（NMR）传感器体积小、精度高、成本低，是未来微型化和芯片级惯性导航系统（INS）的最佳选择之一。由于技术和工艺的限制，核磁共振传感器不可避免地会出现随机漂移。为了抑制这些误差，本文提出了一种基于信号稳定性检测和自适应卡尔曼滤波器（SSD-AKF）的核磁共振传感器漂移抑制方法。首先，基于自回归移动平均（ARMA）序列模型建立了卡尔曼滤波器的状态空间模型。其次，针对基于创新的 AKF 中不稳定信号噪声导致滤波精度降低的问题，提出了一种以信号稳定性检测为辅助的自适应滤波方法。该方法利用先验信息的标准偏差来评估信号的稳定性。在此基础上，自适应滤波器调整增益矩阵，最终增强滤波器的稳定性。动态实验结果表明，所提出的方法能有效提高滤波器性能，减少传感器漂移。

{"title":"Drift suppression method based on signal stability detection and adaptive Kalman filter for NMR sensor","authors":"Qipeng Wang , Zhanchao Liu , Zekun Wu , Jingsong Wang , Chunyu Qu , Jianli Li","doi":"10.1016/j.dsp.2024.104812","DOIUrl":"10.1016/j.dsp.2024.104812","url":null,"abstract":"<div><div>The small volume, high precision, and low cost of Nuclear Magnetic Resonance (NMR) sensors make them one of the best choices for future miniaturized and chip-scale Inertial Navigation System (INS). Due to technical and process limitations, NMR sensors inevitably exhibit random drift. To suppress these errors, a drift suppression method based on Signal Stability Detection and Adaptive Kalman Filter (SSD-AKF) for NMR sensors is proposed. Firstly, a state space model for the Kalman filter is established based on an Auto Regressive Moving Average (ARMA) sequence model. Secondly, to address the issue of reduced filtering accuracy caused by unstable signal noise in innovation-based AKF, an adaptive filtering method aided by a signal stability detection is proposed. The proposed method utilizes the standard deviation of prior information to assess the stability of the signal. Based on this assessment, the adaptive filter adjusts the gain matrix, ultimately enhancing the stability of the filter. The dynamic experimental results show that the proposed method can effectively improve filter performance and reduce sensor drift.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"156 ","pages":"Article 104812"},"PeriodicalIF":2.9,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142427691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Error Performance of DF Cooperative SMTs with I/Q Imbalance over Beckmann Fading Channels 贝克曼衰减信道上具有 I/Q 不平衡的 DF 合作 SMT 的误差性能

IF 2.9 3区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

Digital Signal Processing

Pub Date : 2024-10-09 DOI: 10.1016/j.dsp.2024.104804

Ayse Elif Canbilen , Ibrahim Develi , Seyfettin Sinan Gültekin

The direct down-conversion principle, which has generally been used in the design of multiple-input multiple-output (MIMO) schemes, including space modulation techniques (SMTs), is attractive to researchers because of its low cost, low power consumption with fewer components, flexible and simple structure. However, hardware imperfections such as in-phase (I) and quadrature-phase (Q) imbalance (IQI) negatively affect the performance of the systems with direct down-conversion in practice. On the other hand, cooperative communication is a promising technology that can be utilized in the design of future wireless networks due to its significant advantages such as increasing system reliability, extending network coverage, reducing channel degradation, and providing high quality of service. In this study, SMT-based methods are integrated into cooperative systems, and a flexible and comprehensive model is presented that is applicable to many channel structures. Specifically, the error performance analysis of space shift keying (SSK), spatial modulation (SM), and quadrature SM (QSM) systems in the presence of IQI in decode-and-forward (DF) cooperative communication is carried out by analytical derivations and computer simulations over generalized Beckmann fading channels. The obtained results show that the performance of SMT-based DF cooperative systems is superior to the conventional schemes, and the effects of receiver IQI can be eliminated by optimal detector designs.

直接下变频原理通常用于多输入多输出（MIMO）方案（包括空间调制技术（SMT））的设计，因其成本低、功耗低、元件少、结构灵活简单而备受研究人员青睐。然而，在实际应用中，同相（I）和正交相（Q）不平衡（IQI）等硬件缺陷会对直接下变频系统的性能产生负面影响。另一方面，合作通信是一种前景广阔的技术，可用于未来无线网络的设计，因为它具有显著的优势，如提高系统可靠性、扩大网络覆盖范围、减少信道劣化和提供高质量服务。在本研究中，基于 SMT 的方法被集成到了合作系统中，并提出了一个适用于多种信道结构的灵活而全面的模型。具体而言，通过分析推导和计算机仿真，在广义贝克曼衰落信道上对存在 IQI 的合作通信中的空间移调（SSK）、空间调制（SM）和正交 SM（QSM）系统进行了误差性能分析。结果表明，基于 SMT 的 DF 合作系统的性能优于传统方案，而且接收器 IQI 的影响可以通过优化检测器设计来消除。

{"title":"Error Performance of DF Cooperative SMTs with I/Q Imbalance over Beckmann Fading Channels","authors":"Ayse Elif Canbilen , Ibrahim Develi , Seyfettin Sinan Gültekin","doi":"10.1016/j.dsp.2024.104804","DOIUrl":"10.1016/j.dsp.2024.104804","url":null,"abstract":"<div><div>The direct down-conversion principle, which has generally been used in the design of multiple-input multiple-output (MIMO) schemes, including space modulation techniques (SMTs), is attractive to researchers because of its low cost, low power consumption with fewer components, flexible and simple structure. However, hardware imperfections such as in-phase (I) and quadrature-phase (Q) imbalance (IQI) negatively affect the performance of the systems with direct down-conversion in practice. On the other hand, cooperative communication is a promising technology that can be utilized in the design of future wireless networks due to its significant advantages such as increasing system reliability, extending network coverage, reducing channel degradation, and providing high quality of service. In this study, SMT-based methods are integrated into cooperative systems, and a flexible and comprehensive model is presented that is applicable to many channel structures. Specifically, the error performance analysis of space shift keying (SSK), spatial modulation (SM), and quadrature SM (QSM) systems in the presence of IQI in decode-and-forward (DF) cooperative communication is carried out by analytical derivations and computer simulations over generalized Beckmann fading channels. The obtained results show that the performance of SMT-based DF cooperative systems is superior to the conventional schemes, and the effects of receiver IQI can be eliminated by optimal detector designs.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"156 ","pages":"Article 104804"},"PeriodicalIF":2.9,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142427695","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0