Convolutional Neural Networks (CNN) are an important means of detection of microdefects on the aluminum surface, and the high complexity and computing power requirements of the CNN model lead to difficulties in deploying them on edge computing platforms as the detection accuracy continues to improve. We have studied a lightweight acceleration method for detecting microdefects on aluminum surfaces on the Zynq-7000 All Programmable SoC (ZYNQ) platform. A lightweight aluminum surface defect detection network (LADFastDet) and high-performance accelerators based on ZYNQ are designed to meet the requirements of precision and speed under limited resources. In the LADFastDet structure, a lightweight inverted residual block is designed by combining depthwise convolution, inverted residual block, and inverted bottleneck. A multiscale feature fusion structure is designed to effectively improve the detection accuracy of LADFastDet, especially small target defects. We design accelerators on ZYNQ through optimization methods such as loop optimization strategy, ping-pong buffering, and multichannel and multiple interfaces data reading and writing to reduce data access latency and thus improve the computing speed. The experimental results show that the LADFastDet model has a mAP of 97.51%, the inference time of the accelerators for a single image is 42.57 ms, and a power consumption of 2.15 W, which achieves a throughput of 24.9 GOPS and an energy efficiency of 11.58 GOPS/W.
{"title":"Research on ZYNQ neural network acceleration method for aluminum surface microdefects","authors":"Dongxue Zhao, Shenbo Liu, Zhigang Zhang, Zhao Zhang, Lijun Tang","doi":"10.1016/j.dsp.2024.104900","DOIUrl":"10.1016/j.dsp.2024.104900","url":null,"abstract":"<div><div>Convolutional Neural Networks (CNN) are an important means of detection of microdefects on the aluminum surface, and the high complexity and computing power requirements of the CNN model lead to difficulties in deploying them on edge computing platforms as the detection accuracy continues to improve. We have studied a lightweight acceleration method for detecting microdefects on aluminum surfaces on the Zynq-7000 All Programmable SoC (ZYNQ) platform. A lightweight aluminum surface defect detection network (LADFastDet) and high-performance accelerators based on ZYNQ are designed to meet the requirements of precision and speed under limited resources. In the LADFastDet structure, a lightweight inverted residual block is designed by combining depthwise convolution, inverted residual block, and inverted bottleneck. A multiscale feature fusion structure is designed to effectively improve the detection accuracy of LADFastDet, especially small target defects. We design accelerators on ZYNQ through optimization methods such as loop optimization strategy, ping-pong buffering, and multichannel and multiple interfaces data reading and writing to reduce data access latency and thus improve the computing speed. The experimental results show that the LADFastDet model has a mAP of 97.51%, the inference time of the accelerators for a single image is 42.57 ms, and a power consumption of 2.15 W, which achieves a throughput of 24.9 GOPS and an energy efficiency of 11.58 GOPS/W.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"157 ","pages":"Article 104900"},"PeriodicalIF":2.9,"publicationDate":"2024-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142744712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-28DOI: 10.1016/j.dsp.2024.104883
Fuxian Sui , Hua Wang , Fan Zhang
Accurate segmentation of medical images is of great significance for computer-aided diagnosis. Transformers show great promise in medical image segmentation, where they can complement local convolutions by capturing long-range dependencies via self-attention. Recent methods have shown good performance in dealing with variations in global context modeling. However, they do not deal well with problems such as boundary blurring because they ignore the edge prior and the complementarity of the global context. To address this challenge, we propose a segmentation network based on informative priors across scales. The encoder in our network utilizes the self-attention mechanism to capture long-range dependencies, while the proposed cross-scale prior decoder makes full use of the multi-scale features in the hierarchical vision transformer to capture boundary information by using a prior perceptron, and enhances both remote and local context information by suppressing background information using a pattern perceptron. Through the internal organic combination, the edge prior and the global background are fully used to complement each other, and the problem of inaccurate boundary segmentation is better solved. Extensive experiments have been conducted on multiple segmented datasets to validate the advanced performance of the model.
{"title":"Cross-scale informative priors network for medical image segmentation","authors":"Fuxian Sui , Hua Wang , Fan Zhang","doi":"10.1016/j.dsp.2024.104883","DOIUrl":"10.1016/j.dsp.2024.104883","url":null,"abstract":"<div><div>Accurate segmentation of medical images is of great significance for computer-aided diagnosis. Transformers show great promise in medical image segmentation, where they can complement local convolutions by capturing long-range dependencies via self-attention. Recent methods have shown good performance in dealing with variations in global context modeling. However, they do not deal well with problems such as boundary blurring because they ignore the edge prior and the complementarity of the global context. To address this challenge, we propose a segmentation network based on informative priors across scales. The encoder in our network utilizes the self-attention mechanism to capture long-range dependencies, while the proposed cross-scale prior decoder makes full use of the multi-scale features in the hierarchical vision transformer to capture boundary information by using a prior perceptron, and enhances both remote and local context information by suppressing background information using a pattern perceptron. Through the internal organic combination, the edge prior and the global background are fully used to complement each other, and the problem of inaccurate boundary segmentation is better solved. Extensive experiments have been conducted on multiple segmented datasets to validate the advanced performance of the model.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"157 ","pages":"Article 104883"},"PeriodicalIF":2.9,"publicationDate":"2024-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142744713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-28DOI: 10.1016/j.dsp.2024.104874
Linshan Zhao , Kai Ying , Disheng Xiao , Jian Pang , Kai Kang
In modern wireless communication systems, wide signal bandwidth is the most straightforward approach to accommodate high data rates. Wide signal bandwidth, on the other hand, introduces severe challenges to the power amplifier (PA) and digital predistortion (DPD) design in both performance and cost. Conventional DPD systems usually ignore the impact of the transmit low-pass filter (Tx LPF) bandwidth and assume the transmit bandwidth is sufficiently large. In wideband signal transmissions, the bandwidth of Tx LPF can become the system bottleneck, limiting DPDs compensation effects. Existing DPD studies mostly investigate the DPD with reduced feedback bandwidth. In this paper, we study the impact of Tx LPF bandwidth on the DPD performance. A full-band error minimization DPD based on direct learning structure is proposed. The DPD coefficients are estimated by minimizing the full-band error between the input signal and PA output signal in the frequency domain. Furthermore, we propose a weighted DPD with improved performance by introducing a weighting diagonal matrix to the error function. Compared to existing solutions, the weighted DPD achieves a good trade-off between the in-band distortion compensation and out-of-band spectral regrowth suppression. Simulations and experiments validate the effectiveness of the proposed DPD schemes.
{"title":"An improved digital predistortion scheme for nonlinear transmitters with limited bandwidth","authors":"Linshan Zhao , Kai Ying , Disheng Xiao , Jian Pang , Kai Kang","doi":"10.1016/j.dsp.2024.104874","DOIUrl":"10.1016/j.dsp.2024.104874","url":null,"abstract":"<div><div>In modern wireless communication systems, wide signal bandwidth is the most straightforward approach to accommodate high data rates. Wide signal bandwidth, on the other hand, introduces severe challenges to the power amplifier (PA) and digital predistortion (DPD) design in both performance and cost. Conventional DPD systems usually ignore the impact of the transmit low-pass filter (Tx LPF) bandwidth and assume the transmit bandwidth is sufficiently large. In wideband signal transmissions, the bandwidth of Tx LPF can become the system bottleneck, limiting DPDs compensation effects. Existing DPD studies mostly investigate the DPD with reduced feedback bandwidth. In this paper, we study the impact of Tx LPF bandwidth on the DPD performance. A full-band error minimization DPD based on direct learning structure is proposed. The DPD coefficients are estimated by minimizing the full-band error between the input signal and PA output signal in the frequency domain. Furthermore, we propose a weighted DPD with improved performance by introducing a weighting diagonal matrix to the error function. Compared to existing solutions, the weighted DPD achieves a good trade-off between the in-band distortion compensation and out-of-band spectral regrowth suppression. Simulations and experiments validate the effectiveness of the proposed DPD schemes.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"157 ","pages":"Article 104874"},"PeriodicalIF":2.9,"publicationDate":"2024-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142744714","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-26DOI: 10.1016/j.dsp.2024.104879
Youyang Tao , Hangjun Che , Chenglu Li , Baicheng Pan , Man-Fai Leung
In the era of information explosion, clustering analysis of multi-view data plays a crucial role in revealing the intrinsic structures of data. Despite the advancements in existing multi-view clustering methods for processing complex data, they often overlook the weight differences among various views and the diversity between clusters. To address the issues, the paper introduces a novel multi-view clustering approach termed weight consistency and cluster diversity based concept factorization for multi-view clustering (MVCF-WD). Specifically, the proposed method automatically learns the weights of the views, and incorporates a cluster diversity term to enhance the discriminability of clusters. Furthermore, to solve the formulated optimization model, an iterative optimization algorithm based on multiplication rules is developed and the convergence is analyzed. Extensive experiments conducted across seven datasets compared with ten state-of-the-art clustering algorithms demonstrate the superior clustering performance of the proposed method.
{"title":"Weight consistency and cluster diversity based concept factorization for multi-view clustering","authors":"Youyang Tao , Hangjun Che , Chenglu Li , Baicheng Pan , Man-Fai Leung","doi":"10.1016/j.dsp.2024.104879","DOIUrl":"10.1016/j.dsp.2024.104879","url":null,"abstract":"<div><div>In the era of information explosion, clustering analysis of multi-view data plays a crucial role in revealing the intrinsic structures of data. Despite the advancements in existing multi-view clustering methods for processing complex data, they often overlook the weight differences among various views and the diversity between clusters. To address the issues, the paper introduces a novel multi-view clustering approach termed weight consistency and cluster diversity based concept factorization for multi-view clustering (MVCF-WD). Specifically, the proposed method automatically learns the weights of the views, and incorporates a cluster diversity term to enhance the discriminability of clusters. Furthermore, to solve the formulated optimization model, an iterative optimization algorithm based on multiplication rules is developed and the convergence is analyzed. Extensive experiments conducted across seven datasets compared with ten state-of-the-art clustering algorithms demonstrate the superior clustering performance of the proposed method.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"157 ","pages":"Article 104879"},"PeriodicalIF":2.9,"publicationDate":"2024-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142744711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-26DOI: 10.1016/j.dsp.2024.104893
Wenjing Zhou , Mingwei Shen , Di Wu , Daiyin Zhu , Guodong Han
In this paper, we propose a new wideband sparse array synthesis method based on gridless compressed sensing to solve the basis mismatch problem for discrete grids. Considering the tapped delay-lines (TDL) structure for space-time domain processing, and using successive frequency-varying atoms for sparse representation of wideband signals, an arbitrary sampling-atomic norm minimization is introduced to model the group sparsity-constrained wideband arrays in which the positions and the excitation values of array element are obtained with a high freedom. The above nonconvex problem is then transformed into a convex relaxation, which is solved using the Prolate Spheroidal Wave Functions (PSWFs). The experimental results show that the proposed sparse array design has higher matching accuracy and sparsity, compared with the discretized wideband sparse array design, which verifies the effectiveness and efficiency of this method.
{"title":"Efficient gridless wideband sparse array synthesis with tapped delay-lines","authors":"Wenjing Zhou , Mingwei Shen , Di Wu , Daiyin Zhu , Guodong Han","doi":"10.1016/j.dsp.2024.104893","DOIUrl":"10.1016/j.dsp.2024.104893","url":null,"abstract":"<div><div>In this paper, we propose a new wideband sparse array synthesis method based on gridless compressed sensing to solve the basis mismatch problem for discrete grids. Considering the tapped delay-lines (TDL) structure for space-time domain processing, and using successive frequency-varying atoms for sparse representation of wideband signals, an arbitrary sampling-atomic norm minimization is introduced to model the group sparsity-constrained wideband arrays in which the positions and the excitation values of array element are obtained with a high freedom. The above nonconvex problem is then transformed into a convex relaxation, which is solved using the Prolate Spheroidal Wave Functions (PSWFs). The experimental results show that the proposed sparse array design has higher matching accuracy and sparsity, compared with the discretized wideband sparse array design, which verifies the effectiveness and efficiency of this method.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"157 ","pages":"Article 104893"},"PeriodicalIF":2.9,"publicationDate":"2024-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142744710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-26DOI: 10.1016/j.dsp.2024.104891
Linhui Sun , Xiaolong Zhou , Aifei Gong , Lei Ye , Pingan Li , Eng Siong Chng
Recently, significant progress has been made in the end-to-end single-channel speech separation in clean environments. For noisy speech separation, existing research mainly uses deep neural networks to implicitly process the noise in speech signals, which does not fully utilize the impact of noise reconstruction errors on network training. We propose a lightweight noise-aware network with shared channel-attention encoder and joint constraint, named NSCJnet, which aims to improve the speech separation system performance in noisy environments. Firstly, to reduce network parameters, the model uses a parameter sharing channel attention encoder to convert noisy speech signals into a feature space. In addition, the channel attention layer (CAlayer) in encoder enhances the network's representational capacity and separation performance in noisy environments by calculating different weights of the filters in the convolution. Secondly, to make the network converge quickly, we regard noise as an estimation target of equal significance to speech, which compel the network to separate residual noise from the estimated speech, effectively suppressing lingering noise within the speech signal. Furthermore, by integrating a multi-resolution frequency constraint into the time domain loss, we introduce a weighted time-frequency joint loss constraint, empowering the network to acquire information across both dimensions to conducive to separating mixed speech with noise. It automatically strengthens important features for separation and suppresses unimportant ones during the learning process. The results on the noisy WHAM! dataset and the noisy Libri2Mix dataset show that our method has less computational complexity, and outperforms some advanced methods in various speech quality and intelligibility metrics.
{"title":"Noise-aware network with shared channel-attention encoder and joint constraint for noisy speech separation","authors":"Linhui Sun , Xiaolong Zhou , Aifei Gong , Lei Ye , Pingan Li , Eng Siong Chng","doi":"10.1016/j.dsp.2024.104891","DOIUrl":"10.1016/j.dsp.2024.104891","url":null,"abstract":"<div><div>Recently, significant progress has been made in the end-to-end single-channel speech separation in clean environments. For noisy speech separation, existing research mainly uses deep neural networks to implicitly process the noise in speech signals, which does not fully utilize the impact of noise reconstruction errors on network training. We propose a lightweight noise-aware network with shared channel-attention encoder and joint constraint, named NSCJnet, which aims to improve the speech separation system performance in noisy environments. Firstly, to reduce network parameters, the model uses a parameter sharing channel attention encoder to convert noisy speech signals into a feature space. In addition, the channel attention layer (CAlayer) in encoder enhances the network's representational capacity and separation performance in noisy environments by calculating different weights of the filters in the convolution. Secondly, to make the network converge quickly, we regard noise as an estimation target of equal significance to speech, which compel the network to separate residual noise from the estimated speech, effectively suppressing lingering noise within the speech signal. Furthermore, by integrating a multi-resolution frequency constraint into the time domain loss, we introduce a weighted time-frequency joint loss constraint, empowering the network to acquire information across both dimensions to conducive to separating mixed speech with noise. It automatically strengthens important features for separation and suppresses unimportant ones during the learning process. The results on the noisy WHAM! dataset and the noisy Libri2Mix dataset show that our method has less computational complexity, and outperforms some advanced methods in various speech quality and intelligibility metrics.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"157 ","pages":"Article 104891"},"PeriodicalIF":2.9,"publicationDate":"2024-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142745363","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-26DOI: 10.1016/j.dsp.2024.104882
Jian Xue , Zhen Fan , Shuwen Xu , Meiyan Pan
This paper investigates the problem of adaptive detection of radar targets in non-Gaussian clutter, where the target to be detected is considered to behave the dual-spread in the Doppler frequency dimension and the range dimension. The clutter is assumed to follow the compound Gaussian model with lognormal texture and unknown covariance matrix structure. The multi-rank linear subspace model and the range-spread model are employed to depict the Doppler and range spread characteristics of target echoes. Then, the range-Doppler dual-spread adaptive radar target detector with lognormal-texture is proposed using the two-step generalized likelihood ratio criteria, which replaces the true values of the unknown parameters with their maximum likelihood and maximum a posteriori estimates. Experimental results on simulated and measured data demonstrate that the proposed detector shows superior performance in different clutter and target parameters compared to the competitors.
{"title":"Adaptive detection of radar range-Doppler dual-spread targets in lognormal-texture clutter","authors":"Jian Xue , Zhen Fan , Shuwen Xu , Meiyan Pan","doi":"10.1016/j.dsp.2024.104882","DOIUrl":"10.1016/j.dsp.2024.104882","url":null,"abstract":"<div><div>This paper investigates the problem of adaptive detection of radar targets in non-Gaussian clutter, where the target to be detected is considered to behave the dual-spread in the Doppler frequency dimension and the range dimension. The clutter is assumed to follow the compound Gaussian model with lognormal texture and unknown covariance matrix structure. The multi-rank linear subspace model and the range-spread model are employed to depict the Doppler and range spread characteristics of target echoes. Then, the range-Doppler dual-spread adaptive radar target detector with lognormal-texture is proposed using the two-step generalized likelihood ratio criteria, which replaces the true values of the unknown parameters with their maximum likelihood and maximum a posteriori estimates. Experimental results on simulated and measured data demonstrate that the proposed detector shows superior performance in different clutter and target parameters compared to the competitors.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"157 ","pages":"Article 104882"},"PeriodicalIF":2.9,"publicationDate":"2024-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142745362","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper presents a unified framework for linear scale invariant signals, systems, and transforms from a system theoretic perspective. The work is the scale counterpart of the theory related to linear shift invariant systems and transforms. Similar to Fourier and Laplace transforms that are used to study linear shift or time invariant systems, Mellin transform is used to study scale invariant systems. However, unlike the shift invariant theory, the theory related to scale invariant systems and transforms has so far not been presented with a unified approach. In this work, we present this theory from signal processing viewpoint, where we present the development of scale invariant transform as a systematic progression from scale series for scale periodic signals to scale invariant transform for scale aperiodic signals. We also present a few examples to illustrate the utility of the presented theory.
{"title":"Unified framework for linear scale invariant signals, systems, and transforms: A tutorial","authors":"Anubha Gupta , Pushpendra Singh , Priya Aggarwal , Shiv Dutt Joshi","doi":"10.1016/j.dsp.2024.104880","DOIUrl":"10.1016/j.dsp.2024.104880","url":null,"abstract":"<div><div>This paper presents a unified framework for linear scale invariant signals, systems, and transforms from a system theoretic perspective. The work is the scale counterpart of the theory related to linear shift invariant systems and transforms. Similar to Fourier and Laplace transforms that are used to study linear shift or time invariant systems, Mellin transform is used to study scale invariant systems. However, unlike the shift invariant theory, the theory related to scale invariant systems and transforms has so far not been presented with a unified approach. In this work, we present this theory from signal processing viewpoint, where we present the development of scale invariant transform as a systematic progression from scale series for scale periodic signals to scale invariant transform for scale aperiodic signals. We also present a few examples to illustrate the utility of the presented theory.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"157 ","pages":"Article 104880"},"PeriodicalIF":2.9,"publicationDate":"2024-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142745364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-24DOI: 10.1016/j.dsp.2024.104885
Junran Qian , Xudong Xiang , Haiyan Li , Shuhua Ye , Hongsong Li
As a critical component of computer-aided diagnosis systems, medical image segmentation plays a vital role in assisting clinicians in making rapid and accurate decisions and formulating treatment plans. Nevertheless, precise medical image segmentation still presents a number of challenges, including insufficient feature extraction capabilities in the presence of limited sample sizes, blurred segmentation boundaries, and information loss between the encoder and decoder. In order to address these issues, we propose a Multi-Scale Boundary-Aware Aggregation Network with Bidirectional Information Exchange and Feature Refinement (MBF-Net) for medical image segmentation. Initially, we design a Multi-Scale Boundary-Aware Aggregation Encoder (MBAE) that aggregates features from different scales and pixel levels within the input images, capturing fine-grained boundary information in deep features and establishing comprehensive global and local multi-scale contextual dependencies. This design significantly enhances the model's understanding of the overall image structure and its ability to discern subtle differences between lesions and background. Subsequently, a Multi-Scale Bidirectional Information Transmission (MBIT) module is introduced, which integrates bidirectional information flow between low-level and high-level features, enabling multi-scale features to flow bidirectionally across different layers. The MBIT module effectively preserves crucial boundary details during cross-layer information transmission, thereby bridging the semantic gap between the encoder and decoder, and thereby improving the clarity of the segmentation boundaries. Finally, we develop a Feature Refinement and Aggregation Fusion (FRAF) module, designed to integrate feature information from various semantic levels, which alleviates discrepancies between features at varying scales, thus enhancing the segmentation accuracy of the network. The generalisation and effectiveness of MBF-Net are validated through comprehensive experiments on a range of tasks, including nuclear segmentation, breast cancer segmentation, polyp segmentation and skin lesion segmentation. Both subjective and objective evaluations demonstrate that MBF-Net significantly outperforms current state-of-the-art methods, achieving average Dice Similarity Coefficient (DSC) and Intersection over Union (IoU) scores of 86.34 % and 78.37 %, respectively. The superior performance of MBF-Net in terms of segmentation accuracy and quality is demonstrated across five public datasets.
医学图像分割作为计算机辅助诊断系统的重要组成部分,在帮助临床医生做出快速准确的决策和制定治疗方案方面起着至关重要的作用。然而,精确的医学图像分割仍然面临许多挑战,包括在有限的样本量下特征提取能力不足,分割边界模糊以及编码器和解码器之间的信息丢失。为了解决这些问题,我们提出了一种具有双向信息交换和特征细化的多尺度边界感知聚合网络(MBF-Net)用于医学图像分割。首先,我们设计了一个多尺度边界感知聚合编码器(MBAE),该编码器在输入图像中聚合来自不同尺度和像素级别的特征,捕获深度特征中的细粒度边界信息,并建立全面的全局和局部多尺度上下文依赖关系。这种设计显著提高了模型对整体图像结构的理解,以及对病灶和背景之间细微差别的识别能力。随后,引入了多尺度双向信息传输(Multi-Scale Bidirectional Information Transmission, MBIT)模块,该模块集成了低级特征和高级特征之间的双向信息流,使多尺度特征能够在不同的层之间双向流动。MBIT模块在跨层信息传输过程中有效地保留了关键的边界细节,从而弥合了编码器和解码器之间的语义鸿沟,从而提高了分割边界的清晰度。最后,我们开发了一个特征细化和聚合融合(FRAF)模块,旨在整合来自不同语义层次的特征信息,从而缓解不同尺度下特征之间的差异,从而提高网络的分割精度。通过核分割、乳腺癌分割、息肉分割和皮肤病变分割等一系列任务的综合实验,验证了MBF-Net的泛化和有效性。主观和客观的评估都表明MBF-Net显著优于当前最先进的方法,平均骰子相似系数(DSC)和交集超过联盟(IoU)得分分别为86.34%和78.37%。在五个公共数据集上证明了MBF-Net在分割精度和质量方面的优越性能。
{"title":"MBF-Net: Multi-scale boundary-aware aggregation for bi-directional information exchange and feature reshaping for medical image segmentation","authors":"Junran Qian , Xudong Xiang , Haiyan Li , Shuhua Ye , Hongsong Li","doi":"10.1016/j.dsp.2024.104885","DOIUrl":"10.1016/j.dsp.2024.104885","url":null,"abstract":"<div><div>As a critical component of computer-aided diagnosis systems, medical image segmentation plays a vital role in assisting clinicians in making rapid and accurate decisions and formulating treatment plans. Nevertheless, precise medical image segmentation still presents a number of challenges, including insufficient feature extraction capabilities in the presence of limited sample sizes, blurred segmentation boundaries, and information loss between the encoder and decoder. In order to address these issues, we propose a Multi-Scale Boundary-Aware Aggregation Network with Bidirectional Information Exchange and Feature Refinement (MBF-Net) for medical image segmentation. Initially, we design a Multi-Scale Boundary-Aware Aggregation Encoder (MBAE) that aggregates features from different scales and pixel levels within the input images, capturing fine-grained boundary information in deep features and establishing comprehensive global and local multi-scale contextual dependencies. This design significantly enhances the model's understanding of the overall image structure and its ability to discern subtle differences between lesions and background. Subsequently, a Multi-Scale Bidirectional Information Transmission (MBIT) module is introduced, which integrates bidirectional information flow between low-level and high-level features, enabling multi-scale features to flow bidirectionally across different layers. The MBIT module effectively preserves crucial boundary details during cross-layer information transmission, thereby bridging the semantic gap between the encoder and decoder, and thereby improving the clarity of the segmentation boundaries. Finally, we develop a Feature Refinement and Aggregation Fusion (FRAF) module, designed to integrate feature information from various semantic levels, which alleviates discrepancies between features at varying scales, thus enhancing the segmentation accuracy of the network. The generalisation and effectiveness of MBF-Net are validated through comprehensive experiments on a range of tasks, including nuclear segmentation, breast cancer segmentation, polyp segmentation and skin lesion segmentation. Both subjective and objective evaluations demonstrate that MBF-Net significantly outperforms current state-of-the-art methods, achieving average Dice Similarity Coefficient (DSC) and Intersection over Union (IoU) scores of 86.34 % and 78.37 %, respectively. The superior performance of MBF-Net in terms of segmentation accuracy and quality is demonstrated across five public datasets.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"157 ","pages":"Article 104885"},"PeriodicalIF":2.9,"publicationDate":"2024-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142744715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-22DOI: 10.1016/j.dsp.2024.104859
Xiuling Li , Bo Zhang , Haijian Wei , Qiang Wang , Zhengdong Li
The emerging compressed sensing (CS) enables compression and encryption simultaneously, which is very suitable for the resource-constraint Internet of things (IoT) applications. However, traditional CS-based cryptosystem can not provide efficient resistance to known-plaintext attack (KPA) under the multi-time-sampling (MTS) scenario. A novel CS-based privacy-preserving cryptosystem, called PCS-CAW (parallel CS injected with controllable artificial watermark), for simultaneous compression-encryption applications is proposed. Firstly, the original plaintext is scrambled by the global random permutation (GRP) operation. Then, the novel watermark injected parallel CS (PCS) is developed to re-encrypt and compress the intermediate ciphertext. Since a controllable artificial random watermark is injected into PCS sampling processing, the proposed PCS-CAW cryptosystem provides efficient resistance to KPA under the MTS scenario. In the decoding stage, a distinctive watermark removing strategy is developed. Experiments demonstrate that the proposed cryptosystem can achieve superior security and compression performance than previous CS-based ones.
{"title":"Controllable artificial watermark injected parallel compressive sensing for simultaneous compression-encryption applications","authors":"Xiuling Li , Bo Zhang , Haijian Wei , Qiang Wang , Zhengdong Li","doi":"10.1016/j.dsp.2024.104859","DOIUrl":"10.1016/j.dsp.2024.104859","url":null,"abstract":"<div><div>The emerging compressed sensing (CS) enables compression and encryption simultaneously, which is very suitable for the resource-constraint Internet of things (IoT) applications. However, traditional CS-based cryptosystem can not provide efficient resistance to known-plaintext attack (KPA) under the multi-time-sampling (MTS) scenario. A novel CS-based privacy-preserving cryptosystem, called PCS-CAW (parallel CS injected with controllable artificial watermark), for simultaneous compression-encryption applications is proposed. Firstly, the original plaintext is scrambled by the global random permutation (GRP) operation. Then, the novel watermark injected parallel CS (PCS) is developed to re-encrypt and compress the intermediate ciphertext. Since a controllable artificial random watermark is injected into PCS sampling processing, the proposed PCS-CAW cryptosystem provides efficient resistance to KPA under the MTS scenario. In the decoding stage, a distinctive watermark removing strategy is developed. Experiments demonstrate that the proposed cryptosystem can achieve superior security and compression performance than previous CS-based ones.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"157 ","pages":"Article 104859"},"PeriodicalIF":2.9,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142719617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}