Zexin Ji , Beiji Zou , Xiaoyan Kui , Hua Li , Pierre Vera , Su Ruan
{"title":"基于边缘感知约束的自先验引导Mamba网络生成超分辨率医学图像","authors":"Zexin Ji , Beiji Zou , Xiaoyan Kui , Hua Li , Pierre Vera , Su Ruan","doi":"10.1016/j.patrec.2024.11.020","DOIUrl":null,"url":null,"abstract":"<div><div>Existing deep learning-based super-resolution generation approaches usually depend on the backbone of convolutional neural networks (CNNs) or Transformers. CNN-based approaches are unable to model long-range dependencies, whereas Transformer-based approaches encounter significant computational burdens due to quadratic complexity in calculations. Moreover, high-frequency texture details in images generated by existing approaches still remain indistinct, posing a major challenge in super-resolution tasks. To overcome these problems, we propose a self-prior guided Mamba network with edge-aware constraint (SEMambaSR) for medical image super-resolution. Recently, State Space Models (SSMs), notably Mamba, have gained prominence for the ability to efficiently model long-range dependencies with low complexity. In this paper, we propose to integrate Mamba into the Unet network allowing to extract multi-scale local and global features to generate high-quality super-resolution images. Additionally, we introduce perturbations by randomly adding a brightness window to the input image, enabling the network to mine the self-prior information of the image. We also design an improved 2D-Selective-Scan (ISS2D) module to learn and adaptively fuse multi-directional long-range dependencies in image features to enhance feature representation. An edge-aware constraint is exploited to learn the multi-scale edge information from encoder features for better synthesis of texture boundaries. Our qualitative and quantitative experimental findings indicate superior super-resolution performance over current methods on IXI and BraTS2021 medical datasets. Specifically, our approach achieved a PSNR of 33.44 dB and an SSIM of 0.9371 on IXI, and a PSNR of 41.99 dB and an SSIM of 0.9846 on BraTS2021, both for 2<span><math><mo>×</mo></math></span> upsampling. The downstream vision task on brain tumor segmentation, using a U-Net network, also reveals the effectiveness of our approach, with a mean Dice Score of 57.06% on the BraTS2021 dataset.</div></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"187 ","pages":"Pages 93-99"},"PeriodicalIF":3.9000,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Generation of super-resolution for medical image via a self-prior guided Mamba network with edge-aware constraint\",\"authors\":\"Zexin Ji , Beiji Zou , Xiaoyan Kui , Hua Li , Pierre Vera , Su Ruan\",\"doi\":\"10.1016/j.patrec.2024.11.020\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Existing deep learning-based super-resolution generation approaches usually depend on the backbone of convolutional neural networks (CNNs) or Transformers. CNN-based approaches are unable to model long-range dependencies, whereas Transformer-based approaches encounter significant computational burdens due to quadratic complexity in calculations. Moreover, high-frequency texture details in images generated by existing approaches still remain indistinct, posing a major challenge in super-resolution tasks. To overcome these problems, we propose a self-prior guided Mamba network with edge-aware constraint (SEMambaSR) for medical image super-resolution. Recently, State Space Models (SSMs), notably Mamba, have gained prominence for the ability to efficiently model long-range dependencies with low complexity. In this paper, we propose to integrate Mamba into the Unet network allowing to extract multi-scale local and global features to generate high-quality super-resolution images. Additionally, we introduce perturbations by randomly adding a brightness window to the input image, enabling the network to mine the self-prior information of the image. We also design an improved 2D-Selective-Scan (ISS2D) module to learn and adaptively fuse multi-directional long-range dependencies in image features to enhance feature representation. An edge-aware constraint is exploited to learn the multi-scale edge information from encoder features for better synthesis of texture boundaries. Our qualitative and quantitative experimental findings indicate superior super-resolution performance over current methods on IXI and BraTS2021 medical datasets. Specifically, our approach achieved a PSNR of 33.44 dB and an SSIM of 0.9371 on IXI, and a PSNR of 41.99 dB and an SSIM of 0.9846 on BraTS2021, both for 2<span><math><mo>×</mo></math></span> upsampling. The downstream vision task on brain tumor segmentation, using a U-Net network, also reveals the effectiveness of our approach, with a mean Dice Score of 57.06% on the BraTS2021 dataset.</div></div>\",\"PeriodicalId\":54638,\"journal\":{\"name\":\"Pattern Recognition Letters\",\"volume\":\"187 \",\"pages\":\"Pages 93-99\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2024-11-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Pattern Recognition Letters\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S016786552400326X\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition Letters","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S016786552400326X","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
摘要
现有的基于深度学习的超分辨率生成方法通常依赖于卷积神经网络(cnn)或变压器的主干。基于cnn的方法无法对长期依赖关系进行建模,而基于transformer的方法由于计算的二次复杂度而面临巨大的计算负担。此外,现有方法生成的图像中的高频纹理细节仍然不清晰,这对超分辨率任务构成了重大挑战。为了克服这些问题,我们提出了一种具有边缘感知约束的自先验引导Mamba网络(SEMambaSR)用于医学图像超分辨率。最近,状态空间模型(State Space Models, ssm),特别是Mamba,因为能够以低复杂性高效地建模远程依赖关系而获得了突出的地位。在本文中,我们建议将曼巴整合到Unet网络中,允许提取多尺度局部和全局特征,以生成高质量的超分辨率图像。此外,我们通过在输入图像中随机添加亮度窗口来引入扰动,使网络能够挖掘图像的自先验信息。我们还设计了一个改进的二维选择性扫描(ISS2D)模块来学习和自适应融合图像特征中的多向远程依赖关系,以增强特征表征。利用边缘感知约束从编码器特征中学习多尺度边缘信息,从而更好地合成纹理边界。我们的定性和定量实验结果表明,在IXI和BraTS2021医疗数据集上,超分辨率性能优于当前方法。具体来说,我们的方法在IXI上实现了33.44 dB的PSNR和0.9371的SSIM,在BraTS2021上实现了41.99 dB的PSNR和0.9846的SSIM,两者都是2倍上采样。使用U-Net网络的脑肿瘤分割下游视觉任务也显示了我们方法的有效性,在BraTS2021数据集上的平均Dice Score为57.06%。
Generation of super-resolution for medical image via a self-prior guided Mamba network with edge-aware constraint
Existing deep learning-based super-resolution generation approaches usually depend on the backbone of convolutional neural networks (CNNs) or Transformers. CNN-based approaches are unable to model long-range dependencies, whereas Transformer-based approaches encounter significant computational burdens due to quadratic complexity in calculations. Moreover, high-frequency texture details in images generated by existing approaches still remain indistinct, posing a major challenge in super-resolution tasks. To overcome these problems, we propose a self-prior guided Mamba network with edge-aware constraint (SEMambaSR) for medical image super-resolution. Recently, State Space Models (SSMs), notably Mamba, have gained prominence for the ability to efficiently model long-range dependencies with low complexity. In this paper, we propose to integrate Mamba into the Unet network allowing to extract multi-scale local and global features to generate high-quality super-resolution images. Additionally, we introduce perturbations by randomly adding a brightness window to the input image, enabling the network to mine the self-prior information of the image. We also design an improved 2D-Selective-Scan (ISS2D) module to learn and adaptively fuse multi-directional long-range dependencies in image features to enhance feature representation. An edge-aware constraint is exploited to learn the multi-scale edge information from encoder features for better synthesis of texture boundaries. Our qualitative and quantitative experimental findings indicate superior super-resolution performance over current methods on IXI and BraTS2021 medical datasets. Specifically, our approach achieved a PSNR of 33.44 dB and an SSIM of 0.9371 on IXI, and a PSNR of 41.99 dB and an SSIM of 0.9846 on BraTS2021, both for 2 upsampling. The downstream vision task on brain tumor segmentation, using a U-Net network, also reveals the effectiveness of our approach, with a mean Dice Score of 57.06% on the BraTS2021 dataset.
期刊介绍:
Pattern Recognition Letters aims at rapid publication of concise articles of a broad interest in pattern recognition.
Subject areas include all the current fields of interest represented by the Technical Committees of the International Association of Pattern Recognition, and other developing themes involving learning and recognition.