arXiv - EE - Image and Video Processing最新文献_第2页

Cross-Organ and Cross-Scanner Adenocarcinoma Segmentation using Rein to Fine-tune Vision Foundation Models 利用 Rein 微调视觉基础模型进行跨器官和跨扫描仪腺癌分类

arXiv - EE - Image and Video Processing

Pub Date : 2024-09-18 DOI: arxiv-2409.11752

Pengzhou Cai, Xueyuan Zhang, Ze Zhao

In recent years, significant progress has been made in tumor segmentationwithin the field of digital pathology. However, variations in organs, tissuepreparation methods, and image acquisition processes can lead to domaindiscrepancies among digital pathology images. To address this problem, in thispaper, we use Rein, a fine-tuning method, to parametrically and efficientlyfine-tune various vision foundation models (VFMs) for MICCAI 2024 Cross-Organand Cross-Scanner Adenocarcinoma Segmentation (COSAS2024). The core of Reinconsists of a set of learnable tokens, which are directly linked to instances,improving functionality at the instance level in each layer. In the dataenvironment of the COSAS2024 Challenge, extensive experiments demonstrate thatRein fine-tuned the VFMs to achieve satisfactory results. Specifically, we usedRein to fine-tune ConvNeXt and DINOv2. Our team used the former to achievescores of 0.7719 and 0.7557 on the preliminary test phase and final test phasein task1, respectively, while the latter achieved scores of 0.8848 and 0.8192on the preliminary test phase and final test phase in task2. Code is availableat GitHub.

近年来，数字病理学领域在肿瘤分割方面取得了重大进展。然而，器官、组织制备方法和图像采集过程的不同会导致数字病理图像之间的差异。为了解决这个问题，我们在本文中使用一种微调方法 Rein，对 MICCAI 2024 跨器官和跨扫描仪腺癌分割（COSAS2024）的各种视觉基础模型（VFM）进行参数化和高效的微调。Reincons 的核心由一组可学习标记组成，这些标记与实例直接相关，从而提高了各层实例级的功能。在 COSAS2024 挑战赛的数据环境中，大量实验证明，Rein 对 VFM 进行了微调，取得了令人满意的结果。具体来说，我们使用Rein对ConvNeXt和DINOv2进行了微调。我们团队使用前者在任务 1 的初步测试阶段和最终测试阶段分别取得了 0.7719 和 0.7557 的分数，而后者在任务 2 的初步测试阶段和最终测试阶段分别取得了 0.8848 和 0.8192 的分数。代码可在 GitHub 上获取。

{"title":"Cross-Organ and Cross-Scanner Adenocarcinoma Segmentation using Rein to Fine-tune Vision Foundation Models","authors":"Pengzhou Cai, Xueyuan Zhang, Ze Zhao","doi":"arxiv-2409.11752","DOIUrl":"https://doi.org/arxiv-2409.11752","url":null,"abstract":"In recent years, significant progress has been made in tumor segmentation\u0000within the field of digital pathology. However, variations in organs, tissue\u0000preparation methods, and image acquisition processes can lead to domain\u0000discrepancies among digital pathology images. To address this problem, in this\u0000paper, we use Rein, a fine-tuning method, to parametrically and efficiently\u0000fine-tune various vision foundation models (VFMs) for MICCAI 2024 Cross-Organ\u0000and Cross-Scanner Adenocarcinoma Segmentation (COSAS2024). The core of Rein\u0000consists of a set of learnable tokens, which are directly linked to instances,\u0000improving functionality at the instance level in each layer. In the data\u0000environment of the COSAS2024 Challenge, extensive experiments demonstrate that\u0000Rein fine-tuned the VFMs to achieve satisfactory results. Specifically, we used\u0000Rein to fine-tune ConvNeXt and DINOv2. Our team used the former to achieve\u0000scores of 0.7719 and 0.7557 on the preliminary test phase and final test phase\u0000in task1, respectively, while the latter achieved scores of 0.8848 and 0.8192\u0000on the preliminary test phase and final test phase in task2. Code is available\u0000at GitHub.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":"38 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142263130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Hyperspectral Image Classification Based on Faster Residual Multi-branch Spiking Neural Network 基于更快残差多分支尖峰神经网络的高光谱图像分类技术

arXiv - EE - Image and Video Processing

Pub Date : 2024-09-18 DOI: arxiv-2409.11619

Yang Liu, Yahui Li, Rui Li, Liming Zhou, Lanxue Dang, Huiyu Mu, Qiang Ge

Convolutional neural network (CNN) performs well in Hyperspectral Image (HSI)classification tasks, but its high energy consumption and complex networkstructure make it difficult to directly apply it to edge computing devices. Atpresent, spiking neural networks (SNN) have developed rapidly in HSIclassification tasks due to their low energy consumption and event drivencharacteristics. However, it usually requires a longer time step to achieveoptimal accuracy. In response to the above problems, this paper builds aspiking neural network (SNN-SWMR) based on the leaky integrate-and-fire (LIF)neuron model for HSI classification tasks. The network uses the spiking widthmixed residual (SWMR) module as the basic unit to perform feature extractionoperations. The spiking width mixed residual module is composed of spikingmixed convolution (SMC), which can effectively extract spatial-spectralfeatures. Secondly, this paper designs a simple and efficient arcsineapproximate derivative (AAD), which solves the non-differentiable problem ofspike firing by fitting the Dirac function. Through AAD, we can directly trainsupervised spike neural networks. Finally, this paper conducts comparativeexperiments with multiple advanced HSI classification algorithms based onspiking neural networks on six public hyperspectral data sets. Experimentalresults show that the AAD function has strong robustness and a good fittingeffect. Meanwhile, compared with other algorithms, SNN-SWMR requires a timestep reduction of about 84%, training time, and testing time reduction of about63% and 70% at the same accuracy. This study solves the key problem of SNNbased HSI classification algorithms, which has important practical significancefor promoting the practical application of HSI classification algorithms inedge devices such as spaceborne and airborne devices.

卷积神经网络（CNN）在高光谱图像（HSI）分类任务中表现出色，但其高能耗和复杂的网络结构使其难以直接应用于边缘计算设备。目前，尖峰神经网络（SNN）因其低能耗和事件驱动的特点，在高光谱图像分类任务中发展迅速。然而，它通常需要较长的时间步长才能达到最佳精度。针对上述问题，本文建立了一个基于泄漏积分-发射（LIF）神经元模型的尖峰神经网络（SNN-SWMR），用于人机界面分类任务。该网络使用尖峰宽度混合残差（SWMR）模块作为基本单元来执行特征提取操作。尖峰宽度混合残差模块由尖峰混合卷积（SMC）组成，能有效提取空间-光谱特征。其次，本文设计了一种简单高效的弧线近似导数（AAD），通过拟合狄拉克函数来解决尖峰发射的无差别问题。通过 AAD，我们可以直接训练有监督的尖峰神经网络。最后，本文在六个公开的高光谱数据集上进行了基于尖峰神经网络的多种高级 HSI 分类算法的对比实验。实验结果表明，AAD 函数具有较强的鲁棒性和良好的拟合效果。同时，与其他算法相比，在相同精度下，SNN-SWMR 所需的时间步骤减少了约 84%，训练时间和测试时间分别减少了约 63% 和 70%。该研究解决了基于 SNN 的人机交互分类算法的关键问题，对于推动人机交互分类算法在机载、空载等边缘设备中的实际应用具有重要的现实意义。

{"title":"Hyperspectral Image Classification Based on Faster Residual Multi-branch Spiking Neural Network","authors":"Yang Liu, Yahui Li, Rui Li, Liming Zhou, Lanxue Dang, Huiyu Mu, Qiang Ge","doi":"arxiv-2409.11619","DOIUrl":"https://doi.org/arxiv-2409.11619","url":null,"abstract":"Convolutional neural network (CNN) performs well in Hyperspectral Image (HSI)\u0000classification tasks, but its high energy consumption and complex network\u0000structure make it difficult to directly apply it to edge computing devices. At\u0000present, spiking neural networks (SNN) have developed rapidly in HSI\u0000classification tasks due to their low energy consumption and event driven\u0000characteristics. However, it usually requires a longer time step to achieve\u0000optimal accuracy. In response to the above problems, this paper builds a\u0000spiking neural network (SNN-SWMR) based on the leaky integrate-and-fire (LIF)\u0000neuron model for HSI classification tasks. The network uses the spiking width\u0000mixed residual (SWMR) module as the basic unit to perform feature extraction\u0000operations. The spiking width mixed residual module is composed of spiking\u0000mixed convolution (SMC), which can effectively extract spatial-spectral\u0000features. Secondly, this paper designs a simple and efficient arcsine\u0000approximate derivative (AAD), which solves the non-differentiable problem of\u0000spike firing by fitting the Dirac function. Through AAD, we can directly train\u0000supervised spike neural networks. Finally, this paper conducts comparative\u0000experiments with multiple advanced HSI classification algorithms based on\u0000spiking neural networks on six public hyperspectral data sets. Experimental\u0000results show that the AAD function has strong robustness and a good fitting\u0000effect. Meanwhile, compared with other algorithms, SNN-SWMR requires a time\u0000step reduction of about 84%, training time, and testing time reduction of about\u000063% and 70% at the same accuracy. This study solves the key problem of SNN\u0000based HSI classification algorithms, which has important practical significance\u0000for promoting the practical application of HSI classification algorithms in\u0000edge devices such as spaceborne and airborne devices.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":"27 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142263133","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Adaptive Selection of Sampling-Reconstruction in Fourier Compressed Sensing 傅立叶压缩传感中采样-重构的自适应选择

arXiv - EE - Image and Video Processing

Pub Date : 2024-09-18 DOI: arxiv-2409.11738

Seongmin Hong, Jaehyeok Bae, Jongho Lee, Se Young Chun

Compressed sensing (CS) has emerged to overcome the inefficiency of Nyquistsampling. However, traditional optimization-based reconstruction is slow andcan not yield an exact image in practice. Deep learning-based reconstructionhas been a promising alternative to optimization-based reconstruction,outperforming it in accuracy and computation speed. Finding an efficientsampling method with deep learning-based reconstruction, especially for FourierCS remains a challenge. Existing joint optimization of sampling-reconstructionworks (H1) optimize the sampling mask but have low potential as it is notadaptive to each data point. Adaptive sampling (H2) has also disadvantages ofdifficult optimization and Pareto sub-optimality. Here, we propose a noveladaptive selection of sampling-reconstruction (H1.5) framework that selects thebest sampling mask and reconstruction network for each input data. We providetheorems that our method has a higher potential than H1 and effectively solvesthe Pareto sub-optimality problem in sampling-reconstruction by using separatereconstruction networks for different sampling masks. To select the bestsampling mask, we propose to quantify the high-frequency Bayesian uncertaintyof the input, using a super-resolution space generation model. Our methodoutperforms joint optimization of sampling-reconstruction (H1) and adaptivesampling (H2) by achieving significant improvements on several Fourier CSproblems.

压缩传感（CS）的出现克服了奈奎斯特采样的低效率问题。然而，传统的基于优化的重建速度较慢，在实际应用中无法生成精确的图像。基于深度学习的重构是基于优化的重构的一个很有前途的替代方案，在精确度和计算速度上都优于优化重构。寻找一种高效的采样方法与基于深度学习的重建相结合，特别是对于傅立叶CS，仍然是一个挑战。现有的采样-重建联合优化方法（H1）可以优化采样掩码，但由于不能适应每个数据点，因此潜力不大。自适应采样（H2）也存在优化困难和帕累托次优的缺点。在这里，我们提出了一个新颖的自适应选择采样-重建（H1.5）框架，它能为每个输入数据选择最佳的采样掩码和重建网络。我们提供的定理表明，我们的方法比 H1 具有更高的潜力，并且通过为不同的采样掩码使用不同的重建网络，有效地解决了采样-重建中的帕累托次优化问题。为了选择最佳采样掩码，我们建议使用超分辨率空间生成模型量化输入的高频贝叶斯不确定性。我们的方法通过在多个傅立叶 CS 问题上取得显著改进，实现了采样重建（H1）和自适应采样（H2）的联合优化。

{"title":"Adaptive Selection of Sampling-Reconstruction in Fourier Compressed Sensing","authors":"Seongmin Hong, Jaehyeok Bae, Jongho Lee, Se Young Chun","doi":"arxiv-2409.11738","DOIUrl":"https://doi.org/arxiv-2409.11738","url":null,"abstract":"Compressed sensing (CS) has emerged to overcome the inefficiency of Nyquist\u0000sampling. However, traditional optimization-based reconstruction is slow and\u0000can not yield an exact image in practice. Deep learning-based reconstruction\u0000has been a promising alternative to optimization-based reconstruction,\u0000outperforming it in accuracy and computation speed. Finding an efficient\u0000sampling method with deep learning-based reconstruction, especially for Fourier\u0000CS remains a challenge. Existing joint optimization of sampling-reconstruction\u0000works (H1) optimize the sampling mask but have low potential as it is not\u0000adaptive to each data point. Adaptive sampling (H2) has also disadvantages of\u0000difficult optimization and Pareto sub-optimality. Here, we propose a novel\u0000adaptive selection of sampling-reconstruction (H1.5) framework that selects the\u0000best sampling mask and reconstruction network for each input data. We provide\u0000theorems that our method has a higher potential than H1 and effectively solves\u0000the Pareto sub-optimality problem in sampling-reconstruction by using separate\u0000reconstruction networks for different sampling masks. To select the best\u0000sampling mask, we propose to quantify the high-frequency Bayesian uncertainty\u0000of the input, using a super-resolution space generation model. Our method\u0000outperforms joint optimization of sampling-reconstruction (H1) and adaptive\u0000sampling (H2) by achieving significant improvements on several Fourier CS\u0000problems.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":"65 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142269820","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Tumor aware recurrent inter-patient deformable image registration of computed tomography scans with lung cancer 肺癌患者计算机断层扫描的肿瘤感知复发可变形图像配准

arXiv - EE - Image and Video Processing

Pub Date : 2024-09-18 DOI: arxiv-2409.11910

Jue Jiang, Chloe Min Seo Choi, Maria Thor, Joseph O. Deasy, Harini Veeraraghavan

Background: Voxel-based analysis (VBA) for population level radiotherapy (RT)outcomes modeling requires topology preserving inter-patient deformable imageregistration (DIR) that preserves tumors on moving images while avoidingunrealistic deformations due to tumors occurring on fixed images. Purpose: Wedeveloped a tumor-aware recurrent registration (TRACER) deep learning (DL)method and evaluated its suitability for VBA. Methods: TRACER consists ofencoder layers implemented with stacked 3D convolutional long short term memorynetwork (3D-CLSTM) followed by decoder and spatial transform layers to computedense deformation vector field (DVF). Multiple CLSTM steps are used to computea progressive sequence of deformations. Input conditioning was applied byincluding tumor segmentations with 3D image pairs as input channels.Bidirectional tumor rigidity, image similarity, and deformation smoothnesslosses were used to optimize the network in an unsupervised manner. TRACER andmultiple DL methods were trained with 204 3D CT image pairs from patients withlung cancers (LC) and evaluated using (a) Dataset I (N = 308 pairs) with DLsegmented LCs, (b) Dataset II (N = 765 pairs) with manually delineated LCs, and(c) Dataset III with 42 LC patients treated with RT. Results: TRACER accuratelyaligned normal tissues. It best preserved tumors, blackindicated by thesmallest tumor volume difference of 0.24%, 0.40%, and 0.13 % and mean squareerror in CT intensities of 0.005, 0.005, 0.004, computed between original andresampled moving image tumors, for Datasets I, II, and III, respectively. Itresulted in the smallest planned RT tumor dose difference computed betweenoriginal and resampled moving images of 0.01 Gy and 0.013 Gy when using afemale and a male reference.

背景：用于群体水平放疗（RT）结果建模的基于体素的分析（VBA）需要拓扑保护的患者间可变形图像配准（DIR），该配准既能保护移动图像上的肿瘤，又能避免固定图像上出现的肿瘤导致的不真实变形。目的：我们开发了一种肿瘤感知递归配准（TRACER）深度学习（DL）方法，并评估了其对 VBA 的适用性。方法：TRACER 包括使用堆叠三维卷积长短期记忆网络（3D-CLSTM）实现的编码器层，然后是解码器层和空间变换层，以计算变形矢量场（DVF）。多个 CLSTM 步骤用于计算渐进的变形序列。双向肿瘤刚度、图像相似度和形变平滑度损失用于以无监督方式优化网络。使用来自肺癌患者的 204 对三维 CT 图像训练 TRACER 和多重 DL 方法，并使用以下数据集进行评估：(a) 数据集 I（N = 308 对），包含 DL 分割的肺癌；(b) 数据集 II（N = 765 对），包含人工划定的肺癌；(c) 数据集 III，包含 42 名接受 RT 治疗的肺癌患者。结果：TRACER 准确对齐了正常组织。在数据集 I、II 和 III 中，原始和采样移动图像肿瘤之间计算的最小肿瘤体积差分别为 0.24%、0.40% 和 0.13%，CT 强度的平均平方误差分别为 0.005、0.005 和 0.004。在使用女性和男性参照物时，原始移动图像和重新取样移动图像之间计算出的计划 RT 肿瘤剂量差异最小，分别为 0.01 Gy 和 0.013 Gy。

{"title":"Tumor aware recurrent inter-patient deformable image registration of computed tomography scans with lung cancer","authors":"Jue Jiang, Chloe Min Seo Choi, Maria Thor, Joseph O. Deasy, Harini Veeraraghavan","doi":"arxiv-2409.11910","DOIUrl":"https://doi.org/arxiv-2409.11910","url":null,"abstract":"Background: Voxel-based analysis (VBA) for population level radiotherapy (RT)\u0000outcomes modeling requires topology preserving inter-patient deformable image\u0000registration (DIR) that preserves tumors on moving images while avoiding\u0000unrealistic deformations due to tumors occurring on fixed images. Purpose: We\u0000developed a tumor-aware recurrent registration (TRACER) deep learning (DL)\u0000method and evaluated its suitability for VBA. Methods: TRACER consists of\u0000encoder layers implemented with stacked 3D convolutional long short term memory\u0000network (3D-CLSTM) followed by decoder and spatial transform layers to compute\u0000dense deformation vector field (DVF). Multiple CLSTM steps are used to compute\u0000a progressive sequence of deformations. Input conditioning was applied by\u0000including tumor segmentations with 3D image pairs as input channels.\u0000Bidirectional tumor rigidity, image similarity, and deformation smoothness\u0000losses were used to optimize the network in an unsupervised manner. TRACER and\u0000multiple DL methods were trained with 204 3D CT image pairs from patients with\u0000lung cancers (LC) and evaluated using (a) Dataset I (N = 308 pairs) with DL\u0000segmented LCs, (b) Dataset II (N = 765 pairs) with manually delineated LCs, and\u0000(c) Dataset III with 42 LC patients treated with RT. Results: TRACER accurately\u0000aligned normal tissues. It best preserved tumors, blackindicated by the\u0000smallest tumor volume difference of 0.24%, 0.40%, and 0.13 % and mean square\u0000error in CT intensities of 0.005, 0.005, 0.004, computed between original and\u0000resampled moving image tumors, for Datasets I, II, and III, respectively. It\u0000resulted in the smallest planned RT tumor dose difference computed between\u0000original and resampled moving images of 0.01 Gy and 0.013 Gy when using a\u0000female and a male reference.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":"4 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142263129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Few-Shot Domain Adaptation for Learned Image Compression 用于学习型图像压缩的少镜头域自适应技术

arXiv - EE - Image and Video Processing

Pub Date : 2024-09-17 DOI: arxiv-2409.11111

Tianyu Zhang, Haotian Zhang, Yuqi Li, Li Li, Dong Liu

Learned image compression (LIC) has achieved state-of-the-art rate-distortionperformance, deemed promising for next-generation image compression techniques.However, pre-trained LIC models usually suffer from significant performancedegradation when applied to out-of-training-domain images, implying their poorgeneralization capabilities. To tackle this problem, we propose a few-shotdomain adaptation method for LIC by integrating plug-and-play adapters intopre-trained models. Drawing inspiration from the analogy between latentchannels and frequency components, we examine domain gaps in LIC and observethat out-of-training-domain images disrupt pre-trained channel-wisedecomposition. Consequently, we introduce a method for channel-wisere-allocation using convolution-based adapters and low-rank adapters, which arelightweight and compatible to mainstream LIC schemes. Extensive experimentsacross multiple domains and multiple representative LIC schemes demonstratethat our method significantly enhances pre-trained models, achieving comparableperformance to H.266/VVC intra coding with merely 25 target-domain samples.Additionally, our method matches the performance of full-model finetune whiletransmitting fewer than $2%$ of the parameters.

学习图像压缩（LIC）已经实现了最先进的速率-失真性能，被认为有望成为下一代图像压缩技术。然而，预训练的 LIC 模型在应用于训练域外图像时通常会出现明显的性能下降，这意味着它们的泛化能力较差。为了解决这个问题，我们提出了一种通过将即插即用适配器集成到预先训练的模型中来实现 LIC 的少镜头域适应方法。从潜在信道和频率成分之间的类比中汲取灵感，我们对 LIC 中的域间隙进行了研究，发现训练域外的图像会破坏预先训练的信道化分解。因此，我们介绍了一种使用基于卷积的适配器和低阶适配器进行信道明智分配的方法，这种方法重量轻，与主流 LIC 方案兼容。跨越多个域和多个代表性 LIC 方案的广泛实验表明，我们的方法显著增强了预训练模型，仅用 25 个目标域样本就实现了与 H.266/VVC 内编码相当的性能。

{"title":"Few-Shot Domain Adaptation for Learned Image Compression","authors":"Tianyu Zhang, Haotian Zhang, Yuqi Li, Li Li, Dong Liu","doi":"arxiv-2409.11111","DOIUrl":"https://doi.org/arxiv-2409.11111","url":null,"abstract":"Learned image compression (LIC) has achieved state-of-the-art rate-distortion\u0000performance, deemed promising for next-generation image compression techniques.\u0000However, pre-trained LIC models usually suffer from significant performance\u0000degradation when applied to out-of-training-domain images, implying their poor\u0000generalization capabilities. To tackle this problem, we propose a few-shot\u0000domain adaptation method for LIC by integrating plug-and-play adapters into\u0000pre-trained models. Drawing inspiration from the analogy between latent\u0000channels and frequency components, we examine domain gaps in LIC and observe\u0000that out-of-training-domain images disrupt pre-trained channel-wise\u0000decomposition. Consequently, we introduce a method for channel-wise\u0000re-allocation using convolution-based adapters and low-rank adapters, which are\u0000lightweight and compatible to mainstream LIC schemes. Extensive experiments\u0000across multiple domains and multiple representative LIC schemes demonstrate\u0000that our method significantly enhances pre-trained models, achieving comparable\u0000performance to H.266/VVC intra coding with merely 25 target-domain samples.\u0000Additionally, our method matches the performance of full-model finetune while\u0000transmitting fewer than $2%$ of the parameters.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142269824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multi-Cohort Framework with Cohort-Aware Attention and Adversarial Mutual-Information Minimization for Whole Slide Image Classification 具有同群感知注意力和逆向互信息最小化功能的多同群框架，用于整张幻灯片图像分类

arXiv - EE - Image and Video Processing

Pub Date : 2024-09-17 DOI: arxiv-2409.11119

Sharon Peled, Yosef E. Maruvka, Moti Freiman

Whole Slide Images (WSIs) are critical for various clinical applications,including histopathological analysis. However, current deep learning approachesin this field predominantly focus on individual tumor types, limiting modelgeneralization and scalability. This relatively narrow focus ultimately stemsfrom the inherent heterogeneity in histopathology and the diverse morphologicaland molecular characteristics of different tumors. To this end, we propose anovel approach for multi-cohort WSI analysis, designed to leverage thediversity of different tumor types. We introduce a Cohort-Aware Attentionmodule, enabling the capture of both shared and tumor-specific pathologicalpatterns, enhancing cross-tumor generalization. Furthermore, we construct anadversarial cohort regularization mechanism to minimize cohort-specific biasesthrough mutual information minimization. Additionally, we develop ahierarchical sample balancing strategy to mitigate cohort imbalances andpromote unbiased learning. Together, these form a cohesive framework forunbiased multi-cohort WSI analysis. Extensive experiments on a uniquelyconstructed multi-cancer dataset demonstrate significant improvements ingeneralization, providing a scalable solution for WSI classification acrossdiverse cancer types. Our code for the experiments is publicly available at.

全切片图像（WSI）对于包括组织病理学分析在内的各种临床应用至关重要。然而，目前该领域的深度学习方法主要集中在单个肿瘤类型上，限制了模型的通用性和可扩展性。这种相对狭隘的关注点最终源于组织病理学固有的异质性以及不同肿瘤形态和分子特征的多样性。为此，我们提出了一种新的多队列 WSI 分析方法，旨在利用不同肿瘤类型的多样性。我们引入了群组感知注意力模块（Cohort-Aware Attentionmodule），能够捕捉共有的和肿瘤特有的病理模式，从而增强跨肿瘤的概括能力。此外，我们还构建了一种对抗群组正则化机制，通过互信息最小化来最小化群组特异性偏差。此外，我们还开发了一种分层样本平衡策略，以减轻队列不平衡，促进无偏学习。这些措施共同构成了无偏多队列 WSI 分析的内聚框架。在一个独特构建的多癌症数据集上进行的广泛实验表明，该方法的泛化能力有了显著提高，为不同癌症类型的 WSI 分类提供了一个可扩展的解决方案。我们的实验代码可在以下网址公开获取。

{"title":"Multi-Cohort Framework with Cohort-Aware Attention and Adversarial Mutual-Information Minimization for Whole Slide Image Classification","authors":"Sharon Peled, Yosef E. Maruvka, Moti Freiman","doi":"arxiv-2409.11119","DOIUrl":"https://doi.org/arxiv-2409.11119","url":null,"abstract":"Whole Slide Images (WSIs) are critical for various clinical applications,\u0000including histopathological analysis. However, current deep learning approaches\u0000in this field predominantly focus on individual tumor types, limiting model\u0000generalization and scalability. This relatively narrow focus ultimately stems\u0000from the inherent heterogeneity in histopathology and the diverse morphological\u0000and molecular characteristics of different tumors. To this end, we propose a\u0000novel approach for multi-cohort WSI analysis, designed to leverage the\u0000diversity of different tumor types. We introduce a Cohort-Aware Attention\u0000module, enabling the capture of both shared and tumor-specific pathological\u0000patterns, enhancing cross-tumor generalization. Furthermore, we construct an\u0000adversarial cohort regularization mechanism to minimize cohort-specific biases\u0000through mutual information minimization. Additionally, we develop a\u0000hierarchical sample balancing strategy to mitigate cohort imbalances and\u0000promote unbiased learning. Together, these form a cohesive framework for\u0000unbiased multi-cohort WSI analysis. Extensive experiments on a uniquely\u0000constructed multi-cancer dataset demonstrate significant improvements in\u0000generalization, providing a scalable solution for WSI classification across\u0000diverse cancer types. Our code for the experiments is publicly available at\u0000<link>.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":"10 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142263236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Unsupervised Hybrid framework for ANomaly Detection (HAND) -- applied to Screening Mammogram 用于异常检测的无监督混合框架 (HAND) -- 应用于乳房X光筛查

arXiv - EE - Image and Video Processing

Pub Date : 2024-09-17 DOI: arxiv-2409.11534

Zhemin Zhang, Bhavika Patel, Bhavik Patel, Imon Banerjee

Out-of-distribution (OOD) detection is crucial for enhancing thegeneralization of AI models used in mammogram screening. Given the challenge oflimited prior knowledge about OOD samples in external datasets, unsupervisedgenerative learning is a preferable solution which trains the model to discernthe normal characteristics of in-distribution (ID) data. The hypothesis is thatduring inference, the model aims to reconstruct ID samples accurately, whileOOD samples exhibit poorer reconstruction due to their divergence fromnormality. Inspired by state-of-the-art (SOTA) hybrid architectures combiningCNNs and transformers, we developed a novel backbone - HAND, for detecting OODfrom large-scale digital screening mammogram studies. To boost the learningefficiency, we incorporated synthetic OOD samples and a parallel discriminatorin the latent space to distinguish between ID and OOD samples. Gradientreversal to the OOD reconstruction loss penalizes the model for learning OODreconstructions. An anomaly score is computed by weighting the reconstructionand discriminator loss. On internal RSNA mammogram held-out test and externalMayo clinic hand-curated dataset, the proposed HAND model outperformedencoder-based and GAN-based baselines, and interestingly, it also outperformedthe hybrid CNN+transformer baselines. Therefore, the proposed HAND pipelineoffers an automated efficient computational solution for domain-specificquality checks in external screening mammograms, yielding actionable insightswithout direct exposure to the private medical imaging data.

分布外（OOD）检测对于提高乳腺 X 光筛查所用人工智能模型的泛化能力至关重要。鉴于外部数据集中有关 OOD 样本的先验知识有限，无监督生成学习是一种可取的解决方案，它可以训练模型辨别分布内（ID）数据的正常特征。假设在推理过程中，模型的目标是准确重建 ID 样本，而 OOD 样本由于偏离正态性，重建效果较差。受结合了 CNN 和变压器的最先进（SOTA）混合体系结构的启发，我们开发了一种新型骨架--HAND，用于从大规模数字乳腺 X 光筛查研究中检测 OOD。为了提高学习效率，我们在潜空间中加入了合成 OOD 样本和并行判别器，以区分 ID 和 OOD 样本。对 OOD 重建损失的梯度反转对学习 OOD 重建的模型进行惩罚。通过对重构损失和判别损失进行加权，计算出异常得分。在内部 RSNA 乳房 X 射线照片保留测试和外部马约诊所人工合成数据集上，拟议的 HAND 模型优于基于编码器和基于 GAN 的基线，有趣的是，它还优于混合 CNN+ 变换器基线。因此，所提出的 HAND 流水线为外部乳房 X 光筛查中特定领域的质量检查提供了自动化的高效计算解决方案，在不直接接触私人医疗成像数据的情况下产生了可操作的洞察力。

{"title":"Unsupervised Hybrid framework for ANomaly Detection (HAND) -- applied to Screening Mammogram","authors":"Zhemin Zhang, Bhavika Patel, Bhavik Patel, Imon Banerjee","doi":"arxiv-2409.11534","DOIUrl":"https://doi.org/arxiv-2409.11534","url":null,"abstract":"Out-of-distribution (OOD) detection is crucial for enhancing the\u0000generalization of AI models used in mammogram screening. Given the challenge of\u0000limited prior knowledge about OOD samples in external datasets, unsupervised\u0000generative learning is a preferable solution which trains the model to discern\u0000the normal characteristics of in-distribution (ID) data. The hypothesis is that\u0000during inference, the model aims to reconstruct ID samples accurately, while\u0000OOD samples exhibit poorer reconstruction due to their divergence from\u0000normality. Inspired by state-of-the-art (SOTA) hybrid architectures combining\u0000CNNs and transformers, we developed a novel backbone - HAND, for detecting OOD\u0000from large-scale digital screening mammogram studies. To boost the learning\u0000efficiency, we incorporated synthetic OOD samples and a parallel discriminator\u0000in the latent space to distinguish between ID and OOD samples. Gradient\u0000reversal to the OOD reconstruction loss penalizes the model for learning OOD\u0000reconstructions. An anomaly score is computed by weighting the reconstruction\u0000and discriminator loss. On internal RSNA mammogram held-out test and external\u0000Mayo clinic hand-curated dataset, the proposed HAND model outperformed\u0000encoder-based and GAN-based baselines, and interestingly, it also outperformed\u0000the hybrid CNN+transformer baselines. Therefore, the proposed HAND pipeline\u0000offers an automated efficient computational solution for domain-specific\u0000quality checks in external screening mammograms, yielding actionable insights\u0000without direct exposure to the private medical imaging data.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":"2 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142269818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Using Physics Informed Generative Adversarial Networks to Model 3D porous media 利用物理信息生成对抗网络建立三维多孔介质模型

arXiv - EE - Image and Video Processing

Pub Date : 2024-09-17 DOI: arxiv-2409.11541

Zihan Ren, Sanjay Srinivasan

Micro-CT scanning of rocks significantly enhances our understanding ofpore-scale physics in porous media. With advancements in pore-scale simulationmethods, such as pore network models, it is now possible to accurately simulatemultiphase flow properties, including relative permeability, from CT-scannedrock samples. However, the limited number of CT-scanned samples and thechallenge of connecting pore-scale networks to field-scale rock propertiesoften make it difficult to use pore-scale simulated properties in realisticfield-scale reservoir simulations. Deep learning approaches to create synthetic3D rock structures allow us to simulate variations in CT rock structures, whichcan then be used to compute representative rock properties and flow functions.However, most current deep learning methods for 3D rock structure synthesisdon't consider rock properties derived from well observations, lacking a directlink between pore-scale structures and field-scale data. We present a method toconstruct 3D rock structures constrained to observed rock properties usinggenerative adversarial networks (GANs) with conditioning accomplished through agradual Gaussian deformation process. We begin by pre-training a WassersteinGAN to reconstruct 3D rock structures. Subsequently, we use a pore networkmodel simulator to compute rock properties. The latent vectors for imagegeneration in GAN are progressively altered using the Gaussian deformationapproach to produce 3D rock structures constrained by well-derived conditioningdata. This GAN and Gaussian deformation approach enables high-resolutionsynthetic image generation and reproduces user-defined rock properties such asporosity, permeability, and pore size distribution. Our research provides anovel way to link GAN-generated models to field-derived quantities.

对岩石进行显微 CT 扫描极大地增强了我们对多孔介质孔隙尺度物理学的了解。随着孔隙尺度模拟方法（如孔隙网络模型）的进步，现在可以通过 CT 扫描岩石样本精确模拟多相流特性，包括相对渗透率。然而，由于 CT 扫描样本的数量有限，以及将孔隙尺度网络与油田尺度岩石属性连接起来的挑战，通常很难在现实油田尺度储层模拟中使用孔隙尺度模拟属性。然而，目前大多数用于三维岩石结构合成的深度学习方法并不考虑从油井观测中得出的岩石属性，孔隙尺度结构与油田尺度数据之间缺乏直接联系。我们提出了一种利用生成对抗网络（GANs）构建三维岩石结构的方法，该方法通过渐变高斯变形过程完成调节，并受制于观测到的岩石属性。我们首先对 WassersteinGAN 进行预训练，以重建三维岩石结构。随后，我们使用孔隙网络模型模拟器计算岩石属性。使用高斯变形方法逐步改变 GAN 中用于图像生成的潜向量，以生成受推导出的条件数据约束的三维岩石结构。这种 GAN 和高斯变形方法能够生成高分辨率的合成图像，并再现用户定义的岩石属性，如孔隙度、渗透性和孔径分布。我们的研究提供了一种将 GAN 生成的模型与现场数据联系起来的新方法。

{"title":"Using Physics Informed Generative Adversarial Networks to Model 3D porous media","authors":"Zihan Ren, Sanjay Srinivasan","doi":"arxiv-2409.11541","DOIUrl":"https://doi.org/arxiv-2409.11541","url":null,"abstract":"Micro-CT scanning of rocks significantly enhances our understanding of\u0000pore-scale physics in porous media. With advancements in pore-scale simulation\u0000methods, such as pore network models, it is now possible to accurately simulate\u0000multiphase flow properties, including relative permeability, from CT-scanned\u0000rock samples. However, the limited number of CT-scanned samples and the\u0000challenge of connecting pore-scale networks to field-scale rock properties\u0000often make it difficult to use pore-scale simulated properties in realistic\u0000field-scale reservoir simulations. Deep learning approaches to create synthetic\u00003D rock structures allow us to simulate variations in CT rock structures, which\u0000can then be used to compute representative rock properties and flow functions.\u0000However, most current deep learning methods for 3D rock structure synthesis\u0000don't consider rock properties derived from well observations, lacking a direct\u0000link between pore-scale structures and field-scale data. We present a method to\u0000construct 3D rock structures constrained to observed rock properties using\u0000generative adversarial networks (GANs) with conditioning accomplished through a\u0000gradual Gaussian deformation process. We begin by pre-training a Wasserstein\u0000GAN to reconstruct 3D rock structures. Subsequently, we use a pore network\u0000model simulator to compute rock properties. The latent vectors for image\u0000generation in GAN are progressively altered using the Gaussian deformation\u0000approach to produce 3D rock structures constrained by well-derived conditioning\u0000data. This GAN and Gaussian deformation approach enables high-resolution\u0000synthetic image generation and reproduces user-defined rock properties such as\u0000porosity, permeability, and pore size distribution. Our research provides a\u0000novel way to link GAN-generated models to field-derived quantities.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":"48 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142263137","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multi-Domain Data Aggregation for Axon and Myelin Segmentation in Histology Images 多域数据聚合用于组织学图像中的轴突和髓鞘分割

arXiv - EE - Image and Video Processing

Pub Date : 2024-09-17 DOI: arxiv-2409.11552

Armand Collin, Arthur Boschet, Mathieu Boudreau, Julien Cohen-Adad

Quantifying axon and myelin properties (e.g., axon diameter, myelinthickness, g-ratio) in histology images can provide useful information aboutmicrostructural changes caused by neurodegenerative diseases. Automatic tissuesegmentation is an important tool for these datasets, as a single stainedsection can contain up to thousands of axons. Advances in deep learning havemade this task quick and reliable with minimal overhead, but a deep learningmodel trained by one research group will hardly ever be usable by other groupsdue to differences in their histology training data. This is partly due tosubject diversity (different body parts, species, genetics, pathologies) andalso to the range of modern microscopy imaging techniques resulting in a widevariability of image features (i.e., contrast, resolution). There is a pressingneed to make AI accessible to neuroscience researchers to facilitate andaccelerate their workflow, but publicly available models are scarce and poorlymaintained. Our approach is to aggregate data from multiple imaging modalities(bright field, electron microscopy, Raman spectroscopy) and species (mouse,rat, rabbit, human), to create an open-source, durable tool for axon and myelinsegmentation. Our generalist model makes it easier for researchers to processtheir data and can be fine-tuned for better performance on specific domains. Westudy the benefits of different aggregation schemes. This multi-domainsegmentation model performs better than single-modality dedicated learners(p=0.03077), generalizes better on out-of-distribution data and is easier touse and maintain. Importantly, we package the segmentation tool into awell-maintained open-source software ecosystem (seehttps://github.com/axondeepseg/axondeepseg).

对组织学图像中的轴突和髓鞘特性（如轴突直径、髓鞘厚度、g 比）进行量化可提供有关神经退行性疾病引起的微结构变化的有用信息。对于这些数据集来说，自动组织分割是一项重要工具，因为一个染色切片可能包含多达数千个轴突。深度学习的进步使这项任务变得快速可靠，开销最小，但由于组织学训练数据的差异，一个研究小组训练的深度学习模型很难被其他小组使用。这一方面是由于研究对象的多样性（不同的身体部位、物种、遗传学、病理学），另一方面是由于现代显微成像技术的广泛应用导致了图像特征（如对比度、分辨率）的差异。神经科学研究人员迫切需要人工智能来促进和加快他们的工作流程，但公开可用的模型很少，而且维护不善。我们的方法是汇总来自多种成像模式（明场、电子显微镜、拉曼光谱）和物种（小鼠、大鼠、兔子、人类）的数据，创建一个开源、耐用的轴突和髓鞘分割工具。我们的通用模型使研究人员更容易处理他们的数据，并可进行微调，以提高在特定领域的性能。研究不同聚合方案的优势。这种多领域分割模型比单模式专用学习器的性能更好（P=0.03077），对分布外数据的泛化效果更好，而且更易于使用和维护。重要的是，我们将分割工具打包到了一个维护良好的开源软件生态系统中（见https://github.com/axondeepseg/axondeepseg）。

{"title":"Multi-Domain Data Aggregation for Axon and Myelin Segmentation in Histology Images","authors":"Armand Collin, Arthur Boschet, Mathieu Boudreau, Julien Cohen-Adad","doi":"arxiv-2409.11552","DOIUrl":"https://doi.org/arxiv-2409.11552","url":null,"abstract":"Quantifying axon and myelin properties (e.g., axon diameter, myelin\u0000thickness, g-ratio) in histology images can provide useful information about\u0000microstructural changes caused by neurodegenerative diseases. Automatic tissue\u0000segmentation is an important tool for these datasets, as a single stained\u0000section can contain up to thousands of axons. Advances in deep learning have\u0000made this task quick and reliable with minimal overhead, but a deep learning\u0000model trained by one research group will hardly ever be usable by other groups\u0000due to differences in their histology training data. This is partly due to\u0000subject diversity (different body parts, species, genetics, pathologies) and\u0000also to the range of modern microscopy imaging techniques resulting in a wide\u0000variability of image features (i.e., contrast, resolution). There is a pressing\u0000need to make AI accessible to neuroscience researchers to facilitate and\u0000accelerate their workflow, but publicly available models are scarce and poorly\u0000maintained. Our approach is to aggregate data from multiple imaging modalities\u0000(bright field, electron microscopy, Raman spectroscopy) and species (mouse,\u0000rat, rabbit, human), to create an open-source, durable tool for axon and myelin\u0000segmentation. Our generalist model makes it easier for researchers to process\u0000their data and can be fine-tuned for better performance on specific domains. We\u0000study the benefits of different aggregation schemes. This multi-domain\u0000segmentation model performs better than single-modality dedicated learners\u0000(p=0.03077), generalizes better on out-of-distribution data and is easier to\u0000use and maintain. Importantly, we package the segmentation tool into a\u0000well-maintained open-source software ecosystem (see\u0000https://github.com/axondeepseg/axondeepseg).","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":"2 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142263134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multi-frequency Electrical Impedance Tomography Reconstruction with Multi-Branch Attention Image Prior 利用多分支注意力图像先验进行多频率电阻抗断层扫描重建

arXiv - EE - Image and Video Processing

Pub Date : 2024-09-17 DOI: arxiv-2409.10794

Hao Fang, Zhe Liu, Yi Feng, Zhen Qiu, Pierre Bagnaninchi, Yunjie Yang

Multi-frequency Electrical Impedance Tomography (mfEIT) is a promisingbiomedical imaging technique that estimates tissue conductivities acrossdifferent frequencies. Current state-of-the-art (SOTA) algorithms, which relyon supervised learning and Multiple Measurement Vectors (MMV), requireextensive training data, making them time-consuming, costly, and less practicalfor widespread applications. Moreover, the dependency on training data insupervised MMV methods can introduce erroneous conductivity contrasts acrossfrequencies, posing significant concerns in biomedical applications. To addressthese challenges, we propose a novel unsupervised learning approach based onMulti-Branch Attention Image Prior (MAIP) for mfEIT reconstruction. Our methodemploys a carefully designed Multi-Branch Attention Network (MBA-Net) torepresent multiple frequency-dependent conductivity images and simultaneouslyreconstructs mfEIT images by iteratively updating its parameters. By leveragingthe implicit regularization capability of the MBA-Net, our algorithm cancapture significant inter- and intra-frequency correlations, enabling robustmfEIT reconstruction without the need for training data. Through simulation andreal-world experiments, our approach demonstrates performance comparable to, orbetter than, SOTA algorithms while exhibiting superior generalizationcapability. These results suggest that the MAIP-based method can be used toimprove the reliability and applicability of mfEIT in various settings.

多频电阻抗层析成像（mfEIT）是一种很有前途的生物医学成像技术，可估算不同频率下的组织电导率。目前最先进的（SOTA）算法依赖于监督学习和多测量向量（MMV），需要大量的训练数据，因此耗时长、成本高，在广泛应用中实用性较差。此外，有监督的多测量向量方法对训练数据的依赖会带来错误的跨频率电导率对比，在生物医学应用中造成严重问题。为了应对这些挑战，我们提出了一种基于多分支注意力图像先验（MAIP）的新型无监督学习方法，用于 mfEIT 重建。我们的方法利用精心设计的多分支注意力网络（MBA-Net）来表示多个频率相关的电导率图像，同时通过迭代更新其参数来重建 mfEIT 图像。通过利用 MBA-Net 的隐式正则化能力，我们的算法可以捕捉到显著的频率间和频率内相关性，从而无需训练数据即可实现稳健的 mfEIT 重建。通过仿真和实际实验，我们的方法表现出了与 SOTA 算法相当甚至更好的性能，同时还表现出了更优越的泛化能力。这些结果表明，基于 MAIP 的方法可用于提高 mfEIT 在各种环境中的可靠性和适用性。

{"title":"Multi-frequency Electrical Impedance Tomography Reconstruction with Multi-Branch Attention Image Prior","authors":"Hao Fang, Zhe Liu, Yi Feng, Zhen Qiu, Pierre Bagnaninchi, Yunjie Yang","doi":"arxiv-2409.10794","DOIUrl":"https://doi.org/arxiv-2409.10794","url":null,"abstract":"Multi-frequency Electrical Impedance Tomography (mfEIT) is a promising\u0000biomedical imaging technique that estimates tissue conductivities across\u0000different frequencies. Current state-of-the-art (SOTA) algorithms, which rely\u0000on supervised learning and Multiple Measurement Vectors (MMV), require\u0000extensive training data, making them time-consuming, costly, and less practical\u0000for widespread applications. Moreover, the dependency on training data in\u0000supervised MMV methods can introduce erroneous conductivity contrasts across\u0000frequencies, posing significant concerns in biomedical applications. To address\u0000these challenges, we propose a novel unsupervised learning approach based on\u0000Multi-Branch Attention Image Prior (MAIP) for mfEIT reconstruction. Our method\u0000employs a carefully designed Multi-Branch Attention Network (MBA-Net) to\u0000represent multiple frequency-dependent conductivity images and simultaneously\u0000reconstructs mfEIT images by iteratively updating its parameters. By leveraging\u0000the implicit regularization capability of the MBA-Net, our algorithm can\u0000capture significant inter- and intra-frequency correlations, enabling robust\u0000mfEIT reconstruction without the need for training data. Through simulation and\u0000real-world experiments, our approach demonstrates performance comparable to, or\u0000better than, SOTA algorithms while exhibiting superior generalization\u0000capability. These results suggest that the MAIP-based method can be used to\u0000improve the reliability and applicability of mfEIT in various settings.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":"30 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142263245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0