Pub Date : 2024-08-08DOI: 10.1109/TMI.2024.3440311
Jun Gao, Qicheng Lao, Qingbo Kang, Paul Liu, Chenlin Du, Kang Li, Le Zhang
The recent advent of in-context learning (ICL) capabilities in large pre-trained models has yielded significant advancements in the generalization of segmentation models. By supplying domain-specific image-mask pairs, the ICL model can be effectively guided to produce optimal segmentation outcomes, eliminating the necessity for model fine-tuning or interactive prompting. However, current existing ICL-based segmentation models exhibit significant limitations when applied to medical segmentation datasets with substantial diversity. To address this issue, we propose a dual similarity checkup approach to guarantee the effectiveness of selected in-context samples so that their guidance can be maximally leveraged during inference. We first employ large pre-trained vision models for extracting strong semantic representations from input images and constructing a feature embedding memory bank for semantic similarity checkup during inference. Assuring the similarity in the input semantic space, we then minimize the discrepancy in the mask appearance distribution between the support set and the estimated mask appearance prior through similarity-weighted sampling and augmentation. We validate our proposed dual similarity checkup approach on eight publicly available medical segmentation datasets, and extensive experimental results demonstrate that our proposed method significantly improves the performance metrics of existing ICL-based segmentation models, particularly when applied to medical image datasets characterized by substantial diversity.
{"title":"Boosting Your Context by Dual Similarity Checkup for In-Context Learning Medical Image Segmentation.","authors":"Jun Gao, Qicheng Lao, Qingbo Kang, Paul Liu, Chenlin Du, Kang Li, Le Zhang","doi":"10.1109/TMI.2024.3440311","DOIUrl":"https://doi.org/10.1109/TMI.2024.3440311","url":null,"abstract":"<p><p>The recent advent of in-context learning (ICL) capabilities in large pre-trained models has yielded significant advancements in the generalization of segmentation models. By supplying domain-specific image-mask pairs, the ICL model can be effectively guided to produce optimal segmentation outcomes, eliminating the necessity for model fine-tuning or interactive prompting. However, current existing ICL-based segmentation models exhibit significant limitations when applied to medical segmentation datasets with substantial diversity. To address this issue, we propose a dual similarity checkup approach to guarantee the effectiveness of selected in-context samples so that their guidance can be maximally leveraged during inference. We first employ large pre-trained vision models for extracting strong semantic representations from input images and constructing a feature embedding memory bank for semantic similarity checkup during inference. Assuring the similarity in the input semantic space, we then minimize the discrepancy in the mask appearance distribution between the support set and the estimated mask appearance prior through similarity-weighted sampling and augmentation. We validate our proposed dual similarity checkup approach on eight publicly available medical segmentation datasets, and extensive experimental results demonstrate that our proposed method significantly improves the performance metrics of existing ICL-based segmentation models, particularly when applied to medical image datasets characterized by substantial diversity.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141908633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-08DOI: 10.1109/TMI.2024.3440227
Wanyu Bian, Albert Jang, Liping Zhang, Xiaonan Yang, Zachary Stewart, Fang Liu
This study introduces a novel image reconstruction technique based on a diffusion model that is conditioned on the native data domain. Our method is applied to multi-coil MRI and quantitative MRI (qMRI) reconstruction, leveraging the domain-conditioned diffusion model within the frequency and parameter domains. The prior MRI physics are used as embeddings in the diffusion model, enforcing data consistency to guide the training and sampling process, characterizing MRI k-space encoding in MRI reconstruction, and leveraging MR signal modeling for qMRI reconstruction. Furthermore, a gradient descent optimization is incorporated into the diffusion steps, enhancing feature learning and improving denoising. The proposed method demonstrates a significant promise, particularly for reconstructing images at high acceleration factors. Notably, it maintains great reconstruction accuracy for static and quantitative MRI reconstruction across diverse anatomical structures. Beyond its immediate applications, this method provides potential generalization capability, making it adaptable to inverse problems across various domains.
本研究介绍了一种基于以原始数据域为条件的扩散模型的新型图像重建技术。我们的方法适用于多线圈磁共振成像和定量磁共振成像(qMRI)重建,利用频率域和参数域内的域条件扩散模型。先验核磁共振物理学被用作扩散模型中的嵌入,加强数据一致性以指导训练和采样过程,在核磁共振重建中描述核磁共振 k 空间编码,并利用核磁共振信号建模进行 qMRI 重建。此外,还在扩散步骤中加入了梯度下降优化,从而加强了特征学习并改善了去噪效果。所提出的方法前景广阔,尤其适用于高加速度系数下的图像重建。值得注意的是,它在各种解剖结构的静态和定量 MRI 重建中保持了极高的重建精度。除了直接应用,该方法还具有潜在的通用能力,使其能够适应各种领域的逆问题。
{"title":"Diffusion Modeling with Domain-conditioned Prior Guidance for Accelerated MRI and qMRI Reconstruction.","authors":"Wanyu Bian, Albert Jang, Liping Zhang, Xiaonan Yang, Zachary Stewart, Fang Liu","doi":"10.1109/TMI.2024.3440227","DOIUrl":"https://doi.org/10.1109/TMI.2024.3440227","url":null,"abstract":"<p><p>This study introduces a novel image reconstruction technique based on a diffusion model that is conditioned on the native data domain. Our method is applied to multi-coil MRI and quantitative MRI (qMRI) reconstruction, leveraging the domain-conditioned diffusion model within the frequency and parameter domains. The prior MRI physics are used as embeddings in the diffusion model, enforcing data consistency to guide the training and sampling process, characterizing MRI k-space encoding in MRI reconstruction, and leveraging MR signal modeling for qMRI reconstruction. Furthermore, a gradient descent optimization is incorporated into the diffusion steps, enhancing feature learning and improving denoising. The proposed method demonstrates a significant promise, particularly for reconstructing images at high acceleration factors. Notably, it maintains great reconstruction accuracy for static and quantitative MRI reconstruction across diverse anatomical structures. Beyond its immediate applications, this method provides potential generalization capability, making it adaptable to inverse problems across various domains.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141908634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-07DOI: 10.1109/TMI.2024.3440009
Yuyang Wang, Xiaomo Liu, Liang Li
In dental cone-beam computed tomography (CBCT), metal implants can cause metal artifacts, affecting image quality and the final medical diagnosis. To reduce the impact of metal artifacts, our proposed metal artifacts reduction (MAR) method takes a novel approach by integrating CBCT data with intraoral optical scanning data, utilizing information from these two different modalities to correct metal artifacts in the projection domain using a guided-diffusion model. The intraoral optical scanning data provides a more accurate generation domain for the diffusion model. We have proposed a multi-channel generation method in the training and generation stage of the diffusion model, considering the physical mechanism of CBCT, to ensure the consistency of the diffusion model generation. In this paper, we present experimental results that convincingly demonstrate the feasibility and efficacy of our approach, which introduces intraoral optical scanning data into the analysis and processing of projection domain data using the diffusion model for the first time, and modifies the diffusion model to better adapt to the physical model of CBCT.
{"title":"Metal Artifacts Reducing Method Based on Diffusion Model Using Intraoral Optical Scanning Data for Dental Cone-beam CT.","authors":"Yuyang Wang, Xiaomo Liu, Liang Li","doi":"10.1109/TMI.2024.3440009","DOIUrl":"10.1109/TMI.2024.3440009","url":null,"abstract":"<p><p>In dental cone-beam computed tomography (CBCT), metal implants can cause metal artifacts, affecting image quality and the final medical diagnosis. To reduce the impact of metal artifacts, our proposed metal artifacts reduction (MAR) method takes a novel approach by integrating CBCT data with intraoral optical scanning data, utilizing information from these two different modalities to correct metal artifacts in the projection domain using a guided-diffusion model. The intraoral optical scanning data provides a more accurate generation domain for the diffusion model. We have proposed a multi-channel generation method in the training and generation stage of the diffusion model, considering the physical mechanism of CBCT, to ensure the consistency of the diffusion model generation. In this paper, we present experimental results that convincingly demonstrate the feasibility and efficacy of our approach, which introduces intraoral optical scanning data into the analysis and processing of projection domain data using the diffusion model for the first time, and modifies the diffusion model to better adapt to the physical model of CBCT.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141903950","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The ability to recover tissue deformation from visual features is fundamental for many robotic surgery applications. This has been a long-standing research topic in computer vision, however, is still unsolved due to complex dynamics of soft tissues when being manipulated by surgical instruments. The ambiguous pixel correspondence caused by homogeneous texture makes achieving dense and accurate tissue tracking even more challenging. In this paper, we propose a novel self-supervised framework to recover tissue deformations from stereo surgical videos. Our approach integrates semantics, cross-frame motion flow, and long-range temporal dependencies to enable the recovered deformations to represent actual tissue dynamics. Moreover, we incorporate diffeomorphic mapping to regularize the warping field to be physically realistic. To comprehensively evaluate our method, we collected stereo surgical video clips containing three types of tissue manipulation (i.e., pushing, dissection and retraction) from two different types of surgeries (i.e., hemicolectomy and mesorectal excision). Our method has achieved impressive results in capturing deformation in 3D mesh, and generalized well across manipulations and surgeries. It also outperforms current state-of-the-art methods on non-rigid registration and optical flow estimation. To the best of our knowledge, this is the first work on self-supervised learning for dense tissue deformation modeling from stereo surgical videos. Our code will be released.
{"title":"Self-Supervised Cyclic Diffeomorphic Mapping for Soft Tissue Deformation Recovery in Robotic Surgery Scenes.","authors":"Shizhan Gong, Yonghao Long, Kai Chen, Jiaqi Liu, Yuliang Xiao, Alexis Cheng, Zerui Wang, Qi Dou","doi":"10.1109/TMI.2024.3439701","DOIUrl":"https://doi.org/10.1109/TMI.2024.3439701","url":null,"abstract":"<p><p>The ability to recover tissue deformation from visual features is fundamental for many robotic surgery applications. This has been a long-standing research topic in computer vision, however, is still unsolved due to complex dynamics of soft tissues when being manipulated by surgical instruments. The ambiguous pixel correspondence caused by homogeneous texture makes achieving dense and accurate tissue tracking even more challenging. In this paper, we propose a novel self-supervised framework to recover tissue deformations from stereo surgical videos. Our approach integrates semantics, cross-frame motion flow, and long-range temporal dependencies to enable the recovered deformations to represent actual tissue dynamics. Moreover, we incorporate diffeomorphic mapping to regularize the warping field to be physically realistic. To comprehensively evaluate our method, we collected stereo surgical video clips containing three types of tissue manipulation (i.e., pushing, dissection and retraction) from two different types of surgeries (i.e., hemicolectomy and mesorectal excision). Our method has achieved impressive results in capturing deformation in 3D mesh, and generalized well across manipulations and surgeries. It also outperforms current state-of-the-art methods on non-rigid registration and optical flow estimation. To the best of our knowledge, this is the first work on self-supervised learning for dense tissue deformation modeling from stereo surgical videos. Our code will be released.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141903951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ultrasound vascular imaging (UVI) is a valuable tool for monitoring the physiological states and evaluating the pathological diseases. Advancing from conventional two-dimensional (2D) to three-dimensional (3D) UVI would enhance the vasculature visualization, thereby improving its reliability. Row-column array (RCA) has emerged as a promising approach for cost-effective ultrafast 3D imaging with a low channel count. However, ultrafast RCA imaging is often hampered by high-level sidelobe artifacts and low signal-to-noise ratio (SNR), which makes RCA-based UVI challenging. In this study, we propose a spatial-temporal similarity weighting (St-SW) method to overcome these challenges by exploiting the incoherence of sidelobe artifacts and noise between datasets acquired using orthogonal transmissions. Simulation, in vitro blood flow phantom, and in vivo experiments were conducted to compare the proposed method with existing orthogonal plane wave imaging (OPW), row-column-specific frame-multiply-and-sum beamforming (RC-FMAS), and XDoppler techniques. Qualitative and quantitative results demonstrate the superior performance of the proposed method. In simulations, the proposed method reduced the sidelobe level by 31.3 dB, 20.8 dB, and 14.0 dB, compared to OPW, XDoppler, and RC-FMAS, respectively. In the blood flow phantom experiment, the proposed method significantly improved the contrast-to-noise ratio (CNR) of the tube by 26.8 dB, 25.5 dB, and 19.7 dB, compared to OPW, XDoppler, and RC-FMAS methods, respectively. In the human submandibular gland experiment, it not only reconstructed a more complete vasculature but also improved the CNR by more than 15 dB, compared to OPW, XDoppler, and RC-FMAS methods. In summary, the proposed method effectively suppresses the side-lobe artifacts and noise in images collected using an RCA under low SNR conditions, leading to improved visualization of 3D vasculatures.
{"title":"Enhancing Row-column array (RCA)-based 3D ultrasound vascular imaging with spatial-temporal similarity weighting.","authors":"Jingke Zhang, Chengwu Huang, U-Wai Lok, Zhijie Dong, Hui Liu, Ping Gong, Pengfei Song, Shigao Chen","doi":"10.1109/TMI.2024.3439615","DOIUrl":"https://doi.org/10.1109/TMI.2024.3439615","url":null,"abstract":"<p><p>Ultrasound vascular imaging (UVI) is a valuable tool for monitoring the physiological states and evaluating the pathological diseases. Advancing from conventional two-dimensional (2D) to three-dimensional (3D) UVI would enhance the vasculature visualization, thereby improving its reliability. Row-column array (RCA) has emerged as a promising approach for cost-effective ultrafast 3D imaging with a low channel count. However, ultrafast RCA imaging is often hampered by high-level sidelobe artifacts and low signal-to-noise ratio (SNR), which makes RCA-based UVI challenging. In this study, we propose a spatial-temporal similarity weighting (St-SW) method to overcome these challenges by exploiting the incoherence of sidelobe artifacts and noise between datasets acquired using orthogonal transmissions. Simulation, in vitro blood flow phantom, and in vivo experiments were conducted to compare the proposed method with existing orthogonal plane wave imaging (OPW), row-column-specific frame-multiply-and-sum beamforming (RC-FMAS), and XDoppler techniques. Qualitative and quantitative results demonstrate the superior performance of the proposed method. In simulations, the proposed method reduced the sidelobe level by 31.3 dB, 20.8 dB, and 14.0 dB, compared to OPW, XDoppler, and RC-FMAS, respectively. In the blood flow phantom experiment, the proposed method significantly improved the contrast-to-noise ratio (CNR) of the tube by 26.8 dB, 25.5 dB, and 19.7 dB, compared to OPW, XDoppler, and RC-FMAS methods, respectively. In the human submandibular gland experiment, it not only reconstructed a more complete vasculature but also improved the CNR by more than 15 dB, compared to OPW, XDoppler, and RC-FMAS methods. In summary, the proposed method effectively suppresses the side-lobe artifacts and noise in images collected using an RCA under low SNR conditions, leading to improved visualization of 3D vasculatures.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141899235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-06DOI: 10.1109/TMI.2024.3439573
Yikun Zhang, Dianlin Hu, Wangyao Li, Weijie Zhang, Gaoyu Chen, Ronald C Chen, Yang Chen, Hao Gao
This work demonstrates the feasibility of two-orthogonal-projection-based CBCT (2V-CBCT) reconstruction and dose calculation for radiation therapy (RT) using real projection data, which is the first 2V-CBCT feasibility study with real projection data, to the best of our knowledge. RT treatments are often delivered in multiple fractions, for which on-board CBCT is desirable to calculate the delivered dose per fraction for the purpose of RT delivery quality assurance and adaptive RT. However, not all RT treatments/fractions have CBCT acquired, but two orthogonal projections are always available. The question to be addressed in this work is the feasibility of 2V-CBCT for the purpose of RT dose calculation. 2V-CBCT is a severely ill-posed inverse problem for which we propose a coarse-to-fine learning strategy. First, a 3D deep neural network that can extract and exploit the inter-slice and intra-slice information is adopted to predict the initial 3D volumes. Then, a 2D deep neural network is utilized to fine-tune the initial 3D volumes slice-by-slice. During the fine-tuning stage, a perceptual loss based on multi-frequency features is employed to enhance the image reconstruction. Dose calculation results from both photon and proton RT demonstrate that 2V-CBCT provides comparable accuracy with full-view CBCT based on real projection data.
{"title":"2V-CBCT: Two-Orthogonal-Projection based CBCT Reconstruction and Dose Calculation for Radiation Therapy using Real Projection Data.","authors":"Yikun Zhang, Dianlin Hu, Wangyao Li, Weijie Zhang, Gaoyu Chen, Ronald C Chen, Yang Chen, Hao Gao","doi":"10.1109/TMI.2024.3439573","DOIUrl":"10.1109/TMI.2024.3439573","url":null,"abstract":"<p><p>This work demonstrates the feasibility of two-orthogonal-projection-based CBCT (2V-CBCT) reconstruction and dose calculation for radiation therapy (RT) using real projection data, which is the first 2V-CBCT feasibility study with real projection data, to the best of our knowledge. RT treatments are often delivered in multiple fractions, for which on-board CBCT is desirable to calculate the delivered dose per fraction for the purpose of RT delivery quality assurance and adaptive RT. However, not all RT treatments/fractions have CBCT acquired, but two orthogonal projections are always available. The question to be addressed in this work is the feasibility of 2V-CBCT for the purpose of RT dose calculation. 2V-CBCT is a severely ill-posed inverse problem for which we propose a coarse-to-fine learning strategy. First, a 3D deep neural network that can extract and exploit the inter-slice and intra-slice information is adopted to predict the initial 3D volumes. Then, a 2D deep neural network is utilized to fine-tune the initial 3D volumes slice-by-slice. During the fine-tuning stage, a perceptual loss based on multi-frequency features is employed to enhance the image reconstruction. Dose calculation results from both photon and proton RT demonstrate that 2V-CBCT provides comparable accuracy with full-view CBCT based on real projection data.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141899234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-05DOI: 10.1109/TMI.2024.3438564
Md Hadiur Rahman Khan, Raffaella Righetti
Assessment of mechanical and transport properties of tissues using ultrasound elasticity imaging requires accurate estimations of the spatiotemporal distribution of volumetric strain. Due to physical constraints such as pitch limitation and the lack of phase information in the lateral direction, the quality of lateral strain estimation is typically significantly lower than the quality of axial strain estimation. In this paper, a novel lateral strain estimation technique based on the physics of compressible porous media is developed, tested and validated. This technique is referred to as "Poroelastography-based Ultrasound Lateral Strain Estimation" (PULSE). PULSE differs from previously proposed lateral strain estimators as it uses the underlying physics of internal fluid flow within a local region of the tissue as theoretical foundation. PULSE establishes a relation between spatiotemporal changes in the axial strains and corresponding spatiotemporal changes in the lateral strains, effectively allowing assessment of lateral strains with comparable quality of axial strain estimators. We demonstrate that PULSE can also be used to accurately track compression-induced solid stresses and fluid pressure in cancers using ultrasound poroelastography (USPE). In this study, we report the theoretical formulation for PULSE and validation using finite element (FE) and ultrasound simulations. PULSE-generated results exhibit less than 5% percentage relative error (PRE) and greater than 90% structural similarity index (SSIM) compared to ground truth simulations. Experimental results are included to qualitatively assess the performance of PULSE in vivo. The proposed method can be used to overcome the inherent limitations of non-axial strain imaging and improve clinical translatability of USPE.
{"title":"A Novel Poroelastography Method for High-quality Estimation of Lateral Strain, Solid Stress and Fluid Pressure In Vivo.","authors":"Md Hadiur Rahman Khan, Raffaella Righetti","doi":"10.1109/TMI.2024.3438564","DOIUrl":"https://doi.org/10.1109/TMI.2024.3438564","url":null,"abstract":"<p><p>Assessment of mechanical and transport properties of tissues using ultrasound elasticity imaging requires accurate estimations of the spatiotemporal distribution of volumetric strain. Due to physical constraints such as pitch limitation and the lack of phase information in the lateral direction, the quality of lateral strain estimation is typically significantly lower than the quality of axial strain estimation. In this paper, a novel lateral strain estimation technique based on the physics of compressible porous media is developed, tested and validated. This technique is referred to as \"Poroelastography-based Ultrasound Lateral Strain Estimation\" (PULSE). PULSE differs from previously proposed lateral strain estimators as it uses the underlying physics of internal fluid flow within a local region of the tissue as theoretical foundation. PULSE establishes a relation between spatiotemporal changes in the axial strains and corresponding spatiotemporal changes in the lateral strains, effectively allowing assessment of lateral strains with comparable quality of axial strain estimators. We demonstrate that PULSE can also be used to accurately track compression-induced solid stresses and fluid pressure in cancers using ultrasound poroelastography (USPE). In this study, we report the theoretical formulation for PULSE and validation using finite element (FE) and ultrasound simulations. PULSE-generated results exhibit less than 5% percentage relative error (PRE) and greater than 90% structural similarity index (SSIM) compared to ground truth simulations. Experimental results are included to qualitatively assess the performance of PULSE in vivo. The proposed method can be used to overcome the inherent limitations of non-axial strain imaging and improve clinical translatability of USPE.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141895018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-05DOI: 10.1109/TMI.2024.3435000
Ning Bi, Arezoo Zakeri, Yan Xia, Nina Cheng, Zeike A Taylor, Alejandro F Frangi, Ali Gooya
We propose a novel recurrent variational network, SegMorph, to perform concurrent segmentation and motion estimation on cardiac cine magnetic resonance image (CMR) sequences. Our model establishes a recurrent latent space that captures spatiotemporal features from cine-MRI sequences for multitask inference and synthesis. The proposed model follows a recurrent variational auto-encoder framework and adopts a learnt prior from the temporal inputs. We utilise a multi-branch decoder to handle bi-ventricular segmentation and motion estimation simultaneously. In addition to the spatiotemporal features from the latent space, motion estimation enriches the supervision of sequential segmentation tasks by providing pseudo-ground truth. On the other hand, the segmentation branch helps with motion estimation by predicting deformation vector fields (DVFs) based on anatomical information. Experimental results demonstrate that the proposed method performs better than state-of-the-art approaches qualitatively and quantitatively for both segmentation and motion estimation tasks. We achieved an 81% average Dice Similarity Coefficient (DSC) and a less than 3.5 mm average Hausdorff distance on segmentation. Meanwhile, we achieved a motion estimation Dice Similarity Coefficient of over 79%, with approximately 0.14% of pixels displaying a negative Jacobian determinant in the estimated DVFs.
{"title":"SegMorph: Concurrent Motion Estimation and Segmentation for Cardiac MRI Sequences.","authors":"Ning Bi, Arezoo Zakeri, Yan Xia, Nina Cheng, Zeike A Taylor, Alejandro F Frangi, Ali Gooya","doi":"10.1109/TMI.2024.3435000","DOIUrl":"https://doi.org/10.1109/TMI.2024.3435000","url":null,"abstract":"<p><p>We propose a novel recurrent variational network, SegMorph, to perform concurrent segmentation and motion estimation on cardiac cine magnetic resonance image (CMR) sequences. Our model establishes a recurrent latent space that captures spatiotemporal features from cine-MRI sequences for multitask inference and synthesis. The proposed model follows a recurrent variational auto-encoder framework and adopts a learnt prior from the temporal inputs. We utilise a multi-branch decoder to handle bi-ventricular segmentation and motion estimation simultaneously. In addition to the spatiotemporal features from the latent space, motion estimation enriches the supervision of sequential segmentation tasks by providing pseudo-ground truth. On the other hand, the segmentation branch helps with motion estimation by predicting deformation vector fields (DVFs) based on anatomical information. Experimental results demonstrate that the proposed method performs better than state-of-the-art approaches qualitatively and quantitatively for both segmentation and motion estimation tasks. We achieved an 81% average Dice Similarity Coefficient (DSC) and a less than 3.5 mm average Hausdorff distance on segmentation. Meanwhile, we achieved a motion estimation Dice Similarity Coefficient of over 79%, with approximately 0.14% of pixels displaying a negative Jacobian determinant in the estimated DVFs.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141895072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-02DOI: 10.1109/TMI.2024.3437295
Boah Kim, Yan Zhuang, Tejas Sudharshan Mathai, Ronald M Summers
Deformable image registration is one of the essential processes in analyzing medical images. In particular, when diagnosing abdominal diseases such as hepatic cancer and lymphoma, multi-domain images scanned from different modalities or different imaging protocols are often used. However, they are not aligned due to scanning times, patient breathing, movement, etc. Although recent learning-based approaches can provide deformations in real-time with high performance, multi-domain abdominal image registration using deep learning is still challenging since the images in different domains have different characteristics such as image contrast and intensity ranges. To address this, this paper proposes a novel unsupervised multi-domain image registration framework using neural optimal transport, dubbed OTMorph. When moving and fixed volumes are given as input, a transport module of our proposed model learns the optimal transport plan to map data distributions from the moving to the fixed volumes and estimates a domain-transported volume. Subsequently, a registration module taking the transported volume can effectively estimate the deformation field, leading to deformation performance improvement. Experimental results on multi-domain image registration using multi-modality and multi-parametric abdominal medical images demonstrate that the proposed method provides superior deformable registration via the domain-transported image that alleviates the domain gap between the input images. Also, we attain the improvement even on out-of-distribution data, which indicates the superior generalizability of our model for the registration of various medical images. Our source code is available at https://github.com/boahK/OTMorph.
{"title":"OTMorph: Unsupervised Multi-domain Abdominal Medical Image Registration Using Neural Optimal Transport.","authors":"Boah Kim, Yan Zhuang, Tejas Sudharshan Mathai, Ronald M Summers","doi":"10.1109/TMI.2024.3437295","DOIUrl":"https://doi.org/10.1109/TMI.2024.3437295","url":null,"abstract":"<p><p>Deformable image registration is one of the essential processes in analyzing medical images. In particular, when diagnosing abdominal diseases such as hepatic cancer and lymphoma, multi-domain images scanned from different modalities or different imaging protocols are often used. However, they are not aligned due to scanning times, patient breathing, movement, etc. Although recent learning-based approaches can provide deformations in real-time with high performance, multi-domain abdominal image registration using deep learning is still challenging since the images in different domains have different characteristics such as image contrast and intensity ranges. To address this, this paper proposes a novel unsupervised multi-domain image registration framework using neural optimal transport, dubbed OTMorph. When moving and fixed volumes are given as input, a transport module of our proposed model learns the optimal transport plan to map data distributions from the moving to the fixed volumes and estimates a domain-transported volume. Subsequently, a registration module taking the transported volume can effectively estimate the deformation field, leading to deformation performance improvement. Experimental results on multi-domain image registration using multi-modality and multi-parametric abdominal medical images demonstrate that the proposed method provides superior deformable registration via the domain-transported image that alleviates the domain gap between the input images. Also, we attain the improvement even on out-of-distribution data, which indicates the superior generalizability of our model for the registration of various medical images. Our source code is available at https://github.com/boahK/OTMorph.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141879979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Semi-supervised learning (SSL) has been proven beneficial for mitigating the issue of limited labeled data, especially on volumetric medical image segmentation. Unlike previous SSL methods which focus on exploring highly confident pseudo-labels or developing consistency regularization schemes, our empirical findings suggest that differential decoder features emerge naturally when two decoders strive to generate consistent predictions. Based on the observation, we first analyze the treasure of discrepancy in learning towards consistency, under both pseudo-labeling and consistency regularization settings, and subsequently propose a novel SSL method called LeFeD, which learns the feature-level discrepancies obtained from two decoders, by feeding such information as feedback signals to the encoder. The core design of LeFeD is to enlarge the discrepancies by training differential decoders, and then learn from the differential features iteratively. We evaluate LeFeD against eight state-of-the-art (SOTA) methods on three public datasets. Experiments show LeFeD surpasses competitors without any bells and whistles, such as uncertainty estimation and strong constraints, as well as setting a new state of the art for semi-supervised medical image segmentation. Code has been released at https://github.com/maxwell0027/LeFeD.
{"title":"Consistency-guided Differential Decoding for Enhancing Semi-supervised Medical Image Segmentation.","authors":"Qingjie Zeng, Yutong Xie, Zilin Lu, Mengkang Lu, Jingfeng Zhang, Yuyin Zhou, Yong Xia","doi":"10.1109/TMI.2024.3429340","DOIUrl":"10.1109/TMI.2024.3429340","url":null,"abstract":"<p><p>Semi-supervised learning (SSL) has been proven beneficial for mitigating the issue of limited labeled data, especially on volumetric medical image segmentation. Unlike previous SSL methods which focus on exploring highly confident pseudo-labels or developing consistency regularization schemes, our empirical findings suggest that differential decoder features emerge naturally when two decoders strive to generate consistent predictions. Based on the observation, we first analyze the treasure of discrepancy in learning towards consistency, under both pseudo-labeling and consistency regularization settings, and subsequently propose a novel SSL method called LeFeD, which learns the feature-level discrepancies obtained from two decoders, by feeding such information as feedback signals to the encoder. The core design of LeFeD is to enlarge the discrepancies by training differential decoders, and then learn from the differential features iteratively. We evaluate LeFeD against eight state-of-the-art (SOTA) methods on three public datasets. Experiments show LeFeD surpasses competitors without any bells and whistles, such as uncertainty estimation and strong constraints, as well as setting a new state of the art for semi-supervised medical image segmentation. Code has been released at https://github.com/maxwell0027/LeFeD.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141876992","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}