Ultrasound vascular imaging (UVI) is a valuable tool for monitoring the physiological states and evaluating the pathological diseases. Advancing from conventional two-dimensional (2D) to three-dimensional (3D) UVI would enhance the vasculature visualization, thereby improving its reliability. Row-column array (RCA) has emerged as a promising approach for cost-effective ultrafast 3D imaging with a low channel count. However, ultrafast RCA imaging is often hampered by high-level sidelobe artifacts and low signal-to-noise ratio (SNR), which makes RCA-based UVI challenging. In this study, we propose a spatial-temporal similarity weighting (St-SW) method to overcome these challenges by exploiting the incoherence of sidelobe artifacts and noise between datasets acquired using orthogonal transmissions. Simulation, in vitro blood flow phantom, and in vivo experiments were conducted to compare the proposed method with existing orthogonal plane wave imaging (OPW), row-column-specific frame-multiply-and-sum beamforming (RC-FMAS), and XDoppler techniques. Qualitative and quantitative results demonstrate the superior performance of the proposed method. In simulations, the proposed method reduced the sidelobe level by 31.3 dB, 20.8 dB, and 14.0 dB, compared to OPW, XDoppler, and RC-FMAS, respectively. In the blood flow phantom experiment, the proposed method significantly improved the contrast-to-noise ratio (CNR) of the tube by 26.8 dB, 25.5 dB, and 19.7 dB, compared to OPW, XDoppler, and RC-FMAS methods, respectively. In the human submandibular gland experiment, it not only reconstructed a more complete vasculature but also improved the CNR by more than 15 dB, compared to OPW, XDoppler, and RC-FMAS methods. In summary, the proposed method effectively suppresses the side-lobe artifacts and noise in images collected using an RCA under low SNR conditions, leading to improved visualization of 3D vasculatures.
{"title":"Enhancing Row-column array (RCA)-based 3D ultrasound vascular imaging with spatial-temporal similarity weighting.","authors":"Jingke Zhang, Chengwu Huang, U-Wai Lok, Zhijie Dong, Hui Liu, Ping Gong, Pengfei Song, Shigao Chen","doi":"10.1109/TMI.2024.3439615","DOIUrl":"https://doi.org/10.1109/TMI.2024.3439615","url":null,"abstract":"<p><p>Ultrasound vascular imaging (UVI) is a valuable tool for monitoring the physiological states and evaluating the pathological diseases. Advancing from conventional two-dimensional (2D) to three-dimensional (3D) UVI would enhance the vasculature visualization, thereby improving its reliability. Row-column array (RCA) has emerged as a promising approach for cost-effective ultrafast 3D imaging with a low channel count. However, ultrafast RCA imaging is often hampered by high-level sidelobe artifacts and low signal-to-noise ratio (SNR), which makes RCA-based UVI challenging. In this study, we propose a spatial-temporal similarity weighting (St-SW) method to overcome these challenges by exploiting the incoherence of sidelobe artifacts and noise between datasets acquired using orthogonal transmissions. Simulation, in vitro blood flow phantom, and in vivo experiments were conducted to compare the proposed method with existing orthogonal plane wave imaging (OPW), row-column-specific frame-multiply-and-sum beamforming (RC-FMAS), and XDoppler techniques. Qualitative and quantitative results demonstrate the superior performance of the proposed method. In simulations, the proposed method reduced the sidelobe level by 31.3 dB, 20.8 dB, and 14.0 dB, compared to OPW, XDoppler, and RC-FMAS, respectively. In the blood flow phantom experiment, the proposed method significantly improved the contrast-to-noise ratio (CNR) of the tube by 26.8 dB, 25.5 dB, and 19.7 dB, compared to OPW, XDoppler, and RC-FMAS methods, respectively. In the human submandibular gland experiment, it not only reconstructed a more complete vasculature but also improved the CNR by more than 15 dB, compared to OPW, XDoppler, and RC-FMAS methods. In summary, the proposed method effectively suppresses the side-lobe artifacts and noise in images collected using an RCA under low SNR conditions, leading to improved visualization of 3D vasculatures.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141899235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-06DOI: 10.1109/TMI.2024.3439573
Yikun Zhang, Dianlin Hu, Wangyao Li, Weijie Zhang, Gaoyu Chen, Ronald C Chen, Yang Chen, Hao Gao
This work demonstrates the feasibility of two-orthogonal-projection-based CBCT (2V-CBCT) reconstruction and dose calculation for radiation therapy (RT) using real projection data, which is the first 2V-CBCT feasibility study with real projection data, to the best of our knowledge. RT treatments are often delivered in multiple fractions, for which on-board CBCT is desirable to calculate the delivered dose per fraction for the purpose of RT delivery quality assurance and adaptive RT. However, not all RT treatments/fractions have CBCT acquired, but two orthogonal projections are always available. The question to be addressed in this work is the feasibility of 2V-CBCT for the purpose of RT dose calculation. 2V-CBCT is a severely ill-posed inverse problem for which we propose a coarse-to-fine learning strategy. First, a 3D deep neural network that can extract and exploit the inter-slice and intra-slice information is adopted to predict the initial 3D volumes. Then, a 2D deep neural network is utilized to fine-tune the initial 3D volumes slice-by-slice. During the fine-tuning stage, a perceptual loss based on multi-frequency features is employed to enhance the image reconstruction. Dose calculation results from both photon and proton RT demonstrate that 2V-CBCT provides comparable accuracy with full-view CBCT based on real projection data.
{"title":"2V-CBCT: Two-Orthogonal-Projection based CBCT Reconstruction and Dose Calculation for Radiation Therapy using Real Projection Data.","authors":"Yikun Zhang, Dianlin Hu, Wangyao Li, Weijie Zhang, Gaoyu Chen, Ronald C Chen, Yang Chen, Hao Gao","doi":"10.1109/TMI.2024.3439573","DOIUrl":"10.1109/TMI.2024.3439573","url":null,"abstract":"<p><p>This work demonstrates the feasibility of two-orthogonal-projection-based CBCT (2V-CBCT) reconstruction and dose calculation for radiation therapy (RT) using real projection data, which is the first 2V-CBCT feasibility study with real projection data, to the best of our knowledge. RT treatments are often delivered in multiple fractions, for which on-board CBCT is desirable to calculate the delivered dose per fraction for the purpose of RT delivery quality assurance and adaptive RT. However, not all RT treatments/fractions have CBCT acquired, but two orthogonal projections are always available. The question to be addressed in this work is the feasibility of 2V-CBCT for the purpose of RT dose calculation. 2V-CBCT is a severely ill-posed inverse problem for which we propose a coarse-to-fine learning strategy. First, a 3D deep neural network that can extract and exploit the inter-slice and intra-slice information is adopted to predict the initial 3D volumes. Then, a 2D deep neural network is utilized to fine-tune the initial 3D volumes slice-by-slice. During the fine-tuning stage, a perceptual loss based on multi-frequency features is employed to enhance the image reconstruction. Dose calculation results from both photon and proton RT demonstrate that 2V-CBCT provides comparable accuracy with full-view CBCT based on real projection data.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141899234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-05DOI: 10.1109/TMI.2024.3438564
Md Hadiur Rahman Khan, Raffaella Righetti
Assessment of mechanical and transport properties of tissues using ultrasound elasticity imaging requires accurate estimations of the spatiotemporal distribution of volumetric strain. Due to physical constraints such as pitch limitation and the lack of phase information in the lateral direction, the quality of lateral strain estimation is typically significantly lower than the quality of axial strain estimation. In this paper, a novel lateral strain estimation technique based on the physics of compressible porous media is developed, tested and validated. This technique is referred to as "Poroelastography-based Ultrasound Lateral Strain Estimation" (PULSE). PULSE differs from previously proposed lateral strain estimators as it uses the underlying physics of internal fluid flow within a local region of the tissue as theoretical foundation. PULSE establishes a relation between spatiotemporal changes in the axial strains and corresponding spatiotemporal changes in the lateral strains, effectively allowing assessment of lateral strains with comparable quality of axial strain estimators. We demonstrate that PULSE can also be used to accurately track compression-induced solid stresses and fluid pressure in cancers using ultrasound poroelastography (USPE). In this study, we report the theoretical formulation for PULSE and validation using finite element (FE) and ultrasound simulations. PULSE-generated results exhibit less than 5% percentage relative error (PRE) and greater than 90% structural similarity index (SSIM) compared to ground truth simulations. Experimental results are included to qualitatively assess the performance of PULSE in vivo. The proposed method can be used to overcome the inherent limitations of non-axial strain imaging and improve clinical translatability of USPE.
{"title":"A Novel Poroelastography Method for High-quality Estimation of Lateral Strain, Solid Stress and Fluid Pressure In Vivo.","authors":"Md Hadiur Rahman Khan, Raffaella Righetti","doi":"10.1109/TMI.2024.3438564","DOIUrl":"https://doi.org/10.1109/TMI.2024.3438564","url":null,"abstract":"<p><p>Assessment of mechanical and transport properties of tissues using ultrasound elasticity imaging requires accurate estimations of the spatiotemporal distribution of volumetric strain. Due to physical constraints such as pitch limitation and the lack of phase information in the lateral direction, the quality of lateral strain estimation is typically significantly lower than the quality of axial strain estimation. In this paper, a novel lateral strain estimation technique based on the physics of compressible porous media is developed, tested and validated. This technique is referred to as \"Poroelastography-based Ultrasound Lateral Strain Estimation\" (PULSE). PULSE differs from previously proposed lateral strain estimators as it uses the underlying physics of internal fluid flow within a local region of the tissue as theoretical foundation. PULSE establishes a relation between spatiotemporal changes in the axial strains and corresponding spatiotemporal changes in the lateral strains, effectively allowing assessment of lateral strains with comparable quality of axial strain estimators. We demonstrate that PULSE can also be used to accurately track compression-induced solid stresses and fluid pressure in cancers using ultrasound poroelastography (USPE). In this study, we report the theoretical formulation for PULSE and validation using finite element (FE) and ultrasound simulations. PULSE-generated results exhibit less than 5% percentage relative error (PRE) and greater than 90% structural similarity index (SSIM) compared to ground truth simulations. Experimental results are included to qualitatively assess the performance of PULSE in vivo. The proposed method can be used to overcome the inherent limitations of non-axial strain imaging and improve clinical translatability of USPE.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141895018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-05DOI: 10.1109/TMI.2024.3435000
Ning Bi, Arezoo Zakeri, Yan Xia, Nina Cheng, Zeike A Taylor, Alejandro F Frangi, Ali Gooya
We propose a novel recurrent variational network, SegMorph, to perform concurrent segmentation and motion estimation on cardiac cine magnetic resonance image (CMR) sequences. Our model establishes a recurrent latent space that captures spatiotemporal features from cine-MRI sequences for multitask inference and synthesis. The proposed model follows a recurrent variational auto-encoder framework and adopts a learnt prior from the temporal inputs. We utilise a multi-branch decoder to handle bi-ventricular segmentation and motion estimation simultaneously. In addition to the spatiotemporal features from the latent space, motion estimation enriches the supervision of sequential segmentation tasks by providing pseudo-ground truth. On the other hand, the segmentation branch helps with motion estimation by predicting deformation vector fields (DVFs) based on anatomical information. Experimental results demonstrate that the proposed method performs better than state-of-the-art approaches qualitatively and quantitatively for both segmentation and motion estimation tasks. We achieved an 81% average Dice Similarity Coefficient (DSC) and a less than 3.5 mm average Hausdorff distance on segmentation. Meanwhile, we achieved a motion estimation Dice Similarity Coefficient of over 79%, with approximately 0.14% of pixels displaying a negative Jacobian determinant in the estimated DVFs.
{"title":"SegMorph: Concurrent Motion Estimation and Segmentation for Cardiac MRI Sequences.","authors":"Ning Bi, Arezoo Zakeri, Yan Xia, Nina Cheng, Zeike A Taylor, Alejandro F Frangi, Ali Gooya","doi":"10.1109/TMI.2024.3435000","DOIUrl":"https://doi.org/10.1109/TMI.2024.3435000","url":null,"abstract":"<p><p>We propose a novel recurrent variational network, SegMorph, to perform concurrent segmentation and motion estimation on cardiac cine magnetic resonance image (CMR) sequences. Our model establishes a recurrent latent space that captures spatiotemporal features from cine-MRI sequences for multitask inference and synthesis. The proposed model follows a recurrent variational auto-encoder framework and adopts a learnt prior from the temporal inputs. We utilise a multi-branch decoder to handle bi-ventricular segmentation and motion estimation simultaneously. In addition to the spatiotemporal features from the latent space, motion estimation enriches the supervision of sequential segmentation tasks by providing pseudo-ground truth. On the other hand, the segmentation branch helps with motion estimation by predicting deformation vector fields (DVFs) based on anatomical information. Experimental results demonstrate that the proposed method performs better than state-of-the-art approaches qualitatively and quantitatively for both segmentation and motion estimation tasks. We achieved an 81% average Dice Similarity Coefficient (DSC) and a less than 3.5 mm average Hausdorff distance on segmentation. Meanwhile, we achieved a motion estimation Dice Similarity Coefficient of over 79%, with approximately 0.14% of pixels displaying a negative Jacobian determinant in the estimated DVFs.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141895072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-02DOI: 10.1109/TMI.2024.3437295
Boah Kim, Yan Zhuang, Tejas Sudharshan Mathai, Ronald M Summers
Deformable image registration is one of the essential processes in analyzing medical images. In particular, when diagnosing abdominal diseases such as hepatic cancer and lymphoma, multi-domain images scanned from different modalities or different imaging protocols are often used. However, they are not aligned due to scanning times, patient breathing, movement, etc. Although recent learning-based approaches can provide deformations in real-time with high performance, multi-domain abdominal image registration using deep learning is still challenging since the images in different domains have different characteristics such as image contrast and intensity ranges. To address this, this paper proposes a novel unsupervised multi-domain image registration framework using neural optimal transport, dubbed OTMorph. When moving and fixed volumes are given as input, a transport module of our proposed model learns the optimal transport plan to map data distributions from the moving to the fixed volumes and estimates a domain-transported volume. Subsequently, a registration module taking the transported volume can effectively estimate the deformation field, leading to deformation performance improvement. Experimental results on multi-domain image registration using multi-modality and multi-parametric abdominal medical images demonstrate that the proposed method provides superior deformable registration via the domain-transported image that alleviates the domain gap between the input images. Also, we attain the improvement even on out-of-distribution data, which indicates the superior generalizability of our model for the registration of various medical images. Our source code is available at https://github.com/boahK/OTMorph.
{"title":"OTMorph: Unsupervised Multi-domain Abdominal Medical Image Registration Using Neural Optimal Transport.","authors":"Boah Kim, Yan Zhuang, Tejas Sudharshan Mathai, Ronald M Summers","doi":"10.1109/TMI.2024.3437295","DOIUrl":"https://doi.org/10.1109/TMI.2024.3437295","url":null,"abstract":"<p><p>Deformable image registration is one of the essential processes in analyzing medical images. In particular, when diagnosing abdominal diseases such as hepatic cancer and lymphoma, multi-domain images scanned from different modalities or different imaging protocols are often used. However, they are not aligned due to scanning times, patient breathing, movement, etc. Although recent learning-based approaches can provide deformations in real-time with high performance, multi-domain abdominal image registration using deep learning is still challenging since the images in different domains have different characteristics such as image contrast and intensity ranges. To address this, this paper proposes a novel unsupervised multi-domain image registration framework using neural optimal transport, dubbed OTMorph. When moving and fixed volumes are given as input, a transport module of our proposed model learns the optimal transport plan to map data distributions from the moving to the fixed volumes and estimates a domain-transported volume. Subsequently, a registration module taking the transported volume can effectively estimate the deformation field, leading to deformation performance improvement. Experimental results on multi-domain image registration using multi-modality and multi-parametric abdominal medical images demonstrate that the proposed method provides superior deformable registration via the domain-transported image that alleviates the domain gap between the input images. Also, we attain the improvement even on out-of-distribution data, which indicates the superior generalizability of our model for the registration of various medical images. Our source code is available at https://github.com/boahK/OTMorph.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141879979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Semi-supervised learning (SSL) has been proven beneficial for mitigating the issue of limited labeled data, especially on volumetric medical image segmentation. Unlike previous SSL methods which focus on exploring highly confident pseudo-labels or developing consistency regularization schemes, our empirical findings suggest that differential decoder features emerge naturally when two decoders strive to generate consistent predictions. Based on the observation, we first analyze the treasure of discrepancy in learning towards consistency, under both pseudo-labeling and consistency regularization settings, and subsequently propose a novel SSL method called LeFeD, which learns the feature-level discrepancies obtained from two decoders, by feeding such information as feedback signals to the encoder. The core design of LeFeD is to enlarge the discrepancies by training differential decoders, and then learn from the differential features iteratively. We evaluate LeFeD against eight state-of-the-art (SOTA) methods on three public datasets. Experiments show LeFeD surpasses competitors without any bells and whistles, such as uncertainty estimation and strong constraints, as well as setting a new state of the art for semi-supervised medical image segmentation. Code has been released at https://github.com/maxwell0027/LeFeD.
{"title":"Consistency-guided Differential Decoding for Enhancing Semi-supervised Medical Image Segmentation.","authors":"Qingjie Zeng, Yutong Xie, Zilin Lu, Mengkang Lu, Jingfeng Zhang, Yuyin Zhou, Yong Xia","doi":"10.1109/TMI.2024.3429340","DOIUrl":"10.1109/TMI.2024.3429340","url":null,"abstract":"<p><p>Semi-supervised learning (SSL) has been proven beneficial for mitigating the issue of limited labeled data, especially on volumetric medical image segmentation. Unlike previous SSL methods which focus on exploring highly confident pseudo-labels or developing consistency regularization schemes, our empirical findings suggest that differential decoder features emerge naturally when two decoders strive to generate consistent predictions. Based on the observation, we first analyze the treasure of discrepancy in learning towards consistency, under both pseudo-labeling and consistency regularization settings, and subsequently propose a novel SSL method called LeFeD, which learns the feature-level discrepancies obtained from two decoders, by feeding such information as feedback signals to the encoder. The core design of LeFeD is to enlarge the discrepancies by training differential decoders, and then learn from the differential features iteratively. We evaluate LeFeD against eight state-of-the-art (SOTA) methods on three public datasets. Experiments show LeFeD surpasses competitors without any bells and whistles, such as uncertainty estimation and strong constraints, as well as setting a new state of the art for semi-supervised medical image segmentation. Code has been released at https://github.com/maxwell0027/LeFeD.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141876992","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-01DOI: 10.1109/TMI.2024.3436713
Quan Quan, Qingsong Yao, Heqin Zhu, S Kevin Zhou
Contrastive learning (CL) is a form of self-supervised learning and has been widely used for various tasks. Different from widely studied instance-level contrastive learning, pixel-wise contrastive learning mainly helps with pixel-wise dense prediction tasks. The counter-part to an instance in instance-level CL is a pixel, along with its neighboring context, in pixel-wise CL. Aiming to build better feature representation, there is a vast literature about designing instance augmentation strategies for instance-level CL; but there is little similar work on pixel augmentation for pixel-wise CL with a pixel granularity. In this paper, we attempt to bridge this gap. We first classify a pixel into three categories, namely low-, medium-, and high-informative, based on the information quantity the pixel contains. We then adaptively design separate augmentation strategies for each category in terms of augmentation intensity and sampling ratio. Extensive experiments validate that our information-guided pixel augmentation strategy succeeds in encoding more discriminative representations and surpassing other competitive approaches in unsupervised local feature matching. Furthermore, our pretrained model improves the performance of both one-shot and fully supervised models. To the best of our knowledge, we are the first to propose a pixel augmentation method with a pixel granularity for enhancing unsupervised pixel-wise contrastive learning. Code is available at https: //github.com/Curli-quan/IGU-Aug.
{"title":"IGU-Aug: Information-guided unsupervised augmentation and pixel-wise contrastive learning for medical image analysis.","authors":"Quan Quan, Qingsong Yao, Heqin Zhu, S Kevin Zhou","doi":"10.1109/TMI.2024.3436713","DOIUrl":"https://doi.org/10.1109/TMI.2024.3436713","url":null,"abstract":"<p><p>Contrastive learning (CL) is a form of self-supervised learning and has been widely used for various tasks. Different from widely studied instance-level contrastive learning, pixel-wise contrastive learning mainly helps with pixel-wise dense prediction tasks. The counter-part to an instance in instance-level CL is a pixel, along with its neighboring context, in pixel-wise CL. Aiming to build better feature representation, there is a vast literature about designing instance augmentation strategies for instance-level CL; but there is little similar work on pixel augmentation for pixel-wise CL with a pixel granularity. In this paper, we attempt to bridge this gap. We first classify a pixel into three categories, namely low-, medium-, and high-informative, based on the information quantity the pixel contains. We then adaptively design separate augmentation strategies for each category in terms of augmentation intensity and sampling ratio. Extensive experiments validate that our information-guided pixel augmentation strategy succeeds in encoding more discriminative representations and surpassing other competitive approaches in unsupervised local feature matching. Furthermore, our pretrained model improves the performance of both one-shot and fully supervised models. To the best of our knowledge, we are the first to propose a pixel augmentation method with a pixel granularity for enhancing unsupervised pixel-wise contrastive learning. Code is available at https: //github.com/Curli-quan/IGU-Aug.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141876993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-01DOI: 10.1109/TMI.2024.3436608
Zhenghua Xu, Yunxin Liu, Gang Xu, Thomas Lukasiewicz
Self-supervised learning aims to learn transferable representations from unlabeled data for downstream tasks. Inspired by masked language modeling in natural language processing, masked image modeling (MIM) has achieved certain success in the field of computer vision, but its effectiveness in medical images remains unsatisfactory. This is mainly due to the high redundancy and small discriminative regions in medical images compared to natural images. Therefore, this paper proposes an adaptive hard masking (AHM) approach based on deep reinforcement learning to expand the application of MIM in medical images. Unlike predefined random masks, AHM uses an asynchronous advantage actor-critic (A3C) model to predict reconstruction loss for each patch, enabling the model to learn where masking is valuable. By optimizing the non-differentiable sampling process using reinforcement learning, AHM enhances the understanding of key regions, thereby improving downstream task performance. Experimental results on two medical image datasets demonstrate that AHM outperforms state-of-the-art methods. Additional experiments under various settings validate the effectiveness of AHM in constructing masked images.
{"title":"Self-Supervised Medical Image Segmentation Using Deep Reinforced Adaptive Masking.","authors":"Zhenghua Xu, Yunxin Liu, Gang Xu, Thomas Lukasiewicz","doi":"10.1109/TMI.2024.3436608","DOIUrl":"https://doi.org/10.1109/TMI.2024.3436608","url":null,"abstract":"<p><p>Self-supervised learning aims to learn transferable representations from unlabeled data for downstream tasks. Inspired by masked language modeling in natural language processing, masked image modeling (MIM) has achieved certain success in the field of computer vision, but its effectiveness in medical images remains unsatisfactory. This is mainly due to the high redundancy and small discriminative regions in medical images compared to natural images. Therefore, this paper proposes an adaptive hard masking (AHM) approach based on deep reinforcement learning to expand the application of MIM in medical images. Unlike predefined random masks, AHM uses an asynchronous advantage actor-critic (A3C) model to predict reconstruction loss for each patch, enabling the model to learn where masking is valuable. By optimizing the non-differentiable sampling process using reinforcement learning, AHM enhances the understanding of key regions, thereby improving downstream task performance. Experimental results on two medical image datasets demonstrate that AHM outperforms state-of-the-art methods. Additional experiments under various settings validate the effectiveness of AHM in constructing masked images.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141876994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-01DOI: 10.1109/TMI.2024.3372491
{"title":"IEEE Nuclear Science Symposium","authors":"","doi":"10.1109/TMI.2024.3372491","DOIUrl":"10.1109/TMI.2024.3372491","url":null,"abstract":"","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"43 8","pages":"3057-3057"},"PeriodicalIF":0.0,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10620001","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141877535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-31DOI: 10.1109/TMI.2024.3436080
Doanh C Bui, Boram Song, Kyungeun Kim, Jin Tae Kwak
In computational pathology, graphs have shown to be promising for pathology image analysis. There exist various graph structures that can discover differing features of pathology images. However, the combination and interaction between differing graph structures have not been fully studied and utilized for pathology image analysis. In this study, we propose a parallel, bi-graph neural network, designated as SCUBa-Net, equipped with both graph convolutional networks and Transformers, that processes a pathology image as two distinct graphs, including a spatially-constrained graph and a spatially-unconstrained graph. For efficient and effective graph learning, we introduce two inter-graph interaction blocks and an intra-graph interaction block. The inter-graph interaction blocks learn the node-to-node interactions within each graph. The intra-graph interaction block learns the graph-to-graph interactions at both global- and local-levels with the help of the virtual nodes that collect and summarize the information from the entire graphs. SCUBa-Net is systematically evaluated on four multi-organ datasets, including colorectal, prostate, gastric, and bladder cancers. The experimental results demonstrate the effectiveness of SCUBa-Net in comparison to the state-of-the-art convolutional neural networks, Transformer, and graph neural networks.
{"title":"Spatially-constrained and -unconstrained bi-graph interaction network for multi-organ pathology image classification.","authors":"Doanh C Bui, Boram Song, Kyungeun Kim, Jin Tae Kwak","doi":"10.1109/TMI.2024.3436080","DOIUrl":"https://doi.org/10.1109/TMI.2024.3436080","url":null,"abstract":"<p><p>In computational pathology, graphs have shown to be promising for pathology image analysis. There exist various graph structures that can discover differing features of pathology images. However, the combination and interaction between differing graph structures have not been fully studied and utilized for pathology image analysis. In this study, we propose a parallel, bi-graph neural network, designated as SCUBa-Net, equipped with both graph convolutional networks and Transformers, that processes a pathology image as two distinct graphs, including a spatially-constrained graph and a spatially-unconstrained graph. For efficient and effective graph learning, we introduce two inter-graph interaction blocks and an intra-graph interaction block. The inter-graph interaction blocks learn the node-to-node interactions within each graph. The intra-graph interaction block learns the graph-to-graph interactions at both global- and local-levels with the help of the virtual nodes that collect and summarize the information from the entire graphs. SCUBa-Net is systematically evaluated on four multi-organ datasets, including colorectal, prostate, gastric, and bladder cancers. The experimental results demonstrate the effectiveness of SCUBa-Net in comparison to the state-of-the-art convolutional neural networks, Transformer, and graph neural networks.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141861927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}