Zygomatic implant surgery is an essential treatment option of oral rehabilitation for patients with severe maxillary defect, and preoperative planning is an important approach to enhance the surgical outcomes. However, the current planning still heavily relies on manual interventions, which is labor-intensive, experience-dependent, and poorly reproducible. Therefore, we propose ZygoPlanner, a pioneering efficient preoperative planning framework for zygomatic implantation, which may be the first solution that seamlessly involves the positioning of zygomatic bones, the generation of alternative paths, and the computation of optimal implantation paths. To efficiently achieve robust planning, we developed a graphics-based interpretable method for zygomatic bone positioning leveraging the shape prior knowledge. Meanwhile, a surface-faithful point cloud filling algorithm that works for concave geometries was proposed to populate dense points within the zygomatic bones, facilitating generation of alternative paths. Finally, we innovatively realized a graphical representation of the medical bone-to-implant contact to obtain the optimal results under multiple constraints. Clinical experiments confirmed the superiority of our framework across different scenarios. The source code is available at https://github.com/Haitao-Lee/auto_zygomatic_implantation.
{"title":"ZygoPlanner: A three-stage graphics-based framework for optimal preoperative planning of zygomatic implant placement.","authors":"Haitao Li, Xingqi Fan, Baoxin Tao, Wenying Wang, Yiqun Wu, Xiaojun Chen","doi":"10.1016/j.media.2024.103401","DOIUrl":"https://doi.org/10.1016/j.media.2024.103401","url":null,"abstract":"<p><p>Zygomatic implant surgery is an essential treatment option of oral rehabilitation for patients with severe maxillary defect, and preoperative planning is an important approach to enhance the surgical outcomes. However, the current planning still heavily relies on manual interventions, which is labor-intensive, experience-dependent, and poorly reproducible. Therefore, we propose ZygoPlanner, a pioneering efficient preoperative planning framework for zygomatic implantation, which may be the first solution that seamlessly involves the positioning of zygomatic bones, the generation of alternative paths, and the computation of optimal implantation paths. To efficiently achieve robust planning, we developed a graphics-based interpretable method for zygomatic bone positioning leveraging the shape prior knowledge. Meanwhile, a surface-faithful point cloud filling algorithm that works for concave geometries was proposed to populate dense points within the zygomatic bones, facilitating generation of alternative paths. Finally, we innovatively realized a graphical representation of the medical bone-to-implant contact to obtain the optimal results under multiple constraints. Clinical experiments confirmed the superiority of our framework across different scenarios. The source code is available at https://github.com/Haitao-Lee/auto_zygomatic_implantation.</p>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"101 ","pages":"103401"},"PeriodicalIF":10.7,"publicationDate":"2024-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142818562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-27DOI: 10.1016/j.media.2024.103400
Sylvain Thibeault , Marjolaine Roy-Beaudry , Stefan Parent , Samuel Kadoury
Anterior vertebral tethering (AVT) is a non-invasive spine surgery technique, treating severe spine deformations and preserving lower back mobility. However, patient positioning and surgical strategies greatly influences postoperative results. Predicting the upright geometry from pediatric spines is needed to optimize patient positioning in the operating room (OR) and improve surgical outcomes, but remains a complex task due to immature bone properties. We propose a framework used in the OR predicting the upright spine geometry at the first visit following surgery in idiopathic scoliosis patients. The approach first creates a 3D model of the spine while the patient is on the operating table. For this, multiview Transformers that combine images from different viewpoints are used to generate the intraoperative pose. The postoperative upright shape is then predicted on-the-fly using implicit neural fields, which are trained from geometries at different time points and conditioned with surgical parameters. A Signed Distance Function for shape constellations is used to handle the variability in spine appearance, capturing a disentangled latent domain of the articulation vectors, with separate encoding vectors representing both articulation and shape parameters. A regularization criterion based on a pre-trained group-wise trajectory of spine transformations generates complete spine models. A training set of 652 patients with 3D models was used to train the model, tested on a distinct cohort of 83 surgical patients. The framework based on neural kernels predicted upright 3D geometries with a mean 3D error of in landmarks points, and IoU of 95.9% in vertebral shapes when compared to actual postop models, falling within the acceptable margins of error below 2 mm.
{"title":"Prediction of the upright articulated spine shape in the operating room using conditioned neural kernel fields","authors":"Sylvain Thibeault , Marjolaine Roy-Beaudry , Stefan Parent , Samuel Kadoury","doi":"10.1016/j.media.2024.103400","DOIUrl":"10.1016/j.media.2024.103400","url":null,"abstract":"<div><div>Anterior vertebral tethering (AVT) is a non-invasive spine surgery technique, treating severe spine deformations and preserving lower back mobility. However, patient positioning and surgical strategies greatly influences postoperative results. Predicting the upright geometry from pediatric spines is needed to optimize patient positioning in the operating room (OR) and improve surgical outcomes, but remains a complex task due to immature bone properties. We propose a framework used in the OR predicting the upright spine geometry at the first visit following surgery in idiopathic scoliosis patients. The approach first creates a 3D model of the spine while the patient is on the operating table. For this, multiview Transformers that combine images from different viewpoints are used to generate the intraoperative pose. The postoperative upright shape is then predicted on-the-fly using implicit neural fields, which are trained from geometries at different time points and conditioned with surgical parameters. A Signed Distance Function for shape constellations is used to handle the variability in spine appearance, capturing a disentangled latent domain of the articulation vectors, with separate encoding vectors representing both articulation and shape parameters. A regularization criterion based on a pre-trained group-wise trajectory of spine transformations generates complete spine models. A training set of 652 patients with 3D models was used to train the model, tested on a distinct cohort of 83 surgical patients. The framework based on neural kernels predicted upright 3D geometries with a mean 3D error of <span><math><mrow><mn>1</mn><mo>.</mo><mn>3</mn><mo>±</mo><mn>0</mn><mo>.</mo><mn>5</mn><mspace></mspace><mi>mm</mi></mrow></math></span> in landmarks points, and IoU of 95.9% in vertebral shapes when compared to actual postop models, falling within the acceptable margins of error below 2 mm.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"100 ","pages":"Article 103400"},"PeriodicalIF":10.7,"publicationDate":"2024-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142756929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-26DOI: 10.1016/j.media.2024.103399
Luhao Sun , Bowen Han , Wenzong Jiang , Weifeng Liu , Baodi Liu , Dapeng Tao , Zhiyong Yu , Chao Li
Early diagnosis and treatment of breast cancer can effectively reduce mortality. Since mammogram is one of the most commonly used methods in the early diagnosis of breast cancer, the classification of mammogram images is an important work of computer-aided diagnosis (CAD) systems. With the development of deep learning in CAD, deep convolutional neural networks have been shown to have the ability to complete the classification of breast cancer tumor patches with high quality, which makes most previous CNN-based full-field mammography classification methods rely on region of interest (ROI) or segmentation annotation to enable the model to locate and focus on small tumor regions. However, the dependence on ROI greatly limits the development of CAD, because obtaining a large number of reliable ROI annotations is expensive and difficult. Some full-field mammography image classification algorithms use multi-stage training or multi-feature extractors to get rid of the dependence on ROI, which increases the computational amount of the model and feature redundancy. In order to reduce the cost of model training and make full use of the feature extraction capability of CNN, we propose a deep multi-scale region selection network (MRSN) in deep features for end-to-end training to classify full-field mammography without ROI or segmentation annotation. Inspired by the idea of multi-example learning and the patch classifier, MRSN filters the feature information and saves only the feature information of the tumor region to make the performance of the full-field image classifier closer to the patch classifier. MRSN first scores different regions under different dimensions to obtain the location information of tumor regions. Then, a few high-scoring regions are selected by location information as feature representations of the entire image, allowing the model to focus on the tumor region. Experiments on two public datasets and one private dataset prove that the proposed MRSN achieves the most advanced performance.
{"title":"Multi-scale region selection network in deep features for full-field mammogram classification","authors":"Luhao Sun , Bowen Han , Wenzong Jiang , Weifeng Liu , Baodi Liu , Dapeng Tao , Zhiyong Yu , Chao Li","doi":"10.1016/j.media.2024.103399","DOIUrl":"10.1016/j.media.2024.103399","url":null,"abstract":"<div><div>Early diagnosis and treatment of breast cancer can effectively reduce mortality. Since mammogram is one of the most commonly used methods in the early diagnosis of breast cancer, the classification of mammogram images is an important work of computer-aided diagnosis (CAD) systems. With the development of deep learning in CAD, deep convolutional neural networks have been shown to have the ability to complete the classification of breast cancer tumor patches with high quality, which makes most previous CNN-based full-field mammography classification methods rely on region of interest (ROI) or segmentation annotation to enable the model to locate and focus on small tumor regions. However, the dependence on ROI greatly limits the development of CAD, because obtaining a large number of reliable ROI annotations is expensive and difficult. Some full-field mammography image classification algorithms use multi-stage training or multi-feature extractors to get rid of the dependence on ROI, which increases the computational amount of the model and feature redundancy. In order to reduce the cost of model training and make full use of the feature extraction capability of CNN, we propose a deep multi-scale region selection network (MRSN) in deep features for end-to-end training to classify full-field mammography without ROI or segmentation annotation. Inspired by the idea of multi-example learning and the patch classifier, MRSN filters the feature information and saves only the feature information of the tumor region to make the performance of the full-field image classifier closer to the patch classifier. MRSN first scores different regions under different dimensions to obtain the location information of tumor regions. Then, a few high-scoring regions are selected by location information as feature representations of the entire image, allowing the model to focus on the tumor region. Experiments on two public datasets and one private dataset prove that the proposed MRSN achieves the most advanced performance.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"100 ","pages":"Article 103399"},"PeriodicalIF":10.7,"publicationDate":"2024-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142744060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-26DOI: 10.1016/j.media.2024.103414
Misha P T Kaandorp, Frank Zijlstra, Davood Karimi, Ali Gholipour, Peter T While
In medical image analysis, the utilization of biophysical models for signal analysis offers valuable insights into the underlying tissue types and microstructural processes. In diffusion-weighted magnetic resonance imaging (DWI), a major challenge lies in accurately estimating model parameters from the acquired data due to the inherently low signal-to-noise ratio (SNR) of the signal measurements and the complexity of solving the ill-posed inverse problem. Conventional model fitting approaches treat individual voxels as independent. However, the tissue microenvironment is typically homogeneous in a local environment, where neighboring voxels may contain correlated information. To harness the potential benefits of exploiting correlations among signals in adjacent voxels, this study introduces a novel approach to deep learning parameter estimation that effectively incorporates relevant spatial information. This is achieved by training neural networks on patches of synthetic data encompassing plausible combinations of direct correlations between neighboring voxels. We evaluated the approach on the intravoxel incoherent motion (IVIM) model in DWI. We explored the potential of several deep learning architectures to incorporate spatial information using self-supervised and supervised learning. We assessed performance quantitatively using novel fractal-noise-based synthetic data, which provide ground truths possessing spatial correlations. Additionally, we present results of the approach applied to in vivo DWI data consisting of twelve repetitions from a healthy volunteer. We demonstrate that supervised training on larger patch sizes using attention models leads to substantial performance improvements over both conventional voxelwise model fitting and convolution-based approaches.
{"title":"Incorporating spatial information in deep learning parameter estimation with application to the intravoxel incoherent motion model in diffusion-weighted MRI.","authors":"Misha P T Kaandorp, Frank Zijlstra, Davood Karimi, Ali Gholipour, Peter T While","doi":"10.1016/j.media.2024.103414","DOIUrl":"https://doi.org/10.1016/j.media.2024.103414","url":null,"abstract":"<p><p>In medical image analysis, the utilization of biophysical models for signal analysis offers valuable insights into the underlying tissue types and microstructural processes. In diffusion-weighted magnetic resonance imaging (DWI), a major challenge lies in accurately estimating model parameters from the acquired data due to the inherently low signal-to-noise ratio (SNR) of the signal measurements and the complexity of solving the ill-posed inverse problem. Conventional model fitting approaches treat individual voxels as independent. However, the tissue microenvironment is typically homogeneous in a local environment, where neighboring voxels may contain correlated information. To harness the potential benefits of exploiting correlations among signals in adjacent voxels, this study introduces a novel approach to deep learning parameter estimation that effectively incorporates relevant spatial information. This is achieved by training neural networks on patches of synthetic data encompassing plausible combinations of direct correlations between neighboring voxels. We evaluated the approach on the intravoxel incoherent motion (IVIM) model in DWI. We explored the potential of several deep learning architectures to incorporate spatial information using self-supervised and supervised learning. We assessed performance quantitatively using novel fractal-noise-based synthetic data, which provide ground truths possessing spatial correlations. Additionally, we present results of the approach applied to in vivo DWI data consisting of twelve repetitions from a healthy volunteer. We demonstrate that supervised training on larger patch sizes using attention models leads to substantial performance improvements over both conventional voxelwise model fitting and convolution-based approaches.</p>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"101 ","pages":"103414"},"PeriodicalIF":10.7,"publicationDate":"2024-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142910005","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-26DOI: 10.1016/j.media.2024.103397
Liming Zhong , Ruolin Xiao , Hai Shu , Kaiyi Zheng , Xinming Li , Yuankui Wu , Jianhua Ma , Qianjin Feng , Wei Yang
Contrast-enhanced computed tomography (CECT) is constantly used for delineating organs-at-risk (OARs) in radiation therapy planning. The delineated OARs are needed to transfer from CECT to non-contrast CT (NCCT) for dose calculation. Yet, the use of iodinated contrast agents (CA) in CECT and the dose calculation errors caused by the spatial misalignment between NCCT and CECT images pose risks of adverse side effects. A promising solution is synthesizing CECT images from NCCT scans, which can improve the visibility of organs and abnormalities for more effective multi-organ segmentation in NCCT images. However, existing methods neglect the difference between tissues induced by CA and lack the ability to synthesize the details of organ edges and blood vessels. To address these issues, we propose a contrast-enhanced knowledge and anatomical perception network (CKAP-Net) for NCCT-to-CECT synthesis. CKAP-Net leverages a contrast-enhanced knowledge learning network to capture both similarities and dissimilarities in domain characteristics attributable to CA. Specifically, a CA-based perceptual loss function is introduced to enhance the synthesis of CA details. Furthermore, we design a multi-scale anatomical perception transformer that utilizes multi-scale anatomical information from NCCT images, enabling the precise synthesis of tissue details. Our CKAP-Net is evaluated on a multi-center abdominal NCCT-CECT dataset, a head an neck NCCT-CECT dataset, and an NCMRI-CEMRI dataset. It achieves a MAE of 25.96 ± 2.64, a SSIM of 0.855 ± 0.017, and a PSNR of 32.60 ± 0.02 for CECT synthesis, and a DSC of 81.21 ± 4.44 for segmentation on the internal dataset. Extensive experiments demonstrate that CKAP-Net outperforms state-of-the-art CA synthesis methods and has better generalizability across different datasets.
{"title":"NCCT-to-CECT synthesis with contrast-enhanced knowledge and anatomical perception for multi-organ segmentation in non-contrast CT images","authors":"Liming Zhong , Ruolin Xiao , Hai Shu , Kaiyi Zheng , Xinming Li , Yuankui Wu , Jianhua Ma , Qianjin Feng , Wei Yang","doi":"10.1016/j.media.2024.103397","DOIUrl":"10.1016/j.media.2024.103397","url":null,"abstract":"<div><div>Contrast-enhanced computed tomography (CECT) is constantly used for delineating organs-at-risk (OARs) in radiation therapy planning. The delineated OARs are needed to transfer from CECT to non-contrast CT (NCCT) for dose calculation. Yet, the use of iodinated contrast agents (CA) in CECT and the dose calculation errors caused by the spatial misalignment between NCCT and CECT images pose risks of adverse side effects. A promising solution is synthesizing CECT images from NCCT scans, which can improve the visibility of organs and abnormalities for more effective multi-organ segmentation in NCCT images. However, existing methods neglect the difference between tissues induced by CA and lack the ability to synthesize the details of organ edges and blood vessels. To address these issues, we propose a contrast-enhanced knowledge and anatomical perception network (CKAP-Net) for NCCT-to-CECT synthesis. CKAP-Net leverages a contrast-enhanced knowledge learning network to capture both similarities and dissimilarities in domain characteristics attributable to CA. Specifically, a CA-based perceptual loss function is introduced to enhance the synthesis of CA details. Furthermore, we design a multi-scale anatomical perception transformer that utilizes multi-scale anatomical information from NCCT images, enabling the precise synthesis of tissue details. Our CKAP-Net is evaluated on a multi-center abdominal NCCT-CECT dataset, a head an neck NCCT-CECT dataset, and an NCMRI-CEMRI dataset. It achieves a MAE of 25.96 ± 2.64, a SSIM of 0.855 ± 0.017, and a PSNR of 32.60 ± 0.02 for CECT synthesis, and a DSC of 81.21 ± 4.44 for segmentation on the internal dataset. Extensive experiments demonstrate that CKAP-Net outperforms state-of-the-art CA synthesis methods and has better generalizability across different datasets.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"100 ","pages":"Article 103397"},"PeriodicalIF":10.7,"publicationDate":"2024-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142744163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-25DOI: 10.1016/j.media.2024.103395
Xingru Huang, Changpeng Yue, Yihao Guo, Jian Huang, Zhengyao Jiang, Mingkuan Wang, Zhaoyang Xu, Guangyuan Zhang, Jin Liu, Tianyun Zhang, Zhiwen Zheng, Xiaoshuai Zhang, Hong He, Shaowei Jiang, Yaoqi Sun
Optical Coherence Tomography (OCT) facilitates a comprehensive examination of macular edema and associated lesions. Manual delineation of retinal fluid is labor-intensive and error-prone, necessitating an automated diagnostic and therapeutic planning mechanism. Conventional supervised learning models are hindered by dataset limitations, while Transformer-based large vision models exhibit challenges in medical image segmentation, particularly in detecting small, subtle lesions in OCT images. This paper introduces the Multidimensional Directionality-Enhanced Retinal Fluid Segmentation framework (MD-DERFS), which reduces the limitations inherent in conventional supervised models by adapting a transformer-based large vision model for macular edema segmentation. The proposed MD-DERFS introduces a Multi-Dimensional Feature Re-Encoder Unit (MFU) to augment the model's proficiency in recognizing specific textures and pathological features through directional prior extraction and an Edema Texture Mapping Unit (ETMU), a Cross-scale Directional Insight Network (CDIN) furnishes a holistic perspective spanning local to global details, mitigating the large vision model's deficiencies in capturing localized feature information. Additionally, the framework is augmented by a Harmonic Minutiae Segmentation Equilibrium loss (LHMSE) that can address the challenges of data imbalance and annotation scarcity in macular edema datasets. Empirical validation on the MacuScan-8k dataset shows that MD-DERFS surpasses existing segmentation methodologies, demonstrating its efficacy in adapting large vision models for boundary-sensitive medical imaging tasks. The code is publicly available at https://github.com/IMOP-lab/MD-DERFS-Pytorch.git.
{"title":"Multidimensional Directionality-Enhanced Segmentation via large vision model.","authors":"Xingru Huang, Changpeng Yue, Yihao Guo, Jian Huang, Zhengyao Jiang, Mingkuan Wang, Zhaoyang Xu, Guangyuan Zhang, Jin Liu, Tianyun Zhang, Zhiwen Zheng, Xiaoshuai Zhang, Hong He, Shaowei Jiang, Yaoqi Sun","doi":"10.1016/j.media.2024.103395","DOIUrl":"https://doi.org/10.1016/j.media.2024.103395","url":null,"abstract":"<p><p>Optical Coherence Tomography (OCT) facilitates a comprehensive examination of macular edema and associated lesions. Manual delineation of retinal fluid is labor-intensive and error-prone, necessitating an automated diagnostic and therapeutic planning mechanism. Conventional supervised learning models are hindered by dataset limitations, while Transformer-based large vision models exhibit challenges in medical image segmentation, particularly in detecting small, subtle lesions in OCT images. This paper introduces the Multidimensional Directionality-Enhanced Retinal Fluid Segmentation framework (MD-DERFS), which reduces the limitations inherent in conventional supervised models by adapting a transformer-based large vision model for macular edema segmentation. The proposed MD-DERFS introduces a Multi-Dimensional Feature Re-Encoder Unit (MFU) to augment the model's proficiency in recognizing specific textures and pathological features through directional prior extraction and an Edema Texture Mapping Unit (ETMU), a Cross-scale Directional Insight Network (CDIN) furnishes a holistic perspective spanning local to global details, mitigating the large vision model's deficiencies in capturing localized feature information. Additionally, the framework is augmented by a Harmonic Minutiae Segmentation Equilibrium loss (L<sub>HMSE</sub>) that can address the challenges of data imbalance and annotation scarcity in macular edema datasets. Empirical validation on the MacuScan-8k dataset shows that MD-DERFS surpasses existing segmentation methodologies, demonstrating its efficacy in adapting large vision models for boundary-sensitive medical imaging tasks. The code is publicly available at https://github.com/IMOP-lab/MD-DERFS-Pytorch.git.</p>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"101 ","pages":"103395"},"PeriodicalIF":10.7,"publicationDate":"2024-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142791576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-24DOI: 10.1016/j.media.2024.103404
Weilu Li , Yun Zhang , Hao Zhou, Wenhan Yang, Zhi Xie, Yao He
Deep learning shows promise for medical image segmentation but suffers performance declines when applied to diverse healthcare sites due to data discrepancies among the different sites. Translating deep learning models to new clinical environments is challenging, especially when the original source data used for training is unavailable due to privacy restrictions. Source-free domain adaptation (SFDA) aims to adapt models to new unlabeled target domains without requiring access to the original source data. However, existing SFDA methods face challenges such as error propagation, misalignment of visual and structural features, and inability to preserve source knowledge. This paper introduces Continual Learning Multi-Scale domain adaptation (CLMS), an end-to-end SFDA framework integrating multi-scale reconstruction, continual learning, and style alignment to bridge domain gaps across medical sites using only unlabeled target data or publicly available data. Compared to the current state-of-the-art methods, CLMS consistently and significantly achieved top performance for different tasks, including prostate MRI segmentation (improved Dice of 10.87 %), colonoscopy polyp segmentation (improved Dice of 17.73 %), and plus disease classification from retinal images (improved AUC of 11.19 %). Crucially, CLMS preserved source knowledge for all the tasks, avoiding catastrophic forgetting. CLMS demonstrates a promising solution for translating deep learning models to new clinical imaging domains towards safe, reliable deployment across diverse healthcare settings.
{"title":"CLMS: Bridging domain gaps in medical imaging segmentation with source-free continual learning for robust knowledge transfer and adaptation","authors":"Weilu Li , Yun Zhang , Hao Zhou, Wenhan Yang, Zhi Xie, Yao He","doi":"10.1016/j.media.2024.103404","DOIUrl":"10.1016/j.media.2024.103404","url":null,"abstract":"<div><div>Deep learning shows promise for medical image segmentation but suffers performance declines when applied to diverse healthcare sites due to data discrepancies among the different sites. Translating deep learning models to new clinical environments is challenging, especially when the original source data used for training is unavailable due to privacy restrictions. Source-free domain adaptation (SFDA) aims to adapt models to new unlabeled target domains without requiring access to the original source data. However, existing SFDA methods face challenges such as error propagation, misalignment of visual and structural features, and inability to preserve source knowledge. This paper introduces Continual Learning Multi-Scale domain adaptation (CLMS), an end-to-end SFDA framework integrating multi-scale reconstruction, continual learning, and style alignment to bridge domain gaps across medical sites using only unlabeled target data or publicly available data. Compared to the current state-of-the-art methods, CLMS consistently and significantly achieved top performance for different tasks, including prostate MRI segmentation (improved Dice of 10.87 %), colonoscopy polyp segmentation (improved Dice of 17.73 %), and plus disease classification from retinal images (improved AUC of 11.19 %). Crucially, CLMS preserved source knowledge for all the tasks, avoiding catastrophic forgetting. CLMS demonstrates a promising solution for translating deep learning models to new clinical imaging domains towards safe, reliable deployment across diverse healthcare settings.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"100 ","pages":"Article 103404"},"PeriodicalIF":10.7,"publicationDate":"2024-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142756236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-23DOI: 10.1016/j.media.2024.103398
Jiayue Chu , Chenhe Du , Xiyue Lin , Xiaoqun Zhang , Lihui Wang , Yuyao Zhang , Hongjiang Wei
Reconstructing high-fidelity magnetic resonance (MR) images from under-sampled k-space is a commonly used strategy to reduce scan time. The posterior sampling of diffusion models based on the real measurement data holds significant promise of improved reconstruction accuracy. However, traditional posterior sampling methods often lack effective data consistency guidance, leading to inaccurate and unstable reconstructions. Implicit neural representation (INR) has emerged as a powerful paradigm for solving inverse problems by modeling a signal’s attributes as a continuous function of spatial coordinates. In this study, we present a novel posterior sampler for diffusion models using INR, named DiffINR. The INR-based component incorporates both the diffusion prior distribution and the MRI physical model to ensure high data fidelity. DiffINR demonstrates superior performance on in-distribution datasets with remarkable accuracy, even under high acceleration factors (up to R 12 in single-channel reconstruction). Furthermore, DiffINR exhibits excellent generalizability across various tissue contrasts and anatomical structures with low uncertainty. Overall, DiffINR significantly improves MRI reconstruction in terms of accuracy, generalizability and stability, paving the way for further accelerating MRI acquisition. Notably, our proposed framework can be a generalizable framework to solve inverse problems in other medical imaging tasks.
从采样不足的 k 空间重建高保真磁共振(MR)图像是缩短扫描时间的常用策略。根据真实测量数据对扩散模型进行后向采样,有望显著提高重建精度。然而,传统的后向采样方法往往缺乏有效的数据一致性指导,导致重建不准确、不稳定。隐式神经表示(INR)通过将信号属性建模为空间坐标的连续函数,已成为解决逆问题的强大范例。在这项研究中,我们为使用 INR 的扩散模型提出了一种新型后验采样器,名为 DiffINR。基于 INR 的组件结合了扩散先验分布和核磁共振物理模型,以确保高数据保真度。DiffINR 在分布内数据集上表现出卓越的性能,即使在高加速因子(单通道重建中高达 R = 12)条件下也能保持出色的准确性。此外,DiffINR 在各种组织对比度和解剖结构上都表现出卓越的通用性,不确定性很低。总之,DiffINR 在准确性、通用性和稳定性方面大大提高了磁共振成像重建的效率,为进一步加速磁共振成像采集铺平了道路。值得注意的是,我们提出的框架可用于解决其他医学成像任务中的逆问题。
{"title":"Highly accelerated MRI via implicit neural representation guided posterior sampling of diffusion models","authors":"Jiayue Chu , Chenhe Du , Xiyue Lin , Xiaoqun Zhang , Lihui Wang , Yuyao Zhang , Hongjiang Wei","doi":"10.1016/j.media.2024.103398","DOIUrl":"10.1016/j.media.2024.103398","url":null,"abstract":"<div><div>Reconstructing high-fidelity magnetic resonance (MR) images from under-sampled k-space is a commonly used strategy to reduce scan time. The posterior sampling of diffusion models based on the real measurement data holds significant promise of improved reconstruction accuracy. However, traditional posterior sampling methods often lack effective data consistency guidance, leading to inaccurate and unstable reconstructions. Implicit neural representation (INR) has emerged as a powerful paradigm for solving inverse problems by modeling a signal’s attributes as a continuous function of spatial coordinates. In this study, we present a novel posterior sampler for diffusion models using INR, named DiffINR. The INR-based component incorporates both the diffusion prior distribution and the MRI physical model to ensure high data fidelity. DiffINR demonstrates superior performance on in-distribution datasets with remarkable accuracy, even under high acceleration factors (up to R <span><math><mo>=</mo></math></span> 12 in single-channel reconstruction). Furthermore, DiffINR exhibits excellent generalizability across various tissue contrasts and anatomical structures with low uncertainty. Overall, DiffINR significantly improves MRI reconstruction in terms of accuracy, generalizability and stability, paving the way for further accelerating MRI acquisition. Notably, our proposed framework can be a generalizable framework to solve inverse problems in other medical imaging tasks.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"100 ","pages":"Article 103398"},"PeriodicalIF":10.7,"publicationDate":"2024-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142721145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-20DOI: 10.1016/j.media.2024.103391
Huidong Xie , Liang Guo , Alexandre Velo , Zhao Liu , Qiong Liu , Xueqi Guo , Bo Zhou , Xiongchao Chen , Yu-Jung Tsai , Tianshun Miao , Menghua Xia , Yi-Hwa Liu , Ian S. Armstrong , Ge Wang , Richard E. Carson , Albert J. Sinusas , Chi Liu
<div><div>Rubidium-82 (<span><math><mrow><msup><mrow></mrow><mrow><mn>82</mn></mrow></msup><mtext>Rb</mtext></mrow></math></span>) is a radioactive isotope widely used for cardiac PET imaging. Despite numerous benefits of <span><math><mrow><msup><mrow></mrow><mrow><mn>82</mn></mrow></msup><mtext>Rb</mtext></mrow></math></span>, there are several factors that limits its image quality and quantitative accuracy. First, the short half-life of <span><math><mrow><msup><mrow></mrow><mrow><mn>82</mn></mrow></msup><mtext>Rb</mtext></mrow></math></span> results in noisy dynamic frames. Low signal-to-noise ratio would result in inaccurate and biased image quantification. Noisy dynamic frames also lead to highly noisy parametric images. The noise levels also vary substantially in different dynamic frames due to radiotracer decay and short half-life. Existing denoising methods are not applicable for this task due to the lack of paired training inputs/labels and inability to generalize across varying noise levels. Second, <span><math><mrow><msup><mrow></mrow><mrow><mn>82</mn></mrow></msup><mtext>Rb</mtext></mrow></math></span> emits high-energy positrons. Compared with other tracers such as <span><math><mrow><msup><mrow></mrow><mrow><mn>18</mn></mrow></msup><mtext>F</mtext></mrow></math></span>, <span><math><mrow><msup><mrow></mrow><mrow><mn>82</mn></mrow></msup><mtext>Rb</mtext></mrow></math></span> travels a longer distance before annihilation, which negatively affect image spatial resolution. Here, the goal of this study is to propose a self-supervised method for simultaneous (1) noise-aware dynamic image denoising and (2) positron range correction for <span><math><mrow><msup><mrow></mrow><mrow><mn>82</mn></mrow></msup><mtext>Rb</mtext></mrow></math></span> cardiac PET imaging. Tested on a series of PET scans from a cohort of normal volunteers, the proposed method produced images with superior visual quality. To demonstrate the improvement in image quantification, we compared image-derived input functions (IDIFs) with arterial input functions (AIFs) from continuous arterial blood samples. The IDIF derived from the proposed method led to lower AUC differences, decreasing from 11.09<span><math><mtext>%</mtext></math></span> to 7.58<span><math><mtext>%</mtext></math></span> on average, compared to the original dynamic frames. The proposed method also improved the quantification of myocardium blood flow (MBF), as validated against <span><math><mrow><msup><mrow></mrow><mrow><mn>15</mn></mrow></msup><mtext>O-water</mtext></mrow></math></span> scans, with mean MBF differences decreased from 0.43 to 0.09, compared to the original dynamic frames. We also conducted a generalizability experiment on 37 patient scans obtained from a different country using a different scanner. The presented method enhanced defect contrast and resulted in lower regional MBF in areas with perfusion defects. Lastly, comparison with other related methods is included to show the effectivenes
{"title":"Noise-aware dynamic image denoising and positron range correction for Rubidium-82 cardiac PET imaging via self-supervision","authors":"Huidong Xie , Liang Guo , Alexandre Velo , Zhao Liu , Qiong Liu , Xueqi Guo , Bo Zhou , Xiongchao Chen , Yu-Jung Tsai , Tianshun Miao , Menghua Xia , Yi-Hwa Liu , Ian S. Armstrong , Ge Wang , Richard E. Carson , Albert J. Sinusas , Chi Liu","doi":"10.1016/j.media.2024.103391","DOIUrl":"10.1016/j.media.2024.103391","url":null,"abstract":"<div><div>Rubidium-82 (<span><math><mrow><msup><mrow></mrow><mrow><mn>82</mn></mrow></msup><mtext>Rb</mtext></mrow></math></span>) is a radioactive isotope widely used for cardiac PET imaging. Despite numerous benefits of <span><math><mrow><msup><mrow></mrow><mrow><mn>82</mn></mrow></msup><mtext>Rb</mtext></mrow></math></span>, there are several factors that limits its image quality and quantitative accuracy. First, the short half-life of <span><math><mrow><msup><mrow></mrow><mrow><mn>82</mn></mrow></msup><mtext>Rb</mtext></mrow></math></span> results in noisy dynamic frames. Low signal-to-noise ratio would result in inaccurate and biased image quantification. Noisy dynamic frames also lead to highly noisy parametric images. The noise levels also vary substantially in different dynamic frames due to radiotracer decay and short half-life. Existing denoising methods are not applicable for this task due to the lack of paired training inputs/labels and inability to generalize across varying noise levels. Second, <span><math><mrow><msup><mrow></mrow><mrow><mn>82</mn></mrow></msup><mtext>Rb</mtext></mrow></math></span> emits high-energy positrons. Compared with other tracers such as <span><math><mrow><msup><mrow></mrow><mrow><mn>18</mn></mrow></msup><mtext>F</mtext></mrow></math></span>, <span><math><mrow><msup><mrow></mrow><mrow><mn>82</mn></mrow></msup><mtext>Rb</mtext></mrow></math></span> travels a longer distance before annihilation, which negatively affect image spatial resolution. Here, the goal of this study is to propose a self-supervised method for simultaneous (1) noise-aware dynamic image denoising and (2) positron range correction for <span><math><mrow><msup><mrow></mrow><mrow><mn>82</mn></mrow></msup><mtext>Rb</mtext></mrow></math></span> cardiac PET imaging. Tested on a series of PET scans from a cohort of normal volunteers, the proposed method produced images with superior visual quality. To demonstrate the improvement in image quantification, we compared image-derived input functions (IDIFs) with arterial input functions (AIFs) from continuous arterial blood samples. The IDIF derived from the proposed method led to lower AUC differences, decreasing from 11.09<span><math><mtext>%</mtext></math></span> to 7.58<span><math><mtext>%</mtext></math></span> on average, compared to the original dynamic frames. The proposed method also improved the quantification of myocardium blood flow (MBF), as validated against <span><math><mrow><msup><mrow></mrow><mrow><mn>15</mn></mrow></msup><mtext>O-water</mtext></mrow></math></span> scans, with mean MBF differences decreased from 0.43 to 0.09, compared to the original dynamic frames. We also conducted a generalizability experiment on 37 patient scans obtained from a different country using a different scanner. The presented method enhanced defect contrast and resulted in lower regional MBF in areas with perfusion defects. Lastly, comparison with other related methods is included to show the effectivenes","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"100 ","pages":"Article 103391"},"PeriodicalIF":10.7,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142695596","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-19DOI: 10.1016/j.media.2024.103393
Jing Yang , Lianxin Wang , Chen Lin , Jiacheng Wang , Liansheng Wang
X-ray is the primary tool for diagnosing fractures, crucial for determining their type, location, and severity. However, non-displaced femoral neck fractures (ND-FNF) can pose challenges in identification due to subtle cracks and complex anatomical structures. Most deep learning-based methods for diagnosing ND-FNF rely on cropped images, necessitating manual annotation of the hip location, which increases annotation costs. To address this challenge, we propose Dual Domain Knowledge Guidance (DDKG), which harnesses spatial and semantic domain knowledge to guide the model in acquiring robust representations of ND-FNF across the whole X-ray image. Specifically, DDKG comprises two key modules: the Spatial Aware Module (SAM) and the Semantic Coordination Module (SCM). SAM employs limited positional supervision to guide the model in focusing on the hip joint region and reducing background interference. SCM integrates information from radiological reports, utilizes prior knowledge from large language models to extract critical information related to ND-FNF, and guides the model to learn relevant visual representations. During inference, the model only requires the whole X-ray image for accurate diagnosis without additional information. The model was validated on datasets from four different centers, showing consistent accuracy and robustness. Codes and models are available at https://github.com/Yjing07/DDKG.
X 射线是诊断骨折的主要工具,对于确定骨折的类型、位置和严重程度至关重要。然而,由于存在细微的裂缝和复杂的解剖结构,非移位股骨颈骨折(ND-FNF)的识别面临挑战。大多数基于深度学习的 ND-FNF 诊断方法都依赖于裁剪图像,需要人工标注髋关节位置,这增加了标注成本。为了应对这一挑战,我们提出了双领域知识指导(DDKG),利用空间和语义领域知识指导模型在整个 X 光图像中获取 ND-FNF 的稳健表示。具体来说,DDKG 包括两个关键模块:空间感知模块(SAM)和语义协调模块(SCM)。空间感知模块采用有限的位置监督来引导模型聚焦于髋关节区域并减少背景干扰。SCM 整合了放射报告中的信息,利用大型语言模型中的先验知识提取与 ND-FNF 相关的关键信息,并引导模型学习相关的视觉表征。在推理过程中,该模型只需要整个 X 光图像就能准确诊断,无需额外信息。该模型在四个不同中心的数据集上进行了验证,显示出一致的准确性和鲁棒性。代码和模型可在 https://github.com/Yjing07/DDKG 上获取。
{"title":"DDKG: A Dual Domain Knowledge Guidance strategy for localization and diagnosis of non-displaced femoral neck fractures","authors":"Jing Yang , Lianxin Wang , Chen Lin , Jiacheng Wang , Liansheng Wang","doi":"10.1016/j.media.2024.103393","DOIUrl":"10.1016/j.media.2024.103393","url":null,"abstract":"<div><div>X-ray is the primary tool for diagnosing fractures, crucial for determining their type, location, and severity. However, non-displaced femoral neck fractures (ND-FNF) can pose challenges in identification due to subtle cracks and complex anatomical structures. Most deep learning-based methods for diagnosing ND-FNF rely on cropped images, necessitating manual annotation of the hip location, which increases annotation costs. To address this challenge, we propose Dual Domain Knowledge Guidance (DDKG), which harnesses spatial and semantic domain knowledge to guide the model in acquiring robust representations of ND-FNF across the whole X-ray image. Specifically, DDKG comprises two key modules: the Spatial Aware Module (SAM) and the Semantic Coordination Module (SCM). SAM employs limited positional supervision to guide the model in focusing on the hip joint region and reducing background interference. SCM integrates information from radiological reports, utilizes prior knowledge from large language models to extract critical information related to ND-FNF, and guides the model to learn relevant visual representations. During inference, the model only requires the whole X-ray image for accurate diagnosis without additional information. The model was validated on datasets from four different centers, showing consistent accuracy and robustness. Codes and models are available at <span><span>https://github.com/Yjing07/DDKG</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"100 ","pages":"Article 103393"},"PeriodicalIF":10.7,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142696321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}