Pub Date : 2024-05-07DOI: 10.1016/j.compmedimag.2024.102397
Jing Zou , Youyi Song , Lihao Liu , Angelica I. Aviles-Rivero , Jing Qin
We address the problem of lung CT image registration, which underpins various diagnoses and treatments for lung diseases. The main crux of the problem is the large deformation that the lungs undergo during respiration. This physiological process imposes several challenges from a learning point of view. In this paper, we propose a novel training scheme, called stochastic decomposition, which enables deep networks to effectively learn such a difficult deformation field during lung CT image registration. The key idea is to stochastically decompose the deformation field, and supervise the registration by synthetic data that have the corresponding appearance discrepancy. The stochastic decomposition allows for revealing all possible decompositions of the deformation field. At the learning level, these decompositions can be seen as a prior to reduce the ill-posedness of the registration yielding to boost the performance. We demonstrate the effectiveness of our framework on Lung CT data. We show, through extensive numerical and visual results, that our technique outperforms existing methods.
{"title":"Unsupervised lung CT image registration via stochastic decomposition of deformation fields","authors":"Jing Zou , Youyi Song , Lihao Liu , Angelica I. Aviles-Rivero , Jing Qin","doi":"10.1016/j.compmedimag.2024.102397","DOIUrl":"https://doi.org/10.1016/j.compmedimag.2024.102397","url":null,"abstract":"<div><p>We address the problem of lung CT image registration, which underpins various diagnoses and treatments for lung diseases. The main crux of the problem is the large deformation that the lungs undergo during respiration. This physiological process imposes several challenges from a learning point of view. In this paper, we propose a novel training scheme, called stochastic decomposition, which enables deep networks to effectively learn such a difficult deformation field during lung CT image registration. The key idea is to stochastically decompose the deformation field, and supervise the registration by synthetic data that have the corresponding appearance discrepancy. The stochastic decomposition allows for revealing all possible decompositions of the deformation field. At the learning level, these decompositions can be seen as a prior to reduce the ill-posedness of the registration yielding to boost the performance. We demonstrate the effectiveness of our framework on Lung CT data. We show, through extensive numerical and visual results, that our technique outperforms existing methods.</p></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":null,"pages":null},"PeriodicalIF":5.7,"publicationDate":"2024-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140910060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-07DOI: 10.1016/j.compmedimag.2024.102395
Renato Hermoza , Jacinto C. Nascimento , Gustavo Carneiro
In this paper, we hypothesize that it is possible to localize image regions of preclinical tumors in a Chest X-ray (CXR) image by a weakly-supervised training of a survival prediction model using a dataset containing CXR images of healthy patients and their time-to-death label. These visual explanations can empower clinicians in early lung cancer detection and increase patient awareness of their susceptibility to the disease. To test this hypothesis, we train a censor-aware multi-class survival prediction deep learning classifier that is robust to imbalanced training, where classes represent quantized number of days for time-to-death prediction. Such multi-class model allows us to use post-hoc interpretability methods, such as Grad-CAM, to localize image regions of preclinical tumors. For the experiments, we propose a new benchmark based on the National Lung Cancer Screening Trial (NLST) dataset to test weakly-supervised preclinical tumor localization and survival prediction models, and results suggest that our proposed method shows state-of-the-art C-index survival prediction and weakly-supervised preclinical tumor localization results. To our knowledge, this constitutes a pioneer approach in the field that is able to produce visual explanations of preclinical events associated with survival prediction results.
在本文中,我们假设通过使用包含健康患者 CXR 图像及其死亡时间标签的数据集对生存预测模型进行弱监督训练,有可能在胸部 X 光(CXR)图像中定位临床前肿瘤的图像区域。这些可视化解释可以增强临床医生早期检测肺癌的能力,并提高患者对自身易感性的认识。为了验证这一假设,我们训练了一种对不平衡训练具有鲁棒性的审查器感知多类生存预测深度学习分类器,其中类代表了量化的死亡时间预测天数。这种多类模型允许我们使用事后可解释性方法(如 Grad-CAM)来定位临床前肿瘤的图像区域。在实验中,我们提出了一个基于国家肺癌筛查试验(NLST)数据集的新基准,以测试弱监督临床前肿瘤定位和生存预测模型,结果表明我们提出的方法显示了最先进的 C 指数生存预测和弱监督临床前肿瘤定位结果。据我们所知,这是该领域中能够对与生存预测结果相关的临床前事件进行可视化解释的开创性方法。
{"title":"Weakly-supervised preclinical tumor localization associated with survival prediction from lung cancer screening Chest X-ray images","authors":"Renato Hermoza , Jacinto C. Nascimento , Gustavo Carneiro","doi":"10.1016/j.compmedimag.2024.102395","DOIUrl":"https://doi.org/10.1016/j.compmedimag.2024.102395","url":null,"abstract":"<div><p>In this paper, we hypothesize that it is possible to localize image regions of preclinical tumors in a Chest X-ray (CXR) image by a weakly-supervised training of a survival prediction model using a dataset containing CXR images of healthy patients and their time-to-death label. These visual explanations can empower clinicians in early lung cancer detection and increase patient awareness of their susceptibility to the disease. To test this hypothesis, we train a censor-aware multi-class survival prediction deep learning classifier that is robust to imbalanced training, where classes represent quantized number of days for time-to-death prediction. Such multi-class model allows us to use post-hoc interpretability methods, such as Grad-CAM, to localize image regions of preclinical tumors. For the experiments, we propose a new benchmark based on the National Lung Cancer Screening Trial (NLST) dataset to test weakly-supervised preclinical tumor localization and survival prediction models, and results suggest that our proposed method shows state-of-the-art C-index survival prediction and weakly-supervised preclinical tumor localization results. To our knowledge, this constitutes a pioneer approach in the field that is able to produce visual explanations of preclinical events associated with survival prediction results.</p></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":null,"pages":null},"PeriodicalIF":5.7,"publicationDate":"2024-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0895611124000727/pdfft?md5=13bd653784bd57b091f5c80e427ca52e&pid=1-s2.0-S0895611124000727-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140901751","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Analyzing the basal ganglia following an early brain lesion is crucial due to their noteworthy role in sensory–motor functions. However, the segmentation of these subcortical structures on MRI is challenging in children and is further complicated by the presence of a lesion. Although current deep neural networks (DNN) perform well in segmenting subcortical brain structures in healthy brains, they lack robustness when faced with lesion variability, leading to structural inconsistencies. Given the established spatial organization of the basal ganglia, we propose enhancing the DNN-based segmentation through post-processing with a graph neural network (GNN). The GNN conducts node classification on graphs encoding both class probabilities and spatial information regarding the regions segmented by the DNN. In this study, we focus on neonatal arterial ischemic stroke (NAIS) in children. The approach is evaluated on both healthy children and children after NAIS using three DNN backbones: U-Net, UNETr, and MSGSE-Net. The results show an improvement in segmentation performance, with an increase in the median Dice score by up to 4% and a reduction in the median Hausdorff distance (HD) by up to 93% for healthy children (from 36.45 to 2.57) and up to 91% for children suffering from NAIS (from 40.64 to 3.50). The performance of the method is compared with atlas-based methods. Severe cases of neonatal stroke result in a decline in performance in the injured hemisphere, without negatively affecting the segmentation of the contra-injured hemisphere. Furthermore, the approach demonstrates resilience to small training datasets, a widespread challenge in the medical field, particularly in pediatrics and for rare pathologies.
{"title":"GNN-based structural information to improve DNN-based basal ganglia segmentation in children following early brain lesion","authors":"Patty Coupeau , Jean-Baptiste Fasquel , Lucie Hertz-Pannier , Mickaël Dinomais","doi":"10.1016/j.compmedimag.2024.102396","DOIUrl":"https://doi.org/10.1016/j.compmedimag.2024.102396","url":null,"abstract":"<div><p>Analyzing the basal ganglia following an early brain lesion is crucial due to their noteworthy role in sensory–motor functions. However, the segmentation of these subcortical structures on MRI is challenging in children and is further complicated by the presence of a lesion. Although current deep neural networks (DNN) perform well in segmenting subcortical brain structures in healthy brains, they lack robustness when faced with lesion variability, leading to structural inconsistencies. Given the established spatial organization of the basal ganglia, we propose enhancing the DNN-based segmentation through post-processing with a graph neural network (GNN). The GNN conducts node classification on graphs encoding both class probabilities and spatial information regarding the regions segmented by the DNN. In this study, we focus on neonatal arterial ischemic stroke (NAIS) in children. The approach is evaluated on both healthy children and children after NAIS using three DNN backbones: U-Net, UNETr, and MSGSE-Net. The results show an improvement in segmentation performance, with an increase in the median Dice score by up to 4% and a reduction in the median Hausdorff distance (HD) by up to 93% for healthy children (from 36.45 to 2.57) and up to 91% for children suffering from NAIS (from 40.64 to 3.50). The performance of the method is compared with atlas-based methods. Severe cases of neonatal stroke result in a decline in performance in the injured hemisphere, without negatively affecting the segmentation of the contra-injured hemisphere. Furthermore, the approach demonstrates resilience to small training datasets, a widespread challenge in the medical field, particularly in pediatrics and for rare pathologies.</p></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":null,"pages":null},"PeriodicalIF":5.7,"publicationDate":"2024-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140914459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Colonoscopy is the choice procedure to diagnose, screening, and treat the colon and rectum cancer, from early detection of small precancerous lesions (polyps), to confirmation of malign masses. However, the high variability of the organ appearance and the complex shape of both the colon wall and structures of interest make this exploration difficult. Learned visuospatial and perceptual abilities mitigate technical limitations in clinical practice by proper estimation of the intestinal depth. This work introduces a novel methodology to estimate colon depth maps in single frames from monocular colonoscopy videos. The generated depth map is inferred from the shading variation of the colon wall with respect to the light source, as learned from a realistic synthetic database. Briefly, a classic convolutional neural network architecture is trained from scratch to estimate the depth map, improving sharp depth estimations in haustral folds and polyps by a custom loss function that minimizes the estimation error in edges and curvatures. The network was trained by a custom synthetic colonoscopy database herein constructed and released, composed of 248 400 frames (47 videos), with depth annotations at the level of pixels. This collection comprehends 5 subsets of videos with progressively higher levels of visual complexity. Evaluation of the depth estimation with the synthetic database reached a threshold accuracy of 95.65%, and a mean-RMSE of , while a qualitative assessment with a real database showed consistent depth estimations, visually evaluated by the expert gastroenterologist coauthoring this paper. Finally, the method achieved competitive performance with respect to another state-of-the-art method using a public synthetic database and comparable results in a set of images with other five state-of-the-art methods. Additionally, three-dimensional reconstructions demonstrated useful approximations of the gastrointestinal tract geometry. Code for reproducing the reported results and the dataset are available at https://github.com/Cimalab-unal/ColonDepthEstimation.
{"title":"Leveraging a realistic synthetic database to learn Shape-from-Shading for estimating the colon depth in colonoscopy images","authors":"Josué Ruano , Martín Gómez , Eduardo Romero , Antoine Manzanera","doi":"10.1016/j.compmedimag.2024.102390","DOIUrl":"https://doi.org/10.1016/j.compmedimag.2024.102390","url":null,"abstract":"<div><p>Colonoscopy is the choice procedure to diagnose, screening, and treat the colon and rectum cancer, from early detection of small precancerous lesions (polyps), to confirmation of malign masses. However, the high variability of the organ appearance and the complex shape of both the colon wall and structures of interest make this exploration difficult. Learned visuospatial and perceptual abilities mitigate technical limitations in clinical practice by proper estimation of the intestinal depth. This work introduces a novel methodology to estimate colon depth maps in single frames from monocular colonoscopy videos. The generated depth map is inferred from the shading variation of the colon wall with respect to the light source, as learned from a realistic synthetic database. Briefly, a classic convolutional neural network architecture is trained from scratch to estimate the depth map, improving sharp depth estimations in haustral folds and polyps by a custom loss function that minimizes the estimation error in edges and curvatures. The network was trained by a custom synthetic colonoscopy database herein constructed and released, composed of 248<!--> <!-->400 frames (47 videos), with depth annotations at the level of pixels. This collection comprehends 5 subsets of videos with progressively higher levels of visual complexity. Evaluation of the depth estimation with the synthetic database reached a threshold accuracy of 95.65%, and a mean-RMSE of <span><math><mrow><mn>0</mn><mo>.</mo><mn>451</mn><mi>cm</mi></mrow></math></span>, while a qualitative assessment with a real database showed consistent depth estimations, visually evaluated by the expert gastroenterologist coauthoring this paper. Finally, the method achieved competitive performance with respect to another state-of-the-art method using a public synthetic database and comparable results in a set of images with other five state-of-the-art methods. Additionally, three-dimensional reconstructions demonstrated useful approximations of the gastrointestinal tract geometry. Code for reproducing the reported results and the dataset are available at <span>https://github.com/Cimalab-unal/ColonDepthEstimation</span><svg><path></path></svg>.</p></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":null,"pages":null},"PeriodicalIF":5.7,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0895611124000673/pdfft?md5=bb898a4ca8669cb2b3cd2808af60a2b6&pid=1-s2.0-S0895611124000673-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140842801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Automated Motion Artefact Detection (MAD) in Magnetic Resonance Imaging (MRI) is a field of study that aims to automatically flag motion artefacts in order to prevent the requirement for a repeat scan. In this paper, we identify and tackle the three current challenges in the field of automated MAD; (1) reliance on fully-supervised training, meaning they require specific examples of Motion Artefacts (MA), (2) inconsistent use of benchmark datasets across different works and use of private datasets for testing and training of newly proposed MAD techniques and (3) a lack of sufficiently large datasets for MRI MAD. To address these challenges, we demonstrate how MAs can be identified by formulating the problem as an unsupervised Anomaly Detection (AD) task. We compare the performance of three State-of-the-Art AD algorithms DeepSVDD, Interpolated Gaussian Descriptor and FewSOME on two open-source Brain MRI datasets on the task of MAD and MA severity classification, with FewSOME achieving a MAD AUC on both datasets and a Spearman Rank Correlation Coefficient of 0.8 on the task of MA severity classification. These models are trained in the few shot setting, meaning large Brain MRI datasets are not required to build robust MAD algorithms. This work also sets a standard protocol for testing MAD algorithms on open-source benchmark datasets. In addition to addressing these challenges, we demonstrate how our proposed ‘anomaly-aware’ scoring function improves FewSOME’s MAD performance in the setting where one and two shots of the anomalous class are available for training. Code available at https://github.com/niamhbelton/Unsupervised-Brain-MRI-Motion-Artefact-Detection/.
{"title":"Towards a unified approach for unsupervised brain MRI Motion Artefact Detection with few shot Anomaly Detection","authors":"Niamh Belton , Misgina Tsighe Hagos , Aonghus Lawlor , Kathleen M. Curran","doi":"10.1016/j.compmedimag.2024.102391","DOIUrl":"https://doi.org/10.1016/j.compmedimag.2024.102391","url":null,"abstract":"<div><p>Automated Motion Artefact Detection (MAD) in Magnetic Resonance Imaging (MRI) is a field of study that aims to automatically flag motion artefacts in order to prevent the requirement for a repeat scan. In this paper, we identify and tackle the three current challenges in the field of automated MAD; (1) reliance on fully-supervised training, meaning they require specific examples of Motion Artefacts (MA), (2) inconsistent use of benchmark datasets across different works and use of private datasets for testing and training of newly proposed MAD techniques and (3) a lack of sufficiently large datasets for MRI MAD. To address these challenges, we demonstrate how MAs can be identified by formulating the problem as an unsupervised Anomaly Detection (AD) task. We compare the performance of three State-of-the-Art AD algorithms DeepSVDD, Interpolated Gaussian Descriptor and FewSOME on two open-source Brain MRI datasets on the task of MAD and MA severity classification, with FewSOME achieving a MAD AUC <span><math><mrow><mo>></mo><mn>90</mn><mtext>%</mtext></mrow></math></span> on both datasets and a Spearman Rank Correlation Coefficient of 0.8 on the task of MA severity classification. These models are trained in the few shot setting, meaning large Brain MRI datasets are not required to build robust MAD algorithms. This work also sets a standard protocol for testing MAD algorithms on open-source benchmark datasets. In addition to addressing these challenges, we demonstrate how our proposed ‘anomaly-aware’ scoring function improves FewSOME’s MAD performance in the setting where one and two shots of the anomalous class are available for training. Code available at <span>https://github.com/niamhbelton/Unsupervised-Brain-MRI-Motion-Artefact-Detection/</span><svg><path></path></svg>.</p></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":null,"pages":null},"PeriodicalIF":5.7,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0895611124000685/pdfft?md5=8275185e5cfc03cae6d8bed048a27239&pid=1-s2.0-S0895611124000685-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140844149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-03DOI: 10.1016/j.compmedimag.2024.102394
Chengfan Li , Liangbing Nie , Zhenkui Sun , Xuehai Ding , Quanyong Luo , Chentian Shen
Fracture related infection (FRI) is one of the most devastating complications after fracture surgery in the lower extremities, which can lead to extremely high morbidity and medical costs. Therefore, early comprehensive evaluation and accurate diagnosis of patients are critical for appropriate treatment, prevention of complications, and good prognosis. 18Fluoro-deoxyglucose positron emission tomography/computed tomography (18F-FDG PET/CT) is one of the most commonly used medical imaging modalities for diagnosing FRI. With the development of deep learning, more neural networks have been proposed and become powerful computer-aided diagnosis tools in medical imaging. Therefore, a fully automated two-stage framework for FRI detection and diagnosis, 3DFRINet (Three Dimension FRI Network), is proposed for 18F-FDG PET/CT 3D imaging. The first stage can effectively extract and fuse the features of both modalities to accurately locate the lesion by the dual-branch design and attention module. The second stage reduces the dimensionality of the image by using the maximum intensity projection, which retains the effective features while reducing the computational effort and achieving excellent diagnostic performance. The diagnostic performance of lesions reached 91.55% accuracy, 0.9331 AUC, and 0.9250 F1 score. 3DFRINet has an advantage over six nuclear medicine experts in each classification metric. The statistical analysis shows that 3DFRINet is equivalent or superior to the primary nuclear medicine physicians and comparable to the senior nuclear medicine physicians. In conclusion, this study first proposed a method based on 18F-FDG PET/CT three-dimensional imaging for FRI location and diagnosis. This method shows superior lesion detection rate and diagnostic efficiency and therefore has good prospects for clinical application.
{"title":"3DFRINet: A Framework for the Detection and Diagnosis of Fracture Related Infection in Low Extremities Based on 18F-FDG PET/CT 3D Images","authors":"Chengfan Li , Liangbing Nie , Zhenkui Sun , Xuehai Ding , Quanyong Luo , Chentian Shen","doi":"10.1016/j.compmedimag.2024.102394","DOIUrl":"https://doi.org/10.1016/j.compmedimag.2024.102394","url":null,"abstract":"<div><p>Fracture related infection (FRI) is one of the most devastating complications after fracture surgery in the lower extremities, which can lead to extremely high morbidity and medical costs. Therefore, early comprehensive evaluation and accurate diagnosis of patients are critical for appropriate treatment, prevention of complications, and good prognosis. <sup>18</sup>Fluoro-deoxyglucose positron emission tomography/computed tomography (<sup>18</sup>F-FDG PET/CT) is one of the most commonly used medical imaging modalities for diagnosing FRI. With the development of deep learning, more neural networks have been proposed and become powerful computer-aided diagnosis tools in medical imaging. Therefore, a fully automated two-stage framework for FRI detection and diagnosis, 3DFRINet (Three Dimension FRI Network), is proposed for <sup>18</sup>F-FDG PET/CT 3D imaging. The first stage can effectively extract and fuse the features of both modalities to accurately locate the lesion by the dual-branch design and attention module. The second stage reduces the dimensionality of the image by using the maximum intensity projection, which retains the effective features while reducing the computational effort and achieving excellent diagnostic performance. The diagnostic performance of lesions reached 91.55% accuracy, 0.9331 AUC, and 0.9250 F1 score. 3DFRINet has an advantage over six nuclear medicine experts in each classification metric. The statistical analysis shows that 3DFRINet is equivalent or superior to the primary nuclear medicine physicians and comparable to the senior nuclear medicine physicians. In conclusion, this study first proposed a method based on <sup>18</sup>F-FDG PET/CT three-dimensional imaging for FRI location and diagnosis. This method shows superior lesion detection rate and diagnostic efficiency and therefore has good prospects for clinical application.</p></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":null,"pages":null},"PeriodicalIF":5.7,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140844148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-01DOI: 10.1016/j.compmedimag.2024.102392
Ruisheng Su , P. Matthijs van der Sluijs , Yuan Chen , Sandra Cornelissen , Ruben van den Broek , Wim H. van Zwam , Aad van der Lugt , Wiro J. Niessen , Danny Ruijters , Theo van Walsum
Cerebral X-ray digital subtraction angiography (DSA) is a widely used imaging technique in patients with neurovascular disease, allowing for vessel and flow visualization with high spatio-temporal resolution. Automatic artery–vein segmentation in DSA plays a fundamental role in vascular analysis with quantitative biomarker extraction, facilitating a wide range of clinical applications. The widely adopted U-Net applied on static DSA frames often struggles with disentangling vessels from subtraction artifacts. Further, it falls short in effectively separating arteries and veins as it disregards the temporal perspectives inherent in DSA. To address these limitations, we propose to simultaneously leverage spatial vasculature and temporal cerebral flow characteristics to segment arteries and veins in DSA. The proposed network, coined CAVE, encodes a 2D+time DSA series using spatial modules, aggregates all the features using temporal modules, and decodes it into 2D segmentation maps. On a large multi-center clinical dataset, CAVE achieves a vessel segmentation Dice of 0.84 (0.04) and an artery–vein segmentation Dice of 0.79 (0.06). CAVE surpasses traditional Frangi-based -means clustering (P 0.001) and U-Net (P 0.001) by a significant margin, demonstrating the advantages of harvesting spatio-temporal features. This study represents the first investigation into automatic artery–vein segmentation in DSA using deep learning. The code is publicly available at https://github.com/RuishengSu/CAVE_DSA.
{"title":"CAVE: Cerebral artery–vein segmentation in digital subtraction angiography","authors":"Ruisheng Su , P. Matthijs van der Sluijs , Yuan Chen , Sandra Cornelissen , Ruben van den Broek , Wim H. van Zwam , Aad van der Lugt , Wiro J. Niessen , Danny Ruijters , Theo van Walsum","doi":"10.1016/j.compmedimag.2024.102392","DOIUrl":"https://doi.org/10.1016/j.compmedimag.2024.102392","url":null,"abstract":"<div><p>Cerebral X-ray digital subtraction angiography (DSA) is a widely used imaging technique in patients with neurovascular disease, allowing for vessel and flow visualization with high spatio-temporal resolution. Automatic artery–vein segmentation in DSA plays a fundamental role in vascular analysis with quantitative biomarker extraction, facilitating a wide range of clinical applications. The widely adopted U-Net applied on static DSA frames often struggles with disentangling vessels from subtraction artifacts. Further, it falls short in effectively separating arteries and veins as it disregards the temporal perspectives inherent in DSA. To address these limitations, we propose to simultaneously leverage spatial vasculature and temporal cerebral flow characteristics to segment arteries and veins in DSA. The proposed network, coined CAVE, encodes a 2D+time DSA series using spatial modules, aggregates all the features using temporal modules, and decodes it into 2D segmentation maps. On a large multi-center clinical dataset, CAVE achieves a vessel segmentation Dice of 0.84 (<span><math><mo>±</mo></math></span>0.04) and an artery–vein segmentation Dice of 0.79 (<span><math><mo>±</mo></math></span>0.06). CAVE surpasses traditional Frangi-based <span><math><mi>k</mi></math></span>-means clustering (P <span><math><mo><</mo></math></span> 0.001) and U-Net (P <span><math><mo><</mo></math></span> 0.001) by a significant margin, demonstrating the advantages of harvesting spatio-temporal features. This study represents the first investigation into automatic artery–vein segmentation in DSA using deep learning. The code is publicly available at <span>https://github.com/RuishengSu/CAVE_DSA</span><svg><path></path></svg>.</p></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":null,"pages":null},"PeriodicalIF":5.7,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0895611124000697/pdfft?md5=b8c9ddb6b9334a5a30392653d4a487b2&pid=1-s2.0-S0895611124000697-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140842800","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-01DOI: 10.1016/j.compmedimag.2024.102393
Zhanqiang Guo , Jianjiang Feng , Wangsheng Lu , Yin Yin , Guangming Yang , Jie Zhou
Accurate segmentation of cerebrovascular structures from Computed Tomography Angiography (CTA), Magnetic Resonance Angiography (MRA), and Digital Subtraction Angiography (DSA) is crucial for clinical diagnosis of cranial vascular diseases. Recent advancements in deep Convolution Neural Network (CNN) have significantly improved the segmentation process. However, training segmentation networks for all modalities requires extensive data labeling for each modality, which is often expensive and time-consuming. To circumvent this limitation, we introduce an approach to train cross-modality cerebrovascular segmentation network based on paired data from source and target domains. Our approach involves training a universal vessel segmentation network with manually labeled source domain data, which automatically produces initial labels for target domain training images. We improve the initial labels of target domain training images by fusing paired images, which are then used to refine the target domain segmentation network. A series of experimental arrangements is presented to assess the efficacy of our method in various practical application scenarios. The experiments conducted on an MRA-CTA dataset and a DSA-CTA dataset demonstrate that the proposed method is effective for cross-modality cerebrovascular segmentation and achieves state-of-the-art performance.
{"title":"Cross-modality cerebrovascular segmentation based on pseudo-label generation via paired data","authors":"Zhanqiang Guo , Jianjiang Feng , Wangsheng Lu , Yin Yin , Guangming Yang , Jie Zhou","doi":"10.1016/j.compmedimag.2024.102393","DOIUrl":"https://doi.org/10.1016/j.compmedimag.2024.102393","url":null,"abstract":"<div><p>Accurate segmentation of cerebrovascular structures from Computed Tomography Angiography (CTA), Magnetic Resonance Angiography (MRA), and Digital Subtraction Angiography (DSA) is crucial for clinical diagnosis of cranial vascular diseases. Recent advancements in deep Convolution Neural Network (CNN) have significantly improved the segmentation process. However, training segmentation networks for all modalities requires extensive data labeling for each modality, which is often expensive and time-consuming. To circumvent this limitation, we introduce an approach to train cross-modality cerebrovascular segmentation network based on paired data from source and target domains. Our approach involves training a universal vessel segmentation network with manually labeled source domain data, which automatically produces initial labels for target domain training images. We improve the initial labels of target domain training images by fusing paired images, which are then used to refine the target domain segmentation network. A series of experimental arrangements is presented to assess the efficacy of our method in various practical application scenarios. The experiments conducted on an MRA-CTA dataset and a DSA-CTA dataset demonstrate that the proposed method is effective for cross-modality cerebrovascular segmentation and achieves state-of-the-art performance.</p></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":null,"pages":null},"PeriodicalIF":5.7,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140822866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-29DOI: 10.1016/j.compmedimag.2024.102389
Zhennong Chen, Hui Ren, Quanzheng Li, Xiang Li
Accurate reconstruction of a high-resolution 3D volume of the heart is critical for comprehensive cardiac assessments. However, cardiac magnetic resonance (CMR) data is usually acquired as a stack of 2D short-axis (SAX) slices, which suffers from the inter-slice misalignment due to cardiac motion and data sparsity from large gaps between SAX slices. Therefore, we aim to propose an end-to-end deep learning (DL) model to address these two challenges simultaneously, employing specific model components for each challenge. The objective is to reconstruct a high-resolution 3D volume of the heart () from acquired CMR SAX slices (). We define the transformation from to as a sequential process of motion correction and super-resolution. Accordingly, our DL model incorporates two distinct components. The first component conducts motion correction by predicting displacement vectors to re-position each SAX slice accurately. The second component takes the motion-corrected SAX slices from the first component and performs the super-resolution to fill the data gaps. These two components operate in a sequential way, and the entire model is trained end-to-end. Our model significantly reduced inter-slice misalignment from originally 3.330.74 mm to 1.360.63 mm and generated accurate high resolution 3D volumes with Dice of 0.9740.010 for left ventricle (LV) and 0.9380.017 for myocardium in a simulation dataset. When compared to the LAX contours in a real-world dataset, our model achieved Dice of 0.9450.023 for LV and 0.7860.060 for myocardium. In both datasets, our model with specific components for motion correction and super-resolution significantly enhance the performance compared to the model without such design considerations. The codes for our model are available at https://github.com/zhennongchen/CMR_MC_SR_End2End.
{"title":"Motion correction and super-resolution for multi-slice cardiac magnetic resonance imaging via an end-to-end deep learning approach","authors":"Zhennong Chen, Hui Ren, Quanzheng Li, Xiang Li","doi":"10.1016/j.compmedimag.2024.102389","DOIUrl":"https://doi.org/10.1016/j.compmedimag.2024.102389","url":null,"abstract":"<div><p>Accurate reconstruction of a high-resolution 3D volume of the heart is critical for comprehensive cardiac assessments. However, cardiac magnetic resonance (CMR) data is usually acquired as a stack of 2D short-axis (SAX) slices, which suffers from the inter-slice misalignment due to cardiac motion and data sparsity from large gaps between SAX slices. Therefore, we aim to propose an end-to-end deep learning (DL) model to address these two challenges simultaneously, employing specific model components for each challenge. The objective is to reconstruct a high-resolution 3D volume of the heart (<span><math><msub><mrow><mi>V</mi></mrow><mrow><mi>HR</mi></mrow></msub></math></span>) from acquired CMR SAX slices (<span><math><msub><mrow><mi>V</mi></mrow><mrow><mi>LR</mi></mrow></msub></math></span>). We define the transformation from <span><math><msub><mrow><mi>V</mi></mrow><mrow><mi>LR</mi></mrow></msub></math></span> to <span><math><msub><mrow><mi>V</mi></mrow><mrow><mi>HR</mi></mrow></msub></math></span> as a sequential process of motion correction and super-resolution. Accordingly, our DL model incorporates two distinct components. The first component conducts motion correction by predicting displacement vectors to re-position each SAX slice accurately. The second component takes the motion-corrected SAX slices from the first component and performs the super-resolution to fill the data gaps. These two components operate in a sequential way, and the entire model is trained end-to-end. Our model significantly reduced inter-slice misalignment from originally 3.33<span><math><mo>±</mo></math></span>0.74 mm to 1.36<span><math><mo>±</mo></math></span>0.63 mm and generated accurate high resolution 3D volumes with Dice of 0.974<span><math><mo>±</mo></math></span>0.010 for left ventricle (LV) and 0.938<span><math><mo>±</mo></math></span>0.017 for myocardium in a simulation dataset. When compared to the LAX contours in a real-world dataset, our model achieved Dice of 0.945<span><math><mo>±</mo></math></span>0.023 for LV and 0.786<span><math><mo>±</mo></math></span>0.060 for myocardium. In both datasets, our model with specific components for motion correction and super-resolution significantly enhance the performance compared to the model without such design considerations. The codes for our model are available at <span>https://github.com/zhennongchen/CMR_MC_SR_End2End</span><svg><path></path></svg>.</p></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":null,"pages":null},"PeriodicalIF":5.7,"publicationDate":"2024-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140816871","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-25DOI: 10.1016/j.compmedimag.2024.102388
Yuan Huang , Sven A. Holcombe , Stewart C. Wang , Jisi Tang
Rib cross-sectional shapes (characterized by the outer contour and cortical bone thickness) affect the rib mechanical response under impact loading, thereby influence the rib injury pattern and risk. A statistical description of the rib shapes or their correlations to anthropometrics is a prerequisite to the development of numerical human body models representing target demographics. Variational autoencoders (VAE) as anatomical shape generators remain to be explored in terms of utilizing the latent vectors to control or interpret the representativeness of the generated results. In this paper, we propose a pipeline for developing a multi-rib cross-sectional shape generative model from CT images, which consists of the achievement of rib cross-sectional shape data from CT images using an anatomical indexing system and regular grids, and a unified framework to fit shape distributions and associate shapes to anthropometrics for different rib categories. Specifically, we collected CT images including 3193 ribs, surface regular grid is generated for each rib based on anatomical coordinates, the rib cross-sectional shapes are characterized by nodal coordinates and cortical bone thickness. The tensor structure of shape data based on regular grids enable the implementation of CNNs in the conditional variational autoencoder (CVAE). The CVAE is trained against an auxiliary classifier to decouple the low-dimensional representations of the inter- and intra- variations and fit each intra-variation by a Gaussian distribution simultaneously. Random tree regressors are further leveraged to associate each continuous intra-class space with the corresponding anthropometrics of the subjects, i.e., age, height and weight. As a result, with the rib class labels and the latent vectors sampled from Gaussian distributions or predicted from anthropometrics as the inputs, the decoder can generate valid rib cross-sectional shapes of given class labels (male/female, 2nd to 11th ribs) for arbitrary populational percentiles or specific age, height and weight, which paves the road for future biomedical and biomechanical studies considering the diversity of rib shapes across the population.
{"title":"A deep learning-based pipeline for developing multi-rib shape generative model with populational percentiles or anthropometrics as predictors","authors":"Yuan Huang , Sven A. Holcombe , Stewart C. Wang , Jisi Tang","doi":"10.1016/j.compmedimag.2024.102388","DOIUrl":"10.1016/j.compmedimag.2024.102388","url":null,"abstract":"<div><p>Rib cross-sectional shapes (characterized by the outer contour and cortical bone thickness) affect the rib mechanical response under impact loading, thereby influence the rib injury pattern and risk. A statistical description of the rib shapes or their correlations to anthropometrics is a prerequisite to the development of numerical human body models representing target demographics. Variational autoencoders (VAE) as anatomical shape generators remain to be explored in terms of utilizing the latent vectors to control or interpret the representativeness of the generated results. In this paper, we propose a pipeline for developing a multi-rib cross-sectional shape generative model from CT images, which consists of the achievement of rib cross-sectional shape data from CT images using an anatomical indexing system and regular grids, and a unified framework to fit shape distributions and associate shapes to anthropometrics for different rib categories. Specifically, we collected CT images including 3193 ribs, surface regular grid is generated for each rib based on anatomical coordinates, the rib cross-sectional shapes are characterized by nodal coordinates and cortical bone thickness. The tensor structure of shape data based on regular grids enable the implementation of CNNs in the conditional variational autoencoder (CVAE). The CVAE is trained against an auxiliary classifier to decouple the low-dimensional representations of the inter- and intra- variations and fit each intra-variation by a Gaussian distribution simultaneously. Random tree regressors are further leveraged to associate each continuous intra-class space with the corresponding anthropometrics of the subjects, i.e., age, height and weight. As a result, with the rib class labels and the latent vectors sampled from Gaussian distributions or predicted from anthropometrics as the inputs, the decoder can generate valid rib cross-sectional shapes of given class labels (male/female, 2nd to 11th ribs) for arbitrary populational percentiles or specific age, height and weight, which paves the road for future biomedical and biomechanical studies considering the diversity of rib shapes across the population.</p></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":null,"pages":null},"PeriodicalIF":5.7,"publicationDate":"2024-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140791093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}