Pub Date : 2022-09-01Epub Date: 2022-12-16DOI: 10.1007/978-3-031-21014-3_21
Hao Guan, Siyuan Liu, Weili Lin, Pew-Thian Yap, Mingxia Liu
Pooling structural magnetic resonance imaging (MRI) data from different imaging sites helps increase sample size to facilitate machine learning based neuroimage analysis, but usually suffers from significant cross-site and/or cross-scanner data heterogeneity. Existing studies often focus on reducing cross-site and/or cross-scanner heterogeneity at handcrafted feature level targeting specific tasks (e.g., classification or segmentation), limiting their adaptability in clinical practice. Research on image-level MRI harmonization targeting a broad range of applications is very limited. In this paper, we develop a spectrum swapping based image-level MRI harmonization (SSIMH) framework. Different from previous work, our method focuses on alleviating cross-scanner heterogeneity at raw image level. We first construct spectrum analysis to explore the influences of different frequency components on MRI harmonization. We then utilize a spectrum swapping method for the harmonization of raw MRIs acquired by different scanners. Our method does not rely on complex model training, and can be directly applied to fast real-time MRI harmonization. Experimental results on T1- and T2-weighted MRIs of phantom subjects acquired by using different scanners from the public ABCD dataset suggest the effectiveness of our method in structural MRI harmonization at the image level.
{"title":"Fast Image-Level MRI Harmonization via Spectrum Analysis.","authors":"Hao Guan, Siyuan Liu, Weili Lin, Pew-Thian Yap, Mingxia Liu","doi":"10.1007/978-3-031-21014-3_21","DOIUrl":"10.1007/978-3-031-21014-3_21","url":null,"abstract":"<p><p>Pooling structural magnetic resonance imaging (MRI) data from different imaging sites helps increase sample size to facilitate machine learning based neuroimage analysis, but usually suffers from significant cross-site and/or cross-scanner data heterogeneity. Existing studies often focus on reducing cross-site and/or cross-scanner heterogeneity at handcrafted feature level targeting specific tasks (e.g., classification or segmentation), limiting their adaptability in clinical practice. Research on image-level MRI harmonization targeting a broad range of applications is very limited. In this paper, we develop a spectrum swapping based image-level MRI harmonization (SSIMH) framework. Different from previous work, our method focuses on alleviating cross-scanner heterogeneity at <i>raw image level</i>. We first construct <i>spectrum analysis</i> to explore the influences of different frequency components on MRI harmonization. We then utilize a <i>spectrum swapping</i> method for the harmonization of raw MRIs acquired by different scanners. Our method does not rely on complex model training, and can be directly applied to fast real-time MRI harmonization. Experimental results on T1- and T2-weighted MRIs of phantom subjects acquired by using different scanners from the public ABCD dataset suggest the effectiveness of our method in structural MRI harmonization at the image level.</p>","PeriodicalId":74092,"journal":{"name":"Machine learning in medical imaging. MLMI (Workshop)","volume":"13583 ","pages":"201-209"},"PeriodicalIF":0.0,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9805301/pdf/nihms-1859376.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10467950","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-09-01Epub Date: 2022-12-16DOI: 10.1007/978-3-031-21014-3_18
Zheyuan Zhang, Ulas Bagci
Transformer-based neural networks have surpassed promising performance on many biomedical image segmentation tasks due to a better global information modeling from the self-attention mechanism. However, most methods are still designed for 2D medical images while ignoring the essential 3D volume information. The main challenge for 3D Transformer-based segmentation methods is the quadratic complexity introduced by the self-attention mechanism [17]. In this paper, we are addressing these two research gaps, lack of 3D methods and computational complexity in Transformers, by proposing a novel Transformer architecture that has an encoder-decoder style architecture with linear complexity. Furthermore, we newly introduce a dynamic token concept to further reduce the token numbers for self-attention calculation. Taking advantage of the global information modeling, we provide uncertainty maps from different hierarchy stages. We evaluate this method on multiple challenging CT pancreas segmentation datasets. Our results show that our novel 3D Transformer-based segmentor could provide promising highly feasible segmentation performance and accurate uncertainty quantification using single annotation. Code is available https://github.com/freshman97/LinTransUNet.
{"title":"Dynamic Linear Transformer for 3D Biomedical Image Segmentation.","authors":"Zheyuan Zhang, Ulas Bagci","doi":"10.1007/978-3-031-21014-3_18","DOIUrl":"10.1007/978-3-031-21014-3_18","url":null,"abstract":"<p><p>Transformer-based neural networks have surpassed promising performance on many biomedical image segmentation tasks due to a better global information modeling from the self-attention mechanism. However, most methods are still designed for 2D medical images while ignoring the essential 3D volume information. The main challenge for 3D Transformer-based segmentation methods is the quadratic complexity introduced by the self-attention mechanism [17]. In this paper, we are addressing these two research gaps, lack of 3D methods and computational complexity in Transformers, by proposing a novel Transformer architecture that has an encoder-decoder style architecture with linear complexity. Furthermore, we newly introduce a dynamic token concept to further reduce the token numbers for self-attention calculation. Taking advantage of the global information modeling, we provide uncertainty maps from different hierarchy stages. We evaluate this method on multiple challenging CT pancreas segmentation datasets. Our results show that our novel 3D Transformer-based segmentor could provide promising highly feasible segmentation performance and accurate uncertainty quantification using single annotation. Code is available https://github.com/freshman97/LinTransUNet.</p>","PeriodicalId":74092,"journal":{"name":"Machine learning in medical imaging. MLMI (Workshop)","volume":"13583 ","pages":"171-180"},"PeriodicalIF":0.0,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9911329/pdf/nihms-1870553.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10721278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-09-01DOI: 10.1007/978-3-031-21014-3_2
Junghwan Lee, Tingyi Wanyan, Qingyu Chen, Tiarnan D L Keenan, Benjamin S Glicksberg, Emily Y Chew, Zhiyong Lu, Fei Wang, Yifan Peng
Accurately predicting a patient's risk of progressing to late age-related macular degeneration (AMD) is difficult but crucial for personalized medicine. While existing risk prediction models for progression to late AMD are useful for triaging patients, none utilizes longitudinal color fundus photographs (CFPs) in a patient's history to estimate the risk of late AMD in a given subsequent time interval. In this work, we seek to evaluate how deep neural networks capture the sequential information in longitudinal CFPs and improve the prediction of 2-year and 5-year risk of progression to late AMD. Specifically, we proposed two deep learning models, CNN-LSTM and CNN-Transformer, which use a Long-Short Term Memory (LSTM) and a Transformer, respectively with convolutional neural networks (CNN), to capture the sequential information in longitudinal CFPs. We evaluated our models in comparison to baselines on the Age-Related Eye Disease Study, one of the largest longitudinal AMD cohorts with CFPs. The proposed models outperformed the baseline models that utilized only single-visit CFPs to predict the risk of late AMD (0.879 vs 0.868 in AUC for 2-year prediction, and 0.879 vs 0.862 for 5-year prediction). Further experiments showed that utilizing longitudinal CFPs over a longer time period was helpful for deep learning models to predict the risk of late AMD. We made the source code available at https://github.com/bionlplab/AMD_prognosis_mlmi2022 to catalyze future works that seek to develop deep learning models for late AMD prediction.
准确预测患者进展到晚期黄斑变性(AMD)的风险是困难的,但对于个性化医疗至关重要。虽然现有的进展到晚期AMD的风险预测模型对患者的分类是有用的,但没有一个利用患者病史中的纵向彩色眼底照片(CFPs)来估计在给定的后续时间间隔内发生晚期AMD的风险。在这项工作中,我们试图评估深度神经网络如何捕获纵向CFPs的序列信息,并提高对2年和5年进展为晚期AMD风险的预测。具体而言,我们提出了CNN-LSTM和CNN-Transformer两种深度学习模型,分别使用长短期记忆(LSTM)和变压器,结合卷积神经网络(CNN)来捕获纵向CFPs中的序列信息。我们将我们的模型与年龄相关眼病研究的基线进行比较,该研究是CFPs中最大的纵向AMD队列之一。所提出的模型优于仅使用单次就诊CFPs预测晚期AMD风险的基线模型(2年预测AUC为0.879 vs 0.868, 5年预测为0.879 vs 0.862)。进一步的实验表明,在更长的时间内使用纵向cfp有助于深度学习模型预测晚期AMD的风险。我们在https://github.com/bionlplab/AMD_prognosis_mlmi2022上提供了源代码,以促进未来寻求开发用于晚期AMD预测的深度学习模型的工作。
{"title":"Predicting Age-related Macular Degeneration Progression with Longitudinal Fundus Images Using Deep Learning.","authors":"Junghwan Lee, Tingyi Wanyan, Qingyu Chen, Tiarnan D L Keenan, Benjamin S Glicksberg, Emily Y Chew, Zhiyong Lu, Fei Wang, Yifan Peng","doi":"10.1007/978-3-031-21014-3_2","DOIUrl":"https://doi.org/10.1007/978-3-031-21014-3_2","url":null,"abstract":"<p><p>Accurately predicting a patient's risk of progressing to late age-related macular degeneration (AMD) is difficult but crucial for personalized medicine. While existing risk prediction models for progression to late AMD are useful for triaging patients, none utilizes longitudinal color fundus photographs (CFPs) in a patient's history to estimate the risk of late AMD in a given subsequent time interval. In this work, we seek to evaluate how deep neural networks capture the sequential information in longitudinal CFPs and improve the prediction of 2-year and 5-year risk of progression to late AMD. Specifically, we proposed two deep learning models, CNN-LSTM and CNN-Transformer, which use a Long-Short Term Memory (LSTM) and a Transformer, respectively with convolutional neural networks (CNN), to capture the sequential information in longitudinal CFPs. We evaluated our models in comparison to baselines on the Age-Related Eye Disease Study, one of the largest longitudinal AMD cohorts with CFPs. The proposed models outperformed the baseline models that utilized only single-visit CFPs to predict the risk of late AMD (0.879 vs 0.868 in AUC for 2-year prediction, and 0.879 vs 0.862 for 5-year prediction). Further experiments showed that utilizing longitudinal CFPs over a longer time period was helpful for deep learning models to predict the risk of late AMD. We made the source code available at https://github.com/bionlplab/AMD_prognosis_mlmi2022 to catalyze future works that seek to develop deep learning models for late AMD prediction.</p>","PeriodicalId":74092,"journal":{"name":"Machine learning in medical imaging. MLMI (Workshop)","volume":"13583 ","pages":"11-20"},"PeriodicalIF":0.0,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9842432/pdf/nihms-1859202.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10604660","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-09-01Epub Date: 2022-12-16DOI: 10.1007/978-3-031-21014-3_23
Sahar Ahmad, Fang Nan, Ye Wu, Zhengwang Wu, Weili Lin, Li Wang, Gang Li, Di Wu, Pew-Thian Yap
Neuroimaging data harmonization has become a prerequisite in integrative data analytics for standardizing a wide variety of data collected from multiple studies and enabling interdisciplinary research. The lack of standardized image acquisition and computational procedures introduces non-biological variability and inconsistency in multi-site data, complicating downstream statistical analyses. Here, we propose a novel statistical technique to retrospectively harmonize multi-site cortical data collected longitudinally and cross-sectionally between birth and 100 years. We demonstrate that our method can effectively eliminate non-biological disparities from cortical thickness and myelination measurements, while preserving biological variation across the entire lifespan. Our harmonization method will foster large-scale population studies by providing comparable data required for investigating developmental and aging processes.
{"title":"Harmonization of Multi-site Cortical Data Across the Human Lifespan.","authors":"Sahar Ahmad, Fang Nan, Ye Wu, Zhengwang Wu, Weili Lin, Li Wang, Gang Li, Di Wu, Pew-Thian Yap","doi":"10.1007/978-3-031-21014-3_23","DOIUrl":"10.1007/978-3-031-21014-3_23","url":null,"abstract":"<p><p>Neuroimaging data harmonization has become a prerequisite in integrative data analytics for standardizing a wide variety of data collected from multiple studies and enabling interdisciplinary research. The lack of standardized image acquisition and computational procedures introduces non-biological variability and inconsistency in multi-site data, complicating downstream statistical analyses. Here, we propose a novel statistical technique to retrospectively harmonize multi-site cortical data collected longitudinally and cross-sectionally between birth and 100 years. We demonstrate that our method can effectively eliminate non-biological disparities from cortical thickness and myelination measurements, while preserving biological variation across the entire lifespan. Our harmonization method will foster large-scale population studies by providing comparable data required for investigating developmental and aging processes.</p>","PeriodicalId":74092,"journal":{"name":"Machine learning in medical imaging. MLMI (Workshop)","volume":"13583 ","pages":"220-229"},"PeriodicalIF":0.0,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10134963/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9752268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-06-01DOI: 10.48550/arXiv.2206.00771
Zheyu Zhang, Ulas Bagci
Transformer-based neural networks have surpassed promising performance on many biomedical image segmentation tasks due to a better global information modeling from the self-attention mechanism. However, most methods are still designed for 2D medical images while ignoring the essential 3D volume information. The main challenge for 3D Transformer-based segmentation methods is the quadratic complexity introduced by the self-attention mechanism [17]. In this paper, we are addressing these two research gaps, lack of 3D methods and computational complexity in Transformers, by proposing a novel Transformer architecture that has an encoder-decoder style architecture with linear complexity. Furthermore, we newly introduce a dynamic token concept to further reduce the token numbers for self-attention calculation. Taking advantage of the global information modeling, we provide uncertainty maps from different hierarchy stages. We evaluate this method on multiple challenging CT pancreas segmentation datasets. Our results show that our novel 3D Transformer-based segmentor could provide promising highly feasible segmentation performance and accurate uncertainty quantification using single annotation. Code is available https://github.com/freshman97/LinTransUNet.
{"title":"Dynamic Linear Transformer for 3D Biomedical Image Segmentation","authors":"Zheyu Zhang, Ulas Bagci","doi":"10.48550/arXiv.2206.00771","DOIUrl":"https://doi.org/10.48550/arXiv.2206.00771","url":null,"abstract":"Transformer-based neural networks have surpassed promising performance on many biomedical image segmentation tasks due to a better global information modeling from the self-attention mechanism. However, most methods are still designed for 2D medical images while ignoring the essential 3D volume information. The main challenge for 3D Transformer-based segmentation methods is the quadratic complexity introduced by the self-attention mechanism [17]. In this paper, we are addressing these two research gaps, lack of 3D methods and computational complexity in Transformers, by proposing a novel Transformer architecture that has an encoder-decoder style architecture with linear complexity. Furthermore, we newly introduce a dynamic token concept to further reduce the token numbers for self-attention calculation. Taking advantage of the global information modeling, we provide uncertainty maps from different hierarchy stages. We evaluate this method on multiple challenging CT pancreas segmentation datasets. Our results show that our novel 3D Transformer-based segmentor could provide promising highly feasible segmentation performance and accurate uncertainty quantification using single annotation. Code is available https://github.com/freshman97/LinTransUNet.","PeriodicalId":74092,"journal":{"name":"Machine learning in medical imaging. MLMI (Workshop)","volume":"10 1","pages":"171-180"},"PeriodicalIF":0.0,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88614639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-09-21DOI: 10.1007/978-3-030-87589-3_72
C. Lian, Xiaohuan Cao, I. Rekik, Xuanang Xu, Pingkun Yan
{"title":"Correction to: Machine Learning in Medical Imaging","authors":"C. Lian, Xiaohuan Cao, I. Rekik, Xuanang Xu, Pingkun Yan","doi":"10.1007/978-3-030-87589-3_72","DOIUrl":"https://doi.org/10.1007/978-3-030-87589-3_72","url":null,"abstract":"","PeriodicalId":74092,"journal":{"name":"Machine learning in medical imaging. MLMI (Workshop)","volume":"114 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75724842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We propose a novel 3D fully convolutional deep network for automated pancreas segmentation from both MRI and CT scans. More specifically, the proposed model consists of a 3D encoder that learns to extract volume features at different scales; features taken at different points of the encoder hierarchy are then sent to multiple 3D decoders that individually predict intermediate segmentation maps. Finally, all segmentation maps are combined to obtain a unique detailed segmentation mask. We test our model on both CT and MRI imaging data: the publicly available NIH Pancreas-CT dataset (consisting of 82 contrast-enhanced CTs) and a private MRI dataset (consisting of 40 MRI scans). Experimental results show that our model outperforms existing methods on CT pancreas segmentation, obtaining an average Dice score of about 88%, and yields promising segmentation performance on a very challenging MRI data set (average Dice score is about 77%). Additional control experiments demonstrate that the achieved performance is due to the combination of our 3D fully-convolutional deep network and the hierarchical representation decoding, thus substantiating our architectural design.
{"title":"Hierarchical 3D Feature Learning for Pancreas Segmentation.","authors":"Federica Proietto Salanitri, Giovanni Bellitto, Ismail Irmakci, Simone Palazzo, Ulas Bagci, Concetto Spampinato","doi":"10.1007/978-3-030-87589-3_25","DOIUrl":"10.1007/978-3-030-87589-3_25","url":null,"abstract":"<p><p>We propose a novel 3D fully convolutional deep network for automated pancreas segmentation from both MRI and CT scans. More specifically, the proposed model consists of a 3D encoder that learns to extract volume features at different scales; features taken at different points of the encoder hierarchy are then sent to multiple 3D decoders that individually predict intermediate segmentation maps. Finally, all segmentation maps are combined to obtain a unique detailed segmentation mask. We test our model on both CT and MRI imaging data: the publicly available NIH Pancreas-CT dataset (consisting of 82 contrast-enhanced CTs) and a private MRI dataset (consisting of 40 MRI scans). Experimental results show that our model outperforms existing methods on CT pancreas segmentation, obtaining an average Dice score of about 88%, and yields promising segmentation performance on a very challenging MRI data set (average Dice score is about 77%). Additional control experiments demonstrate that the achieved performance is due to the combination of our 3D fully-convolutional deep network and the hierarchical representation decoding, thus substantiating our architectural design.</p>","PeriodicalId":74092,"journal":{"name":"Machine learning in medical imaging. MLMI (Workshop)","volume":"12966 ","pages":"238-247"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9921296/pdf/nihms-1871453.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10721275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Elbow fracture diagnosis often requires patients to take both frontal and lateral views of elbow X-ray radiographs. In this paper, we propose a multiview deep learning method for an elbow fracture subtype classification task. Our strategy leverages transfer learning by first training two single-view models, one for frontal view and the other for lateral view, and then transferring the weights to the corresponding layers in the proposed multiview network architecture. Meanwhile, quantitative medical knowledge was integrated into the training process through a curriculum learning framework, which enables the model to first learn from "easier" samples and then transition to "harder" samples to reach better performance. In addition, our multiview network can work both in a dual-view setting and with a single view as input. We evaluate our method through extensive experiments on a classification task of elbow fracture with a dataset of 1,964 images. Results show that our method outperforms two related methods on bone fracture study in multiple settings, and our technique is able to boost the performance of the compared methods. The code is available at https://github.com/ljaiverson/multiview-curriculum.
{"title":"Knowledge-Guided Multiview Deep Curriculum Learning for Elbow Fracture Classification.","authors":"Jun Luo, Gene Kitamura, Dooman Arefan, Emine Doganay, Ashok Panigrahy, Shandong Wu","doi":"10.1007/978-3-030-87589-3_57","DOIUrl":"10.1007/978-3-030-87589-3_57","url":null,"abstract":"<p><p>Elbow fracture diagnosis often requires patients to take both frontal and lateral views of elbow X-ray radiographs. In this paper, we propose a multiview deep learning method for an elbow fracture subtype classification task. Our strategy leverages transfer learning by first training two single-view models, one for frontal view and the other for lateral view, and then transferring the weights to the corresponding layers in the proposed multiview network architecture. Meanwhile, quantitative medical knowledge was integrated into the training process through a curriculum learning framework, which enables the model to first learn from \"easier\" samples and then transition to \"harder\" samples to reach better performance. In addition, our multiview network can work both in a dual-view setting and with a single view as input. We evaluate our method through extensive experiments on a classification task of elbow fracture with a dataset of 1,964 images. Results show that our method outperforms two related methods on bone fracture study in multiple settings, and our technique is able to boost the performance of the compared methods. The code is available at https://github.com/ljaiverson/multiview-curriculum.</p>","PeriodicalId":74092,"journal":{"name":"Machine learning in medical imaging. MLMI (Workshop)","volume":"12966 ","pages":"555-564"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10557058/pdf/nihms-1933007.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41175565","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-09-01Epub Date: 2021-09-21DOI: 10.1007/978-3-030-87589-3_63
Qin Liu, Chunfeng Lian, Deqiang Xiao, Lei Ma, Han Deng, Xu Chen, Dinggang Shen, Pew-Thian Yap, James J Xia
Skull segmentation from three-dimensional (3D) cone-beam computed tomography (CBCT) images is critical for the diagnosis and treatment planning of the patients with craniomaxillofacial (CMF) deformities. Convolutional neural network (CNN)-based methods are currently dominating volumetric image segmentation, but these methods suffer from the limited GPU memory and the large image size (e.g., 512 × 512 × 448). Typical ad-hoc strategies, such as down-sampling or patch cropping, will degrade segmentation accuracy due to insufficient capturing of local fine details or global contextual information. Other methods such as Global-Local Networks (GLNet) are focusing on the improvement of neural networks, aiming to combine the local details and the global contextual information in a GPU memory-efficient manner. However, all these methods are operating on regular grids, which are computationally inefficient for volumetric image segmentation. In this work, we propose a novel VoxelRend-based network (VR-U-Net) by combining a memory-efficient variant of 3D U-Net with a voxel-based rendering (VoxelRend) module that refines local details via voxel-based predictions on non-regular grids. Establishing on relatively coarse feature maps, the VoxelRend module achieves significant improvement of segmentation accuracy with a fraction of GPU memory consumption. We evaluate our proposed VR-U-Net in the skull segmentation task on a high-resolution CBCT dataset collected from local hospitals. Experimental results show that the proposed VR-U-Net yields high-quality segmentation results in a memory-efficient manner, highlighting the practical value of our method.
三维锥形束计算机断层扫描(CBCT)图像的颅骨分割对于颅颌面畸形的诊断和治疗计划至关重要。基于卷积神经网络(CNN)的方法目前在体积图像分割中占主导地位,但这些方法受到GPU内存有限和图像尺寸较大(例如512 × 512 × 448)的影响。典型的特殊策略,如降采样或斑块裁剪,会降低分割的准确性,因为没有充分捕获局部细节或全局上下文信息。其他方法如global - local Networks (GLNet)则专注于神经网络的改进,旨在以GPU内存高效的方式将局部细节和全局上下文信息结合起来。然而,所有这些方法都是在规则网格上操作的,这对于体积图像分割来说计算效率很低。在这项工作中,我们提出了一种新的基于VoxelRend的网络(VR-U-Net),通过将3D U-Net的内存高效变体与基于体素的渲染(VoxelRend)模块相结合,该模块通过基于体素的非规则网格预测来细化局部细节。VoxelRend模块建立在相对粗糙的特征映射上,以一小部分GPU内存消耗实现了分割精度的显著提高。我们在从当地医院收集的高分辨率CBCT数据集上评估了我们提出的VR-U-Net在颅骨分割任务中的应用。实验结果表明,本文提出的VR-U-Net算法在节省内存的前提下,获得了高质量的分割结果,突出了本文方法的实用价值。
{"title":"Skull Segmentation from CBCT Images via Voxel-Based Rendering.","authors":"Qin Liu, Chunfeng Lian, Deqiang Xiao, Lei Ma, Han Deng, Xu Chen, Dinggang Shen, Pew-Thian Yap, James J Xia","doi":"10.1007/978-3-030-87589-3_63","DOIUrl":"https://doi.org/10.1007/978-3-030-87589-3_63","url":null,"abstract":"<p><p>Skull segmentation from three-dimensional (3D) cone-beam computed tomography (CBCT) images is critical for the diagnosis and treatment planning of the patients with craniomaxillofacial (CMF) deformities. Convolutional neural network (CNN)-based methods are currently dominating volumetric image segmentation, but these methods suffer from the limited GPU memory and the large image size (<i>e.g</i>., 512 × 512 × 448). Typical ad-hoc strategies, such as down-sampling or patch cropping, will degrade segmentation accuracy due to insufficient capturing of local fine details or global contextual information. Other methods such as Global-Local Networks (GLNet) are focusing on the improvement of neural networks, aiming to combine the local details and the global contextual information in a GPU memory-efficient manner. However, all these methods are operating on regular grids, which are computationally inefficient for volumetric image segmentation. In this work, we propose a novel VoxelRend-based network (VR-U-Net) by combining a memory-efficient variant of 3D U-Net with a voxel-based rendering (VoxelRend) module that refines local details via voxel-based predictions on non-regular grids. Establishing on relatively coarse feature maps, the VoxelRend module achieves significant improvement of segmentation accuracy with a fraction of GPU memory consumption. We evaluate our proposed VR-U-Net in the skull segmentation task on a high-resolution CBCT dataset collected from local hospitals. Experimental results show that the proposed VR-U-Net yields high-quality segmentation results in a memory-efficient manner, highlighting the practical value of our method.</p>","PeriodicalId":74092,"journal":{"name":"Machine learning in medical imaging. MLMI (Workshop)","volume":" ","pages":"615-623"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8675180/pdf/nihms-1762343.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39853017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-09-01DOI: 10.1007/978-3-030-87589-3_71
N. Islam, S. Gehlot, Zongwei Zhou, M. Gotway, Jianming Liang
{"title":"Seeking an Optimal Approach for Computer-Aided Pulmonary Embolism Detection","authors":"N. Islam, S. Gehlot, Zongwei Zhou, M. Gotway, Jianming Liang","doi":"10.1007/978-3-030-87589-3_71","DOIUrl":"https://doi.org/10.1007/978-3-030-87589-3_71","url":null,"abstract":"","PeriodicalId":74092,"journal":{"name":"Machine learning in medical imaging. MLMI (Workshop)","volume":"10 1","pages":"692-702"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86109324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}