Pub Date : 2024-04-15DOI: 10.1007/s10278-024-01091-0
Muhammad Usman Saeed, Wang Bin, Jinfang Sheng, Hussain Mobarak Albarakati
Spine fractures represent a critical health concern with far-reaching implications for patient care and clinical decision-making. Accurate segmentation of spine fractures from medical images is a crucial task due to its location, shape, type, and severity. Addressing these challenges often requires the use of advanced machine learning and deep learning techniques. In this research, a novel multi-scale feature fusion deep learning model is proposed for the automated spine fracture segmentation using Computed Tomography (CT) to these challenges. The proposed model consists of six modules; Feature Fusion Module (FFM), Squeeze and Excitation (SEM), Atrous Spatial Pyramid Pooling (ASPP), Residual Convolution Block Attention Module (RCBAM), Residual Border Refinement Attention Block (RBRAB), and Local Position Residual Attention Block (LPRAB). These modules are used to apply multi-scale feature fusion, spatial feature extraction, channel-wise feature improvement, segmentation border results border refinement, and positional focus on the region of interest. After that, a decoder network is used to predict the fractured spine. The experimental results show that the proposed approach achieves better accuracy results in solving the above challenges and also performs well compared to the existing segmentation methods.
{"title":"An Automated Multi-scale Feature Fusion Network for Spine Fracture Segmentation Using Computed Tomography Images","authors":"Muhammad Usman Saeed, Wang Bin, Jinfang Sheng, Hussain Mobarak Albarakati","doi":"10.1007/s10278-024-01091-0","DOIUrl":"https://doi.org/10.1007/s10278-024-01091-0","url":null,"abstract":"<p>Spine fractures represent a critical health concern with far-reaching implications for patient care and clinical decision-making. Accurate segmentation of spine fractures from medical images is a crucial task due to its location, shape, type, and severity. Addressing these challenges often requires the use of advanced machine learning and deep learning techniques. In this research, a novel multi-scale feature fusion deep learning model is proposed for the automated spine fracture segmentation using Computed Tomography (CT) to these challenges. The proposed model consists of six modules; Feature Fusion Module (FFM), Squeeze and Excitation (SEM), Atrous Spatial Pyramid Pooling (ASPP), Residual Convolution Block Attention Module (RCBAM), Residual Border Refinement Attention Block (RBRAB), and Local Position Residual Attention Block (LPRAB). These modules are used to apply multi-scale feature fusion, spatial feature extraction, channel-wise feature improvement, segmentation border results border refinement, and positional focus on the region of interest. After that, a decoder network is used to predict the fractured spine. The experimental results show that the proposed approach achieves better accuracy results in solving the above challenges and also performs well compared to the existing segmentation methods.</p>","PeriodicalId":50214,"journal":{"name":"Journal of Digital Imaging","volume":"169 1","pages":""},"PeriodicalIF":4.4,"publicationDate":"2024-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140591064","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-15DOI: 10.1007/s10278-024-01108-8
Luella Marcos, Paul Babyn, Javad Alirezaie
Convolutional neural networks (CNN) have been used for a wide variety of deep learning applications, especially in computer vision. For medical image processing, researchers have identified certain challenges associated with CNNs. These challenges encompass the generation of less informative features, limitations in capturing both high and low-frequency information within feature maps, and the computational cost incurred when enhancing receptive fields by deepening the network. Transformers have emerged as an approach aiming to address and overcome these specific limitations of CNNs in the context of medical image analysis. Preservation of all spatial details of medical images is necessary to ensure accurate patient diagnosis. Hence, this research introduced the use of a pure Vision Transformer (ViT) for a denoising artificial neural network for medical image processing specifically for low-dose computed tomography (LDCT) image denoising. The proposed model follows a U-Net framework that contains ViT modules with the integration of Noise2Neighbor (N2N) interpolation operation. Five different datasets containing LDCT and normal-dose CT (NDCT) image pairs were used to carry out this experiment. To test the efficacy of the proposed model, this experiment includes comparisons between the quantitative and visual results among CNN-based (BM3D, RED-CNN, DRL-E-MP), hybrid CNN-ViT-based (TED-Net), and the proposed pure ViT-based denoising model. The findings of this study showed that there is about 15–20% increase in SSIM and PSNR when using self-attention transformers than using the typical pure CNN. Visual results also showed improvements especially when it comes to showing fine structural details of CT images.
{"title":"Pure Vision Transformer (CT-ViT) with Noise2Neighbors Interpolation for Low-Dose CT Image Denoising","authors":"Luella Marcos, Paul Babyn, Javad Alirezaie","doi":"10.1007/s10278-024-01108-8","DOIUrl":"https://doi.org/10.1007/s10278-024-01108-8","url":null,"abstract":"<p>Convolutional neural networks (CNN) have been used for a wide variety of deep learning applications, especially in computer vision. For medical image processing, researchers have identified certain challenges associated with CNNs. These challenges encompass the generation of less informative features, limitations in capturing both high and low-frequency information within feature maps, and the computational cost incurred when enhancing receptive fields by deepening the network. Transformers have emerged as an approach aiming to address and overcome these specific limitations of CNNs in the context of medical image analysis. Preservation of all spatial details of medical images is necessary to ensure accurate patient diagnosis. Hence, this research introduced the use of a pure Vision Transformer (ViT) for a denoising artificial neural network for medical image processing specifically for low-dose computed tomography (LDCT) image denoising. The proposed model follows a U-Net framework that contains ViT modules with the integration of Noise2Neighbor (N2N) interpolation operation. Five different datasets containing LDCT and normal-dose CT (NDCT) image pairs were used to carry out this experiment. To test the efficacy of the proposed model, this experiment includes comparisons between the quantitative and visual results among CNN-based (BM3D, RED-CNN, DRL-E-MP), hybrid CNN-ViT-based (TED-Net), and the proposed pure ViT-based denoising model. The findings of this study showed that there is about 15–20% increase in SSIM and PSNR when using self-attention transformers than using the typical pure CNN. Visual results also showed improvements especially when it comes to showing fine structural details of CT images.</p>","PeriodicalId":50214,"journal":{"name":"Journal of Digital Imaging","volume":"2 1","pages":""},"PeriodicalIF":4.4,"publicationDate":"2024-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140591432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-08DOI: 10.1007/s10278-024-01052-7
Karthik Kantipudi, Jingwen Gu, Vy Bui, Hang Yu, Stefan Jaeger, Ziv Yaniv
According to the 2022 World Health Organization's Global Tuberculosis (TB) report, an estimated 10.6 million people fell ill with TB, and 1.6 million died from the disease in 2021. In addition, 2021 saw a reversal of a decades-long trend of declining TB infections and deaths, with an estimated increase of 4.5% in the number of people who fell ill with TB compared to 2020, and an estimated yearly increase of 450,000 cases of drug resistant TB. Estimating the severity of pulmonary TB using frontal chest X-rays (CXR) can enable better resource allocation in resource constrained settings and monitoring of treatment response, enabling prompt treatment modifications if disease severity does not decrease over time. The Timika score is a clinically used TB severity score based on a CXR reading. This work proposes and evaluates three deep learning-based approaches for predicting the Timika score with varying levels of explainability. The first approach uses two deep learning-based models, one to explicitly detect lesion regions using YOLOV5n and another to predict the presence of cavitation using DenseNet121, which are then utilized in score calculation. The second approach uses a DenseNet121-based regression model to directly predict the affected lung percentage and another to predict cavitation presence using a DenseNet121-based classification model. Finally, the third approach directly predicts the Timika score using a DenseNet121-based regression model. The best performance is achieved by the second approach with a mean absolute error of 13-14% and a Pearson correlation of 0.7-0.84 using three held-out datasets for evaluating generalization.
{"title":"Automated Pulmonary Tuberculosis Severity Assessment on Chest X-rays","authors":"Karthik Kantipudi, Jingwen Gu, Vy Bui, Hang Yu, Stefan Jaeger, Ziv Yaniv","doi":"10.1007/s10278-024-01052-7","DOIUrl":"https://doi.org/10.1007/s10278-024-01052-7","url":null,"abstract":"<p>According to the 2022 World Health Organization's Global Tuberculosis (TB) report, an estimated 10.6 million people fell ill with TB, and 1.6 million died from the disease in 2021. In addition, 2021 saw a reversal of a decades-long trend of declining TB infections and deaths, with an estimated increase of 4.5% in the number of people who fell ill with TB compared to 2020, and an estimated yearly increase of 450,000 cases of drug resistant TB. Estimating the severity of pulmonary TB using frontal chest X-rays (CXR) can enable better resource allocation in resource constrained settings and monitoring of treatment response, enabling prompt treatment modifications if disease severity does not decrease over time. The Timika score is a clinically used TB severity score based on a CXR reading. This work proposes and evaluates three deep learning-based approaches for predicting the Timika score with varying levels of explainability. The first approach uses two deep learning-based models, one to explicitly detect lesion regions using YOLOV5n and another to predict the presence of cavitation using DenseNet121, which are then utilized in score calculation. The second approach uses a DenseNet121-based regression model to directly predict the affected lung percentage and another to predict cavitation presence using a DenseNet121-based classification model. Finally, the third approach directly predicts the Timika score using a DenseNet121-based regression model. The best performance is achieved by the second approach with a mean absolute error of 13-14% and a Pearson correlation of 0.7-0.84 using three held-out datasets for evaluating generalization.</p>","PeriodicalId":50214,"journal":{"name":"Journal of Digital Imaging","volume":"95 1","pages":""},"PeriodicalIF":4.4,"publicationDate":"2024-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140591143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Capsule endoscopy (CE) is non-invasive and painless during gastrointestinal examination. However, capsule endoscopy can increase the workload of image reviewing for clinicians, making it prone to missed and misdiagnosed diagnoses. Current researches primarily concentrated on binary classifiers, multiple classifiers targeting fewer than four abnormality types and detectors within a specific segment of the digestive tract, and segmenters for a single type of anomaly. Due to intra-class variations, the task of creating a unified scheme for detecting multiple gastrointestinal diseases is particularly challenging. A cascade neural network designed in this study, Cascade-EC, can automatically identify and localize four types of gastrointestinal lesions in CE images: angiectasis, bleeding, erosion, and polyp. Cascade-EC consists of EfficientNet for image classification and CA_stm_Retinanet for lesion detection and location. As the first layer of Cascade-EC, the EfficientNet network classifies CE images. CA_stm_Retinanet, as the second layer, performs the target detection and location task on the classified image. CA_stm_Retinanet adopts the general architecture of Retinanet. Its feature extraction module is the CA_stm_Backbone from the stack of CA_stm Block. CA_stm Block adopts the split-transform-merge strategy and introduces the coordinate attention. The dataset in this study is from Shanghai East Hospital, collected by PillCam SB3 and AnKon capsule endoscopes, which contains a total of 7936 images of 317 patients from the years 2017 to 2021. In the testing set, the average precision of Cascade-EC in the multi-lesions classification task was 94.55%, the average recall was 90.60%, and the average F1 score was 92.26%. The mean mAP@ 0.5 of Cascade-EC for detecting the four types of diseases is 85.88%. The experimental results show that compared with a single target detection network, Cascade-EC has better performance and can effectively assist clinicians to classify and detect multiple lesions in CE images.
{"title":"Cascade-EC Network: Recognition of Gastrointestinal Multiple Lesions Based on EfficientNet and CA_stm_Retinanet","authors":"Xudong Guo, Lei Xu, Shengnan Li, Meidong Xu, Yuan Chu, Qinfen Jiang","doi":"10.1007/s10278-024-01096-9","DOIUrl":"https://doi.org/10.1007/s10278-024-01096-9","url":null,"abstract":"<p>Capsule endoscopy (CE) is non-invasive and painless during gastrointestinal examination. However, capsule endoscopy can increase the workload of image reviewing for clinicians, making it prone to missed and misdiagnosed diagnoses. Current researches primarily concentrated on binary classifiers, multiple classifiers targeting fewer than four abnormality types and detectors within a specific segment of the digestive tract, and segmenters for a single type of anomaly. Due to intra-class variations, the task of creating a unified scheme for detecting multiple gastrointestinal diseases is particularly challenging. A cascade neural network designed in this study, Cascade-EC, can automatically identify and localize four types of gastrointestinal lesions in CE images: angiectasis, bleeding, erosion, and polyp. Cascade-EC consists of EfficientNet for image classification and CA_stm_Retinanet for lesion detection and location. As the first layer of Cascade-EC, the EfficientNet network classifies CE images. CA_stm_Retinanet, as the second layer, performs the target detection and location task on the classified image. CA_stm_Retinanet adopts the general architecture of Retinanet. Its feature extraction module is the CA_stm_Backbone from the stack of CA_stm Block. CA_stm Block adopts the split-transform-merge strategy and introduces the coordinate attention. The dataset in this study is from Shanghai East Hospital, collected by PillCam SB3 and AnKon capsule endoscopes, which contains a total of 7936 images of 317 patients from the years 2017 to 2021. In the testing set, the average precision of Cascade-EC in the multi-lesions classification task was 94.55%, the average recall was 90.60%, and the average F1 score was 92.26%. The mean mAP@ 0.5 of Cascade-EC for detecting the four types of diseases is 85.88%. The experimental results show that compared with a single target detection network, Cascade-EC has better performance and can effectively assist clinicians to classify and detect multiple lesions in CE images.</p>","PeriodicalId":50214,"journal":{"name":"Journal of Digital Imaging","volume":"17 1","pages":""},"PeriodicalIF":4.4,"publicationDate":"2024-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140574032","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-08DOI: 10.1007/s10278-024-01104-y
Yang Li, Maliha R. Imami, Linmei Zhao, Alireza Amindarolzarbi, Esther Mena, Jeffrey Leal, Junyu Chen, Andrei Gafita, Andrew F. Voter, Xin Li, Yong Du, Chengzhang Zhu, Peter L. Choyke, Beiji Zou, Zhicheng Jiao, Steven P. Rowe, Martin G. Pomper, Harrison X. Bai
Uptake segmentation and classification on PSMA PET/CT are important for automating whole-body tumor burden determinations. We developed and evaluated an automated deep learning (DL)-based framework that segments and classifies uptake on PSMA PET/CT. We identified 193 [18F] DCFPyL PET/CT scans of patients with biochemically recurrent prostate cancer from two institutions, including 137 [18F] DCFPyL PET/CT scans for training and internally testing, and 56 scans from another institution for external testing. Two radiologists segmented and labelled foci as suspicious or non-suspicious for malignancy. A DL-based segmentation was developed with two independent CNNs. An anatomical prior guidance was applied to make the DL framework focus on PSMA-avid lesions. Segmentation performance was evaluated by Dice, IoU, precision, and recall. Classification model was constructed with multi-modal decision fusion framework evaluated by accuracy, AUC, F1 score, precision, and recall. Automatic segmentation of suspicious lesions was improved under prior guidance, with mean Dice, IoU, precision, and recall of 0.700, 0.566, 0.809, and 0.660 on the internal test set and 0.680, 0.548, 0.749, and 0.740 on the external test set. Our multi-modal decision fusion framework outperformed single-modal and multi-modal CNNs with accuracy, AUC, F1 score, precision, and recall of 0.764, 0.863, 0.844, 0.841, and 0.847 in distinguishing suspicious and non-suspicious foci on the internal test set and 0.796, 0.851, 0.865, 0.814, and 0.923 on the external test set. DL-based lesion segmentation on PSMA PET is facilitated through our anatomical prior guidance strategy. Our classification framework differentiates suspicious foci from those not suspicious for cancer with good accuracy.
{"title":"An Automated Deep Learning-Based Framework for Uptake Segmentation and Classification on PSMA PET/CT Imaging of Patients with Prostate Cancer","authors":"Yang Li, Maliha R. Imami, Linmei Zhao, Alireza Amindarolzarbi, Esther Mena, Jeffrey Leal, Junyu Chen, Andrei Gafita, Andrew F. Voter, Xin Li, Yong Du, Chengzhang Zhu, Peter L. Choyke, Beiji Zou, Zhicheng Jiao, Steven P. Rowe, Martin G. Pomper, Harrison X. Bai","doi":"10.1007/s10278-024-01104-y","DOIUrl":"https://doi.org/10.1007/s10278-024-01104-y","url":null,"abstract":"<p>Uptake segmentation and classification on PSMA PET/CT are important for automating whole-body tumor burden determinations. We developed and evaluated an automated deep learning (DL)-based framework that segments and classifies uptake on PSMA PET/CT. We identified 193 [<sup>18</sup>F] DCFPyL PET/CT scans of patients with biochemically recurrent prostate cancer from two institutions, including 137 [<sup>18</sup>F] DCFPyL PET/CT scans for training and internally testing, and 56 scans from another institution for external testing. Two radiologists segmented and labelled foci as suspicious or non-suspicious for malignancy. A DL-based segmentation was developed with two independent CNNs. An anatomical prior guidance was applied to make the DL framework focus on PSMA-avid lesions. Segmentation performance was evaluated by Dice, IoU, precision, and recall. Classification model was constructed with multi-modal decision fusion framework evaluated by accuracy, AUC, F1 score, precision, and recall. Automatic segmentation of suspicious lesions was improved under prior guidance, with mean Dice, IoU, precision, and recall of 0.700, 0.566, 0.809, and 0.660 on the internal test set and 0.680, 0.548, 0.749, and 0.740 on the external test set. Our multi-modal decision fusion framework outperformed single-modal and multi-modal CNNs with accuracy, AUC, F1 score, precision, and recall of 0.764, 0.863, 0.844, 0.841, and 0.847 in distinguishing suspicious and non-suspicious foci on the internal test set and 0.796, 0.851, 0.865, 0.814, and 0.923 on the external test set. DL-based lesion segmentation on PSMA PET is facilitated through our anatomical prior guidance strategy. Our classification framework differentiates suspicious foci from those not suspicious for cancer with good accuracy.</p>","PeriodicalId":50214,"journal":{"name":"Journal of Digital Imaging","volume":"89 1","pages":""},"PeriodicalIF":4.4,"publicationDate":"2024-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140574050","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-08DOI: 10.1007/s10278-024-01072-3
Peilong Wang, Timothy L. Kline, Andrew D. Missert, Cole J. Cook, Matthew R. Callstrom, Alex Chan, Robert P. Hartman, Zachary S. Kelm, Panagiotis Korfiatis
Automated segmentation tools often encounter accuracy and adaptability issues when applied to images of different pathology. The purpose of this study is to explore the feasibility of building a workflow to efficiently route images to specifically trained segmentation models. By implementing a deep learning classifier to automatically classify the images and route them to appropriate segmentation models, we hope that our workflow can segment the images with different pathology accurately. The data we used in this study are 350 CT images from patients affected by polycystic liver disease and 350 CT images from patients presenting with liver metastases from colorectal cancer. All images had the liver manually segmented by trained imaging analysts. Our proposed adaptive segmentation workflow achieved a statistically significant improvement for the task of total liver segmentation compared to the generic single-segmentation model (non-parametric Wilcoxon signed rank test, n = 100, p-value << 0.001). This approach is applicable in a wide range of scenarios and should prove useful in clinical implementations of segmentation pipelines.
{"title":"A Classification-Based Adaptive Segmentation Pipeline: Feasibility Study Using Polycystic Liver Disease and Metastases from Colorectal Cancer CT Images","authors":"Peilong Wang, Timothy L. Kline, Andrew D. Missert, Cole J. Cook, Matthew R. Callstrom, Alex Chan, Robert P. Hartman, Zachary S. Kelm, Panagiotis Korfiatis","doi":"10.1007/s10278-024-01072-3","DOIUrl":"https://doi.org/10.1007/s10278-024-01072-3","url":null,"abstract":"<p>Automated segmentation tools often encounter accuracy and adaptability issues when applied to images of different pathology. The purpose of this study is to explore the feasibility of building a workflow to efficiently route images to specifically trained segmentation models. By implementing a deep learning classifier to automatically classify the images and route them to appropriate segmentation models, we hope that our workflow can segment the images with different pathology accurately. The data we used in this study are 350 CT images from patients affected by polycystic liver disease and 350 CT images from patients presenting with liver metastases from colorectal cancer. All images had the liver manually segmented by trained imaging analysts. Our proposed adaptive segmentation workflow achieved a statistically significant improvement for the task of total liver segmentation compared to the generic single-segmentation model (non-parametric Wilcoxon signed rank test, <i>n</i> = 100, <i>p</i>-value << 0.001). This approach is applicable in a wide range of scenarios and should prove useful in clinical implementations of segmentation pipelines.</p>","PeriodicalId":50214,"journal":{"name":"Journal of Digital Imaging","volume":"30 1","pages":""},"PeriodicalIF":4.4,"publicationDate":"2024-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140574741","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-08DOI: 10.1007/s10278-024-01098-7
Jacob A. Macdonald, Katelyn R. Morgan, Brandon Konkel, Kulsoom Abdullah, Mark Martin, Cory Ennis, Joseph Y. Lo, Marissa Stroo, Denise C. Snyder, Mustafa R. Bashir
De-identification of DICOM images is an essential component of medical image research. While many established methods exist for the safe removal of protected health information (PHI) in DICOM metadata, approaches for the removal of PHI “burned-in” to image pixel data are typically manual, and automated high-throughput approaches are not well validated. Emerging optical character recognition (OCR) models can potentially detect and remove PHI-bearing text from medical images but are very time-consuming to run on the high volume of images found in typical research studies. We present a data processing method that performs metadata de-identification for all images combined with a targeted approach to only apply OCR to images with a high likelihood of burned-in text. The method was validated on a dataset of 415,182 images across ten modalities representative of the de-identification requests submitted at our institution over a 20-year span. Of the 12,578 images in this dataset with burned-in text of any kind, only 10 passed undetected with the method. OCR was only required for 6050 images (1.5% of the dataset).
{"title":"A Method for Efficient De-identification of DICOM Metadata and Burned-in Pixel Text","authors":"Jacob A. Macdonald, Katelyn R. Morgan, Brandon Konkel, Kulsoom Abdullah, Mark Martin, Cory Ennis, Joseph Y. Lo, Marissa Stroo, Denise C. Snyder, Mustafa R. Bashir","doi":"10.1007/s10278-024-01098-7","DOIUrl":"https://doi.org/10.1007/s10278-024-01098-7","url":null,"abstract":"<p>De-identification of DICOM images is an essential component of medical image research. While many established methods exist for the safe removal of protected health information (PHI) in DICOM metadata, approaches for the removal of PHI “burned-in” to image pixel data are typically manual, and automated high-throughput approaches are not well validated. Emerging optical character recognition (OCR) models can potentially detect and remove PHI-bearing text from medical images but are very time-consuming to run on the high volume of images found in typical research studies. We present a data processing method that performs metadata de-identification for all images combined with a targeted approach to only apply OCR to images with a high likelihood of burned-in text. The method was validated on a dataset of 415,182 images across ten modalities representative of the de-identification requests submitted at our institution over a 20-year span. Of the 12,578 images in this dataset with burned-in text of any kind, only 10 passed undetected with the method. OCR was only required for 6050 images (1.5% of the dataset).</p>","PeriodicalId":50214,"journal":{"name":"Journal of Digital Imaging","volume":"30 1","pages":""},"PeriodicalIF":4.4,"publicationDate":"2024-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140574049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-02DOI: 10.1007/s10278-024-01086-x
Mohammed A. H. Lubbad, Ikbal Leblebicioglu Kurtulus, Dervis Karaboga, Kerem Kilic, Alper Basturk, Bahriye Akay, Ozkan Ufuk Nalbantoglu, Ozden Melis Durmaz Yilmaz, Mustafa Ayata, Serkan Yilmaz, Ishak Pacal
This study aims to provide an effective solution for the autonomous identification of dental implant brands through a deep learning-based computer diagnostic system. It also seeks to ascertain the system’s potential in clinical practices and to offer a strategic framework for improving diagnosis and treatment processes in implantology. This study employed a total of 28 different deep learning models, including 18 convolutional neural network (CNN) models (VGG, ResNet, DenseNet, EfficientNet, RegNet, ConvNeXt) and 10 vision transformer models (Swin and Vision Transformer). The dataset comprises 1258 panoramic radiographs from patients who received implant treatments at Erciyes University Faculty of Dentistry between 2012 and 2023. It is utilized for the training and evaluation process of deep learning models and consists of prototypes from six different implant systems provided by six manufacturers. The deep learning-based dental implant system provided high classification accuracy for different dental implant brands using deep learning models. Furthermore, among all the architectures evaluated, the small model of the ConvNeXt architecture achieved an impressive accuracy rate of 94.2%, demonstrating a high level of classification success.This study emphasizes the effectiveness of deep learning-based systems in achieving high classification accuracy in dental implant types. These findings pave the way for integrating advanced deep learning tools into clinical practice, promising significant improvements in patient care and treatment outcomes.
{"title":"A Comparative Analysis of Deep Learning-Based Approaches for Classifying Dental Implants Decision Support System","authors":"Mohammed A. H. Lubbad, Ikbal Leblebicioglu Kurtulus, Dervis Karaboga, Kerem Kilic, Alper Basturk, Bahriye Akay, Ozkan Ufuk Nalbantoglu, Ozden Melis Durmaz Yilmaz, Mustafa Ayata, Serkan Yilmaz, Ishak Pacal","doi":"10.1007/s10278-024-01086-x","DOIUrl":"https://doi.org/10.1007/s10278-024-01086-x","url":null,"abstract":"<p>This study aims to provide an effective solution for the autonomous identification of dental implant brands through a deep learning-based computer diagnostic system. It also seeks to ascertain the system’s potential in clinical practices and to offer a strategic framework for improving diagnosis and treatment processes in implantology. This study employed a total of 28 different deep learning models, including 18 convolutional neural network (CNN) models (VGG, ResNet, DenseNet, EfficientNet, RegNet, ConvNeXt) and 10 vision transformer models (Swin and Vision Transformer). The dataset comprises 1258 panoramic radiographs from patients who received implant treatments at Erciyes University Faculty of Dentistry between 2012 and 2023. It is utilized for the training and evaluation process of deep learning models and consists of prototypes from six different implant systems provided by six manufacturers. The deep learning-based dental implant system provided high classification accuracy for different dental implant brands using deep learning models. Furthermore, among all the architectures evaluated, the small model of the ConvNeXt architecture achieved an impressive accuracy rate of 94.2%, demonstrating a high level of classification success.This study emphasizes the effectiveness of deep learning-based systems in achieving high classification accuracy in dental implant types. These findings pave the way for integrating advanced deep learning tools into clinical practice, promising significant improvements in patient care and treatment outcomes.</p>","PeriodicalId":50214,"journal":{"name":"Journal of Digital Imaging","volume":"27 1","pages":""},"PeriodicalIF":4.4,"publicationDate":"2024-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140574569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-02DOI: 10.1007/s10278-024-00967-5
He Ren, Qiubo Wang, Zhengguang Xiao, Runwei Mo, Jiachen Guo, Gareth Richard Hide, Mengting Tu, Yanan Zeng, Chen Ling, Ping Li
This study aimed to develop an interpretable diagnostic model for subtyping of pulmonary adenocarcinoma, including minimally invasive adenocarcinoma (MIA), adenocarcinoma in situ (AIS), and invasive adenocarcinoma (IAC), by integrating 3D-radiomic features and clinical data. Data from multiple hospitals were collected, and 10 key features were selected from 1600 3D radiomic signatures and 11 radiological features. Diverse decision rules were extracted using ensemble learning methods (gradient boosting, random forest, and AdaBoost), fused, ranked, and selected via RuleFit and SHAP to construct a rule-based diagnostic model. The model’s performance was evaluated using AUC, precision, accuracy, recall, and F1-score and compared with other models. The rule-based diagnostic model exhibited excellent performance in the training, testing, and validation cohorts, with AUC values of 0.9621, 0.9529, and 0.8953, respectively. This model outperformed counterparts relying solely on selected features and previous research models. Specifically, the AUC values for the previous research models in the three cohorts were 0.851, 0.893, and 0.836. It is noteworthy that individual models employing GBDT, random forest, and AdaBoost demonstrated AUC values of 0.9391, 0.8681, and 0.9449 in the training cohort, 0.9093, 0.8722, and 0.9363 in the testing cohort, and 0.8440, 0.8640, and 0.8750 in the validation cohort, respectively. These results highlight the superiority of the rule-based diagnostic model in the assessment of lung adenocarcinoma subtypes, while also providing insights into the performance of individual models. Integrating diverse decision rules enhanced the accuracy and interpretability of the diagnostic model for lung adenocarcinoma subtypes. This approach bridges the gap between complex predictive models and clinical utility, offering valuable support to healthcare professionals and patients.
{"title":"Fusing Diverse Decision Rules in 3D-Radiomics for Assisting Diagnosis of Lung Adenocarcinoma","authors":"He Ren, Qiubo Wang, Zhengguang Xiao, Runwei Mo, Jiachen Guo, Gareth Richard Hide, Mengting Tu, Yanan Zeng, Chen Ling, Ping Li","doi":"10.1007/s10278-024-00967-5","DOIUrl":"https://doi.org/10.1007/s10278-024-00967-5","url":null,"abstract":"<p>This study aimed to develop an interpretable diagnostic model for subtyping of pulmonary adenocarcinoma, including minimally invasive adenocarcinoma (MIA), adenocarcinoma in situ (AIS), and invasive adenocarcinoma (IAC), by integrating 3D-radiomic features and clinical data. Data from multiple hospitals were collected, and 10 key features were selected from 1600 3D radiomic signatures and 11 radiological features. Diverse decision rules were extracted using ensemble learning methods (gradient boosting, random forest, and AdaBoost), fused, ranked, and selected via RuleFit and SHAP to construct a rule-based diagnostic model. The model’s performance was evaluated using AUC, precision, accuracy, recall, and <i>F</i>1-score and compared with other models. The rule-based diagnostic model exhibited excellent performance in the training, testing, and validation cohorts, with AUC values of 0.9621, 0.9529, and 0.8953, respectively. This model outperformed counterparts relying solely on selected features and previous research models. Specifically, the AUC values for the previous research models in the three cohorts were 0.851, 0.893, and 0.836. It is noteworthy that individual models employing GBDT, random forest, and AdaBoost demonstrated AUC values of 0.9391, 0.8681, and 0.9449 in the training cohort, 0.9093, 0.8722, and 0.9363 in the testing cohort, and 0.8440, 0.8640, and 0.8750 in the validation cohort, respectively. These results highlight the superiority of the rule-based diagnostic model in the assessment of lung adenocarcinoma subtypes, while also providing insights into the performance of individual models. Integrating diverse decision rules enhanced the accuracy and interpretability of the diagnostic model for lung adenocarcinoma subtypes. This approach bridges the gap between complex predictive models and clinical utility, offering valuable support to healthcare professionals and patients.</p>","PeriodicalId":50214,"journal":{"name":"Journal of Digital Imaging","volume":"24 1","pages":""},"PeriodicalIF":4.4,"publicationDate":"2024-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140574030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-02DOI: 10.1007/s10278-024-01009-w
Maria Nazir, Sadia Shakil, Khurram Khurshid
Brain tumors are a threat to life for every other human being, be it adults or children. Gliomas are one of the deadliest brain tumors with an extremely difficult diagnosis. The reason is their complex and heterogenous structure which gives rise to subjective as well as objective errors. Their manual segmentation is a laborious task due to their complex structure and irregular appearance. To cater to all these issues, a lot of research has been done and is going on to develop AI-based solutions that can help doctors and radiologists in the effective diagnosis of gliomas with the least subjective and objective errors, but an end-to-end system is still missing. An all-in-one framework has been proposed in this research. The developed end-to-end multi-task learning (MTL) architecture with a feature attention module can classify, segment, and predict the overall survival of gliomas by leveraging task relationships between similar tasks. Uncertainty estimation has also been incorporated into the framework to enhance the confidence level of healthcare practitioners. Extensive experimentation was performed by using combinations of MRI sequences. Brain tumor segmentation (BraTS) challenge datasets of 2019 and 2020 were used for experimental purposes. Results of the best model with four sequences show 95.1% accuracy for classification, 86.3% dice score for segmentation, and a mean absolute error (MAE) of 456.59 for survival prediction on the test data. It is evident from the results that deep learning–based MTL models have the potential to automate the whole brain tumor analysis process and give efficient results with least inference time without human intervention. Uncertainty quantification confirms the idea that more data can improve the generalization ability and in turn can produce more accurate results with less uncertainty. The proposed model has the potential to be utilized in a clinical setup for the initial screening of glioma patients.
{"title":"End-to-End Multi-task Learning Architecture for Brain Tumor Analysis with Uncertainty Estimation in MRI Images","authors":"Maria Nazir, Sadia Shakil, Khurram Khurshid","doi":"10.1007/s10278-024-01009-w","DOIUrl":"https://doi.org/10.1007/s10278-024-01009-w","url":null,"abstract":"<p>Brain tumors are a threat to life for every other human being, be it adults or children. Gliomas are one of the deadliest brain tumors with an extremely difficult diagnosis. The reason is their complex and heterogenous structure which gives rise to subjective as well as objective errors. Their manual segmentation is a laborious task due to their complex structure and irregular appearance. To cater to all these issues, a lot of research has been done and is going on to develop AI-based solutions that can help doctors and radiologists in the effective diagnosis of gliomas with the least subjective and objective errors, but an end-to-end system is still missing. An all-in-one framework has been proposed in this research. The developed end-to-end multi-task learning (MTL) architecture with a feature attention module can classify, segment, and predict the overall survival of gliomas by leveraging task relationships between similar tasks. Uncertainty estimation has also been incorporated into the framework to enhance the confidence level of healthcare practitioners. Extensive experimentation was performed by using combinations of MRI sequences. Brain tumor segmentation (BraTS) challenge datasets of 2019 and 2020 were used for experimental purposes. Results of the best model with four sequences show 95.1% accuracy for classification, 86.3% dice score for segmentation, and a mean absolute error (MAE) of 456.59 for survival prediction on the test data. It is evident from the results that deep learning–based MTL models have the potential to automate the whole brain tumor analysis process and give efficient results with least inference time without human intervention<b>.</b> Uncertainty quantification confirms the idea that more data can improve the generalization ability and in turn can produce more accurate results with less uncertainty. The proposed model has the potential to be utilized in a clinical setup for the initial screening of glioma patients.</p>","PeriodicalId":50214,"journal":{"name":"Journal of Digital Imaging","volume":"4 1","pages":""},"PeriodicalIF":4.4,"publicationDate":"2024-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140574029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}