The aim of this study was to investigate whether super-resolution deep learning reconstruction (SR-DLR) is superior to conventional deep learning reconstruction (DLR) with respect to interobserver agreement in the evaluation of neuroforaminal stenosis using 1.5T cervical spine MRI. This retrospective study included 39 patients who underwent 1.5T cervical spine MRI. T2-weighted sagittal images were reconstructed with SR-DLR and DLR. Three blinded radiologists independently evaluated the images in terms of the degree of neuroforaminal stenosis, depictions of the vertebrae, spinal cord and neural foramina, sharpness, noise, artefacts and diagnostic acceptability. In quantitative image analyses, a fourth radiologist evaluated the signal-to-noise ratio (SNR) by placing a circular or ovoid region of interest on the spinal cord, and the edge slope based on a linear region of interest placed across the surface of the spinal cord. Interobserver agreement in the evaluations of neuroforaminal stenosis using SR-DLR and DLR was 0.422–0.571 and 0.410–0.542, respectively. The kappa values between reader 1 vs. reader 2 and reader 2 vs. reader 3 significantly differed. Two of the three readers rated depictions of the spinal cord, sharpness, and diagnostic acceptability as significantly better with SR-DLR than with DLR. Both SNR and edge slope (/mm) were also significantly better with SR-DLR (12.9 and 6031, respectively) than with DLR (11.5 and 3741, respectively) (p < 0.001 for both). In conclusion, compared to DLR, SR-DLR improved interobserver agreement in the evaluations of neuroforaminal stenosis using 1.5T cervical spine MRI.
{"title":"Super-resolution Deep Learning Reconstruction Cervical Spine 1.5T MRI: Improved Interobserver Agreement in Evaluations of Neuroforaminal Stenosis Compared to Conventional Deep Learning Reconstruction","authors":"Koichiro Yasaka, Shunichi Uehara, Shimpei Kato, Yusuke Watanabe, Taku Tajima, Hiroyuki Akai, Naoki Yoshioka, Masaaki Akahane, Kuni Ohtomo, Osamu Abe, Shigeru Kiryu","doi":"10.1007/s10278-024-01112-y","DOIUrl":"https://doi.org/10.1007/s10278-024-01112-y","url":null,"abstract":"<p>The aim of this study was to investigate whether super-resolution deep learning reconstruction (SR-DLR) is superior to conventional deep learning reconstruction (DLR) with respect to interobserver agreement in the evaluation of neuroforaminal stenosis using 1.5T cervical spine MRI. This retrospective study included 39 patients who underwent 1.5T cervical spine MRI. T2-weighted sagittal images were reconstructed with SR-DLR and DLR. Three blinded radiologists independently evaluated the images in terms of the degree of neuroforaminal stenosis, depictions of the vertebrae, spinal cord and neural foramina, sharpness, noise, artefacts and diagnostic acceptability. In quantitative image analyses, a fourth radiologist evaluated the signal-to-noise ratio (SNR) by placing a circular or ovoid region of interest on the spinal cord, and the edge slope based on a linear region of interest placed across the surface of the spinal cord. Interobserver agreement in the evaluations of neuroforaminal stenosis using SR-DLR and DLR was 0.422–0.571 and 0.410–0.542, respectively. The kappa values between reader 1 vs. reader 2 and reader 2 vs. reader 3 significantly differed. Two of the three readers rated depictions of the spinal cord, sharpness, and diagnostic acceptability as significantly better with SR-DLR than with DLR. Both SNR and edge slope (/mm) were also significantly better with SR-DLR (12.9 and 6031, respectively) than with DLR (11.5 and 3741, respectively) (<i>p</i> < 0.001 for both). In conclusion, compared to DLR, SR-DLR improved interobserver agreement in the evaluations of neuroforaminal stenosis using 1.5T cervical spine MRI.</p>","PeriodicalId":50214,"journal":{"name":"Journal of Digital Imaging","volume":"09 1","pages":""},"PeriodicalIF":4.4,"publicationDate":"2024-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140799447","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-19DOI: 10.1007/s10278-024-01105-x
Sara El-Ateif, Ali Idri
Multimodality fusion has gained significance in medical applications, particularly in diagnosing challenging diseases like eye diseases, notably diabetic eye diseases that pose risks of vision loss and blindness. Mono-modality eye disease diagnosis proves difficult, often missing crucial disease indicators. In response, researchers advocate multimodality-based approaches to enhance diagnostics. This study is a unique exploration, evaluating three multimodality fusion strategies—early, joint, and late—in conjunction with state-of-the-art convolutional neural network models for automated eye disease binary detection across three datasets: fundus fluorescein angiography, macula, and combination of digital retinal images for vessel extraction, structured analysis of the retina, and high-resolution fundus. Findings reveal the efficacy of each fusion strategy: type 0 early fusion with DenseNet121 achieves an impressive 99.45% average accuracy. InceptionResNetV2 emerges as the top-performing joint fusion architecture with an average accuracy of 99.58%. Late fusion ResNet50V2 achieves a perfect score of 100% across all metrics, surpassing both early and joint fusion. Comparative analysis demonstrates that late fusion ResNet50V2 matches the accuracy of state-of-the-art feature-level fusion model for multiview learning. In conclusion, this study substantiates late fusion as the optimal strategy for eye disease diagnosis compared to early and joint fusion, showcasing its superiority in leveraging multimodal information.
{"title":"Multimodality Fusion Strategies in Eye Disease Diagnosis","authors":"Sara El-Ateif, Ali Idri","doi":"10.1007/s10278-024-01105-x","DOIUrl":"https://doi.org/10.1007/s10278-024-01105-x","url":null,"abstract":"<p>Multimodality fusion has gained significance in medical applications, particularly in diagnosing challenging diseases like eye diseases, notably diabetic eye diseases that pose risks of vision loss and blindness. Mono-modality eye disease diagnosis proves difficult, often missing crucial disease indicators. In response, researchers advocate multimodality-based approaches to enhance diagnostics. This study is a unique exploration, evaluating three multimodality fusion strategies—early, joint, and late—in conjunction with state-of-the-art convolutional neural network models for automated eye disease binary detection across three datasets: fundus fluorescein angiography, macula, and combination of digital retinal images for vessel extraction, structured analysis of the retina, and high-resolution fundus. Findings reveal the efficacy of each fusion strategy: type 0 early fusion with DenseNet121 achieves an impressive 99.45% average accuracy. InceptionResNetV2 emerges as the top-performing joint fusion architecture with an average accuracy of 99.58%. Late fusion ResNet50V2 achieves a perfect score of 100% across all metrics, surpassing both early and joint fusion. Comparative analysis demonstrates that late fusion ResNet50V2 matches the accuracy of state-of-the-art feature-level fusion model for multiview learning. In conclusion, this study substantiates late fusion as the optimal strategy for eye disease diagnosis compared to early and joint fusion, showcasing its superiority in leveraging multimodal information.</p>","PeriodicalId":50214,"journal":{"name":"Journal of Digital Imaging","volume":"206 1","pages":""},"PeriodicalIF":4.4,"publicationDate":"2024-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140629560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-19DOI: 10.1007/s10278-024-01119-5
Kuan-Chih Huang, Donna Shu-Han Lin, Geng-Shi Jeng, Ting-Tse Lin, Lian-Yu Lin, Chih-Kuo Lee, Lung-Chun Lin
The left ventricular global longitudinal strain (LVGLS) is a crucial prognostic indicator. However, inconsistencies in measurements due to the speckle tracking algorithm and manual adjustments have hindered its standardization and democratization. To solve this issue, we proposed a fully automated strain measurement by artificial intelligence-assisted LV segmentation contours. The LV segmentation model was trained from echocardiograms of 368 adults (11,125 frames). We compared the registration-like effects of dynamic time warping (DTW) with speckle tracking on a synthetic echocardiographic dataset in experiment-1. In experiment-2, we enrolled 80 patients to compare the DTW method with commercially available software. In experiment-3, we combined the segmentation model and DTW method to create the artificial intelligence (AI)-DTW method, which was then tested on 40 patients with general LV morphology, 20 with dilated cardiomyopathy (DCMP), and 20 with transthyretin-associated cardiac amyloidosis (ATTR-CA), 20 with severe aortic stenosis (AS), and 20 with severe mitral regurgitation (MR). Experiments-1 and -2 revealed that the DTW method is consistent with dedicated software. In experiment-3, the AI-DTW strain method showed comparable results for general LV morphology (bias − 0.137 ± 0.398%), DCMP (− 0.397 ± 0.607%), ATTR-CA (0.095 ± 0.581%), AS (0.334 ± 0.358%), and MR (0.237 ± 0.490%). Moreover, the strain curves showed a high correlation in their characteristics, with R-squared values of 0.8879–0.9452 for those LV morphology in experiment-3. Measuring LVGLS through dynamic warping of segmentation contour is a feasible method compared to traditional tracking techniques. This approach has the potential to decrease the need for manual demarcation and make LVGLS measurements more efficient and user-friendly for daily practice.
{"title":"Left Ventricular Segmentation, Warping, and Myocardial Registration for Automated Strain Measurement","authors":"Kuan-Chih Huang, Donna Shu-Han Lin, Geng-Shi Jeng, Ting-Tse Lin, Lian-Yu Lin, Chih-Kuo Lee, Lung-Chun Lin","doi":"10.1007/s10278-024-01119-5","DOIUrl":"https://doi.org/10.1007/s10278-024-01119-5","url":null,"abstract":"<p>The left ventricular global longitudinal strain (LVGLS) is a crucial prognostic indicator. However, inconsistencies in measurements due to the speckle tracking algorithm and manual adjustments have hindered its standardization and democratization. To solve this issue, we proposed a fully automated strain measurement by artificial intelligence-assisted LV segmentation contours. The LV segmentation model was trained from echocardiograms of 368 adults (11,125 frames). We compared the registration-like effects of dynamic time warping (DTW) with speckle tracking on a synthetic echocardiographic dataset in experiment-1. In experiment-2, we enrolled 80 patients to compare the DTW method with commercially available software. In experiment-3, we combined the segmentation model and DTW method to create the artificial intelligence (AI)-DTW method, which was then tested on 40 patients with general LV morphology, 20 with dilated cardiomyopathy (DCMP), and 20 with transthyretin-associated cardiac amyloidosis (ATTR-CA), 20 with severe aortic stenosis (AS), and 20 with severe mitral regurgitation (MR). Experiments-1 and -2 revealed that the DTW method is consistent with dedicated software. In experiment-3, the AI-DTW strain method showed comparable results for general LV morphology (bias − 0.137 ± 0.398%), DCMP (− 0.397 ± 0.607%), ATTR-CA (0.095 ± 0.581%), AS (0.334 ± 0.358%), and MR (0.237 ± 0.490%). Moreover, the strain curves showed a high correlation in their characteristics, with <i>R</i>-squared values of 0.8879–0.9452 for those LV morphology in experiment-3. Measuring LVGLS through dynamic warping of segmentation contour is a feasible method compared to traditional tracking techniques. This approach has the potential to decrease the need for manual demarcation and make LVGLS measurements more efficient and user-friendly for daily practice.</p>","PeriodicalId":50214,"journal":{"name":"Journal of Digital Imaging","volume":"116 1","pages":""},"PeriodicalIF":4.4,"publicationDate":"2024-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140623968","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-19DOI: 10.1007/s10278-024-01117-7
Vishal Patel, Shengzhen Tao, Xiangzhi Zhou, Chen Lin, Erin Westerhold, Sanjeet Grewal, Erik H. Middlebrooks
Deep brain stimulation (DBS) is a method of electrical neuromodulation used to treat a variety of neuropsychiatric conditions including essential tremor, Parkinson’s disease, epilepsy, and obsessive–compulsive disorder. The procedure requires precise placement of electrodes such that the electrical contacts lie within or in close proximity to specific target nuclei and tracts located deep within the brain. DBS electrode trajectory planning has become increasingly dependent on direct targeting with the need for precise visualization of targets. MRI is the primary tool for direct visualization, and this has led to the development of numerous sequences to aid in visualization of different targets. Synthetic inversion recovery images, specified by an inversion time parameter, can be generated from T1 relaxation maps, and this represents a promising method for modifying the contrast of deep brain structures to accentuate target areas using a single acquisition. However, there is currently no accessible method for dynamically adjusting the inversion time parameter and observing the effects in real-time in order to choose the optimal value. In this work, we examine three different approaches to implementing an application for real-time optimal synthetic inversion recovery image selection and evaluate them based on their ability to display continually-updated synthetic inversion recovery images as the user modifies the inversion time parameter. These methods include continuously computing the inversion recovery equation at each voxel in the image volume, limiting the computation only to the voxels of the orthogonal slices currently displayed on screen, or using a series of lookup tables with precomputed solutions to the inversion recovery equation. We find the latter implementation provides for the quickest display updates both when modifying the inversion time and when scrolling through the image. We introduce a publicly available cross-platform application built around this conclusion. We also briefly discuss other details of the implementations and considerations for extensions to other use cases.
{"title":"Real-Time Optimal Synthetic Inversion Recovery Image Selection (RT-OSIRIS) for Deep Brain Stimulation Targeting","authors":"Vishal Patel, Shengzhen Tao, Xiangzhi Zhou, Chen Lin, Erin Westerhold, Sanjeet Grewal, Erik H. Middlebrooks","doi":"10.1007/s10278-024-01117-7","DOIUrl":"https://doi.org/10.1007/s10278-024-01117-7","url":null,"abstract":"<p>Deep brain stimulation (DBS) is a method of electrical neuromodulation used to treat a variety of neuropsychiatric conditions including essential tremor, Parkinson’s disease, epilepsy, and obsessive–compulsive disorder. The procedure requires precise placement of electrodes such that the electrical contacts lie within or in close proximity to specific target nuclei and tracts located deep within the brain. DBS electrode trajectory planning has become increasingly dependent on direct targeting with the need for precise visualization of targets. MRI is the primary tool for direct visualization, and this has led to the development of numerous sequences to aid in visualization of different targets. Synthetic inversion recovery images, specified by an inversion time parameter, can be generated from T<sub>1</sub> relaxation maps, and this represents a promising method for modifying the contrast of deep brain structures to accentuate target areas using a single acquisition. However, there is currently no accessible method for dynamically adjusting the inversion time parameter and observing the effects in real-time in order to choose the optimal value. In this work, we examine three different approaches to implementing an application for real-time optimal synthetic inversion recovery image selection and evaluate them based on their ability to display continually-updated synthetic inversion recovery images as the user modifies the inversion time parameter. These methods include continuously computing the inversion recovery equation at each voxel in the image volume, limiting the computation only to the voxels of the orthogonal slices currently displayed on screen, or using a series of lookup tables with precomputed solutions to the inversion recovery equation. We find the latter implementation provides for the quickest display updates both when modifying the inversion time and when scrolling through the image. We introduce a publicly available cross-platform application built around this conclusion. We also briefly discuss other details of the implementations and considerations for extensions to other use cases.</p>","PeriodicalId":50214,"journal":{"name":"Journal of Digital Imaging","volume":"4 1","pages":""},"PeriodicalIF":4.4,"publicationDate":"2024-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140623964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-18DOI: 10.1007/s10278-024-01115-9
Wutong Chen, Du Junsheng, Yanzhen Chen, Yifeng Fan, Hengzhi Liu, Chang Tan, Xuanming Shao, Xinzhi Li
We aimed to develop and validate a deep convolutional neural network (DCNN) model capable of accurately identifying spondylolysis or spondylolisthesis on lateral or dynamic X-ray images. A total of 2449 lumbar lateral and dynamic X-ray images were collected from two tertiary hospitals. These images were categorized into lumbar spondylolysis (LS), degenerative lumbar spondylolisthesis (DLS), and normal lumbar in a proportional manner. Subsequently, the images were randomly divided into training, validation, and test sets to establish a classification recognition network. The model training and validation process utilized the EfficientNetV2-M network. The model’s ability to generalize was assessed by conducting a rigorous evaluation on an entirely independent test set and comparing its performance with the diagnoses made by three orthopedists and three radiologists. The evaluation metrics employed to assess the model’s performance included accuracy, sensitivity, specificity, and F1 score. Additionally, the weight distribution of the network was visualized using gradient-weighted class activation mapping (Grad-CAM). For the doctor group, accuracy ranged from 87.9 to 90.0% (mean, 89.0%), precision ranged from 87.2 to 90.5% (mean, 89.0%), sensitivity ranged from 87.1 to 91.0% (mean, 89.2%), specificity ranged from 93.7 to 94.7% (mean, 94.3%), and F1 score ranged from 88.2 to 89.9% (mean, 89.1%). The DCNN model had accuracy of 92.0%, precision of 91.9%, sensitivity of 92.2%, specificity of 95.7%, and F1 score of 92.0%. Grad-CAM exhibited concentrations of highlighted areas in the intervertebral foraminal region. We developed a DCNN model that intelligently distinguished spondylolysis or spondylolisthesis on lumbar lateral or lumbar dynamic radiographs.
{"title":"The Classification of Lumbar Spondylolisthesis X-Ray Images Using Convolutional Neural Networks","authors":"Wutong Chen, Du Junsheng, Yanzhen Chen, Yifeng Fan, Hengzhi Liu, Chang Tan, Xuanming Shao, Xinzhi Li","doi":"10.1007/s10278-024-01115-9","DOIUrl":"https://doi.org/10.1007/s10278-024-01115-9","url":null,"abstract":"<p>We aimed to develop and validate a deep convolutional neural network (DCNN) model capable of accurately identifying spondylolysis or spondylolisthesis on lateral or dynamic X-ray images. A total of 2449 lumbar lateral and dynamic X-ray images were collected from two tertiary hospitals. These images were categorized into lumbar spondylolysis (LS), degenerative lumbar spondylolisthesis (DLS), and normal lumbar in a proportional manner. Subsequently, the images were randomly divided into training, validation, and test sets to establish a classification recognition network. The model training and validation process utilized the EfficientNetV2-M network. The model’s ability to generalize was assessed by conducting a rigorous evaluation on an entirely independent test set and comparing its performance with the diagnoses made by three orthopedists and three radiologists. The evaluation metrics employed to assess the model’s performance included accuracy, sensitivity, specificity, and <i>F</i>1 score. Additionally, the weight distribution of the network was visualized using gradient-weighted class activation mapping (Grad-CAM). For the doctor group, accuracy ranged from 87.9 to 90.0% (mean, 89.0%), precision ranged from 87.2 to 90.5% (mean, 89.0%), sensitivity ranged from 87.1 to 91.0% (mean, 89.2%), specificity ranged from 93.7 to 94.7% (mean, 94.3%), and <i>F</i>1 score ranged from 88.2 to 89.9% (mean, 89.1%). The DCNN model had accuracy of 92.0%, precision of 91.9%, sensitivity of 92.2%, specificity of 95.7%, and <i>F</i>1 score of 92.0%. Grad-CAM exhibited concentrations of highlighted areas in the intervertebral foraminal region. We developed a DCNN model that intelligently distinguished spondylolysis or spondylolisthesis on lumbar lateral or lumbar dynamic radiographs.</p>","PeriodicalId":50214,"journal":{"name":"Journal of Digital Imaging","volume":"50 1","pages":""},"PeriodicalIF":4.4,"publicationDate":"2024-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140623965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
While dual-energy computed tomography (DECT) technology introduces energy-specific information in clinical practice, single-energy CT (SECT) is predominantly used, limiting the number of people who can benefit from DECT. This study proposed a novel method to generate synthetic low-energy virtual monochromatic images at 50 keV (sVMI50keV) from SECT images using a transformer-based deep learning model, SwinUNETR. Data were obtained from 85 patients who underwent head and neck radiotherapy. Among these, the model was built using data from 70 patients for whom only DECT images were available. The remaining 15 patients, for whom both DECT and SECT images were available, were used to predict from the actual SECT images. We used the SwinUNETR model to generate sVMI50keV. The image quality was evaluated, and the results were compared with those of the convolutional neural network-based model, Unet. The mean absolute errors from the true VMI50keV were 36.5 ± 4.9 and 33.0 ± 4.4 Hounsfield units for Unet and SwinUNETR, respectively. SwinUNETR yielded smaller errors in tissue attenuation values compared with those of Unet. The contrast changes in sVMI50keV generated by SwinUNETR from SECT were closer to those of DECT-derived VMI50keV than the contrast changes in Unet-generated sVMI50keV. This study demonstrated the potential of transformer-based models for generating synthetic low-energy VMIs from SECT images, thereby improving the image quality of head and neck cancer imaging. It provides a practical and feasible solution to obtain low-energy VMIs from SECT data that can benefit a large number of facilities and patients without access to DECT technology.
{"title":"Synthetic Low-Energy Monochromatic Image Generation in Single-Energy Computed Tomography System Using a Transformer-Based Deep Learning Model","authors":"Yuhei Koike, Shingo Ohira, Sayaka Kihara, Yusuke Anetai, Hideki Takegawa, Satoaki Nakamura, Masayoshi Miyazaki, Koji Konishi, Noboru Tanigawa","doi":"10.1007/s10278-024-01111-z","DOIUrl":"https://doi.org/10.1007/s10278-024-01111-z","url":null,"abstract":"<p>While dual-energy computed tomography (DECT) technology introduces energy-specific information in clinical practice, single-energy CT (SECT) is predominantly used, limiting the number of people who can benefit from DECT. This study proposed a novel method to generate synthetic low-energy virtual monochromatic images at 50 keV (sVMI<sub>50keV</sub>) from SECT images using a transformer-based deep learning model, SwinUNETR. Data were obtained from 85 patients who underwent head and neck radiotherapy. Among these, the model was built using data from 70 patients for whom only DECT images were available. The remaining 15 patients, for whom both DECT and SECT images were available, were used to predict from the actual SECT images. We used the SwinUNETR model to generate sVMI<sub>50keV</sub>. The image quality was evaluated, and the results were compared with those of the convolutional neural network-based model, Unet. The mean absolute errors from the true VMI<sub>50keV</sub> were 36.5 ± 4.9 and 33.0 ± 4.4 Hounsfield units for Unet and SwinUNETR, respectively. SwinUNETR yielded smaller errors in tissue attenuation values compared with those of Unet. The contrast changes in sVMI<sub>50keV</sub> generated by SwinUNETR from SECT were closer to those of DECT-derived VMI<sub>50keV</sub> than the contrast changes in Unet-generated sVMI<sub>50keV</sub>. This study demonstrated the potential of transformer-based models for generating synthetic low-energy VMIs from SECT images, thereby improving the image quality of head and neck cancer imaging. It provides a practical and feasible solution to obtain low-energy VMIs from SECT data that can benefit a large number of facilities and patients without access to DECT technology.</p>","PeriodicalId":50214,"journal":{"name":"Journal of Digital Imaging","volume":"50 1","pages":""},"PeriodicalIF":4.4,"publicationDate":"2024-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140623826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-16DOI: 10.1007/s10278-024-01106-w
Uzma Saghir, Shailendra Kumar Singh, Moin Hasan
Skin cancer affects people of all ages and is a common disease. The death toll from skin cancer rises with a late diagnosis. An automated mechanism for early-stage skin cancer detection is required to diminish the mortality rate. Visual examination with scanning or imaging screening is a common mechanism for detecting this disease, but due to its similarity to other diseases, this mechanism shows the least accuracy. This article introduces an innovative segmentation mechanism that operates on the ISIC dataset to divide skin images into critical and non-critical sections. The main objective of the research is to segment lesions from dermoscopic skin images. The suggested framework is completed in two steps. The first step is to pre-process the image; for this, we have applied a bottom hat filter for hair removal and image enhancement by applying DCT and color coefficient. In the next phase, a background subtraction method with midpoint analysis is applied for segmentation to extract the region of interest and achieves an accuracy of 95.30%. The ground truth for the validation of segmentation is accomplished by comparing the segmented images with validation data provided with the ISIC dataset.
{"title":"Skin Cancer Image Segmentation Based on Midpoint Analysis Approach","authors":"Uzma Saghir, Shailendra Kumar Singh, Moin Hasan","doi":"10.1007/s10278-024-01106-w","DOIUrl":"https://doi.org/10.1007/s10278-024-01106-w","url":null,"abstract":"<p>Skin cancer affects people of all ages and is a common disease. The death toll from skin cancer rises with a late diagnosis. An automated mechanism for early-stage skin cancer detection is required to diminish the mortality rate. Visual examination with scanning or imaging screening is a common mechanism for detecting this disease, but due to its similarity to other diseases, this mechanism shows the least accuracy. This article introduces an innovative segmentation mechanism that operates on the ISIC dataset to divide skin images into critical and non-critical sections. The main objective of the research is to segment lesions from dermoscopic skin images. The suggested framework is completed in two steps. The first step is to pre-process the image; for this, we have applied a bottom hat filter for hair removal and image enhancement by applying DCT and color coefficient. In the next phase, a background subtraction method with midpoint analysis is applied for segmentation to extract the region of interest and achieves an accuracy of 95.30%. The ground truth for the validation of segmentation is accomplished by comparing the segmented images with validation data provided with the ISIC dataset.</p>","PeriodicalId":50214,"journal":{"name":"Journal of Digital Imaging","volume":"58 1","pages":""},"PeriodicalIF":4.4,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140616077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-16DOI: 10.1007/s10278-024-01085-y
Ting-Wei Ou, Tzu-Chieh Weng, Ruey-Feng Chang
Architectural distortion (AD) is one of the most common findings on mammograms, and it may represent not only cancer but also a lesion such as a radial scar that may have an associated cancer. AD accounts for 18–45% missed cancer, and the positive predictive value of AD is approximately 74.5%. Early detection of AD leads to early diagnosis and treatment of the cancer and improves the overall prognosis. However, detection of AD is a challenging task. In this work, we propose a new approach for detecting architectural distortion in mammography images by combining preprocessing methods and a novel structure fusion attention model. The proposed structure-focused weighted orientation preprocessing method is composed of the original image, the architecture enhancement map, and the weighted orientation map, highlighting suspicious AD locations. The proposed structure fusion attention model captures the information from different channels and outperforms other models in terms of false positives and top sensitivity, which refers to the maximum sensitivity that a model can achieve under the acceptance of the highest number of false positives, reaching 0.92 top sensitivity with only 0.6590 false positive per image. The findings suggest that the combination of preprocessing methods and a novel network architecture can lead to more accurate and reliable AD detection. Overall, the proposed approach offers a novel perspective on detecting ADs, and we believe that our method can be applied to clinical settings in the future, assisting radiologists in the early detection of ADs from mammography, ultimately leading to early treatment of breast cancer patients.
建筑变形(AD)是乳房 X 光检查中最常见的发现之一,它不仅可能代表癌症,也可能代表可能伴有癌症的病变,如放射状疤痕。AD占漏诊癌症的18%-45%,AD的阳性预测值约为74.5%。早期发现 AD 可使癌症得到早期诊断和治疗,并改善整体预后。然而,检测 AD 是一项具有挑战性的任务。在这项工作中,我们结合预处理方法和新型结构融合注意力模型,提出了一种检测乳腺 X 射线图像结构失真的新方法。所提出的以结构为重点的加权方向预处理方法由原始图像、结构增强图和加权方向图组成,可突出显示可疑的乳腺增生位置。所提出的结构融合注意力模型捕捉了来自不同通道的信息,在误报率和最高灵敏度(指模型在接受最高误报率的情况下所能达到的最高灵敏度)方面优于其他模型,最高灵敏度达到 0.92,而每幅图像的误报率仅为 0.6590。研究结果表明,将预处理方法与新型网络架构相结合,可以实现更准确、更可靠的注意力缺失检测。总之,所提出的方法为检测乳腺增生症提供了一个新的视角,我们相信我们的方法将来可以应用于临床,帮助放射科医生从乳腺 X 射线摄影中早期检测出乳腺增生症,最终实现乳腺癌患者的早期治疗。
{"title":"A Novel Structure Fusion Attention Model to Detect Architectural Distortion on Mammography","authors":"Ting-Wei Ou, Tzu-Chieh Weng, Ruey-Feng Chang","doi":"10.1007/s10278-024-01085-y","DOIUrl":"https://doi.org/10.1007/s10278-024-01085-y","url":null,"abstract":"<p>Architectural distortion (AD) is one of the most common findings on mammograms, and it may represent not only cancer but also a lesion such as a radial scar that may have an associated cancer. AD accounts for 18–45% missed cancer, and the positive predictive value of AD is approximately 74.5%. Early detection of AD leads to early diagnosis and treatment of the cancer and improves the overall prognosis. However, detection of AD is a challenging task. In this work, we propose a new approach for detecting architectural distortion in mammography images by combining preprocessing methods and a novel structure fusion attention model. The proposed structure-focused weighted orientation preprocessing method is composed of the original image, the architecture enhancement map, and the weighted orientation map, highlighting suspicious AD locations. The proposed structure fusion attention model captures the information from different channels and outperforms other models in terms of false positives and top sensitivity, which refers to the maximum sensitivity that a model can achieve under the acceptance of the highest number of false positives, reaching 0.92 top sensitivity with only 0.6590 false positive per image. The findings suggest that the combination of preprocessing methods and a novel network architecture can lead to more accurate and reliable AD detection. Overall, the proposed approach offers a novel perspective on detecting ADs, and we believe that our method can be applied to clinical settings in the future, assisting radiologists in the early detection of ADs from mammography, ultimately leading to early treatment of breast cancer patients.</p>","PeriodicalId":50214,"journal":{"name":"Journal of Digital Imaging","volume":"306 1","pages":""},"PeriodicalIF":4.4,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140616243","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Is the radiomic approach, utilizing diffusion-weighted imaging (DWI), capable of predicting the various pathological grades of intrahepatic mass-forming cholangiocarcinoma (IMCC)? Furthermore, which model demonstrates superior performance among the diverse algorithms currently available? The objective of our study is to develop DWI radiomic models based on different machine learning algorithms and identify the optimal prediction model. We undertook a retrospective analysis of the DWI data of 77 patients with IMCC confirmed by pathological testing. Fifty-seven patients initially included in the study were randomly assigned to either the training set or the validation set in a ratio of 7:3. We established four different classifier models, namely random forest (RF), support vector machines (SVM), logistic regression (LR), and gradient boosting decision tree (GBDT), by manually contouring the region of interest and extracting prominent radiomic features. An external validation of the model was performed with the DWI data of 20 patients with IMCC who were subsequently included in the study. The area under the receiver operating curve (AUC), accuracy (ACC), precision (PRE), sensitivity (REC), and F1 score were used to evaluate the diagnostic performance of the model. Following the process of feature selection, a total of nine features were retained, with skewness being the most crucial radiomic feature demonstrating the highest diagnostic performance, followed by Gray Level Co-occurrence Matrix lmc1 (glcm-lmc1) and kurtosis, whose diagnostic performances were slightly inferior to skewness. Skewness and kurtosis showed a negative correlation with the pathological grading of IMCC, while glcm-lmc1 exhibited a positive correlation with the IMCC pathological grade. Compared with the other three models, the SVM radiomic model had the best diagnostic performance with an AUC of 0.957, an accuracy of 88.2%, a sensitivity of 85.7%, a precision of 85.7%, and an F1 score of 85.7% in the training set, as well as an AUC of 0.829, an accuracy of 76.5%, a sensitivity of 71.4%, a precision of 71.4%, and an F1 score of 71.4% in the external validation set. The DWI-based radiomic model proved to be efficacious in predicting the pathological grade of IMCC. The model with the SVM classifier algorithm had the best prediction efficiency and robustness. Consequently, this SVM-based model can be further explored as an option for a non-invasive preoperative prediction method in clinical practice.
{"title":"Comparison of Machine Learning Models Using Diffusion-Weighted Images for Pathological Grade of Intrahepatic Mass-Forming Cholangiocarcinoma","authors":"Li-Hong Xing, Shu-Ping Wang, Li-Yong Zhuo, Yu Zhang, Jia-Ning Wang, Ze-Peng Ma, Ying-Jia Zhao, Shuang-Rui Yuan, Qian-He Zu, Xiao-Ping Yin","doi":"10.1007/s10278-024-01103-z","DOIUrl":"https://doi.org/10.1007/s10278-024-01103-z","url":null,"abstract":"<p>Is the radiomic approach, utilizing diffusion-weighted imaging (DWI), capable of predicting the various pathological grades of intrahepatic mass-forming cholangiocarcinoma (IMCC)? Furthermore, which model demonstrates superior performance among the diverse algorithms currently available? The objective of our study is to develop DWI radiomic models based on different machine learning algorithms and identify the optimal prediction model. We undertook a retrospective analysis of the DWI data of 77 patients with IMCC confirmed by pathological testing. Fifty-seven patients initially included in the study were randomly assigned to either the training set or the validation set in a ratio of 7:3. We established four different classifier models, namely random forest (RF), support vector machines (SVM), logistic regression (LR), and gradient boosting decision tree (GBDT), by manually contouring the region of interest and extracting prominent radiomic features. An external validation of the model was performed with the DWI data of 20 patients with IMCC who were subsequently included in the study. The area under the receiver operating curve (AUC), accuracy (ACC), precision (PRE), sensitivity (REC), and F1 score were used to evaluate the diagnostic performance of the model. Following the process of feature selection, a total of nine features were retained, with skewness being the most crucial radiomic feature demonstrating the highest diagnostic performance, followed by Gray Level Co-occurrence Matrix lmc1 (glcm-lmc1) and kurtosis, whose diagnostic performances were slightly inferior to skewness. Skewness and kurtosis showed a negative correlation with the pathological grading of IMCC, while glcm-lmc1 exhibited a positive correlation with the IMCC pathological grade. Compared with the other three models, the SVM radiomic model had the best diagnostic performance with an AUC of 0.957, an accuracy of 88.2%, a sensitivity of 85.7%, a precision of 85.7%, and an F1 score of 85.7% in the training set, as well as an AUC of 0.829, an accuracy of 76.5%, a sensitivity of 71.4%, a precision of 71.4%, and an F1 score of 71.4% in the external validation set. The DWI-based radiomic model proved to be efficacious in predicting the pathological grade of IMCC. The model with the SVM classifier algorithm had the best prediction efficiency and robustness. Consequently, this SVM-based model can be further explored as an option for a non-invasive preoperative prediction method in clinical practice.</p>","PeriodicalId":50214,"journal":{"name":"Journal of Digital Imaging","volume":"43 1","pages":""},"PeriodicalIF":4.4,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140616083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-15DOI: 10.1007/s10278-024-01082-1
Qingbo Ji, Tingshuo Yin, Pengfei Zhang, Qingquan Liu, Changbo Hou
The morphological analysis test item of urine red blood cells is referred to as “extracorporeal renal biopsy,” which holds significant importance for medical department testing. However, the accuracy of existing urine red blood cell morphology analyzers is suboptimal, and they are not widely utilized in medical examinations. Challenges include low image spatial resolution, blurred distinguishing features between cells, difficulty in fine-grained feature extraction, and insufficient data volume. This article aims to improve the classification accuracy of low-resolution urine red blood cells. This paper proposes a super-resolution method based on category-aware loss and an RBC-MIX data enhancement approach. It optimizes the cross-entropy loss to maximize the classification boundary and improve intra-class tightness and inter-class difference, achieving fine-grained classification of low-resolution urine red blood cells. Experimental outcomes demonstrate that with this method, an accuracy rate of 97.8% can be achieved for low-resolution urine red blood cell images. This algorithm attains outstanding classification performance for low-resolution urine red blood cells with only category labels required. This method can serve as a practical reference for urine red blood cell morphology examination items.