This study aims to investigate the maximum achievable dose reduction for applying a new deep learning-based reconstruction algorithm, namely the artificial intelligence iterative reconstruction (AIIR), in computed tomography (CT) for hepatic lesion detection. A total of 40 patients with 98 clinically confirmed hepatic lesions were retrospectively included. The mean volume CT dose index was 13.66 ± 1.73 mGy in routine-dose portal venous CT examinations, where the images were originally obtained with hybrid iterative reconstruction (HIR). Low-dose simulations were performed in projection domain for 40%-, 20%-, and 10%-dose levels, followed by reconstruction using both HIR and AIIR. Two radiologists were asked to detect hepatic lesion on each set of low-dose image in separate sessions. Qualitative metrics including lesion conspicuity, diagnostic confidence, and overall image quality were evaluated using a 5-point scale. The contrast-to-noise ratio (CNR) for lesion was also calculated for quantitative assessment. The lesion CNR on AIIR at reduced doses were significantly higher than that on routine-dose HIR (all p < 0.05). Lower qualitative image quality was observed as the radiation dose reduced, while there were no significant differences between 40%-dose AIIR and routine-dose HIR images. The lesion detection rate was 100%, 98% (96/98), and 73.5% (72/98) on 40%-, 20%-, and 10%-dose AIIR, respectively, whereas it was 98% (96/98), 73.5% (72/98), and 40% (39/98) on the corresponding low-dose HIR, respectively. AIIR outperformed HIR in simulated low-dose CT examinations of the liver. The use of AIIR allows up to 60% dose reduction for lesion detection while maintaining comparable image quality to routine-dose HIR.
{"title":"Exploring the Low-Dose Limit for Focal Hepatic Lesion Detection with a Deep Learning-Based CT Reconstruction Algorithm: A Simulation Study on Patient Images","authors":"Yongchun You, Sihua Zhong, Guozhi Zhang, Yuting Wen, Dian Guo, Wanjiang Li, Zhenlin Li","doi":"10.1007/s10278-024-01080-3","DOIUrl":"https://doi.org/10.1007/s10278-024-01080-3","url":null,"abstract":"<p>This study aims to investigate the maximum achievable dose reduction for applying a new deep learning-based reconstruction algorithm, namely the artificial intelligence iterative reconstruction (AIIR), in computed tomography (CT) for hepatic lesion detection. A total of 40 patients with 98 clinically confirmed hepatic lesions were retrospectively included. The mean volume CT dose index was 13.66 ± 1.73 mGy in routine-dose portal venous CT examinations, where the images were originally obtained with hybrid iterative reconstruction (HIR). Low-dose simulations were performed in projection domain for 40%-, 20%-, and 10%-dose levels, followed by reconstruction using both HIR and AIIR. Two radiologists were asked to detect hepatic lesion on each set of low-dose image in separate sessions. Qualitative metrics including lesion conspicuity, diagnostic confidence, and overall image quality were evaluated using a 5-point scale. The contrast-to-noise ratio (CNR) for lesion was also calculated for quantitative assessment. The lesion CNR on AIIR at reduced doses were significantly higher than that on routine-dose HIR (all <i>p</i> < 0.05). Lower qualitative image quality was observed as the radiation dose reduced, while there were no significant differences between 40%-dose AIIR and routine-dose HIR images. The lesion detection rate was 100%, 98% (96/98), and 73.5% (72/98) on 40%-, 20%-, and 10%-dose AIIR, respectively, whereas it was 98% (96/98), 73.5% (72/98), and 40% (39/98) on the corresponding low-dose HIR, respectively. AIIR outperformed HIR in simulated low-dose CT examinations of the liver. The use of AIIR allows up to 60% dose reduction for lesion detection while maintaining comparable image quality to routine-dose HIR.</p>","PeriodicalId":50214,"journal":{"name":"Journal of Digital Imaging","volume":"46 1","pages":""},"PeriodicalIF":4.4,"publicationDate":"2024-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140168349","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-21DOI: 10.1007/s10278-024-01044-7
Abstract
Segmentation of glioma is crucial for quantitative brain tumor assessment, to guide therapeutic research and clinical management, but very time-consuming. Fully automated tools for the segmentation of multi-sequence MRI are needed. We developed and pretrained a deep learning (DL) model using publicly available datasets A (n = 210) and B (n = 369) containing FLAIR, T2WI, and contrast-enhanced (CE)-T1WI. This was then fine-tuned with our institutional dataset (n = 197) containing ADC, T2WI, and CE-T1WI, manually annotated by radiologists, and split into training (n = 100) and testing (n = 97) sets. The Dice similarity coefficient (DSC) was used to compare model outputs and manual labels. A third independent radiologist assessed segmentation quality on a semi-quantitative 5-scale score. Differences in DSC between new and recurrent gliomas, and between uni or multifocal gliomas were analyzed using the Mann–Whitney test. Semi-quantitative analyses were compared using the chi-square test. We found that there was good agreement between segmentations from the fine-tuned DL model and ground truth manual segmentations (median DSC: 0.729, std-dev: 0.134). DSC was higher for newly diagnosed (0.807) than recurrent (0.698) (p < 0.001), and higher for unifocal (0.747) than multi-focal (0.613) cases (p = 0.001). Semi-quantitative scores of DL and manual segmentation were not significantly different (mean: 3.567 vs. 3.639; 93.8% vs. 97.9% scoring ≥ 3, p = 0.107). In conclusion, the proposed transfer learning DL performed similarly to human radiologists in glioma segmentation on both structural and ADC sequences. Further improvement in segmenting challenging postoperative and multifocal glioma cases is needed.
{"title":"Auto-segmentation of Adult-Type Diffuse Gliomas: Comparison of Transfer Learning-Based Convolutional Neural Network Model vs. Radiologists","authors":"","doi":"10.1007/s10278-024-01044-7","DOIUrl":"https://doi.org/10.1007/s10278-024-01044-7","url":null,"abstract":"<h3>Abstract</h3> <p>Segmentation of glioma is crucial for quantitative brain tumor assessment, to guide therapeutic research and clinical management, but very time-consuming. Fully automated tools for the segmentation of multi-sequence MRI are needed. We developed and pretrained a deep learning (DL) model using publicly available datasets A (<em>n</em> = 210) and B (<em>n</em> = 369) containing FLAIR, T2WI, and contrast-enhanced (CE)-T1WI. This was then fine-tuned with our institutional dataset (<em>n</em> = 197) containing ADC, T2WI, and CE-T1WI, manually annotated by radiologists, and split into training (<em>n</em> = 100) and testing (<em>n</em> = 97) sets. The Dice similarity coefficient (DSC) was used to compare model outputs and manual labels. A third independent radiologist assessed segmentation quality on a semi-quantitative 5-scale score. Differences in DSC between new and recurrent gliomas, and between uni or multifocal gliomas were analyzed using the Mann–Whitney test. Semi-quantitative analyses were compared using the chi-square test. We found that there was good agreement between segmentations from the fine-tuned DL model and ground truth manual segmentations (median DSC: 0.729, std-dev: 0.134). DSC was higher for newly diagnosed (0.807) than recurrent (0.698) (<em>p</em> < 0.001), and higher for unifocal (0.747) than multi-focal (0.613) cases (<em>p</em> = 0.001). Semi-quantitative scores of DL and manual segmentation were not significantly different (mean: 3.567 vs. 3.639; 93.8% vs. 97.9% scoring ≥ 3, <em>p</em> = 0.107). In conclusion, the proposed transfer learning DL performed similarly to human radiologists in glioma segmentation on both structural and ADC sequences. Further improvement in segmenting challenging postoperative and multifocal glioma cases is needed.</p>","PeriodicalId":50214,"journal":{"name":"Journal of Digital Imaging","volume":"72 1","pages":""},"PeriodicalIF":4.4,"publicationDate":"2024-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139918921","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-21DOI: 10.1007/s10278-024-01028-7
Abstract
Atlases of normal genomics, transcriptomics, proteomics, and metabolomics have been published in an attempt to understand the biological phenotype in health and disease and to set the basis of comprehensive comparative omics studies. No such atlas exists for radiomics data. The purpose of this study was to systematically create a radiomics dataset of normal abdominal and pelvic radiomics that can be used for model development and validation. Young adults without any previously known disease, aged > 17 and ≤ 36 years old, were retrospectively included. All patients had undergone CT scanning for emergency indications. In case abnormal findings were identified, the relevant anatomical structures were excluded. Deep learning was used to automatically segment the majority of visible anatomical structures with the TotalSegmentator model as applied in 3DSlicer. Radiomics features including first order, texture, wavelet, and Laplacian of Gaussian transformed features were extracted with PyRadiomics. A Github repository was created to host the resulting dataset. Radiomics data were extracted from a total of 531 patients with a mean age of 26.8 ± 5.19 years, including 250 female and 281 male patients. A maximum of 53 anatomical structures were segmented and used for subsequent radiomics data extraction. Radiomics features were derived from a total of 526 non-contrast and 400 contrast-enhanced (portal venous) series. The dataset is publicly available for model development and validation purposes.
{"title":"Developing a Radiomics Atlas Dataset of normal Abdominal and Pelvic computed Tomography (RADAPT)","authors":"","doi":"10.1007/s10278-024-01028-7","DOIUrl":"https://doi.org/10.1007/s10278-024-01028-7","url":null,"abstract":"<h3>Abstract</h3> <p>Atlases of normal genomics, transcriptomics, proteomics, and metabolomics have been published in an attempt to understand the biological phenotype in health and disease and to set the basis of comprehensive comparative omics studies. No such atlas exists for radiomics data. The purpose of this study was to systematically create a radiomics dataset of normal abdominal and pelvic radiomics that can be used for model development and validation. Young adults without any previously known disease, aged > 17 and ≤ 36 years old, were retrospectively included. All patients had undergone CT scanning for emergency indications. In case abnormal findings were identified, the relevant anatomical structures were excluded. Deep learning was used to automatically segment the majority of visible anatomical structures with the TotalSegmentator model as applied in 3DSlicer. Radiomics features including first order, texture, wavelet, and Laplacian of Gaussian transformed features were extracted with PyRadiomics. A Github repository was created to host the resulting dataset. Radiomics data were extracted from a total of 531 patients with a mean age of 26.8 ± 5.19 years, including 250 female and 281 male patients. A maximum of 53 anatomical structures were segmented and used for subsequent radiomics data extraction. Radiomics features were derived from a total of 526 non-contrast and 400 contrast-enhanced (portal venous) series. The dataset is publicly available for model development and validation purposes.</p>","PeriodicalId":50214,"journal":{"name":"Journal of Digital Imaging","volume":"2 1","pages":""},"PeriodicalIF":4.4,"publicationDate":"2024-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139919002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-21DOI: 10.1007/s10278-024-01039-4
Wuqi Li, Shitong Mao, Amanda S. Mahoney, James L. Coyle, Ervin Sejdić
The hyoid bone displacement and rotation are critical kinematic events of the swallowing process in the assessment of videofluoroscopic swallow studies (VFSS). However, the quantitative analysis of such events requires frame-by-frame manual annotation, which is labor-intensive and time-consuming. Our work aims to develop a method of automatically tracking hyoid bone displacement and rotation in VFSS. We proposed a full high-resolution network, a deep learning architecture, to detect the anterior and posterior of the hyoid bone to identify its location and rotation. Meanwhile, the anterior-inferior corners of the C2 and C4 vertebrae were detected simultaneously to automatically establish a new coordinate system and eliminate the effect of posture change. The proposed model was developed by 59,468 VFSS frames collected from 1488 swallowing samples, and it achieved an average landmark localization error of 2.38 pixels (around 0.5% of the image with 448 × 448 pixels) and an average angle prediction error of 0.065 radians in predicting C2–C4 and hyoid bone angles. In addition, the displacement of the hyoid bone center was automatically tracked on a frame-by-frame analysis, achieving an average mean absolute error of 2.22 pixels and 2.78 pixels in the x-axis and y-axis, respectively. The results of this study support the effectiveness and accuracy of the proposed method in detecting hyoid bone displacement and rotation. Our study provided an automatic method of analyzing hyoid bone kinematics during VFSS, which could contribute to early diagnosis and effective disease management.
{"title":"Automatic Tracking of Hyoid Bone Displacement and Rotation Relative to Cervical Vertebrae in Videofluoroscopic Swallow Studies Using Deep Learning","authors":"Wuqi Li, Shitong Mao, Amanda S. Mahoney, James L. Coyle, Ervin Sejdić","doi":"10.1007/s10278-024-01039-4","DOIUrl":"https://doi.org/10.1007/s10278-024-01039-4","url":null,"abstract":"<p>The hyoid bone displacement and rotation are critical kinematic events of the swallowing process in the assessment of videofluoroscopic swallow studies (VFSS). However, the quantitative analysis of such events requires frame-by-frame manual annotation, which is labor-intensive and time-consuming. Our work aims to develop a method of automatically tracking hyoid bone displacement and rotation in VFSS. We proposed a full high-resolution network, a deep learning architecture, to detect the anterior and posterior of the hyoid bone to identify its location and rotation. Meanwhile, the anterior-inferior corners of the C2 and C4 vertebrae were detected simultaneously to automatically establish a new coordinate system and eliminate the effect of posture change. The proposed model was developed by 59,468 VFSS frames collected from 1488 swallowing samples, and it achieved an average landmark localization error of 2.38 pixels (around 0.5% of the image with 448 × 448 pixels) and an average angle prediction error of 0.065 radians in predicting C2–C4 and hyoid bone angles. In addition, the displacement of the hyoid bone center was automatically tracked on a frame-by-frame analysis, achieving an average mean absolute error of 2.22 pixels and 2.78 pixels in the <i>x</i>-axis and <i>y</i>-axis, respectively. The results of this study support the effectiveness and accuracy of the proposed method in detecting hyoid bone displacement and rotation. Our study provided an automatic method of analyzing hyoid bone kinematics during VFSS, which could contribute to early diagnosis and effective disease management.</p>","PeriodicalId":50214,"journal":{"name":"Journal of Digital Imaging","volume":"5 1","pages":""},"PeriodicalIF":4.4,"publicationDate":"2024-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139918916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-02DOI: 10.1007/s10278-024-00985-3
Xin Tie, Muheon Shin, Ali Pirasteh, Nevein Ibrahim, Zachary Huemann, Sharon M. Castellino, Kara M. Kelly, John Garrett, Junjie Hu, Steve Y. Cho, Tyler J. Bradshaw
Large language models (LLMs) have shown promise in accelerating radiology reporting by summarizing clinical findings into impressions. However, automatic impression generation for whole-body PET reports presents unique challenges and has received little attention. Our study aimed to evaluate whether LLMs can create clinically useful impressions for PET reporting. To this end, we fine-tuned twelve open-source language models on a corpus of 37,370 retrospective PET reports collected from our institution. All models were trained using the teacher-forcing algorithm, with the report findings and patient information as input and the original clinical impressions as reference. An extra input token encoded the reading physician’s identity, allowing models to learn physician-specific reporting styles. To compare the performances of different models, we computed various automatic evaluation metrics and benchmarked them against physician preferences, ultimately selecting PEGASUS as the top LLM. To evaluate its clinical utility, three nuclear medicine physicians assessed the PEGASUS-generated impressions and original clinical impressions across 6 quality dimensions (3-point scales) and an overall utility score (5-point scale). Each physician reviewed 12 of their own reports and 12 reports from other physicians. When physicians assessed LLM impressions generated in their own style, 89% were considered clinically acceptable, with a mean utility score of 4.08/5. On average, physicians rated these personalized impressions as comparable in overall utility to the impressions dictated by other physicians (4.03, P = 0.41). In summary, our study demonstrated that personalized impressions generated by PEGASUS were clinically useful in most cases, highlighting its potential to expedite PET reporting by automatically drafting impressions.
大语言模型(LLM)通过将临床发现总结为印象,在加速放射学报告方面已显示出良好的前景。然而,全身 PET 报告的自动印象生成却面临着独特的挑战,很少受到关注。我们的研究旨在评估 LLM 能否为 PET 报告生成临床有用的印象。为此,我们在本机构收集的 37,370 份回顾性 PET 报告语料库上对 12 个开源语言模型进行了微调。所有模型均采用教师强迫算法进行训练,以报告结果和患者信息作为输入,原始临床印象作为参考。额外的输入标记对阅读医生的身份进行编码,使模型能够学习医生特定的报告风格。为了比较不同模型的性能,我们计算了各种自动评估指标,并根据医生的偏好进行了基准测试,最终选择 PEGASUS 作为最佳 LLM。为了评估 PEGASUS 的临床实用性,三名核医学医生从 6 个质量维度(3 分制)和一个总体实用性评分(5 分制)对 PEGASUS 生成的印象和原始临床印象进行了评估。每位医生都审查了自己的 12 份报告和其他医生的 12 份报告。当医生评估以自己的风格生成的 LLM 印象时,89% 被认为在临床上是可接受的,平均效用分数为 4.08/5。平均而言,医生对这些个性化印象的总体效用评价与其他医生口述的印象相当(4.03,P = 0.41)。总之,我们的研究表明,PEGASUS 生成的个性化印象在大多数情况下都具有临床实用性,突显了其通过自动起草印象加快 PET 报告的潜力。
{"title":"Personalized Impression Generation for PET Reports Using Large Language Models","authors":"Xin Tie, Muheon Shin, Ali Pirasteh, Nevein Ibrahim, Zachary Huemann, Sharon M. Castellino, Kara M. Kelly, John Garrett, Junjie Hu, Steve Y. Cho, Tyler J. Bradshaw","doi":"10.1007/s10278-024-00985-3","DOIUrl":"https://doi.org/10.1007/s10278-024-00985-3","url":null,"abstract":"<p>Large language models (LLMs) have shown promise in accelerating radiology reporting by summarizing clinical findings into impressions. However, automatic impression generation for whole-body PET reports presents unique challenges and has received little attention. Our study aimed to evaluate whether LLMs can create clinically useful impressions for PET reporting. To this end, we fine-tuned twelve open-source language models on a corpus of 37,370 retrospective PET reports collected from our institution. All models were trained using the teacher-forcing algorithm, with the report findings and patient information as input and the original clinical impressions as reference. An extra input token encoded the reading physician’s identity, allowing models to learn physician-specific reporting styles. To compare the performances of different models, we computed various automatic evaluation metrics and benchmarked them against physician preferences, ultimately selecting PEGASUS as the top LLM. To evaluate its clinical utility, three nuclear medicine physicians assessed the PEGASUS-generated impressions and original clinical impressions across 6 quality dimensions (3-point scales) and an overall utility score (5-point scale). Each physician reviewed 12 of their own reports and 12 reports from other physicians. When physicians assessed LLM impressions generated in their own style, 89% were considered clinically acceptable, with a mean utility score of 4.08/5. On average, physicians rated these personalized impressions as comparable in overall utility to the impressions dictated by other physicians (4.03, P = 0.41). In summary, our study demonstrated that personalized impressions generated by PEGASUS were clinically useful in most cases, highlighting its potential to expedite PET reporting by automatically drafting impressions.</p>","PeriodicalId":50214,"journal":{"name":"Journal of Digital Imaging","volume":"245 1","pages":""},"PeriodicalIF":4.4,"publicationDate":"2024-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139669520","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-02DOI: 10.1007/s10278-024-00968-4
Le Nguyen Binh, Nguyen Thanh Nhu, Vu Pham Thao Vy, Do Le Hoang Son, Truong Nguyen Khanh Hung, Nguyen Bach, Hoang Quoc Huy, Le Van Tuan, Nguyen Quoc Khanh Le, Jiunn-Horng Kang
Common pediatric distal forearm fractures necessitate precise detection. To support prompt treatment planning by clinicians, our study aimed to create a multi-class convolutional neural network (CNN) model for pediatric distal forearm fractures, guided by the AO Foundation/Orthopaedic Trauma Association (AO/ATO) classification system for pediatric fractures. The GRAZPEDWRI-DX dataset (2008–2018) of wrist X-ray images was used. We labeled images into four fracture classes (FRM, FUM, FRE, and FUE with F, fracture; R, radius; U, ulna; M, metaphysis; and E, epiphysis) based on the pediatric AO/ATO classification. We performed multi-class classification by training a YOLOv4-based CNN object detection model with 7006 images from 1809 patients (80% for training and 20% for validation). An 88-image test set from 34 patients was used to evaluate the model performance, which was then compared to the diagnosis performances of two readers—an orthopedist and a radiologist. The overall mean average precision levels on the validation set in four classes of the model were 0.97, 0.92, 0.95, and 0.94, respectively. On the test set, the model’s performance included sensitivities of 0.86, 0.71, 0.88, and 0.89; specificities of 0.88, 0.94, 0.97, and 0.98; and area under the curve (AUC) values of 0.87, 0.83, 0.93, and 0.94, respectively. The best performance among the three readers belonged to the radiologist, with a mean AUC of 0.922, followed by our model (0.892) and the orthopedist (0.830). Therefore, using the AO/OTA concept, our multi-class fracture detection model excelled in identifying pediatric distal forearm fractures.
{"title":"Multi-Class Deep Learning Model for Detecting Pediatric Distal Forearm Fractures Based on the AO/OTA Classification","authors":"Le Nguyen Binh, Nguyen Thanh Nhu, Vu Pham Thao Vy, Do Le Hoang Son, Truong Nguyen Khanh Hung, Nguyen Bach, Hoang Quoc Huy, Le Van Tuan, Nguyen Quoc Khanh Le, Jiunn-Horng Kang","doi":"10.1007/s10278-024-00968-4","DOIUrl":"https://doi.org/10.1007/s10278-024-00968-4","url":null,"abstract":"<p>Common pediatric distal forearm fractures necessitate precise detection. To support prompt treatment planning by clinicians, our study aimed to create a multi-class convolutional neural network (CNN) model for pediatric distal forearm fractures, guided by the AO Foundation/Orthopaedic Trauma Association (AO/ATO) classification system for pediatric fractures. The GRAZPEDWRI-DX dataset (2008–2018) of wrist X-ray images was used. We labeled images into four fracture classes (FRM, FUM, FRE, and FUE with F, fracture; R, radius; U, ulna; M, metaphysis; and E, epiphysis) based on the pediatric AO/ATO classification. We performed multi-class classification by training a YOLOv4-based CNN object detection model with 7006 images from 1809 patients (80% for training and 20% for validation). An 88-image test set from 34 patients was used to evaluate the model performance, which was then compared to the diagnosis performances of two readers—an orthopedist and a radiologist. The overall mean average precision levels on the validation set in four classes of the model were 0.97, 0.92, 0.95, and 0.94, respectively. On the test set, the model’s performance included sensitivities of 0.86, 0.71, 0.88, and 0.89; specificities of 0.88, 0.94, 0.97, and 0.98; and area under the curve (AUC) values of 0.87, 0.83, 0.93, and 0.94, respectively. The best performance among the three readers belonged to the radiologist, with a mean AUC of 0.922, followed by our model (0.892) and the orthopedist (0.830). Therefore, using the AO/OTA concept, our multi-class fracture detection model excelled in identifying pediatric distal forearm fractures.</p>","PeriodicalId":50214,"journal":{"name":"Journal of Digital Imaging","volume":"11 1","pages":""},"PeriodicalIF":4.4,"publicationDate":"2024-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139669516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-01DOI: 10.1007/s10278-024-00989-z
Abstract
This study aims to assess the effectiveness of radiomics signatures obtained from dual-energy computed tomography enterography (DECTE) in the evaluation of mucosal healing (MH) in patients diagnosed with Crohn’s disease (CD). In this study, 106 CD patients with a total of 221 diseased intestinal segments (79 with MH and 142 non-MH) from two medical centers were included and randomly divided into training and testing cohorts at a ratio of 7:3. Radiomics features were extracted from the enteric phase iodine maps and 40-kev and 70-kev virtual monoenergetic images (VMIs) of the diseased intestinal segments, as well as from mesenteric fat. Feature selection was performed using the least absolute shrinkage and selection operator (LASSO) logistic regression. Radiomics models were subsequently established, and the accuracy of these models in identifying MH in CD was assessed by calculating the area under the receiver operating characteristic curve (AUC). The combined-iodine model formulated by integrating the intestinal and mesenteric fat radiomics features of iodine maps exhibited the most favorable performance in evaluating MH, with AUCs of 0.989 (95% confidence interval (CI) 0.977–1.000) in the training cohort and 0.947 (95% CI 0.884–1.000) in the testing cohort. Patients categorized as high risk by the combined-iodine model displayed a greater probability of experiencing disease progression when contrasted with low-risk patients. The combined-iodine radiomics model, which is built upon iodine maps of diseased intestinal segments and mesenteric fat, has demonstrated promising performance in evaluating MH in CD patients.
摘要 本研究旨在评估从双能计算机断层扫描肠造影(DECTE)中获得的放射组学特征在评估克罗恩病(CD)患者粘膜愈合(MH)中的有效性。这项研究纳入了来自两个医疗中心的106名克罗恩病患者,共221个病变肠段(79个MH肠段和142个非MH肠段),并按7:3的比例随机分为训练组和测试组。从病变肠段的肠相碘图、40-kev 和 70-kev 虚拟单能图像(VMI)以及肠系膜脂肪中提取放射组学特征。特征选择采用最小绝对收缩和选择算子(LASSO)逻辑回归法。随后建立了放射组学模型,并通过计算接收者操作特征曲线下面积(AUC)评估了这些模型在鉴别 CD 中 MH 的准确性。通过整合碘图的肠道和肠系膜脂肪放射组学特征而建立的联合碘模型在评估MH方面表现最出色,训练队列中的AUC为0.989(95%置信区间(CI)0.977-1.000),测试队列中的AUC为0.947(95%置信区间(CI)0.884-1.000)。与低风险患者相比,被联合碘模型归类为高风险的患者出现疾病进展的概率更高。联合碘放射组学模型建立在病变肠段和肠系膜脂肪的碘图基础上,在评估 CD 患者的 MH 方面表现出了良好的性能。
{"title":"Evaluation of Mucosal Healing in Crohn’s Disease: Radiomics Models of Intestinal Wall and Mesenteric Fat Based on Dual-Energy CT","authors":"","doi":"10.1007/s10278-024-00989-z","DOIUrl":"https://doi.org/10.1007/s10278-024-00989-z","url":null,"abstract":"<h3>Abstract</h3> <p>This study aims to assess the effectiveness of radiomics signatures obtained from dual-energy computed tomography enterography (DECTE) in the evaluation of mucosal healing (MH) in patients diagnosed with Crohn’s disease (CD). In this study, 106 CD patients with a total of 221 diseased intestinal segments (79 with MH and 142 non-MH) from two medical centers were included and randomly divided into training and testing cohorts at a ratio of 7:3. Radiomics features were extracted from the enteric phase iodine maps and 40-kev and 70-kev virtual monoenergetic images (VMIs) of the diseased intestinal segments, as well as from mesenteric fat. Feature selection was performed using the least absolute shrinkage and selection operator (LASSO) logistic regression. Radiomics models were subsequently established, and the accuracy of these models in identifying MH in CD was assessed by calculating the area under the receiver operating characteristic curve (AUC). The combined-iodine model formulated by integrating the intestinal and mesenteric fat radiomics features of iodine maps exhibited the most favorable performance in evaluating MH, with AUCs of 0.989 (95% confidence interval (CI) 0.977–1.000) in the training cohort and 0.947 (95% CI 0.884–1.000) in the testing cohort. Patients categorized as high risk by the combined-iodine model displayed a greater probability of experiencing disease progression when contrasted with low-risk patients. The combined-iodine radiomics model, which is built upon iodine maps of diseased intestinal segments and mesenteric fat, has demonstrated promising performance in evaluating MH in CD patients.</p>","PeriodicalId":50214,"journal":{"name":"Journal of Digital Imaging","volume":"20 1","pages":""},"PeriodicalIF":4.4,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139669648","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This study aims to investigate the influence of adaptive statistical iterative reconstruction-V (ASIR-V) and deep learning image reconstruction (DLIR) on CT radiomics feature robustness. A standardized phantom was scanned under single-energy CT (SECT) and dual-energy CT (DECT) modes at standard and low (20 and 10 mGy) dose levels. Images of SECT 120 kVp and corresponding DECT 120 kVp-like virtual monochromatic images were generated with filtered back-projection (FBP), ASIR-V at 40% (AV-40) and 100% (AV-100) blending levels, and DLIR algorithm at low (DLIR-L), medium (DLIR-M), and high (DLIR-H) strength levels. Ninety-four features were extracted via Pyradiomics. Reproducibility of features was calculated between standard and low dose levels, between reconstruction algorithms in reference to FBP images, and within scan mode, using intraclass correlation coefficient (ICC) and concordance correlation coefficient (CCC). The average percentage of features with ICC > 0.90 and CCC > 0.90 between the two dose levels was 21.28% and 20.75% in AV-40 images, and 39.90% and 35.11% in AV-100 images, respectively, and increased from 15.43 to 45.22% and from 15.43 to 44.15% with an increasing strength level of DLIR. The average percentage of features with ICC > 0.90 and CCC > 0.90 in reference to FBP images was 26.07% and 25.80% in AV-40 images, and 18.88% and 18.62% in AV-100 images, respectively, and decreased from 27.93 to 17.82% and from 27.66 to 17.29% with an increasing strength level of DLIR. DLIR and ASIR-V algorithms showed low reproducibility in reference to FBP images, while the high-strength DLIR algorithm provides an opportunity for minimizing radiomics variability due to dose reduction.
{"title":"Impacts of Adaptive Statistical Iterative Reconstruction-V and Deep Learning Image Reconstruction Algorithms on Robustness of CT Radiomics Features: Opportunity for Minimizing Radiomics Variability Among Scans of Different Dose Levels","authors":"Jingyu Zhong, Zhiyuan Wu, Lingyun Wang, Yong Chen, Yihan Xia, Lan Wang, Jianying Li, Wei Lu, Xiaomeng Shi, Jianxing Feng, Haipeng Dong, Huan Zhang, Weiwu Yao","doi":"10.1007/s10278-023-00901-1","DOIUrl":"https://doi.org/10.1007/s10278-023-00901-1","url":null,"abstract":"<p>This study aims to investigate the influence of adaptive statistical iterative reconstruction-V (ASIR-V) and deep learning image reconstruction (DLIR) on CT radiomics feature robustness. A standardized phantom was scanned under single-energy CT (SECT) and dual-energy CT (DECT) modes at standard and low (20 and 10 mGy) dose levels. Images of SECT 120 kVp and corresponding DECT 120 kVp-like virtual monochromatic images were generated with filtered back-projection (FBP), ASIR-V at 40% (AV-40) and 100% (AV-100) blending levels, and DLIR algorithm at low (DLIR-L), medium (DLIR-M), and high (DLIR-H) strength levels. Ninety-four features were extracted via Pyradiomics. Reproducibility of features was calculated between standard and low dose levels, between reconstruction algorithms in reference to FBP images, and within scan mode, using intraclass correlation coefficient (ICC) and concordance correlation coefficient (CCC). The average percentage of features with ICC > 0.90 and CCC > 0.90 between the two dose levels was 21.28% and 20.75% in AV-40 images, and 39.90% and 35.11% in AV-100 images, respectively, and increased from 15.43 to 45.22% and from 15.43 to 44.15% with an increasing strength level of DLIR. The average percentage of features with ICC > 0.90 and CCC > 0.90 in reference to FBP images was 26.07% and 25.80% in AV-40 images, and 18.88% and 18.62% in AV-100 images, respectively, and decreased from 27.93 to 17.82% and from 27.66 to 17.29% with an increasing strength level of DLIR. DLIR and ASIR-V algorithms showed low reproducibility in reference to FBP images, while the high-strength DLIR algorithm provides an opportunity for minimizing radiomics variability due to dose reduction.</p>","PeriodicalId":50214,"journal":{"name":"Journal of Digital Imaging","volume":"9 1","pages":""},"PeriodicalIF":4.4,"publicationDate":"2024-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139586136","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-29DOI: 10.1007/s10278-024-00973-7
Abstract
Nasal base aesthetics is an interesting and challenging issue that attracts the attention of researchers in recent years. With that insight, in this study, we propose a novel automatic framework (AF) for evaluating the nasal base which can be useful to improve the symmetry in rhinoplasty and reconstruction. The introduced AF includes a hybrid model for nasal base landmarks recognition and a combined model for predicting nasal base symmetry. The proposed state-of-the-art nasal base landmark detection model is trained on the nasal base images for comprehensive qualitative and quantitative assessments. Then, the deep convolutional neural networks (CNN) and multi-layer perceptron neural network (MLP) models are integrated by concatenating their last hidden layer to evaluate the nasal base symmetry based on geometry features and tiled images of the nasal base. This study explores the concept of data augmentation by applying the methods motivated via commonly used image augmentation techniques. According to the experimental findings, the results of the AF are closely related to the otolaryngologists’ ratings and are useful for preoperative planning, intraoperative decision-making, and postoperative assessment. Furthermore, the visualization indicates that the proposed AF is capable of predicting the nasal base symmetry and capturing asymmetry areas to facilitate semantic predictions. The codes are accessible at https://github.com/AshooriMaryam/Nasal-Aesthetic-Assessment-Deep-learning.
摘要 鼻基底美学是一个有趣而具有挑战性的问题,近年来吸引了众多研究人员的关注。有鉴于此,我们在本研究中提出了一种新型的鼻基底自动评估框架(AF),可用于改善鼻整形和鼻重建中的对称性。引入的自动框架包括一个用于识别鼻基底地标的混合模型和一个用于预测鼻基底对称性的组合模型。所提出的最先进的鼻基底地标检测模型在鼻基底图像上进行训练,以进行全面的定性和定量评估。然后,将深度卷积神经网络(CNN)和多层感知器神经网络(MLP)模型的最后一个隐藏层合并起来,根据几何特征和鼻基底的平铺图像来评估鼻基底对称性。本研究通过应用常用图像增强技术所激发的方法,探索了数据增强的概念。实验结果表明,AF 的结果与耳鼻喉科医生的评分密切相关,对术前规划、术中决策和术后评估非常有用。此外,可视化结果表明,所提出的 AF 能够预测鼻基底对称性并捕捉不对称区域,从而促进语义预测。代码可在 https://github.com/AshooriMaryam/Nasal-Aesthetic-Assessment-Deep-learning 上查阅。
{"title":"An Automatic Framework for Nasal Esthetic Assessment by ResNet Convolutional Neural Network","authors":"","doi":"10.1007/s10278-024-00973-7","DOIUrl":"https://doi.org/10.1007/s10278-024-00973-7","url":null,"abstract":"<h3>Abstract</h3> <p>Nasal base aesthetics is an interesting and challenging issue that attracts the attention of researchers in recent years. With that insight, in this study, we propose a novel automatic framework (AF) for evaluating the nasal base which can be useful to improve the symmetry in rhinoplasty and reconstruction. The introduced AF includes a hybrid model for nasal base landmarks recognition and a combined model for predicting nasal base symmetry. The proposed state-of-the-art nasal base landmark detection model is trained on the nasal base images for comprehensive qualitative and quantitative assessments. Then, the deep convolutional neural networks (CNN) and multi-layer perceptron neural network (MLP) models are integrated by concatenating their last hidden layer to evaluate the nasal base symmetry based on geometry features and tiled images of the nasal base. This study explores the concept of data augmentation by applying the methods motivated via commonly used image augmentation techniques. According to the experimental findings, the results of the AF are closely related to the otolaryngologists’ ratings and are useful for preoperative planning, intraoperative decision-making, and postoperative assessment. Furthermore, the visualization indicates that the proposed AF is capable of predicting the nasal base symmetry and capturing asymmetry areas to facilitate semantic predictions. The codes are accessible at https://github.com/AshooriMaryam/Nasal-Aesthetic-Assessment-Deep-learning.</p>","PeriodicalId":50214,"journal":{"name":"Journal of Digital Imaging","volume":"28 1","pages":""},"PeriodicalIF":4.4,"publicationDate":"2024-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139586393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Research software is continuously developed to facilitate progress and innovation in the medical field. Over time, numerous research software programs have been created, making it challenging to keep abreast of what is available. This work aims to evaluate the most frequently utilized software by the computer-assisted intervention (CAI) research community. The software assessments encompass a range of criteria, including load time, stress load, multi-tasking, extensibility and range of functionalities, user-friendliness, documentation, and technical support. A total of eight software programs were selected: 3D Slicer, Elastix, ITK-SNAP, MedInria, MeVisLab, MIPAV, and Seg3D. While none of the software was found to be perfect on all evaluation criteria, 3D Slicer and ITK-SNAP emerged with the highest rankings overall. These two software programs could frequently complement each other, as 3D Slicer has a broad and customizable range of features, while ITK-SNAP excels at performing fundamental tasks in an efficient manner. Nonetheless, each software had distinctive features that may better fit the requirements of certain research projects. This review provides valuable information to CAI researchers seeking the best-suited software to support their projects. The evaluation also offers insights for the software development teams, as it highlights areas where the software can be improved.
为了促进医学领域的进步和创新,研究软件不断得到开发。随着时间的推移,无数研究软件应运而生,这使得跟上现有软件的步伐成为一项挑战。这项工作旨在评估计算机辅助干预(CAI)研究界最常用的软件。软件评估包含一系列标准,包括加载时间、压力负荷、多任务处理、可扩展性和功能范围、用户友好性、文档和技术支持。共有八款软件入选:3D Slicer、Elastix、ITK-SNAP、MedInria、MeVisLab、MIPAV 和 Seg3D。虽然没有一款软件在所有评估标准上都是完美无缺的,但 3D Slicer 和 ITK-SNAP 的综合排名最高。这两款软件经常可以互补,因为 3D Slicer 具有广泛的可定制功能,而 ITK-SNAP 则擅长高效执行基本任务。不过,每种软件都有其独特的功能,可能更适合某些研究项目的要求。本综述为 CAI 研究人员提供了宝贵的信息,帮助他们寻找最适合的软件来支持自己的项目。评估还为软件开发团队提供了见解,因为它强调了软件可以改进的地方。
{"title":"Review of the Free Research Software for Computer-Assisted Interventions","authors":"Zaiba Amla, Parminder Singh Khehra, Ashley Mathialagan, Elodie Lugez","doi":"10.1007/s10278-023-00912-y","DOIUrl":"https://doi.org/10.1007/s10278-023-00912-y","url":null,"abstract":"<p>Research software is continuously developed to facilitate progress and innovation in the medical field. Over time, numerous research software programs have been created, making it challenging to keep abreast of what is available. This work aims to evaluate the most frequently utilized software by the computer-assisted intervention (CAI) research community. The software assessments encompass a range of criteria, including load time, stress load, multi-tasking, extensibility and range of functionalities, user-friendliness, documentation, and technical support. A total of eight software programs were selected: 3D Slicer, Elastix, ITK-SNAP, MedInria, MeVisLab, MIPAV, and Seg3D. While none of the software was found to be perfect on all evaluation criteria, 3D Slicer and ITK-SNAP emerged with the highest rankings overall. These two software programs could frequently complement each other, as 3D Slicer has a broad and customizable range of features, while ITK-SNAP excels at performing fundamental tasks in an efficient manner. Nonetheless, each software had distinctive features that may better fit the requirements of certain research projects. This review provides valuable information to CAI researchers seeking the best-suited software to support their projects. The evaluation also offers insights for the software development teams, as it highlights areas where the software can be improved.</p>","PeriodicalId":50214,"journal":{"name":"Journal of Digital Imaging","volume":"10 1","pages":""},"PeriodicalIF":4.4,"publicationDate":"2024-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139586139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}