首页 > 最新文献

Journal of Medical Imaging最新文献

英文 中文
TFKT V2: task-focused knowledge transfer from natural images for computed tomography perceptual image quality assessment. TFKT V2:用于计算机断层扫描感知图像质量评估的以任务为中心的自然图像知识转移。
IF 1.9 Q3 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING Pub Date : 2025-09-01 Epub Date: 2025-05-28 DOI: 10.1117/1.JMI.12.5.051805
Kazi Ramisa Rifa, Md Atik Ahamed, Jie Zhang, Abdullah Imran

Purpose: The accurate assessment of computed tomography (CT) image quality is crucial for ensuring diagnostic reliability while minimizing radiation dose. Radiologists' evaluations are time-consuming and labor-intensive. Existing automated approaches often require large CT datasets with predefined image quality assessment (IQA) scores, which often do not align well with clinical evaluations. We aim to develop a reference-free, automated method for CT IQA that closely reflects radiologists' evaluations, reducing the dependency on large annotated datasets.

Approach: We propose Task-Focused Knowledge Transfer (TFKT), a deep learning-based IQA method leveraging knowledge transfer from task-similar natural image datasets. TFKT incorporates a hybrid convolutional neural network-transformer model, enabling accurate quality predictions by learning from natural image distortions with human-annotated mean opinion scores. The model is pre-trained on natural image datasets and fine-tuned on low-dose computed tomography perceptual image quality assessment data to ensure task-specific adaptability.

Results: Extensive evaluations demonstrate that the proposed TFKT method effectively predicts IQA scores aligned with radiologists' assessments on in-domain datasets and generalizes well to out-of-domain clinical pediatric CT exams. The model achieves robust performance without requiring high-dose reference images. Our model is capable of assessing the quality of 30 CT image slices in a second.

Conclusions: The proposed TFKT approach provides a scalable, accurate, and reference-free solution for CT IQA. The model bridges the gap between traditional and deep learning-based IQA, offering clinically relevant and computationally efficient assessments applicable to real-world clinical settings.

目的:计算机断层扫描(CT)图像质量的准确评估是确保诊断可靠性,同时尽量减少辐射剂量的关键。放射科医生的评估既费时又费力。现有的自动化方法通常需要具有预定义图像质量评估(IQA)分数的大型CT数据集,这些数据通常与临床评估不太一致。我们的目标是开发一种无参考的、自动化的CT IQA方法,该方法密切反映放射科医生的评估,减少对大型注释数据集的依赖。方法:我们提出了以任务为中心的知识转移(TFKT),这是一种基于深度学习的IQA方法,利用任务相似的自然图像数据集的知识转移。TFKT结合了混合卷积神经网络变压器模型,通过学习自然图像失真和人工注释的平均意见得分,实现准确的质量预测。该模型在自然图像数据集上进行预训练,并在低剂量计算机断层扫描感知图像质量评估数据上进行微调,以确保任务特定的适应性。结果:广泛的评估表明,所提出的TFKT方法有效地预测了与放射科医生在域内数据集上的评估相一致的IQA分数,并很好地推广到域外的临床儿科CT检查。该模型在不需要高剂量参考图像的情况下实现了鲁棒性。我们的模型能够在一秒钟内评估约30个CT图像切片的质量。结论:提出的TFKT方法为CT IQA提供了一种可扩展、准确、无参考的解决方案。该模型弥合了传统和基于深度学习的IQA之间的差距,提供了适用于现实世界临床环境的临床相关和计算效率高的评估。
{"title":"TFKT V2: task-focused knowledge transfer from natural images for computed tomography perceptual image quality assessment.","authors":"Kazi Ramisa Rifa, Md Atik Ahamed, Jie Zhang, Abdullah Imran","doi":"10.1117/1.JMI.12.5.051805","DOIUrl":"10.1117/1.JMI.12.5.051805","url":null,"abstract":"<p><strong>Purpose: </strong>The accurate assessment of computed tomography (CT) image quality is crucial for ensuring diagnostic reliability while minimizing radiation dose. Radiologists' evaluations are time-consuming and labor-intensive. Existing automated approaches often require large CT datasets with predefined image quality assessment (IQA) scores, which often do not align well with clinical evaluations. We aim to develop a reference-free, automated method for CT IQA that closely reflects radiologists' evaluations, reducing the dependency on large annotated datasets.</p><p><strong>Approach: </strong>We propose Task-Focused Knowledge Transfer (TFKT), a deep learning-based IQA method leveraging knowledge transfer from task-similar natural image datasets. TFKT incorporates a hybrid convolutional neural network-transformer model, enabling accurate quality predictions by learning from natural image distortions with human-annotated mean opinion scores. The model is pre-trained on natural image datasets and fine-tuned on low-dose computed tomography perceptual image quality assessment data to ensure task-specific adaptability.</p><p><strong>Results: </strong>Extensive evaluations demonstrate that the proposed TFKT method effectively predicts IQA scores aligned with radiologists' assessments on in-domain datasets and generalizes well to out-of-domain clinical pediatric CT exams. The model achieves robust performance without requiring high-dose reference images. Our model is capable of assessing the quality of <math><mrow><mo>∼</mo> <mn>30</mn></mrow> </math> CT image slices in a second.</p><p><strong>Conclusions: </strong>The proposed TFKT approach provides a scalable, accurate, and reference-free solution for CT IQA. The model bridges the gap between traditional and deep learning-based IQA, offering clinically relevant and computationally efficient assessments applicable to real-world clinical settings.</p>","PeriodicalId":47707,"journal":{"name":"Journal of Medical Imaging","volume":"12 5","pages":"051805"},"PeriodicalIF":1.9,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12116730/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144182165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Full-head segmentation of MRI with abnormal brain anatomy: model and data release. 脑解剖异常的MRI全头分割:模型与数据发布。
IF 1.7 Q3 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING Pub Date : 2025-09-01 Epub Date: 2025-09-17 DOI: 10.1117/1.JMI.12.5.054001
Andrew M Birnbaum, Adam Buchwald, Peter Turkeltaub, Adam Jacks, George Carr, Shreya Kannan, Yu Huang, Abhisheck Datta, Lucas C Parra, Lukas A Hirsch

Purpose: Our goal was to develop a deep network for whole-head segmentation, including clinical magnetic resonance imaging (MRI) with abnormal anatomy, and compile the first public benchmark dataset for this purpose. We collected 98 MRIs with volumetric segmentation labels for a diverse set of human subjects, including normal and abnormal anatomy in clinical cases of stroke and disorders of consciousness.

Approach: Training labels were generated by manually correcting initial automated segmentations for skin/scalp, skull, cerebro-spinal fluid, gray matter, white matter, air cavity, and extracephalic air. We developed a "MultiAxial" network consisting of three 2D U-Net that operate independently in sagittal, axial, and coronal planes, which are then combined to produce a single 3D segmentation.

Results: The MultiAxial network achieved a test-set Dice scores of 0.88 ± 0.04 (median ± interquartile range) on whole-head segmentation, including gray and white matter. This was compared with 0.86 ± 0.04 for Multipriors and 0.79 ± 0.10 for SPM12, two standard tools currently available for this task. The MultiAxial network gains in robustness by avoiding the need for coregistration with an atlas. It performed well in regions with abnormal anatomy and on images that have been de-identified. It enables more accurate and robust current flow modeling when incorporated into ROAST, a widely used modeling toolbox for transcranial electric stimulation.

Conclusions: We are releasing a new state-of-the-art tool for whole-head MRI segmentation in abnormal anatomy, along with the largest volume of labeled clinical head MRIs, including labels for nonbrain structures. Together, the model and data may serve as a benchmark for future efforts.

目的:我们的目标是开发一个用于全头部分割的深度网络,包括异常解剖的临床磁共振成像(MRI),并为此目的编制第一个公开的基准数据集。我们收集了98个具有体积分割标签的mri,用于不同的人类受试者,包括中风和意识障碍的临床病例中的正常和异常解剖。方法:通过手动校正皮肤/头皮、颅骨、脑脊液、灰质、白质、空腔和脑外空气的初始自动分割来生成训练标签。我们开发了一个“多轴”网络,由三个2D U-Net组成,它们分别在矢状面、轴状面和冠状面独立运行,然后将它们组合在一起产生一个单一的3D分割。结果:MultiAxial网络在包括灰质和白质在内的整个头部分割上的测试集Dice得分为0.88±0.04(中位数±四分位数范围)。相比之下,目前可用于该任务的两种标准工具Multipriors为0.86±0.04,SPM12为0.79±0.10。多轴网络通过避免与图谱的共配准而获得鲁棒性。它在解剖结构异常的区域和已经去识别的图像上表现良好。它可以更准确和强大的电流建模时,结合到ROAST,一个广泛使用的模型工具箱,经颅电刺激。结论:我们正在发布一种新的最先进的工具,用于异常解剖的全头部MRI分割,以及最大容量的标记临床头部MRI,包括非脑结构的标记。该模型和数据可以作为未来工作的基准。
{"title":"Full-head segmentation of MRI with abnormal brain anatomy: model and data release.","authors":"Andrew M Birnbaum, Adam Buchwald, Peter Turkeltaub, Adam Jacks, George Carr, Shreya Kannan, Yu Huang, Abhisheck Datta, Lucas C Parra, Lukas A Hirsch","doi":"10.1117/1.JMI.12.5.054001","DOIUrl":"10.1117/1.JMI.12.5.054001","url":null,"abstract":"<p><strong>Purpose: </strong>Our goal was to develop a deep network for whole-head segmentation, including clinical magnetic resonance imaging (MRI) with abnormal anatomy, and compile the first public benchmark dataset for this purpose. We collected 98 MRIs with volumetric segmentation labels for a diverse set of human subjects, including normal and abnormal anatomy in clinical cases of stroke and disorders of consciousness.</p><p><strong>Approach: </strong>Training labels were generated by manually correcting initial automated segmentations for skin/scalp, skull, cerebro-spinal fluid, gray matter, white matter, air cavity, and extracephalic air. We developed a \"MultiAxial\" network consisting of three 2D U-Net that operate independently in sagittal, axial, and coronal planes, which are then combined to produce a single 3D segmentation.</p><p><strong>Results: </strong>The MultiAxial network achieved a test-set Dice scores of <math><mrow><mn>0.88</mn> <mo>±</mo> <mn>0.04</mn></mrow> </math> (median ± interquartile range) on whole-head segmentation, including gray and white matter. This was compared with <math><mrow><mn>0.86</mn> <mo>±</mo> <mn>0.04</mn></mrow> </math> for Multipriors and <math><mrow><mn>0.79</mn> <mo>±</mo> <mn>0.10</mn></mrow> </math> for SPM12, two standard tools currently available for this task. The MultiAxial network gains in robustness by avoiding the need for coregistration with an atlas. It performed well in regions with abnormal anatomy and on images that have been de-identified. It enables more accurate and robust current flow modeling when incorporated into ROAST, a widely used modeling toolbox for transcranial electric stimulation.</p><p><strong>Conclusions: </strong>We are releasing a new state-of-the-art tool for whole-head MRI segmentation in abnormal anatomy, along with the largest volume of labeled clinical head MRIs, including labels for nonbrain structures. Together, the model and data may serve as a benchmark for future efforts.</p>","PeriodicalId":47707,"journal":{"name":"Journal of Medical Imaging","volume":"12 5","pages":"054001"},"PeriodicalIF":1.7,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12442731/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145087827","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Joint CT reconstruction of anatomy and implants using a mixed prior model. 使用混合先验模型重建关节CT解剖和植入物。
IF 1.7 Q3 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING Pub Date : 2025-09-01 Epub Date: 2025-10-18 DOI: 10.1117/1.JMI.12.5.053502
Xiao Jiang, Grace J Gang, J Webster Stayman

Purpose: Medical implants, often made of dense materials, pose significant challenges to accurate computed tomography (CT) reconstruction, especially near implants due to beam hardening and partial-volume artifacts. Moreover, diagnostics involving implants often require separate visualization for implants and anatomy. In this work, we propose a approach for joint estimation of anatomy and implants as separate volumes using a mixed prior model.

Approach: We leverage a learning-based prior for anatomy and a sparsity prior for implants to decouple the two volumes. In addition, a hybrid mono-polyenergetic forward model is employed to accommodate the spectral effects of implants, and a multiresolution object model is used to achieve high-resolution implant reconstruction. The reconstruction process alternates between diffusion posterior sampling for anatomy updates and classic optimization for implants and spectral coefficients.

Results: Evaluations were performed on emulated cardiac imaging with stent and spine imaging with pedicle screws. The structures of the cardiac stent with 0.25 mm wires were clearly visualized in the implant images, whereas the blooming artifacts around the stent were effectively suppressed in the anatomical reconstruction. For pedicle screws, the proposed algorithm mitigated streaking and beam-hardening artifacts in the anatomy volume, demonstrating significant improvements in SSIM and PSNR compared with frequency-splitting metal artifact reduction and model-based reconstruction on slices containing implants.

Conclusion: The proposed mixed prior model coupled with a hybrid spectral and multiresolution model can help to separate spatially and spectrally distinct objects that differ from anatomical features in single-energy CT, improving both image quality and separate visualization of implants and anatomy.

目的:医疗植入物通常由致密材料制成,对精确的计算机断层扫描(CT)重建构成重大挑战,特别是在植入物附近,由于束硬化和部分体积伪影。此外,涉及植入物的诊断通常需要对植入物和解剖结构进行单独的可视化。在这项工作中,我们提出了一种使用混合先验模型将解剖学和植入物作为单独体积进行联合估计的方法。方法:我们利用基于学习的解剖学先验和植入物的稀疏性先验来解耦两卷。此外,采用混合单多能正演模型来适应种植体的光谱效应,采用多分辨率目标模型来实现高分辨率的种植体重建。重建过程在解剖学更新的扩散后验采样和植入物和光谱系数的经典优化之间交替进行。结果:对支架模拟心脏成像和椎弓根螺钉模拟脊柱成像进行评价。0.25 mm金属丝心脏支架的结构在植入物图像上清晰可见,而在解剖重建中,支架周围的虚化伪影被有效抑制。对于椎弓根螺钉,所提出的算法减轻了解剖体积中的条纹和波束硬化伪影,与分频金属伪影减少和基于模型的含植入物切片重建相比,在SSIM和PSNR方面有了显著改善。结论:本文提出的混合先验模型与混合光谱和多分辨率模型相结合,有助于在单能量CT中分离出与解剖特征不同的空间和光谱上不同的物体,提高图像质量,并提高植入物和解剖的分离可视化。
{"title":"Joint CT reconstruction of anatomy and implants using a mixed prior model.","authors":"Xiao Jiang, Grace J Gang, J Webster Stayman","doi":"10.1117/1.JMI.12.5.053502","DOIUrl":"10.1117/1.JMI.12.5.053502","url":null,"abstract":"<p><strong>Purpose: </strong>Medical implants, often made of dense materials, pose significant challenges to accurate computed tomography (CT) reconstruction, especially near implants due to beam hardening and partial-volume artifacts. Moreover, diagnostics involving implants often require separate visualization for implants and anatomy. In this work, we propose a approach for joint estimation of anatomy and implants as separate volumes using a mixed prior model.</p><p><strong>Approach: </strong>We leverage a learning-based prior for anatomy and a sparsity prior for implants to decouple the two volumes. In addition, a hybrid mono-polyenergetic forward model is employed to accommodate the spectral effects of implants, and a multiresolution object model is used to achieve high-resolution implant reconstruction. The reconstruction process alternates between diffusion posterior sampling for anatomy updates and classic optimization for implants and spectral coefficients.</p><p><strong>Results: </strong>Evaluations were performed on emulated cardiac imaging with stent and spine imaging with pedicle screws. The structures of the cardiac stent with 0.25 mm wires were clearly visualized in the implant images, whereas the blooming artifacts around the stent were effectively suppressed in the anatomical reconstruction. For pedicle screws, the proposed algorithm mitigated streaking and beam-hardening artifacts in the anatomy volume, demonstrating significant improvements in SSIM and PSNR compared with frequency-splitting metal artifact reduction and model-based reconstruction on slices containing implants.</p><p><strong>Conclusion: </strong>The proposed mixed prior model coupled with a hybrid spectral and multiresolution model can help to separate spatially and spectrally distinct objects that differ from anatomical features in single-energy CT, improving both image quality and separate visualization of implants and anatomy.</p>","PeriodicalId":47707,"journal":{"name":"Journal of Medical Imaging","volume":"12 5","pages":"053502"},"PeriodicalIF":1.7,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12537543/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145349286","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Segmentation variability and radiomics stability for predicting triple-negative breast cancer subtype using magnetic resonance imaging. 磁共振成像预测三阴性乳腺癌亚型的分割变异性和放射组学稳定性。
IF 1.7 Q3 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING Pub Date : 2025-09-01 Epub Date: 2025-09-17 DOI: 10.1117/1.JMI.12.5.054501
Isabella Cama, Alejandro Guzmán, Cristina Campi, Michele Piana, Karim Lekadir, Sara Garbarino, Oliver Díaz

Purpose: Many studies caution against using radiomic features that are sensitive to contouring variability in predictive models for disease stratification. Consequently, metrics such as the intraclass correlation coefficient (ICC) are recommended to guide feature selection based on stability. However, the direct impact of segmentation variability on the performance of predictive models remains underexplored. We examine how segmentation variability affects both feature stability and predictive performance in the radiomics-based classification of triple-negative breast cancer (TNBC) using breast magnetic resonance imaging.

Approach: We analyzed 244 images from the Duke dataset, introducing segmentation variability through controlled modifications of manual segmentations. For each segmentation mask, explainable radiomic features were selected using Shapley Additive exPlanations and used to train logistic regression models. Feature stability across segmentations was assessed via ICC, Pearson's correlation, and reliability scores quantifying the relationship between segmentation variability and feature robustness.

Results: Model performances in predicting TNBC do not exhibit a significant difference across varying segmentations. The most explicative and predictive features exhibit decreasing ICC as segmentation accuracy decreases. However, their predictive power remains intact due to low ICC combined with high Pearson's correlation. No shared numerical relationship is found between feature stability and segmentation variability among the most predictive features.

Conclusions: Moderate segmentation variability has a limited impact on model performance. Although incorporating peritumoral information may reduce feature reproducibility, it does not compromise predictive utility. Notably, feature stability is not a strict prerequisite for predictive relevance, highlighting that exclusive reliance on ICC or stability metrics for feature selection may inadvertently discard informative features.

目的:许多研究警告不要在疾病分层预测模型中使用对轮廓变异性敏感的放射学特征。因此,建议使用诸如类内相关系数(ICC)之类的度量来指导基于稳定性的特征选择。然而,分割可变性对预测模型性能的直接影响仍未得到充分探讨。我们研究了在使用乳房磁共振成像的基于放射组学的三阴性乳腺癌(TNBC)分类中,分割变异性如何影响特征稳定性和预测性能。方法:我们分析了来自杜克大学数据集的244幅图像,通过对人工分割的可控修改引入了分割的可变性。对于每个分割掩模,使用Shapley加性解释选择可解释的放射学特征,并用于训练逻辑回归模型。通过ICC、Pearson相关和可靠性评分来评估分割变异性和特征鲁棒性之间的关系。结果:预测TNBC的模型性能在不同的分割中没有显着差异。随着分割精度的降低,最具解释性和预测性的特征呈现出降低的ICC。然而,由于低ICC结合高Pearson相关性,他们的预测能力仍然完好无损。在最具预测性的特征中,特征稳定性和分割可变性之间没有发现共享的数值关系。结论:中度分割可变性对模型性能的影响有限。虽然合并肿瘤周围信息可能会降低特征的再现性,但它不会损害预测效用。值得注意的是,特征稳定性并不是预测相关性的严格先决条件,强调在特征选择中完全依赖ICC或稳定性度量可能会无意中丢弃信息特征。
{"title":"Segmentation variability and radiomics stability for predicting triple-negative breast cancer subtype using magnetic resonance imaging.","authors":"Isabella Cama, Alejandro Guzmán, Cristina Campi, Michele Piana, Karim Lekadir, Sara Garbarino, Oliver Díaz","doi":"10.1117/1.JMI.12.5.054501","DOIUrl":"https://doi.org/10.1117/1.JMI.12.5.054501","url":null,"abstract":"<p><strong>Purpose: </strong>Many studies caution against using radiomic features that are sensitive to contouring variability in predictive models for disease stratification. Consequently, metrics such as the intraclass correlation coefficient (ICC) are recommended to guide feature selection based on stability. However, the direct impact of segmentation variability on the performance of predictive models remains underexplored. We examine how segmentation variability affects both feature stability and predictive performance in the radiomics-based classification of triple-negative breast cancer (TNBC) using breast magnetic resonance imaging.</p><p><strong>Approach: </strong>We analyzed 244 images from the Duke dataset, introducing segmentation variability through controlled modifications of manual segmentations. For each segmentation mask, explainable radiomic features were selected using Shapley Additive exPlanations and used to train logistic regression models. Feature stability across segmentations was assessed via ICC, Pearson's correlation, and reliability scores quantifying the relationship between segmentation variability and feature robustness.</p><p><strong>Results: </strong>Model performances in predicting TNBC do not exhibit a significant difference across varying segmentations. The most explicative and predictive features exhibit decreasing ICC as segmentation accuracy decreases. However, their predictive power remains intact due to low ICC combined with high Pearson's correlation. No shared numerical relationship is found between feature stability and segmentation variability among the most predictive features.</p><p><strong>Conclusions: </strong>Moderate segmentation variability has a limited impact on model performance. Although incorporating peritumoral information may reduce feature reproducibility, it does not compromise predictive utility. Notably, feature stability is not a strict prerequisite for predictive relevance, highlighting that exclusive reliance on ICC or stability metrics for feature selection may inadvertently discard informative features.</p>","PeriodicalId":47707,"journal":{"name":"Journal of Medical Imaging","volume":"12 5","pages":"054501"},"PeriodicalIF":1.7,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12443385/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145087856","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Gamification for emergency radiology education and image perception: stab the diagnosis. 游戏化急诊放射学教育与影像感知:切入诊断。
IF 1.7 Q3 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING Pub Date : 2025-09-01 Epub Date: 2025-09-24 DOI: 10.1117/1.JMI.12.5.051808
William F Auffermann, Nathan Barber, Ryan Stockard, Soham Banerjee

Purpose: Gamification can be a helpful adjunct to education and is increasingly used in radiology. We aim to determine if using a gamified framework to teach medical trainees about emergency radiology can improve perceptual and interpretive skills and facilitate learning.

Approach: We obtained approval from the Institutional Review Board, and participation was voluntary. Participants received training at the RadSimPE radiology workstation simulator and were shown three sets of computed tomography images related to emergency radiology diagnoses. Participants were asked to state their certainty that an abnormality was not present, localize it if present, and give their confidence in localization. Between case sets 1 and 2, the experimental group was provided with gamified emergency radiology training on the Stab the Diagnosis program, whereas the control group was not. Following the session, participants completed an eight-question survey to assess their thoughts about the training.

Results: A total of 36 medical trainees participated. Both the experimental group and control group improved in localization accuracy, but the experimental group's localization confidence was significantly greater than the control group ( p = 0.0364 ). Survey results were generally positive and were statistically significantly greater than the neutral value of 3, with p -values < 0.05 for all eight questions. For example, survey results indicated that participants felt the training was a helpful educational experience ( p < 0.001 ) and that the session was more effective for learning than traditional educational techniques ( p = 0.001 ).

Conclusions: Gamification may be a valuable adjunct to conventional methods in radiology education and may improve trainee confidence.

目的:游戏化可以作为教育的辅助手段,在放射学中应用越来越广泛。我们的目的是确定使用游戏化的框架来教授医学学员关于急诊放射学的知识是否可以提高感知和解释技能,并促进学习。方法:我们获得了机构审查委员会的批准,参与是自愿的。参与者在RadSimPE放射学工作站模拟器上接受了培训,并向他们展示了三组与急诊放射学诊断相关的计算机断层扫描图像。参与者被要求陈述他们对不存在异常的确信,如果存在则定位,并给出他们对定位的信心。在病例集1和病例集2之间,实验组接受了游戏化的急诊放射学“刺伤诊断”项目培训,而对照组则没有。课程结束后,参与者完成了一项包含八个问题的调查,以评估他们对培训的看法。结果:共有36名医学实习生参加。实验组和对照组的定位精度均有提高,但实验组的定位置信度显著大于对照组(p = 0.0364)。调查结果普遍为阳性,且在统计学上显著大于中性值3,8个问题的p值均为0.05。例如,调查结果表明,参与者认为培训是一种有益的教育经验(p = 0.001),并且该会议比传统教育技术更有效地学习(p = 0.001)。结论:在放射学教育中,游戏化可能是一种有价值的辅助方法,可以提高受训者的信心。
{"title":"Gamification for emergency radiology education and image perception: stab the diagnosis.","authors":"William F Auffermann, Nathan Barber, Ryan Stockard, Soham Banerjee","doi":"10.1117/1.JMI.12.5.051808","DOIUrl":"https://doi.org/10.1117/1.JMI.12.5.051808","url":null,"abstract":"<p><strong>Purpose: </strong>Gamification can be a helpful adjunct to education and is increasingly used in radiology. We aim to determine if using a gamified framework to teach medical trainees about emergency radiology can improve perceptual and interpretive skills and facilitate learning.</p><p><strong>Approach: </strong>We obtained approval from the Institutional Review Board, and participation was voluntary. Participants received training at the RadSimPE radiology workstation simulator and were shown three sets of computed tomography images related to emergency radiology diagnoses. Participants were asked to state their certainty that an abnormality was not present, localize it if present, and give their confidence in localization. Between case sets 1 and 2, the experimental group was provided with gamified emergency radiology training on the Stab the Diagnosis program, whereas the control group was not. Following the session, participants completed an eight-question survey to assess their thoughts about the training.</p><p><strong>Results: </strong>A total of 36 medical trainees participated. Both the experimental group and control group improved in localization accuracy, but the experimental group's localization confidence was significantly greater than the control group ( <math><mrow><mi>p</mi> <mo>=</mo> <mn>0.0364</mn></mrow> </math> ). Survey results were generally positive and were statistically significantly greater than the neutral value of 3, with <math><mrow><mi>p</mi></mrow> </math> -values <math><mrow><mo><</mo> <mn>0.05</mn></mrow> </math> for all eight questions. For example, survey results indicated that participants felt the training was a helpful educational experience ( <math><mrow><mi>p</mi> <mo><</mo> <mn>0.001</mn></mrow> </math> ) and that the session was more effective for learning than traditional educational techniques ( <math><mrow><mi>p</mi> <mo>=</mo> <mn>0.001</mn></mrow> </math> ).</p><p><strong>Conclusions: </strong>Gamification may be a valuable adjunct to conventional methods in radiology education and may improve trainee confidence.</p>","PeriodicalId":47707,"journal":{"name":"Journal of Medical Imaging","volume":"12 5","pages":"051808"},"PeriodicalIF":1.7,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12458100/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145151419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Correlation of objective image quality metrics with radiologists' diagnostic confidence depends on the clinical task performed. 客观图像质量指标与放射科医生诊断信心的相关性取决于所执行的临床任务。
IF 1.9 Q3 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING Pub Date : 2025-09-01 Epub Date: 2025-04-11 DOI: 10.1117/1.JMI.12.5.051803
Michelle C Pryde, James Rioux, Adela Elena Cora, David Volders, Matthias H Schmidt, Mohammed Abdolell, Chris Bowen, Steven D Beyea

Purpose: Objective image quality metrics (IQMs) are widely used as outcome measures to assess acquisition and reconstruction strategies for diagnostic images. For nonpathological magnetic resonance (MR) images, these IQMs correlate to varying degrees with expert radiologists' confidence scores of overall perceived diagnostic image quality. However, it is unclear whether IQMs also correlate with task-specific diagnostic image quality or expert radiologists' confidence in performing a specific diagnostic task, which calls into question their use as surrogates for radiologist opinion.

Approach: 0.5 T MR images from 16 stroke patients and two healthy volunteers were retrospectively undersampled ( R = 1 to 7 × ) and reconstructed via compressed sensing. Three neuroradiologists reported the presence/absence of acute ischemic stroke (AIS) and assigned a Fazekas score describing the extent of chronic ischemic lesion burden. Neuroradiologists ranked their confidence in performing each task using a 1 to 5 Likert scale. Confidence scores were correlated with noise quality measure, the visual information fidelity criterion, the feature similarity index, root mean square error, and structural similarity (SSIM) via nonlinear regression modeling.

Results: Although acceleration alters image quality, neuroradiologists remain able to report pathology. All of the IQMs tested correlated to some degree with diagnostic confidence for assessing chronic ischemic lesion burden, but none correlated with diagnostic confidence in diagnosing the presence/absence of AIS due to consistent radiologist performance regardless of image degradation.

Conclusions: Accelerated images were helpful for understanding the ability of IQMs to assess task-specific diagnostic image quality in the context of chronic ischemic lesion burden, although not in the case of AIS diagnosis. These findings suggest that commonly used IQMs, such as the SSIM index, do not necessarily indicate an image's utility when performing certain diagnostic tasks.

目的:客观图像质量指标(IQMs)被广泛用于评估诊断图像的采集和重建策略。对于非病理性磁共振(MR)图像,这些iqm与放射科专家对整体感知诊断图像质量的信心得分有不同程度的相关性。然而,目前尚不清楚iqm是否也与特定任务的诊断图像质量或专家放射科医生在执行特定诊断任务时的信心相关,这就使他们作为放射科医生意见的替代品的使用受到质疑。方法:对16例脑卒中患者和2名健康志愿者的0.5 T MR图像进行回顾性欠采样(R = 1 ~ 7 ×),并通过压缩感知进行重构。三名神经放射学家报告了急性缺血性卒中(AIS)的存在/不存在,并分配了描述慢性缺血性病变负担程度的Fazekas评分。神经放射学家用1到5的李克特量表对他们完成每项任务的信心进行排名。通过非线性回归建模,置信度得分与噪声质量度量、视觉信息保真度标准、特征相似度指数、均方根误差和结构相似度(SSIM)相关。结果:虽然加速改变图像质量,神经放射科医生仍然能够报告病理。所有测试的iqm都在一定程度上与评估慢性缺血性病变负担的诊断信心相关,但没有一个与诊断AIS存在与否的诊断信心相关,因为无论图像退化如何,放射科医生的表现都是一致的。结论:加速图像有助于理解IQMs在慢性缺血性病变负担背景下评估特定任务诊断图像质量的能力,尽管在AIS诊断情况下并非如此。这些发现表明,在执行某些诊断任务时,常用的iqm(如SSIM索引)不一定表明映像的实用性。
{"title":"Correlation of objective image quality metrics with radiologists' diagnostic confidence depends on the clinical task performed.","authors":"Michelle C Pryde, James Rioux, Adela Elena Cora, David Volders, Matthias H Schmidt, Mohammed Abdolell, Chris Bowen, Steven D Beyea","doi":"10.1117/1.JMI.12.5.051803","DOIUrl":"10.1117/1.JMI.12.5.051803","url":null,"abstract":"<p><strong>Purpose: </strong>Objective image quality metrics (IQMs) are widely used as outcome measures to assess acquisition and reconstruction strategies for diagnostic images. For nonpathological magnetic resonance (MR) images, these IQMs correlate to varying degrees with expert radiologists' confidence scores of overall perceived diagnostic image quality. However, it is unclear whether IQMs also correlate with task-specific diagnostic image quality or expert radiologists' confidence in performing a specific diagnostic task, which calls into question their use as surrogates for radiologist opinion.</p><p><strong>Approach: </strong>0.5 T MR images from 16 stroke patients and two healthy volunteers were retrospectively undersampled ( <math><mrow><mi>R</mi> <mo>=</mo> <mn>1</mn></mrow> </math> to <math><mrow><mn>7</mn> <mo>×</mo></mrow> </math> ) and reconstructed via compressed sensing. Three neuroradiologists reported the presence/absence of acute ischemic stroke (AIS) and assigned a Fazekas score describing the extent of chronic ischemic lesion burden. Neuroradiologists ranked their confidence in performing each task using a 1 to 5 Likert scale. Confidence scores were correlated with noise quality measure, the visual information fidelity criterion, the feature similarity index, root mean square error, and structural similarity (SSIM) via nonlinear regression modeling.</p><p><strong>Results: </strong>Although acceleration alters image quality, neuroradiologists remain able to report pathology. All of the IQMs tested correlated to some degree with diagnostic confidence for assessing chronic ischemic lesion burden, but none correlated with diagnostic confidence in diagnosing the presence/absence of AIS due to consistent radiologist performance regardless of image degradation.</p><p><strong>Conclusions: </strong>Accelerated images were helpful for understanding the ability of IQMs to assess task-specific diagnostic image quality in the context of chronic ischemic lesion burden, although not in the case of AIS diagnosis. These findings suggest that commonly used IQMs, such as the SSIM index, do not necessarily indicate an image's utility when performing certain diagnostic tasks.</p>","PeriodicalId":47707,"journal":{"name":"Journal of Medical Imaging","volume":"12 5","pages":"051803"},"PeriodicalIF":1.9,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11991859/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144018546","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Contrast-enhanced spectral mammography demonstrates better inter-reader repeatability than digital mammography for screening breast cancer patients. 对比增强光谱乳房x线摄影显示更好的阅读器间重复性比数字乳房x线摄影筛查乳腺癌患者。
IF 1.9 Q3 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING Pub Date : 2025-09-01 Epub Date: 2025-06-18 DOI: 10.1117/1.JMI.12.5.051806
Alisa Mohebbi, Ali Abdi, Saeed Mohammadzadeh, Mohammad Mirza-Aghazadeh-Attari, Ali Abbasian Ardakani, Afshin Mohammadi

Purpose: Our purpose is to assess the inter-rater agreement between digital mammography (DM) and contrast-enhanced spectral mammography (CESM) in evaluating the Breast Imaging Reporting and Data System (BI-RADS) grading.

Approach: This retrospective study included 326 patients recruited between January 2019 and February 2021. The study protocol was pre-registered on the Open Science Framework platform. Two expert radiologists interpreted the CESM and DM findings. Pathological data are used for radiologically suspicious or malignant-appearing lesions, whereas follow-up was considered the gold standard for benign-appearing lesions and breasts without lesions.

Results: For intra-device agreement, both imaging modalities showed "almost perfect" agreement, indicating that different radiologists are expected to report the same BI-RADS score for the same image. Despite showing a similar interpretation, a paired t -test showed significantly higher agreement for CESM compared with DM ( p < 0.001 ). Subgrouping based on the side or view did not show a considerable difference for both imaging modalities. For inter-device agreement, "almost perfect" agreement was also achieved. However, for proven malignant lesions, an overall higher BI-RADS score was achieved for CESM, whereas for benign or normal breasts, a lower BI-RADS score was reported, indicating a more precise BI-RADS classification for CESM compared with DM.

Conclusions: Our findings demonstrated strong agreement among readers regarding the identification of DM and CESM findings in breast images from various views. Moreover, it indicates that CESM is equally precise compared with DM and can be used as an alternative in clinical centers.

目的:我们的目的是评估数字乳房x线摄影(DM)和对比增强光谱乳房x线摄影(CESM)在评估乳腺成像报告和数据系统(BI-RADS)分级方面的一致性。方法:该回顾性研究纳入了2019年1月至2021年2月期间招募的326例患者。研究方案已在开放科学框架平台上预先注册。两位放射科专家解释了CESM和DM的结果。病理数据用于放射学上可疑或恶性病变,而随访被认为是良性病变和无病变乳房的金标准。结果:对于设备内一致性,两种成像方式显示“几乎完美”的一致性,这表明不同的放射科医生对相同的图像报告相同的BI-RADS评分。尽管显示了相似的解释,配对t检验显示CESM与DM的一致性显著更高(p 0.001)。基于侧面或视图的亚分组在两种成像方式中没有显示出相当大的差异。在设备间协议方面,也实现了“近乎完美”的协议。然而,对于已证实的恶性病变,CESM的BI-RADS评分总体较高,而对于良性或正常乳房,BI-RADS评分较低,这表明CESM的BI-RADS分类比DM更精确。结论:我们的研究结果表明,读者对从不同角度的乳房图像中识别DM和CESM的发现有强烈的共识。此外,这表明CESM与DM相比同样精确,可以作为临床中心的替代方案。
{"title":"Contrast-enhanced spectral mammography demonstrates better inter-reader repeatability than digital mammography for screening breast cancer patients.","authors":"Alisa Mohebbi, Ali Abdi, Saeed Mohammadzadeh, Mohammad Mirza-Aghazadeh-Attari, Ali Abbasian Ardakani, Afshin Mohammadi","doi":"10.1117/1.JMI.12.5.051806","DOIUrl":"10.1117/1.JMI.12.5.051806","url":null,"abstract":"<p><strong>Purpose: </strong>Our purpose is to assess the inter-rater agreement between digital mammography (DM) and contrast-enhanced spectral mammography (CESM) in evaluating the Breast Imaging Reporting and Data System (BI-RADS) grading.</p><p><strong>Approach: </strong>This retrospective study included 326 patients recruited between January 2019 and February 2021. The study protocol was pre-registered on the Open Science Framework platform. Two expert radiologists interpreted the CESM and DM findings. Pathological data are used for radiologically suspicious or malignant-appearing lesions, whereas follow-up was considered the gold standard for benign-appearing lesions and breasts without lesions.</p><p><strong>Results: </strong>For intra-device agreement, both imaging modalities showed \"almost perfect\" agreement, indicating that different radiologists are expected to report the same BI-RADS score for the same image. Despite showing a similar interpretation, a paired <math><mrow><mi>t</mi></mrow> </math> -test showed significantly higher agreement for CESM compared with DM ( <math><mrow><mi>p</mi> <mo><</mo> <mn>0.001</mn></mrow> </math> ). Subgrouping based on the side or view did not show a considerable difference for both imaging modalities. For inter-device agreement, \"almost perfect\" agreement was also achieved. However, for proven malignant lesions, an overall higher BI-RADS score was achieved for CESM, whereas for benign or normal breasts, a lower BI-RADS score was reported, indicating a more precise BI-RADS classification for CESM compared with DM.</p><p><strong>Conclusions: </strong>Our findings demonstrated strong agreement among readers regarding the identification of DM and CESM findings in breast images from various views. Moreover, it indicates that CESM is equally precise compared with DM and can be used as an alternative in clinical centers.</p>","PeriodicalId":47707,"journal":{"name":"Journal of Medical Imaging","volume":"12 5","pages":"051806"},"PeriodicalIF":1.9,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12175086/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144334196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Breast cancer survivors' perceptual map of breast reconstruction appearance outcomes. 乳腺癌幸存者对乳房重建外观结果的感知图。
IF 1.9 Q3 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING Pub Date : 2025-09-01 Epub Date: 2025-03-19 DOI: 10.1117/1.JMI.12.5.051802
Haoqi Wang, Xiomara T Gonzalez, Gabriela A Renta-López, Mary Catherine Bordes, Michael C Hout, Seung W Choi, Gregory P Reece, Mia K Markey

Purpose: It is often hard for patients to articulate their expectations about breast reconstruction appearance outcomes to their providers. Our overarching goal is to develop a tool to help patients visually express what they expect to look like after reconstruction. We aim to comprehensively understand how breast cancer survivors perceive diverse breast appearance states by mapping them onto a low-dimensional Euclidean space, which simplifies the complex information about perceptual similarity relationships into a more interpretable form.

Approach: We recruited breast cancer survivors and conducted observer experiments to assess the visual similarities among clinical photographs depicting a range of appearances of the torso relevant to breast reconstruction. Then, we developed a perceptual map to illuminate how breast cancer survivors perceive and distinguish among these appearance states.

Results: We sampled 100 photographs as stimuli and recruited 34 breast cancer survivors locally. The resulting perceptual map, constructed in two dimensions, offers valuable insights into factors influencing breast cancer survivors' perceptions of breast reconstruction outcomes. Our findings highlight specific aspects, such as the number of nipples, symmetry, ptosis, scars, and breast shape, that emerge as particularly noteworthy for breast cancer survivors.

Conclusions: Analysis of the perceptual map identified factors associated with breast cancer survivors' perceptions of breast appearance states that should be emphasized in the appearance consultation process. The perceptual map could be used to assist patients in visually expressing what they expect to look like. Our study lays the groundwork for evaluating interventions intended to help patients form realistic expectations.

目的:患者通常很难向医生表达对乳房重建外观结果的期望。我们的首要目标是开发一种工具,帮助患者从视觉上表达他们对重建后的期望。我们的目标是全面了解乳腺癌幸存者如何通过将其映射到低维欧几里得空间来感知不同的乳房外观状态,从而将有关感知相似性关系的复杂信息简化为更可解释的形式。方法:我们招募了乳腺癌幸存者,并进行了观察实验,以评估临床照片中描绘的一系列与乳房重建相关的躯干外观之间的视觉相似性。然后,我们开发了一个感知图来阐明乳腺癌幸存者如何感知和区分这些外观状态。结果:我们抽取了100张照片作为刺激,并在当地招募了34名乳腺癌幸存者。由此产生的二维感知图为影响乳腺癌幸存者对乳房重建结果感知的因素提供了有价值的见解。我们的研究结果强调了一些特定的方面,如乳头的数量、对称性、上睑下垂、疤痕和乳房形状,这些对乳腺癌幸存者来说尤其值得注意。结论:对感知图的分析确定了与乳腺癌幸存者对乳房外观状态的感知相关的因素,这些因素应在外观咨询过程中得到强调。感知图可以用来帮助病人在视觉上表达他们期望的样子。我们的研究为评估旨在帮助患者形成现实期望的干预措施奠定了基础。
{"title":"Breast cancer survivors' perceptual map of breast reconstruction appearance outcomes.","authors":"Haoqi Wang, Xiomara T Gonzalez, Gabriela A Renta-López, Mary Catherine Bordes, Michael C Hout, Seung W Choi, Gregory P Reece, Mia K Markey","doi":"10.1117/1.JMI.12.5.051802","DOIUrl":"10.1117/1.JMI.12.5.051802","url":null,"abstract":"<p><strong>Purpose: </strong>It is often hard for patients to articulate their expectations about breast reconstruction appearance outcomes to their providers. Our overarching goal is to develop a tool to help patients visually express what they expect to look like after reconstruction. We aim to comprehensively understand how breast cancer survivors perceive diverse breast appearance states by mapping them onto a low-dimensional Euclidean space, which simplifies the complex information about perceptual similarity relationships into a more interpretable form.</p><p><strong>Approach: </strong>We recruited breast cancer survivors and conducted observer experiments to assess the visual similarities among clinical photographs depicting a range of appearances of the torso relevant to breast reconstruction. Then, we developed a perceptual map to illuminate how breast cancer survivors perceive and distinguish among these appearance states.</p><p><strong>Results: </strong>We sampled 100 photographs as stimuli and recruited 34 breast cancer survivors locally. The resulting perceptual map, constructed in two dimensions, offers valuable insights into factors influencing breast cancer survivors' perceptions of breast reconstruction outcomes. Our findings highlight specific aspects, such as the number of nipples, symmetry, ptosis, scars, and breast shape, that emerge as particularly noteworthy for breast cancer survivors.</p><p><strong>Conclusions: </strong>Analysis of the perceptual map identified factors associated with breast cancer survivors' perceptions of breast appearance states that should be emphasized in the appearance consultation process. The perceptual map could be used to assist patients in visually expressing what they expect to look like. Our study lays the groundwork for evaluating interventions intended to help patients form realistic expectations.</p>","PeriodicalId":47707,"journal":{"name":"Journal of Medical Imaging","volume":"12 5","pages":"051802"},"PeriodicalIF":1.9,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11921042/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143671445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Convolutional neural network model observers discount signal-like anatomical structures during search in virtual digital breast tomosynthesis phantoms. 卷积神经网络模型观察员折扣信号样解剖结构在虚拟数字乳房断层合成幻影搜索。
IF 1.7 Q3 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING Pub Date : 2025-09-01 Epub Date: 2025-10-16 DOI: 10.1117/1.JMI.12.5.051809
Aditya Jonnalagadda, Bruno B Barufaldi, Andrew D A Maidment, Susan P Weinstein, Craig K Abbey, Miguel P Eckstein

Purpose: We aim to assess the perceptual tasks in which convolutional neural networks (CNNs) might be better tools than commonly used linear model observers (LMOs) to evaluate medical image quality.

Approach: We compared the LMOs (channelized Hotelling [CHO] and frequency convolution channels observers [FCO]) and CNN detection accuracies for tasks with a few possible signal locations (location known exactly) and for the search for mass and microcalcification signals embedded in 2D/3D breast tomosynthesis phantoms. We also compared the LMOs and CNN accuracies to those of radiologists in the search tasks. We analyzed radiologists' eye position to assess whether they fixate longer at locations considered suspicious by the LMOs or those by the CNN.

Results: LMOs resulted in similar detection accuracies [area under the receiver operating characteristic curve (AUC)] to the CNN for tasks with up to 100 signal locations but lower accuracies in the search task for microcalcification and mass 3D images. Radiologists' AUC was significantly higher ( p < 1 e - 4 ) than that of LMOs for the microcalcification 2D search (CHO, FCO) and 3D mass search ( p < 0.05 , CHO) but was not higher than the CNN's AUC. For both signal types, radiologists fixated longer on the locations of the highest response scores of the CNN than those of the LMOs but only reached statistical significance for the mass (masses: p = 0.009 versus CHO and p = 0.004 versus FCO).

Conclusion: We show that CNNs are a more suitable model observer for search tasks. Like radiologists but not traditional LMOs, CNNs can discount false positives arising from anatomical backgrounds.

目的:我们旨在评估感知任务,其中卷积神经网络(cnn)可能比常用的线性模型观测器(LMOs)更好地评估医学图像质量。方法:我们比较了LMOs(通道化Hotelling [CHO]和频率卷积通道观测器[FCO])和CNN在具有少数可能信号位置(确切位置已知)的任务中的检测精度,以及在二维/三维乳房断层合成图像中搜索肿块和微钙化信号的精度。我们还比较了LMOs和CNN在搜索任务中的准确性与放射科医生的准确性。我们分析了放射科医生的眼睛位置,以评估他们是否在lmo认为可疑的位置或CNN认为可疑的位置注视更长时间。结果:LMOs对于多达100个信号位置的任务的检测精度[接收器工作特征曲线下面积(AUC)]与CNN相似,但在微钙化和大量3D图像的搜索任务中精度较低。放射科医师的微钙化二维搜索(CHO, FCO)和三维肿块搜索(p 0.05, CHO)的AUC显著高于LMOs (p 0.05, CHO),但不高于CNN的AUC。对于这两种信号类型,放射科医生对CNN反应得分最高的位置的注视时间比LMOs长,但仅对质量达到统计学意义(质量:p = 0.009相对于CHO, p = 0.004相对于FCO)。结论:我们证明cnn是一个更适合搜索任务的模型观测器。像放射科医生而不是传统的lmo一样,cnn可以忽略解剖学背景引起的假阳性。
{"title":"Convolutional neural network model observers discount signal-like anatomical structures during search in virtual digital breast tomosynthesis phantoms.","authors":"Aditya Jonnalagadda, Bruno B Barufaldi, Andrew D A Maidment, Susan P Weinstein, Craig K Abbey, Miguel P Eckstein","doi":"10.1117/1.JMI.12.5.051809","DOIUrl":"10.1117/1.JMI.12.5.051809","url":null,"abstract":"<p><strong>Purpose: </strong>We aim to assess the perceptual tasks in which convolutional neural networks (CNNs) might be better tools than commonly used linear model observers (LMOs) to evaluate medical image quality.</p><p><strong>Approach: </strong>We compared the LMOs (channelized Hotelling [CHO] and frequency convolution channels observers [FCO]) and CNN detection accuracies for tasks with a few possible signal locations (location known exactly) and for the search for mass and microcalcification signals embedded in 2D/3D breast tomosynthesis phantoms. We also compared the LMOs and CNN accuracies to those of radiologists in the search tasks. We analyzed radiologists' eye position to assess whether they fixate longer at locations considered suspicious by the LMOs or those by the CNN.</p><p><strong>Results: </strong>LMOs resulted in similar detection accuracies [area under the receiver operating characteristic curve (AUC)] to the CNN for tasks with up to 100 signal locations but lower accuracies in the search task for microcalcification and mass 3D images. Radiologists' AUC was significantly higher ( <math><mrow><mi>p</mi> <mo><</mo> <mn>1</mn> <mi>e</mi> <mo>-</mo> <mn>4</mn></mrow> </math> ) than that of LMOs for the microcalcification 2D search (CHO, FCO) and 3D mass search ( <math><mrow><mi>p</mi> <mo><</mo> <mn>0.05</mn></mrow> </math> , CHO) but was not higher than the CNN's AUC. For both signal types, radiologists fixated longer on the locations of the highest response scores of the CNN than those of the LMOs but only reached statistical significance for the mass (masses: <math><mrow><mi>p</mi> <mo>=</mo> <mn>0.009</mn></mrow> </math> versus CHO and <math><mrow><mi>p</mi> <mo>=</mo> <mn>0.004</mn></mrow> </math> versus FCO).</p><p><strong>Conclusion: </strong>We show that CNNs are a more suitable model observer for search tasks. Like radiologists but not traditional LMOs, CNNs can discount false positives arising from anatomical backgrounds.</p>","PeriodicalId":47707,"journal":{"name":"Journal of Medical Imaging","volume":"12 5","pages":"051809"},"PeriodicalIF":1.7,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12530144/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145330494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Machine learning evaluation of pneumonia severity: subgroup performance in the Medical Imaging and Data Resource Center modified radiographic assessment of lung edema mastermind challenge. 肺炎严重程度的机器学习评估:医学成像和数据资源中心改进的肺水肿主脑挑战影像学评估的亚组表现。
IF 1.7 Q3 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING Pub Date : 2025-09-01 Epub Date: 2025-10-07 DOI: 10.1117/1.JMI.12.5.054502
Karen Drukker, Samuel G Armato, Lubomir Hadjiiski, Judy Gichoya, Nicholas Gruszauskas, Jayashree Kalpathy-Cramer, Hui Li, Kyle J Myers, Robert M Tomek, Heather M Whitney, Zi Zhang, Maryellen L Giger

Purpose: The Medical Imaging and Data Resource Center Mastermind Grand Challenge of modified radiographic assessment of lung edema (mRALE) tasked participants with developing machine learning techniques for automated COVID-19 severity assessment via mRALE scores on portable chest radiographs (CXRs). We examine potential biases across demographic subgroups for the best-performing models of the nine teams participating in the test phase of the challenge.

Approach: Models were evaluated against a nonpublic test set of CXRs (814 patients) annotated by radiologists for disease severity (mRALE score 0 to 24). Participants used a variety of data and methods for training. Performance was measured using quadratic-weighted kappa (QWK). Bias analyses considered demographics (sex, age, race, ethnicity, and their intersections) using QWK. In addition, for distinguishing no/mild versus moderate/severe disease, equal opportunity difference (EOD) and average absolute odds difference (AAOD) were calculated. Bias was defined as statistically significant QWK subgroup differences, or EOD outside [ - 0.1 ; 0.1], or AAOD outside [0; 0.1].

Results: The nine models demonstrated good agreement with the reference standard (QWK 0.74 to 0.88). The winning model (QWK = 0.884 [0.819; 0.949]) was the only model without biases identified in terms of QWK. The runner-up model (QWK = 0.874 [0.813; 0.936]) showed no identified biases in terms of EOD and AAOD, whereas the winning model disadvantaged three subgroups in each of these metrics. The median number of disadvantaged subgroups for all models was 3.

Conclusions: The challenge demonstrated strong model performances but identified subgroup disparities. Bias analysis is essential as models with similar accuracy may exhibit varying fairness.

目的:医学成像和数据资源中心(Medical Imaging and Data Resource Center)发起了改进肺水肿放射学评估(mRALE)的大挑战,要求参与者开发机器学习技术,通过便携式胸片(cxr)上的mRALE评分自动评估COVID-19严重程度。我们检查了参与挑战测试阶段的九个团队中表现最佳的模型在人口统计子组中的潜在偏差。方法:根据放射科医生对疾病严重程度(mRALE评分0至24分)注释的非公开cxr测试集(814例患者)对模型进行评估。参与者使用各种数据和方法进行训练。使用二次加权kappa (QWK)来测量性能。偏差分析使用QWK考虑人口统计学(性别、年龄、种族、民族及其交集)。此外,为了区分无/轻度与中度/重度疾病,计算了均等机会差(EOD)和平均绝对优势差(AAOD)。偏倚定义为统计学上显著的QWK亚组差异,或EOD外[- 0.1;0.1],或AAOD外[0;0.1]。结果:9个模型与参考标准的QWK值(0.74 ~ 0.88)吻合较好。获胜的模型(QWK = 0.884[0.819; 0.949])是唯一一个在QWK方面没有发现偏差的模型。第二名模型(QWK = 0.874[0.813; 0.936])在EOD和AAOD方面没有明显的偏差,而获胜模型在这些指标中都有三个亚组处于劣势。所有模型的弱势亚组中位数为3。结论:挑战证明了强大的模型性能,但确定了亚组差异。偏差分析是必不可少的,因为具有相似精度的模型可能表现出不同的公平性。
{"title":"Machine learning evaluation of pneumonia severity: subgroup performance in the Medical Imaging and Data Resource Center modified radiographic assessment of lung edema mastermind challenge.","authors":"Karen Drukker, Samuel G Armato, Lubomir Hadjiiski, Judy Gichoya, Nicholas Gruszauskas, Jayashree Kalpathy-Cramer, Hui Li, Kyle J Myers, Robert M Tomek, Heather M Whitney, Zi Zhang, Maryellen L Giger","doi":"10.1117/1.JMI.12.5.054502","DOIUrl":"10.1117/1.JMI.12.5.054502","url":null,"abstract":"<p><strong>Purpose: </strong>The Medical Imaging and Data Resource Center Mastermind Grand Challenge of modified radiographic assessment of lung edema (mRALE) tasked participants with developing machine learning techniques for automated COVID-19 severity assessment via mRALE scores on portable chest radiographs (CXRs). We examine potential biases across demographic subgroups for the best-performing models of the nine teams participating in the test phase of the challenge.</p><p><strong>Approach: </strong>Models were evaluated against a nonpublic test set of CXRs (814 patients) annotated by radiologists for disease severity (mRALE score 0 to 24). Participants used a variety of data and methods for training. Performance was measured using quadratic-weighted kappa (QWK). Bias analyses considered demographics (sex, age, race, ethnicity, and their intersections) using QWK. In addition, for distinguishing no/mild versus moderate/severe disease, equal opportunity difference (EOD) and average absolute odds difference (AAOD) were calculated. Bias was defined as statistically significant QWK subgroup differences, or EOD outside [ <math><mrow><mo>-</mo> <mn>0.1</mn></mrow> </math> ; 0.1], or AAOD outside [0; 0.1].</p><p><strong>Results: </strong>The nine models demonstrated good agreement with the reference standard (QWK 0.74 to 0.88). The winning model (QWK = 0.884 [0.819; 0.949]) was the only model without biases identified in terms of QWK. The runner-up model (QWK = 0.874 [0.813; 0.936]) showed no identified biases in terms of EOD and AAOD, whereas the winning model disadvantaged three subgroups in each of these metrics. The median number of disadvantaged subgroups for all models was 3.</p><p><strong>Conclusions: </strong>The challenge demonstrated strong model performances but identified subgroup disparities. Bias analysis is essential as models with similar accuracy may exhibit varying fairness.</p>","PeriodicalId":47707,"journal":{"name":"Journal of Medical Imaging","volume":"12 5","pages":"054502"},"PeriodicalIF":1.7,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12503059/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145253373","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Medical Imaging
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1