Purpose: Self-supervised pre-training can reduce the amount of labeled training data needed by pre-learning fundamental visual characteristics of the medical imaging data. We investigate several self-supervised training strategies for chest computed tomography exams and their effects on downstream applications.
Approach: We benchmark five well-known self-supervision strategies (masked image region prediction, next slice prediction, rotation prediction, flip prediction, and denoising) on 15 M chest computed tomography (CT) slices collected from four sites of the Mayo Clinic enterprise, United States. These models were evaluated for two downstream tasks on public datasets: pulmonary embolism (PE) detection (classification) and lung nodule segmentation. Image embeddings generated by these models were also evaluated for prediction of patient age, race, and gender to study inherent biases in models' understanding of chest CT exams.
Results: The use of pre-training weights especially masked region prediction-based weights, improved performance, and reduced computational effort needed for downstream tasks compared with task-specific state-of-the-art (SOTA) models. Performance improvement for PE detection was observed for training dataset sizes as large as with a maximum gain of 5% over SOTA. The segmentation model initialized with pre-training weights learned twice as fast as the randomly initialized model. While gender and age predictors built using self-supervised training weights showed no performance improvement over randomly initialized predictors, the race predictor experienced a 10% performance boost when using self-supervised training weights.
Conclusion: We released self-supervised models and weights under an open-source academic license. These models can then be fine-tuned with limited task-specific annotated data for a variety of downstream imaging tasks, thus accelerating research in biomedical imaging informatics.
目的:自监督预训练可以通过预学习医学影像数据的基本视觉特征来减少所需的标记训练数据量。我们研究了胸部计算机断层扫描检查的几种自监督训练策略及其对下游应用的影响:我们在从美国梅奥诊所企业的四个站点收集的 1500 万张胸部计算机断层扫描(CT)切片上,对五种著名的自监督策略(遮蔽图像区域预测、下一切片预测、旋转预测、翻转预测和去噪)进行了基准测试。这些模型针对公共数据集上的两项下游任务进行了评估:肺栓塞(PE)检测(分类)和肺结节分割。此外,还对这些模型生成的图像嵌入进行了评估,以预测患者的年龄、种族和性别,从而研究模型对胸部 CT 检查的理解是否存在固有偏差:结果:与针对特定任务的最先进模型(SOTA)相比,使用预训练权重(尤其是基于掩蔽区域预测的权重)提高了性能,并减少了下游任务所需的计算工作量。当训练数据集的大小达到 380 K 时,PE 检测的性能有所提高,与 SOTA 相比最大提高了 5%。使用预训练权重初始化的分割模型的学习速度是随机初始化模型的两倍。与随机初始化的预测器相比,使用自我监督训练权重构建的性别和年龄预测器的性能没有提高,但使用自我监督训练权重的种族预测器的性能提高了 10%:我们以开源学术许可证的形式发布了自监督模型和权重。结论:我们以开源学术许可证的形式发布了自监督模型和权重,然后可以利用有限的特定任务注释数据对这些模型进行微调,以用于各种下游成像任务,从而加速生物医学成像信息学的研究。
{"title":"Self-supervised learning for chest computed tomography: training strategies and effect on downstream applications.","authors":"Amara Tariq, Gokul Ramasamy, Bhavik Patel, Imon Banerjee","doi":"10.1117/1.JMI.11.6.064003","DOIUrl":"https://doi.org/10.1117/1.JMI.11.6.064003","url":null,"abstract":"<p><strong>Purpose: </strong>Self-supervised pre-training can reduce the amount of labeled training data needed by pre-learning fundamental visual characteristics of the medical imaging data. We investigate several self-supervised training strategies for chest computed tomography exams and their effects on downstream applications.</p><p><strong>Approach: </strong>We benchmark five well-known self-supervision strategies (masked image region prediction, next slice prediction, rotation prediction, flip prediction, and denoising) on 15 M chest computed tomography (CT) slices collected from four sites of the Mayo Clinic enterprise, United States. These models were evaluated for two downstream tasks on public datasets: pulmonary embolism (PE) detection (classification) and lung nodule segmentation. Image embeddings generated by these models were also evaluated for prediction of patient age, race, and gender to study inherent biases in models' understanding of chest CT exams.</p><p><strong>Results: </strong>The use of pre-training weights especially masked region prediction-based weights, improved performance, and reduced computational effort needed for downstream tasks compared with task-specific state-of-the-art (SOTA) models. Performance improvement for PE detection was observed for training dataset sizes as large as <math><mrow><mo>∼</mo> <mn>380</mn> <mtext> </mtext> <mi>K</mi></mrow> </math> with a maximum gain of 5% over SOTA. The segmentation model initialized with pre-training weights learned twice as fast as the randomly initialized model. While gender and age predictors built using self-supervised training weights showed no performance improvement over randomly initialized predictors, the race predictor experienced a 10% performance boost when using self-supervised training weights.</p><p><strong>Conclusion: </strong>We released self-supervised models and weights under an open-source academic license. These models can then be fine-tuned with limited task-specific annotated data for a variety of downstream imaging tasks, thus accelerating research in biomedical imaging informatics.</p>","PeriodicalId":47707,"journal":{"name":"Journal of Medical Imaging","volume":"11 6","pages":"064003"},"PeriodicalIF":1.9,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11550486/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142630349","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-01Epub Date: 2024-11-05DOI: 10.1117/1.JMI.11.6.067501
Lucas W Remedios, Shunxing Bao, Samuel W Remedios, Ho Hin Lee, Leon Y Cai, Thomas Li, Ruining Deng, Nancy R Newlin, Adam M Saunders, Can Cui, Jia Li, Qi Liu, Ken S Lau, Joseph T Roland, Mary K Washington, Lori A Coburn, Keith T Wilson, Yuankai Huo, Bennett A Landman
Purpose: Cells are building blocks for human physiology; consequently, understanding the way cells communicate, co-locate, and interrelate is essential to furthering our understanding of how the body functions in both health and disease. Hematoxylin and eosin (H&E) is the standard stain used in histological analysis of tissues in both clinical and research settings. Although H&E is ubiquitous and reveals tissue microanatomy, the classification and mapping of cell subtypes often require the use of specialized stains. The recent CoNIC Challenge focused on artificial intelligence classification of six types of cells on colon H&E but was unable to classify epithelial subtypes (progenitor, enteroendocrine, goblet), lymphocyte subtypes (B, helper T, cytotoxic T), and connective subtypes (fibroblasts). We propose to use inter-modality learning to label previously un-labelable cell types on H&E.
Approach: We took advantage of the cell classification information inherent in multiplexed immunofluorescence (MxIF) histology to create cell-level annotations for 14 subclasses. Then, we performed style transfer on the MxIF to synthesize realistic virtual H&E. We assessed the efficacy of a supervised learning scheme using the virtual H&E and 14 subclass labels. We evaluated our model on virtual H&E and real H&E.
Results: On virtual H&E, we were able to classify helper T cells and epithelial progenitors with positive predictive values of (prevalence ) and (prevalence ), respectively, when using ground truth centroid information. On real H&E, we needed to compute bounded metrics instead of direct metrics because our fine-grained virtual H&E predicted classes had to be matched to the closest available parent classes in the coarser labels from the real H&E dataset. For the real H&E, we could classify bounded metrics for the helper T cells and epithelial progenitors with upper bound positive predictive values of (parent class prevalence 0.21) and (parent class prevalence 0.49) when using ground truth centroid information.
Conclusions: This is the first work to provide cell type classification for helper T and epithelial progenitor nuclei on H&E.
{"title":"Data-driven nucleus subclassification on colon hematoxylin and eosin using style-transferred digital pathology.","authors":"Lucas W Remedios, Shunxing Bao, Samuel W Remedios, Ho Hin Lee, Leon Y Cai, Thomas Li, Ruining Deng, Nancy R Newlin, Adam M Saunders, Can Cui, Jia Li, Qi Liu, Ken S Lau, Joseph T Roland, Mary K Washington, Lori A Coburn, Keith T Wilson, Yuankai Huo, Bennett A Landman","doi":"10.1117/1.JMI.11.6.067501","DOIUrl":"10.1117/1.JMI.11.6.067501","url":null,"abstract":"<p><strong>Purpose: </strong>Cells are building blocks for human physiology; consequently, understanding the way cells communicate, co-locate, and interrelate is essential to furthering our understanding of how the body functions in both health and disease. Hematoxylin and eosin (H&E) is the standard stain used in histological analysis of tissues in both clinical and research settings. Although H&E is ubiquitous and reveals tissue microanatomy, the classification and mapping of cell subtypes often require the use of specialized stains. The recent CoNIC Challenge focused on artificial intelligence classification of six types of cells on colon H&E but was unable to classify epithelial subtypes (progenitor, enteroendocrine, goblet), lymphocyte subtypes (B, helper T, cytotoxic T), and connective subtypes (fibroblasts). We propose to use inter-modality learning to label previously un-labelable cell types on H&E.</p><p><strong>Approach: </strong>We took advantage of the cell classification information inherent in multiplexed immunofluorescence (MxIF) histology to create cell-level annotations for 14 subclasses. Then, we performed style transfer on the MxIF to synthesize realistic virtual H&E. We assessed the efficacy of a supervised learning scheme using the virtual H&E and 14 subclass labels. We evaluated our model on virtual H&E and real H&E.</p><p><strong>Results: </strong>On virtual H&E, we were able to classify helper T cells and epithelial progenitors with positive predictive values of <math><mrow><mn>0.34</mn> <mo>±</mo> <mn>0.15</mn></mrow> </math> (prevalence <math><mrow><mn>0.03</mn> <mo>±</mo> <mn>0.01</mn></mrow> </math> ) and <math><mrow><mn>0.47</mn> <mo>±</mo> <mn>0.1</mn></mrow> </math> (prevalence <math><mrow><mn>0.07</mn> <mo>±</mo> <mn>0.02</mn></mrow> </math> ), respectively, when using ground truth centroid information. On real H&E, we needed to compute bounded metrics instead of direct metrics because our fine-grained virtual H&E predicted classes had to be matched to the closest available parent classes in the coarser labels from the real H&E dataset. For the real H&E, we could classify bounded metrics for the helper T cells and epithelial progenitors with upper bound positive predictive values of <math><mrow><mn>0.43</mn> <mo>±</mo> <mn>0.03</mn></mrow> </math> (parent class prevalence 0.21) and <math><mrow><mn>0.94</mn> <mo>±</mo> <mn>0.02</mn></mrow> </math> (parent class prevalence 0.49) when using ground truth centroid information.</p><p><strong>Conclusions: </strong>This is the first work to provide cell type classification for helper T and epithelial progenitor nuclei on H&E.</p>","PeriodicalId":47707,"journal":{"name":"Journal of Medical Imaging","volume":"11 6","pages":"067501"},"PeriodicalIF":1.9,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11537205/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142591962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-01Epub Date: 2024-11-06DOI: 10.1117/1.JMI.11.6.064001
Yihao Liu, Junyu Chen, Lianrui Zuo, Aaron Carass, Jerry L Prince
Purpose: Deformable image registration establishes non-linear spatial correspondences between fixed and moving images. Deep learning-based deformable registration methods have been widely studied in recent years due to their speed advantage over traditional algorithms as well as their better accuracy. Most existing deep learning-based methods require neural networks to encode location information in their feature maps and predict displacement or deformation fields through convolutional or fully connected layers from these high-dimensional feature maps. We present vector field attention (VFA), a novel framework that enhances the efficiency of the existing network design by enabling direct retrieval of location correspondences.
Approach: VFA uses neural networks to extract multi-resolution feature maps from the fixed and moving images and then retrieves pixel-level correspondences based on feature similarity. The retrieval is achieved with a novel attention module without the need for learnable parameters. VFA is trained end-to-end in either a supervised or unsupervised manner.
Results: We evaluated VFA for intra- and inter-modality registration and unsupervised and semi-supervised registration using public datasets as well as the Learn2Reg challenge. VFA demonstrated comparable or superior registration accuracy compared with several state-of-the-art methods.
Conclusions: VFA offers a novel approach to deformable image registration by directly retrieving spatial correspondences from feature maps, leading to improved performance in registration tasks. It holds potential for broader applications.
{"title":"Vector field attention for deformable image registration.","authors":"Yihao Liu, Junyu Chen, Lianrui Zuo, Aaron Carass, Jerry L Prince","doi":"10.1117/1.JMI.11.6.064001","DOIUrl":"https://doi.org/10.1117/1.JMI.11.6.064001","url":null,"abstract":"<p><strong>Purpose: </strong>Deformable image registration establishes non-linear spatial correspondences between fixed and moving images. Deep learning-based deformable registration methods have been widely studied in recent years due to their speed advantage over traditional algorithms as well as their better accuracy. Most existing deep learning-based methods require neural networks to encode location information in their feature maps and predict displacement or deformation fields through convolutional or fully connected layers from these high-dimensional feature maps. We present vector field attention (VFA), a novel framework that enhances the efficiency of the existing network design by enabling direct retrieval of location correspondences.</p><p><strong>Approach: </strong>VFA uses neural networks to extract multi-resolution feature maps from the fixed and moving images and then retrieves pixel-level correspondences based on feature similarity. The retrieval is achieved with a novel attention module without the need for learnable parameters. VFA is trained end-to-end in either a supervised or unsupervised manner.</p><p><strong>Results: </strong>We evaluated VFA for intra- and inter-modality registration and unsupervised and semi-supervised registration using public datasets as well as the Learn2Reg challenge. VFA demonstrated comparable or superior registration accuracy compared with several state-of-the-art methods.</p><p><strong>Conclusions: </strong>VFA offers a novel approach to deformable image registration by directly retrieving spatial correspondences from feature maps, leading to improved performance in registration tasks. It holds potential for broader applications.</p>","PeriodicalId":47707,"journal":{"name":"Journal of Medical Imaging","volume":"11 6","pages":"064001"},"PeriodicalIF":1.9,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11540117/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142606811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-01Epub Date: 2024-09-13DOI: 10.1117/1.JMI.11.6.062604
Gesiren Zhang, Trong N Nguyen, Hadi Fooladi-Talari, Tyler Salvador, Kia Thomas, Daragh Crowley, R Scott Dingeman, Raj Shekhar
Significance: Conventional ultrasound-guided vascular access procedures are challenging due to the need for anatomical understanding, precise needle manipulation, and hand-eye coordination. Recently, augmented reality (AR)-based guidance has emerged as an aid to improve procedural efficiency and potential outcomes. However, its application in pediatric vascular access has not been comprehensively evaluated.
Aim: We developed an AR ultrasound application, HoloUS, using the Microsoft HoloLens 2 to display live ultrasound images directly in the proceduralist's field of view. We presented our evaluation of the effect of using the Microsoft HoloLens 2 for point-of-care ultrasound (POCUS)-guided vascular access in 30 pediatric patients.
Approach: A custom software module was developed on a tablet capable of capturing the moving ultrasound image from any ultrasound machine's screen. The captured image was compressed and sent to the HoloLens 2 via a hotspot without needing Internet access. On the HoloLens 2, we developed a custom software module to receive, decompress, and display the live ultrasound image. Hand gesture and voice command features were implemented for the user to reposition, resize, and change the gain and the contrast of the image. We evaluated 30 (15 successful control and 12 successful interventional) cases completed in a single-center, prospective, randomized study.
Results: The mean overall rendering latency and the rendering frame rate of the HoloUS application were 139.30 ms and 30 frames per second, respectively. The average procedure completion time was 17.3% shorter using AR guidance. The numbers of puncture attempts and needle redirections were similar between the two groups, and the number of head adjustments was minimal in the interventional group.
Conclusion: We presented our evaluation of the results from the first study using the Microsoft HoloLens 2 that investigates AR-based POCUS-guided vascular access in pediatric patients. Our evaluation confirmed clinical feasibility and potential improvement in procedural efficiency.
{"title":"Augmented reality for point-of-care ultrasound-guided vascular access in pediatric patients using Microsoft HoloLens 2: a preliminary evaluation.","authors":"Gesiren Zhang, Trong N Nguyen, Hadi Fooladi-Talari, Tyler Salvador, Kia Thomas, Daragh Crowley, R Scott Dingeman, Raj Shekhar","doi":"10.1117/1.JMI.11.6.062604","DOIUrl":"https://doi.org/10.1117/1.JMI.11.6.062604","url":null,"abstract":"<p><strong>Significance: </strong>Conventional ultrasound-guided vascular access procedures are challenging due to the need for anatomical understanding, precise needle manipulation, and hand-eye coordination. Recently, augmented reality (AR)-based guidance has emerged as an aid to improve procedural efficiency and potential outcomes. However, its application in pediatric vascular access has not been comprehensively evaluated.</p><p><strong>Aim: </strong>We developed an AR ultrasound application, HoloUS, using the Microsoft HoloLens 2 to display live ultrasound images directly in the proceduralist's field of view. We presented our evaluation of the effect of using the Microsoft HoloLens 2 for point-of-care ultrasound (POCUS)-guided vascular access in 30 pediatric patients.</p><p><strong>Approach: </strong>A custom software module was developed on a tablet capable of capturing the moving ultrasound image from any ultrasound machine's screen. The captured image was compressed and sent to the HoloLens 2 via a hotspot without needing Internet access. On the HoloLens 2, we developed a custom software module to receive, decompress, and display the live ultrasound image. Hand gesture and voice command features were implemented for the user to reposition, resize, and change the gain and the contrast of the image. We evaluated 30 (15 successful control and 12 successful interventional) cases completed in a single-center, prospective, randomized study.</p><p><strong>Results: </strong>The mean overall rendering latency and the rendering frame rate of the HoloUS application were 139.30 ms <math><mrow><mo>(</mo> <mi>σ</mi> <mo>=</mo> <mn>32.02</mn> <mtext> </mtext> <mi>ms</mi> <mo>)</mo></mrow> </math> and 30 frames per second, respectively. The average procedure completion time was 17.3% shorter using AR guidance. The numbers of puncture attempts and needle redirections were similar between the two groups, and the number of head adjustments was minimal in the interventional group.</p><p><strong>Conclusion: </strong>We presented our evaluation of the results from the first study using the Microsoft HoloLens 2 that investigates AR-based POCUS-guided vascular access in pediatric patients. Our evaluation confirmed clinical feasibility and potential improvement in procedural efficiency.</p>","PeriodicalId":47707,"journal":{"name":"Journal of Medical Imaging","volume":"11 6","pages":"062604"},"PeriodicalIF":1.9,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11393663/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142298700","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-01Epub Date: 2024-11-14DOI: 10.1117/1.JMI.11.6.064004
Ho Hin Lee, Adam M Saunders, Michael E Kim, Samuel W Remedios, Lucas W Remedios, Yucheng Tang, Qi Yang, Xin Yu, Shunxing Bao, Chloe Cho, Louise A Mawn, Tonia S Rex, Kevin L Schey, Blake E Dewey, Jeffrey M Spraggins, Jerry L Prince, Yuankai Huo, Bennett A Landman
Purpose: Eye morphology varies significantly across the population, especially for the orbit and optic nerve. These variations limit the feasibility and robustness of generalizing population-wise features of eye organs to an unbiased spatial reference.
Approach: To tackle these limitations, we propose a process for creating high-resolution unbiased eye atlases. First, to restore spatial details from scans with a low through-plane resolution compared with a high in-plane resolution, we apply a deep learning-based super-resolution algorithm. Then, we generate an initial unbiased reference with an iterative metric-based registration using a small portion of subject scans. We register the remaining scans to this template and refine the template using an unsupervised deep probabilistic approach that generates a more expansive deformation field to enhance the organ boundary alignment. We demonstrate this framework using magnetic resonance images across four different tissue contrasts, generating four atlases in separate spatial alignments.
Results: When refining the template with sufficient subjects, we find a significant improvement using the Wilcoxon signed-rank test in the average Dice score across four labeled regions compared with a standard registration framework consisting of rigid, affine, and deformable transformations. These results highlight the effective alignment of eye organs and boundaries using our proposed process.
Conclusions: By combining super-resolution preprocessing and deep probabilistic models, we address the challenge of generating an eye atlas to serve as a standardized reference across a largely variable population.
{"title":"Super-resolution multi-contrast unbiased eye atlases with deep probabilistic refinement.","authors":"Ho Hin Lee, Adam M Saunders, Michael E Kim, Samuel W Remedios, Lucas W Remedios, Yucheng Tang, Qi Yang, Xin Yu, Shunxing Bao, Chloe Cho, Louise A Mawn, Tonia S Rex, Kevin L Schey, Blake E Dewey, Jeffrey M Spraggins, Jerry L Prince, Yuankai Huo, Bennett A Landman","doi":"10.1117/1.JMI.11.6.064004","DOIUrl":"10.1117/1.JMI.11.6.064004","url":null,"abstract":"<p><strong>Purpose: </strong>Eye morphology varies significantly across the population, especially for the orbit and optic nerve. These variations limit the feasibility and robustness of generalizing population-wise features of eye organs to an unbiased spatial reference.</p><p><strong>Approach: </strong>To tackle these limitations, we propose a process for creating high-resolution unbiased eye atlases. First, to restore spatial details from scans with a low through-plane resolution compared with a high in-plane resolution, we apply a deep learning-based super-resolution algorithm. Then, we generate an initial unbiased reference with an iterative metric-based registration using a small portion of subject scans. We register the remaining scans to this template and refine the template using an unsupervised deep probabilistic approach that generates a more expansive deformation field to enhance the organ boundary alignment. We demonstrate this framework using magnetic resonance images across four different tissue contrasts, generating four atlases in separate spatial alignments.</p><p><strong>Results: </strong>When refining the template with sufficient subjects, we find a significant improvement using the Wilcoxon signed-rank test in the average Dice score across four labeled regions compared with a standard registration framework consisting of rigid, affine, and deformable transformations. These results highlight the effective alignment of eye organs and boundaries using our proposed process.</p><p><strong>Conclusions: </strong>By combining super-resolution preprocessing and deep probabilistic models, we address the challenge of generating an eye atlas to serve as a standardized reference across a largely variable population.</p>","PeriodicalId":47707,"journal":{"name":"Journal of Medical Imaging","volume":"11 6","pages":"064004"},"PeriodicalIF":1.9,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11561295/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142649317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-01Epub Date: 2024-10-08DOI: 10.1117/1.JMI.11.6.062606
Jacquemyn Xander, Bamps Kobe, Moermans Ruben, Dubois Christophe, Rega Filip, Verbrugghe Peter, Weyn Barbara, Dymarkowski Steven, Budts Werner, Van De Bruaene Alexander
Purpose: Virtual reality (VR) and augmented reality (AR) have led to significant advancements in cardiac preoperative planning, shaping the world in profound ways. A noticeable gap exists in the availability of a comprehensive multi-user, multi-device mixed reality application that can be used in a multidisciplinary team meeting.
Approach: A multi-user, multi-device mixed reality application was developed, supporting AR and VR implementations. Technical validation involved a standardized testing protocol and comparison of AR and VR measurements regarding absolute error and time. Preclinical validation engaged experts in interventional cardiology, evaluating the clinical applicability prior to clinical validation. Clinical validation included patient-specific measurements for five patients in VR compared with standard computed tomography (CT) for preoperative planning. Questionnaires were used at all stages for subjective evaluation.
Results: Technical validation, including 106 size measurements, demonstrated an absolute median error of 0.69 mm (0.25 to 1.18 mm) compared with ground truth. The time to complete the entire task was on average, with VR measurements being faster than AR ( versus , ). On clinical validation of five preoperative patients, there was no statistically significant difference between paired CT and VR measurements (0.58 [95% CI, to 2.74], ). Questionnaires showcased unanimous agreement on the user-friendly nature, effectiveness, and clinical value.
Conclusions: The mixed reality application, validated through technical, preclinical, and clinical assessments, demonstrates precision and user-friendliness. Further research of our application is needed to validate the generalizability and impact on patient outcomes.
目的:虚拟现实(VR)和增强现实(AR)在心脏术前规划方面取得了重大进展,深刻地改变了世界。但在可用于多学科团队会议的多用户、多设备混合现实综合应用方面存在明显差距:方法:开发了一款多用户、多设备混合现实应用程序,支持 AR 和 VR 实现。技术验证包括标准化测试协议以及 AR 和 VR 测量绝对误差和时间的比较。临床前验证邀请了介入心脏病学专家参与,在临床验证之前评估临床适用性。临床验证包括将 VR 与标准计算机断层扫描(CT)进行术前规划比较,对五名患者进行特定测量。所有阶段均使用问卷进行主观评估:技术验证包括 106 次尺寸测量,与地面实况相比,绝对中位误差为 0.69 毫米(0.25 至 1.18 毫米)。完成整个任务的平均时间为 892 ± 407 秒,VR 测量比 AR 测量快(804 ± 483 秒对 957 ± 257 秒,P = 0.045)。在对五名术前患者进行临床验证时,CT 和 VR 的配对测量结果在统计学上没有显著差异(0.58 [95% CI, - 1.58 to 2.74], P = 0.586)。调查问卷显示,用户一致认同该应用的易用性、有效性和临床价值:结论:通过技术、临床前和临床评估验证的混合现实应用显示出精确性和用户友好性。需要对我们的应用进行进一步研究,以验证其通用性和对患者治疗效果的影响。
{"title":"Augmented and virtual reality imaging for collaborative planning of structural cardiovascular interventions: a proof-of-concept and validation study.","authors":"Jacquemyn Xander, Bamps Kobe, Moermans Ruben, Dubois Christophe, Rega Filip, Verbrugghe Peter, Weyn Barbara, Dymarkowski Steven, Budts Werner, Van De Bruaene Alexander","doi":"10.1117/1.JMI.11.6.062606","DOIUrl":"10.1117/1.JMI.11.6.062606","url":null,"abstract":"<p><strong>Purpose: </strong>Virtual reality (VR) and augmented reality (AR) have led to significant advancements in cardiac preoperative planning, shaping the world in profound ways. A noticeable gap exists in the availability of a comprehensive multi-user, multi-device mixed reality application that can be used in a multidisciplinary team meeting.</p><p><strong>Approach: </strong>A multi-user, multi-device mixed reality application was developed, supporting AR and VR implementations. Technical validation involved a standardized testing protocol and comparison of AR and VR measurements regarding absolute error and time. Preclinical validation engaged experts in interventional cardiology, evaluating the clinical applicability prior to clinical validation. Clinical validation included patient-specific measurements for five patients in VR compared with standard computed tomography (CT) for preoperative planning. Questionnaires were used at all stages for subjective evaluation.</p><p><strong>Results: </strong>Technical validation, including 106 size measurements, demonstrated an absolute median error of 0.69 mm (0.25 to 1.18 mm) compared with ground truth. The time to complete the entire task was <math><mrow><mn>892</mn> <mo>±</mo> <mn>407</mn> <mtext> </mtext> <mi>s</mi></mrow> </math> on average, with VR measurements being faster than AR ( <math><mrow><mn>804</mn> <mo>±</mo> <mn>483</mn></mrow> </math> versus <math><mrow><mn>957</mn> <mo>±</mo> <mn>257</mn> <mtext> </mtext> <mi>s</mi></mrow> </math> , <math><mrow><mi>P</mi> <mo>=</mo> <mn>0.045</mn></mrow> </math> ). On clinical validation of five preoperative patients, there was no statistically significant difference between paired CT and VR measurements (0.58 [95% CI, <math><mrow><mo>-</mo> <mn>1.58</mn></mrow> </math> to 2.74], <math><mrow><mi>P</mi> <mo>=</mo> <mn>0.586</mn></mrow> </math> ). Questionnaires showcased unanimous agreement on the user-friendly nature, effectiveness, and clinical value.</p><p><strong>Conclusions: </strong>The mixed reality application, validated through technical, preclinical, and clinical assessments, demonstrates precision and user-friendliness. Further research of our application is needed to validate the generalizability and impact on patient outcomes.</p>","PeriodicalId":47707,"journal":{"name":"Journal of Medical Imaging","volume":"11 6","pages":"062606"},"PeriodicalIF":1.9,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11460359/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142394282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-01Epub Date: 2024-11-08DOI: 10.1117/1.JMI.11.6.064002
Sepideh Amiri, Reza Karimzadeh, Tomaž Vrtovec, Erik Gudmann Steuble Brandt, Henrik S Thomsen, Michael Brun Andersen, Christoph Felix Müller, Anders Bertil Rodell, Bulat Ibragimov
Purpose: Pancreatic ductal adenocarcinoma is forecast to become the second most significant cause of cancer mortality as the number of patients with cancer in the main duct of the pancreas grows, and measurement of the pancreatic duct diameter from medical images has been identified as relevant for its early diagnosis.
Approach: We propose an automated pancreatic duct centerline tracing method from computed tomography (CT) images that is based on deep reinforcement learning, which employs an artificial agent to interact with the environment and calculates rewards by combining the distances from the target and the centerline. A deep neural network is implemented to forecast step-wise values for each potential action. With the help of this mechanism, the agent can probe along the pancreatic duct centerline using the best possible navigational path. To enhance the tracing accuracy, we employ landmark-based registration, which enables the generation of a probability map of the pancreatic duct. Subsequently, we utilize a gradient-based method on the registered data to extract a probability map specifically indicating the centerline of the pancreatic duct.
Results: Three datasets with a total of 115 CT images were used to evaluate the proposed method. Using image hold-out from the first two datasets, the method performance was 2.0, 4.0, and 2.1 mm measured in terms of the mean detection error, Hausdorff distance (HD), and root mean squared error (RMSE), respectively. Using the first two datasets for training and the third one for testing, the method accuracy was 2.2, 4.9, and 2.6 mm measured in terms of the mean detection error, HD, and RMSE, respectively.
Conclusions: We present an algorithm for automated pancreatic duct centerline tracing using deep reinforcement learning. We observe that validation on an external dataset confirms the potential for practical utilization of the presented method.
{"title":"Centerline-guided reinforcement learning model for pancreatic duct identifications.","authors":"Sepideh Amiri, Reza Karimzadeh, Tomaž Vrtovec, Erik Gudmann Steuble Brandt, Henrik S Thomsen, Michael Brun Andersen, Christoph Felix Müller, Anders Bertil Rodell, Bulat Ibragimov","doi":"10.1117/1.JMI.11.6.064002","DOIUrl":"https://doi.org/10.1117/1.JMI.11.6.064002","url":null,"abstract":"<p><strong>Purpose: </strong>Pancreatic ductal adenocarcinoma is forecast to become the second most significant cause of cancer mortality as the number of patients with cancer in the main duct of the pancreas grows, and measurement of the pancreatic duct diameter from medical images has been identified as relevant for its early diagnosis.</p><p><strong>Approach: </strong>We propose an automated pancreatic duct centerline tracing method from computed tomography (CT) images that is based on deep reinforcement learning, which employs an artificial agent to interact with the environment and calculates rewards by combining the distances from the target and the centerline. A deep neural network is implemented to forecast step-wise values for each potential action. With the help of this mechanism, the agent can probe along the pancreatic duct centerline using the best possible navigational path. To enhance the tracing accuracy, we employ landmark-based registration, which enables the generation of a probability map of the pancreatic duct. Subsequently, we utilize a gradient-based method on the registered data to extract a probability map specifically indicating the centerline of the pancreatic duct.</p><p><strong>Results: </strong>Three datasets with a total of 115 CT images were used to evaluate the proposed method. Using image hold-out from the first two datasets, the method performance was 2.0, 4.0, and 2.1 mm measured in terms of the mean detection error, Hausdorff distance (HD), and root mean squared error (RMSE), respectively. Using the first two datasets for training and the third one for testing, the method accuracy was 2.2, 4.9, and 2.6 mm measured in terms of the mean detection error, HD, and RMSE, respectively.</p><p><strong>Conclusions: </strong>We present an algorithm for automated pancreatic duct centerline tracing using deep reinforcement learning. We observe that validation on an external dataset confirms the potential for practical utilization of the presented method.</p>","PeriodicalId":47707,"journal":{"name":"Journal of Medical Imaging","volume":"11 6","pages":"064002"},"PeriodicalIF":1.9,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11543826/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142630343","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-01Epub Date: 2024-09-14DOI: 10.1117/1.JMI.11.6.062605
Khushi Bhansali, Miguel A Lago, Ryan Beams, Chumin Zhao
Purpose: Visualization of medical images on a virtual reality (VR) head-mounted display (HMD) requires binocular fusion of a stereoscopic pair of graphical views. However, current image quality assessment on VR HMDs for medical applications has been primarily limited to time-consuming monocular optical bench measurement on a single eyepiece.
Approach: As an alternative to optical bench measurement to quantify the image quality on VR HMDs, we developed a WebXR test platform to perform contrast perceptual experiments that can be used for binocular image quality assessment. We obtained monocular and binocular contrast sensitivity responses (CSRs) from participants on a Meta Quest 2 VR HMD using varied interpupillary distance (IPD) configurations.
Results: The perceptual result shows that contrast perception on VR HMDs is primarily affected by optical aberration of the VR HMD. As a result, monocular CSR degrades at a high spatial frequency greater than 4 cycles per degree when gazing at the periphery of the display field of view, especially for mismatched IPD settings consistent with optical bench measurements. On the contrary, binocular contrast perception is dominated by the monocular view with superior image quality measured by the contrast.
Conclusions: We developed a test platform to investigate monocular and binocular contrast perception by performing perceptual experiments. The test method can be used to evaluate monocular and/or binocular image quality on VR HMDs for potential medical applications without extensive optical bench measurements.
{"title":"Evaluation of monocular and binocular contrast perception on virtual reality head-mounted displays.","authors":"Khushi Bhansali, Miguel A Lago, Ryan Beams, Chumin Zhao","doi":"10.1117/1.JMI.11.6.062605","DOIUrl":"https://doi.org/10.1117/1.JMI.11.6.062605","url":null,"abstract":"<p><strong>Purpose: </strong>Visualization of medical images on a virtual reality (VR) head-mounted display (HMD) requires binocular fusion of a stereoscopic pair of graphical views. However, current image quality assessment on VR HMDs for medical applications has been primarily limited to time-consuming monocular optical bench measurement on a single eyepiece.</p><p><strong>Approach: </strong>As an alternative to optical bench measurement to quantify the image quality on VR HMDs, we developed a WebXR test platform to perform contrast perceptual experiments that can be used for binocular image quality assessment. We obtained monocular and binocular contrast sensitivity responses (CSRs) from participants on a Meta Quest 2 VR HMD using varied interpupillary distance (IPD) configurations.</p><p><strong>Results: </strong>The perceptual result shows that contrast perception on VR HMDs is primarily affected by optical aberration of the VR HMD. As a result, monocular CSR degrades at a high spatial frequency greater than 4 cycles per degree when gazing at the periphery of the display field of view, especially for mismatched IPD settings consistent with optical bench measurements. On the contrary, binocular contrast perception is dominated by the monocular view with superior image quality measured by the contrast.</p><p><strong>Conclusions: </strong>We developed a test platform to investigate monocular and binocular contrast perception by performing perceptual experiments. The test method can be used to evaluate monocular and/or binocular image quality on VR HMDs for potential medical applications without extensive optical bench measurements.</p>","PeriodicalId":47707,"journal":{"name":"Journal of Medical Imaging","volume":"11 6","pages":"062605"},"PeriodicalIF":1.9,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11401613/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142298701","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-01Epub Date: 2024-10-10DOI: 10.1117/1.JMI.11.5.057501
Jin Zhou, Xiang Li, Dawit Demeke, Timothy A Dinh, Yingbao Yang, Andrew R Janowczyk, Jarcy Zee, Lawrence Holzman, Laura Mariani, Krishnendu Chakrabarty, Laura Barisoni, Jeffrey B Hodgin, Kyle J Lafata
Purpose: Our purpose is to develop a computer vision approach to quantify intra-arterial thickness on digital pathology images of kidney biopsies as a computational biomarker of arteriosclerosis.
Approach: The severity of the arteriosclerosis was scored (0 to 3) in 753 arteries from 33 trichrome-stained whole slide images (WSIs) of kidney biopsies, and the outer contours of the media, intima, and lumen were manually delineated by a renal pathologist. We then developed a multi-class deep learning (DL) framework for segmenting the different intra-arterial compartments (training dataset: 648 arteries from 24 WSIs; testing dataset: 105 arteries from 9 WSIs). Subsequently, we employed radial sampling and made measurements of media and intima thickness as a function of spatially encoded polar coordinates throughout the artery. Pathomic features were extracted from the measurements to collectively describe the arterial wall characteristics. The technique was first validated through numerical analysis of simulated arteries, with systematic deformations applied to study their effect on arterial thickness measurements. We then compared these computationally derived measurements with the pathologists' grading of arteriosclerosis.
Results: Numerical validation shows that our measurement technique adeptly captured the decreasing smoothness in the intima and media thickness as the deformation increases in the simulated arteries. Intra-arterial DL segmentations of media, intima, and lumen achieved Dice scores of 0.84, 0.78, and 0.86, respectively. Several significant associations were identified between arteriosclerosis grade and pathomic features using our technique (e.g., intima-media ratio average [ , ]) through Kendall's tau analysis.
Conclusions: We developed a computer vision approach to computationally characterize intra-arterial morphology on digital pathology images and demonstrate its feasibility as a potential computational biomarker of arteriosclerosis.
目的:我们的目的是开发一种计算机视觉方法,以量化肾活检数字病理图像上的动脉内厚度,作为动脉硬化的计算生物标志物:方法:从33张三色染色的肾活检全切片图像(WSIs)中对753条动脉的动脉硬化严重程度进行评分(0至3分),并由肾脏病理学家手动划定中膜、内膜和管腔的外轮廓。然后,我们开发了一个多类深度学习(DL)框架,用于分割不同的动脉内分区(训练数据集:训练数据集:来自 24 个 WSI 的 648 条动脉;测试数据集:来自 9 个 WSI 的 105 条动脉:来自 9 个 WSI 的 105 条动脉)。随后,我们采用径向采样,测量了整个动脉中作为空间编码极坐标函数的中膜和内膜厚度。从测量结果中提取病理特征,以综合描述动脉壁的特征。该技术首先通过模拟动脉的数值分析进行验证,并应用系统变形研究其对动脉厚度测量的影响。然后,我们将计算得出的测量结果与病理学家对动脉硬化的分级进行了比较:结果:数值验证表明,我们的测量技术能很好地捕捉到随着模拟动脉变形的增加,动脉内膜和介质厚度的平滑度不断降低的现象。动脉内介质、内膜和管腔的 DL 分段 Dice 分数分别为 0.84、0.78 和 0.86。通过 Kendall's tau 分析,我们的技术在动脉硬化等级和病理特征(如内膜-介质比平均值 [ τ = 0.52 , p 0.0001 ])之间发现了一些重要的关联:我们开发了一种计算机视觉方法来计算数字病理图像上的动脉内形态特征,并证明了其作为动脉硬化潜在计算生物标志物的可行性。
{"title":"Characterization of arteriosclerosis based on computer-aided measurements of intra-arterial thickness.","authors":"Jin Zhou, Xiang Li, Dawit Demeke, Timothy A Dinh, Yingbao Yang, Andrew R Janowczyk, Jarcy Zee, Lawrence Holzman, Laura Mariani, Krishnendu Chakrabarty, Laura Barisoni, Jeffrey B Hodgin, Kyle J Lafata","doi":"10.1117/1.JMI.11.5.057501","DOIUrl":"https://doi.org/10.1117/1.JMI.11.5.057501","url":null,"abstract":"<p><strong>Purpose: </strong>Our purpose is to develop a computer vision approach to quantify intra-arterial thickness on digital pathology images of kidney biopsies as a computational biomarker of arteriosclerosis.</p><p><strong>Approach: </strong>The severity of the arteriosclerosis was scored (0 to 3) in 753 arteries from 33 trichrome-stained whole slide images (WSIs) of kidney biopsies, and the outer contours of the media, intima, and lumen were manually delineated by a renal pathologist. We then developed a multi-class deep learning (DL) framework for segmenting the different intra-arterial compartments (training dataset: 648 arteries from 24 WSIs; testing dataset: 105 arteries from 9 WSIs). Subsequently, we employed radial sampling and made measurements of media and intima thickness as a function of spatially encoded polar coordinates throughout the artery. Pathomic features were extracted from the measurements to collectively describe the arterial wall characteristics. The technique was first validated through numerical analysis of simulated arteries, with systematic deformations applied to study their effect on arterial thickness measurements. We then compared these computationally derived measurements with the pathologists' grading of arteriosclerosis.</p><p><strong>Results: </strong>Numerical validation shows that our measurement technique adeptly captured the decreasing smoothness in the intima and media thickness as the deformation increases in the simulated arteries. Intra-arterial DL segmentations of media, intima, and lumen achieved Dice scores of 0.84, 0.78, and 0.86, respectively. Several significant associations were identified between arteriosclerosis grade and pathomic features using our technique (e.g., intima-media ratio average [ <math><mrow><mi>τ</mi> <mo>=</mo> <mn>0.52</mn></mrow> </math> , <math><mrow><mi>p</mi> <mo><</mo> <mn>0.0001</mn></mrow> </math> ]) through Kendall's tau analysis.</p><p><strong>Conclusions: </strong>We developed a computer vision approach to computationally characterize intra-arterial morphology on digital pathology images and demonstrate its feasibility as a potential computational biomarker of arteriosclerosis.</p>","PeriodicalId":47707,"journal":{"name":"Journal of Medical Imaging","volume":"11 5","pages":"057501"},"PeriodicalIF":1.9,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11466048/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142477764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-01Epub Date: 2024-09-05DOI: 10.1117/1.JMI.11.5.055501
Kaiyan Li, Hua Li, Mark A Anastasio
Purpose: Recently, learning-based denoising methods that incorporate task-relevant information into the training procedure have been developed to enhance the utility of the denoised images. However, this line of research is relatively new and underdeveloped, and some fundamental issues remain unexplored. Our purpose is to yield insights into general issues related to these task-informed methods. This includes understanding the impact of denoising on objective measures of image quality (IQ) when the specified task at inference time is different from that employed for model training, a phenomenon we refer to as "task-shift."
Approach: A virtual imaging test bed comprising a stylized computational model of a chest X-ray computed tomography imaging system was employed to enable a controlled and tractable study design. A canonical, fully supervised, convolutional neural network-based denoising method was purposely adopted to understand the underlying issues that may be relevant to a variety of applications and more advanced denoising or image reconstruction methods. Signal detection and signal detection-localization tasks under signal-known-statistically with background-known-statistically conditions were considered, and several distinct types of numerical observers were employed to compute estimates of the task performance. Studies were designed to reveal how a task-informed transfer-learning approach can influence the tradeoff between conventional and task-based measures of image quality within the context of the considered tasks. In addition, the impact of task-shift on these image quality measures was assessed.
Results: The results indicated that certain tradeoffs can be achieved such that the resulting AUC value was significantly improved and the degradation of physical IQ measures was statistically insignificant. It was also observed that introducing task-shift degrades the task performance as expected. The degradation was significant when a relatively simple task was considered for network training and observer performance on a more complex one was assessed at inference time.
Conclusions: The presented results indicate that the task-informed training method can improve the observer performance while providing control over the tradeoff between traditional and task-based measures of image quality. The behavior of a task-informed model fine-tuning procedure was demonstrated, and the impact of task-shift on task-based image quality measures was investigated.
目的最近,人们开发了基于学习的去噪方法,将任务相关信息纳入训练程序,以提高去噪图像的实用性。然而,这一研究方向相对较新,发展还不充分,一些基本问题仍未得到探索。我们的目的是深入了解与这些任务信息方法相关的一般问题。这包括了解当推理时的指定任务不同于模型训练时的指定任务时,去噪对客观图像质量(IQ)测量的影响,我们将这种现象称为 "任务偏移":虚拟成像试验台由胸部 X 射线计算机断层扫描成像系统的风格化计算模型组成,以实现可控、可操作的研究设计。特意采用了一种基于卷积神经网络的典型、完全监督去噪方法,以了解可能与各种应用和更先进的去噪或图像重建方法相关的基本问题。研究考虑了信号已知统计和背景已知统计条件下的信号检测和信号检测定位任务,并采用了几种不同类型的数字观测器来计算任务性能的估计值。研究旨在揭示在所考虑的任务背景下,基于任务的迁移学习方法如何影响图像质量的传统测量方法和基于任务的测量方法之间的权衡。此外,还评估了任务转移对这些图像质量衡量标准的影响:结果表明,可以实现某些权衡,从而显著提高 AUC 值,而物理智商指标的下降在统计上并不明显。此外,还观察到引入任务转移会降低任务性能。当考虑用相对简单的任务进行网络训练,并在推理时评估观察者在更复杂的任务上的表现时,任务性能的下降非常明显:以上结果表明,基于任务的训练方法可以提高观察者的表现,同时还能控制传统图像质量测量方法和基于任务的图像质量测量方法之间的权衡。演示了任务信息模型微调程序的行为,并研究了任务转移对基于任务的图像质量测量的影响。
{"title":"Investigating the use of signal detection information in supervised learning-based image denoising with consideration of task-shift.","authors":"Kaiyan Li, Hua Li, Mark A Anastasio","doi":"10.1117/1.JMI.11.5.055501","DOIUrl":"10.1117/1.JMI.11.5.055501","url":null,"abstract":"<p><strong>Purpose: </strong>Recently, learning-based denoising methods that incorporate task-relevant information into the training procedure have been developed to enhance the utility of the denoised images. However, this line of research is relatively new and underdeveloped, and some fundamental issues remain unexplored. Our purpose is to yield insights into general issues related to these task-informed methods. This includes understanding the impact of denoising on objective measures of image quality (IQ) when the specified task at inference time is different from that employed for model training, a phenomenon we refer to as \"task-shift.\"</p><p><strong>Approach: </strong>A virtual imaging test bed comprising a stylized computational model of a chest X-ray computed tomography imaging system was employed to enable a controlled and tractable study design. A canonical, fully supervised, convolutional neural network-based denoising method was purposely adopted to understand the underlying issues that may be relevant to a variety of applications and more advanced denoising or image reconstruction methods. Signal detection and signal detection-localization tasks under signal-known-statistically with background-known-statistically conditions were considered, and several distinct types of numerical observers were employed to compute estimates of the task performance. Studies were designed to reveal how a task-informed transfer-learning approach can influence the tradeoff between conventional and task-based measures of image quality within the context of the considered tasks. In addition, the impact of task-shift on these image quality measures was assessed.</p><p><strong>Results: </strong>The results indicated that certain tradeoffs can be achieved such that the resulting AUC value was significantly improved and the degradation of physical IQ measures was statistically insignificant. It was also observed that introducing task-shift degrades the task performance as expected. The degradation was significant when a relatively simple task was considered for network training and observer performance on a more complex one was assessed at inference time.</p><p><strong>Conclusions: </strong>The presented results indicate that the task-informed training method can improve the observer performance while providing control over the tradeoff between traditional and task-based measures of image quality. The behavior of a task-informed model fine-tuning procedure was demonstrated, and the impact of task-shift on task-based image quality measures was investigated.</p>","PeriodicalId":47707,"journal":{"name":"Journal of Medical Imaging","volume":"11 5","pages":"055501"},"PeriodicalIF":1.9,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11376226/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142156370","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}