首页 > 最新文献

Journal of Medical Imaging最新文献

英文 中文
LiteMIL: a computationally efficient cross-attention multiple instance learning for cancer subtyping on whole-slide images. LiteMIL:一种计算效率高的跨注意多实例学习,用于整张幻灯片图像的癌症亚型分型。
IF 1.7 Q3 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING Pub Date : 2026-01-01 Epub Date: 2026-01-06 DOI: 10.1117/1.JMI.13.1.017501
Haitham Kussaibi

Purpose: Accurate cancer subtyping is essential for precision medicine but challenged by the computational demands of gigapixel whole-slide images (WSIs). Although transformer-based multiple instance learning (MIL) methods achieve strong performance, their quadratic complexity limits clinical deployment. We introduce LiteMIL, a computationally efficient cross-attention MIL, optimized for WSIs classification.

Approach: LiteMIL employs a single learnable query with multi-head cross-attention for bag-level aggregation from extracted features. We evaluated LiteMIL against five baselines (mean/max pooling, ABMIL, MAD-MIL, and TransMIL) on four TCGA datasets (breast: n = 875 , kidney: n = 906 , lung: n = 958 , and TUPAC16: n = 821 ) using nested cross-validation with patient-level splitting. Systematic ablation studies evaluated multi-query variants, attention heads, dropout rates, and architectural components.

Results: LiteMIL achieved competitive accuracy (average 83.5%), matching TransMIL, while offering substantial efficiency gains: 4.8 × fewer parameters (560K versus 2.67M), 2.9 × faster inference (1.6s versus 4.6s per fold), and 6.7 × lower Graphics Processing Unit (GPU) memory usage (1.15 GB versus 7.77 GB). LiteMIL excelled on lung (86.3% versus 85.0%), TUPAC16 (72% versus 71.4%), and matched kidney performance (89.9% versus 89.7%). Ablation studies revealed task-dependent multi-query performance benefits: Q = 4 versus Q = 1 improved morphologically heterogeneous tasks (breast/lung + 1.3 % each, p < 0.05 ) but degraded on grading tasks (TUPAC16: - 1.6 % ), validating single-query optimality for focused attention scenarios.

Conclusions: LiteMIL provides a resource-efficient solution for WSI classification. The cross-attention architecture matches complex transformer performance while enabling deployment on consumer GPUs. Task-dependent design insights, single query for sparse discriminating features, multi-query for heterogeneous patterns, guide practical implementation. The architecture's efficiency, combined with compact features, makes LiteMIL suitable for clinical integration in settings with limited computational infrastructure.

目的:准确的癌症亚型对精准医学至关重要,但受到千兆像素整片图像(wsi)计算需求的挑战。尽管基于变压器的多实例学习(MIL)方法具有较强的性能,但其二次复杂度限制了临床应用。我们介绍了LiteMIL,一个计算效率高的交叉注意MIL,优化了wsi分类。方法:LiteMIL采用单个可学习的查询,具有多头交叉关注,用于从提取的特征中进行袋级聚合。我们在4个TCGA数据集(乳腺:n = 875,肾脏:n = 906,肺:n = 958, TUPAC16: n = 821)上对LiteMIL的5个基线(mean/max pooling, ABMIL, med - mil和TransMIL)进行了评估,采用嵌套交叉验证和患者水平分割。系统消融研究评估了多查询变量、注意头、辍学率和架构组件。结果:LiteMIL达到了具有竞争力的准确性(平均83.5%),与TransMIL相匹配,同时提供了可观的效率提升:参数减少4.8倍(560K对2.67M),推理速度加快2.9倍(1.6s对4.6s每倍),图形处理单元(GPU)内存使用降低6.7倍(1.15 GB对7.77 GB)。LiteMIL在肺(86.3%对85.0%)、TUPAC16(72%对71.4%)和匹配肾脏性能(89.9%对89.7%)方面表现出色。消融研究显示了任务相关的多查询性能优势:Q = 4和Q = 1改善了形态学异构任务(乳腺/肺各+ 1.3%,p 0.05),但在分级任务上有所下降(TUPAC16: - 1.6%),验证了单查询在集中注意力场景中的最佳性。结论:LiteMIL为WSI分类提供了一种资源高效的解决方案。交叉关注架构匹配复杂的变压器性能,同时支持在消费级gpu上部署。任务相关的设计见解,单个查询用于稀疏区分特征,多个查询用于异构模式,指导实际实现。该架构的效率,加上紧凑的功能,使LiteMIL适合在计算基础设施有限的情况下进行临床集成。
{"title":"LiteMIL: a computationally efficient cross-attention multiple instance learning for cancer subtyping on whole-slide images.","authors":"Haitham Kussaibi","doi":"10.1117/1.JMI.13.1.017501","DOIUrl":"https://doi.org/10.1117/1.JMI.13.1.017501","url":null,"abstract":"<p><strong>Purpose: </strong>Accurate cancer subtyping is essential for precision medicine but challenged by the computational demands of gigapixel whole-slide images (WSIs). Although transformer-based multiple instance learning (MIL) methods achieve strong performance, their quadratic complexity limits clinical deployment. We introduce LiteMIL, a computationally efficient cross-attention MIL, optimized for WSIs classification.</p><p><strong>Approach: </strong>LiteMIL employs a single learnable query with multi-head cross-attention for bag-level aggregation from extracted features. We evaluated LiteMIL against five baselines (mean/max pooling, ABMIL, MAD-MIL, and TransMIL) on four TCGA datasets (breast: <math><mrow><mi>n</mi> <mo>=</mo> <mn>875</mn></mrow> </math> , kidney: <math><mrow><mi>n</mi> <mo>=</mo> <mn>906</mn></mrow> </math> , lung: <math><mrow><mi>n</mi> <mo>=</mo> <mn>958</mn></mrow> </math> , and TUPAC16: <math><mrow><mi>n</mi> <mo>=</mo> <mn>821</mn></mrow> </math> ) using nested cross-validation with patient-level splitting. Systematic ablation studies evaluated multi-query variants, attention heads, dropout rates, and architectural components.</p><p><strong>Results: </strong>LiteMIL achieved competitive accuracy (average 83.5%), matching TransMIL, while offering substantial efficiency gains: <math><mrow><mn>4.8</mn> <mo>×</mo></mrow> </math> fewer parameters (560K versus 2.67M), <math><mrow><mn>2.9</mn> <mo>×</mo></mrow> </math> faster inference (1.6s versus 4.6s per fold), and <math><mrow><mn>6.7</mn> <mo>×</mo></mrow> </math> lower Graphics Processing Unit (GPU) memory usage (1.15 GB versus 7.77 GB). LiteMIL excelled on lung (86.3% versus 85.0%), TUPAC16 (72% versus 71.4%), and matched kidney performance (89.9% versus 89.7%). Ablation studies revealed task-dependent multi-query performance benefits: <math><mrow><mi>Q</mi> <mo>=</mo> <mn>4</mn></mrow> </math> versus <math><mrow><mi>Q</mi> <mo>=</mo> <mn>1</mn></mrow> </math> improved morphologically heterogeneous tasks (breast/lung <math><mrow><mo>+</mo> <mn>1.3</mn> <mo>%</mo></mrow> </math> each, <math><mrow><mi>p</mi> <mo><</mo> <mn>0.05</mn></mrow> </math> ) but degraded on grading tasks (TUPAC16: <math><mrow><mo>-</mo> <mn>1.6</mn> <mo>%</mo></mrow> </math> ), validating single-query optimality for focused attention scenarios.</p><p><strong>Conclusions: </strong>LiteMIL provides a resource-efficient solution for WSI classification. The cross-attention architecture matches complex transformer performance while enabling deployment on consumer GPUs. Task-dependent design insights, single query for sparse discriminating features, multi-query for heterogeneous patterns, guide practical implementation. The architecture's efficiency, combined with compact features, makes LiteMIL suitable for clinical integration in settings with limited computational infrastructure.</p>","PeriodicalId":47707,"journal":{"name":"Journal of Medical Imaging","volume":"13 1","pages":"017501"},"PeriodicalIF":1.7,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12772520/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145918838","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Asynchronous federated learning for web-based OCT image analysis. 基于web的OCT图像分析异步联合学习。
IF 1.7 Q3 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING Pub Date : 2026-01-01 Epub Date: 2026-01-05 DOI: 10.1117/1.JMI.13.1.014501
Hasan Md Tusfiqur Alam, Tim Maurer, Abdulrahman Mohamed Selim, Matthias Eiletz, Michael Barz, Daniel Sonntag

Purpose: Centralized machine learning often struggles with limited data access and expert involvement. We investigate decentralized approaches that preserve data privacy while enabling collaborative model training for medical imaging tasks.

Approach: We explore asynchronous federated learning (FL) using the FL with buffered asynchronous aggregation (FedBuff) algorithm for classifying optical coherence tomography (OCT) retina images. Unlike synchronous algorithms such as FedAvg, which require all clients to participate simultaneously, FedBuff supports independent client updates. We compare its performance to both centralized models and FedAvg. In addition, we develop a browser-based proof-of-concept system using modern web technologies to assess the feasibility and limitations of interactive, collaborative learning in real-world settings.

Results: FedBuff performs well in binary OCT classification tasks but shows reduced accuracy in more complex, multiclass scenarios. FedAvg achieves results comparable to centralized training, consistent with previous findings. Although FedBuff underperforms compared with FedAvg and centralized models, it still delivers acceptable accuracy in less complex settings. The browser-based prototype demonstrates the potential for accessible, user-driven FL systems but also highlights technical limitations in current web standards, especially regarding local computation and communication efficiency.

Conclusion: Asynchronous FL via FedBuff offers a promising, privacy-preserving approach for medical image classification, particularly when synchronous participation is impractical. However, its scalability to complex classification tasks remains limited. Web-based implementations have the potential to broaden access to collaborative AI tools, but limitations of the current technologies need to be further investigated.

目的:集中式机器学习经常与有限的数据访问和专家参与作斗争。我们研究了分散的方法,这些方法既可以保护数据隐私,又可以为医学成像任务提供协作模型训练。方法:我们使用带缓冲异步聚合(FedBuff)算法的异步联邦学习(FL)来对光学相干断层扫描(OCT)视网膜图像进行分类。与fedag等要求所有客户端同时参与的同步算法不同,FedBuff支持独立客户端更新。我们将其性能与集中式模型和fedag进行了比较。此外,我们开发了一个基于浏览器的概念验证系统,使用现代网络技术来评估现实世界中交互式协作学习的可行性和局限性。结果:FedBuff在二元OCT分类任务中表现良好,但在更复杂的多类别场景中准确性降低。fedag达到了与集中训练相当的结果,与先前的研究结果一致。尽管与fedag和集中式模型相比,FedBuff表现不佳,但在不太复杂的设置中,它仍然提供了可接受的准确性。基于浏览器的原型展示了可访问的、用户驱动的FL系统的潜力,但也突出了当前web标准的技术限制,特别是在本地计算和通信效率方面。结论:通过FedBuff的异步FL为医学图像分类提供了一种有前途的、保护隐私的方法,特别是在同步参与不切实际的情况下。然而,它对复杂分类任务的可扩展性仍然有限。基于网络的实现有可能扩大对协作人工智能工具的访问,但当前技术的局限性需要进一步研究。
{"title":"Asynchronous federated learning for web-based OCT image analysis.","authors":"Hasan Md Tusfiqur Alam, Tim Maurer, Abdulrahman Mohamed Selim, Matthias Eiletz, Michael Barz, Daniel Sonntag","doi":"10.1117/1.JMI.13.1.014501","DOIUrl":"https://doi.org/10.1117/1.JMI.13.1.014501","url":null,"abstract":"<p><strong>Purpose: </strong>Centralized machine learning often struggles with limited data access and expert involvement. We investigate decentralized approaches that preserve data privacy while enabling collaborative model training for medical imaging tasks.</p><p><strong>Approach: </strong>We explore asynchronous federated learning (FL) using the FL with buffered asynchronous aggregation (FedBuff) algorithm for classifying optical coherence tomography (OCT) retina images. Unlike synchronous algorithms such as FedAvg, which require all clients to participate simultaneously, FedBuff supports independent client updates. We compare its performance to both centralized models and FedAvg. In addition, we develop a browser-based proof-of-concept system using modern web technologies to assess the feasibility and limitations of interactive, collaborative learning in real-world settings.</p><p><strong>Results: </strong>FedBuff performs well in binary OCT classification tasks but shows reduced accuracy in more complex, multiclass scenarios. FedAvg achieves results comparable to centralized training, consistent with previous findings. Although FedBuff underperforms compared with FedAvg and centralized models, it still delivers acceptable accuracy in less complex settings. The browser-based prototype demonstrates the potential for accessible, user-driven FL systems but also highlights technical limitations in current web standards, especially regarding local computation and communication efficiency.</p><p><strong>Conclusion: </strong>Asynchronous FL via FedBuff offers a promising, privacy-preserving approach for medical image classification, particularly when synchronous participation is impractical. However, its scalability to complex classification tasks remains limited. Web-based implementations have the potential to broaden access to collaborative AI tools, but limitations of the current technologies need to be further investigated.</p>","PeriodicalId":47707,"journal":{"name":"Journal of Medical Imaging","volume":"13 1","pages":"014501"},"PeriodicalIF":1.7,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12767620/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145913135","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Characterizing the effects of noncontrast head CT reconstruction kernel and slice thickness parameters on the performance of an automated AI algorithm in the evaluation of ischemic stroke. 表征非对比头部CT重建核和层厚参数对一种评估缺血性卒中的自动AI算法性能的影响。
IF 1.7 Q3 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING Pub Date : 2026-01-01 Epub Date: 2026-01-09 DOI: 10.1117/1.JMI.13.1.014503
Spencer H Welland, Grace Hyun J Kim, Anil Yadav, Kambiz Nael, John M Hoffman, Matthew S Brown, Michael F McNitt-Gray, William Hsu

Purpose: There are multiple commercially available, Food and Drug Administration (FDA)-cleared, artificial intelligence (AI)-based tools automating stroke evaluation in noncontrast computed tomography (NCCT). This study assessed the impact of variations in reconstruction kernel and slice thickness on two outputs of such a system: hypodense volume and Alberta Stroke Program Early CT Score (ASPECTS).

Approach: The NCCT series image data of 67 patients imaged with a CT stroke protocol were reconstructed with four kernels (H10s-smooth, H40s-medium, H60s-sharp, and H70h-very sharp) and three slice thicknesses (1.5, 3.0, and 5.0 mm) to create 1 reference condition (H40s/5.0 mm) and 11 nonreference conditions. The 12 reconstructions per patient were processed with a commercially available FDA-cleared software package that yields total hypodense volume (mL) and ASPECTS. A mixed-effect model was used to test the difference in hypodense volume, and an ordered logistic model was used to test the difference in e-ASPECTS.

Results: Hypodense volume differences from the reference condition ranged from - 14.6 to 1.1 mL and were significant for all nonreference kernels (H10s p = 0.025 , H60s p < 0.001 , and H70h p < 0.001 ) and for thinner slices (1.5 mm p < 0.001 and 3.0 mm p = 0.002 ). e-ASPECTS was invariant to the nonreference kernels and slice thicknesses, with a mean difference ranging from - 0.1 to 0.5. No significant differences were found for any kernel or slice thickness (all p > 0.05 ).

Conclusions: Automated hypodense volume measured with a commercially available, FDA-cleared software package is substantially impacted by reconstruction kernel and slice thickness. Conversely, automated ASPECTS is invariant to these reconstruction parameters.

目的:有多种市售的、美国食品和药物管理局(FDA)批准的、基于人工智能(AI)的工具,可以在非对比计算机断层扫描(NCCT)中自动评估脑卒中。本研究评估了重建核和切片厚度的变化对该系统的两个输出结果的影响:低密度体积和Alberta卒中程序早期CT评分(ASPECTS)。方法:对67例脑卒中CT成像患者的NCCT系列图像数据进行4个核(h10 -平滑、h40 -中等、h60 -尖锐、h70 -非常尖锐)和3个切片厚度(1.5、3.0、5.0 mm)重构,创建1个参考条件(h40 /5.0 mm)和11个非参考条件。每位患者的12个重建用市售的fda批准的软件包进行处理,该软件包产生总低密度体积(mL)和ASPECTS。采用混合效应模型检验低密度体积的差异,采用有序logistic模型检验e-ASPECTS的差异。结果:与参考条件相比,低密度体积差异在- 14.6至1.1 mL之间,并且在所有非参考条件下(H10s p = 0.025, h60h p = 0.001, H70h p = 0.001)和较薄切片(1.5 mm p = 0.001和3.0 mm p = 0.002)均具有显著性。e-ASPECTS对非参考核和切片厚度不变,平均差值在- 0.1 ~ 0.5之间。仁层厚度和切片厚度均无显著差异(p < 0.05)。结论:用市售的、fda批准的软件包自动测量低密度体积受到重建核和切片厚度的很大影响。相反,自动化方面对这些重建参数是不变的。
{"title":"Characterizing the effects of noncontrast head CT reconstruction kernel and slice thickness parameters on the performance of an automated AI algorithm in the evaluation of ischemic stroke.","authors":"Spencer H Welland, Grace Hyun J Kim, Anil Yadav, Kambiz Nael, John M Hoffman, Matthew S Brown, Michael F McNitt-Gray, William Hsu","doi":"10.1117/1.JMI.13.1.014503","DOIUrl":"10.1117/1.JMI.13.1.014503","url":null,"abstract":"<p><strong>Purpose: </strong>There are multiple commercially available, Food and Drug Administration (FDA)-cleared, artificial intelligence (AI)-based tools automating stroke evaluation in noncontrast computed tomography (NCCT). This study assessed the impact of variations in reconstruction kernel and slice thickness on two outputs of such a system: hypodense volume and Alberta Stroke Program Early CT Score (ASPECTS).</p><p><strong>Approach: </strong>The NCCT series image data of 67 patients imaged with a CT stroke protocol were reconstructed with four kernels (H10s-smooth, H40s-medium, H60s-sharp, and H70h-very sharp) and three slice thicknesses (1.5, 3.0, and 5.0 mm) to create 1 reference condition (H40s/5.0 mm) and 11 nonreference conditions. The 12 reconstructions per patient were processed with a commercially available FDA-cleared software package that yields total hypodense volume (mL) and ASPECTS. A mixed-effect model was used to test the difference in hypodense volume, and an ordered logistic model was used to test the difference in e-ASPECTS.</p><p><strong>Results: </strong>Hypodense volume differences from the reference condition ranged from <math><mrow><mo>-</mo> <mn>14.6</mn></mrow> </math> to 1.1 mL and were significant for all nonreference kernels (H10s <math><mrow><mi>p</mi> <mo>=</mo> <mn>0.025</mn></mrow> </math> , H60s <math><mrow><mi>p</mi> <mo><</mo> <mn>0.001</mn></mrow> </math> , and H70h <math><mrow><mi>p</mi> <mo><</mo> <mn>0.001</mn></mrow> </math> ) and for thinner slices (1.5 mm <math><mrow><mi>p</mi> <mo><</mo> <mn>0.001</mn></mrow> </math> and 3.0 mm <math><mrow><mi>p</mi> <mo>=</mo> <mn>0.002</mn></mrow> </math> ). e-ASPECTS was invariant to the nonreference kernels and slice thicknesses, with a mean difference ranging from <math><mrow><mo>-</mo> <mn>0.1</mn></mrow> </math> to 0.5. No significant differences were found for any kernel or slice thickness (all <math><mrow><mi>p</mi> <mo>></mo> <mn>0.05</mn></mrow> </math> ).</p><p><strong>Conclusions: </strong>Automated hypodense volume measured with a commercially available, FDA-cleared software package is substantially impacted by reconstruction kernel and slice thickness. Conversely, automated ASPECTS is invariant to these reconstruction parameters.</p>","PeriodicalId":47707,"journal":{"name":"Journal of Medical Imaging","volume":"13 1","pages":"014503"},"PeriodicalIF":1.7,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12782953/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145953582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
From preoperative computed tomography to postmastoidectomy mesh construction: mastoidectomy shape prediction for cochlear implant surgery. 从术前计算机断层扫描到乳突切除术后网状结构:人工耳蜗手术乳突切除术形状预测。
IF 1.7 Q3 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING Pub Date : 2026-01-01 Epub Date: 2026-01-27 DOI: 10.1117/1.JMI.13.1.014004
Yike Zhang, Eduardo Davalos, Dingjie Su, Ange Lou, Jack Noble

Purpose: Cochlear implant (CI) surgery treats severe hearing loss by inserting an electrode array into the cochlea to stimulate the auditory nerve. An important step in this procedure is mastoidectomy, which removes part of the mastoid region of the temporal bone to provide surgical access. Accurate mastoidectomy shape prediction from preoperative imaging improves presurgical planning, reduces risks, and enhances surgical outcomes. Despite its importance, there are limited deep-learning-based studies regarding this topic due to the challenges of acquiring ground-truth labels. We address this gap by investigating self-supervised and weakly-supervised learning models to predict the mastoidectomy region without human annotations.

Approach: We propose a hybrid self-supervised and weakly-supervised learning framework to predict the mastoidectomy region directly from preoperative CT scans, where the mastoid remains intact. Our self-supervised learning approach reconstructs the postmastoidectomy 3D surface from preoperative imaging, aiming to align with the corresponding intraoperative microscope views for future surgical navigation-related applications. Postoperative CT scans are used in the self-supervised learning model to assist training procedures despite additional challenges such as metal artifacts and low signal-to-noise ratios introduced by them. To further improve the accuracy and robustness, we introduce a Mamba-based weakly-supervised model that refines mastoidectomy shape prediction by using 3D T-distribution loss function, inspired by the student- t distribution. Weak supervision is achieved by leveraging segmentation results from the prior self-supervised framework, eliminating the manual data labeling process.

Results: Our hybrid method achieves a mean Dice score of 0.72 when predicting the complex and boundary-less mastoidectomy shape, surpassing state-of-the-art approaches and demonstrating strong performance. The method provides groundwork for constructing 3D postmastoidectomy surfaces directly from the corresponding preoperative CT scans.

Conclusion: To our knowledge, this is the first work that integrates self-supervised and weakly-supervised learning for mastoidectomy shape prediction, offering a robust and efficient solution for CI surgical planning while leveraging 3D T-distribution loss in weakly-supervised medical imaging.

目的:人工耳蜗手术通过在耳蜗内插入电极阵列来刺激听神经来治疗严重的听力损失。该手术的一个重要步骤是乳突切除术,切除颞骨的部分乳突区以提供手术通道。通过术前影像学准确预测乳突切除术形状可以改善术前计划,降低风险,提高手术效果。尽管它很重要,但由于获取基础真值标签的挑战,关于这一主题的基于深度学习的研究有限。我们通过研究自我监督和弱监督学习模型来解决这一差距,以在没有人类注释的情况下预测乳突切除术区域。方法:我们提出了一种混合的自我监督和弱监督学习框架,直接从术前CT扫描中预测乳突切除术区域,乳突保持完整。我们的自我监督学习方法从术前成像重建乳突切除术后的三维表面,旨在与相应的术中显微镜视图对齐,以用于未来的手术导航相关应用。术后CT扫描用于自我监督学习模型,以辅助训练程序,尽管它们带来了额外的挑战,如金属伪影和低信噪比。为了进一步提高准确性和鲁棒性,我们引入了一种基于mamba的弱监督模型,该模型通过使用3D t分布损失函数来改进乳突切除术形状预测,该模型的灵感来自student- t分布。弱监督是通过利用先前自监督框架的分割结果来实现的,消除了人工数据标记过程。结果:我们的混合方法在预测复杂和无边界乳突切除形状时,平均Dice得分为0.72,超过了最先进的方法,表现出很强的性能。该方法为直接从相应的术前CT扫描中构建三维乳突切除术后表面提供了基础。结论:据我们所知,这是第一个将自我监督和弱监督学习结合起来进行乳突切除术形状预测的工作,在利用弱监督医学成像中的3D t分布损失的同时,为CI手术计划提供了一个强大而有效的解决方案。
{"title":"From preoperative computed tomography to postmastoidectomy mesh construction: mastoidectomy shape prediction for cochlear implant surgery.","authors":"Yike Zhang, Eduardo Davalos, Dingjie Su, Ange Lou, Jack Noble","doi":"10.1117/1.JMI.13.1.014004","DOIUrl":"https://doi.org/10.1117/1.JMI.13.1.014004","url":null,"abstract":"<p><strong>Purpose: </strong>Cochlear implant (CI) surgery treats severe hearing loss by inserting an electrode array into the cochlea to stimulate the auditory nerve. An important step in this procedure is mastoidectomy, which removes part of the mastoid region of the temporal bone to provide surgical access. Accurate mastoidectomy shape prediction from preoperative imaging improves presurgical planning, reduces risks, and enhances surgical outcomes. Despite its importance, there are limited deep-learning-based studies regarding this topic due to the challenges of acquiring ground-truth labels. We address this gap by investigating self-supervised and weakly-supervised learning models to predict the mastoidectomy region without human annotations.</p><p><strong>Approach: </strong>We propose a hybrid self-supervised and weakly-supervised learning framework to predict the mastoidectomy region directly from preoperative CT scans, where the mastoid remains intact. Our self-supervised learning approach reconstructs the postmastoidectomy 3D surface from preoperative imaging, aiming to align with the corresponding intraoperative microscope views for future surgical navigation-related applications. Postoperative CT scans are used in the self-supervised learning model to assist training procedures despite additional challenges such as metal artifacts and low signal-to-noise ratios introduced by them. To further improve the accuracy and robustness, we introduce a Mamba-based weakly-supervised model that refines mastoidectomy shape prediction by using 3D T-distribution loss function, inspired by the student- <math><mrow><mi>t</mi></mrow> </math> distribution. Weak supervision is achieved by leveraging segmentation results from the prior self-supervised framework, eliminating the manual data labeling process.</p><p><strong>Results: </strong>Our hybrid method achieves a mean Dice score of 0.72 when predicting the complex and boundary-less mastoidectomy shape, surpassing state-of-the-art approaches and demonstrating strong performance. The method provides groundwork for constructing 3D postmastoidectomy surfaces directly from the corresponding preoperative CT scans.</p><p><strong>Conclusion: </strong>To our knowledge, this is the first work that integrates self-supervised and weakly-supervised learning for mastoidectomy shape prediction, offering a robust and efficient solution for CI surgical planning while leveraging 3D T-distribution loss in weakly-supervised medical imaging.</p>","PeriodicalId":47707,"journal":{"name":"Journal of Medical Imaging","volume":"13 1","pages":"014004"},"PeriodicalIF":1.7,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12838396/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146094516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Toward robust modeling of breast biomechanical compression: an extended study using graph neural networks. 乳房生物力学压缩的鲁棒建模:使用图神经网络的扩展研究。
IF 1.7 Q3 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING Pub Date : 2026-01-01 Epub Date: 2025-12-29 DOI: 10.1117/1.JMI.13.1.015001
Hadeel Awwad, Eloy García, Robert Martí

Purpose: Accurate simulation of breast tissue deformation is essential for reliable image registration between 3D imaging modalities and 2D mammograms, where compression significantly alters tissue geometry. Although finite element analysis (FEA) provides high-fidelity modeling, it is computationally intensive and not well suited for rapid simulations. To address this, the physics-based graph neural network (PhysGNN) has been introduced as a computationally efficient approximation model trained on FEA-generated deformations. We extend prior work by evaluating the performance of PhysGNN on new digital breast phantoms and assessing the impact of training on multiple phantoms.

Approach: PhysGNN was trained on both single-phantom (per-geometry) and multiphantom (multigeometry) datasets generated from incremental FEA simulations. The digital breast phantoms represent the uncompressed state, serving as input geometries for predicting compressed configurations. A leave-one-deformation-out evaluation strategy was used to assess predictive performance under compression.

Results: Training on new digital phantoms confirmed the model's robust performance, though with some variability in prediction accuracy reflecting the diverse anatomical structures. Multiphantom training further enhanced this robustness and reduced prediction errors.

Conclusions: PhysGNN offers a computationally efficient alternative to FEA for simulating breast compression. The results showed that model performance remains robust when trained per-geometry, and further demonstrated that multigeometry training enhances predictive accuracy and robustness for the geometries included in the training set. This suggests a strong potential path toward developing reliable models for generating compressed breast volumes, which could facilitate image registration and algorithm development.

目的:准确模拟乳腺组织变形对于3D成像模式和2D乳房x线照片之间的可靠图像配准至关重要,其中压缩显着改变了组织几何形状。虽然有限元分析(FEA)提供了高保真的建模,但它的计算量很大,不适合快速模拟。为了解决这个问题,基于物理的图神经网络(PhysGNN)作为一种计算效率高的近似模型被引入到有限元生成的变形上。我们通过评估PhysGNN在新的数字乳房模型上的性能和评估训练对多个模型的影响来扩展先前的工作。方法:PhysGNN在增量有限元模拟生成的单模型(每几何)和多模型(多几何)数据集上进行训练。数字乳房幻影表示未压缩状态,作为预测压缩配置的输入几何形状。采用“留一变形”评价策略对压缩下的预测性能进行评价。结果:对新的数字模型的训练证实了该模型的鲁棒性,尽管在反映不同解剖结构的预测精度上存在一些差异。多幻影训练进一步增强了这种鲁棒性并减少了预测误差。结论:PhysGNN为模拟乳房压缩提供了一种计算效率高的替代方法。结果表明,在按几何形状训练时,模型的鲁棒性保持不变,并进一步证明了多几何形状训练提高了训练集中几何形状的预测精度和鲁棒性。这为开发可靠的模型来生成压缩乳房体积提供了一条强有力的潜在途径,这可以促进图像配准和算法开发。
{"title":"Toward robust modeling of breast biomechanical compression: an extended study using graph neural networks.","authors":"Hadeel Awwad, Eloy García, Robert Martí","doi":"10.1117/1.JMI.13.1.015001","DOIUrl":"https://doi.org/10.1117/1.JMI.13.1.015001","url":null,"abstract":"<p><strong>Purpose: </strong>Accurate simulation of breast tissue deformation is essential for reliable image registration between 3D imaging modalities and 2D mammograms, where compression significantly alters tissue geometry. Although finite element analysis (FEA) provides high-fidelity modeling, it is computationally intensive and not well suited for rapid simulations. To address this, the physics-based graph neural network (PhysGNN) has been introduced as a computationally efficient approximation model trained on FEA-generated deformations. We extend prior work by evaluating the performance of PhysGNN on new digital breast phantoms and assessing the impact of training on multiple phantoms.</p><p><strong>Approach: </strong>PhysGNN was trained on both single-phantom (per-geometry) and multiphantom (multigeometry) datasets generated from incremental FEA simulations. The digital breast phantoms represent the uncompressed state, serving as input geometries for predicting compressed configurations. A leave-one-deformation-out evaluation strategy was used to assess predictive performance under compression.</p><p><strong>Results: </strong>Training on new digital phantoms confirmed the model's robust performance, though with some variability in prediction accuracy reflecting the diverse anatomical structures. Multiphantom training further enhanced this robustness and reduced prediction errors.</p><p><strong>Conclusions: </strong>PhysGNN offers a computationally efficient alternative to FEA for simulating breast compression. The results showed that model performance remains robust when trained per-geometry, and further demonstrated that multigeometry training enhances predictive accuracy and robustness for the geometries included in the training set. This suggests a strong potential path toward developing reliable models for generating compressed breast volumes, which could facilitate image registration and algorithm development.</p>","PeriodicalId":47707,"journal":{"name":"Journal of Medical Imaging","volume":"13 1","pages":"015001"},"PeriodicalIF":1.7,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12745487/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145866013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Attention-driven framework to segment renal ablation zone in posttreatment CT images: a step toward ablation margin evaluation. 治疗后CT图像中肾消融区分割的注意力驱动框架:消融边缘评估的一个步骤。
IF 1.7 Q3 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING Pub Date : 2026-01-01 Epub Date: 2026-01-05 DOI: 10.1117/1.JMI.13.1.014001
Maryam Rastegarpoor, Derek W Cool, Aaron Fenster

Purpose: Thermal ablation is a minimally invasive therapy used for the treatment of small renal cell carcinoma tumors. Treatment success is evaluated on postablation computed tomography (CT) to determine if the ablation zone covered the tumor with an adequate treatment margin (often 5 to 10 mm). Incorrect margin identification can lead to treatment misassessment, resulting in unnecessary additional ablation. Therefore, segmentation of the renal ablation zone (RAZ) is crucial for treatment evaluation. We aim to develop and assess an accurate deep learning workflow for delineating the RAZ from surrounding tissues in kidney CT images.

Approach: We present an advanced deep learning method using the attention-based U-Net architecture to segment the RAZ. The workflow leverages the strengths of U-Net, enhanced with attention mechanisms, to improve the network's focus on the most relevant parts of the images, resulting in an accurate segmentation.

Results: Our model was trained and evaluated on a dataset comprising 76 patients' annotated RAZs in CT images. Analysis demonstrated that the proposed workflow achieved an accuracy = 0.97 ± 0.02 , precision = 0.74 ± 0.23 , recall = 0.73 ± 0.25 , DSC = 0.70 ± 0.22 , Jaccard = 0.58 ± 0.22 , specificity = 0.99 ± 0.01 , Hausdorff distance = 6.70 ± 4.44    mm , and mean absolute boundary distance = 2.67 ± 2.22    mm .

Conclusions: We used 3D CT images with RAZs and, for the first time, addressed deep-learning-based RAZ segmentation using parallel CT images. Our framework can effectively segment RAZs, allowing clinicians to automatically determine the ablation margin, making our tool ready for clinical use. Prediction time is 1    s per patient, enabling clinicians to perform quick reviews, especially in time-constrained settings.

目的:热消融是一种微创治疗肾小细胞癌的方法。通过消融后的计算机断层扫描(CT)评估治疗成功与否,以确定消融区是否覆盖肿瘤并有足够的治疗范围(通常为5 - 10mm)。不正确的切缘识别可导致治疗评估错误,导致不必要的额外消融。因此,肾消融区(RAZ)的分割对治疗评价至关重要。我们的目标是开发和评估一种准确的深度学习工作流程,用于在肾脏CT图像中从周围组织中描绘RAZ。方法:我们提出了一种先进的深度学习方法,使用基于注意力的U-Net架构来分割RAZ。该工作流利用U-Net的优势,增强了注意力机制,以提高网络对图像最相关部分的关注,从而实现准确的分割。结果:我们的模型在包含76例患者CT图像注释raz的数据集上进行了训练和评估。结果表明,该工作流的准确率为0.97±0.02,精密度为0.74±0.23,召回率为0.73±0.25,DSC为0.70±0.22,Jaccard为0.58±0.22,特异性为0.99±0.01,Hausdorff距离为6.70±4.44 mm,平均绝对边界距离为2.67±2.22 mm。结论:我们将三维CT图像与RAZ结合使用,并首次使用并行CT图像解决了基于深度学习的RAZ分割问题。我们的框架可以有效地分割raz,允许临床医生自动确定消融范围,使我们的工具为临床使用做好准备。预测时间为每位患者约1秒,使临床医生能够进行快速审查,特别是在时间有限的情况下。
{"title":"Attention-driven framework to segment renal ablation zone in posttreatment CT images: a step toward ablation margin evaluation.","authors":"Maryam Rastegarpoor, Derek W Cool, Aaron Fenster","doi":"10.1117/1.JMI.13.1.014001","DOIUrl":"https://doi.org/10.1117/1.JMI.13.1.014001","url":null,"abstract":"<p><strong>Purpose: </strong>Thermal ablation is a minimally invasive therapy used for the treatment of small renal cell carcinoma tumors. Treatment success is evaluated on postablation computed tomography (CT) to determine if the ablation zone covered the tumor with an adequate treatment margin (often 5 to 10 mm). Incorrect margin identification can lead to treatment misassessment, resulting in unnecessary additional ablation. Therefore, segmentation of the renal ablation zone (RAZ) is crucial for treatment evaluation. We aim to develop and assess an accurate deep learning workflow for delineating the RAZ from surrounding tissues in kidney CT images.</p><p><strong>Approach: </strong>We present an advanced deep learning method using the attention-based U-Net architecture to segment the RAZ. The workflow leverages the strengths of U-Net, enhanced with attention mechanisms, to improve the network's focus on the most relevant parts of the images, resulting in an accurate segmentation.</p><p><strong>Results: </strong>Our model was trained and evaluated on a dataset comprising 76 patients' annotated RAZs in CT images. Analysis demonstrated that the proposed workflow achieved an accuracy <math><mrow><mo>=</mo> <mn>0.97</mn> <mo>±</mo> <mn>0.02</mn></mrow> </math> , precision <math><mrow><mo>=</mo> <mn>0.74</mn> <mo>±</mo> <mn>0.23</mn></mrow> </math> , <math><mrow><mtext>recall</mtext> <mo>=</mo> <mn>0.73</mn> <mo>±</mo> <mn>0.25</mn></mrow> </math> , <math><mrow><mi>DSC</mi> <mo>=</mo> <mn>0.70</mn> <mo>±</mo> <mn>0.22</mn></mrow> </math> , Jaccard <math><mrow><mo>=</mo> <mn>0.58</mn> <mo>±</mo> <mn>0.22</mn></mrow> </math> , specificity <math><mrow><mo>=</mo> <mn>0.99</mn> <mo>±</mo> <mn>0.01</mn></mrow> </math> , Hausdorff distance <math><mrow><mo>=</mo> <mn>6.70</mn> <mo>±</mo> <mn>4.44</mn> <mtext>  </mtext> <mi>mm</mi></mrow> </math> , and mean absolute boundary distance <math><mrow><mo>=</mo> <mn>2.67</mn> <mo>±</mo> <mn>2.22</mn> <mtext>  </mtext> <mi>mm</mi></mrow> </math> .</p><p><strong>Conclusions: </strong>We used 3D CT images with RAZs and, for the first time, addressed deep-learning-based RAZ segmentation using parallel CT images. Our framework can effectively segment RAZs, allowing clinicians to automatically determine the ablation margin, making our tool ready for clinical use. Prediction time is <math><mrow><mo>∼</mo> <mn>1</mn> <mtext>  </mtext> <mi>s</mi></mrow> </math> per patient, enabling clinicians to perform quick reviews, especially in time-constrained settings.</p>","PeriodicalId":47707,"journal":{"name":"Journal of Medical Imaging","volume":"13 1","pages":"014001"},"PeriodicalIF":1.7,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12767621/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145913125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Airway quantifications of bronchitis patients with photon-counting and energy-integrating computed tomography. 光子计数和能量积分计算机断层扫描对支气管炎患者气道的定量研究。
IF 1.7 Q3 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING Pub Date : 2026-01-01 Epub Date: 2026-02-02 DOI: 10.1117/1.JMI.13.1.013501
Fong Chi Ho, William Paul Segars, Ehsan Samei, Ehsan Abadi

Purpose: Accurate airway measurement is critical for bronchitis quantification with computed tomography (CT), yet optimal protocols and the added value of photon-counting CT (PCCT) over energy-integrating CT (EICT) for reducing bias remain unclear. We quantified biomarker accuracy across modalities and protocols and assessed strategies to reduce bias.

Approach: A virtual imaging trial with 20 bronchitis anthropomorphic models was scanned using a validated simulator for two systems (EICT: SOMATOM Flash; PCCT: NAEOTOM Alpha) at 6.3 and 12.6 mGy. Reconstructions varied algorithm, kernel sharpness, slice thickness, and pixel size. Pi10 (square-root wall thickness at 10-mm perimeter) and WA% (wall-area percentage) were compared against ground-truth airway dimensions obtained from the 0.1-mm-precision anatomical models prior to CT simulation. External validation used clinical PCCT ( n = 22 ) and EICT ( n = 80 ).

Results: Simulated airway dimensions agreed with pathological references ( R = 0.89 - 0.93 ). PCCT had lower errors than EICT across segmented generations ( p < 0.05 ). Under optimal parameters, PCCT improved Pi10 and WA% accuracy by 26.3% and 64.9%. Across the tested PCCT and EICT imaging protocols, improvements were associated with sharper kernels (25.8% Pi10, 33.0% WA%), thinner slices (23.9% Pi10, 49.8% WA%), smaller pixels (17.0% Pi10, 23.1% WA%), and higher dose ( 3.9 % ). Clinically, PCCT achieved higher maximum airway generation ( 8.8 ± 0.5 versus 6.0 ± 1.1 ) and lower variability, mirroring trends in virtual results.

Conclusions: PCCT improves the accuracy and consistency of airway biomarker quantification relative to EICT, particularly with optimized protocols. The validated virtual platform enables modality-bias assessment and protocol optimization for accurate, reproducible bronchitis measurements.

目的:准确的气道测量对于用计算机断层扫描(CT)量化支气管炎至关重要,但最佳方案和光子计数CT (PCCT)比能量积分CT (EICT)在减少偏倚方面的附加价值尚不清楚。我们量化了各种模式和方案的生物标志物准确性,并评估了减少偏倚的策略。方法:对20个支气管炎拟人模型进行虚拟成像试验,使用两种系统(EICT: SOMATOM Flash; PCCT: NAEOTOM Alpha)在6.3和12.6 mGy下进行扫描。重建不同的算法,核清晰度,切片厚度和像素大小。Pi10 (10mm周长的平方根壁厚)和WA%(壁面积百分比)与CT模拟前从0.1 mm精度的解剖模型中获得的真实气道尺寸进行比较。外部验证采用临床PCCT (n = 22)和EICT (n = 80)。结果:模拟气道尺寸与病理对照吻合(R = 0.89 ~ 0.93)。PCCT在不同世代间的误差低于EICT (p < 0.05)。在最优参数下,PCCT的Pi10和WA%准确率分别提高了26.3%和64.9%。在测试的PCCT和EICT成像方案中,改善与更清晰的核(25.8% Pi10, 33.0% WA%),更薄的切片(23.9% Pi10, 49.8% WA%),更小的像素(17.0% Pi10, 23.1% WA%)和更高的剂量(≤3.9%)相关。在临床上,PCCT实现了更高的最大气道生成(8.8±0.5比6.0±1.1)和更低的变异性,反映了虚拟结果的趋势。结论:相对于EICT, PCCT提高了气道生物标志物定量的准确性和一致性,特别是优化的方案。经过验证的虚拟平台能够进行模式偏差评估和方案优化,以实现准确,可重复的支气管炎测量。
{"title":"Airway quantifications of bronchitis patients with photon-counting and energy-integrating computed tomography.","authors":"Fong Chi Ho, William Paul Segars, Ehsan Samei, Ehsan Abadi","doi":"10.1117/1.JMI.13.1.013501","DOIUrl":"10.1117/1.JMI.13.1.013501","url":null,"abstract":"<p><strong>Purpose: </strong>Accurate airway measurement is critical for bronchitis quantification with computed tomography (CT), yet optimal protocols and the added value of photon-counting CT (PCCT) over energy-integrating CT (EICT) for reducing bias remain unclear. We quantified biomarker accuracy across modalities and protocols and assessed strategies to reduce bias.</p><p><strong>Approach: </strong>A virtual imaging trial with 20 bronchitis anthropomorphic models was scanned using a validated simulator for two systems (EICT: SOMATOM Flash; PCCT: NAEOTOM Alpha) at 6.3 and 12.6 mGy. Reconstructions varied algorithm, kernel sharpness, slice thickness, and pixel size. Pi10 (square-root wall thickness at 10-mm perimeter) and WA% (wall-area percentage) were compared against ground-truth airway dimensions obtained from the 0.1-mm-precision anatomical models prior to CT simulation. External validation used clinical PCCT ( <math><mrow><mi>n</mi> <mo>=</mo> <mn>22</mn></mrow> </math> ) and EICT ( <math><mrow><mi>n</mi> <mo>=</mo> <mn>80</mn></mrow> </math> ).</p><p><strong>Results: </strong>Simulated airway dimensions agreed with pathological references ( <math><mrow><mi>R</mi> <mo>=</mo> <mn>0.89</mn> <mo>-</mo> <mn>0.93</mn></mrow> </math> ). PCCT had lower errors than EICT across segmented generations ( <math><mrow><mi>p</mi> <mo><</mo> <mn>0.05</mn></mrow> </math> ). Under optimal parameters, PCCT improved Pi10 and WA% accuracy by 26.3% and 64.9%. Across the tested PCCT and EICT imaging protocols, improvements were associated with sharper kernels (25.8% Pi10, 33.0% WA%), thinner slices (23.9% Pi10, 49.8% WA%), smaller pixels (17.0% Pi10, 23.1% WA%), and higher dose ( <math><mrow><mo>≤</mo> <mn>3.9</mn> <mo>%</mo></mrow> </math> ). Clinically, PCCT achieved higher maximum airway generation ( <math><mrow><mn>8.8</mn> <mo>±</mo> <mn>0.5</mn></mrow> </math> versus <math><mrow><mn>6.0</mn> <mo>±</mo> <mn>1.1</mn></mrow> </math> ) and lower variability, mirroring trends in virtual results.</p><p><strong>Conclusions: </strong>PCCT improves the accuracy and consistency of airway biomarker quantification relative to EICT, particularly with optimized protocols. The validated virtual platform enables modality-bias assessment and protocol optimization for accurate, reproducible bronchitis measurements.</p>","PeriodicalId":47707,"journal":{"name":"Journal of Medical Imaging","volume":"13 1","pages":"013501"},"PeriodicalIF":1.7,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12863983/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146114479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LCSD-Net: a light-weight cross-attention-based semantic dual transformer for domain generalization in melanoma detection. LCSD-Net:用于黑色素瘤检测领域泛化的轻量级交叉注意语义双转换器。
IF 1.7 Q3 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING Pub Date : 2026-01-01 Epub Date: 2026-01-06 DOI: 10.1117/1.JMI.13.1.014502
Rishi Agrawal, Neeraj Gupta, Anand Singh Jalal

Purpose: Research in deep learning has shown a great advancement in the detection of melanoma. However, recent literature has emphasized a tendency of certain models to rely on disease-irrelevant visual artifacts such as dark corners, dense hair, or ruler marks. The dependence on these markers leads to biased models that do well for training but generalize poorly to heterogeneous clinical environments. To address these limitations in developing reliability in skin lesion detection, a lightweight cross-attention-based semantic dual (LCSD) transformer model was proposed.

Approach: The LCSD model extracts global-level semantic information, uses feature normalization to improve model accuracy, and employs semantic queries to improve domain generalization. Multihead attention is included with the semantic queries to refine global features. The cross-attention between feature maps and semantic query provides the model with a generalized encoding of the global context. The model improved the computational complexity from O ( n 2 d ) to O ( n m d + m 2 d ) , which makes the model suitable for the development of real-time and mobile applications.

Results: Empirical evaluation was conducted on three challenging datasets: Derm7pt-Dermoscopic, Derm7pt-Clinical, and PAD-UFES-20. The proposed model achieved classification accuracies of 82.82%, 72.95%, and 86.21%, respectively. These results demonstrate superior performance compared with conventional transformer-based models, highlighting both improved robustness and reduced computational cost.

Conclusion: The LCSD model mitigates the influence of irrelevant visual characteristics, enhances domain generalization, and ensures better adaptability across diverse clinical scenarios. Its lightweight design further supports deployment in mobile applications, making it a reliable and efficient solution for real-world melanoma detection.

目的:深度学习的研究在黑色素瘤的检测方面取得了很大的进展。然而,最近的文献强调,某些模型倾向于依赖与疾病无关的视觉人工制品,如黑暗的角落,浓密的头发,或标尺标记。对这些标记的依赖导致有偏见的模型在训练中表现良好,但在异质临床环境中泛化能力差。为了解决这些限制在开发可靠性的皮肤损伤检测,提出了一个轻量级的基于交叉注意的语义对偶(LCSD)变压器模型。方法:LCSD模型提取全局语义信息,使用特征归一化来提高模型精度,使用语义查询来提高领域泛化。语义查询中包含多头注意,以细化全局特征。特征映射和语义查询之间的交叉关注为模型提供了全局上下文的通用编码。该模型将计算复杂度从0 (n²d)提高到0 (n²d + m²d),适合实时和移动应用的开发。结果:对三个具有挑战性的数据集:Derm7pt-Dermoscopic、Derm7pt-Clinical和pad - upes -20进行了实证评估。该模型的分类准确率分别为82.82%、72.95%和86.21%。与传统的基于变压器的模型相比,这些结果显示了优越的性能,突出了增强的鲁棒性和降低的计算成本。结论:LCSD模型减轻了不相关视觉特征的影响,增强了领域泛化,确保了对不同临床场景更好的适应性。其轻量级设计进一步支持移动应用程序的部署,使其成为现实世界黑色素瘤检测的可靠和高效的解决方案。
{"title":"LCSD-Net: a light-weight cross-attention-based semantic dual transformer for domain generalization in melanoma detection.","authors":"Rishi Agrawal, Neeraj Gupta, Anand Singh Jalal","doi":"10.1117/1.JMI.13.1.014502","DOIUrl":"https://doi.org/10.1117/1.JMI.13.1.014502","url":null,"abstract":"<p><strong>Purpose: </strong>Research in deep learning has shown a great advancement in the detection of melanoma. However, recent literature has emphasized a tendency of certain models to rely on disease-irrelevant visual artifacts such as dark corners, dense hair, or ruler marks. The dependence on these markers leads to biased models that do well for training but generalize poorly to heterogeneous clinical environments. To address these limitations in developing reliability in skin lesion detection, a lightweight cross-attention-based semantic dual (LCSD) transformer model was proposed.</p><p><strong>Approach: </strong>The LCSD model extracts global-level semantic information, uses feature normalization to improve model accuracy, and employs semantic queries to improve domain generalization. Multihead attention is included with the semantic queries to refine global features. The cross-attention between feature maps and semantic query provides the model with a generalized encoding of the global context. The model improved the computational complexity from <math><mrow><mi>O</mi> <mo>(</mo> <msup><mrow><mi>n</mi></mrow> <mrow><mn>2</mn></mrow> </msup> <mi>d</mi> <mo>)</mo></mrow> </math> to <math><mrow><mi>O</mi> <mo>(</mo> <mi>n</mi> <mi>m</mi> <mi>d</mi> <mo>+</mo> <msup><mrow><mi>m</mi></mrow> <mrow><mn>2</mn></mrow> </msup> <mi>d</mi> <mo>)</mo></mrow> </math> , which makes the model suitable for the development of real-time and mobile applications.</p><p><strong>Results: </strong>Empirical evaluation was conducted on three challenging datasets: Derm7pt-Dermoscopic, Derm7pt-Clinical, and PAD-UFES-20. The proposed model achieved classification accuracies of 82.82%, 72.95%, and 86.21%, respectively. These results demonstrate superior performance compared with conventional transformer-based models, highlighting both improved robustness and reduced computational cost.</p><p><strong>Conclusion: </strong>The LCSD model mitigates the influence of irrelevant visual characteristics, enhances domain generalization, and ensures better adaptability across diverse clinical scenarios. Its lightweight design further supports deployment in mobile applications, making it a reliable and efficient solution for real-world melanoma detection.</p>","PeriodicalId":47707,"journal":{"name":"Journal of Medical Imaging","volume":"13 1","pages":"014502"},"PeriodicalIF":1.7,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12773922/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145918799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Radiomic signatures from baseline CT predict chemotherapy response in unresectable colorectal liver metastases. 基线CT放射学特征预测不可切除的结直肠癌肝转移的化疗反应。
IF 1.7 Q3 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING Pub Date : 2026-01-01 Epub Date: 2026-01-13 DOI: 10.1117/1.JMI.13.1.014505
Mane Piliposyan, Jacob J Peoples, Mohammad Hamghalam, Ramtin Mojtahedi, Kaitlyn Kobayashi, E Claire Bunker, Natalie Gangai, Hyunseon C Kang, Yun Shin Chun, Christian Muise, Richard K G Do, Amber L Simpson

Purpose: Colorectal cancer is the third most common cancer globally, with a high mortality rate due to metastatic progression, particularly in the liver. Surgical resection remains the main curative treatment, but only a small subset of patients is eligible for surgery at diagnosis. For patients with initially unresectable colorectal liver metastases (CRLM), neoadjuvant chemotherapy can downstage tumors, potentially making surgery feasible. We investigate whether radiomic signatures-quantitative imaging biomarkers derived from baseline computed tomography (CT) scans-can noninvasively predict chemotherapy response in patients with unresectable CRLM, offering a pathway toward personalized treatment planning.

Approach: We used radiomics combined with a stacking classifier (SC) to predict treatment outcome. Baseline CT imaging data from 355 patients with initially unresectable CRLM were analyzed using two regions of interest (ROIs) separately (all tumors in the liver and the largest tumor by volume). From each ROI, 107 radiomic features were extracted. The dataset was split into training and testing sets, and multiple machine learning models were trained and integrated via stacking to enhance prediction. Logistic regression coefficients were used to derive radiomic signatures.

Results: The SC achieved strong predictive performance, with an area under the receiver operating characteristic curve of up to 0.77 for response prediction. Logistic regression identified 12 and 7 predictive features for treatment response in all tumors and the largest tumor ROIs, respectively.

Conclusion: Our findings demonstrate that radiomic features from baseline CT scans can serve as robust, interpretable biomarkers for predicting chemotherapy response, offering insights to guide personalized treatment in unresectable CRLM.

目的:结直肠癌是全球第三大常见癌症,由于转移进展,特别是在肝脏,死亡率很高。手术切除仍然是主要的治疗方法,但只有一小部分患者在诊断时符合手术条件。对于最初无法切除的结肠肝转移(CRLM)患者,新辅助化疗可以降低肿瘤的分期,可能使手术成为可能。我们研究放射学特征-来自基线计算机断层扫描(CT)扫描的定量成像生物标志物-是否可以无创地预测不可切除的CRLM患者的化疗反应,为个性化治疗计划提供途径。方法:我们使用放射组学结合堆叠分类器(SC)来预测治疗结果。355例最初不可切除的CRLM患者的基线CT成像数据分别使用两个感兴趣区域(所有肝脏肿瘤和体积最大的肿瘤)进行分析。从每个ROI中提取107个放射学特征。将数据集分为训练集和测试集,对多个机器学习模型进行训练和叠加,增强预测能力。逻辑回归系数被用来推导放射性特征。结果:SC具有较强的预测效果,受试者工作特征曲线下面积可达0.77。Logistic回归分别确定了所有肿瘤治疗反应和最大肿瘤roi的12个和7个预测特征。结论:我们的研究结果表明,基线CT扫描的放射学特征可以作为预测化疗反应的可靠、可解释的生物标志物,为指导不可切除的CRLM的个性化治疗提供见解。
{"title":"Radiomic signatures from baseline CT predict chemotherapy response in unresectable colorectal liver metastases.","authors":"Mane Piliposyan, Jacob J Peoples, Mohammad Hamghalam, Ramtin Mojtahedi, Kaitlyn Kobayashi, E Claire Bunker, Natalie Gangai, Hyunseon C Kang, Yun Shin Chun, Christian Muise, Richard K G Do, Amber L Simpson","doi":"10.1117/1.JMI.13.1.014505","DOIUrl":"https://doi.org/10.1117/1.JMI.13.1.014505","url":null,"abstract":"<p><strong>Purpose: </strong>Colorectal cancer is the third most common cancer globally, with a high mortality rate due to metastatic progression, particularly in the liver. Surgical resection remains the main curative treatment, but only a small subset of patients is eligible for surgery at diagnosis. For patients with initially unresectable colorectal liver metastases (CRLM), neoadjuvant chemotherapy can downstage tumors, potentially making surgery feasible. We investigate whether radiomic signatures-quantitative imaging biomarkers derived from baseline computed tomography (CT) scans-can noninvasively predict chemotherapy response in patients with unresectable CRLM, offering a pathway toward personalized treatment planning.</p><p><strong>Approach: </strong>We used radiomics combined with a stacking classifier (SC) to predict treatment outcome. Baseline CT imaging data from 355 patients with initially unresectable CRLM were analyzed using two regions of interest (ROIs) separately (all tumors in the liver and the largest tumor by volume). From each ROI, 107 radiomic features were extracted. The dataset was split into training and testing sets, and multiple machine learning models were trained and integrated via stacking to enhance prediction. Logistic regression coefficients were used to derive radiomic signatures.</p><p><strong>Results: </strong>The SC achieved strong predictive performance, with an area under the receiver operating characteristic curve of up to 0.77 for response prediction. Logistic regression identified 12 and 7 predictive features for treatment response in all tumors and the largest tumor ROIs, respectively.</p><p><strong>Conclusion: </strong>Our findings demonstrate that radiomic features from baseline CT scans can serve as robust, interpretable biomarkers for predicting chemotherapy response, offering insights to guide personalized treatment in unresectable CRLM.</p>","PeriodicalId":47707,"journal":{"name":"Journal of Medical Imaging","volume":"13 1","pages":"014505"},"PeriodicalIF":1.7,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12797257/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145971472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ER2Net: an evidential reasoning rule-enabled neural network for reliable triple-negative breast cancer tumor segmentation in magnetic resonance imaging. ER2Net:一个支持证据推理规则的神经网络,用于磁共振成像中可靠的三阴性乳腺癌肿瘤分割。
IF 1.7 Q3 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING Pub Date : 2026-01-01 Epub Date: 2026-01-29 DOI: 10.1117/1.JMI.13.1.014005
Kazi Md Farhad Mahmud, Ahmad Qasem, Joshua M Staley, Rachel Yoder, Allison Aripoli, Shane R Stecklein, Priyanka Sharma, Zhiguo Zhou

Purpose: Triple-negative breast cancer (TNBC) is an aggressive subtype with limited treatment options and high recurrence rates. Magnetic resonance imaging (MRI) is widely used for tumor assessment, but manual segmentation is labor-intensive and variable. Existing deep learning methods often lack generalizability, calibrated confidence, and robust uncertainty quantification.

Approach: We propose ER2Net, an evidential reasoning-enabled neural network for reliable TNBC tumor segmentation on MRI. ER2Net trains multiple U-Net variants with dropouts to generate diverse predictions and introduces pixel-wise reliability to quantify model agreement. We then introduce two ensemble fusion techniques: weighted reliability (WR) segmentation, which leverages pixel-wise reliability to enhance sensitivity, and Bayesian fusion (BF), which integrates predictions probabilistically for robust consensus. Confidence calibration is achieved using evidential reasoning, and we further propose pixel-wise reliable confidence entropy (PWRE) as a uncertainty measure.

Results: ER2Net improved performance compared with individual models. WR achieved IoU = 0.886, sensitivity = 0.928, precision = 0.952, and Hausdorff distance = 5.429 mm, whereas BF achieved IoU = 0.885 and sensitivity = 0.929. Reliable fusion provided the best calibration [expected calibration error = 0.00003; maximum calibration error = 0.017]. PWRE produced lower variance than conventional entropy, yielding more stable uncertainty estimates.

Conclusion: ER2Net introduces WR segmentation and BF as enhanced fusion techniques and PWRE as a uncertainty metric. Together, these advances improve segmentation accuracy, sensitivity, confidence calibration, and uncertainty estimation, paving the way for reliable MRI-based tools to support personalized treatment planning and response assessment in TNBC.

目的:三阴性乳腺癌(TNBC)是一种侵袭性亚型,治疗选择有限,复发率高。磁共振成像(MRI)被广泛应用于肿瘤评估,但人工分割是劳动密集型和可变的。现有的深度学习方法往往缺乏通用性、校准置信度和稳健的不确定性量化。方法:我们提出了ER2Net,一个基于证据推理的神经网络,用于可靠的MRI TNBC肿瘤分割。ER2Net训练多个带有dropout的U-Net变体,以生成不同的预测,并引入像素级可靠性来量化模型一致性。然后,我们介绍了两种集成融合技术:加权可靠性(WR)分割,它利用像素级可靠性来提高灵敏度,以及贝叶斯融合(BF),它以概率方式集成预测以获得稳健的共识。利用证据推理实现置信度校准,并进一步提出像素级可靠置信度熵(PWRE)作为不确定性度量。结果:与单个模型相比,ER2Net提高了性能。WR实现IoU = 0.886,灵敏度= 0.928,精度= 0.952,Hausdorff距离= 5.429 mm, BF实现IoU = 0.885,灵敏度= 0.929。可靠融合提供最佳校准[期望校准误差= 0.00003;最大校准误差= 0.017]。PWRE产生比常规熵更低的方差,产生更稳定的不确定性估计。结论:ER2Net引入了WR分割和BF作为增强融合技术,并将PWRE作为不确定度度量。总之,这些进步提高了分割的准确性、灵敏度、置信度校准和不确定性估计,为可靠的基于mri的工具铺平了道路,以支持TNBC的个性化治疗计划和反应评估。
{"title":"ER<sup>2</sup>Net: an evidential reasoning rule-enabled neural network for reliable triple-negative breast cancer tumor segmentation in magnetic resonance imaging.","authors":"Kazi Md Farhad Mahmud, Ahmad Qasem, Joshua M Staley, Rachel Yoder, Allison Aripoli, Shane R Stecklein, Priyanka Sharma, Zhiguo Zhou","doi":"10.1117/1.JMI.13.1.014005","DOIUrl":"https://doi.org/10.1117/1.JMI.13.1.014005","url":null,"abstract":"<p><strong>Purpose: </strong>Triple-negative breast cancer (TNBC) is an aggressive subtype with limited treatment options and high recurrence rates. Magnetic resonance imaging (MRI) is widely used for tumor assessment, but manual segmentation is labor-intensive and variable. Existing deep learning methods often lack generalizability, calibrated confidence, and robust uncertainty quantification.</p><p><strong>Approach: </strong>We propose ER<sup>2</sup>Net, an evidential reasoning-enabled neural network for reliable TNBC tumor segmentation on MRI. ER<sup>2</sup>Net trains multiple U-Net variants with dropouts to generate diverse predictions and introduces pixel-wise reliability to quantify model agreement. We then introduce two ensemble fusion techniques: weighted reliability (WR) segmentation, which leverages pixel-wise reliability to enhance sensitivity, and Bayesian fusion (BF), which integrates predictions probabilistically for robust consensus. Confidence calibration is achieved using evidential reasoning, and we further propose pixel-wise reliable confidence entropy (PWRE) as a uncertainty measure.</p><p><strong>Results: </strong>ER<sup>2</sup>Net improved performance compared with individual models. WR achieved IoU = 0.886, sensitivity = 0.928, precision = 0.952, and Hausdorff distance = 5.429 mm, whereas BF achieved IoU = 0.885 and sensitivity = 0.929. Reliable fusion provided the best calibration [expected calibration error = 0.00003; maximum calibration error = 0.017]. PWRE produced lower variance than conventional entropy, yielding more stable uncertainty estimates.</p><p><strong>Conclusion: </strong>ER<sup>2</sup>Net introduces WR segmentation and BF as enhanced fusion techniques and PWRE as a uncertainty metric. Together, these advances improve segmentation accuracy, sensitivity, confidence calibration, and uncertainty estimation, paving the way for reliable MRI-based tools to support personalized treatment planning and response assessment in TNBC.</p>","PeriodicalId":47707,"journal":{"name":"Journal of Medical Imaging","volume":"13 1","pages":"014005"},"PeriodicalIF":1.7,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12853374/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146107856","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Medical Imaging
全部 Geobiology Appl. Clay Sci. Geochim. Cosmochim. Acta J. Hydrol. Org. Geochem. Carbon Balance Manage. Contrib. Mineral. Petrol. Int. J. Biometeorol. IZV-PHYS SOLID EART+ J. Atmos. Chem. Acta Oceanolog. Sin. Acta Geophys. ACTA GEOL POL ACTA PETROL SIN ACTA GEOL SIN-ENGL AAPG Bull. Acta Geochimica Adv. Atmos. Sci. Adv. Meteorol. Am. J. Phys. Anthropol. Am. J. Sci. Am. Mineral. Annu. Rev. Earth Planet. Sci. Appl. Geochem. Aquat. Geochem. Ann. Glaciol. Archaeol. Anthropol. Sci. ARCHAEOMETRY ARCT ANTARCT ALP RES Asia-Pac. J. Atmos. Sci. ATMOSPHERE-BASEL Atmos. Res. Aust. J. Earth Sci. Atmos. Chem. Phys. Atmos. Meas. Tech. Basin Res. Big Earth Data BIOGEOSCIENCES Geostand. Geoanal. Res. GEOLOGY Geosci. J. Geochem. J. Geochem. Trans. Geosci. Front. Geol. Ore Deposits Global Biogeochem. Cycles Gondwana Res. Geochem. Int. Geol. J. Geophys. Prospect. Geosci. Model Dev. GEOL BELG GROUNDWATER Hydrogeol. J. Hydrol. Earth Syst. Sci. Hydrol. Processes Int. J. Climatol. Int. J. Earth Sci. Int. Geol. Rev. Int. J. Disaster Risk Reduct. Int. J. Geomech. Int. J. Geog. Inf. Sci. Isl. Arc J. Afr. Earth. Sci. J. Adv. Model. Earth Syst. J APPL METEOROL CLIM J. Atmos. Oceanic Technol. J. Atmos. Sol. Terr. Phys. J. Clim. J. Earth Sci. J. Earth Syst. Sci. J. Environ. Eng. Geophys. J. Geog. Sci. Mineral. Mag. Miner. Deposita Mon. Weather Rev. Nat. Hazards Earth Syst. Sci. Nat. Clim. Change Nat. Geosci. Ocean Dyn. Ocean and Coastal Research npj Clim. Atmos. Sci. Ocean Modell. Ocean Sci. Ore Geol. Rev. OCEAN SCI J Paleontol. J. PALAEOGEOGR PALAEOCL PERIOD MINERAL PETROLOGY+ Phys. Chem. Miner. Polar Sci. Prog. Oceanogr. Quat. Sci. Rev. Q. J. Eng. Geol. Hydrogeol. RADIOCARBON Pure Appl. Geophys. Resour. Geol. Rev. Geophys. Sediment. Geol.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1