首页 > 最新文献

Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention最新文献

英文 中文
Longitudinal Multimodal Transformer Integrating Imaging and Latent Clinical Signatures From Routine EHRs for Pulmonary Nodule Classification. 纵向多模态变换器整合常规电子病历中的成像和潜在临床特征,用于肺结节分类。
Thomas Z Li, John M Still, Kaiwen Xu, Ho Hin Lee, Leon Y Cai, Aravind R Krishnan, Riqiang Gao, Mirza S Khan, Sanja Antic, Michael Kammer, Kim L Sandler, Fabien Maldonado, Bennett A Landman, Thomas A Lasko

The accuracy of predictive models for solitary pulmonary nodule (SPN) diagnosis can be greatly increased by incorporating repeat imaging and medical context, such as electronic health records (EHRs). However, clinically routine modalities such as imaging and diagnostic codes can be asynchronous and irregularly sampled over different time scales which are obstacles to longitudinal multimodal learning. In this work, we propose a transformer-based multimodal strategy to integrate repeat imaging with longitudinal clinical signatures from routinely collected EHRs for SPN classification. We perform unsupervised disentanglement of latent clinical signatures and leverage time-distance scaled self-attention to jointly learn from clinical signatures expressions and chest computed tomography (CT) scans. Our classifier is pretrained on 2,668 scans from a public dataset and 1,149 subjects with longitudinal chest CTs, billing codes, medications, and laboratory tests from EHRs of our home institution. Evaluation on 227 subjects with challenging SPNs revealed a significant AUC improvement over a longitudinal multimodal baseline (0.824 vs 0.752 AUC), as well as improvements over a single cross-section multimodal scenario (0.809 AUC) and a longitudinal imaging-only scenario (0.741 AUC). This work demonstrates significant advantages with a novel approach for co-learning longitudinal imaging and non-imaging phenotypes with transformers. Code available at https://github.com/MASILab/lmsignatures.

通过结合重复成像和医疗背景(如电子健康记录(EHR)),可大大提高单发肺结节(SPN)诊断预测模型的准确性。然而,成像和诊断代码等临床常规模式在不同时间尺度上可能是异步和不规则采样的,这对纵向多模式学习构成了障碍。在这项工作中,我们提出了一种基于变压器的多模态策略,将重复成像与日常收集的电子病历中的纵向临床特征整合在一起,用于 SPN 分类。我们对潜在的临床特征进行了无监督的反纠缠,并利用时间距离缩放自关注来联合学习临床特征表达和胸部计算机断层扫描(CT)。我们的分类器是在公共数据集中的 2,668 次扫描和 1,149 名受试者的纵向胸部 CT、账单代码、药物和实验室测试上进行预训练的,这些数据来自我们所在机构的电子病历。对 227 名患有高难度 SPN 的受试者进行的评估显示,与纵向多模态基线(0.824 对 0.752 AUC)相比,AUC 有了显著提高,与单一横截面多模态方案(0.809 AUC)和仅纵向成像方案(0.741 AUC)相比,AUC 也有所提高。这项工作表明,利用变换器共同学习纵向成像和非成像表型的新方法具有显著优势。代码见 https://github.com/MASILab/lmsignatures。
{"title":"Longitudinal Multimodal Transformer Integrating Imaging and Latent Clinical Signatures From Routine EHRs for Pulmonary Nodule Classification.","authors":"Thomas Z Li, John M Still, Kaiwen Xu, Ho Hin Lee, Leon Y Cai, Aravind R Krishnan, Riqiang Gao, Mirza S Khan, Sanja Antic, Michael Kammer, Kim L Sandler, Fabien Maldonado, Bennett A Landman, Thomas A Lasko","doi":"10.1007/978-3-031-43895-0_61","DOIUrl":"10.1007/978-3-031-43895-0_61","url":null,"abstract":"<p><p>The accuracy of predictive models for solitary pulmonary nodule (SPN) diagnosis can be greatly increased by incorporating repeat imaging and medical context, such as electronic health records (EHRs). However, clinically routine modalities such as imaging and diagnostic codes can be asynchronous and irregularly sampled over different time scales which are obstacles to longitudinal multimodal learning. In this work, we propose a transformer-based multimodal strategy to integrate repeat imaging with longitudinal clinical signatures from routinely collected EHRs for SPN classification. We perform unsupervised disentanglement of latent clinical signatures and leverage time-distance scaled self-attention to jointly learn from clinical signatures expressions and chest computed tomography (CT) scans. Our classifier is pretrained on 2,668 scans from a public dataset and 1,149 subjects with longitudinal chest CTs, billing codes, medications, and laboratory tests from EHRs of our home institution. Evaluation on 227 subjects with challenging SPNs revealed a significant AUC improvement over a longitudinal multimodal baseline (0.824 vs 0.752 AUC), as well as improvements over a single cross-section multimodal scenario (0.809 AUC) and a longitudinal imaging-only scenario (0.741 AUC). This work demonstrates significant advantages with a novel approach for co-learning longitudinal imaging and non-imaging phenotypes with transformers. Code available at https://github.com/MASILab/lmsignatures.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"14221 ","pages":"649-659"},"PeriodicalIF":0.0,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11110542/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141081448","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Brain Anatomy-Guided MRI Analysis for Assessing Clinical Progression of Cognitive Impairment with Structural MRI. 利用结构磁共振成像评估认知障碍临床进展的脑解剖学引导磁共振成像分析。
Lintao Zhang, Jinjian Wu, Lihong Wang, Li Wang, David C Steffens, Shijun Qiu, Guy G Potter, Mingxia Liu

Brain structural MRI has been widely used for assessing future progression of cognitive impairment (CI) based on learning-based methods. Previous studies generally suffer from the limited number of labeled training data, while there exists a huge amount of MRIs in large-scale public databases. Even without task-specific label information, brain anatomical structures provided by these MRIs can be used to boost learning performance intuitively. Unfortunately, existing research seldom takes advantage of such brain anatomy prior. To this end, this paper proposes a brain anatomy-guided representation (BAR) learning framework for assessing the clinical progression of cognitive impairment with T1-weighted MRIs. The BAR consists of a pretext model and a downstream model, with a shared brain anatomy-guided encoder for MRI feature extraction. The pretext model also contains a decoder for brain tissue segmentation, while the downstream model relies on a predictor for classification. We first train the pretext model through a brain tissue segmentation task on 9,544 auxiliary T1-weighted MRIs, yielding a generalizable encoder. The downstream model with the learned encoder is further fine-tuned on target MRIs for prediction tasks. We validate the proposed BAR on two CI-related studies with a total of 391 subjects with T1-weighted MRIs. Experimental results suggest that the BAR outperforms several state-of-the-art (SOTA) methods. The source code and pre-trained models are available at https://github.com/goodaycoder/BAR.

基于学习方法的脑结构磁共振成像已被广泛用于评估认知障碍(CI)的未来进展。以往的研究普遍存在标注训练数据数量有限的问题,而大规模公共数据库中存在大量核磁共振成像数据。即使没有特定任务的标签信息,这些核磁共振成像提供的大脑解剖结构也能直观地提高学习效率。遗憾的是,现有研究很少利用这些大脑解剖结构。为此,本文提出了一种大脑解剖引导表征(BAR)学习框架,用于通过 T1 加权核磁共振成像评估认知障碍的临床进展。BAR 由一个前置模型和一个下游模型组成,共享用于磁共振成像特征提取的脑解剖导向编码器。前导模型还包含一个用于脑组织分割的解码器,而下游模型则依靠一个预测器进行分类。我们首先通过对 9544 张辅助 T1 加权核磁共振图像进行脑组织分割任务来训练前置模型,从而获得可通用的编码器。使用所学编码器的下游模型在目标 MRI 上进一步微调,以完成预测任务。我们在两项与 CI 相关的研究中对所提出的 BAR 进行了验证,共有 391 名受试者接受了 T1 加权磁共振成像。实验结果表明,BAR 的性能优于几种最先进的 (SOTA) 方法。源代码和预训练模型可在 https://github.com/goodaycoder/BAR 上获取。
{"title":"Brain Anatomy-Guided MRI Analysis for Assessing Clinical Progression of Cognitive Impairment with Structural MRI.","authors":"Lintao Zhang, Jinjian Wu, Lihong Wang, Li Wang, David C Steffens, Shijun Qiu, Guy G Potter, Mingxia Liu","doi":"10.1007/978-3-031-43993-3_11","DOIUrl":"10.1007/978-3-031-43993-3_11","url":null,"abstract":"<p><p>Brain structural MRI has been widely used for assessing future progression of cognitive impairment (CI) based on learning-based methods. Previous studies generally suffer from the limited number of labeled training data, while there exists a huge amount of MRIs in large-scale public databases. Even without task-specific label information, brain anatomical structures provided by these MRIs can be used to boost learning performance intuitively. Unfortunately, existing research seldom takes advantage of such brain anatomy prior. To this end, this paper proposes a brain anatomy-guided representation (BAR) learning framework for assessing the clinical progression of cognitive impairment with T1-weighted MRIs. The BAR consists of a <i>pretext model</i> and a <i>downstream model</i>, with a shared brain anatomy-guided encoder for MRI feature extraction. The pretext model also contains a decoder for brain tissue segmentation, while the downstream model relies on a predictor for classification. We first train the pretext model through a brain tissue segmentation task on 9,544 auxiliary T1-weighted MRIs, yielding a generalizable encoder. The downstream model with the learned encoder is further fine-tuned on target MRIs for prediction tasks. We validate the proposed BAR on two CI-related studies with a total of 391 subjects with T1-weighted MRIs. Experimental results suggest that the BAR outperforms several state-of-the-art (SOTA) methods. The source code and pre-trained models are available at https://github.com/goodaycoder/BAR.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"14227 ","pages":"109-119"},"PeriodicalIF":0.0,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10883230/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139935020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Flow-based Geometric Interpolation of Fiber Orientation Distribution Functions. 基于流动的纤维方向分布函数几何插值。
Xinyu Nie, Yonggang Shi

The fiber orientation distribution function (FOD) is an advanced model for high angular resolution diffusion MRI representing complex fiber geometry. However, the complicated mathematical structures of the FOD function pose challenges for FOD image processing tasks such as interpolation, which plays a critical role in the propagation of fiber tracts in tractography. In FOD-based tractography, linear interpolation is commonly used for numerical efficiency, but it is prone to generate false artificial information, leading to anatomically incorrect fiber tracts. To overcome this difficulty, we propose a flowbased and geometrically consistent interpolation framework that considers peak-wise rotations of FODs within the neighborhood of each location. Our method decomposes a FOD function into multiple components and uses a smooth vector field to model the flows of each peak in its neighborhood. To generate the interpolated result along the flow of each vector field, we develop a closed-form and efficient method to rotate FOD peaks in neighboring voxels and realize geometrically consistent interpolation of FOD components. By combining the interpolation results from each peak, we obtain the final interpolation of FODs. Experimental results on Human Connectome Project (HCP) data demonstrate that our method produces anatomically more meaningful FOD interpolations and significantly enhances tractography performance.

纤维取向分布函数(FOD)是一种先进的高角度分辨率扩散核磁共振成像模型,代表了复杂的纤维几何形状。然而,FOD 函数复杂的数学结构给 FOD 图像处理任务(如插值)带来了挑战,而插值在束流成像中纤维束的传播中起着至关重要的作用。在基于 FOD 的纤维束成像中,线性插值通常用于提高数值效率,但它容易产生虚假的人工信息,导致解剖学上不正确的纤维束。为了克服这一困难,我们提出了一种基于流的几何一致性插值框架,该框架考虑了每个位置邻域内 FOD 的峰值旋转。我们的方法将 FOD 函数分解为多个分量,并使用平滑矢量场对其邻域内每个峰值的流量进行建模。为了沿着每个矢量场的流向生成插值结果,我们开发了一种闭式高效方法来旋转邻近体素中的 FOD 峰,并实现 FOD 分量的几何一致性插值。通过合并每个峰值的插值结果,我们得到了最终的 FOD 插值结果。人类连接组计划(HCP)数据的实验结果表明,我们的方法产生的 FOD 插值在解剖学上更有意义,并显著提高了牵引成像的性能。
{"title":"Flow-based Geometric Interpolation of Fiber Orientation Distribution Functions.","authors":"Xinyu Nie, Yonggang Shi","doi":"10.1007/978-3-031-43993-3_5","DOIUrl":"10.1007/978-3-031-43993-3_5","url":null,"abstract":"<p><p>The fiber orientation distribution function (FOD) is an advanced model for high angular resolution diffusion MRI representing complex fiber geometry. However, the complicated mathematical structures of the FOD function pose challenges for FOD image processing tasks such as interpolation, which plays a critical role in the propagation of fiber tracts in tractography. In FOD-based tractography, linear interpolation is commonly used for numerical efficiency, but it is prone to generate false artificial information, leading to anatomically incorrect fiber tracts. To overcome this difficulty, we propose a flowbased and geometrically consistent interpolation framework that considers peak-wise rotations of FODs within the neighborhood of each location. Our method decomposes a FOD function into multiple components and uses a smooth vector field to model the flows of each peak in its neighborhood. To generate the interpolated result along the flow of each vector field, we develop a closed-form and efficient method to rotate FOD peaks in neighboring voxels and realize geometrically consistent interpolation of FOD components. By combining the interpolation results from each peak, we obtain the final interpolation of FODs. Experimental results on Human Connectome Project (HCP) data demonstrate that our method produces anatomically more meaningful FOD interpolations and significantly enhances tractography performance.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"14227 ","pages":"46-55"},"PeriodicalIF":0.0,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10978007/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140320351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CortexMorph: fast cortical thickness estimation via diffeomorphic registration using VoxelMorph. CortexMorph:使用VoxelMorph通过差胚配准快速估计皮质厚度。
Richard McKinley, Christian Rummel

The thickness of the cortical band is linked to various neurological and psychiatric conditions, and is often estimated through surface-based methods such as Freesurfer in MRI studies. The DiReCT method, which calculates cortical thickness using a diffeomorphic deformation of the gray-white matter interface towards the pial surface, offers an alternative to surface-based methods. Recent studies using a synthetic cortical thickness phantom have demonstrated that the combination of DiReCT and deep-learning-based segmentation is more sensitive to subvoxel cortical thinning than Freesurfer. While anatomical segmentation of a T1-weighted image now takes seconds, existing implementations of DiReCT rely on iterative image registration methods which can take up to an hour per volume. On the other hand, learning-based deformable image registration methods like VoxelMorph have been shown to be faster than classical methods while improving registration accuracy. This paper proposes CortexMorph, a new method that employs unsupervised deep learning to directly regress the deformation field needed for DiReCT. By combining CortexMorph with a deep-learning-based segmentation model, it is possible to estimate region-wise thickness in seconds from a T1-weighted image, while maintaining the ability to detect cortical atrophy. We validate this claim on the OASIS-3 dataset and the synthetic cortical thickness phantom of Rusak et al.

皮质带的厚度与各种神经和精神状况有关,通常通过MRI研究中的Freesurfer等基于表面的方法来估计。DiReCT方法利用灰质-白质界面向脑皮层表面的微分变形来计算皮层厚度,提供了一种基于表面的方法的替代方法。最近使用合成皮质厚度幻象的研究表明,与Freesurfer相比,DiReCT和基于深度学习的分割结合对亚体素皮质变薄更敏感。虽然t1加权图像的解剖分割现在需要几秒钟,但现有的DiReCT实现依赖于迭代图像配准方法,每个体积可能需要长达一个小时的时间。另一方面,基于学习的可变形图像配准方法,如VoxelMorph,在提高配准精度的同时,也比传统方法更快。本文提出了一种利用无监督深度学习直接回归DiReCT所需变形场的新方法——CortexMorph。通过将CortexMorph与基于深度学习的分割模型相结合,可以在几秒钟内从t1加权图像中估计出区域厚度,同时保持检测皮质萎缩的能力。我们在OASIS-3数据集和Rusak等人的合成皮质厚度模型上验证了这一说法。
{"title":"CortexMorph: fast cortical thickness estimation via diffeomorphic registration using VoxelMorph.","authors":"Richard McKinley, Christian Rummel","doi":"10.1007/978-3-031-43999-5_69","DOIUrl":"10.1007/978-3-031-43999-5_69","url":null,"abstract":"<p><p>The thickness of the cortical band is linked to various neurological and psychiatric conditions, and is often estimated through surface-based methods such as Freesurfer in MRI studies. The DiReCT method, which calculates cortical thickness using a diffeomorphic deformation of the gray-white matter interface towards the pial surface, offers an alternative to surface-based methods. Recent studies using a synthetic cortical thickness phantom have demonstrated that the combination of DiReCT and deep-learning-based segmentation is more sensitive to subvoxel cortical thinning than Freesurfer. While anatomical segmentation of a T1-weighted image now takes seconds, existing implementations of DiReCT rely on iterative image registration methods which can take up to an hour per volume. On the other hand, learning-based deformable image registration methods like VoxelMorph have been shown to be faster than classical methods while improving registration accuracy. This paper proposes CortexMorph, a new method that employs unsupervised deep learning to directly regress the deformation field needed for DiReCT. By combining CortexMorph with a deep-learning-based segmentation model, it is possible to estimate region-wise thickness in seconds from a T1-weighted image, while maintaining the ability to detect cortical atrophy. We validate this claim on the OASIS-3 dataset and the synthetic cortical thickness phantom of Rusak et al.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":" ","pages":"730-739"},"PeriodicalIF":0.0,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7618429/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145679904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Foundation Ark: Accruing and Reusing Knowledge for Superior and Robust Performance. 基础方舟:积累和重复使用知识,实现卓越而稳健的绩效。
DongAo Ma, Jiaxuan Pang, Michael B Gotway, Jianming Liang

Deep learning nowadays offers expert-level and sometimes even super-expert-level performance, but achieving such performance demands massive annotated data for training (e.g., Google's proprietary CXR Foundation Model (CXR-FM) was trained on 821,544 labeled and mostly private chest X-rays (CXRs)). Numerous datasets are publicly available in medical imaging but individually small and heterogeneous in expert labels. We envision a powerful and robust foundation model that can be trained by aggregating numerous small public datasets. To realize this vision, we have developed Ark, a framework that accrues and reuses knowledge from heterogeneous expert annotations in various datasets. As a proof of concept, we have trained two Ark models on 335,484 and 704,363 CXRs, respectively, by merging several datasets including ChestX-ray14, CheXpert, MIMIC-II, and VinDr-CXR, evaluated them on a wide range of imaging tasks covering both classification and segmentation via fine-tuning, linear-probing, and gender-bias analysis, and demonstrated our Ark's superior and robust performance over the state-of-the-art (SOTA) fully/self-supervised baselines and Google's proprietary CXR-FM. This enhanced performance is attributed to our simple yet powerful observation that aggregating numerous public datasets diversifies patient populations and accrues knowledge from diverse experts, yielding unprecedented performance yet saving annotation cost. With all codes and pretrained models released at GitHub.com/JLiangLab/Ark, we hope that Ark exerts an important impact on open science, as accruing and reusing knowledge from expert annotations in public datasets can potentially surpass the performance of proprietary models trained on unusually large data, inspiring many more researchers worldwide to share codes and datasets to build open foundation models, accelerate open science, and democratize deep learning for medical imaging.

如今,深度学习可以提供专家级,有时甚至是超专家级的性能,但要达到这样的性能,需要海量标注数据进行训练(例如,谷歌专有的 CXR 基础模型(CXR-FM)就是在 821,544 张标注且大多是私人的胸部 X 光片(CXR)上训练出来的)。医学影像领域有许多公开的数据集,但每个数据集的规模都很小,而且专家标签也不尽相同。我们设想通过汇集众多小型公共数据集,训练出一个强大而稳健的基础模型。为了实现这一愿景,我们开发了方舟,这是一个从各种数据集中的异构专家注释中积累和重用知识的框架。作为概念验证,我们通过合并多个数据集(包括 ChestX-ray14、CheXpert、MIMIC-II 和 VinDr-CXR),分别在 335,484 张和 704,363 张 CXR 上训练了两个 Ark 模型,并通过微调对它们进行了广泛的成像任务评估,包括分类和分割、并证明了我们的方舟比最先进的(SOTA)完全/自我监督基线和谷歌专有的 CXR-FM 性能更优越、更稳健。性能的提升归功于我们简单而有力的观察,即汇聚众多公共数据集可使患者群体多样化,并从不同专家那里积累知识,从而产生前所未有的性能,同时节省注释成本。随着所有代码和预训练模型在GitHub.com/JLiangLab/Ark上发布,我们希望方舟能对开放科学产生重要影响,因为从公共数据集的专家注释中积累和重用知识,有可能超越在异常大的数据上训练的专有模型的性能,激励全世界更多研究人员共享代码和数据集,以建立开放基础模型,加速开放科学,并使医学影像的深度学习民主化。
{"title":"Foundation Ark: Accruing and Reusing Knowledge for Superior and Robust Performance.","authors":"DongAo Ma, Jiaxuan Pang, Michael B Gotway, Jianming Liang","doi":"10.1007/978-3-031-43907-0_62","DOIUrl":"10.1007/978-3-031-43907-0_62","url":null,"abstract":"<p><p>Deep learning nowadays offers expert-level and sometimes even super-expert-level performance, but achieving such performance demands massive annotated data for training (e.g., Google's <i>proprietary</i> CXR Foundation Model (CXR-FM) was trained on 821,544 <i>labeled</i> and mostly <i>private</i> chest X-rays (CXRs)). <i>Numerous</i> datasets are <i>publicly</i> available in medical imaging but individually <i>small</i> and <i>heterogeneous</i> in expert labels. We envision a powerful and robust foundation model that can be trained by aggregating numerous small public datasets. To realize this vision, we have developed <b>Ark</b>, a framework that <b>a</b>ccrues and <b>r</b>euses <b>k</b>nowledge from <b>heterogeneous</b> expert annotations in various datasets. As a proof of concept, we have trained two Ark models on 335,484 and 704,363 CXRs, respectively, by merging several datasets including ChestX-ray14, CheXpert, MIMIC-II, and VinDr-CXR, evaluated them on a wide range of imaging tasks covering both classification and segmentation via fine-tuning, linear-probing, and gender-bias analysis, and demonstrated our Ark's superior and robust performance over the state-of-the-art (SOTA) fully/self-supervised baselines and Google's proprietary CXR-FM. This enhanced performance is attributed to our simple yet powerful observation that aggregating numerous public datasets diversifies patient populations and accrues knowledge from diverse experts, yielding unprecedented performance yet saving annotation cost. With all codes and pretrained models released at GitHub.com/JLiangLab/Ark, we hope that Ark exerts an important impact on open science, as accruing and reusing knowledge from expert annotations in public datasets can potentially surpass the performance of proprietary models trained on unusually large data, inspiring many more researchers worldwide to share codes and datasets to build open foundation models, accelerate open science, and democratize deep learning for medical imaging.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"14220 ","pages":"651-662"},"PeriodicalIF":0.0,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11095392/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140946796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Explainable Geometric-Weighted Graph Attention Network for Identifying Functional Networks Associated with Gait Impairment. 一个可解释的几何加权图注意网络识别与步态障碍相关的功能网络。
Favour Nerrise, Qingyu Zhao, Kathleen L Poston, Kilian M Pohl, Ehsan Adeli

One of the hallmark symptoms of Parkinson's Disease (PD) is the progressive loss of postural reflexes, which eventually leads to gait difficulties and balance problems. Identifying disruptions in brain function associated with gait impairment could be crucial in better understanding PD motor progression, thus advancing the development of more effective and personalized therapeutics. In this work, we present an explainable, geometric, weighted-graph attention neural network (xGW-GAT) to identify functional networks predictive of the progression of gait difficulties in individuals with PD. xGW-GAT predicts the multi-class gait impairment on the MDS-Unified PD Rating Scale (MDS-UPDRS). Our computational- and data-efficient model represents functional connectomes as symmetric positive definite (SPD) matrices on a Riemannian manifold to explicitly encode pairwise interactions of entire connectomes, based on which we learn an attention mask yielding individual- and group-level explainability. Applied to our resting-state functional MRI (rs-fMRI) dataset of individuals with PD, xGW-GAT identifies functional connectivity patterns associated with gait impairment in PD and offers interpretable explanations of functional subnetworks associated with motor impairment. Our model successfully outperforms several existing methods while simultaneously revealing clinically-relevant connectivity patterns. The source code is available at https://github.com/favour-nerrise/xGW-GAT.

帕金森病(PD)的标志性症状之一是姿势反射的逐渐丧失,最终导致步态困难和平衡问题。识别与步态障碍相关的脑功能中断对于更好地了解PD运动进展至关重要,从而促进更有效和个性化治疗的发展。在这项工作中,我们提出了一个可解释的、几何的、加权图的注意神经网络(xGW-GAT)来识别预测PD患者步态困难进展的功能网络。xGW-GAT预测mds -统一PD评定量表(MDS-UPDRS)的多等级步态障碍。我们的计算和数据效率模型将功能连接体表示为黎曼流形上的对称正定(SPD)矩阵,以显式编码整个连接体的成对相互作用,在此基础上,我们学习了一个产生个人和群体级别可解释性的注意掩模。xGW-GAT应用于PD患者的静息状态功能MRI (rs-fMRI)数据集,确定了PD患者与步态障碍相关的功能连接模式,并提供了与运动障碍相关的功能子网络的可解释性解释。我们的模型成功地超越了几种现有的方法,同时揭示了临床相关的连接模式。源代码可从https://github.com/favour-nerrise/xGW-GAT获得。
{"title":"An Explainable Geometric-Weighted Graph Attention Network for Identifying Functional Networks Associated with Gait Impairment.","authors":"Favour Nerrise, Qingyu Zhao, Kathleen L Poston, Kilian M Pohl, Ehsan Adeli","doi":"10.1007/978-3-031-43895-0_68","DOIUrl":"10.1007/978-3-031-43895-0_68","url":null,"abstract":"<p><p>One of the hallmark symptoms of Parkinson's Disease (PD) is the progressive loss of postural reflexes, which eventually leads to gait difficulties and balance problems. Identifying disruptions in brain function associated with gait impairment could be crucial in better understanding PD motor progression, thus advancing the development of more effective and personalized therapeutics. In this work, we present an explainable, geometric, weighted-graph attention neural network (<b>xGW-GAT</b>) to identify functional networks predictive of the progression of gait difficulties in individuals with PD. <b>xGW-GAT</b> predicts the multi-class gait impairment on the MDS-Unified PD Rating Scale (MDS-UPDRS). Our computational- and data-efficient model represents functional connectomes as symmetric positive definite (SPD) matrices on a Riemannian manifold to explicitly encode pairwise interactions of entire connectomes, based on which we learn an attention mask yielding individual- and group-level explainability. Applied to our resting-state functional MRI (rs-fMRI) dataset of individuals with PD, <b>xGW-GAT</b> identifies functional connectivity patterns associated with gait impairment in PD and offers interpretable explanations of functional subnetworks associated with motor impairment. Our model successfully outperforms several existing methods while simultaneously revealing clinically-relevant connectivity patterns. The source code is available at https://github.com/favour-nerrise/xGW-GAT.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"14221 ","pages":"723-733"},"PeriodicalIF":0.0,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10657737/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138049118","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LSOR: Longitudinally-Consistent Self-Organized Representation Learning. 纵向一致自组织表征学习。
Jiahong Ouyang, Qingyu Zhao, Ehsan Adeli, Wei Peng, Greg Zaharchuk, Kilian M Pohl

Interpretability is a key issue when applying deep learning models to longitudinal brain MRIs. One way to address this issue is by visualizing the high-dimensional latent spaces generated by deep learning via self-organizing maps (SOM). SOM separates the latent space into clusters and then maps the cluster centers to a discrete (typically 2D) grid preserving the high-dimensional relationship between clusters. However, learning SOM in a high-dimensional latent space tends to be unstable, especially in a self-supervision setting. Furthermore, the learned SOM grid does not necessarily capture clinically interesting information, such as brain age. To resolve these issues, we propose the first self-supervised SOM approach that derives a high-dimensional, interpretable representation stratified by brain age solely based on longitudinal brain MRIs (i.e., without demographic or cognitive information). Called Longitudinally-consistent Self-Organized Representation learning (LSOR), the method is stable during training as it relies on soft clustering (vs. the hard cluster assignments used by existing SOM). Furthermore, our approach generates a latent space stratified according to brain age by aligning trajectories inferred from longitudinal MRIs to the reference vector associated with the corresponding SOM cluster. When applied to longitudinal MRIs of the Alzheimer's Disease Neuroimaging Initiative (ADNI, N=632), LSOR generates an interpretable latent space and achieves comparable or higher accuracy than the state-of-the-art representations with respect to the downstream tasks of classification (static vs. progressive mild cognitive impairment) and regression (determining ADAS-Cog score of all subjects). The code is available at https://github.com/ouyangjiahong/longitudinal-som-single-modality.

将深度学习模型应用于纵向脑核磁共振成像时,可解释性是一个关键问题。解决这个问题的一种方法是通过自组织地图(SOM)可视化深度学习产生的高维潜在空间。SOM将潜在空间分成簇,然后将簇中心映射到一个离散的(通常是二维的)网格,以保持簇之间的高维关系。然而,在高维潜在空间中学习SOM往往是不稳定的,尤其是在自我监督的环境中。此外,习得的SOM网格不一定能捕捉到临床上有趣的信息,比如大脑年龄。为了解决这些问题,我们提出了第一种自我监督的SOM方法,该方法仅基于纵向脑mri(即没有人口统计学或认知信息)获得高维,可解释的脑年龄分层表示。这种方法被称为纵向一致自组织表示学习(LSOR),在训练期间是稳定的,因为它依赖于软聚类(相对于现有SOM使用的硬聚类分配)。此外,我们的方法通过将从纵向mri推断的轨迹与相应SOM集群相关的参考向量对齐,生成了一个根据脑年龄分层的潜在空间。当应用于阿尔茨海默病神经成像计划(ADNI, N=632)的纵向mri时,LSOR产生了一个可解释的潜在空间,并且在分类(静态与进行性轻度认知障碍)和回归(确定所有受试者的ADAS-Cog评分)的下游任务方面达到了与最先进的表征相当或更高的准确性。代码可在https://github.com/ouyangjiahong/longitudinal-som-single-modality上获得。
{"title":"LSOR: Longitudinally-Consistent Self-Organized Representation Learning.","authors":"Jiahong Ouyang, Qingyu Zhao, Ehsan Adeli, Wei Peng, Greg Zaharchuk, Kilian M Pohl","doi":"10.1007/978-3-031-43907-0_27","DOIUrl":"10.1007/978-3-031-43907-0_27","url":null,"abstract":"<p><p>Interpretability is a key issue when applying deep learning models to longitudinal brain MRIs. One way to address this issue is by visualizing the high-dimensional latent spaces generated by deep learning via self-organizing maps (SOM). SOM separates the latent space into clusters and then maps the cluster centers to a discrete (typically 2D) grid preserving the high-dimensional relationship between clusters. However, learning SOM in a high-dimensional latent space tends to be unstable, especially in a self-supervision setting. Furthermore, the learned SOM grid does not necessarily capture clinically interesting information, such as brain age. To resolve these issues, we propose the first self-supervised SOM approach that derives a high-dimensional, interpretable representation stratified by brain age solely based on longitudinal brain MRIs (i.e., without demographic or cognitive information). Called <b>L</b>ongitudinally-consistent <b>S</b>elf-<b>O</b>rganized <b>R</b>epresentation learning (LSOR), the method is stable during training as it relies on soft clustering (vs. the hard cluster assignments used by existing SOM). Furthermore, our approach generates a latent space stratified according to brain age by aligning trajectories inferred from longitudinal MRIs to the reference vector associated with the corresponding SOM cluster. When applied to longitudinal MRIs of the Alzheimer's Disease Neuroimaging Initiative (ADNI, <math><mi>N</mi><mspace></mspace><mo>=</mo><mspace></mspace><mn>632</mn></math>), LSOR generates an interpretable latent space and achieves comparable or higher accuracy than the state-of-the-art representations with respect to the downstream tasks of classification (static vs. progressive mild cognitive impairment) and regression (determining ADAS-Cog score of all subjects). The code is available at https://github.com/ouyangjiahong/longitudinal-som-single-modality.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"14220 ","pages":"279-289"},"PeriodicalIF":0.0,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10642576/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"92158078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Pelphix: Surgical Phase Recognition from X-ray Images in Percutaneous Pelvic Fixation. Pelphix:从 X 光图像识别经皮骨盆固定术中的手术期。
Benjamin D Killeen, Han Zhang, Jan Mangulabnan, Mehran Armand, Russell H Taylor, Greg Osgood, Mathias Unberath

Surgical phase recognition (SPR) is a crucial element in the digital transformation of the modern operating theater. While SPR based on video sources is well-established, incorporation of interventional X-ray sequences has not yet been explored. This paper presents Pelphix, a first approach to SPR for X-ray-guided percutaneous pelvic fracture fixation, which models the procedure at four levels of granularity - corridor, activity, view, and frame value - simulating the pelvic fracture fixation workflow as a Markov process to provide fully annotated training data. Using added supervision from detection of bony corridors, tools, and anatomy, we learn image representations that are fed into a transformer model to regress surgical phases at the four granularity levels. Our approach demonstrates the feasibility of X-ray-based SPR, achieving an average accuracy of 99.2% on simulated sequences and 71.7% in cadaver across all granularity levels, with up to 84% accuracy for the target corridor in real data. This work constitutes the first step toward SPR for the X-ray domain, establishing an approach to categorizing phases in X-ray-guided surgery, simulating realistic image sequences to enable machine learning model development, and demonstrating that this approach is feasible for the analysis of real procedures. As X-ray-based SPR continues to mature, it will benefit procedures in orthopedic surgery, angiography, and interventional radiology by equipping intelligent surgical systems with situational awareness in the operating room.

手术相位识别(SPR)是现代手术室数字化转型的关键因素。虽然基于视频源的 SPR 已经得到广泛认可,但将介入性 X 射线序列纳入其中的做法尚未得到探索。本文介绍了 Pelphix,这是第一种用于 X 光引导下经皮骨盆骨折固定的 SPR 方法,它从走廊、活动、视图和帧值四个粒度层面对手术过程进行建模,将骨盆骨折固定工作流程模拟为马尔可夫过程,从而提供完全注释的训练数据。通过对骨走廊、工具和解剖结构的检测,我们学习了图像表征,并将其输入变换器模型,从而在四个粒度水平上对手术阶段进行回归。我们的方法证明了基于 X 射线的 SPR 的可行性,在所有粒度水平上,模拟序列的平均准确率达到 99.2%,在尸体中达到 71.7%,在真实数据中,目标走廊的准确率高达 84%。这项工作迈出了 X 射线领域 SPR 的第一步,建立了 X 射线引导手术中阶段分类的方法,模拟了真实的图像序列以实现机器学习模型的开发,并证明了这种方法在真实手术分析中的可行性。随着基于 X 射线的 SPR 技术的不断成熟,它将通过为智能手术系统配备手术室中的态势感知功能,使骨科手术、血管造影术和介入放射学手术受益匪浅。
{"title":"Pelphix: Surgical Phase Recognition from X-ray Images in Percutaneous Pelvic Fixation.","authors":"Benjamin D Killeen, Han Zhang, Jan Mangulabnan, Mehran Armand, Russell H Taylor, Greg Osgood, Mathias Unberath","doi":"10.1007/978-3-031-43996-4_13","DOIUrl":"https://doi.org/10.1007/978-3-031-43996-4_13","url":null,"abstract":"<p><p>Surgical phase recognition (SPR) is a crucial element in the digital transformation of the modern operating theater. While SPR based on video sources is well-established, incorporation of interventional X-ray sequences has not yet been explored. This paper presents Pelphix, a first approach to SPR for X-ray-guided percutaneous pelvic fracture fixation, which models the procedure at four levels of granularity - corridor, activity, view, and frame value - simulating the pelvic fracture fixation workflow as a Markov process to provide fully annotated training data. Using added supervision from detection of bony corridors, tools, and anatomy, we learn image representations that are fed into a transformer model to regress surgical phases at the four granularity levels. Our approach demonstrates the feasibility of X-ray-based SPR, achieving an average accuracy of 99.2% on simulated sequences and 71.7% in cadaver across all granularity levels, with up to 84% accuracy for the target corridor in real data. This work constitutes the first step toward SPR for the X-ray domain, establishing an approach to categorizing phases in X-ray-guided surgery, simulating realistic image sequences to enable machine learning model development, and demonstrating that this approach is feasible for the analysis of real procedures. As X-ray-based SPR continues to mature, it will benefit procedures in orthopedic surgery, angiography, and interventional radiology by equipping intelligent surgical systems with situational awareness in the operating room.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"14228 ","pages":"133-143"},"PeriodicalIF":0.0,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11016332/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140862109","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CTFlow: Mitigating Effects of Computed Tomography Acquisition and Reconstruction with Normalizing Flows. CTFlow:利用归一化流量减轻计算机断层扫描采集和重建的影响。
Leihao Wei, Anil Yadav, William Hsu

Mitigating the effects of image appearance due to variations in computed tomography (CT) acquisition and reconstruction parameters is a challenging inverse problem. We present CTFlow, a normalizing flows-based method for harmonizing CT scans acquired and reconstructed using different doses and kernels to a target scan. Unlike existing state-of-the-art image harmonization approaches that only generate a single output, flow-based methods learn the explicit conditional density and output the entire spectrum of plausible reconstruction, reflecting the underlying uncertainty of the problem. We demonstrate how normalizing flows reduces variability in image quality and the performance of a machine learning algorithm for lung nodule detection. We evaluate the performance of CTFlow by 1) comparing it with other techniques on a denoising task using the AAPM-Mayo Clinical Low-Dose CT Grand Challenge dataset, and 2) demonstrating consistency in nodule detection performance across 186 real-world low-dose CT chest scans acquired at our institution. CTFlow performs better in the denoising task for both peak signal-to-noise ratio and perceptual quality metrics. Moreover, CTFlow produces more consistent predictions across all dose and kernel conditions than generative adversarial network (GAN)-based image harmonization on a lung nodule detection task. The code is available at https://github.com/hsu-lab/ctflow.

减轻因计算机断层扫描(CT)采集和重建参数变化而造成的图像外观影响是一个具有挑战性的逆问题。我们提出的 CTFlow 是一种基于归一化流量的方法,用于协调使用不同剂量和内核采集和重建的 CT 扫描与目标扫描。现有的先进图像协调方法只能生成单一输出,而基于流量的方法则不同,它能学习明确的条件密度,并输出整个可信重建谱,从而反映出问题的潜在不确定性。我们展示了流量归一化如何减少图像质量的变化以及肺结节检测机器学习算法的性能。我们通过以下方法评估 CTFlow 的性能:1)使用 AAPM-Mayo 临床低剂量 CT 大挑战数据集,在去噪任务中将 CTFlow 与其他技术进行比较;2)在本机构获取的 186 个真实世界低剂量 CT 胸部扫描中证明结节检测性能的一致性。在峰值信噪比和感知质量指标方面,CTFlow 在去噪任务中表现更好。此外,与基于生成式对抗网络(GAN)的图像协调相比,CTFlow 在肺结节检测任务中的所有剂量和内核条件下都能产生更一致的预测结果。代码见 https://github.com/hsu-lab/ctflow。
{"title":"CTFlow: Mitigating Effects of Computed Tomography Acquisition and Reconstruction with Normalizing Flows.","authors":"Leihao Wei, Anil Yadav, William Hsu","doi":"10.1007/978-3-031-43990-2_39","DOIUrl":"10.1007/978-3-031-43990-2_39","url":null,"abstract":"<p><p>Mitigating the effects of image appearance due to variations in computed tomography (CT) acquisition and reconstruction parameters is a challenging inverse problem. We present CTFlow, a normalizing flows-based method for harmonizing CT scans acquired and reconstructed using different doses and kernels to a target scan. Unlike existing state-of-the-art image harmonization approaches that only generate a single output, flow-based methods learn the explicit conditional density and output the entire spectrum of plausible reconstruction, reflecting the underlying uncertainty of the problem. We demonstrate how normalizing flows reduces variability in image quality and the performance of a machine learning algorithm for lung nodule detection. We evaluate the performance of CTFlow by 1) comparing it with other techniques on a denoising task using the AAPM-Mayo Clinical Low-Dose CT Grand Challenge dataset, and 2) demonstrating consistency in nodule detection performance across 186 real-world low-dose CT chest scans acquired at our institution. CTFlow performs better in the denoising task for both peak signal-to-noise ratio and perceptual quality metrics. Moreover, CTFlow produces more consistent predictions across all dose and kernel conditions than generative adversarial network (GAN)-based image harmonization on a lung nodule detection task. The code is available at https://github.com/hsu-lab/ctflow.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"14226 ","pages":"413-422"},"PeriodicalIF":0.0,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11086056/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140913633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Implicit Anatomical Rendering for Medical Image Segmentation with Stochastic Experts. 利用随机专家为医学图像分割进行隐式解剖渲染
Chenyu You, Weicheng Dai, Yifei Min, Lawrence Staib, James S Duncan

Integrating high-level semantically correlated contents and low-level anatomical features is of central importance in medical image segmentation. Towards this end, recent deep learning-based medical segmentation methods have shown great promise in better modeling such information. However, convolution operators for medical segmentation typically operate on regular grids, which inherently blur the high-frequency regions, i.e., boundary regions. In this work, we propose MORSE, a generic implicit neural rendering framework designed at an anatomical level to assist learning in medical image segmentation. Our method is motivated by the fact that implicit neural representation has been shown to be more effective in fitting complex signals and solving computer graphics problems than discrete grid-based representation. The core of our approach is to formulate medical image segmentation as a rendering problem in an end-to-end manner. Specifically, we continuously align the coarse segmentation prediction with the ambiguous coordinate-based point representations and aggregate these features to adaptively refine the boundary region. To parallelly optimize multi-scale pixel-level features, we leverage the idea from Mixture-of-Expert (MoE) to design and train our MORSE with a stochastic gating mechanism. Our experiments demonstrate that MORSE can work well with different medical segmentation backbones, consistently achieving competitive performance improvements in both 2D and 3D supervised medical segmentation methods. We also theoretically analyze the superiority of MORSE.

整合高级语义相关内容和低级解剖特征在医学图像分割中至关重要。为此,最近基于深度学习的医学分割方法在更好地模拟此类信息方面大有可为。然而,用于医学分割的卷积算子通常是在规则网格上运行的,这就从本质上模糊了高频区域,即边界区域。在这项工作中,我们提出了 MORSE,这是一种在解剖学层面设计的通用隐式神经渲染框架,用于辅助医学图像分割的学习。与基于离散网格的表示法相比,隐式神经表示法在拟合复杂信号和解决计算机图形问题方面更有效。我们方法的核心是以端到端的方式将医学图像分割表述为渲染问题。具体来说,我们不断将粗略的分割预测与模糊的基于坐标的点表示相一致,并将这些特征汇总以自适应地完善边界区域。为了并行优化多尺度像素级特征,我们利用专家混合(MoE)的理念,设计并训练具有随机门控机制的 MORSE。我们的实验证明,MORSE 可以与不同的医疗分割骨干技术很好地配合使用,在二维和三维监督医疗分割方法中不断取得具有竞争力的性能改进。我们还从理论上分析了 MORSE 的优越性。
{"title":"Implicit Anatomical Rendering for Medical Image Segmentation with Stochastic Experts.","authors":"Chenyu You, Weicheng Dai, Yifei Min, Lawrence Staib, James S Duncan","doi":"10.1007/978-3-031-43898-1_54","DOIUrl":"10.1007/978-3-031-43898-1_54","url":null,"abstract":"<p><p>Integrating high-level semantically correlated contents and low-level anatomical features is of central importance in medical image segmentation. Towards this end, recent deep learning-based medical segmentation methods have shown great promise in better modeling such information. However, convolution operators for medical segmentation typically operate on regular grids, which inherently blur the high-frequency regions, <i>i.e</i>., boundary regions. In this work, we propose MORSE, a generic implicit neural rendering framework designed at an anatomical level to assist learning in medical image segmentation. Our method is motivated by the fact that implicit neural representation has been shown to be more effective in fitting complex signals and solving computer graphics problems than discrete grid-based representation. The core of our approach is to formulate medical image segmentation as a rendering problem in an end-to-end manner. Specifically, we continuously align the coarse segmentation prediction with the ambiguous coordinate-based point representations and aggregate these features to adaptively refine the boundary region. To parallelly optimize multi-scale pixel-level features, we leverage the idea from Mixture-of-Expert (MoE) to design and train our MORSE with a stochastic gating mechanism. Our experiments demonstrate that MORSE can work well with different medical segmentation backbones, consistently achieving competitive performance improvements in both 2D and 3D supervised medical segmentation methods. We also theoretically analyze the superiority of MORSE.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"14222 ","pages":"561-571"},"PeriodicalIF":0.0,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11151725/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141262863","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1