首页 > 最新文献

IEEE transactions on medical imaging最新文献

英文 中文
Building a Synthetic Vascular Model: Evaluation in an Intracranial Aneurysms Detection Scenario. 建立合成血管模型:在颅内动脉瘤检测场景中进行评估。
Pub Date : 2024-11-06 DOI: 10.1109/TMI.2024.3492313
Rafic Nader, Florent Autrusseau, Vincent L'Allinec, Romain Bourcier

We hereby present a full synthetic model, able to mimic the various constituents of the cerebral vascular tree, including the cerebral arteries, bifurcations and intracranial aneurysms. This model intends to provide a substantial dataset of brain arteries which could be used by a 3D convolutional neural network to efficiently detect Intra-Cranial Aneurysms. The cerebral aneurysms most often occur on a particular structure of the vascular tree named the Circle of Willis. Various studies have been conducted to detect and monitor the aneurysms and those based on Deep Learning achieve the best performance. Specifically, in this work, we propose a full synthetic 3D model able to mimic the brain vasculature as acquired by Magnetic Resonance Angiography, Time Of Flight principle. Among the various MRI modalities, this latter allows for a good rendering of the blood vessels and is non-invasive. Our model has been designed to simultaneously mimic the arteries' geometry, the aneurysm shape, and the background noise. The vascular tree geometry is modeled thanks to an interpolation with 3D Spline functions, and the statistical properties of the background noise is collected from angiography acquisitions and reproduced within the model. In this work, we thoroughly describe the synthetic vasculature model, we build up a neural network designed for aneurysm segmentation and detection, finally, we carry out an in-depth evaluation of the performance gap gained thanks to the synthetic model data augmentation.

我们在此提出一个完整的合成模型,能够模拟脑血管树的各个组成部分,包括脑动脉、分叉和颅内动脉瘤。该模型旨在提供大量脑动脉数据集,三维卷积神经网络可利用这些数据集有效检测颅内动脉瘤。脑动脉瘤最常发生在血管树的一个特殊结构上,即威利斯环。针对动脉瘤的检测和监控已经开展了多项研究,其中基于深度学习的研究取得了最佳效果。具体来说,在这项工作中,我们提出了一个全合成三维模型,该模型能够模仿通过飞行时间原理磁共振血管造影术获取的脑血管结构。在各种核磁共振成像模式中,后者可以很好地渲染血管,而且是非侵入性的。我们设计的模型可同时模拟动脉的几何形状、动脉瘤的形状和背景噪声。血管树的几何形状是通过三维样条函数插值建模的,背景噪声的统计特性是从血管造影采集的数据中收集的,并在模型中再现。在这项工作中,我们详细描述了合成血管模型,建立了一个用于动脉瘤分割和检测的神经网络,最后,我们对合成模型数据增强后的性能差距进行了深入评估。
{"title":"Building a Synthetic Vascular Model: Evaluation in an Intracranial Aneurysms Detection Scenario.","authors":"Rafic Nader, Florent Autrusseau, Vincent L'Allinec, Romain Bourcier","doi":"10.1109/TMI.2024.3492313","DOIUrl":"https://doi.org/10.1109/TMI.2024.3492313","url":null,"abstract":"<p><p>We hereby present a full synthetic model, able to mimic the various constituents of the cerebral vascular tree, including the cerebral arteries, bifurcations and intracranial aneurysms. This model intends to provide a substantial dataset of brain arteries which could be used by a 3D convolutional neural network to efficiently detect Intra-Cranial Aneurysms. The cerebral aneurysms most often occur on a particular structure of the vascular tree named the Circle of Willis. Various studies have been conducted to detect and monitor the aneurysms and those based on Deep Learning achieve the best performance. Specifically, in this work, we propose a full synthetic 3D model able to mimic the brain vasculature as acquired by Magnetic Resonance Angiography, Time Of Flight principle. Among the various MRI modalities, this latter allows for a good rendering of the blood vessels and is non-invasive. Our model has been designed to simultaneously mimic the arteries' geometry, the aneurysm shape, and the background noise. The vascular tree geometry is modeled thanks to an interpolation with 3D Spline functions, and the statistical properties of the background noise is collected from angiography acquisitions and reproduced within the model. In this work, we thoroughly describe the synthetic vasculature model, we build up a neural network designed for aneurysm segmentation and detection, finally, we carry out an in-depth evaluation of the performance gap gained thanks to the synthetic model data augmentation.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142591694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FAMF-Net: Feature Alignment Mutual Attention Fusion with Region Awareness for Breast Cancer Diagnosis via Imbalanced Data. FAMF-Net:通过不平衡数据进行乳腺癌诊断的特征对齐与区域感知相互关注融合。
Pub Date : 2024-11-05 DOI: 10.1109/TMI.2024.3485612
Yiyao Liu, Jinyao Li, Cheng Zhao, Yongtao Zhang, Qian Chen, Jing Qin, Lei Dong, Tianfu Wang, Wei Jiang, Baiying Lei

Automatic and accurate classification of breast cancer in multimodal ultrasound images is crucial to improve patients' diagnosis and treatment effect and save medical resources. Methodologically, the fusion of multimodal ultrasound images often encounters challenges such as misalignment, limited utilization of complementary information, poor interpretability in feature fusion, and imbalances in sample categories. To solve these problems, we propose a feature alignment mutual attention fusion method (FAMF-Net), which consists of a region awareness alignment (RAA) block, a mutual attention fusion (MAF) block, and a reinforcement learning-based dynamic optimization strategy(RDO). Specifically, RAA achieves region awareness through class activation mapping and performs translation transformation to achieve feature alignment. When MAF utilizes a mutual attention mechanism for feature interaction fusion, it mines edge and color features separately in B-mode and shear wave elastography images, enhancing the complementarity of features and improving interpretability. Finally, RDO uses the distribution of samples and prediction probabilities during training as the state of reinforcement learning to dynamically optimize the weights of the loss function, thereby solving the problem of class imbalance. The experimental results based on our clinically obtained dataset demonstrate the effectiveness of the proposed method. Our code will be available at: https://github.com/Magnety/Multi_modal_Image.

自动、准确地对多模态超声图像中的乳腺癌进行分类,对于提高患者的诊断和治疗效果以及节约医疗资源至关重要。在方法学上,多模态超声图像的融合经常会遇到一些挑战,如对齐错误、互补信息利用有限、特征融合的可解释性差、样本类别不平衡等。为了解决这些问题,我们提出了一种特征配准相互关注融合方法(FAMF-Net),它由区域感知配准(RAA)模块、相互关注融合(MAF)模块和基于强化学习的动态优化策略(RDO)组成。具体来说,RAA 通过类激活映射实现区域感知,并执行平移变换以实现特征对齐。当 MAF 利用相互关注机制进行特征交互融合时,它将 B 模式和剪切波弹性成像图像中的边缘特征和颜色特征分别挖掘出来,增强了特征的互补性,提高了可解释性。最后,RDO 将训练过程中的样本分布和预测概率作为强化学习的状态,动态优化损失函数的权重,从而解决了类不平衡的问题。基于临床数据集的实验结果证明了所提方法的有效性。我们的代码可在以下网址获取:https://github.com/Magnety/Multi_modal_Image。
{"title":"FAMF-Net: Feature Alignment Mutual Attention Fusion with Region Awareness for Breast Cancer Diagnosis via Imbalanced Data.","authors":"Yiyao Liu, Jinyao Li, Cheng Zhao, Yongtao Zhang, Qian Chen, Jing Qin, Lei Dong, Tianfu Wang, Wei Jiang, Baiying Lei","doi":"10.1109/TMI.2024.3485612","DOIUrl":"https://doi.org/10.1109/TMI.2024.3485612","url":null,"abstract":"<p><p>Automatic and accurate classification of breast cancer in multimodal ultrasound images is crucial to improve patients' diagnosis and treatment effect and save medical resources. Methodologically, the fusion of multimodal ultrasound images often encounters challenges such as misalignment, limited utilization of complementary information, poor interpretability in feature fusion, and imbalances in sample categories. To solve these problems, we propose a feature alignment mutual attention fusion method (FAMF-Net), which consists of a region awareness alignment (RAA) block, a mutual attention fusion (MAF) block, and a reinforcement learning-based dynamic optimization strategy(RDO). Specifically, RAA achieves region awareness through class activation mapping and performs translation transformation to achieve feature alignment. When MAF utilizes a mutual attention mechanism for feature interaction fusion, it mines edge and color features separately in B-mode and shear wave elastography images, enhancing the complementarity of features and improving interpretability. Finally, RDO uses the distribution of samples and prediction probabilities during training as the state of reinforcement learning to dynamically optimize the weights of the loss function, thereby solving the problem of class imbalance. The experimental results based on our clinically obtained dataset demonstrate the effectiveness of the proposed method. Our code will be available at: https://github.com/Magnety/Multi_modal_Image.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142585413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Corrections to “Contrastive Graph Pooling for Explainable Classification of Brain Networks” 脑网络可解释分类的对比图集合》的更正。
Pub Date : 2024-11-04 DOI: 10.1109/TMI.2024.3465968
Jiaxing Xu;Qingtian Bian;Xinhang Li;Aihu Zhang;Yiping Ke;Miao Qiao;Wei Zhang;Wei Khang Jeremy Sim;Balázs Gulyás
{"title":"Corrections to “Contrastive Graph Pooling for Explainable Classification of Brain Networks”","authors":"Jiaxing Xu;Qingtian Bian;Xinhang Li;Aihu Zhang;Yiping Ke;Miao Qiao;Wei Zhang;Wei Khang Jeremy Sim;Balázs Gulyás","doi":"10.1109/TMI.2024.3465968","DOIUrl":"10.1109/TMI.2024.3465968","url":null,"abstract":"","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"43 11","pages":"4075-4075"},"PeriodicalIF":0.0,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10741900","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142577333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-Center Fetal Brain Tissue Annotation (FeTA) Challenge 2022 Results. 2022 年多中心胎儿脑组织注释(FeTA)挑战赛结果。
Pub Date : 2024-10-30 DOI: 10.1109/TMI.2024.3485554
Kelly Payette, Celine Steger, Roxane Licandro, Priscille De Dumast, Hongwei Bran Li, Matthew Barkovich, Liu Li, Maik Dannecker, Chen Chen, Cheng Ouyang, Niccolo McConnell, Alina Miron, Yongmin Li, Alena Uus, Irina Grigorescu, Paula Ramirez Gilliland, Md Mahfuzur Rahman Siddiquee, Daguang Xu, Andriy Myronenko, Haoyu Wang, Ziyan Huang, Jin Ye, Mireia Alenya, Valentin Comte, Oscar Camara, Jean-Baptiste Masson, Astrid Nilsson, Charlotte Godard, Moona Mazher, Abdul Qayyum, Yibo Gao, Hangqi Zhou, Shangqi Gao, Jia Fu, Guiming Dong, Guotai Wang, ZunHyan Rieu, HyeonSik Yang, Minwoo Lee, Szymon Plotka, Michal K Grzeszczyk, Arkadiusz Sitek, Luisa Vargas Daza, Santiago Usma, Pablo Arbelaez, Wenying Lu, Wenhao Zhang, Jing Liang, Romain Valabregue, Anand A Joshi, Krishna N Nayak, Richard M Leahy, Luca Wilhelmi, Aline Dandliker, Hui Ji, Antonio G Gennari, Anton Jakovcic, Melita Klaic, Ana Adzic, Pavel Markovic, Gracia Grabaric, Gregor Kasprian, Gregor Dovjak, Milan Rados, Lana Vasung, Meritxell Bach Cuadra, Andras Jakab

Segmentation is a critical step in analyzing the developing human fetal brain. There have been vast improvements in automatic segmentation methods in the past several years, and the Fetal Brain Tissue Annotation (FeTA) Challenge 2021 helped to establish an excellent standard of fetal brain segmentation. However, FeTA 2021 was a single center study, limiting real-world clinical applicability and acceptance. The multi-center FeTA Challenge 2022 focused on advancing the generalizability of fetal brain segmentation algorithms for magnetic resonance imaging (MRI). In FeTA 2022, the training dataset contained images and corresponding manually annotated multi-class labels from two imaging centers, and the testing data contained images from these two centers as well as two additional unseen centers. The multi-center data included different MR scanners, imaging parameters, and fetal brain super-resolution algorithms applied. 16 teams participated and 17 algorithms were evaluated. Here, the challenge results are presented, focusing on the generalizability of the submissions. Both in- and out-of-domain, the white matter and ventricles were segmented with the highest accuracy (Top Dice scores: 0.89, 0.87 respectively), while the most challenging structure remains the grey matter (Top Dice score: 0.75) due to anatomical complexity. The top 5 average Dices scores ranged from 0.81-0.82, the top 5 average 95th percentile Hausdorff distance values ranged from 2.3-2.5mm, and the top 5 volumetric similarity scores ranged from 0.90-0.92. The FeTA Challenge 2022 was able to successfully evaluate and advance generalizability of multi-class fetal brain tissue segmentation algorithms for MRI and it continues to benchmark new algorithms.

分割是分析发育中的人类胎儿大脑的关键步骤。过去几年,自动分割方法有了很大改进,2021 年胎儿脑组织注释(FeTA)挑战赛帮助建立了胎儿脑分割的优秀标准。然而,FeTA 2021 是一项单中心研究,限制了实际临床应用和接受程度。多中心 FeTA 2022 挑战赛的重点是提高磁共振成像(MRI)胎儿大脑分割算法的通用性。在 FeTA 2022 中,训练数据集包含来自两个成像中心的图像和相应的人工注释多类标签,测试数据包含来自这两个中心以及另外两个未见中心的图像。多中心数据包括不同的磁共振扫描仪、成像参数和应用的胎儿大脑超分辨率算法。共有 16 个团队参赛,对 17 种算法进行了评估。这里介绍的是挑战赛的结果,重点是提交数据的通用性。无论是域内还是域外,白质和脑室的分割准确率最高(最高骰子得分分别为 0.89 和 0.87),而最具挑战性的结构仍然是灰质(最高骰子得分:0.75),原因是解剖结构复杂。前五名的平均骰子得分介于 0.81-0.82 之间,前五名的平均第 95 百分位数豪斯多夫距离值介于 2.3-2.5 毫米之间,前五名的体积相似性得分介于 0.90-0.92 之间。2022 年 FeTA 挑战赛能够成功评估和推进 MRI 多类胎儿脑组织分割算法的通用性,并继续为新算法设定基准。
{"title":"Multi-Center Fetal Brain Tissue Annotation (FeTA) Challenge 2022 Results.","authors":"Kelly Payette, Celine Steger, Roxane Licandro, Priscille De Dumast, Hongwei Bran Li, Matthew Barkovich, Liu Li, Maik Dannecker, Chen Chen, Cheng Ouyang, Niccolo McConnell, Alina Miron, Yongmin Li, Alena Uus, Irina Grigorescu, Paula Ramirez Gilliland, Md Mahfuzur Rahman Siddiquee, Daguang Xu, Andriy Myronenko, Haoyu Wang, Ziyan Huang, Jin Ye, Mireia Alenya, Valentin Comte, Oscar Camara, Jean-Baptiste Masson, Astrid Nilsson, Charlotte Godard, Moona Mazher, Abdul Qayyum, Yibo Gao, Hangqi Zhou, Shangqi Gao, Jia Fu, Guiming Dong, Guotai Wang, ZunHyan Rieu, HyeonSik Yang, Minwoo Lee, Szymon Plotka, Michal K Grzeszczyk, Arkadiusz Sitek, Luisa Vargas Daza, Santiago Usma, Pablo Arbelaez, Wenying Lu, Wenhao Zhang, Jing Liang, Romain Valabregue, Anand A Joshi, Krishna N Nayak, Richard M Leahy, Luca Wilhelmi, Aline Dandliker, Hui Ji, Antonio G Gennari, Anton Jakovcic, Melita Klaic, Ana Adzic, Pavel Markovic, Gracia Grabaric, Gregor Kasprian, Gregor Dovjak, Milan Rados, Lana Vasung, Meritxell Bach Cuadra, Andras Jakab","doi":"10.1109/TMI.2024.3485554","DOIUrl":"https://doi.org/10.1109/TMI.2024.3485554","url":null,"abstract":"<p><p>Segmentation is a critical step in analyzing the developing human fetal brain. There have been vast improvements in automatic segmentation methods in the past several years, and the Fetal Brain Tissue Annotation (FeTA) Challenge 2021 helped to establish an excellent standard of fetal brain segmentation. However, FeTA 2021 was a single center study, limiting real-world clinical applicability and acceptance. The multi-center FeTA Challenge 2022 focused on advancing the generalizability of fetal brain segmentation algorithms for magnetic resonance imaging (MRI). In FeTA 2022, the training dataset contained images and corresponding manually annotated multi-class labels from two imaging centers, and the testing data contained images from these two centers as well as two additional unseen centers. The multi-center data included different MR scanners, imaging parameters, and fetal brain super-resolution algorithms applied. 16 teams participated and 17 algorithms were evaluated. Here, the challenge results are presented, focusing on the generalizability of the submissions. Both in- and out-of-domain, the white matter and ventricles were segmented with the highest accuracy (Top Dice scores: 0.89, 0.87 respectively), while the most challenging structure remains the grey matter (Top Dice score: 0.75) due to anatomical complexity. The top 5 average Dices scores ranged from 0.81-0.82, the top 5 average 95<sup>th</sup> percentile Hausdorff distance values ranged from 2.3-2.5mm, and the top 5 volumetric similarity scores ranged from 0.90-0.92. The FeTA Challenge 2022 was able to successfully evaluate and advance generalizability of multi-class fetal brain tissue segmentation algorithms for MRI and it continues to benchmark new algorithms.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142549774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CQformer: Learning Dynamics Across Slices in Medical Image Segmentation. CQformer:医学图像分割中的跨切片动态学习
Pub Date : 2024-10-10 DOI: 10.1109/TMI.2024.3477555
Shengjie Zhang, Xin Shen, Xiang Chen, Ziqi Yu, Bohan Ren, Haibo Yang, Xiao-Yong Zhang, Yuan Zhou

Prevalent studies on deep learning-based 3D medical image segmentation capture the continuous variation across 2D slices mainly via convolution, Transformer, inter-slice interaction, and time series models. In this work, via modeling this variation by an ordinary differential equation (ODE), we propose a cross instance query-guided Transformer architecture (CQformer) that leverages features from preceding slices to improve the segmentation performance of subsequent slices. Its key components include a cross-attention mechanism in an ODE formulation, which bridges the features of contiguous 2D slices of the 3D volumetric data. In addition, a regression head is employed to shorten the gap between the bottleneck and the prediction layer. Extensive experiments on 7 datasets with various modalities (CT, MRI) and tasks (organ, tissue, and lesion) demonstrate that CQformer outperforms previous state-of-the-art segmentation algorithms on 6 datasets by 0.44%-2.45%, and achieves the second highest performance of 88.30% on the BTCV dataset. The code will be publicly available after acceptance.

基于深度学习的三维医学图像分割研究主要通过卷积、变换器、切片间交互和时间序列模型来捕捉二维切片间的连续变化。在这项工作中,通过用常微分方程(ODE)对这种变化进行建模,我们提出了一种跨实例查询引导的变换器架构(CQformer),它能利用前面切片的特征来提高后续切片的分割性能。其关键组件包括 ODE 公式中的交叉注意机制,该机制将三维容积数据中连续二维切片的特征连接起来。此外,还采用了回归头来缩短瓶颈层和预测层之间的差距。在不同模式(CT、MRI)和任务(器官、组织和病变)的 7 个数据集上进行的广泛实验表明,CQformer 在 6 个数据集上的表现比以前的一流分割算法高出 0.44%-2.45% ,在 BTCV 数据集上的表现为 88.30%,位居第二。代码将在通过验收后公开发布。
{"title":"CQformer: Learning Dynamics Across Slices in Medical Image Segmentation.","authors":"Shengjie Zhang, Xin Shen, Xiang Chen, Ziqi Yu, Bohan Ren, Haibo Yang, Xiao-Yong Zhang, Yuan Zhou","doi":"10.1109/TMI.2024.3477555","DOIUrl":"https://doi.org/10.1109/TMI.2024.3477555","url":null,"abstract":"<p><p>Prevalent studies on deep learning-based 3D medical image segmentation capture the continuous variation across 2D slices mainly via convolution, Transformer, inter-slice interaction, and time series models. In this work, via modeling this variation by an ordinary differential equation (ODE), we propose a cross instance query-guided Transformer architecture (CQformer) that leverages features from preceding slices to improve the segmentation performance of subsequent slices. Its key components include a cross-attention mechanism in an ODE formulation, which bridges the features of contiguous 2D slices of the 3D volumetric data. In addition, a regression head is employed to shorten the gap between the bottleneck and the prediction layer. Extensive experiments on 7 datasets with various modalities (CT, MRI) and tasks (organ, tissue, and lesion) demonstrate that CQformer outperforms previous state-of-the-art segmentation algorithms on 6 datasets by 0.44%-2.45%, and achieves the second highest performance of 88.30% on the BTCV dataset. The code will be publicly available after acceptance.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142402485","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Non-invasive Deep-Brain Imaging with 3D Integrated Photoacoustic Tomography and Ultrasound Localization Microscopy (3D-PAULM). 利用三维集成光声层析成像和超声定位显微镜(3D-PAULM)进行无创深部脑成像。
Pub Date : 2024-10-09 DOI: 10.1109/TMI.2024.3477317
Yuqi Tang, Nanchao Wang, Zhijie Dong, Matthew Lowerison, Angela Del Aguila, Natalie Johnston, Tri Vu, Chenshuo Ma, Yirui Xu, Wei Yang, Pengfei Song, Junjie Yao

Photoacoustic computed tomography (PACT) is a proven technology for imaging hemodynamics in deep brain of small animal models. PACT is inherently compatible with ultrasound (US) imaging, providing complementary contrast mechanisms. While PACT can quantify the brain's oxygen saturation of hemoglobin (sO2), US imaging can probe the blood flow based on the Doppler effect. Further, by tracking gas-filled microbubbles, ultrasound localization microscopy (ULM) can map the blood flow velocity with sub-diffraction spatial resolution. In this work, we present a 3D deep-brain imaging system that seamlessly integrates PACT and ULM into a single device, 3D-PAULM. Using a low ultrasound frequency of 4 MHz, 3D-PAULM is capable of imaging the brain hemodynamic functions with intact scalp and skull in a totally non-invasive manner. Using 3D-PAULM, we studied the mouse brain functions with ischemic stroke. Multi-spectral PACT, US B-mode imaging, microbubble-enhanced power Doppler (PD), and ULM were performed on the same mouse brain with intrinsic image co-registration. From the multi-modality measurements, we further quantified blood perfusion, sO2, vessel density, and flow velocity of the mouse brain, showing stroke-induced ischemia, hypoxia, and reduced blood flow. We expect that 3D-PAULM can find broad applications in studying deep brain functions on small animal models.

光声计算机断层扫描(PACT)是一种成熟的小动物模型脑深部血流动力学成像技术。光声计算机断层扫描与超声(US)成像具有内在兼容性,可提供互补的对比机制。PACT 可量化大脑血红蛋白的氧饱和度(sO2),而 US 成像可根据多普勒效应探测血流。此外,通过跟踪充满气体的微气泡,超声定位显微镜(ULM)能以亚衍射空间分辨率绘制血流速度图。在这项研究中,我们提出了一种三维脑深部成像系统,它将 PACT 和 ULM 无缝集成到一个设备中,即 3D-PAULM 系统。3D-PAULM 使用 4 MHz 的低超声频率,能够以完全无创的方式在头皮和头骨完好的情况下对大脑血流动力学功能进行成像。我们利用 3D-PAULM 研究了缺血性中风小鼠的脑功能。我们在同一只小鼠脑部进行了多光谱 PACT、美国 B 型成像、微泡增强功率多普勒(PD)和超低功耗成像,并进行了内在图像协同配准。通过多模态测量,我们进一步量化了小鼠大脑的血液灌注、血氧饱和度、血管密度和血流速度,显示了中风引起的缺血、缺氧和血流减少。我们期待 3D-PAULM 能在小动物模型的脑深部功能研究中得到广泛应用。
{"title":"Non-invasive Deep-Brain Imaging with 3D Integrated Photoacoustic Tomography and Ultrasound Localization Microscopy (3D-PAULM).","authors":"Yuqi Tang, Nanchao Wang, Zhijie Dong, Matthew Lowerison, Angela Del Aguila, Natalie Johnston, Tri Vu, Chenshuo Ma, Yirui Xu, Wei Yang, Pengfei Song, Junjie Yao","doi":"10.1109/TMI.2024.3477317","DOIUrl":"10.1109/TMI.2024.3477317","url":null,"abstract":"<p><p>Photoacoustic computed tomography (PACT) is a proven technology for imaging hemodynamics in deep brain of small animal models. PACT is inherently compatible with ultrasound (US) imaging, providing complementary contrast mechanisms. While PACT can quantify the brain's oxygen saturation of hemoglobin (sO2), US imaging can probe the blood flow based on the Doppler effect. Further, by tracking gas-filled microbubbles, ultrasound localization microscopy (ULM) can map the blood flow velocity with sub-diffraction spatial resolution. In this work, we present a 3D deep-brain imaging system that seamlessly integrates PACT and ULM into a single device, 3D-PAULM. Using a low ultrasound frequency of 4 MHz, 3D-PAULM is capable of imaging the brain hemodynamic functions with intact scalp and skull in a totally non-invasive manner. Using 3D-PAULM, we studied the mouse brain functions with ischemic stroke. Multi-spectral PACT, US B-mode imaging, microbubble-enhanced power Doppler (PD), and ULM were performed on the same mouse brain with intrinsic image co-registration. From the multi-modality measurements, we further quantified blood perfusion, sO2, vessel density, and flow velocity of the mouse brain, showing stroke-induced ischemia, hypoxia, and reduced blood flow. We expect that 3D-PAULM can find broad applications in studying deep brain functions on small animal models.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142396305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GobletNet: Wavelet-Based High-Frequency Fusion Network for Semantic Segmentation of Electron Microscopy Images. GobletNet:基于小波的高频融合网络,用于电子显微镜图像的语义分割。
Pub Date : 2024-10-04 DOI: 10.1109/TMI.2024.3474028
Yanfeng Zhou, Lingrui Li, Chenlong Wang, Le Song, Ge Yang

Semantic segmentation of electron microscopy (EM) images is crucial for nanoscale analysis. With the development of deep neural networks (DNNs), semantic segmentation of EM images has achieved remarkable success. However, current EM image segmentation models are usually extensions or adaptations of natural or biomedical models. They lack the full exploration and utilization of the intrinsic characteristics of EM images. Furthermore, they are often designed only for several specific segmentation objects and lack versatility. In this study, we quantitatively analyze the characteristics of EM images compared with those of natural and other biomedical images via the wavelet transform. To better utilize these characteristics, we design a high-frequency (HF) fusion network, GobletNet, which outperforms state-of-the-art models by a large margin in the semantic segmentation of EM images. We use the wavelet transform to generate HF images as extra inputs and use an extra encoding branch to extract HF information. Furthermore, we introduce a fusion-attention module (FAM) into GobletNet to facilitate better absorption and fusion of information from raw images and HF images. Extensive benchmarking on seven public EM datasets (EPFL, CREMI, SNEMI3D, UroCell, MitoEM, Nanowire and BetaSeg) demonstrates the effectiveness of our model. The code is available at https://github.com/Yanfeng-Zhou/GobletNet.

电子显微镜(EM)图像的语义分割对于纳米级分析至关重要。随着深度神经网络(DNN)的发展,电磁图像的语义分割取得了显著的成功。然而,目前的电磁图像分割模型通常是自然或生物医学模型的扩展或改编。它们缺乏对电磁图像内在特征的充分挖掘和利用。此外,它们通常只针对几个特定的分割对象而设计,缺乏通用性。在本研究中,我们通过小波变换定量分析了电磁图像与自然图像和其他生物医学图像相比的特征。为了更好地利用这些特点,我们设计了一种高频(HF)融合网络 GobletNet,它在 EM 图像的语义分割方面远远优于最先进的模型。我们使用小波变换生成高频图像作为额外输入,并使用额外的编码分支来提取高频信息。此外,我们还在 GobletNet 中引入了融合关注模块(FAM),以便更好地吸收和融合原始图像和高频图像中的信息。在七个公共电磁数据集(EPFL、CREMI、SNEMI3D、UroCell、MitoEM、Nanowire 和 BetaSeg)上进行的广泛基准测试证明了我们模型的有效性。代码可在 https://github.com/Yanfeng-Zhou/GobletNet 上获取。
{"title":"GobletNet: Wavelet-Based High-Frequency Fusion Network for Semantic Segmentation of Electron Microscopy Images.","authors":"Yanfeng Zhou, Lingrui Li, Chenlong Wang, Le Song, Ge Yang","doi":"10.1109/TMI.2024.3474028","DOIUrl":"https://doi.org/10.1109/TMI.2024.3474028","url":null,"abstract":"<p><p>Semantic segmentation of electron microscopy (EM) images is crucial for nanoscale analysis. With the development of deep neural networks (DNNs), semantic segmentation of EM images has achieved remarkable success. However, current EM image segmentation models are usually extensions or adaptations of natural or biomedical models. They lack the full exploration and utilization of the intrinsic characteristics of EM images. Furthermore, they are often designed only for several specific segmentation objects and lack versatility. In this study, we quantitatively analyze the characteristics of EM images compared with those of natural and other biomedical images via the wavelet transform. To better utilize these characteristics, we design a high-frequency (HF) fusion network, GobletNet, which outperforms state-of-the-art models by a large margin in the semantic segmentation of EM images. We use the wavelet transform to generate HF images as extra inputs and use an extra encoding branch to extract HF information. Furthermore, we introduce a fusion-attention module (FAM) into GobletNet to facilitate better absorption and fusion of information from raw images and HF images. Extensive benchmarking on seven public EM datasets (EPFL, CREMI, SNEMI3D, UroCell, MitoEM, Nanowire and BetaSeg) demonstrates the effectiveness of our model. The code is available at https://github.com/Yanfeng-Zhou/GobletNet.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142376487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Geometry-Aware Attenuation Learning for Sparse-View CBCT Reconstruction. 针对稀疏视图 CBCT 重建的几何感知衰减学习。
Pub Date : 2024-10-04 DOI: 10.1109/TMI.2024.3473970
Zhentao Liu, Yu Fang, Changjian Li, Han Wu, Yuan Liu, Dinggang Shen, Zhiming Cui

Cone Beam Computed Tomography (CBCT) plays a vital role in clinical imaging. Traditional methods typically require hundreds of 2D X-ray projections to reconstruct a high-quality 3D CBCT image, leading to considerable radiation exposure. This has led to a growing interest in sparse-view CBCT reconstruction to reduce radiation doses. While recent advances, including deep learning and neural rendering algorithms, have made strides in this area, these methods either produce unsatisfactory results or suffer from time inefficiency of individual optimization. In this paper, we introduce a novel geometry-aware encoder-decoder framework to solve this problem. Our framework starts by encoding multi-view 2D features from various 2D X-ray projections with a 2D CNN encoder. Leveraging the geometry of CBCT scanning, it then back-projects the multi-view 2D features into the 3D space to formulate a comprehensive volumetric feature map, followed by a 3D CNN decoder to recover 3D CBCT image. Importantly, our approach respects the geometric relationship between 3D CBCT image and its 2D X-ray projections during feature back projection stage, and enjoys the prior knowledge learned from the data population. This ensures its adaptability in dealing with extremely sparse view inputs without individual training, such as scenarios with only 5 or 10 X-ray projections. Extensive evaluations on two simulated datasets and one real-world dataset demonstrate exceptional reconstruction quality and time efficiency of our method.

锥形束计算机断层扫描(CBCT)在临床成像中发挥着至关重要的作用。传统方法通常需要数百个二维 X 射线投影才能重建高质量的三维 CBCT 图像,从而导致大量辐射暴露。因此,人们对稀疏视图 CBCT 重建以减少辐射剂量的兴趣与日俱增。虽然包括深度学习和神经渲染算法在内的最新进展在这一领域取得了长足进步,但这些方法要么产生的结果不尽如人意,要么存在单个优化的时间效率低下问题。在本文中,我们介绍了一种新颖的几何感知编码器-解码器框架来解决这一问题。我们的框架首先使用二维 CNN 编码器对来自各种二维 X 射线投影的多视角二维特征进行编码。然后,利用 CBCT 扫描的几何原理,将多视角二维特征反向投影到三维空间,形成一个全面的容积特征图,再用三维 CNN 解码器恢复三维 CBCT 图像。重要的是,在特征反投影阶段,我们的方法尊重三维 CBCT 图像与其二维 X 射线投影之间的几何关系,并利用从数据群体中学到的先验知识。这确保了它在处理极其稀疏的视图输入时的适应性,而无需进行单独训练,例如只有 5 或 10 个 X 射线投影的情况。在两个模拟数据集和一个实际数据集上进行的广泛评估表明,我们的方法具有卓越的重建质量和时间效率。
{"title":"Geometry-Aware Attenuation Learning for Sparse-View CBCT Reconstruction.","authors":"Zhentao Liu, Yu Fang, Changjian Li, Han Wu, Yuan Liu, Dinggang Shen, Zhiming Cui","doi":"10.1109/TMI.2024.3473970","DOIUrl":"https://doi.org/10.1109/TMI.2024.3473970","url":null,"abstract":"<p><p>Cone Beam Computed Tomography (CBCT) plays a vital role in clinical imaging. Traditional methods typically require hundreds of 2D X-ray projections to reconstruct a high-quality 3D CBCT image, leading to considerable radiation exposure. This has led to a growing interest in sparse-view CBCT reconstruction to reduce radiation doses. While recent advances, including deep learning and neural rendering algorithms, have made strides in this area, these methods either produce unsatisfactory results or suffer from time inefficiency of individual optimization. In this paper, we introduce a novel geometry-aware encoder-decoder framework to solve this problem. Our framework starts by encoding multi-view 2D features from various 2D X-ray projections with a 2D CNN encoder. Leveraging the geometry of CBCT scanning, it then back-projects the multi-view 2D features into the 3D space to formulate a comprehensive volumetric feature map, followed by a 3D CNN decoder to recover 3D CBCT image. Importantly, our approach respects the geometric relationship between 3D CBCT image and its 2D X-ray projections during feature back projection stage, and enjoys the prior knowledge learned from the data population. This ensures its adaptability in dealing with extremely sparse view inputs without individual training, such as scenarios with only 5 or 10 X-ray projections. Extensive evaluations on two simulated datasets and one real-world dataset demonstrate exceptional reconstruction quality and time efficiency of our method.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142376486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A gradient-based approach to fast and accurate head motion compensation in cone-beam CT. 基于梯度的方法,在锥束 CT 中实现快速准确的头部运动补偿。
Pub Date : 2024-10-04 DOI: 10.1109/TMI.2024.3474250
Mareike Thies, Fabian Wagner, Noah Maul, Haijun Yu, Manuela Goldmann, Linda-Sophie Schneider, Mingxuan Gu, Siyuan Mei, Lukas Folle, Alexander Preuhs, Michael Manhart, Andreas Maier

Cone-beam computed tomography (CBCT) systems, with their flexibility, present a promising avenue for direct point-of-care medical imaging, particularly in critical scenarios such as acute stroke assessment. However, the integration of CBCT into clinical workflows faces challenges, primarily linked to long scan duration resulting in patient motion during scanning and leading to image quality degradation in the reconstructed volumes. This paper introduces a novel approach to CBCT motion estimation using a gradient-based optimization algorithm, which leverages generalized derivatives of the backprojection operator for cone-beam CT geometries. Building on that, a fully differentiable target function is formulated which grades the quality of the current motion estimate in reconstruction space. We drastically accelerate motion estimation yielding a 19-fold speed-up compared to existing methods. Additionally, we investigate the architecture of networks used for quality metric regression and propose predicting voxel-wise quality maps, favoring autoencoder-like architectures over contracting ones. This modification improves gradient flow, leading to more accurate motion estimation. The presented method is evaluated through realistic experiments on head anatomy. It achieves a reduction in reprojection error from an initial average of 3 mm to 0.61 mm after motion compensation and consistently demonstrates superior performance compared to existing approaches. The analytic Jacobian for the backprojection operation, which is at the core of the proposed method, is made publicly available. In summary, this paper contributes to the advancement of CBCT integration into clinical workflows by proposing a robust motion estimation approach that enhances efficiency and accuracy, addressing critical challenges in time-sensitive scenarios.

锥形束计算机断层扫描(CBCT)系统以其灵活性为直接医疗点医学成像提供了一条大有可为的途径,尤其是在急性中风评估等关键场景中。然而,将 CBCT 集成到临床工作流程中面临着挑战,主要原因是扫描时间长,导致扫描过程中患者移动,从而导致重建体的图像质量下降。本文介绍了一种利用基于梯度的优化算法进行 CBCT 运动估计的新方法,该算法利用了锥形束 CT 几何结构的反投影算子的广义导数。在此基础上,制定了一个完全可变的目标函数,该函数可对重建空间中当前运动估计的质量进行分级。与现有方法相比,我们大大加快了运动估计的速度,提高了 19 倍。此外,我们还研究了用于质量度量回归的网络架构,并提出了预测体素质量图的建议,同时倾向于采用类似自动编码器的架构,而不是收缩架构。这种修改改善了梯度流,从而实现了更精确的运动估计。通过对头部解剖的实际实验,对所提出的方法进行了评估。经过运动补偿后,该方法可将重投影误差从最初的平均 3 毫米减少到 0.61 毫米,与现有方法相比始终表现出卓越的性能。作为该方法核心的反向投影操作的雅各布解析式已公开发表。总之,本文提出了一种稳健的运动估算方法,提高了效率和准确性,解决了时间敏感场景中的关键难题,为将 CBCT 集成到临床工作流程中做出了贡献。
{"title":"A gradient-based approach to fast and accurate head motion compensation in cone-beam CT.","authors":"Mareike Thies, Fabian Wagner, Noah Maul, Haijun Yu, Manuela Goldmann, Linda-Sophie Schneider, Mingxuan Gu, Siyuan Mei, Lukas Folle, Alexander Preuhs, Michael Manhart, Andreas Maier","doi":"10.1109/TMI.2024.3474250","DOIUrl":"10.1109/TMI.2024.3474250","url":null,"abstract":"<p><p>Cone-beam computed tomography (CBCT) systems, with their flexibility, present a promising avenue for direct point-of-care medical imaging, particularly in critical scenarios such as acute stroke assessment. However, the integration of CBCT into clinical workflows faces challenges, primarily linked to long scan duration resulting in patient motion during scanning and leading to image quality degradation in the reconstructed volumes. This paper introduces a novel approach to CBCT motion estimation using a gradient-based optimization algorithm, which leverages generalized derivatives of the backprojection operator for cone-beam CT geometries. Building on that, a fully differentiable target function is formulated which grades the quality of the current motion estimate in reconstruction space. We drastically accelerate motion estimation yielding a 19-fold speed-up compared to existing methods. Additionally, we investigate the architecture of networks used for quality metric regression and propose predicting voxel-wise quality maps, favoring autoencoder-like architectures over contracting ones. This modification improves gradient flow, leading to more accurate motion estimation. The presented method is evaluated through realistic experiments on head anatomy. It achieves a reduction in reprojection error from an initial average of 3 mm to 0.61 mm after motion compensation and consistently demonstrates superior performance compared to existing approaches. The analytic Jacobian for the backprojection operation, which is at the core of the proposed method, is made publicly available. In summary, this paper contributes to the advancement of CBCT integration into clinical workflows by proposing a robust motion estimation approach that enhances efficiency and accuracy, addressing critical challenges in time-sensitive scenarios.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142376485","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AttriPrompter: Auto-Prompting with Attribute Semantics for Zero-shot Nuclei Detection via Visual-Language Pre-trained Models. AttriPrompter:通过视觉语言预训练模型,利用属性语义自动提示进行零镜头核检测。
Pub Date : 2024-10-03 DOI: 10.1109/TMI.2024.3473745
Yongjian Wu, Yang Zhou, Jiya Saiyin, Bingzheng Wei, Maode Lai, Jianzhong Shou, Yan Xu

Large-scale visual-language pre-trained models (VLPMs) have demonstrated exceptional performance in downstream object detection through text prompts for natural scenes. However, their application to zero-shot nuclei detection on histopathology images remains relatively unexplored, mainly due to the significant gap between the characteristics of medical images and the weboriginated text-image pairs used for pre-training. This paper aims to investigate the potential of the object-level VLPM, Grounded Language-Image Pre-training (GLIP), for zero-shot nuclei detection. Specifically, we propose an innovative auto-prompting pipeline, named AttriPrompter, comprising attribute generation, attribute augmentation, and relevance sorting, to avoid subjective manual prompt design. AttriPrompter utilizes VLPMs' text-to-image alignment to create semantically rich text prompts, which are then fed into GLIP for initial zero-shot nuclei detection. Additionally, we propose a self-trained knowledge distillation framework, where GLIP serves as the teacher with its initial predictions used as pseudo labels, to address the challenges posed by high nuclei density, including missed detections, false positives, and overlapping instances. Our method exhibits remarkable performance in label-free nuclei detection, out-performing all existing unsupervised methods and demonstrating excellent generality. Notably, this work highlights the astonishing potential of VLPMs pre-trained on natural image-text pairs for downstream tasks in the medical field as well. Code will be released at github.com/AttriPrompter.

大规模视觉语言预训练模型(VLPM)在通过自然场景文本提示进行下游对象检测方面表现出了卓越的性能。然而,它们在组织病理学图像的零点核检测中的应用仍相对欠缺,这主要是由于医学图像的特征与用于预训练的源于网络的文本图像对之间存在巨大差距。本文旨在研究对象级 VLPM、Grounded Language-Image Pre-training (GLIP) 在零点核检测方面的潜力。具体来说,我们提出了一个创新的自动提示管道,名为 AttriPrompter,包括属性生成、属性增强和相关性排序,以避免主观的人工提示设计。AttriPrompter 利用 VLPM 的文本到图像对齐功能创建语义丰富的文本提示,然后将其输入 GLIP 进行初始零镜头核检测。此外,我们还提出了一个自我训练的知识提炼框架,由 GLIP 作为教师,将其初始预测作为伪标签,以应对高核密度带来的挑战,包括漏检、误报和重叠实例。我们的方法在无标签细胞核检测方面表现出色,优于所有现有的无监督方法,并显示出卓越的通用性。值得注意的是,这项工作凸显了在自然图像-文本对上预先训练的 VLPM 在医疗领域下游任务中的惊人潜力。代码将在 github.com/AttriPrompter 上发布。
{"title":"AttriPrompter: Auto-Prompting with Attribute Semantics for Zero-shot Nuclei Detection via Visual-Language Pre-trained Models.","authors":"Yongjian Wu, Yang Zhou, Jiya Saiyin, Bingzheng Wei, Maode Lai, Jianzhong Shou, Yan Xu","doi":"10.1109/TMI.2024.3473745","DOIUrl":"10.1109/TMI.2024.3473745","url":null,"abstract":"<p><p>Large-scale visual-language pre-trained models (VLPMs) have demonstrated exceptional performance in downstream object detection through text prompts for natural scenes. However, their application to zero-shot nuclei detection on histopathology images remains relatively unexplored, mainly due to the significant gap between the characteristics of medical images and the weboriginated text-image pairs used for pre-training. This paper aims to investigate the potential of the object-level VLPM, Grounded Language-Image Pre-training (GLIP), for zero-shot nuclei detection. Specifically, we propose an innovative auto-prompting pipeline, named AttriPrompter, comprising attribute generation, attribute augmentation, and relevance sorting, to avoid subjective manual prompt design. AttriPrompter utilizes VLPMs' text-to-image alignment to create semantically rich text prompts, which are then fed into GLIP for initial zero-shot nuclei detection. Additionally, we propose a self-trained knowledge distillation framework, where GLIP serves as the teacher with its initial predictions used as pseudo labels, to address the challenges posed by high nuclei density, including missed detections, false positives, and overlapping instances. Our method exhibits remarkable performance in label-free nuclei detection, out-performing all existing unsupervised methods and demonstrating excellent generality. Notably, this work highlights the astonishing potential of VLPMs pre-trained on natural image-text pairs for downstream tasks in the medical field as well. Code will be released at github.com/AttriPrompter.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142373922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE transactions on medical imaging
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1