IEEE transactions on medical imaging最新文献_第10页

LCGNet: Local Sequential Feature Coupling Global Representation Learning for Functional Connectivity Network Analysis with fMRI. LCGNet：利用 fMRI 进行功能连接网络分析的局部序列特征耦合全局表征学习。

IEEE transactions on medical imaging

Pub Date : 2024-07-01 DOI: 10.1109/TMI.2024.3421360

Jie Zhou, Biao Jie, Zhengdong Wang, Zhixiang Zhang, Tongchun Du, Weixin Bian, Yang Yang, Jun Jia

Analysis of functional connectivity networks (FCNs) derived from resting-state functional magnetic resonance imaging (rs-fMRI) has greatly advanced our understanding of brain diseases, including Alzheimer's disease (AD) and attention deficit hyperactivity disorder (ADHD). Advanced machine learning techniques, such as convolutional neural networks (CNNs), have been used to learn high-level feature representations of FCNs for automated brain disease classification. Even though convolution operations in CNNs are good at extracting local properties of FCNs, they generally cannot well capture global temporal representations of FCNs. Recently, the transformer technique has demonstrated remarkable performance in various tasks, which is attributed to its effective self-attention mechanism in capturing the global temporal feature representations. However, it cannot effectively model the local network characteristics of FCNs. To this end, in this paper, we propose a novel network structure for Local sequential feature Coupling Global representation learning (LCGNet) to take advantage of convolutional operations and self-attention mechanisms for enhanced FCN representation learning. Specifically, we first build a dynamic FCN for each subject using an overlapped sliding window approach. We then construct three sequential components (i.e., edge-to-vertex layer, vertex-to-network layer, and network-to-temporality layer) with a dual backbone branch of CNN and transformer to extract and couple from local to global topological information of brain networks. Experimental results on two real datasets (i.e., ADNI and ADHD-200) with rs-fMRI data show the superiority of our LCGNet.

对静息态功能磁共振成像（rs-fMRI）得出的功能连接网络（FCN）进行分析，极大地促进了我们对阿尔茨海默病（AD）和注意缺陷多动障碍（ADHD）等脑部疾病的了解。先进的机器学习技术，如卷积神经网络（CNN），已被用于学习 FCN 的高级特征表征，以实现脑部疾病的自动分类。尽管卷积神经网络中的卷积运算能很好地提取 FCN 的局部属性，但通常不能很好地捕捉 FCN 的全局时间表示。最近，变换器技术在各种任务中表现出了不俗的性能，这归功于它在捕捉全局时间特征表征方面有效的自我注意机制。然而，它无法有效模拟 FCN 的局部网络特征。为此，我们在本文中提出了一种用于局部序列特征耦合全局表征学习（LCGNet）的新型网络结构，以利用卷积运算和自注意机制来增强 FCN 表征学习。具体来说，我们首先使用重叠滑动窗口方法为每个受试者构建一个动态 FCN。然后，我们利用 CNN 的双主干分支和转换器构建了三个连续组件（即边缘到顶点层、顶点到网络层和网络到时序层），以提取和耦合大脑网络的局部到全局拓扑信息。在两个真实数据集（即 ADNI 和 ADHD-200）的 rs-fMRI 数据上的实验结果表明了我们的 LCGNet 的优越性。

{"title":"LCGNet: Local Sequential Feature Coupling Global Representation Learning for Functional Connectivity Network Analysis with fMRI.","authors":"Jie Zhou, Biao Jie, Zhengdong Wang, Zhixiang Zhang, Tongchun Du, Weixin Bian, Yang Yang, Jun Jia","doi":"10.1109/TMI.2024.3421360","DOIUrl":"https://doi.org/10.1109/TMI.2024.3421360","url":null,"abstract":"Analysis of functional connectivity networks (FCNs) derived from resting-state functional magnetic resonance imaging (rs-fMRI) has greatly advanced our understanding of brain diseases, including Alzheimer's disease (AD) and attention deficit hyperactivity disorder (ADHD). Advanced machine learning techniques, such as convolutional neural networks (CNNs), have been used to learn high-level feature representations of FCNs for automated brain disease classification. Even though convolution operations in CNNs are good at extracting local properties of FCNs, they generally cannot well capture global temporal representations of FCNs. Recently, the transformer technique has demonstrated remarkable performance in various tasks, which is attributed to its effective self-attention mechanism in capturing the global temporal feature representations. However, it cannot effectively model the local network characteristics of FCNs. To this end, in this paper, we propose a novel network structure for Local sequential feature Coupling Global representation learning (LCGNet) to take advantage of convolutional operations and self-attention mechanisms for enhanced FCN representation learning. Specifically, we first build a dynamic FCN for each subject using an overlapped sliding window approach. We then construct three sequential components (i.e., edge-to-vertex layer, vertex-to-network layer, and network-to-temporality layer) with a dual backbone branch of CNN and transformer to extract and couple from local to global topological information of brain networks. Experimental results on two real datasets (i.e., ADNI and ADHD-200) with rs-fMRI data show the superiority of our LCGNet.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141478184","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Dual-Domain Collaborative Diffusion Sampling for Multi-Source Stationary Computed Tomography Reconstruction 用于多源静态计算机断层扫描重建的双域协作扩散采样

IEEE transactions on medical imaging

Pub Date : 2024-06-28 DOI: 10.1109/TMI.2024.3420411

Zirong Li;Dingyue Chang;Zhenxi Zhang;Fulin Luo;Qiegen Liu;Jianjia Zhang;Guang Yang;Weiwen Wu

The multi-source stationary CT, where both the detector and X-ray source are fixed, represents a novel imaging system with high temporal resolution that has garnered significant interest. Limited space within the system restricts the number of X-ray sources, leading to sparse-view CT imaging challenges. Recent diffusion models for reconstructing sparse-view CT have generally focused separately on sinogram or image domains. Sinogram-centric models effectively estimate missing projections but may introduce artifacts, lacking mechanisms to ensure image correctness. Conversely, image-domain models, while capturing detailed image features, often struggle with complex data distribution, leading to inaccuracies in projections. Addressing these issues, the Dual-domain Collaborative Diffusion Sampling (DCDS) model integrates sinogram and image domain diffusion processes for enhanced sparse-view reconstruction. This model combines the strengths of both domains in an optimized mathematical framework. A collaborative diffusion mechanism underpins this model, improving sinogram recovery and image generative capabilities. This mechanism facilitates feedback-driven image generation from the sinogram domain and uses image domain results to complete missing projections. Optimization of the DCDS model is further achieved through the alternative direction iteration method, focusing on data consistency updates. Extensive testing, including numerical simulations, real phantoms, and clinical cardiac datasets, demonstrates the DCDS model’s effectiveness. It consistently outperforms various state-of-the-art benchmarks, delivering exceptional reconstruction quality and precise sinogram.

多源固定 CT（探测器和 X 射线源都是固定的）是一种具有高时间分辨率的新型成像系统，已引起广泛关注。系统内有限的空间限制了 X 射线源的数量，从而给稀疏视图 CT 成像带来了挑战。最近用于重建稀疏视图 CT 的扩散模型一般都分别侧重于矢量图域或图像域。以正弦图为中心的模型能有效估计缺失的投影，但可能会引入伪影，缺乏确保图像正确性的机制。相反，图像域模型虽然能捕捉到详细的图像特征，但往往难以应对复杂的数据分布，导致投影不准确。为了解决这些问题，双域协作扩散采样（DCDS）模型整合了正弦图和图像域扩散过程，以增强稀疏视图重建。该模型在优化的数学框架中结合了两个域的优势。协作扩散机制是这一模型的基础，可提高正弦图恢复和图像生成能力。这种机制有利于从正弦图域生成反馈驱动的图像，并利用图像域的结果来完成缺失的投影。通过替代方向迭代法进一步实现了 DCDS 模型的优化，重点是数据一致性更新。广泛的测试（包括数值模拟、真实模型和临床心脏数据集）证明了 DCDS 模型的有效性。它的性能始终优于各种最先进的基准，可提供卓越的重建质量和精确的正弦曲线。

{"title":"Dual-Domain Collaborative Diffusion Sampling for Multi-Source Stationary Computed Tomography Reconstruction","authors":"Zirong Li;Dingyue Chang;Zhenxi Zhang;Fulin Luo;Qiegen Liu;Jianjia Zhang;Guang Yang;Weiwen Wu","doi":"10.1109/TMI.2024.3420411","DOIUrl":"10.1109/TMI.2024.3420411","url":null,"abstract":"The multi-source stationary CT, where both the detector and X-ray source are fixed, represents a novel imaging system with high temporal resolution that has garnered significant interest. Limited space within the system restricts the number of X-ray sources, leading to sparse-view CT imaging challenges. Recent diffusion models for reconstructing sparse-view CT have generally focused separately on sinogram or image domains. Sinogram-centric models effectively estimate missing projections but may introduce artifacts, lacking mechanisms to ensure image correctness. Conversely, image-domain models, while capturing detailed image features, often struggle with complex data distribution, leading to inaccuracies in projections. Addressing these issues, the Dual-domain Collaborative Diffusion Sampling (DCDS) model integrates sinogram and image domain diffusion processes for enhanced sparse-view reconstruction. This model combines the strengths of both domains in an optimized mathematical framework. A collaborative diffusion mechanism underpins this model, improving sinogram recovery and image generative capabilities. This mechanism facilitates feedback-driven image generation from the sinogram domain and uses image domain results to complete missing projections. Optimization of the DCDS model is further achieved through the alternative direction iteration method, focusing on data consistency updates. Extensive testing, including numerical simulations, real phantoms, and clinical cardiac datasets, demonstrates the DCDS model’s effectiveness. It consistently outperforms various state-of-the-art benchmarks, delivering exceptional reconstruction quality and precise sinogram.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"43 10","pages":"3398-3411"},"PeriodicalIF":0.0,"publicationDate":"2024-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141462763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Towards human-scale magnetic particle imaging: development of the first system with superconductor-based selection coils. 实现人体尺度的磁粉成像：开发首个基于超导体选择线圈的系统。

IEEE transactions on medical imaging

Pub Date : 2024-06-26 DOI: 10.1109/TMI.2024.3419427

Tuan-Anh Le, Minh Phu Bui, Yaser Hadadian, Khaled Mohamed Gadelmowla, Seungjun Oh, Chaemin Im, Seungyong Hahn, Jungwon Yoon

Magnetic Particle Imaging (MPI) is an emerging tomographic modality that allows for precise three-dimensional (3D) mapping of magnetic nanoparticles (MNPs) concentration and distribution. Although significant progress has been made towards improving MPI since its introduction, scaling it up for human applications has proven challenging. High-quality images have been obtained in animal-scale MPI scanners with gradients up to 7 T/m/μ₀, however, for MPI systems with bore diameters around 200 mm the gradients generated by electromagnets drop significantly to below 0.5 T/m/μ₀. Given the current technological limitations in image reconstruction and the properties of available MNPs, these low gradients inherently impose limitations on improving MPI resolution for higher precision medical imaging. Utilizing superconductors stands out as a promising approach for developing a human-scale MPI system. In this study, we introduce, for the first time, a human-scale amplitude-modulated (AM) MPI system with superconductor-based selection coils. The system achieves an unprecedented magnetic field gradient of up to 2.5 T/m/μ₀ within a 200 mm bore diameter, enabling large fields of view of 100 × 130 × 98 mm³ at 2.5 T/m/μ₀ for 3D imaging. While obtained spatial resolution is in the order of previous animal-scale AM MPIs, incorporating superconductors for achieving such high gradients in a 200 mm bore diameter marks a major step toward clinical MPI.

磁性粒子成像（MPI）是一种新兴的断层成像模式，可对磁性纳米粒子（MNPs）的浓度和分布进行精确的三维（3D）绘图。虽然 MPI 自推出以来在改进方面取得了重大进展，但将其推广到人类应用中仍具有挑战性。动物规模的 MPI 扫描仪可获得梯度高达 7 T/m/μ0 的高质量图像，但对于孔径在 200 毫米左右的 MPI 系统，电磁铁产生的梯度明显降低到 0.5 T/m/μ0 以下。鉴于目前图像重建的技术限制和现有 MNP 的特性，这些低梯度对提高 MPI 分辨率以实现更高精度的医学成像造成了固有的限制。利用超导体是开发人体级 MPI 系统的一种可行方法。在本研究中，我们首次引入了一个人体级调幅（AM）MPI 系统，该系统采用了基于超导体的选择线圈。该系统在 200 毫米的孔径内实现了前所未有的高达 2.5 T/m/μ0 的磁场梯度，从而在 2.5 T/m/μ0 的条件下实现了 100 × 130 × 98 立方毫米的大视野三维成像。虽然所获得的空间分辨率与以前的动物级 AM MPI 相差无几，但在 200 毫米孔径内采用超导体实现如此高的梯度，标志着向临床 MPI 迈出了重要一步。

{"title":"Towards human-scale magnetic particle imaging: development of the first system with superconductor-based selection coils.","authors":"Tuan-Anh Le, Minh Phu Bui, Yaser Hadadian, Khaled Mohamed Gadelmowla, Seungjun Oh, Chaemin Im, Seungyong Hahn, Jungwon Yoon","doi":"10.1109/TMI.2024.3419427","DOIUrl":"https://doi.org/10.1109/TMI.2024.3419427","url":null,"abstract":"Magnetic Particle Imaging (MPI) is an emerging tomographic modality that allows for precise three-dimensional (3D) mapping of magnetic nanoparticles (MNPs) concentration and distribution. Although significant progress has been made towards improving MPI since its introduction, scaling it up for human applications has proven challenging. High-quality images have been obtained in animal-scale MPI scanners with gradients up to 7 T/m/μ0, however, for MPI systems with bore diameters around 200 mm the gradients generated by electromagnets drop significantly to below 0.5 T/m/μ0. Given the current technological limitations in image reconstruction and the properties of available MNPs, these low gradients inherently impose limitations on improving MPI resolution for higher precision medical imaging. Utilizing superconductors stands out as a promising approach for developing a human-scale MPI system. In this study, we introduce, for the first time, a human-scale amplitude-modulated (AM) MPI system with superconductor-based selection coils. The system achieves an unprecedented magnetic field gradient of up to 2.5 T/m/μ0 within a 200 mm bore diameter, enabling large fields of view of 100 × 130 × 98 mm3 at 2.5 T/m/μ0 for 3D imaging. While obtained spatial resolution is in the order of previous animal-scale AM MPIs, incorporating superconductors for achieving such high gradients in a 200 mm bore diameter marks a major step toward clinical MPI.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141461400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

HiCervix: An Extensive Hierarchical Dataset and Benchmark for Cervical Cytology Classification. HiCervix：宫颈细胞学分类的广泛分层数据集和基准。

IEEE transactions on medical imaging

Pub Date : 2024-06-26 DOI: 10.1109/TMI.2024.3419697

De Cai, Jie Chen, Junhan Zhao, Yuan Xue, Sen Yang, Wei Yuan, Min Feng, Haiyan Weng, Shuguang Liu, Yulong Peng, Junyou Zhu, Kanran Wang, Christopher Jackson, Hongping Tang, Junzhou Huang, Xiyue Wang

Cervical cytology is a critical screening strategy for early detection of pre-cancerous and cancerous cervical lesions. The challenge lies in accurately classifying various cervical cytology cell types. Existing automated cervical cytology methods are primarily trained on databases covering a narrow range of coarse-grained cell types, which fail to provide a comprehensive and detailed performance analysis that accurately represents real-world cytopathology conditions. To overcome these limitations, we introduce HiCervix, the most extensive, multi-center cervical cytology dataset currently available to the public. HiCervix includes 40,229 cervical cells from 4,496 whole slide images, categorized into 29 annotated classes. These classes are organized within a three-level hierarchical tree to capture fine-grained subtype information. To exploit the semantic correlation inherent in this hierarchical tree, we propose HierSwin, a hierarchical vision transformer-based classification network. HierSwin serves as a benchmark for detailed feature learning in both coarse-level and fine-level cervical cancer classification tasks. In our comprehensive experiments, HierSwin demonstrated remarkable performance, achieving 92.08% accuracy for coarse-level classification and 82.93% accuracy averaged across all three levels. When compared to board-certified cytopathologists, HierSwin achieved high classification performance (0.8293 versus 0.7359 averaged accuracy), highlighting its potential for clinical applications. This newly released HiCervix dataset, along with our benchmark HierSwin method, is poised to make a substantial impact on the advancement of deep learning algorithms for rapid cervical cancer screening and greatly improve cancer prevention and patient outcomes in real-world clinical settings.

宫颈细胞学检查是早期发现宫颈癌前病变和癌变的重要筛查策略。难点在于如何对各种宫颈细胞学细胞类型进行准确分类。现有的自动宫颈细胞学检查方法主要是在覆盖范围较窄的粗粒度细胞类型数据库中进行训练，无法提供全面详细的性能分析，准确反映真实世界的细胞病理学状况。为了克服这些局限性，我们引入了 HiCervix，这是目前可供公众使用的最广泛的多中心宫颈细胞学数据集。HiCervix 包括来自 4,496 张全切片图像的 40,229 个宫颈细胞，分为 29 个注释类别。这些类别以三级分层树的形式组织起来，以捕捉细粒度的亚型信息。为了利用分层树中固有的语义相关性，我们提出了基于分层视觉转换器的分类网络 HierSwin。HierSwin 可作为粗粒度和细粒度宫颈癌分类任务中详细特征学习的基准。在我们的综合实验中，HierSwin 表现出色，粗分类准确率达到 92.08%，三级平均准确率达到 82.93%。与经过认证的细胞病理学家相比，HierSwin 实现了较高的分类性能（0.8293 对 0.7359 的平均准确率），凸显了其在临床应用方面的潜力。新发布的 HiCervix 数据集与我们的基准 HierSwin 方法一起，有望对用于快速宫颈癌筛查的深度学习算法的发展产生重大影响，并大大改善现实世界临床环境中的癌症预防和患者预后。

{"title":"HiCervix: An Extensive Hierarchical Dataset and Benchmark for Cervical Cytology Classification.","authors":"De Cai, Jie Chen, Junhan Zhao, Yuan Xue, Sen Yang, Wei Yuan, Min Feng, Haiyan Weng, Shuguang Liu, Yulong Peng, Junyou Zhu, Kanran Wang, Christopher Jackson, Hongping Tang, Junzhou Huang, Xiyue Wang","doi":"10.1109/TMI.2024.3419697","DOIUrl":"https://doi.org/10.1109/TMI.2024.3419697","url":null,"abstract":"Cervical cytology is a critical screening strategy for early detection of pre-cancerous and cancerous cervical lesions. The challenge lies in accurately classifying various cervical cytology cell types. Existing automated cervical cytology methods are primarily trained on databases covering a narrow range of coarse-grained cell types, which fail to provide a comprehensive and detailed performance analysis that accurately represents real-world cytopathology conditions. To overcome these limitations, we introduce HiCervix, the most extensive, multi-center cervical cytology dataset currently available to the public. HiCervix includes 40,229 cervical cells from 4,496 whole slide images, categorized into 29 annotated classes. These classes are organized within a three-level hierarchical tree to capture fine-grained subtype information. To exploit the semantic correlation inherent in this hierarchical tree, we propose HierSwin, a hierarchical vision transformer-based classification network. HierSwin serves as a benchmark for detailed feature learning in both coarse-level and fine-level cervical cancer classification tasks. In our comprehensive experiments, HierSwin demonstrated remarkable performance, achieving 92.08% accuracy for coarse-level classification and 82.93% accuracy averaged across all three levels. When compared to board-certified cytopathologists, HierSwin achieved high classification performance (0.8293 versus 0.7359 averaged accuracy), highlighting its potential for clinical applications. This newly released HiCervix dataset, along with our benchmark HierSwin method, is poised to make a substantial impact on the advancement of deep learning algorithms for rapid cervical cancer screening and greatly improve cancer prevention and patient outcomes in real-world clinical settings.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141461398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

PtbNet: Based on Local Few-Shot Classes and Small Objects to accurately detect PTB. PtbNet：基于本地少拍类和小物体，准确检测 PTB。

IEEE transactions on medical imaging

Pub Date : 2024-06-26 DOI: 10.1109/TMI.2024.3419134

Wenhui Yang, Shuo Gao, Hao Zhang, Hong Yu, Menglei Xu, Puimun Chong, Weijie Zhang, Hong Wang, Wenjuan Zhang, Airong Qian

Pulmonary Tuberculosis (PTB) is one of the world's most infectious illnesses, and its early detection is critical for preventing PTB. Digital Radiography (DR) has been the most common and effective technique to examine PTB. However, due to the variety and weak specificity of phenotypes on DR chest X-ray (DCR), it is difficult to make reliable diagnoses for radiologists. Although artificial intelligence technology has made considerable gains in assisting the diagnosis of PTB, it lacks methods to identify the lesions of PTB with few-shot classes and small objects. To solve these problems, geometric data augmentation was used to increase the size of the DCRs. For this purpose, a diffusion probability model was implemented for six few-shot classes. Importantly, we propose a new multi-lesion detector PtbNet based on RetinaNet, which was constructed to detect small objects of PTB lesions. The results showed that by two data augmentations, the number of DCRs increased by 80% from 570 to 2,859. In the pre-evaluation experiments with the baseline, RetinaNet, the AP improved by 9.9 for six few-shot classes. Our extensive empirical evaluation showed that the AP of PtbNet achieved 28.2, outperforming the other 9 state-of-the-art methods. In the ablation study, combined with BiFPN+ and PSPD-Conv, the AP increased by 2.1, AP^s increased by 5.0, and grew by an average of 9.8 in AP^m and AP^l. In summary, PtbNet not only improves the detection of small-object lesions but also enhances the ability to detect different types of PTB uniformly, which helps physicians diagnose PTB lesions accurately. The code is available at https://github.com/Wenhui-person/PtbNet/tree/master.

肺结核（PTB）是世界上最具传染性的疾病之一，早期发现对于预防肺结核至关重要。数字射线摄影（DR）一直是检查肺结核最常用、最有效的技术。然而，由于 DR 胸部 X 光片（DCR）上的表型种类繁多且特异性较弱，放射科医生很难做出可靠的诊断。虽然人工智能技术在辅助诊断肺结核方面取得了相当大的进展，但它缺乏识别肺结核病变的方法，因为肺结核的病变类型较少，且病变物体较小。为了解决这些问题，我们采用了几何数据增强技术来增加 DCR 的大小。为此，我们采用了一个扩散概率模型，用于识别六种少镜头类别。重要的是，我们在 RetinaNet 的基础上提出了一种新的多病灶检测器 PtbNet，用于检测 PTB 病灶的小物体。结果显示，通过两次数据增强，DCR 的数量增加了 80%，从 570 个增加到 2859 个。在与基线 RetinaNet 进行的预评估实验中，6 个少镜头类别的 AP 提高了 9.9。我们广泛的经验评估表明，PtbNet 的 AP 达到了 28.2，超过了其他 9 种最先进的方法。在消融研究中，结合 BiFPN+ 和 PSPD-Conv，AP 增加了 2.1，APs 增加了 5.0，APm 和 APl 平均增加了 9.8。总之，PtbNet 不仅提高了小物体病变的检测能力，还增强了统一检测不同类型 PTB 的能力，有助于医生准确诊断 PTB 病变。代码见 https://github.com/Wenhui-person/PtbNet/tree/master。

{"title":"PtbNet: Based on Local Few-Shot Classes and Small Objects to accurately detect PTB.","authors":"Wenhui Yang, Shuo Gao, Hao Zhang, Hong Yu, Menglei Xu, Puimun Chong, Weijie Zhang, Hong Wang, Wenjuan Zhang, Airong Qian","doi":"10.1109/TMI.2024.3419134","DOIUrl":"https://doi.org/10.1109/TMI.2024.3419134","url":null,"abstract":"Pulmonary Tuberculosis (PTB) is one of the world's most infectious illnesses, and its early detection is critical for preventing PTB. Digital Radiography (DR) has been the most common and effective technique to examine PTB. However, due to the variety and weak specificity of phenotypes on DR chest X-ray (DCR), it is difficult to make reliable diagnoses for radiologists. Although artificial intelligence technology has made considerable gains in assisting the diagnosis of PTB, it lacks methods to identify the lesions of PTB with few-shot classes and small objects. To solve these problems, geometric data augmentation was used to increase the size of the DCRs. For this purpose, a diffusion probability model was implemented for six few-shot classes. Importantly, we propose a new multi-lesion detector PtbNet based on RetinaNet, which was constructed to detect small objects of PTB lesions. The results showed that by two data augmentations, the number of DCRs increased by 80% from 570 to 2,859. In the pre-evaluation experiments with the baseline, RetinaNet, the AP improved by 9.9 for six few-shot classes. Our extensive empirical evaluation showed that the AP of PtbNet achieved 28.2, outperforming the other 9 state-of-the-art methods. In the ablation study, combined with BiFPN+ and PSPD-Conv, the AP increased by 2.1, APs increased by 5.0, and grew by an average of 9.8 in APm and APl. In summary, PtbNet not only improves the detection of small-object lesions but also enhances the ability to detect different types of PTB uniformly, which helps physicians diagnose PTB lesions accurately. The code is available at https://github.com/Wenhui-person/PtbNet/tree/master.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141461399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Accurate Airway Tree Segmentation in CT Scans via Anatomy-aware Multi-class Segmentation and Topology-guided Iterative Learning. 通过解剖学感知的多类分割和拓扑学指导的迭代学习在 CT 扫描中准确分割气道树

IEEE transactions on medical imaging

Pub Date : 2024-06-26 DOI: 10.1109/TMI.2024.3419707

Puyang Wang, Dazhou Guo, Dandan Zheng, Minghui Zhang, Haogang Yu, Xin Sun, Jia Ge, Yun Gu, Le Lu, Xianghua Ye, Dakai Jin

Intrathoracic airway segmentation in computed tomography is a prerequisite for various respiratory disease analyses such as chronic obstructive pulmonary disease, asthma and lung cancer. Due to the low imaging contrast and noises execrated at peripheral branches, the topological-complexity and the intra-class imbalance of airway tree, it remains challenging for deep learning-based methods to segment the complete airway tree (on extracting deeper branches). Unlike other organs with simpler shapes or topology, the airway's complex tree structure imposes an unbearable burden to generate the "ground truth" label (up to 7 or 3 hours of manual or semi-automatic annotation per case). Most of the existing airway datasets are incompletely labeled/annotated, thus limiting the completeness of computer-segmented airway. In this paper, we propose a new anatomy-aware multi-class airway segmentation method enhanced by topology-guided iterative self-learning. Based on the natural airway anatomy, we formulate a simple yet highly effective anatomy-aware multi-class segmentation task to intuitively handle the severe intra-class imbalance of the airway. To solve the incomplete labeling issue, we propose a tailored iterative self-learning scheme to segment toward the complete airway tree. For generating pseudo-labels to achieve higher sensitivity (while retaining similar specificity), we introduce a novel breakage attention map and design a topology-guided pseudo-label refinement method by iteratively connecting breaking branches commonly existed from initial pseudo-labels. Extensive experiments have been conducted on four datasets including two public challenges. The proposed method achieves the top performance in both EXACT'09 challenge using average score and ATM'22 challenge on weighted average score. In a public BAS dataset and a private lung cancer dataset, our method significantly improves previous leading approaches by extracting at least (absolute) 6.1% more detected tree length and 5.2% more tree branches, while maintaining comparable precision.

计算机断层扫描中的胸腔内气道分割是各种呼吸系统疾病（如慢性阻塞性肺病、哮喘和肺癌）分析的先决条件。由于气道树的成像对比度低、外围分支噪音大、拓扑复杂和类内不平衡，基于深度学习的方法要分割完整的气道树（提取更深的分支）仍然具有挑战性。与其他形状或拓扑结构较为简单的器官不同，气道复杂的树状结构给生成 "地面实况 "标签带来了难以承受的负担（每个病例的人工或半自动标注时间长达 7 或 3 个小时）。现有的气道数据集大多标注/注释不完整，从而限制了计算机气道分割的完整性。在本文中，我们提出了一种新的解剖感知多类气道分割方法，该方法通过拓扑学引导的迭代自学习得到增强。基于自然气道解剖学，我们制定了一个简单而高效的解剖感知多类分割任务，直观地处理气道严重的类内不平衡问题。为了解决标记不完整的问题，我们提出了一种量身定制的迭代自学习方案，以分割出完整的气道树。为了生成伪标签以实现更高的灵敏度（同时保留相似的特异性），我们引入了一种新颖的断裂注意图，并设计了一种拓扑引导的伪标签完善方法，通过迭代连接初始伪标签中普遍存在的断裂分支来实现。我们在包括两个公开挑战赛在内的四个数据集上进行了广泛的实验。所提出的方法在 EXACT'09 挑战赛（使用平均分）和 ATM'22 挑战赛（使用加权平均分）中都取得了优异成绩。在一个公共 BAS 数据集和一个私人肺癌数据集中，我们的方法显著改进了之前的领先方法，至少（绝对）多提取了 6.1% 的检测树长度和 5.2% 的树枝，同时保持了相当的精确度。

{"title":"Accurate Airway Tree Segmentation in CT Scans via Anatomy-aware Multi-class Segmentation and Topology-guided Iterative Learning.","authors":"Puyang Wang, Dazhou Guo, Dandan Zheng, Minghui Zhang, Haogang Yu, Xin Sun, Jia Ge, Yun Gu, Le Lu, Xianghua Ye, Dakai Jin","doi":"10.1109/TMI.2024.3419707","DOIUrl":"https://doi.org/10.1109/TMI.2024.3419707","url":null,"abstract":"Intrathoracic airway segmentation in computed tomography is a prerequisite for various respiratory disease analyses such as chronic obstructive pulmonary disease, asthma and lung cancer. Due to the low imaging contrast and noises execrated at peripheral branches, the topological-complexity and the intra-class imbalance of airway tree, it remains challenging for deep learning-based methods to segment the complete airway tree (on extracting deeper branches). Unlike other organs with simpler shapes or topology, the airway's complex tree structure imposes an unbearable burden to generate the \"ground truth\" label (up to 7 or 3 hours of manual or semi-automatic annotation per case). Most of the existing airway datasets are incompletely labeled/annotated, thus limiting the completeness of computer-segmented airway. In this paper, we propose a new anatomy-aware multi-class airway segmentation method enhanced by topology-guided iterative self-learning. Based on the natural airway anatomy, we formulate a simple yet highly effective anatomy-aware multi-class segmentation task to intuitively handle the severe intra-class imbalance of the airway. To solve the incomplete labeling issue, we propose a tailored iterative self-learning scheme to segment toward the complete airway tree. For generating pseudo-labels to achieve higher sensitivity (while retaining similar specificity), we introduce a novel breakage attention map and design a topology-guided pseudo-label refinement method by iteratively connecting breaking branches commonly existed from initial pseudo-labels. Extensive experiments have been conducted on four datasets including two public challenges. The proposed method achieves the top performance in both EXACT'09 challenge using average score and ATM'22 challenge on weighted average score. In a public BAS dataset and a private lung cancer dataset, our method significantly improves previous leading approaches by extracting at least (absolute) 6.1% more detected tree length and 5.2% more tree branches, while maintaining comparable precision.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141461397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Temporal Dynamic Synchronous Functional Brain Network for Schizophrenia Classification and Lateralization Analysis. 用于精神分裂症分类和侧化分析的时态动态同步功能脑网络

IEEE transactions on medical imaging

Pub Date : 2024-06-25 DOI: 10.1109/TMI.2024.3419041

Cheng Zhu, Ying Tan, Shuqi Yang, Jiaqing Miao, Jiayi Zhu, Huan Huang, Dezhong Yao, Cheng Luo

Available evidence suggests that dynamic functional connectivity can capture time-varying abnormalities in brain activity in resting-state cerebral functional magnetic resonance imaging (rs-fMRI) data and has a natural advantage in uncovering mechanisms of abnormal brain activity in schizophrenia (SZ) patients. Hence, an advanced dynamic brain network analysis model called the temporal brain category graph convolutional network (Temporal-BCGCN) was employed. Firstly, a unique dynamic brain network analysis module, DSF-BrainNet, was designed to construct dynamic synchronization features. Subsequently, a revolutionary graph convolution method, TemporalConv, was proposed based on the synchronous temporal properties of features. Finally, the first modular test tool for abnormal hemispherical lateralization in deep learning based on rs-fMRI data, named CategoryPool, was proposed. This study was validated on COBRE and UCLA datasets and achieved 83.62% and 89.71% average accuracies, respectively, outperforming the baseline model and other state-of-theart methods. The ablation results also demonstrate the advantages of TemporalConv over the traditional edge feature graph convolution approach and the improvement of CategoryPool over the classical graph pooling approach. Interestingly, this study showed that the lower-order perceptual system and higher-order network regions in the left hemisphere are more severely dysfunctional than in the right hemisphere in SZ, reaffirmings the importance of the left medial superior frontal gyrus in SZ. Our code was available at: https://github.com/swfen/Temporal-BCGCN.

现有证据表明，动态功能连接可以捕捉静息态脑功能磁共振成像（rs-fMRI）数据中大脑活动的时变异常，在揭示精神分裂症（SZ）患者大脑活动异常机制方面具有天然优势。因此，我们采用了一种先进的动态脑网络分析模型--时空脑类别图卷积网络（Temporal-BCGCN）。首先，设计了一个独特的动态脑网络分析模块--DSF-BrainNet，用于构建动态同步特征。随后，基于特征的同步时间属性，提出了一种革命性的图卷积方法 TemporalConv。最后，提出了第一个基于 rs-fMRI 数据的深度学习异常半球侧化模块化测试工具，名为 CategoryPool。这项研究在 COBRE 和 UCLA 数据集上进行了验证，平均准确率分别达到 83.62% 和 89.71%，优于基线模型和其他最先进的方法。消融结果还证明了 TemporalConv 相对于传统边缘特征图卷积方法的优势，以及 CategoryPool 相对于经典图池方法的改进。有趣的是，这项研究表明，在 SZ 患者中，左半球的低阶感知系统和高阶网络区域的功能障碍比右半球更为严重，这再次证实了左侧内侧额上回在 SZ 中的重要性。我们的代码见：https://github.com/swfen/Temporal-BCGCN。

{"title":"Temporal Dynamic Synchronous Functional Brain Network for Schizophrenia Classification and Lateralization Analysis.","authors":"Cheng Zhu, Ying Tan, Shuqi Yang, Jiaqing Miao, Jiayi Zhu, Huan Huang, Dezhong Yao, Cheng Luo","doi":"10.1109/TMI.2024.3419041","DOIUrl":"10.1109/TMI.2024.3419041","url":null,"abstract":"Available evidence suggests that dynamic functional connectivity can capture time-varying abnormalities in brain activity in resting-state cerebral functional magnetic resonance imaging (rs-fMRI) data and has a natural advantage in uncovering mechanisms of abnormal brain activity in schizophrenia (SZ) patients. Hence, an advanced dynamic brain network analysis model called the temporal brain category graph convolutional network (Temporal-BCGCN) was employed. Firstly, a unique dynamic brain network analysis module, DSF-BrainNet, was designed to construct dynamic synchronization features. Subsequently, a revolutionary graph convolution method, TemporalConv, was proposed based on the synchronous temporal properties of features. Finally, the first modular test tool for abnormal hemispherical lateralization in deep learning based on rs-fMRI data, named CategoryPool, was proposed. This study was validated on COBRE and UCLA datasets and achieved 83.62% and 89.71% average accuracies, respectively, outperforming the baseline model and other state-of-theart methods. The ablation results also demonstrate the advantages of TemporalConv over the traditional edge feature graph convolution approach and the improvement of CategoryPool over the classical graph pooling approach. Interestingly, this study showed that the lower-order perceptual system and higher-order network regions in the left hemisphere are more severely dysfunctional than in the right hemisphere in SZ, reaffirmings the importance of the left medial superior frontal gyrus in SZ. Our code was available at: https://github.com/swfen/Temporal-BCGCN.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141452465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Blind CT Image Quality Assessment Using DDPM-Derived Content and Transformer-Based Evaluator 使用 DDPM 派生内容和基于变换器的评估器进行盲 CT 图像质量评估。

IEEE transactions on medical imaging

Pub Date : 2024-06-24 DOI: 10.1109/TMI.2024.3418652

Yongyi Shi;Wenjun Xia;Ge Wang;Xuanqin Mou

Lowering radiation dose per view and utilizing sparse views per scan are two common CT scan modes, albeit often leading to distorted images characterized by noise and streak artifacts. Blind image quality assessment (BIQA) strives to evaluate perceptual quality in alignment with what radiologists perceive, which plays an important role in advancing low-dose CT reconstruction techniques. An intriguing direction involves developing BIQA methods that mimic the operational characteristic of the human visual system (HVS). The internal generative mechanism (IGM) theory reveals that the HVS actively deduces primary content to enhance comprehension. In this study, we introduce an innovative BIQA metric that emulates the active inference process of IGM. Initially, an active inference module, implemented as a denoising diffusion probabilistic model (DDPM), is constructed to anticipate the primary content. Then, the dissimilarity map is derived by assessing the interrelation between the distorted image and its primary content. Subsequently, the distorted image and dissimilarity map are combined into a multi-channel image, which is inputted into a transformer-based image quality evaluator. By leveraging the DDPM-derived primary content, our approach achieves competitive performance on a low-dose CT dataset.

降低每个视图的辐射剂量和利用每次扫描的稀疏视图是两种常见的 CT 扫描模式，但往往会导致以噪声和条纹伪影为特征的图像失真。盲图像质量评估（BIQA）致力于评估与放射科医生感知一致的感知质量，这在推进低剂量 CT 重建技术方面发挥着重要作用。一个令人感兴趣的方向是开发能模仿人类视觉系统（HVS）运行特征的 BIQA 方法。内部生成机制（IGM）理论揭示了 HVS 会主动推导主要内容以提高理解能力。在本研究中，我们引入了一种创新的 BIQA 指标，以模拟 IGM 的主动推理过程。首先，构建一个以去噪扩散概率模型（DDPM）形式实现的主动推理模块，以预测主要内容。然后，通过评估失真图像与其主要内容之间的相互关系得出相似性图。随后，将扭曲图像和差异图组合成多通道图像，并将其输入基于变换器的图像质量评估器。通过利用从 DDPM 派生的主要内容，我们的方法在低剂量 CT 数据集上实现了具有竞争力的性能。

{"title":"Blind CT Image Quality Assessment Using DDPM-Derived Content and Transformer-Based Evaluator","authors":"Yongyi Shi;Wenjun Xia;Ge Wang;Xuanqin Mou","doi":"10.1109/TMI.2024.3418652","DOIUrl":"10.1109/TMI.2024.3418652","url":null,"abstract":"Lowering radiation dose per view and utilizing sparse views per scan are two common CT scan modes, albeit often leading to distorted images characterized by noise and streak artifacts. Blind image quality assessment (BIQA) strives to evaluate perceptual quality in alignment with what radiologists perceive, which plays an important role in advancing low-dose CT reconstruction techniques. An intriguing direction involves developing BIQA methods that mimic the operational characteristic of the human visual system (HVS). The internal generative mechanism (IGM) theory reveals that the HVS actively deduces primary content to enhance comprehension. In this study, we introduce an innovative BIQA metric that emulates the active inference process of IGM. Initially, an active inference module, implemented as a denoising diffusion probabilistic model (DDPM), is constructed to anticipate the primary content. Then, the dissimilarity map is derived by assessing the interrelation between the distorted image and its primary content. Subsequently, the distorted image and dissimilarity map are combined into a multi-channel image, which is inputted into a transformer-based image quality evaluator. By leveraging the DDPM-derived primary content, our approach achieves competitive performance on a low-dose CT dataset.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"43 10","pages":"3559-3569"},"PeriodicalIF":0.0,"publicationDate":"2024-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141447882","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

MCPL: Multi-modal Collaborative Prompt Learning for Medical Vision-Language Model. MCPL：医学视觉语言模型的多模式协作提示学习。

IEEE transactions on medical imaging

Pub Date : 2024-06-24 DOI: 10.1109/TMI.2024.3418408

Pengyu Wang, Huaqi Zhang, Yixuan Yuan

Multi-modal prompt learning is a high-performance and cost-effective learning paradigm, which learns text as well as image prompts to tune pre-trained vision-language (V-L) models like CLIP for adapting multiple downstream tasks. However, recent methods typically treat text and image prompts as independent components without considering the dependency between prompts. Moreover, extending multi-modal prompt learning into the medical field poses challenges due to a significant gap between general- and medical-domain data. To this end, we propose a Multi-modal Collaborative Prompt Learning (MCPL) pipeline to tune a frozen V-L model for aligning medical text-image representations, thereby achieving medical downstream tasks. We first construct the anatomy-pathology (AP) prompt for multi-modal prompting jointly with text and image prompts. The AP prompt introduces instance-level anatomy and pathology information, thereby making a V-L model better comprehend medical reports and images. Next, we propose graph-guided prompt collaboration module (GPCM), which explicitly establishes multi-way couplings between the AP, text, and image prompts, enabling collaborative multi-modal prompt producing and updating for more effective prompting. Finally, we develop a novel prompt configuration scheme, which attaches the AP prompt to the query and key, and the text/image prompt to the value in self-attention layers for improving the interpretability of multi-modal prompts. Extensive experiments on numerous medical classification and object detection datasets show that the proposed pipeline achieves excellent effectiveness and generalization. Compared with state-of-the-art prompt learning methods, MCPL provides a more reliable multi-modal prompt paradigm for reducing tuning costs of V-L models on medical downstream tasks. Our code: https://github.com/CUHK-AIM-Group/MCPL.

多模态提示学习是一种高性能、高性价比的学习范式，它通过学习文本和图像提示来调整像 CLIP 这样的预训练视觉语言（V-L）模型，以适应多种下游任务。然而，最近的方法通常将文本和图像提示作为独立组件处理，而不考虑提示之间的依赖关系。此外，由于通用数据和医疗领域数据之间存在巨大差距，将多模态提示学习扩展到医疗领域面临着挑战。为此，我们提出了多模态协同提示学习（MCPL）管道，以调整用于对齐医学文本-图像表征的冻结 V-L 模型，从而实现医学下游任务。我们首先构建了解剖病理学（AP）提示，用于联合文本和图像提示进行多模态提示。解剖病理提示引入了实例级的解剖和病理信息，从而使 V-L 模型能更好地理解医疗报告和图像。接着，我们提出了图引导提示协作模块（GPCM），该模块明确地在AP、文本和图像提示之间建立了多向耦合，实现了多模态提示的协作生成和更新，从而提高了提示效率。最后，我们开发了一种新颖的提示配置方案，将 AP 提示与查询和密钥相连，将文本/图像提示与自我关注层中的值相连，以提高多模态提示的可解释性。在大量医疗分类和物体检测数据集上进行的广泛实验表明，所提出的管道具有出色的有效性和泛化能力。与最先进的提示学习方法相比，MCPL 提供了一种更可靠的多模态提示范例，可降低医疗下游任务中 V-L 模型的调整成本。我们的代码：https://github.com/CUHK-AIM-Group/MCPL。

{"title":"MCPL: Multi-modal Collaborative Prompt Learning for Medical Vision-Language Model.","authors":"Pengyu Wang, Huaqi Zhang, Yixuan Yuan","doi":"10.1109/TMI.2024.3418408","DOIUrl":"10.1109/TMI.2024.3418408","url":null,"abstract":"Multi-modal prompt learning is a high-performance and cost-effective learning paradigm, which learns text as well as image prompts to tune pre-trained vision-language (V-L) models like CLIP for adapting multiple downstream tasks. However, recent methods typically treat text and image prompts as independent components without considering the dependency between prompts. Moreover, extending multi-modal prompt learning into the medical field poses challenges due to a significant gap between general- and medical-domain data. To this end, we propose a Multi-modal Collaborative Prompt Learning (MCPL) pipeline to tune a frozen V-L model for aligning medical text-image representations, thereby achieving medical downstream tasks. We first construct the anatomy-pathology (AP) prompt for multi-modal prompting jointly with text and image prompts. The AP prompt introduces instance-level anatomy and pathology information, thereby making a V-L model better comprehend medical reports and images. Next, we propose graph-guided prompt collaboration module (GPCM), which explicitly establishes multi-way couplings between the AP, text, and image prompts, enabling collaborative multi-modal prompt producing and updating for more effective prompting. Finally, we develop a novel prompt configuration scheme, which attaches the AP prompt to the query and key, and the text/image prompt to the value in self-attention layers for improving the interpretability of multi-modal prompts. Extensive experiments on numerous medical classification and object detection datasets show that the proposed pipeline achieves excellent effectiveness and generalization. Compared with state-of-the-art prompt learning methods, MCPL provides a more reliable multi-modal prompt paradigm for reducing tuning costs of V-L models on medical downstream tasks. Our code: https://github.com/CUHK-AIM-Group/MCPL.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141447883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Time-Reversion Fast-Sampling Score-Based Model for Limited-Angle CT Reconstruction 用于有限角度 CT 重建的基于时间转换快速采样分数的模型

IEEE transactions on medical imaging

Pub Date : 2024-06-24 DOI: 10.1109/TMI.2024.3418838

Yanyang Wang;Zirong Li;Weiwen Wu

The score-based generative model (SGM) has received significant attention in the field of medical imaging, particularly in the context of limited-angle computed tomography (LACT). Traditional SGM approaches achieved robust reconstruction performance by incorporating a substantial number of sampling steps during the inference phase. However, these established SGM-based methods require large computational cost to reconstruct one case. The main challenge lies in achieving high-quality images with rapid sampling while preserving sharp edges and small features. In this study, we propose an innovative rapid-sampling strategy for SGM, which we have aptly named the time-reversion fast-sampling (TIFA) score-based model for LACT reconstruction. The entire sampling procedure adheres steadfastly to the principles of robust optimization theory and is firmly grounded in a comprehensive mathematical model. TIFA’s rapid-sampling mechanism comprises several essential components, including jump sampling, time-reversion with re-sampling, and compressed sampling. In the initial jump sampling stage, multiple sampling steps are bypassed to expedite the attainment of preliminary results. Subsequently, during the time-reversion process, the initial results undergo controlled corruption by introducing small-scale noise. The re-sampling process then diligently refines the initially corrupted results. Finally, compressed sampling fine-tunes the refinement outcomes by imposing regularization term. Quantitative and qualitative assessments conducted on numerical simulations, real physical phantom, and clinical cardiac datasets, unequivocally demonstrate that TIFA method (using 200 steps) outperforms other state-of-the-art methods (using 2000 steps) from available [0°, 90°] and [0°, 60°]. Furthermore, experimental results underscore that our TIFA method continues to reconstruct high-quality images even with 10 steps. Our code at https://github.com/tianzhijiaoziA/TIFADiffusion.

基于分数的生成模型（SGM）在医学成像领域，尤其是在有限角度计算机断层扫描（LACT）方面受到了极大关注。传统的 SGM 方法通过在推理阶段加入大量的采样步骤来实现稳健的重建性能。然而，这些成熟的基于 SGM 的方法重建一个病例需要大量的计算成本。主要挑战在于如何通过快速采样获得高质量图像，同时保留锐利边缘和小特征。在本研究中，我们为 SGM 提出了一种创新的快速采样策略，并将其命名为基于时间反演快速采样（TIFA）的 LACT 重建得分模型。整个采样过程严格遵循稳健优化理论的原则，并以一个全面的数学模型为坚实基础。TIFA 的快速采样机制由几个重要部分组成，包括跳跃采样、带重新采样的时间反演和压缩采样。在最初的跳跃采样阶段，多个采样步骤被绕过，以加快获得初步结果。随后，在时间反演过程中，通过引入小尺度噪声，对初步结果进行受控破坏。然后，重新采样过程会不断完善最初被破坏的结果。最后，压缩采样通过施加正则化项对细化结果进行微调。在数值模拟、真实物理模型和临床心脏数据集上进行的定量和定性评估明确表明，TIFA 方法（使用 200 步）在可用的 [0°, 90°] 和 [0°, 60°] 范围内优于其他最先进的方法（使用 2000 步）。此外，实验结果还表明，即使使用 10 个步骤，我们的 TIFA 方法也能继续重建高质量的图像。我们的代码见 https://github.com/tianzhijiaoziA/TIFADiffusion。

{"title":"Time-Reversion Fast-Sampling Score-Based Model for Limited-Angle CT Reconstruction","authors":"Yanyang Wang;Zirong Li;Weiwen Wu","doi":"10.1109/TMI.2024.3418838","DOIUrl":"10.1109/TMI.2024.3418838","url":null,"abstract":"The score-based generative model (SGM) has received significant attention in the field of medical imaging, particularly in the context of limited-angle computed tomography (LACT). Traditional SGM approaches achieved robust reconstruction performance by incorporating a substantial number of sampling steps during the inference phase. However, these established SGM-based methods require large computational cost to reconstruct one case. The main challenge lies in achieving high-quality images with rapid sampling while preserving sharp edges and small features. In this study, we propose an innovative rapid-sampling strategy for SGM, which we have aptly named the time-reversion fast-sampling (TIFA) score-based model for LACT reconstruction. The entire sampling procedure adheres steadfastly to the principles of robust optimization theory and is firmly grounded in a comprehensive mathematical model. TIFA’s rapid-sampling mechanism comprises several essential components, including jump sampling, time-reversion with re-sampling, and compressed sampling. In the initial jump sampling stage, multiple sampling steps are bypassed to expedite the attainment of preliminary results. Subsequently, during the time-reversion process, the initial results undergo controlled corruption by introducing small-scale noise. The re-sampling process then diligently refines the initially corrupted results. Finally, compressed sampling fine-tunes the refinement outcomes by imposing regularization term. Quantitative and qualitative assessments conducted on numerical simulations, real physical phantom, and clinical cardiac datasets, unequivocally demonstrate that TIFA method (using 200 steps) outperforms other state-of-the-art methods (using 2000 steps) from available [0°, 90°] and [0°, 60°]. Furthermore, experimental results underscore that our TIFA method continues to reconstruct high-quality images even with 10 steps. Our code at \u0000<uri>https://github.com/tianzhijiaoziA/TIFADiffusion</uri>\u0000.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"43 10","pages":"3449-3460"},"PeriodicalIF":0.0,"publicationDate":"2024-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141447884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0