DiffBoost: Enhancing Medical Image Segmentation via Text-Guided Diffusion Model

IEEE transactions on medical imaging Pub Date : 2024-12-17 DOI:10.1109/TMI.2024.3519307

Zheyuan Zhang;Lanhong Yao;Bin Wang;Debesh Jha;Gorkem Durak;Elif Keles;Alpay Medetalibeyoglu;Ulas Bagci

{"title":"DiffBoost: Enhancing Medical Image Segmentation via Text-Guided Diffusion Model","authors":"Zheyuan Zhang;Lanhong Yao;Bin Wang;Debesh Jha;Gorkem Durak;Elif Keles;Alpay Medetalibeyoglu;Ulas Bagci","doi":"10.1109/TMI.2024.3519307","DOIUrl":null,"url":null,"abstract":"Large-scale, big-variant, high-quality data are crucial for developing robust and successful deep-learning models for medical applications since they potentially enable better generalization performance and avoid overfitting. However, the scarcity of high-quality labeled data always presents significant challenges. This paper proposes a novel approach to address this challenge by developing controllable diffusion models for medical image synthesis, called DiffBoost. We leverage recent diffusion probabilistic models to generate realistic and diverse synthetic medical image data that preserve the essential characteristics of the original medical images by incorporating edge information of objects to guide the synthesis process. In our approach, we ensure that the synthesized samples adhere to medically relevant constraints and preserve the underlying structure of imaging data. Due to the random sampling process by the diffusion model, we can generate an arbitrary number of synthetic images with diverse appearances. To validate the effectiveness of our proposed method, we conduct an extensive set of medical image segmentation experiments on multiple datasets, including Ultrasound breast (+13.87%), CT spleen (+0.38%), and MRI prostate (+7.78%), achieving significant improvements over the baseline segmentation methods. The promising results demonstrate the effectiveness of our DiffBoost for medical image segmentation tasks and show the feasibility of introducing a first-ever text-guided diffusion model for general medical image segmentation tasks. With carefully designed ablation experiments, we investigate the influence of various data augmentations, hyper-parameter settings, patch size for generating random merging mask settings, and combined influence with different network architectures. Source code with checkpoints are available at <uri>https://github.com/NUBagciLab/DiffBoost</uri>.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 9","pages":"3670-3682"},"PeriodicalIF":0.0000,"publicationDate":"2024-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on medical imaging","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10804854/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Large-scale, big-variant, high-quality data are crucial for developing robust and successful deep-learning models for medical applications since they potentially enable better generalization performance and avoid overfitting. However, the scarcity of high-quality labeled data always presents significant challenges. This paper proposes a novel approach to address this challenge by developing controllable diffusion models for medical image synthesis, called DiffBoost. We leverage recent diffusion probabilistic models to generate realistic and diverse synthetic medical image data that preserve the essential characteristics of the original medical images by incorporating edge information of objects to guide the synthesis process. In our approach, we ensure that the synthesized samples adhere to medically relevant constraints and preserve the underlying structure of imaging data. Due to the random sampling process by the diffusion model, we can generate an arbitrary number of synthetic images with diverse appearances. To validate the effectiveness of our proposed method, we conduct an extensive set of medical image segmentation experiments on multiple datasets, including Ultrasound breast (+13.87%), CT spleen (+0.38%), and MRI prostate (+7.78%), achieving significant improvements over the baseline segmentation methods. The promising results demonstrate the effectiveness of our DiffBoost for medical image segmentation tasks and show the feasibility of introducing a first-ever text-guided diffusion model for general medical image segmentation tasks. With carefully designed ablation experiments, we investigate the influence of various data augmentations, hyper-parameter settings, patch size for generating random merging mask settings, and combined influence with different network architectures. Source code with checkpoints are available at https://github.com/NUBagciLab/DiffBoost.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

DiffBoost：通过文本引导扩散模型增强医学图像分割

大规模、大变量、高质量的数据对于为医疗应用开发鲁棒和成功的深度学习模型至关重要，因为它们有可能实现更好的泛化性能并避免过拟合。然而，高质量标记数据的稀缺性总是带来重大挑战。本文提出了一种新的方法，通过开发用于医学图像合成的可控扩散模型来解决这一挑战，称为DiffBoost。我们利用最新的扩散概率模型来生成真实多样的合成医学图像数据，这些数据通过结合物体的边缘信息来指导合成过程，从而保留了原始医学图像的基本特征。在我们的方法中，我们确保合成的样品符合医学相关的限制条件，并保留成像数据的底层结构。由于扩散模型的随机采样过程，我们可以生成任意数量的具有不同外观的合成图像。为了验证我们提出的方法的有效性，我们在多个数据集上进行了大量的医学图像分割实验，包括超声乳房（+13.87%），CT脾脏（+0.38%）和MRI前列腺（+7.78%），取得了比基线分割方法显著改进。这些令人鼓舞的结果证明了DiffBoost在医学图像分割任务中的有效性，并显示了为一般医学图像分割任务引入首个文本引导扩散模型的可行性。通过精心设计的消融实验，我们研究了各种数据增强、超参数设置、生成随机合并掩码设置的补丁大小以及不同网络架构的综合影响。带有检查点的源代码可从https://github.com/NUBagciLab/DiffBoost获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE transactions on medical imaging

自引率

0.00%

发文量