{"title":"Guided synthesis of annotated lung CT images with pathologies using a multi-conditioned denoising diffusion probabilistic model (mDDPM).","authors":"Arjun Krishna, Ge Wang, Klaus Mueller","doi":"10.1088/1361-6560/adb9b3","DOIUrl":null,"url":null,"abstract":"<p><p><i>Objective</i>. The training of AI models for medical image diagnostics requires highly accurate, diverse, and large training datasets with annotations and pathologies. Unfortunately, due to privacy and other constraints the amount of medical image data available for AI training remains limited, and this scarcity is exacerbated by the high overhead required for annotation. We address this challenge by introducing a new controlled framework for the generation of synthetic images complete with annotations, incorporating multiple conditional specifications as inputs.<i>Approach</i>. Using lung CT as a case study, we employ a denoising diffusion probabilistic model to train an unconditional large-scale generative model. We extend this with a classifier-free sampling strategy to develop a robust generation framework. This approach enables the generation of constrained and annotated lung CT images that accurately depict anatomy, successfully deceiving experts into perceiving them as real. Most notably, we demonstrate the generalizability of our multi-conditioned sampling approach by producing images with specific pathologies, such as lung nodules at designated locations, within the constrained anatomy.<i>Main results</i>. Our experiments reveal that our proposed approach can effectively produce constrained, annotated and diverse lung CT images that maintain anatomical consistency and fidelity, even for annotations not present in the training datasets. Moreover, our results highlight the superior performance of controlled generative frameworks of this nature compared to nearly every state-of-the-art image generative model when trained on comparable large medical datasets. Finally, we highlight how our approach can be extended to other medical imaging domains, further underscoring the versatility of our method.<i>Significance</i>. The significance of our work lies in its robust approach for generating synthetic images with annotations, facilitating the creation of highly accurate and diverse training datasets for AI applications and its wider applicability to other imaging modalities in medical domains. Our demonstrated capability to faithfully represent anatomy and pathology in generated medical images holds significant potential for various medical imaging applications, with high promise to lead to improved diagnostic accuracy and patient care.</p>","PeriodicalId":20185,"journal":{"name":"Physics in medicine and biology","volume":" ","pages":""},"PeriodicalIF":3.3000,"publicationDate":"2025-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Physics in medicine and biology","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1088/1361-6560/adb9b3","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
引用次数: 0
Abstract
Objective. The training of AI models for medical image diagnostics requires highly accurate, diverse, and large training datasets with annotations and pathologies. Unfortunately, due to privacy and other constraints the amount of medical image data available for AI training remains limited, and this scarcity is exacerbated by the high overhead required for annotation. We address this challenge by introducing a new controlled framework for the generation of synthetic images complete with annotations, incorporating multiple conditional specifications as inputs.Approach. Using lung CT as a case study, we employ a denoising diffusion probabilistic model to train an unconditional large-scale generative model. We extend this with a classifier-free sampling strategy to develop a robust generation framework. This approach enables the generation of constrained and annotated lung CT images that accurately depict anatomy, successfully deceiving experts into perceiving them as real. Most notably, we demonstrate the generalizability of our multi-conditioned sampling approach by producing images with specific pathologies, such as lung nodules at designated locations, within the constrained anatomy.Main results. Our experiments reveal that our proposed approach can effectively produce constrained, annotated and diverse lung CT images that maintain anatomical consistency and fidelity, even for annotations not present in the training datasets. Moreover, our results highlight the superior performance of controlled generative frameworks of this nature compared to nearly every state-of-the-art image generative model when trained on comparable large medical datasets. Finally, we highlight how our approach can be extended to other medical imaging domains, further underscoring the versatility of our method.Significance. The significance of our work lies in its robust approach for generating synthetic images with annotations, facilitating the creation of highly accurate and diverse training datasets for AI applications and its wider applicability to other imaging modalities in medical domains. Our demonstrated capability to faithfully represent anatomy and pathology in generated medical images holds significant potential for various medical imaging applications, with high promise to lead to improved diagnostic accuracy and patient care.
期刊介绍:
The development and application of theoretical, computational and experimental physics to medicine, physiology and biology. Topics covered are: therapy physics (including ionizing and non-ionizing radiation); biomedical imaging (e.g. x-ray, magnetic resonance, ultrasound, optical and nuclear imaging); image-guided interventions; image reconstruction and analysis (including kinetic modelling); artificial intelligence in biomedical physics and analysis; nanoparticles in imaging and therapy; radiobiology; radiation protection and patient dose monitoring; radiation dosimetry