Is synthetic data generation effective in maintaining clinical biomarkers? Investigating diffusion models across diverse imaging modalities.

IF 4.7 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Frontiers in Artificial Intelligence Pub Date : 2025-01-31 eCollection Date: 2024-01-01 DOI:10.3389/frai.2024.1454441
Abdullah Hosseini, Ahmed Serag
{"title":"Is synthetic data generation effective in maintaining clinical biomarkers? Investigating diffusion models across diverse imaging modalities.","authors":"Abdullah Hosseini, Ahmed Serag","doi":"10.3389/frai.2024.1454441","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>The integration of recent technologies in medical imaging has become a cornerstone of modern healthcare, facilitating detailed analysis of internal anatomy and pathology. Traditional methods, however, often grapple with data-sharing restrictions due to privacy concerns. Emerging techniques in artificial intelligence offer innovative solutions to overcome these constraints, with synthetic data generation enabling the creation of realistic medical imaging datasets, but the preservation of critical hidden medical biomarkers is an open question.</p><p><strong>Methods: </strong>This study employs state-of-the-art Denoising Diffusion Probabilistic Models integrated with a Swin-transformer-based network to generate synthetic medical data. Three distinct areas of medical imaging - radiology, ophthalmology, and histopathology - are explored. The quality of synthetic images is evaluated through a classifier trained to identify the preservation of medical biomarkers.</p><p><strong>Results: </strong>The diffusion model effectively preserves key medical features, such as lung markings and retinal abnormalities, producing synthetic images closely resembling real data. Classifier performance demonstrates the reliability of synthetic data for downstream tasks, with F1 and AUC reaching 0.8-0.99.</p><p><strong>Discussion: </strong>This work provides valuable insights into the potential of diffusion-based models for generating realistic, biomarker-preserving synthetic images across various medical imaging modalities. These findings highlight the potential of synthetic data to address challenges such as data scarcity and privacy concerns in clinical practice, research, and education.</p>","PeriodicalId":33315,"journal":{"name":"Frontiers in Artificial Intelligence","volume":"7 ","pages":"1454441"},"PeriodicalIF":4.7000,"publicationDate":"2025-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11826350/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/frai.2024.1454441","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Introduction: The integration of recent technologies in medical imaging has become a cornerstone of modern healthcare, facilitating detailed analysis of internal anatomy and pathology. Traditional methods, however, often grapple with data-sharing restrictions due to privacy concerns. Emerging techniques in artificial intelligence offer innovative solutions to overcome these constraints, with synthetic data generation enabling the creation of realistic medical imaging datasets, but the preservation of critical hidden medical biomarkers is an open question.

Methods: This study employs state-of-the-art Denoising Diffusion Probabilistic Models integrated with a Swin-transformer-based network to generate synthetic medical data. Three distinct areas of medical imaging - radiology, ophthalmology, and histopathology - are explored. The quality of synthetic images is evaluated through a classifier trained to identify the preservation of medical biomarkers.

Results: The diffusion model effectively preserves key medical features, such as lung markings and retinal abnormalities, producing synthetic images closely resembling real data. Classifier performance demonstrates the reliability of synthetic data for downstream tasks, with F1 and AUC reaching 0.8-0.99.

Discussion: This work provides valuable insights into the potential of diffusion-based models for generating realistic, biomarker-preserving synthetic images across various medical imaging modalities. These findings highlight the potential of synthetic data to address challenges such as data scarcity and privacy concerns in clinical practice, research, and education.

Abstract Image

Abstract Image

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
合成数据生成对维持临床生物标志物有效吗?研究不同成像模式下的扩散模型。
医学影像技术的整合已成为现代医疗保健的基石,促进了内部解剖和病理的详细分析。然而,传统的方法往往由于隐私问题而受到数据共享限制。人工智能中的新兴技术为克服这些限制提供了创新的解决方案,合成数据生成使创建真实的医学成像数据集成为可能,但保存关键的隐藏医学生物标志物是一个悬而未决的问题。方法:本研究采用最先进的去噪扩散概率模型与基于旋转变压器的网络相结合来生成合成的医疗数据。医学影像的三个不同的领域-放射学,眼科和组织病理学-探索。通过训练的分类器来评估合成图像的质量,以识别医学生物标志物的保存。结果:扩散模型有效地保留了关键的医学特征,如肺标记和视网膜异常,生成的合成图像与真实数据非常相似。分类器性能证明了合成数据对下游任务的可靠性,F1和AUC达到0.8-0.99。讨论:这项工作为基于扩散的模型的潜力提供了有价值的见解,该模型可在各种医学成像模式中生成逼真的、保存生物标志物的合成图像。这些发现强调了合成数据在解决临床实践、研究和教育中的数据稀缺和隐私问题等挑战方面的潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
6.10
自引率
2.50%
发文量
272
审稿时长
13 weeks
期刊最新文献
Artificial intelligence models and combined scoring approaches for endometrial receptivity assessment in in vitro fertilization. An AI-driven conceptual framework for detecting fake news and deepfake content: a systematic review. Providers of relief in distress: RAG-based LLMs as situation and intent-aware assistants. Transformer enhanced based YOLOv8 integration: a hybrid deep learning framework for intelligent insulator defect detection in high-voltage transmission systems. FinTextSim: a domain-specific sentence-transformer for extracting predictive latent topics from financial disclosures.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1