DIFLF: A domain-invariant features learning framework for single-source domain generalization in mammogram classification

IF 4.9 2区 医学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Computer methods and programs in biomedicine Pub Date : 2025-01-06 DOI:10.1016/j.cmpb.2025.108592
Wanfang Xie , Zhenyu Liu , Litao Zhao , Meiyun Wang , Jie Tian , Jiangang Liu
{"title":"DIFLF: A domain-invariant features learning framework for single-source domain generalization in mammogram classification","authors":"Wanfang Xie ,&nbsp;Zhenyu Liu ,&nbsp;Litao Zhao ,&nbsp;Meiyun Wang ,&nbsp;Jie Tian ,&nbsp;Jiangang Liu","doi":"10.1016/j.cmpb.2025.108592","DOIUrl":null,"url":null,"abstract":"<div><h3>Background and Objective</h3><div>Single-source domain generalization (SSDG) aims to generalize a deep learning (DL) model trained on one source dataset to multiple unseen datasets. This is important for the clinical applications of DL-based models to breast cancer screening, wherein a DL-based model is commonly developed in an institute and then tested in other institutes. One challenge of SSDG is to alleviate the domain shifts using only one domain dataset.</div></div><div><h3>Methods</h3><div>The present study proposed a domain-invariant features learning framework (DIFLF) for single-source domain. Specifically, a style-augmentation module (SAM) and a content-style disentanglement module (CSDM) are proposed in DIFLF. SAM includes two different color jitter transforms, which transforms each mammogram in the source domain into two synthesized mammograms with new styles. Thus, it can greatly increase the feature diversity of the source domain, reducing the overfitting of the trained model. CSDM includes three feature disentanglement units, which extracts domain-invariant content (DIC) features by disentangling them from domain-specific style (DSS) features, reducing the influence of the domain shifts resulting from different feature distributions. Our code is available for open access on Github (<span><span>https://github.com/85675/DIFLF</span><svg><path></path></svg></span>).</div></div><div><h3>Results</h3><div>DIFLF is trained in a private dataset (PRI1), and tested first in another private dataset (PRI2) with similar feature distribution to PRI1 and then tested in two public datasets (INbreast and MIAS) with greatly different feature distributions from PRI1. As revealed by the experiment results, DIFLF presents excellent performance for classifying mammograms in the unseen target datasets of PRI2, INbreast, and MIAS. The accuracy and AUC of DIFLF are 0.917 and 0.928 in PRI2, 0.882 and 0.893 in INbreast, 0.767 and 0.710 in MIAS, respectively.</div></div><div><h3>Conclusions</h3><div>DIFLF can alleviate the influence of domain shifts only using one source dataset. Moreover, DIFLF can achieve an excellent mammogram classification performance even in the unseen datasets with great feature distribution differences from the training dataset.</div></div>","PeriodicalId":10624,"journal":{"name":"Computer methods and programs in biomedicine","volume":"261 ","pages":"Article 108592"},"PeriodicalIF":4.9000,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer methods and programs in biomedicine","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169260725000094","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

Abstract

Background and Objective

Single-source domain generalization (SSDG) aims to generalize a deep learning (DL) model trained on one source dataset to multiple unseen datasets. This is important for the clinical applications of DL-based models to breast cancer screening, wherein a DL-based model is commonly developed in an institute and then tested in other institutes. One challenge of SSDG is to alleviate the domain shifts using only one domain dataset.

Methods

The present study proposed a domain-invariant features learning framework (DIFLF) for single-source domain. Specifically, a style-augmentation module (SAM) and a content-style disentanglement module (CSDM) are proposed in DIFLF. SAM includes two different color jitter transforms, which transforms each mammogram in the source domain into two synthesized mammograms with new styles. Thus, it can greatly increase the feature diversity of the source domain, reducing the overfitting of the trained model. CSDM includes three feature disentanglement units, which extracts domain-invariant content (DIC) features by disentangling them from domain-specific style (DSS) features, reducing the influence of the domain shifts resulting from different feature distributions. Our code is available for open access on Github (https://github.com/85675/DIFLF).

Results

DIFLF is trained in a private dataset (PRI1), and tested first in another private dataset (PRI2) with similar feature distribution to PRI1 and then tested in two public datasets (INbreast and MIAS) with greatly different feature distributions from PRI1. As revealed by the experiment results, DIFLF presents excellent performance for classifying mammograms in the unseen target datasets of PRI2, INbreast, and MIAS. The accuracy and AUC of DIFLF are 0.917 and 0.928 in PRI2, 0.882 and 0.893 in INbreast, 0.767 and 0.710 in MIAS, respectively.

Conclusions

DIFLF can alleviate the influence of domain shifts only using one source dataset. Moreover, DIFLF can achieve an excellent mammogram classification performance even in the unseen datasets with great feature distribution differences from the training dataset.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
DIFLF:乳房x线照片分类中单源域泛化的域不变特征学习框架。
背景与目的:单源域泛化(Single-source domain generalization, SSDG)旨在将在一个源数据集上训练的深度学习(DL)模型泛化到多个看不见的数据集。这对于基于dl的模型在乳腺癌筛查中的临床应用具有重要意义,其中基于dl的模型通常由一个研究所开发,然后在其他研究所进行测试。SSDG的一个挑战是仅使用一个域数据集来减轻域迁移。方法:提出一种单源域的域不变特征学习框架(DIFLF)。具体来说,在DIFLF中提出了一个风格增强模块(SAM)和一个内容风格解除纠缠模块(CSDM)。SAM包括两种不同的颜色抖动变换,将源域的每张乳房x光片变换成两张具有新样式的合成乳房x光片。因此,它可以大大增加源域的特征多样性,减少训练模型的过拟合。CSDM包含三个特征解纠缠单元,通过将域不变内容(DIC)特征与域特定样式(DSS)特征解纠缠,提取域不变内容(DIC)特征,降低了不同特征分布导致的域移动的影响。我们的代码可在Github上开放访问(https://github.com/85675/DIFLF).Results: DIFLF在私有数据集(PRI1)中进行训练,并首先在具有与PRI1相似特征分布的另一个私有数据集(PRI2)中进行测试,然后在具有与PRI1有很大不同特征分布的两个公共数据集(INbreast和MIAS)中进行测试。实验结果表明,DIFLF在PRI2、INbreast和MIAS的未见目标数据集中对乳房x线照片进行分类时表现出优异的性能。DIFLF在PRI2中的准确度和AUC分别为0.917和0.928,INbreast中的准确度和AUC分别为0.882和0.893,MIAS中的准确度和AUC分别为0.767和0.710。结论:DIFLF在单一源数据集上可以缓解域漂移的影响。此外,即使在与训练数据集特征分布差异较大的未见数据集上,DIFLF也能取得优异的乳房x线照片分类性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Computer methods and programs in biomedicine
Computer methods and programs in biomedicine 工程技术-工程:生物医学
CiteScore
12.30
自引率
6.60%
发文量
601
审稿时长
135 days
期刊介绍: To encourage the development of formal computing methods, and their application in biomedical research and medical practice, by illustration of fundamental principles in biomedical informatics research; to stimulate basic research into application software design; to report the state of research of biomedical information processing projects; to report new computer methodologies applied in biomedical areas; the eventual distribution of demonstrable software to avoid duplication of effort; to provide a forum for discussion and improvement of existing software; to optimize contact between national organizations and regional user groups by promoting an international exchange of information on formal methods, standards and software in biomedicine. Computer Methods and Programs in Biomedicine covers computing methodology and software systems derived from computing science for implementation in all aspects of biomedical research and medical practice. It is designed to serve: biochemists; biologists; geneticists; immunologists; neuroscientists; pharmacologists; toxicologists; clinicians; epidemiologists; psychiatrists; psychologists; cardiologists; chemists; (radio)physicists; computer scientists; programmers and systems analysts; biomedical, clinical, electrical and other engineers; teachers of medical informatics and users of educational software.
期刊最新文献
Editorial Board A Markov Chain methodology for care pathway mapping using health insurance data, a study case on pediatric TBI Towards clinical prediction with transparency: An explainable AI approach to survival modelling in residential aged care A novel endoscopic posterior cervical decompression and interbody fusion technique: Feasibility and biomechanical analysis Nonlinear dose-response relationship in tDCS-induced brain network synchrony: A resting-state whole-brain model analysis
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1