首页 > 最新文献

Journal of Biomedical Informatics最新文献

英文 中文
A federated learning framework for ethical dynamic treatment allocation across heterogeneous hospitals 跨异构医院伦理动态治疗分配的联邦学习框架。
IF 4.5 2区 医学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2026-02-01 Epub Date: 2026-01-16 DOI: 10.1016/j.jbi.2026.104987
Xenia Konti , Nicoleta J. Economou-Zavlanos , Yi Shen , Giorgos Stamou , Armando Bedoya , Michael J. Pencina , Chuan Hong , Michael M. Zavlanos

Objective

In this paper, we propose an adaptive federated learning framework to learn optimal treatments for individual hospitals that possibly serve different patient populations. The proposed framework can enable the design of more efficient treatment allocation problems.

Methods

We propose a federated treatment recommendation strategy that for each hospital is formulated as a Multi-Armed Bandit (MAB) problem. The process is coordinated by a lead hospital that adaptively learns and transfers Upper Confidence Bounds (UCB) across similar hospitals and Personalized Upper Bounds across heterogeneous hospitals. We test our proposed method on a simulated clinical trial environment created using real Covid-19 data from the Duke University Health System.

Results

Our method relies on collaboration among hospitals, which allows for fewer data samples needed per institution, while protecting the privacy of the individual patient data. At the same time, it ensures fairness of the learned treatments by mitigating possible biases due to differences in the patient populations treated across different hospitals. Finally, our method improves the safety of the learning procedure by reducing the number of patients administered with sub-optimal treatments at each hospital. In the experiments, we show that our proposed method outperforms other state of the art approaches in that it requires up to 36%–75% fewer patient data to learn the optimal treatment for each hospital and administers the optimal treatment to 0.95%-48.6% more patients.

Conclusion

In this paper, we propose an adaptive federated learning strategy for treatment recommendation tasks, that learns optimal treatments for individual hospitals that possibly serve different patient populations, while satisfying privacy, fairness, and safety considerations.
目的:在本文中,我们提出了一个自适应联邦学习框架,以学习可能服务于不同患者群体的个别医院的最佳治疗方法。提出的框架可以使设计更有效的处理分配问题。方法:我们提出了一个联合治疗推荐策略,为每个医院制定了一个多武装强盗(MAB)问题。该过程由一家领先的医院协调,该医院可自适应地学习并在类似医院之间传递上限置信界限(UCB),并在异构医院之间传递个性化上限界限。我们在使用杜克大学卫生系统的真实Covid-19数据创建的模拟临床试验环境中测试了我们提出的方法。结果:我们的方法依赖于医院之间的协作,这使得每个机构所需的数据样本更少,同时保护了个体患者数据的隐私。同时,它通过减少因不同医院治疗的患者群体差异而可能产生的偏见,确保了所学治疗方法的公平性。最后,我们的方法通过减少在每家医院接受次优治疗的患者数量,提高了学习过程的安全性。在实验中,我们表明,我们提出的方法优于其他最先进的方法,因为它需要多达36%-75%的患者数据来学习每个医院的最佳治疗方法,并为0.95%-48.6%的患者提供最佳治疗。结论:在本文中,我们提出了一种针对治疗推荐任务的自适应联邦学习策略,该策略可以为可能服务于不同患者群体的单个医院学习最佳治疗方法,同时满足隐私、公平和安全方面的考虑。
{"title":"A federated learning framework for ethical dynamic treatment allocation across heterogeneous hospitals","authors":"Xenia Konti ,&nbsp;Nicoleta J. Economou-Zavlanos ,&nbsp;Yi Shen ,&nbsp;Giorgos Stamou ,&nbsp;Armando Bedoya ,&nbsp;Michael J. Pencina ,&nbsp;Chuan Hong ,&nbsp;Michael M. Zavlanos","doi":"10.1016/j.jbi.2026.104987","DOIUrl":"10.1016/j.jbi.2026.104987","url":null,"abstract":"<div><h3>Objective</h3><div>In this paper, we propose an adaptive federated learning framework to learn optimal treatments for individual hospitals that possibly serve different patient populations. The proposed framework can enable the design of more efficient treatment allocation problems.</div></div><div><h3>Methods</h3><div>We propose a federated treatment recommendation strategy that for each hospital is formulated as a Multi-Armed Bandit (MAB) problem. The process is coordinated by a lead hospital that adaptively learns and transfers Upper Confidence Bounds (UCB) across similar hospitals and Personalized Upper Bounds across heterogeneous hospitals. We test our proposed method on a simulated clinical trial environment created using real Covid-19 data from the Duke University Health System.</div></div><div><h3>Results</h3><div>Our method relies on collaboration among hospitals, which allows for fewer data samples needed per institution, while protecting the privacy of the individual patient data. At the same time, it ensures fairness of the learned treatments by mitigating possible biases due to differences in the patient populations treated across different hospitals. Finally, our method improves the safety of the learning procedure by reducing the number of patients administered with sub-optimal treatments at each hospital. In the experiments, we show that our proposed method outperforms other state of the art approaches in that it requires up to 36%–75% fewer patient data to learn the optimal treatment for each hospital and administers the optimal treatment to 0.95%-48.6% more patients.</div></div><div><h3>Conclusion</h3><div>In this paper, we propose an adaptive federated learning strategy for treatment recommendation tasks, that learns optimal treatments for individual hospitals that possibly serve different patient populations, while satisfying privacy, fairness, and safety considerations.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"174 ","pages":"Article 104987"},"PeriodicalIF":4.5,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145998273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RINet: synthetic data training for indirect estimation of clinical reference distributions RINet:用于间接估计临床参考分布的综合数据训练。
IF 4.5 2区 医学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2026-02-01 Epub Date: 2026-01-08 DOI: 10.1016/j.jbi.2026.104980
Jack LeBien , Julian Velev , Abiel Roche-Lima

Background

Indirect methods for estimating clinical reference intervals (RIs) use statistical analysis to identify non-pathological sub-distributions within large datasets acquired from routine clinical testing. This approach has the potential to accelerate the estimation of precise RIs, accounting for influential variables such as age, gender, and ethnicity. Most existing methods are based on traditional statistics and hand-crafted algorithms. The investigation of supervised learning, which often outperforms traditional approaches, has been impeded by the limitations of real-world data. However, previous studies have widely used synthetic data for evaluating and benchmarking indirect methods due several advantages over real-world data, including greater control, variability, accessibility, and the availability of exact ground-truth RIs. Synthetic data may also provide a pathway for developing data-driven solutions for indirect RI estimation.

Methods

In this study, we leveraged synthetic data to train two convolutional neural networks (CNNs) to predict the parameters of underlying reference distributions (RDs) in diverse real-world clinical datasets. While one model was trained for standard univariate data, the other was extended to bivariate data, enabling the prediction of covariance between clinical analytes. Trained models were evaluated using both real-world and synthetic test datasets and compared with four alternative algorithms.

Results

Model predictions closely matched directly estimated RIs and RDs in real-world data and known RDs in synthetic data, outperforming four alternative indirect methods: GMM, refineR, reflimR, and RINetv1. Using labeled healthy and HCV-positive groups in real data, we compared established univariate RIs with predicted multivariate reference regions (MRRs). On average, the MRRs showed 1) higher coverage of healthy patients (closer to the desired 95%) and 2) smaller regions, which reduce the likelihood of including abnormal values.

Conclusions

Synthetic data training is a viable approach for developing accurate indirect RI estimation models for both univariate and bivariate clinical data. This strategy could help address some limitations of real-world data, direct analyses, and univariate RIs.
背景:估计临床参考区间(RIs)的间接方法使用统计分析来识别从常规临床检测获得的大型数据集中的非病理亚分布。考虑到年龄、性别和种族等有影响的变量,这种方法有可能加速对精确RIs的估计。大多数现有的方法都是基于传统的统计和手工制作的算法。监督学习的研究通常优于传统方法,但受到现实世界数据的限制。然而,先前的研究已经广泛使用合成数据来评估和对间接方法进行基准测试,因为与真实世界的数据相比,合成数据具有一些优势,包括更大的可控性、可变性、可访问性和精确的真实RIs的可用性。合成数据还可以为开发数据驱动的间接RI估计解决方案提供途径。方法:在这项研究中,我们利用合成数据来训练两个卷积神经网络(cnn)来预测不同现实世界临床数据集中潜在参考分布(rd)的参数。当一个模型被训练为标准的单变量数据时,另一个模型被扩展到双变量数据,从而能够预测临床分析者之间的协方差。训练后的模型使用真实世界和合成测试数据集进行评估,并与四种替代算法进行比较。结果:模型预测与实际数据中直接估计的RIs和rd以及合成数据中的已知rd密切匹配,优于四种替代间接方法:GMM, refineR, reflimR和RINetv1。使用真实数据中标记的健康组和hcv阳性组,我们比较了已建立的单变量RIs与预测的多变量参考区域(MRRs)。平均而言,磁共振成像显示1)健康患者的覆盖率更高(接近预期的95%),2)区域更小,这降低了包括异常值的可能性。结论:综合数据训练是为单变量和双变量临床数据建立准确的间接RI估计模型的可行方法。这种策略可以帮助解决现实世界数据、直接分析和单变量RIs的一些限制。
{"title":"RINet: synthetic data training for indirect estimation of clinical reference distributions","authors":"Jack LeBien ,&nbsp;Julian Velev ,&nbsp;Abiel Roche-Lima","doi":"10.1016/j.jbi.2026.104980","DOIUrl":"10.1016/j.jbi.2026.104980","url":null,"abstract":"<div><h3>Background</h3><div>Indirect methods for estimating clinical reference intervals (RIs) use statistical analysis to identify non-pathological sub-distributions within large datasets acquired from routine clinical testing. This approach has the potential to accelerate the estimation of precise RIs, accounting for influential variables such as age, gender, and ethnicity. Most existing methods are based on traditional statistics and hand-crafted algorithms. The investigation of supervised learning, which often outperforms traditional approaches, has been impeded by the limitations of real-world data. However, previous studies have widely used synthetic data for evaluating and benchmarking indirect methods due several advantages over real-world data, including greater control, variability, accessibility, and the availability of exact ground-truth RIs. Synthetic data may also provide a pathway for developing data-driven solutions for indirect RI estimation.</div></div><div><h3>Methods</h3><div>In this study, we leveraged synthetic data to train two convolutional neural networks (CNNs) to predict the parameters of underlying reference distributions (RDs) in diverse real-world clinical datasets. While one model was trained for standard univariate data, the other was extended to bivariate data, enabling the prediction of covariance between clinical analytes. Trained models were evaluated using both real-world and synthetic test datasets and compared with four alternative algorithms.</div></div><div><h3>Results</h3><div>Model predictions closely matched directly estimated RIs and RDs in real-world data and known RDs in synthetic data, outperforming four alternative indirect methods: GMM, <em>refineR</em>, <em>reflimR</em>, and RINet<sub>v1</sub>. Using labeled healthy and HCV-positive groups in real data, we compared established univariate RIs with predicted multivariate reference regions (MRRs). On average, the MRRs showed 1) higher coverage of healthy patients (closer to the desired 95%) and 2) smaller regions, which reduce the likelihood of including abnormal values.</div></div><div><h3>Conclusions</h3><div>Synthetic data training is a viable approach for developing accurate indirect RI estimation models for both univariate and bivariate clinical data. This strategy could help address some limitations of real-world data, direct analyses, and univariate RIs.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"174 ","pages":"Article 104980"},"PeriodicalIF":4.5,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145948483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FG-DDI: Functional group-aware graph neural networks for drug–drug interaction prediction FG-DDI:用于药物-药物相互作用预测的功能群感知图神经网络
IF 4.5 2区 医学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2026-02-01 Epub Date: 2026-01-12 DOI: 10.1016/j.jbi.2026.104981
Fangyu Zhou, Shahadat Uddin

Objective:

We aim to improve Drug–Drug Interactions (DDIs) by explicitly injecting medicinal-chemistry knowledge of functional groups (FGs) into graph neural network (GNN) message passing, in both transductive and inductive settings. Our goal is to (i) encode FG priors in a trainable way that enhances representation quality without handcrafting features, and (ii) yield interpretable attributions that align learned weights with pharmacologically meaningful FG patterns.

Methods:

We introduce FG-DDI, a dual-view GNN that augments both intra- and inter-molecular reasoning. At the intra-molecular level, atom/bond messages are scaled by FG enrichment weights derived from detected FG motifs within each drug graph. At the inter-molecular level, a bipartite message-passing layer between a drug pair is modulated by FG–FG enrichment scores that reflect empirical co-occurrence in known DDIs. Enrichment is computed as odds ratios from corpus statistics and injected via learnable gates, ensuring differentiability and allowing data to override noisy priors. We couple this with standard supervision on interaction labels and report accuracy (ACC), AUROC, average precision (AP), and F1. Experiments use DrugBank (1706 drugs; 86 interaction types) and TwoSides (filtered triplets) under transductive and inductive splits (one unseen; both unseen). We perform ablations removing each FG term to isolate contributions and assess stability across splits.

Results:

Comprehensive experiments on DrugBank and TwoSides datasets demonstrate that FG-DDI achieves superior performance compared to state-of-the-art methods. For DrugBank, the accuracy improves by 0.36% in transductive settings and by 0.46% and 1.42% in inductive settings, respectively for S1 and S2 partitioning.

Conclusion:

By systematically integrating chemical domain knowledge into deep learning architectures, this approach enables better generalization to unseen drug combinations while maintaining computational efficiency, making it particularly valuable for real-world pharmaceutical applications where new drugs continuously enter the market.
目的:我们的目的是通过在转导和感应设置下将官能团(fg)的药物化学知识明确注入图神经网络(GNN)消息传递来改善药物-药物相互作用(ddi)。我们的目标是(i)以一种可训练的方式编码FG先验,在不手工制作特征的情况下提高表征质量,以及(ii)产生可解释的归因,使学习到的权重与药理学上有意义的FG模式保持一致。方法:我们引入FG-DDI,一种双重视图GNN,增强了分子内和分子间推理。在分子内水平,原子/键信息通过每个药物图中检测到的FG基序得出的FG富集权重进行缩放。在分子间水平上,药物对之间的双向信息传递层由FG-FG富集分数调节,该分数反映了已知ddi的经验共现性。从语料库统计数据中以比值比计算富集,并通过可学习门注入,确保可微分性并允许数据覆盖有噪声的先验。我们将其与对交互标签和报告准确性(ACC)、AUROC、平均精度(AP)和F1的标准监督相结合。实验使用DrugBank(1706种药物,86种相互作用类型)和TwoSides(过滤的三联体)在转导和诱导分离(一个看不见,两个看不见)下进行。我们执行消融去除每个FG项,以分离贡献并评估分裂的稳定性。结果:在DrugBank和TwoSides数据集上的综合实验表明,FG-DDI的性能优于目前最先进的方法。对于DrugBank,在转导设置下的准确率提高了0.36%,在感应设置下的准确率分别提高了0.46%和1.42%,分别用于S1和S2分区。结论:通过系统地将化学领域知识集成到深度学习架构中,该方法可以更好地泛化未见过的药物组合,同时保持计算效率,使其在新药不断进入市场的现实世界制药应用中特别有价值。
{"title":"FG-DDI: Functional group-aware graph neural networks for drug–drug interaction prediction","authors":"Fangyu Zhou,&nbsp;Shahadat Uddin","doi":"10.1016/j.jbi.2026.104981","DOIUrl":"10.1016/j.jbi.2026.104981","url":null,"abstract":"<div><h3>Objective:</h3><div>We aim to improve Drug–Drug Interactions (DDIs) by explicitly injecting medicinal-chemistry knowledge of functional groups (FGs) into graph neural network (GNN) message passing, in both transductive and inductive settings. Our goal is to (i) encode FG priors in a trainable way that enhances representation quality without handcrafting features, and (ii) yield interpretable attributions that align learned weights with pharmacologically meaningful FG patterns.</div></div><div><h3>Methods:</h3><div>We introduce <em>FG-DDI</em>, a dual-view GNN that augments both intra- and inter-molecular reasoning. At the <em>intra</em>-molecular level, atom/bond messages are scaled by FG enrichment weights derived from detected FG motifs within each drug graph. At the <em>inter</em>-molecular level, a bipartite message-passing layer between a drug pair is modulated by FG–FG enrichment scores that reflect empirical co-occurrence in known DDIs. Enrichment is computed as odds ratios from corpus statistics and injected via learnable gates, ensuring differentiability and allowing data to override noisy priors. We couple this with standard supervision on interaction labels and report accuracy (ACC), AUROC, average precision (AP), and F1. Experiments use DrugBank (1706 drugs; 86 interaction types) and TwoSides (filtered triplets) under transductive and inductive splits (one unseen; both unseen). We perform ablations removing each FG term to isolate contributions and assess stability across splits.</div></div><div><h3>Results:</h3><div>Comprehensive experiments on DrugBank and TwoSides datasets demonstrate that FG-DDI achieves superior performance compared to state-of-the-art methods. For DrugBank, the accuracy improves by 0.36% in transductive settings and by 0.46% and 1.42% in inductive settings, respectively for S1 and S2 partitioning.</div></div><div><h3>Conclusion:</h3><div>By systematically integrating chemical domain knowledge into deep learning architectures, this approach enables better generalization to unseen drug combinations while maintaining computational efficiency, making it particularly valuable for real-world pharmaceutical applications where new drugs continuously enter the market.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"174 ","pages":"Article 104981"},"PeriodicalIF":4.5,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145979340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Quantifying the effect of Behaviour Self-Regulation on well-being through causal analysis: A methodological framework for longitudinal health data 通过因果分析量化行为自我调节对健康的影响:纵向健康数据的方法学框架。
IF 4.5 2区 医学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2026-02-01 Epub Date: 2026-01-10 DOI: 10.1016/j.jbi.2026.104984
Jialou Wang , Pingfan Wang , Wai Lok Woo , Kandianos Emmanouil Sakalidis , Florentina Johanna Hettinga , Angela Rodrigues , Helen Dawes , Gavin Daniel Tempest
Understanding the drivers of well-being from longitudinal behavioural data is a fundamental challenge in biomedical informatics, where traditional analyses often conflate correlation with causation. This paper presents a rigorous application of causal inference to disentangle the drivers of well-being from complex longitudinal self-report data (N=141 enrolled; N=94 analysed after a priori completeness threshold of 20 of 28 daily entries). We introduce a novel computational metric, the Behaviour Self-Regulation Score (BSRS), to quantify both trait-like (long-term) and state-like (short-term) behavioural consistency from daily reports of physical activity and sleep. Employing causal graphical models and propensity score methods, we estimate the causal effects of these behavioural patterns, controlling for motivational and perceptual confounders. Our analysis uncovers distinct causal pathways: while long-term self-regulation (BSRS-L) has a stable positive causal effect, short-term behavioural consistency (BSRS-S) demonstrates a significantly stronger causal impact on daily well-being, despite a near-zero correlation. Furthermore, we demonstrate that features selected via our causal framework significantly improve the predictive accuracy of well-being in machine learning models compared to conventional feature selection methods. This work contributes a robust methodological framework for causal analysis of longitudinal self-report data and provides evidence that causally-informed modelling can identify more potent targets for digital health interventions.
从纵向行为数据中理解幸福的驱动因素是生物医学信息学的一个基本挑战,传统的分析通常将相关性与因果关系混为一谈。本文采用严格的因果推理方法,从复杂的纵向自我报告数据中找出幸福感的驱动因素(纳入的N=141;在28个日常条目的先验完备性阈值≥20后,分析N=94)。我们引入了一种新的计算度量,即行为自我调节评分(BSRS),从日常的身体活动和睡眠报告中量化特征(长期)和状态(短期)行为一致性。采用因果图模型和倾向评分方法,我们估计这些行为模式的因果效应,控制动机和知觉混杂因素。我们的分析揭示了不同的因果途径:虽然长期自我调节(BSRS-L)具有稳定的正因果效应,但短期行为一致性(BSRS-S)对日常幸福感的因果影响明显更强,尽管相关性接近于零。此外,我们证明,与传统的特征选择方法相比,通过我们的因果框架选择的特征显著提高了机器学习模型中幸福感的预测准确性。这项工作为纵向自我报告数据的因果分析提供了一个强有力的方法框架,并提供证据表明,因果关系知情的建模可以确定更有效的数字卫生干预目标。
{"title":"Quantifying the effect of Behaviour Self-Regulation on well-being through causal analysis: A methodological framework for longitudinal health data","authors":"Jialou Wang ,&nbsp;Pingfan Wang ,&nbsp;Wai Lok Woo ,&nbsp;Kandianos Emmanouil Sakalidis ,&nbsp;Florentina Johanna Hettinga ,&nbsp;Angela Rodrigues ,&nbsp;Helen Dawes ,&nbsp;Gavin Daniel Tempest","doi":"10.1016/j.jbi.2026.104984","DOIUrl":"10.1016/j.jbi.2026.104984","url":null,"abstract":"<div><div>Understanding the drivers of well-being from longitudinal behavioural data is a fundamental challenge in biomedical informatics, where traditional analyses often conflate correlation with causation. This paper presents a rigorous application of causal inference to disentangle the drivers of well-being from complex longitudinal self-report data (<span><math><mrow><mi>N</mi><mo>=</mo><mn>141</mn></mrow></math></span> enrolled; <span><math><mrow><mi>N</mi><mo>=</mo><mn>94</mn></mrow></math></span> analysed after a priori completeness threshold of <span><math><mrow><mo>≥</mo><mn>20</mn></mrow></math></span> of 28 daily entries). We introduce a novel computational metric, the Behaviour Self-Regulation Score (BSRS), to quantify both trait-like (long-term) and state-like (short-term) behavioural consistency from daily reports of physical activity and sleep. Employing causal graphical models and propensity score methods, we estimate the causal effects of these behavioural patterns, controlling for motivational and perceptual confounders. Our analysis uncovers distinct causal pathways: while long-term self-regulation (BSRS-L) has a stable positive causal effect, short-term behavioural consistency (BSRS-S) demonstrates a significantly stronger causal impact on daily well-being, despite a near-zero correlation. Furthermore, we demonstrate that features selected via our causal framework significantly improve the predictive accuracy of well-being in machine learning models compared to conventional feature selection methods. This work contributes a robust methodological framework for causal analysis of longitudinal self-report data and provides evidence that causally-informed modelling can identify more potent targets for digital health interventions.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"174 ","pages":"Article 104984"},"PeriodicalIF":4.5,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145951928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MDD-MARF: a multimodal depression detection model based on multi-level attention mechanism and residual fusion MDD-MARF:基于多层次注意机制和残差融合的多模态抑郁检测模型
IF 4.5 2区 医学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2026-01-01 Epub Date: 2025-11-29 DOI: 10.1016/j.jbi.2025.104965
Jianghai Zhou , Jike Ge , Zuqin Chen , Jie Tan , You Li

Objective

Depression is a serious mental disorder that significantly affects patients’ work ability and social functioning. With the rapid development of artificial intelligence, researchers have begun to explore automatic depression detection methods based on multimodal data. However, multimodal data are often accompanied by a large amount of noise. Existing methods usually lack sufficient feature screening after extraction and are directly applied to downstream tasks, which may limit the model’s generalization ability. In addition, current multimodal fusion strategies still face several challenges.

Methods

To address these challenges, we propose a novel multimodal depression detection model that integrates three modalities: audio, vision, and text. The model extracts depression-related key features through a multi-level attention mechanism and achieves efficient multimodal feature fusion using skip connections with a residual structure.

Results

Experiments conducted on the DAIC-WOZ dataset showed that the proposed method achieved a mean absolute error (MAE) of 3.13 and a root mean square error (RMSE) of 3.59, outperforming existing state-of-the-art models. The generalization ability of the model was further validated on the E-DAIC dataset, demonstrating its effectiveness and robustness.

Conclusion

The proposed method provides an efficient and reliable solution for depression detection using multimodal data and multi-level attention mechanisms. The findings highlight the significant value of multimodal learning in the medical field and offer strong support for the development of AI-assisted clinical decision-making systems.
目的抑郁症是一种严重的精神障碍,严重影响患者的工作能力和社会功能。随着人工智能的快速发展,研究人员开始探索基于多模态数据的抑郁症自动检测方法。然而,多模态数据往往伴随着大量的噪声。现有方法通常在提取后缺乏足够的特征筛选,直接应用于下游任务,这可能会限制模型的泛化能力。此外,当前的多模态融合策略还面临着一些挑战。为了解决这些挑战,我们提出了一种新的多模态抑郁症检测模型,该模型集成了三种模态:音频、视觉和文本。该模型通过多层次注意机制提取抑郁相关关键特征,并利用带有残余结构的跳跃连接实现高效的多模态特征融合。结果在DAIC-WOZ数据集上进行的实验表明,该方法的平均绝对误差(MAE)为3.13,均方根误差(RMSE)为3.59,优于现有的最先进模型。在e - aic数据集上进一步验证了模型的泛化能力,证明了模型的有效性和鲁棒性。结论该方法利用多模态数据和多层次注意机制,为抑郁症检测提供了高效可靠的解决方案。研究结果突出了多模式学习在医学领域的重要价值,并为人工智能辅助临床决策系统的发展提供了强有力的支持。
{"title":"MDD-MARF: a multimodal depression detection model based on multi-level attention mechanism and residual fusion","authors":"Jianghai Zhou ,&nbsp;Jike Ge ,&nbsp;Zuqin Chen ,&nbsp;Jie Tan ,&nbsp;You Li","doi":"10.1016/j.jbi.2025.104965","DOIUrl":"10.1016/j.jbi.2025.104965","url":null,"abstract":"<div><h3>Objective</h3><div>Depression is a serious mental disorder that significantly affects patients’ work ability and social functioning. With the rapid development of artificial intelligence, researchers have begun to explore automatic depression detection methods based on multimodal data. However, multimodal data are often accompanied by a large amount of noise. Existing methods usually lack sufficient feature screening after extraction and are directly applied to downstream tasks, which may limit the model’s generalization ability. In addition, current multimodal fusion strategies still face several challenges.</div></div><div><h3>Methods</h3><div>To address these challenges, we propose a novel multimodal depression detection model that integrates three modalities: audio, vision, and text. The model extracts depression-related key features through a multi-level attention mechanism and achieves efficient multimodal feature fusion using skip connections with a residual structure.</div></div><div><h3>Results</h3><div>Experiments conducted on the DAIC-WOZ dataset showed that the proposed method achieved a mean absolute error (MAE) of 3.13 and a root mean square error (RMSE) of 3.59, outperforming existing state-of-the-art models. The generalization ability of the model was further validated on the E-DAIC dataset, demonstrating its effectiveness and robustness.</div></div><div><h3>Conclusion</h3><div>The proposed method provides an efficient and reliable solution for depression detection using multimodal data and multi-level attention mechanisms. The findings highlight the significant value of multimodal learning in the medical field and offer strong support for the development of AI-assisted clinical decision-making systems.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"173 ","pages":"Article 104965"},"PeriodicalIF":4.5,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145645673","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fusion framework: Conditional-aware one-stage nested event extraction model 融合框架:条件感知的单阶段嵌套事件提取模型。
IF 4.5 2区 医学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2026-01-01 Epub Date: 2025-12-24 DOI: 10.1016/j.jbi.2025.104972
Sen Niu , Xiaohong Han , Liu Cao , Ye Tian , Ding Yuan , Longlong Cheng
We present CA-NEE, a Conditional-Aware one-stage model for overlapping and nested biomedical event extraction. CA-NEE integrates an event-type-aware conditioning mechanism with token-pair relation modeling to jointly identify triggers, argument spans, and roles. A Conditional Layer Normalization (CLN) dynamically adapts token representations to candidate event types, and a parallel word-pair scorer predicts span boundaries and roles in a single pass. Evaluations on GENIA11 and GENIA13 show consistent gains in Trigger Classification (TC) and Argument Classification (AC) over strong baselines, particularly on complex overlapping and nested structures. These results demonstrate that CA-NEE offers an effective and efficient solution for biomedical event extraction.
我们提出了一种用于重叠和嵌套生物医学事件提取的条件感知单阶段模型CA-NEE。CA-NEE将事件类型感知的条件调节机制与令牌对关系建模集成在一起,以联合识别触发器、参数范围和角色。条件层规范化(Conditional Layer Normalization, CLN)动态地将令牌表示适应候选事件类型,并行词对评分器在一次传递中预测跨度边界和角色。对GENIA11和GENIA13的评估显示,在强基线上,触发分类(TC)和参数分类(AC)取得了一致的进展,特别是在复杂的重叠和嵌套结构上。这些结果表明CA-NEE为生物医学事件提取提供了有效的解决方案。
{"title":"Fusion framework: Conditional-aware one-stage nested event extraction model","authors":"Sen Niu ,&nbsp;Xiaohong Han ,&nbsp;Liu Cao ,&nbsp;Ye Tian ,&nbsp;Ding Yuan ,&nbsp;Longlong Cheng","doi":"10.1016/j.jbi.2025.104972","DOIUrl":"10.1016/j.jbi.2025.104972","url":null,"abstract":"<div><div>We present CA-NEE, a Conditional-Aware one-stage model for overlapping and nested biomedical event extraction. CA-NEE integrates an event-type-aware conditioning mechanism with token-pair relation modeling to jointly identify triggers, argument spans, and roles. A Conditional Layer Normalization (CLN) dynamically adapts token representations to candidate event types, and a parallel word-pair scorer predicts span boundaries and roles in a single pass. Evaluations on <strong>GENIA11</strong> and <strong>GENIA13</strong> show consistent gains in Trigger Classification (TC) and Argument Classification (AC) over strong baselines, particularly on complex overlapping and nested structures. These results demonstrate that CA-NEE offers an effective and efficient solution for biomedical event extraction.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"173 ","pages":"Article 104972"},"PeriodicalIF":4.5,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145843751","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reviewer Acknowledgement 2025 审稿人致谢2025。
IF 4.5 2区 医学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2026-01-01 Epub Date: 2025-12-25 DOI: 10.1016/j.jbi.2025.104974
{"title":"Reviewer Acknowledgement 2025","authors":"","doi":"10.1016/j.jbi.2025.104974","DOIUrl":"10.1016/j.jbi.2025.104974","url":null,"abstract":"","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"173 ","pages":"Article 104974"},"PeriodicalIF":4.5,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145900440","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Modeling temporal self and interactive evolution for biomedical hypothesis generation 生物医学假设生成的时间自我和交互进化建模。
IF 4.5 2区 医学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2026-01-01 Epub Date: 2025-12-11 DOI: 10.1016/j.jbi.2025.104970
Hongyun Zeng , Huiwei Zhou , Weihong Yao , Hao Zhou , Yan Zhao , Zhecheng Wang

Objectives

Hypothesis generation (HG) aims to reveal meaningful hidden relationships between scientific terms from literature for accelerating innovation in drug discovery, disease prognosis and treatment. Recent studies have successfully employed the dynamic nature of term-pair relations for HG. However, the existing methods focus on capturing the evolution of term pairs by modeling the temporal meaning of terms themselves, which is hard to accurately model intricate spatio-temporal relations between term pairs.

Methods

In this paper, a Temporal Self and Interactive Evolution (TSIE) modeling method is proposed to accurately characterize complex dynamics of term-pair relations in HG. Specifically, for each term pair, we first employ Gated Recurrent Unit (GRU) to model its Temporal Self-evolution (TSE) and Temporal Interactive Evolution (TIE) for learning its TSE Embedding (TSE_emb) and TIE Embedding (TIE_emb), respectively. Then, we adopt a dual-tower Transformer to further model the temporal dependencies of both TSE_emb and TIE_emb, which are finally integrated by a gated fusion layer for inferring the future connectivity of the term pair.

Results

Experiments on three real-world datasets Immunotherapy, Virology, and Neurology demonstrate that TSIE can effectively capture complex evolutional patterns for biomedical hypothesis generation and achieve the state-of-the-art performance.

Conclusion

This paper proposes a novel TSIE method to learn temporal interactive difference features and enhance the model’s understanding of temporal relation inference. Our TSIE learns both TSE and TIE to effectively model the dynamic relationship between terms. By adapting a dual-tower Transformer encoder, TSIE can further model the temporal dependencies of TSE and TIE.
目的:假设生成(Hypothesis generation, HG)旨在揭示文献中科学术语之间有意义的隐含关系,以加速药物发现、疾病预后和治疗的创新。最近的研究成功地利用了术语对关系的动态特性,但现有的方法主要是通过对术语本身的时间意义建模来捕捉术语对的演变,难以准确地模拟术语对之间复杂的时空关系。方法:本文提出了一种时间自交互演化(TSIE)建模方法,以准确表征HG中术语对关系的复杂动态。具体而言,对于每个术语对,我们首先使用门控循环单元(GRU)对其时间自演化(TSE)和时间交互演化(TIE)建模,分别学习其TSE嵌入(TSE_emb)和TIE嵌入(TIE_emb)。然后,我们采用双塔变压器进一步建模TSE_emb和TIE_emb的时间依赖性,最后通过门控融合层进行集成,以推断术语对的未来连通性。结果:在免疫疗法、病毒学和神经学三个真实数据集上的实验表明,TSIE可以有效地捕获生物医学假设生成的复杂进化模式,并达到最先进的性能。结论:本文提出了一种新的TSIE方法来学习时间交互差异特征,增强模型对时间关系推理的理解。我们的TSIE同时学习TSE和TIE,以有效地建模术语之间的动态关系。通过采用双塔变压器编码器,TSIE可以进一步对TSE和TIE的时间依赖性进行建模。
{"title":"Modeling temporal self and interactive evolution for biomedical hypothesis generation","authors":"Hongyun Zeng ,&nbsp;Huiwei Zhou ,&nbsp;Weihong Yao ,&nbsp;Hao Zhou ,&nbsp;Yan Zhao ,&nbsp;Zhecheng Wang","doi":"10.1016/j.jbi.2025.104970","DOIUrl":"10.1016/j.jbi.2025.104970","url":null,"abstract":"<div><h3>Objectives</h3><div>Hypothesis generation (HG) aims to reveal meaningful hidden relationships between scientific terms from literature for accelerating innovation in drug discovery, disease prognosis and treatment. Recent studies have successfully employed the dynamic nature of term-pair relations for HG. However, the existing methods focus on capturing the evolution of term pairs by modeling the temporal meaning of terms themselves, which is hard to accurately model intricate spatio-temporal relations between term pairs.</div></div><div><h3>Methods</h3><div>In this paper, a Temporal Self and Interactive Evolution (TSIE) modeling method is proposed to accurately characterize complex dynamics of term-pair relations in HG. Specifically, for each term pair, we first employ Gated Recurrent Unit (GRU) to model its Temporal Self-evolution (TSE) and Temporal Interactive Evolution (TIE) for learning its TSE Embedding (TSE_emb) and TIE Embedding (TIE_emb), respectively. Then, we adopt a dual-tower Transformer to further model the temporal dependencies of both TSE_emb and TIE_emb, which are finally integrated by a gated fusion layer for inferring the future connectivity of the term pair.</div></div><div><h3>Results</h3><div>Experiments on three real-world datasets Immunotherapy, Virology, and Neurology demonstrate that TSIE can effectively capture complex evolutional patterns for biomedical hypothesis generation and achieve the state-of-the-art performance.</div></div><div><h3>Conclusion</h3><div>This paper proposes a novel TSIE method to learn temporal interactive difference features and enhance the model’s understanding of temporal relation inference. Our TSIE learns both TSE and TIE to effectively model the dynamic relationship between terms. By adapting a dual-tower Transformer encoder, TSIE can further model the temporal dependencies of TSE and TIE.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"173 ","pages":"Article 104970"},"PeriodicalIF":4.5,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145742797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Beyond manual transcripts: Exploring the potential of automatic speech recognition errors in improving Alzheimer’s disease detection 超越手工转录:探索自动语音识别错误在提高阿尔茨海默病检测中的潜力
IF 4.5 2区 医学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2026-01-01 Epub Date: 2025-12-18 DOI: 10.1016/j.jbi.2025.104968
Yin-Long Liu , Yuanchao Li , Rui Feng , Jiaxin Chen , Yiming Wang , Yu-Ang Chen , Nan Ding , Jiahong Yuan , Zhen-Hua Ling

Objective:

This study aims to extend the counterintuitive observation that Automatic Speech Recognition (ASR) errors can be beneficial for Alzheimer’s Disease (AD) detection. Our objective is to conduct a large-scale investigation to validate this phenomenon and, more importantly, to elucidate the specific mechanisms by which ASR errors can serve as valuable diagnostic clues for distinguishing individuals with AD from Healthy Controls (HC).

Methods:

We employed 18 ASR models, in both their original and fine-tuned versions, to generate 36 sets of transcripts from the ADReSS dataset. We also synthesized speech from both manual and ASR transcripts using a text-to-speech (TTS) model. Knowledge-based features and pre-trained embeddings were extracted and fed into two proposed AD detection models : a self-attention model and a cross-attention-based interpretability model. To uncover the underlying mechanisms, we conducted a multi-faceted set of analyses, including examinations of ASR error types, words affected by ASR errors, linguistic comparisons, attention weight analysis, and case studies.

Results:

We demonstrate that transcripts generated by certain ASR models achieve higher AD detection accuracy than gold-standard manual transcripts. This performance gain stems not from errors in general or a high Word Error Rate (WER), but from specific and asymmetric error patterns. Our analyses reveal that these patterns amplify some pre-existing linguistic deficits in AD speech (e.g., disfluencies), thereby increasing the feature-level divergence between the AD and HC groups. Furthermore, we show that these diagnostic clues are effectively preserved when speech is synthesized from ASR transcripts, holding significant implications for data augmentation strategies in AD research.

Conclusion:

The specific, asymmetric error patterns introduced by certain ASR models enhance the distinction between AD and HC groups by amplifying pathological linguistic deficits associated with AD. This work suggests a paradigm shift for clinical ASR development: optimizing models not merely for transcription accuracy, but for their downstream diagnostic utility.
目的:本研究旨在扩展反直觉的观察,即自动语音识别(ASR)错误可能有利于阿尔茨海默病(AD)的检测。我们的目标是进行大规模的调查来验证这一现象,更重要的是,阐明ASR错误作为区分AD个体和健康对照(HC)的有价值的诊断线索的具体机制。方法:我们采用了18个原始版本和微调版本的ASR模型,从address数据集中生成36组转录本。我们还使用文本到语音(TTS)模型从手动和ASR转录本合成语音。提取基于知识的特征和预训练的嵌入,并将其输入到两种AD检测模型中:自注意模型和基于交叉注意的可解释性模型。为了揭示潜在的机制,我们进行了多方面的分析,包括检查ASR错误类型、受ASR错误影响的单词、语言比较、注意权重分析和案例研究。结果:我们证明由某些ASR模型生成的转录本比金标准人工转录本具有更高的AD检测精度。这种性能的提高不是来自一般的错误或较高的单词错误率,而是来自特定的和不对称的错误模式。我们的分析表明,这些模式放大了AD言语中一些先前存在的语言缺陷(例如,不流利),从而增加了AD和HC群体之间的特征水平差异。此外,我们发现当从ASR转录本合成语音时,这些诊断线索有效地保留了下来,这对AD研究中的数据增强策略具有重要意义。结论:某些ASR模型引入的特定的、不对称的错误模式通过放大与AD相关的病理性语言缺陷来增强AD和HC组之间的区别。这项工作提示了临床ASR发展的范式转变:优化模型不仅是为了转录准确性,而且为了它们的下游诊断效用。
{"title":"Beyond manual transcripts: Exploring the potential of automatic speech recognition errors in improving Alzheimer’s disease detection","authors":"Yin-Long Liu ,&nbsp;Yuanchao Li ,&nbsp;Rui Feng ,&nbsp;Jiaxin Chen ,&nbsp;Yiming Wang ,&nbsp;Yu-Ang Chen ,&nbsp;Nan Ding ,&nbsp;Jiahong Yuan ,&nbsp;Zhen-Hua Ling","doi":"10.1016/j.jbi.2025.104968","DOIUrl":"10.1016/j.jbi.2025.104968","url":null,"abstract":"<div><h3>Objective:</h3><div>This study aims to extend the counterintuitive observation that Automatic Speech Recognition (ASR) errors can be beneficial for Alzheimer’s Disease (AD) detection. Our objective is to conduct a large-scale investigation to validate this phenomenon and, more importantly, to elucidate the specific mechanisms by which ASR errors can serve as valuable diagnostic clues for distinguishing individuals with AD from Healthy Controls (HC).</div></div><div><h3>Methods:</h3><div>We employed 18 ASR models, in both their original and fine-tuned versions, to generate 36 sets of transcripts from the ADReSS dataset. We also synthesized speech from both manual and ASR transcripts using a text-to-speech (TTS) model. Knowledge-based features and pre-trained embeddings were extracted and fed into two proposed AD detection models : a self-attention model and a cross-attention-based interpretability model. To uncover the underlying mechanisms, we conducted a multi-faceted set of analyses, including examinations of ASR error types, words affected by ASR errors, linguistic comparisons, attention weight analysis, and case studies.</div></div><div><h3>Results:</h3><div>We demonstrate that transcripts generated by certain ASR models achieve higher AD detection accuracy than gold-standard manual transcripts. This performance gain stems not from errors in general or a high Word Error Rate (WER), but from specific and asymmetric error patterns. Our analyses reveal that these patterns amplify some pre-existing linguistic deficits in AD speech (e.g., disfluencies), thereby increasing the feature-level divergence between the AD and HC groups. Furthermore, we show that these diagnostic clues are effectively preserved when speech is synthesized from ASR transcripts, holding significant implications for data augmentation strategies in AD research.</div></div><div><h3>Conclusion:</h3><div>The specific, asymmetric error patterns introduced by certain ASR models enhance the distinction between AD and HC groups by amplifying pathological linguistic deficits associated with AD. This work suggests a paradigm shift for clinical ASR development: optimizing models not merely for transcription accuracy, but for their downstream diagnostic utility.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"173 ","pages":"Article 104968"},"PeriodicalIF":4.5,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145798054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Domain adaptation of stable diffusion for ultrasound inpainting: a synthetic data approach for enhanced thyroid nodule segmentation 超声成像稳定扩散的域自适应:一种增强甲状腺结节分割的综合数据方法。
IF 4.5 2区 医学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2026-01-01 Epub Date: 2025-11-30 DOI: 10.1016/j.jbi.2025.104963
Antonin Prochazka, Jan Zeman

Objective

To enhance the cross-domain generalization of thyroid-nodule segmentation models by augmenting limited ultrasound training data with synthetic images generated by a fine-tuned Stable Diffusion model.

Methods

Three public thyroid ultrasound datasets with heterogeneous acquisition characteristics were used: TN3K (training + testing), TDID, and TUCC. The denoising UNet inside Stable Diffusion v1.4 was fine-tuned on 2303 TN3K nodules and then used to synthesize realistic thyroid nodules. Using the model’s inpainting capability, same number of synthetic nodules were inserted into original ultrasound images. The combined data were then used to train ResUNet, DeepLabV3+ and MITUnet segmentation networks with identical hyper-parameters. Performance between the models trained on native data only and native + synthetic data was quantified with the Dice similarity coefficient (Dice score) and Intersection-over-Union (IoU).

Results

Across the in-domain TN3K test set (n = 614), performance gains were modest, with the best improvements reaching + 2.2 % in Dice score for DeepLabV3+. In contrast, substantial gains were observed on the external datasets. On the TDID dataset (n = 462), DeepLabV3+ improved from 38.2 % to 59.1 % Dice (+20.9 %), while MITUNet and ResUNet also gained up by 7.1 % and 6.9 % respectively. On the TUCC dataset (n = 192), DeepLabV3+ improved by 11.4 % in Dice, MITUNet by 6.9 %, and ResUNet by 3.1 %. All improvements—except for in-domain TN3K—were statistically significant (p < 0.01, paired t-test or Wilcoxon signed-rank test), confirming that synthetic images generated by Stable Diffusion enhance cross-domain segmentation robustness.

Conclusion

Augmenting ultrasound dataset with synthetic images generated by a task-specific Stable Diffusion model substantially improves the robustness of thyroid nodule segmentation across datasets acquired with different devices, at different institutions, and by different operators.
目的:利用经微调的稳定扩散模型生成的合成图像增强有限超声训练数据,增强甲状腺结节分割模型的跨域泛化。方法:使用三个具有异构采集特征的公共甲状腺超声数据集:TN3K(训练 + 测试)、TDID和TUCC。利用Stable Diffusion v1.4中的去噪UNet对2303个TN3K结节进行微调,合成真实的甲状腺结节。利用模型的绘制能力,将相同数量的合成结节插入到原始超声图像中。然后使用组合数据训练具有相同超参数的ResUNet、DeepLabV3+和MITUnet分割网络。使用Dice相似系数(Dice score)和Intersection-over-Union (IoU)来量化仅在本地数据和本地 + 合成数据上训练的模型之间的性能。结果:在域内TN3K测试集(n = 614)中,性能的提高是适度的,DeepLabV3+的Dice得分的最佳改进达到 + 2.2 %。相比之下,在外部数据集上观察到实质性的收益。在TDID数据集(n = 462)上,DeepLabV3+从38.2 %提高到59.1 % Dice(+20.9 %),而MITUNet和ResUNet也分别提高了7.1 %和6.9 %。在TUCC数据集(n = 192)上,DeepLabV3+在Dice上提高了11.4 %,在MITUNet上提高了6.9 %,在ResUNet上提高了3.1 %。结论:用特定任务的稳定扩散模型生成的合成图像增强超声数据集,大大提高了不同设备、不同机构和不同操作人员获得的数据集的甲状腺结节分割的鲁棒性。
{"title":"Domain adaptation of stable diffusion for ultrasound inpainting: a synthetic data approach for enhanced thyroid nodule segmentation","authors":"Antonin Prochazka,&nbsp;Jan Zeman","doi":"10.1016/j.jbi.2025.104963","DOIUrl":"10.1016/j.jbi.2025.104963","url":null,"abstract":"<div><h3>Objective</h3><div>To enhance the cross-domain generalization of thyroid-nodule segmentation models by augmenting limited ultrasound training data with synthetic images generated by a fine-tuned Stable Diffusion model.</div></div><div><h3>Methods</h3><div>Three public thyroid ultrasound datasets with heterogeneous acquisition characteristics were used: TN3K (training + testing), TDID, and TUCC. The denoising UNet inside Stable Diffusion v1.4 was fine-tuned on 2303 TN3K nodules and then used to synthesize realistic thyroid nodules. Using the model’s inpainting capability, same number of synthetic nodules were inserted into original ultrasound images. The combined data were then used to train ResUNet, DeepLabV3+ and MITUnet segmentation networks with identical hyper-parameters. Performance between the models trained on native data only and native + synthetic data was quantified with the Dice similarity coefficient (Dice score) and Intersection-over-Union (IoU).</div></div><div><h3>Results</h3><div>Across the in-domain TN3K test set (n = 614), performance gains were modest, with the best improvements reaching + 2.2 % in Dice score for DeepLabV3+. In contrast, substantial gains were observed on the external datasets. On the TDID dataset (n = 462), DeepLabV3+ improved from 38.2 % to 59.1 % Dice (+20.9 %), while MITUNet and ResUNet also gained up by 7.1 % and 6.9 % respectively. On the TUCC dataset (n = 192), DeepLabV3+ improved by 11.4 % in Dice, MITUNet by 6.9 %, and ResUNet by 3.1 %. All improvements—except for in-domain TN3K—were statistically significant (p &lt; 0.01, paired <em>t</em>-test or Wilcoxon signed-rank test), confirming that synthetic images generated by Stable Diffusion enhance cross-domain segmentation robustness.</div></div><div><h3>Conclusion</h3><div>Augmenting ultrasound dataset with synthetic images generated by a task-specific Stable Diffusion model substantially improves the robustness of thyroid nodule segmentation across datasets acquired with different devices, at different institutions, and by different operators.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"173 ","pages":"Article 104963"},"PeriodicalIF":4.5,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145661395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Biomedical Informatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1