首页 > 最新文献

IEEE Journal of Biomedical and Health Informatics最新文献

英文 中文
Asymmetric Co-Training With Decoder-Head Decoupling for Semi-Supervised Medical Image Segmentation. 基于译码头解耦的非对称协同训练半监督医学图像分割。
IF 6.8 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-12 DOI: 10.1109/JBHI.2026.3664127
Yuxin Tian, Muhan Shi, Jianxun Li, Bin Zhang, Min Qu, Yinxue Shi, Xian Yang, Min Wang

Semi-supervised learning reduces annotation costs in medical image segmentation by leveraging abundant unlabeled data alongside scarce labels. Most models adopt an encoder-decoder architecture with a task-specific segmentation head. While co-training is effective, existing frameworks suffer from intra-network coupling (decoder-head binding) and inter-network coupling (over-aligned predictions), which reduce prediction diversity and amplify confirmation bias-particularly for small structures, ambiguous boundaries, and anatomically variable regions. We propose AsyCo, an asymmetric co-training framework with two components. (1) Asymmetric Decoder Coupling implements decoder-head decoupling by dynamically remapping encoder-decoder features to non-default heads across branches, breaking intra-network coupling and creating diverse prediction paths without additional parameters. (2) Hierarchical Consistency Regularization converts this diversity into stable supervision by aligning (i) the two branches' final outputs along their default paths (branch-output consistency), (ii) predictions from different segmentation heads evaluated on identical decoder features (inter-head consistency), and (iii) intermediate encoder-decoder representations (representation consistency). Through these mechanisms, AsyCo explicitly mitigates both intra- and inter-network coupling, improving training stability and reducing confirmation bias. Extensive experiments on three clinical benchmarks under limited-label regimes demonstrate that AsyCo consistently outperforms nine state-of-the-art semi-supervised learning methods. These results indicate that AsyCo delivers accurate and reliable segmentation with minimal annotation, thereby enhancing the reliability of medical image analysis in real-world clinical practice.

半监督学习通过利用丰富的未标记数据和稀缺的标签,降低了医学图像分割中的标注成本。大多数模型采用具有任务特定分段头的编码器-解码器架构。虽然共同训练是有效的,但现有框架存在网络内耦合(译码头绑定)和网络间耦合(过度对齐的预测),这降低了预测多样性并放大了确认偏差——特别是对于小结构、模糊边界和解剖可变区域。我们提出了一个非对称协同训练框架,它有两个组成部分。(1)非对称解码器耦合通过动态地将编码器-解码器特征重新映射到跨分支的非默认头部来实现解码器-头部解耦,从而打破网络内耦合并在不需要额外参数的情况下创建多种预测路径。(2)分层一致性正则化(Hierarchical Consistency Regularization)通过对齐(i)两个分支沿着其默认路径的最终输出(分支输出一致性),(ii)在相同解码器特征上评估不同分割头的预测(头间一致性),以及(iii)中间编码器-解码器表示(表示一致性),将这种多样性转换为稳定的监督。通过这些机制,AsyCo明确地减轻了网络内部和网络内部的耦合,提高了训练稳定性并减少了确认偏差。在有限标签制度下对三个临床基准进行的广泛实验表明,AsyCo始终优于九种最先进的半监督学习方法。这些结果表明,AsyCo以最少的注释提供准确可靠的分割,从而提高了现实世界临床实践中医学图像分析的可靠性。
{"title":"Asymmetric Co-Training With Decoder-Head Decoupling for Semi-Supervised Medical Image Segmentation.","authors":"Yuxin Tian, Muhan Shi, Jianxun Li, Bin Zhang, Min Qu, Yinxue Shi, Xian Yang, Min Wang","doi":"10.1109/JBHI.2026.3664127","DOIUrl":"https://doi.org/10.1109/JBHI.2026.3664127","url":null,"abstract":"<p><p>Semi-supervised learning reduces annotation costs in medical image segmentation by leveraging abundant unlabeled data alongside scarce labels. Most models adopt an encoder-decoder architecture with a task-specific segmentation head. While co-training is effective, existing frameworks suffer from intra-network coupling (decoder-head binding) and inter-network coupling (over-aligned predictions), which reduce prediction diversity and amplify confirmation bias-particularly for small structures, ambiguous boundaries, and anatomically variable regions. We propose AsyCo, an asymmetric co-training framework with two components. (1) Asymmetric Decoder Coupling implements decoder-head decoupling by dynamically remapping encoder-decoder features to non-default heads across branches, breaking intra-network coupling and creating diverse prediction paths without additional parameters. (2) Hierarchical Consistency Regularization converts this diversity into stable supervision by aligning (i) the two branches' final outputs along their default paths (branch-output consistency), (ii) predictions from different segmentation heads evaluated on identical decoder features (inter-head consistency), and (iii) intermediate encoder-decoder representations (representation consistency). Through these mechanisms, AsyCo explicitly mitigates both intra- and inter-network coupling, improving training stability and reducing confirmation bias. Extensive experiments on three clinical benchmarks under limited-label regimes demonstrate that AsyCo consistently outperforms nine state-of-the-art semi-supervised learning methods. These results indicate that AsyCo delivers accurate and reliable segmentation with minimal annotation, thereby enhancing the reliability of medical image analysis in real-world clinical practice.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2026-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146179312","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
NeuroCLIP: A Multimodal Contrastive Learning Method for rTMS-treated Methamphetamine Addiction Analysis. NeuroCLIP:一种多模态对比学习方法用于rtms治疗甲基苯丙胺成瘾分析。
IF 6.8 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-11 DOI: 10.1109/JBHI.2026.3663869
Chengkai Wang, Di Wu, Yunsheng Liao, Wenyao Zheng, Ziyi Zeng, Xurong Gao, Hemmings Wu, Zhoule Zhu, Jie Yang, Lihua Zhong, Weiwei Cheng, Yun-Hsuan Chen, Mohamad Sawan

Methamphetamine dependence poses a significant global health challenge, yet its assessment and the evaluation of treatments like repetitive transcranial magnetic stimulation (rTMS) frequently depend on subjective self-reports, which may introduce uncertainties. While objective neuroimaging modalities such as electroencephalography (EEG) and functional near-infrared spectroscopy (fNIRS) offer alternatives, their individual limitations and the reliance on conventional, often hand-crafted, feature extraction can compromise the reliability of derived biomarkers. To overcome these limitations, we propose NeuroCLIP, a novel deep learning framework integrating simultaneously recorded EEG and fNIRS data through a progressive learning strategy. This approach offers a robust and trustworthy data-driven biomarker for methamphetamine addiction. Validation experiments show that NeuroCLIP significantly improves discriminative capabilities among the methamphetamine-dependent individuals and healthy controls compared to models using either EEG or only fNIRS alone. Furthermore, the proposed framework facilitates objective, brain-based evaluation of rTMS treatment efficacy, demonstrating measurable shifts in neural patterns towards healthy control profiles after treatment. Critically, we establish the trustworthiness of the multimodal data-driven biomarker by showing its strong correlation with psychometrically validated craving scores. These findings suggest that biomarker derived from EEG-fNIRS data via NeuroCLIP offers enhanced robustness and reliability over single-modality approaches, providing a valuable tool for addiction neuroscience research and potentially improving clinical assessments.

甲基苯丙胺依赖是一项重大的全球健康挑战,但其评估和对重复经颅磁刺激(rTMS)等治疗的评估往往依赖于主观自我报告,这可能会带来不确定性。虽然脑电图(EEG)和功能性近红外光谱(fNIRS)等客观神经成像模式提供了替代方案,但它们的个体局限性以及对传统(通常是手工制作的)特征提取的依赖可能会损害衍生生物标志物的可靠性。为了克服这些限制,我们提出了NeuroCLIP,这是一个新的深度学习框架,通过渐进式学习策略集成同时记录的EEG和fNIRS数据。这种方法为甲基苯丙胺成瘾提供了一个可靠的数据驱动的生物标志物。验证实验表明,与单独使用EEG或仅使用fNIRS的模型相比,NeuroCLIP显着提高了甲基苯丙胺依赖个体和健康对照的辨别能力。此外,所提出的框架有助于客观的、基于大脑的rTMS治疗效果评估,证明治疗后神经模式向健康对照谱的可测量转变。至关重要的是,我们通过显示其与心理测量学验证的渴望分数的强烈相关性,建立了多模态数据驱动的生物标志物的可信度。这些发现表明,通过NeuroCLIP从EEG-fNIRS数据中获得的生物标志物比单模式方法具有更高的鲁棒性和可靠性,为成瘾神经科学研究提供了有价值的工具,并有可能改善临床评估。
{"title":"NeuroCLIP: A Multimodal Contrastive Learning Method for rTMS-treated Methamphetamine Addiction Analysis.","authors":"Chengkai Wang, Di Wu, Yunsheng Liao, Wenyao Zheng, Ziyi Zeng, Xurong Gao, Hemmings Wu, Zhoule Zhu, Jie Yang, Lihua Zhong, Weiwei Cheng, Yun-Hsuan Chen, Mohamad Sawan","doi":"10.1109/JBHI.2026.3663869","DOIUrl":"https://doi.org/10.1109/JBHI.2026.3663869","url":null,"abstract":"<p><p>Methamphetamine dependence poses a significant global health challenge, yet its assessment and the evaluation of treatments like repetitive transcranial magnetic stimulation (rTMS) frequently depend on subjective self-reports, which may introduce uncertainties. While objective neuroimaging modalities such as electroencephalography (EEG) and functional near-infrared spectroscopy (fNIRS) offer alternatives, their individual limitations and the reliance on conventional, often hand-crafted, feature extraction can compromise the reliability of derived biomarkers. To overcome these limitations, we propose NeuroCLIP, a novel deep learning framework integrating simultaneously recorded EEG and fNIRS data through a progressive learning strategy. This approach offers a robust and trustworthy data-driven biomarker for methamphetamine addiction. Validation experiments show that NeuroCLIP significantly improves discriminative capabilities among the methamphetamine-dependent individuals and healthy controls compared to models using either EEG or only fNIRS alone. Furthermore, the proposed framework facilitates objective, brain-based evaluation of rTMS treatment efficacy, demonstrating measurable shifts in neural patterns towards healthy control profiles after treatment. Critically, we establish the trustworthiness of the multimodal data-driven biomarker by showing its strong correlation with psychometrically validated craving scores. These findings suggest that biomarker derived from EEG-fNIRS data via NeuroCLIP offers enhanced robustness and reliability over single-modality approaches, providing a valuable tool for addiction neuroscience research and potentially improving clinical assessments.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2026-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146165222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ECG-AuxNet: A Dual-Branch Spatial-Temporal Feature Fusion Framework with Auxiliary Learning for Enhanced Cardiac Disease Diagnosis. ECG-AuxNet:辅助学习的双分支时空特征融合框架增强心脏病诊断。
IF 6.8 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-11 DOI: 10.1109/JBHI.2026.3664231
Ruiqi Shen, Yanan Wang, Chunge Cao, Shuaicong Hu, Jia Liu, Hongyu Wang, Gaoyan Zhong, Cuiwei Yang

Objective: Multiple limitations exist in current automated ECG analysis, including insufficient feature integration across leads, limited interpretability, poor generalization, and inadequate handling of class imbalance. To address these challenges, we develop a novel dual-branch framework that comprehensively captures spatial-temporal features for cardiac disease diagnosis.

Methods: ECG-AuxNet combines a Multi-scale Transformer Attention CNN for spatial feature extraction and a GRU network for temporal dependency modeling. A Dual-stage Cross-Attention Fusion module integrates features from both branches, while a Feature Space Reconstruction (FSR) auxiliary task is introduced as a manifold regularizer to enhance feature discrimination. The framework was evaluated on PTB-XL (15,709 ECGs) and validated in real-world clinical scenarios (SXMU-2k, 1,673 ECGs).

Results: For class-imbalanced disease recognition (NORM, CD, MI, STTC), ECG-AuxNet attained 78.34% F1-score on PTB-XL and 82.63% F1-score on SXMU-2k, outperforming 9 baseline models. FSR significantly improved feature discrimination by 11.7%, enhancing class boundary clarity and classification accuracy. Grad-CAM analysis revealed attention patterns that precisely match cardiologists' diagnostic focus areas.

Conclusion: ECG-AuxNet effectively integrates spatial-temporal features through auxiliary learning, achieving robust generalizability in cardiac disease diagnosis with interpretability aligned with clinical expertise.

目的:当前心电自动分析存在诸多局限性,包括导联特征整合不足、可解释性有限、泛化差、类不平衡处理不充分等。为了应对这些挑战,我们开发了一种新的双分支框架,全面捕捉心脏病诊断的时空特征。方法:ECG-AuxNet结合多尺度变压器注意力CNN进行空间特征提取,GRU网络进行时间依赖性建模。双阶段交叉注意融合模块集成了两个分支的特征,同时引入特征空间重构(FSR)辅助任务作为流形正则化器来增强特征识别。该框架在PTB-XL(15,709个ecg)上进行了评估,并在真实临床场景(SXMU-2k, 1,673个ecg)中进行了验证。结果:对于类别不平衡疾病识别(NORM、CD、MI、STTC), ECG-AuxNet在PTB-XL上的f1评分为78.34%,在SXMU-2k上的f1评分为82.63%,优于9个基线模型。FSR显著提高了11.7%的特征识别率,提高了分类边界的清晰度和分类精度。Grad-CAM分析揭示了与心脏病专家的诊断重点区域精确匹配的注意力模式。结论:ECG-AuxNet通过辅助学习有效地整合了时空特征,在心脏病诊断中实现了强大的泛化,并具有与临床专业知识一致的可解释性。
{"title":"ECG-AuxNet: A Dual-Branch Spatial-Temporal Feature Fusion Framework with Auxiliary Learning for Enhanced Cardiac Disease Diagnosis.","authors":"Ruiqi Shen, Yanan Wang, Chunge Cao, Shuaicong Hu, Jia Liu, Hongyu Wang, Gaoyan Zhong, Cuiwei Yang","doi":"10.1109/JBHI.2026.3664231","DOIUrl":"https://doi.org/10.1109/JBHI.2026.3664231","url":null,"abstract":"<p><strong>Objective: </strong>Multiple limitations exist in current automated ECG analysis, including insufficient feature integration across leads, limited interpretability, poor generalization, and inadequate handling of class imbalance. To address these challenges, we develop a novel dual-branch framework that comprehensively captures spatial-temporal features for cardiac disease diagnosis.</p><p><strong>Methods: </strong>ECG-AuxNet combines a Multi-scale Transformer Attention CNN for spatial feature extraction and a GRU network for temporal dependency modeling. A Dual-stage Cross-Attention Fusion module integrates features from both branches, while a Feature Space Reconstruction (FSR) auxiliary task is introduced as a manifold regularizer to enhance feature discrimination. The framework was evaluated on PTB-XL (15,709 ECGs) and validated in real-world clinical scenarios (SXMU-2k, 1,673 ECGs).</p><p><strong>Results: </strong>For class-imbalanced disease recognition (NORM, CD, MI, STTC), ECG-AuxNet attained 78.34% F1-score on PTB-XL and 82.63% F1-score on SXMU-2k, outperforming 9 baseline models. FSR significantly improved feature discrimination by 11.7%, enhancing class boundary clarity and classification accuracy. Grad-CAM analysis revealed attention patterns that precisely match cardiologists' diagnostic focus areas.</p><p><strong>Conclusion: </strong>ECG-AuxNet effectively integrates spatial-temporal features through auxiliary learning, achieving robust generalizability in cardiac disease diagnosis with interpretability aligned with clinical expertise.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2026-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146165183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PASAformer: Cerebrovascular Disease Classification with Medical Prior-Guided Adapter and Pathology-Aware Sparse Attention. PASAformer:脑血管疾病分类与医学先验引导适配器和病理意识稀疏注意。
IF 6.8 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-11 DOI: 10.1109/JBHI.2026.3663876
Baiming Chen, Xin Gao, Weiguo Zhang, Sue Cao, Si Li, Linhai Yan

Cerebrovascular diseases (CVDs) such as aneurysms, arteriovenous malformations, stenosis, and Moyamoya disease are major public health concerns. Accurate classification of these conditions is essential for timely intervention, yet current computer-aided methods often exhibit limited representational capacity, feature redundancy, and insufficient interpretability, restricting clinical applicability. We propose PASAformer, a Swin-Transformer-based framework for cerebrovascular disease classification on Digital Subtraction Angiography (DSA). PASAformer incorporates a Pathology-Aware Sparse Attention (PASA) module that emphasizes lesion-related regions while suppressing background redundancy. Inserted into the Swin backbone, PASA replaces dense window self-attention, improving computational efficiency while preserving the hierarchical architecture. We further employ the MiAMix data augmenter to increase sample diversity, and incorporate a CombinedAdapter encoder that injects anatomical priors from the frozen Medical Segment Anything Model (MED-SAM) into early-stage representations, strengthening discriminative power under limited supervision. To support research in this underexplored area, we curate CDSA-NEO, a proprietary DSA dataset comprising more than 1,700 static images across four major cerebrovascular disease categories, constituting the first large-scale benchmark of its kind. Furthermore, an external cohort of angiographic runs with sequential, unselected frames is used to assess robustness in realistic temporal workflows. Extensive experiments on CDSA-NEO and public vascular datasets demonstrate that PASAformer achieves competitive precision and balanced accuracy compared to representative state-of-the-art models, while providing more focused visual explanations. These results suggest that PASAformer can support automated cerebrovascular disease classification on angiography, and that CDSA-NEO provides a benchmark for future method development and evaluation.

脑血管疾病(cvd),如动脉瘤、动静脉畸形、狭窄和烟雾病是主要的公共卫生问题。这些疾病的准确分类对于及时干预至关重要,但目前的计算机辅助方法往往表现出有限的表征能力,特征冗余,可解释性不足,限制了临床适用性。我们提出PASAformer,一个基于swing - transformer的框架,用于数字减影血管造影(DSA)的脑血管疾病分类。PASAformer集成了病理感知稀疏注意(PASA)模块,该模块强调病变相关区域,同时抑制背景冗余。插入Swin主干后,PASA取代了密集的窗口自关注,在保持分层结构的同时提高了计算效率。我们进一步使用了MiAMix数据增强器来增加样本多样性,并结合了一个CombinedAdapter编码器,该编码器将冷冻医学片段任意模型(MED-SAM)的解剖先验注入到早期表示中,从而在有限的监督下增强了判别能力。为了支持这一尚未开发的领域的研究,我们策划了CDSA-NEO,这是一个专有的DSA数据集,包括四种主要脑血管疾病类别的1,700多张静态图像,构成了此类数据集的第一个大规模基准。此外,一个外部队列的血管造影运行序列,未选择的帧被用来评估鲁棒性在现实的时间工作流程。在CDSA-NEO和公共血管数据集上进行的大量实验表明,与代表性的最先进模型相比,PASAformer具有竞争力的精度和平衡的精度,同时提供更集中的视觉解释。这些结果表明PASAformer可以支持血管造影的脑血管疾病自动分类,而CDSA-NEO为未来的方法开发和评估提供了一个基准。
{"title":"PASAformer: Cerebrovascular Disease Classification with Medical Prior-Guided Adapter and Pathology-Aware Sparse Attention.","authors":"Baiming Chen, Xin Gao, Weiguo Zhang, Sue Cao, Si Li, Linhai Yan","doi":"10.1109/JBHI.2026.3663876","DOIUrl":"https://doi.org/10.1109/JBHI.2026.3663876","url":null,"abstract":"<p><p>Cerebrovascular diseases (CVDs) such as aneurysms, arteriovenous malformations, stenosis, and Moyamoya disease are major public health concerns. Accurate classification of these conditions is essential for timely intervention, yet current computer-aided methods often exhibit limited representational capacity, feature redundancy, and insufficient interpretability, restricting clinical applicability. We propose PASAformer, a Swin-Transformer-based framework for cerebrovascular disease classification on Digital Subtraction Angiography (DSA). PASAformer incorporates a Pathology-Aware Sparse Attention (PASA) module that emphasizes lesion-related regions while suppressing background redundancy. Inserted into the Swin backbone, PASA replaces dense window self-attention, improving computational efficiency while preserving the hierarchical architecture. We further employ the MiAMix data augmenter to increase sample diversity, and incorporate a CombinedAdapter encoder that injects anatomical priors from the frozen Medical Segment Anything Model (MED-SAM) into early-stage representations, strengthening discriminative power under limited supervision. To support research in this underexplored area, we curate CDSA-NEO, a proprietary DSA dataset comprising more than 1,700 static images across four major cerebrovascular disease categories, constituting the first large-scale benchmark of its kind. Furthermore, an external cohort of angiographic runs with sequential, unselected frames is used to assess robustness in realistic temporal workflows. Extensive experiments on CDSA-NEO and public vascular datasets demonstrate that PASAformer achieves competitive precision and balanced accuracy compared to representative state-of-the-art models, while providing more focused visual explanations. These results suggest that PASAformer can support automated cerebrovascular disease classification on angiography, and that CDSA-NEO provides a benchmark for future method development and evaluation.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2026-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146165241","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Shortening the MacArthur-Bates Communicative Developmental Inventory Using Machine Learning Based Computerized Adaptive Testing (ML-CAT). 利用基于机器学习的计算机自适应测试(ML-CAT)缩短麦克阿瑟-贝茨交际发展量表。
IF 6.8 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-11 DOI: 10.1109/JBHI.2025.3626073
Diana Saker, Haya Salameh, Hila Gendler-Shalev, Hagit Hel-Or

Early identification of infants and toddlers at risk for developmental disorders can improve the efficiency of early intervention programs and can reduce healthcare costs. The MacArthur-Bates Communicative Development Inventory (MB-CDI) is a standardized tool for assessing children's early lexical development. However, due to its long list of words, administration is time-consuming and often limiting. In this paper we use Machine learning together with a computerized adaptive testing approach (ML-CAT), to shorten the MB-CDI by adapting the sequence of words to the subject's responses. We show that the ML-CAT can reliably predict the final score of the H-MB-CDI with as few as 10 words on average while maintaining 94% to 96% accuracy. We further show that the ML-CAT outperforms existing approaches, including fixed, non adaptive methods as well as statistical models based on Item Response Theory (IRT). Results are also given for five different languages. Most importantly, ML-CAT is shown to outperform IRT based methods when handling atypical talkers (outliers). The ML-CAT enables more efficient lexical development assessment, allowing for a wider and repeated screening in the community. Additionally, due to its shorter length, assessment is expected to be less of a burden on the subject or her caregiver and consequently more reliable.

早期识别有发育障碍风险的婴幼儿可以提高早期干预计划的效率,并可以降低医疗保健成本。麦克阿瑟-贝茨交际发展量表(MB-CDI)是一种评估儿童早期词汇发展的标准化工具。然而,由于它的单词列表很长,管理是耗时的,而且经常受到限制。在本文中,我们将机器学习与计算机化自适应测试方法(ML-CAT)结合使用,通过根据受试者的反应调整单词序列来缩短MB-CDI。我们的研究表明,ML-CAT平均只需要10个单词就可以可靠地预测H-MB-CDI的最终分数,同时保持94%到96%的准确率。我们进一步表明,ML-CAT优于现有的方法,包括固定的、非自适应的方法以及基于项目反应理论(IRT)的统计模型。还给出了五种不同语言的结果。最重要的是,在处理非典型说话者(异常值)时,ML-CAT的表现优于基于IRT的方法。ML-CAT能够更有效地评估词汇发展,允许在社区中进行更广泛和重复的筛查。此外,由于评估的长度较短,预计对受试者或其护理人员的负担较小,因此更可靠。
{"title":"Shortening the MacArthur-Bates Communicative Developmental Inventory Using Machine Learning Based Computerized Adaptive Testing (ML-CAT).","authors":"Diana Saker, Haya Salameh, Hila Gendler-Shalev, Hagit Hel-Or","doi":"10.1109/JBHI.2025.3626073","DOIUrl":"https://doi.org/10.1109/JBHI.2025.3626073","url":null,"abstract":"<p><p>Early identification of infants and toddlers at risk for developmental disorders can improve the efficiency of early intervention programs and can reduce healthcare costs. The MacArthur-Bates Communicative Development Inventory (MB-CDI) is a standardized tool for assessing children's early lexical development. However, due to its long list of words, administration is time-consuming and often limiting. In this paper we use Machine learning together with a computerized adaptive testing approach (ML-CAT), to shorten the MB-CDI by adapting the sequence of words to the subject's responses. We show that the ML-CAT can reliably predict the final score of the H-MB-CDI with as few as 10 words on average while maintaining 94% to 96% accuracy. We further show that the ML-CAT outperforms existing approaches, including fixed, non adaptive methods as well as statistical models based on Item Response Theory (IRT). Results are also given for five different languages. Most importantly, ML-CAT is shown to outperform IRT based methods when handling atypical talkers (outliers). The ML-CAT enables more efficient lexical development assessment, allowing for a wider and repeated screening in the community. Additionally, due to its shorter length, assessment is expected to be less of a burden on the subject or her caregiver and consequently more reliable.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2026-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146165181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning Optimal Spectral Clustering for Functional Brain Network Generation and Classification. 学习最优谱聚类用于脑功能网络的生成和分类。
IF 6.8 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-11 DOI: 10.1109/JBHI.2025.3633902
Jiacheng Hou, Zhenjie Song, Chenfei Ye, Ercan Engin Kuruoglu

Functional brain network (FBN) analysis aims to enhance the understanding of brain organization and support the diagnosis of neurological and psychiatric disorders. Prior studies have shown that FBNs exhibit small-world topology, where brain regions form functional clusters, and abnormalities in these clusters are strongly associated with disease. However, current learning-based methods either ignore this special topological structure or impose it as a post-hoc step outside the learning process, limiting both performance and interpretability. In this paper, we propose Learning Optimal Spectral Clustering (LOSC), a new framework that integrates the FBN generation, clustering, and classification with a novel graph theory grounded loss to fully exploit the small-world topology. Firstly, LOSC learns brain connectivity in a nonlinear spatio-spectral embedding space, guided by our proposed Rayleigh Quotient Loss (RQL), to preserve the small-world properties in generated FBNs. Then, the FBNs are partitioned into clusters of functionally synchronized regions, and both intra- and inter-cluster relations are utilized for brain network classification. Our contributions are threefold: (1) Improved brain network classification accuracy: by leveraging small-world functional clusters, LOSC achieves consistent gains of 2.0%, 3.6%, and 2.6% on the ABIDE, ADHD-200, and HCP datasets compared with state-of-the-art models, respectively; (2) Theoretical grounding: with our proposed RQL, LOSC bridges the gap between the graph theory and learning-based FBN analysis; and (3) Interpretability: the discovered functional clusters align with known neuropathology and contribute to the discovery of new functional community biomarkers.

脑功能网络(FBN)分析旨在增强对大脑组织的认识,并支持神经和精神疾病的诊断。先前的研究表明,fbn表现出小世界拓扑结构,其中大脑区域形成功能簇,这些簇的异常与疾病密切相关。然而,当前基于学习的方法要么忽略这种特殊的拓扑结构,要么将其作为学习过程之外的一个特殊步骤,从而限制了性能和可解释性。在本文中,我们提出了学习最优谱聚类(LOSC),这是一个新的框架,它将FBN的生成、聚类和分类与一种新的基于图论的损失相结合,以充分利用小世界拓扑。首先,在我们提出的瑞利商损失(Rayleigh Quotient Loss, RQL)的指导下,LOSC在非线性空间频谱嵌入空间中学习大脑连接,以保留生成的fbn的小世界性质。然后,将fbn划分为功能同步区域的簇,并利用簇内和簇间关系进行脑网络分类。我们的贡献有三个方面:(1)提高了脑网络分类精度:通过利用小世界功能集群,LOSC在ABIDE、ADHD-200和HCP数据集上实现了与最先进模型相比的一致性增益,分别为2.0%、3.6%和2.6%;(2)理论基础:通过我们提出的RQL, LOSC弥补了图论和基于学习的FBN分析之间的差距;(3)可解释性:发现的功能簇与已知的神经病理学一致,有助于发现新的功能群落生物标志物。
{"title":"Learning Optimal Spectral Clustering for Functional Brain Network Generation and Classification.","authors":"Jiacheng Hou, Zhenjie Song, Chenfei Ye, Ercan Engin Kuruoglu","doi":"10.1109/JBHI.2025.3633902","DOIUrl":"https://doi.org/10.1109/JBHI.2025.3633902","url":null,"abstract":"<p><p>Functional brain network (FBN) analysis aims to enhance the understanding of brain organization and support the diagnosis of neurological and psychiatric disorders. Prior studies have shown that FBNs exhibit small-world topology, where brain regions form functional clusters, and abnormalities in these clusters are strongly associated with disease. However, current learning-based methods either ignore this special topological structure or impose it as a post-hoc step outside the learning process, limiting both performance and interpretability. In this paper, we propose Learning Optimal Spectral Clustering (LOSC), a new framework that integrates the FBN generation, clustering, and classification with a novel graph theory grounded loss to fully exploit the small-world topology. Firstly, LOSC learns brain connectivity in a nonlinear spatio-spectral embedding space, guided by our proposed Rayleigh Quotient Loss (RQL), to preserve the small-world properties in generated FBNs. Then, the FBNs are partitioned into clusters of functionally synchronized regions, and both intra- and inter-cluster relations are utilized for brain network classification. Our contributions are threefold: (1) Improved brain network classification accuracy: by leveraging small-world functional clusters, LOSC achieves consistent gains of 2.0%, 3.6%, and 2.6% on the ABIDE, ADHD-200, and HCP datasets compared with state-of-the-art models, respectively; (2) Theoretical grounding: with our proposed RQL, LOSC bridges the gap between the graph theory and learning-based FBN analysis; and (3) Interpretability: the discovered functional clusters align with known neuropathology and contribute to the discovery of new functional community biomarkers.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2026-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146165165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RGShuffleNet: An Efficient Design for Medical Image Segmentation on Portable Devices. RGShuffleNet:一种高效的便携式医疗图像分割设计。
IF 6.8 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-11 DOI: 10.1109/JBHI.2026.3663638
Zemin Cai, Jiarui Luo, Jian-Huang Lai, Fu Chen

Medical image segmentation plays a crucial role in intelligent medical image processing systems, serving as the foundation for effective medical image analysis, particularly in assisting diagnosis and surgical planning. Over the past few years, UNet has achieved tremendous success in the field of image segmentation, with several UNet-based extension models widely applied in medical image segmentation tasks. However, the application of these models is limited to scenarios where large medical equipment can be deployed, such as hospitals. The significant computational costs associated with these segmentation models pose significant challenges when deploying them on portable devices with limited hardware resources. This hinders the realization of rapid and efficient image segmentation in Homelab. In this paper, we present a lightweight model, RGShuffleNet, specifically designed for resource-constrained mobile devices for medical image segmentation. To reduce parameters and computational complexity, we first propose Reshaped Group Convolution, a novel convolutional method for effectively restructuring dimensions of different feature groups. Modifying the feature structure enhances correlations between different groups. Additionally, we introduce the MSC-Shuffle block to facilitate information flow between different feature groups. Unlike traditional Shuffle operations that focus solely on channel correlation, the MSC-Shuffle block proposed in this paper enables information exchange between different groups in both channel and spatial dimensions, thereby achieving superior segmentation performance. Experimental evaluations on two cardiac ultrasound image datasets and one chest CT image dataset demonstrate that RGShuffleNet achieves performance superior to various other state-of-the-art methods while maintaining lower complexity. Finally, RGShuffleNet is deployed on portable devices. The source code of the project is available at https://github.com/Zemin-Cai/RGShuffleNet.

医学图像分割在智能医学图像处理系统中起着至关重要的作用,是有效的医学图像分析的基础,特别是在辅助诊断和手术计划方面。在过去的几年里,UNet在图像分割领域取得了巨大的成功,一些基于UNet的扩展模型被广泛应用于医学图像分割任务中。然而,这些模型的应用仅限于可以部署大型医疗设备的场景,例如医院。当在硬件资源有限的便携式设备上部署这些分割模型时,与这些分割模型相关的大量计算成本构成了重大挑战。这阻碍了Homelab实现快速高效的图像分割。在本文中,我们提出了一个轻量级模型RGShuffleNet,专门为资源受限的移动设备设计用于医学图像分割。为了减少参数和计算复杂度,我们首先提出了一种新的卷积方法——重塑群卷积,用于有效地重组不同特征组的维数。修改特征结构可以增强不同组之间的相关性。此外,我们还引入了MSC-Shuffle块,以促进不同功能组之间的信息流。与传统的Shuffle操作只关注信道相关性不同,本文提出的MSC-Shuffle块可以在信道和空间维度上实现不同分组之间的信息交换,从而实现优越的分割性能。在两个心脏超声图像数据集和一个胸部CT图像数据集上的实验评估表明,RGShuffleNet在保持较低复杂性的同时,取得了优于其他各种最先进方法的性能。最后,在便携式设备上部署RGShuffleNet。​
{"title":"RGShuffleNet: An Efficient Design for Medical Image Segmentation on Portable Devices.","authors":"Zemin Cai, Jiarui Luo, Jian-Huang Lai, Fu Chen","doi":"10.1109/JBHI.2026.3663638","DOIUrl":"https://doi.org/10.1109/JBHI.2026.3663638","url":null,"abstract":"<p><p>Medical image segmentation plays a crucial role in intelligent medical image processing systems, serving as the foundation for effective medical image analysis, particularly in assisting diagnosis and surgical planning. Over the past few years, UNet has achieved tremendous success in the field of image segmentation, with several UNet-based extension models widely applied in medical image segmentation tasks. However, the application of these models is limited to scenarios where large medical equipment can be deployed, such as hospitals. The significant computational costs associated with these segmentation models pose significant challenges when deploying them on portable devices with limited hardware resources. This hinders the realization of rapid and efficient image segmentation in Homelab. In this paper, we present a lightweight model, RGShuffleNet, specifically designed for resource-constrained mobile devices for medical image segmentation. To reduce parameters and computational complexity, we first propose Reshaped Group Convolution, a novel convolutional method for effectively restructuring dimensions of different feature groups. Modifying the feature structure enhances correlations between different groups. Additionally, we introduce the MSC-Shuffle block to facilitate information flow between different feature groups. Unlike traditional Shuffle operations that focus solely on channel correlation, the MSC-Shuffle block proposed in this paper enables information exchange between different groups in both channel and spatial dimensions, thereby achieving superior segmentation performance. Experimental evaluations on two cardiac ultrasound image datasets and one chest CT image dataset demonstrate that RGShuffleNet achieves performance superior to various other state-of-the-art methods while maintaining lower complexity. Finally, RGShuffleNet is deployed on portable devices. The source code of the project is available at https://github.com/Zemin-Cai/RGShuffleNet.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2026-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146165192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Subject-Adaptive EEG Decoding via Filter-Bank Neural Architecture Search for BCI Applications. 基于Filter-Bank神经结构搜索的脑机接口自适应EEG解码。
IF 6.8 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-11 DOI: 10.1109/JBHI.2026.3663725
Chong Wang, Li Yang, Bingfan Yuan, Jiafan Zhang, Chen Jin, Rong Li, Junjie Bu

Individual differences pose a significant challenge in brain-computer interface (BCI) research. Designing a universally applicable network architecture is impractical due to the variability in human brain structure and function. We propose Filter-Bank Neural Architecture Search (FBNAS), an EEG decoding framework that automates network architecture design for individuals. FBNAS uses three temporal cells to process different frequency EEG signals, with dilated convolution kernels in their search spaces. A multi-path NAS algorithm determines optimal architectures for multi-scale feature extraction. We benchmarked FBNAS on three EEG datasets across two BCI paradigms, comparing it to six state-of-the-art deep learning algorithms. FBNAS achieved cross-session decoding accuracies of 79.78%, 70.66%, and 68.38% on the BCIC-IV-2a, OpenBMI, and SEED datasets, respectively, outperforming other methods. Our results show that FBNAS customizes decoding models to address individual differences, enhancing decoding performance and shifting model design from expert-driven to machine-aided. The source code can be found at https://github.com/wang1239435478/FBNAS-master.

个体差异对脑机接口(BCI)研究提出了重大挑战。由于人类大脑结构和功能的可变性,设计一个普遍适用的网络架构是不切实际的。我们提出了滤波器库神经结构搜索(Filter-Bank Neural Architecture Search, FBNAS),这是一个脑电图解码框架,可以自动为个人设计网络结构。FBNAS使用三个时间细胞来处理不同频率的脑电图信号,在其搜索空间中使用扩展的卷积核。多路径NAS算法确定了多尺度特征提取的最优架构。我们在两种脑机接口范式的三个EEG数据集上对FBNAS进行了基准测试,并将其与六种最先进的深度学习算法进行了比较。FBNAS在bbic - iv -2a、OpenBMI和SEED数据集上的跨会话解码准确率分别为79.78%、70.66%和68.38%,优于其他方法。我们的研究结果表明,FBNAS定制解码模型以解决个体差异,提高解码性能,并将模型设计从专家驱动转向机器辅助。源代码可以在https://github.com/wang1239435478/FBNAS-master上找到。
{"title":"Subject-Adaptive EEG Decoding via Filter-Bank Neural Architecture Search for BCI Applications.","authors":"Chong Wang, Li Yang, Bingfan Yuan, Jiafan Zhang, Chen Jin, Rong Li, Junjie Bu","doi":"10.1109/JBHI.2026.3663725","DOIUrl":"https://doi.org/10.1109/JBHI.2026.3663725","url":null,"abstract":"<p><p>Individual differences pose a significant challenge in brain-computer interface (BCI) research. Designing a universally applicable network architecture is impractical due to the variability in human brain structure and function. We propose Filter-Bank Neural Architecture Search (FBNAS), an EEG decoding framework that automates network architecture design for individuals. FBNAS uses three temporal cells to process different frequency EEG signals, with dilated convolution kernels in their search spaces. A multi-path NAS algorithm determines optimal architectures for multi-scale feature extraction. We benchmarked FBNAS on three EEG datasets across two BCI paradigms, comparing it to six state-of-the-art deep learning algorithms. FBNAS achieved cross-session decoding accuracies of 79.78%, 70.66%, and 68.38% on the BCIC-IV-2a, OpenBMI, and SEED datasets, respectively, outperforming other methods. Our results show that FBNAS customizes decoding models to address individual differences, enhancing decoding performance and shifting model design from expert-driven to machine-aided. The source code can be found at https://github.com/wang1239435478/FBNAS-master.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2026-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146165233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Tumor Contraction-Aware Multi-Sequence MRI Framework for Accurate Post-Ablation Margin Assessment in Hepatocellular Carcinoma. 肿瘤收缩感知多序列MRI框架用于肝细胞癌消融后边缘准确评估。
IF 6.8 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-11 DOI: 10.1109/JBHI.2026.3663682
Linan Dong, Hongwei Ge, Jie Yu, Yong Luo, Jinming Hu, Shichen Yu, Ping Liang

Hepatocellular carcinoma (HCC) is a major cause of cancer-related mortality, and microwave ablation (MWA) is commonly used for patients ineligible for surgical resection. A critical challenge following MWA is the assessment of the ablative margin, which is complicated by non-diffeomorphic deformations introduced by thermal effects during the procedure. This paper proposes a Multi-sequence Distance-guided Complementary Network (MDCNet) that utilizes multi-sequence MRI to quantify the extent of tumor contraction after MWA. To account for the differential contraction responses of liver parenchyma and tumor tissue, we propose a novel distance-aware mask transformation strategy. This method explicitly models the spatial attenuation of MWA energy and approximates the influence of liver parenchyma's linear elastic response on tumor shrinkage, thereby enhancing the spatial adaptiveness of feature weighting. To capture the distinct structural characteristics of liver tissue emphasized by different MRI sequences and to leverage their complementary information, a gated channel fusion module is introduced to dynamically integrate features from delayed-phase and T2-weighted images. To validate the practical effectiveness of our proposed method, we evaluate the ablative margins of 115 HCC patients using a fine-tuned TransMorph model that incorporated tumor contraction predictions generated by MDCNet, and compare the results with radiologist 2D assessments. The registration method enhanced with MDCNet improved tumor deformation accuracy and achieved a higher Youden Index in detecting incomplete ablations. Moreover, MDCNet provides interpretable predictions, thereby facilitating clinical decision support.

肝细胞癌(HCC)是癌症相关死亡的主要原因,微波消融(MWA)通常用于不适合手术切除的患者。MWA后的一个关键挑战是烧蚀边缘的评估,这是由于过程中热效应引起的非微分型变形而复杂化的。本文提出了一种多序列距离引导互补网络(MDCNet),该网络利用多序列MRI量化MWA后肿瘤收缩的程度。考虑到肝实质和肿瘤组织的不同收缩反应,我们提出了一种新的距离感知掩膜转换策略。该方法明确地模拟了MWA能量的空间衰减,逼近了肝实质的线性弹性响应对肿瘤收缩的影响,从而增强了特征加权的空间适应性。为了捕捉不同MRI序列所强调的肝组织的不同结构特征,并利用它们的互补信息,引入了门控通道融合模块来动态整合延迟相位和t2加权图像的特征。为了验证我们提出的方法的实际有效性,我们使用经过微调的TransMorph模型评估了115例HCC患者的消融边缘,该模型结合了MDCNet生成的肿瘤收缩预测,并将结果与放射科医生的2D评估进行了比较。经MDCNet增强的配准方法提高了肿瘤变形精度,在检测不完全消融时获得了更高的约登指数。此外,MDCNet提供可解释的预测,从而促进临床决策支持。
{"title":"Tumor Contraction-Aware Multi-Sequence MRI Framework for Accurate Post-Ablation Margin Assessment in Hepatocellular Carcinoma.","authors":"Linan Dong, Hongwei Ge, Jie Yu, Yong Luo, Jinming Hu, Shichen Yu, Ping Liang","doi":"10.1109/JBHI.2026.3663682","DOIUrl":"https://doi.org/10.1109/JBHI.2026.3663682","url":null,"abstract":"<p><p>Hepatocellular carcinoma (HCC) is a major cause of cancer-related mortality, and microwave ablation (MWA) is commonly used for patients ineligible for surgical resection. A critical challenge following MWA is the assessment of the ablative margin, which is complicated by non-diffeomorphic deformations introduced by thermal effects during the procedure. This paper proposes a Multi-sequence Distance-guided Complementary Network (MDCNet) that utilizes multi-sequence MRI to quantify the extent of tumor contraction after MWA. To account for the differential contraction responses of liver parenchyma and tumor tissue, we propose a novel distance-aware mask transformation strategy. This method explicitly models the spatial attenuation of MWA energy and approximates the influence of liver parenchyma's linear elastic response on tumor shrinkage, thereby enhancing the spatial adaptiveness of feature weighting. To capture the distinct structural characteristics of liver tissue emphasized by different MRI sequences and to leverage their complementary information, a gated channel fusion module is introduced to dynamically integrate features from delayed-phase and T2-weighted images. To validate the practical effectiveness of our proposed method, we evaluate the ablative margins of 115 HCC patients using a fine-tuned TransMorph model that incorporated tumor contraction predictions generated by MDCNet, and compare the results with radiologist 2D assessments. The registration method enhanced with MDCNet improved tumor deformation accuracy and achieved a higher Youden Index in detecting incomplete ablations. Moreover, MDCNet provides interpretable predictions, thereby facilitating clinical decision support.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2026-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146165263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DAMON: Difference-Aware Medical Visual Question Answering via Multimodal Large Language Model. 基于多模态大语言模型的差异感知医学视觉问答。
IF 6.8 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-10 DOI: 10.1109/JBHI.2026.3663420
Zefan Zhang, Yanhui Li, Ruihong Zhao, Tian Bai

Difference-aware Medical Visual Question Answering (MVQA) aims to answer questions regarding disease-related content and the visual differences between the paired medical images, which is crucial for assessing disease progression and guiding further treatment planning. Although current medical Multimodal Large Language Models (MLLMs) have shown promising results in MVQA, they still exhibit poor generalization performance in difference-aware MVQA due to two key challenges. Firstly, existing difference-aware MVQA datasets are biased toward temporal variations of individual diseases, limiting their ability to model multi-disease coexistence and overlapping symptoms in real-world clinical scenarios. Secondly, disease-level semantic alignment becomes more challenging with multi-image inputs, as they introduce more redundant and interfering visual features. To address the first challenge, we introduce DAMON-QA, a large-scale difference-aware MVQA dataset designed to support visual difference analysis across multiple diseases. Leveraging this dataset, we train MLLMs and propose a Difference-Aware Medical visual questiON answering (DAMON) model. To tackle the second challenge, we further propose a Disease-driven Prompt Module (DPM) to identify the relevant diseases and guide the disease difference analysis process. Experiments on MIMIC-Diff-VQA show that our DAMON model achieves state-of-the-art (SOTA) performance. The dataset and code can be found at https://github.com/zefanZhang-cn/DAMON.

差异感知医学视觉问答(MVQA)旨在回答有关疾病相关内容和配对医学图像之间视觉差异的问题,这对于评估疾病进展和指导进一步的治疗计划至关重要。尽管目前的医学多模态大语言模型(Multimodal Large Language Models, MLLMs)在多模态大语言模型质量评价(MVQA)中取得了令人鼓舞的成果,但由于两个关键的挑战,它们在差异感知多模态大语言模型质量评价中仍然表现出较差的泛化性能。首先,现有的差异感知MVQA数据集偏向于个体疾病的时间变化,限制了它们在真实临床场景中模拟多疾病共存和重叠症状的能力。其次,对于多图像输入,疾病级语义对齐变得更具挑战性,因为它们引入了更多冗余和干扰的视觉特征。为了解决第一个挑战,我们引入了DAMON-QA,这是一个大规模的差异感知MVQA数据集,旨在支持跨多种疾病的视觉差异分析。利用该数据集,我们训练了mllm,并提出了一个差异感知医学视觉问答(DAMON)模型。为了解决第二个挑战,我们进一步提出了疾病驱动提示模块(disease -driven Prompt Module, DPM)来识别相关疾病并指导疾病差异分析过程。在MIMIC-Diff-VQA上的实验表明,我们的DAMON模型达到了最先进(SOTA)的性能。数据集和代码可以在https://github.com/zefanZhang-cn/DAMON上找到。
{"title":"DAMON: Difference-Aware Medical Visual Question Answering via Multimodal Large Language Model.","authors":"Zefan Zhang, Yanhui Li, Ruihong Zhao, Tian Bai","doi":"10.1109/JBHI.2026.3663420","DOIUrl":"https://doi.org/10.1109/JBHI.2026.3663420","url":null,"abstract":"<p><p>Difference-aware Medical Visual Question Answering (MVQA) aims to answer questions regarding disease-related content and the visual differences between the paired medical images, which is crucial for assessing disease progression and guiding further treatment planning. Although current medical Multimodal Large Language Models (MLLMs) have shown promising results in MVQA, they still exhibit poor generalization performance in difference-aware MVQA due to two key challenges. Firstly, existing difference-aware MVQA datasets are biased toward temporal variations of individual diseases, limiting their ability to model multi-disease coexistence and overlapping symptoms in real-world clinical scenarios. Secondly, disease-level semantic alignment becomes more challenging with multi-image inputs, as they introduce more redundant and interfering visual features. To address the first challenge, we introduce DAMON-QA, a large-scale difference-aware MVQA dataset designed to support visual difference analysis across multiple diseases. Leveraging this dataset, we train MLLMs and propose a Difference-Aware Medical visual questiON answering (DAMON) model. To tackle the second challenge, we further propose a Disease-driven Prompt Module (DPM) to identify the relevant diseases and guide the disease difference analysis process. Experiments on MIMIC-Diff-VQA show that our DAMON model achieves state-of-the-art (SOTA) performance. The dataset and code can be found at https://github.com/zefanZhang-cn/DAMON.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2026-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146157166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE Journal of Biomedical and Health Informatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1