首页 > 最新文献

IEEE Journal of Biomedical and Health Informatics最新文献

英文 中文
Simplifying Depression Diagnosis: Single-Channel EEG and Deep Learning Approaches. 简化抑郁症诊断:单通道脑电图和深度学习方法。
IF 6.8 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-02 DOI: 10.1109/JBHI.2025.3631326
Shruthi Narayanan Vaniya, Ahsan Habib, Maia Angelova, Chandan Karmakar

Major depressive disorder (MDD) or depression is a chronic mental illness that significantly impacts individuals' well-being and is often diagnosed at advanced stages, increasing the risk of suicide. Current diagnostic practices, which rely heavily on subjective assessments and patient self-reports, are often hindered by challenges such as under-reporting and the failure to detect early, subtle symptoms. Early detection of MDD is crucial and requires monitoring vital signs in daily living conditions. The electroencephalogram (EEG) is a valuable tool for monitoring brain activity, providing critical information on MDD and its underlying neurological mechanisms. While traditional EEG systems typically involve multiple channels for recording, making them impractical for home-based monitoring, wearable sensors can effectively capture single-channel EEG data. However, generating meaningful features from these data poses challenges due to the need for specialized domain knowledge and significant computational power, which can hinder real-time processing. To address these issues, our study focuses on developing a deep learning model for the binary classification of MDD using single-channel EEG data. We focused on specific channels from various brain regions such as central, frontal, occipital, temporal, and parietal. Our study found that the channels Fp1, F8 and Cz achieved an impressive accuracy of 90% when analyzed using a Convolutional Neural Network (CNN) with leave-one-subject-out cross-validation on a public dataset. Our study highlights the potential of utilizing single-channel EEG data for reliable MDD diagnosis, providing a less intrusive and more convenient wearable solution for mental health assessment.

重度抑郁症(MDD)或抑郁症是一种慢性精神疾病,严重影响个人的健康,通常在晚期才被诊断出来,增加了自杀的风险。目前的诊断做法严重依赖主观评估和患者自我报告,常常受到报告不足和未能及早发现细微症状等挑战的阻碍。早期发现重度抑郁症至关重要,需要在日常生活条件下监测生命体征。脑电图(EEG)是监测大脑活动的一种有价值的工具,提供了关于重度抑郁症及其潜在神经机制的重要信息。传统的脑电图系统通常涉及多个记录通道,这使得它们不适合家庭监控,而可穿戴传感器可以有效地捕获单通道脑电图数据。然而,由于需要专业的领域知识和强大的计算能力,从这些数据中生成有意义的特征带来了挑战,这可能会阻碍实时处理。为了解决这些问题,我们的研究重点是开发一个深度学习模型,用于使用单通道EEG数据对MDD进行二分类。我们关注的是来自不同大脑区域的特定通道,如中央、额叶、枕叶、颞叶和顶叶。我们的研究发现,当使用卷积神经网络(CNN)在公共数据集上进行留一个主体的交叉验证时,通道Fp1, F8和Cz达到了令人印象深刻的90%的准确性。我们的研究强调了利用单通道脑电图数据进行可靠的重度抑郁症诊断的潜力,为心理健康评估提供了一种侵入性更小、更方便的可穿戴解决方案。
{"title":"Simplifying Depression Diagnosis: Single-Channel EEG and Deep Learning Approaches.","authors":"Shruthi Narayanan Vaniya, Ahsan Habib, Maia Angelova, Chandan Karmakar","doi":"10.1109/JBHI.2025.3631326","DOIUrl":"https://doi.org/10.1109/JBHI.2025.3631326","url":null,"abstract":"<p><p>Major depressive disorder (MDD) or depression is a chronic mental illness that significantly impacts individuals' well-being and is often diagnosed at advanced stages, increasing the risk of suicide. Current diagnostic practices, which rely heavily on subjective assessments and patient self-reports, are often hindered by challenges such as under-reporting and the failure to detect early, subtle symptoms. Early detection of MDD is crucial and requires monitoring vital signs in daily living conditions. The electroencephalogram (EEG) is a valuable tool for monitoring brain activity, providing critical information on MDD and its underlying neurological mechanisms. While traditional EEG systems typically involve multiple channels for recording, making them impractical for home-based monitoring, wearable sensors can effectively capture single-channel EEG data. However, generating meaningful features from these data poses challenges due to the need for specialized domain knowledge and significant computational power, which can hinder real-time processing. To address these issues, our study focuses on developing a deep learning model for the binary classification of MDD using single-channel EEG data. We focused on specific channels from various brain regions such as central, frontal, occipital, temporal, and parietal. Our study found that the channels Fp1, F8 and Cz achieved an impressive accuracy of 90% when analyzed using a Convolutional Neural Network (CNN) with leave-one-subject-out cross-validation on a public dataset. Our study highlights the potential of utilizing single-channel EEG data for reliable MDD diagnosis, providing a less intrusive and more convenient wearable solution for mental health assessment.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2026-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146105311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MEM-UNet: Morphology-Enhanced 3D Mamba UNet for Esophagus Segmentation. em -UNet:形态学增强的三维曼巴UNet用于食管分割。
IF 6.8 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-02 DOI: 10.1109/JBHI.2026.3659853
Chao-Chia Lin, Shanq-Jang Ruan, Yu-Jen Wang

Accurate segmentation of medical images, particularly for anatomical structures with irregular shapes and low contrast such as the esophagus, remains a significant challenge. To address this issue, we propose MEM-UNet, a robust 3D Mamba-based UNet framework enhanced by mathematical morphology. Our approach adapts the State Space Model (SSM) in Mamba to support three-dimensional CT volumes, establishing an effective 3D perception backbone for the UNet architecture. In addition, we incorporate Morphology-Aware Spatial-Channel Attention (MASCA) blocks into the skip connections, where Morphology-Enhanced Spatial Convolution (MESC) augments spatial representations while Squeeze-and-Excitation (SE) highlights channel- wise features. This integration effectively leverages the shape-awareness provided by morphological operations, thus improving boundary precision. To further refine segmentation, we introduce a Morphology-Enhanced Decision (MED) layer that sharpens contour boundaries and performs voxel-level classification with high precision. Extensive experiments on SegTHOR and BTCV datasets demonstrate that MEM-UNet surpasses state-of-the-art models, achieving Dice Similarity Coefficient (DSC) scores of 87.42% and 74.86% for multi-organ segmentation, and 78.94% and 67.70% for esophagus segmentation, respectively. Ablation studies confirm the effectiveness of the proposed components and highlight the benefits of integrating mathematical morphology into our pipeline. The implementation is available at https://gitfront.io/r/cheee123/DDTJhrf3LRMd/MEM-UNet/.

医学图像的准确分割,特别是不规则形状和低对比度的解剖结构,如食道,仍然是一个重大的挑战。为了解决这个问题,我们提出了meme -UNet,这是一个基于mamba的三维UNet框架,通过数学形态学增强。我们的方法采用了曼巴的状态空间模型(SSM)来支持三维CT体积,为UNet架构建立了有效的3D感知主干。此外,我们将形态感知的空间通道注意(MASCA)块整合到跳跃连接中,其中形态增强的空间卷积(MESC)增强了空间表征,而挤压和激励(SE)突出了通道智能特征。这种集成有效地利用了形态操作提供的形状感知,从而提高了边界精度。为了进一步细化分割,我们引入了形态学增强决策(MED)层,该层可以锐化轮廓边界并以高精度执行体素级分类。在SegTHOR和BTCV数据集上进行的大量实验表明,meme - unet优于最先进的模型,在多器官分割上的Dice Similarity Coefficient (DSC)得分分别为87.42%和74.86%,在食道分割上的DSC得分分别为78.94%和67.70%。消融研究证实了所提出组件的有效性,并强调了将数学形态学集成到我们的管道中的好处。该实现可从https://gitfront.io/r/cheee123/DDTJhrf3LRMd/MEM-UNet/获得。
{"title":"MEM-UNet: Morphology-Enhanced 3D Mamba UNet for Esophagus Segmentation.","authors":"Chao-Chia Lin, Shanq-Jang Ruan, Yu-Jen Wang","doi":"10.1109/JBHI.2026.3659853","DOIUrl":"https://doi.org/10.1109/JBHI.2026.3659853","url":null,"abstract":"<p><p>Accurate segmentation of medical images, particularly for anatomical structures with irregular shapes and low contrast such as the esophagus, remains a significant challenge. To address this issue, we propose MEM-UNet, a robust 3D Mamba-based UNet framework enhanced by mathematical morphology. Our approach adapts the State Space Model (SSM) in Mamba to support three-dimensional CT volumes, establishing an effective 3D perception backbone for the UNet architecture. In addition, we incorporate Morphology-Aware Spatial-Channel Attention (MASCA) blocks into the skip connections, where Morphology-Enhanced Spatial Convolution (MESC) augments spatial representations while Squeeze-and-Excitation (SE) highlights channel- wise features. This integration effectively leverages the shape-awareness provided by morphological operations, thus improving boundary precision. To further refine segmentation, we introduce a Morphology-Enhanced Decision (MED) layer that sharpens contour boundaries and performs voxel-level classification with high precision. Extensive experiments on SegTHOR and BTCV datasets demonstrate that MEM-UNet surpasses state-of-the-art models, achieving Dice Similarity Coefficient (DSC) scores of 87.42% and 74.86% for multi-organ segmentation, and 78.94% and 67.70% for esophagus segmentation, respectively. Ablation studies confirm the effectiveness of the proposed components and highlight the benefits of integrating mathematical morphology into our pipeline. The implementation is available at https://gitfront.io/r/cheee123/DDTJhrf3LRMd/MEM-UNet/.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2026-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146105286","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Two-Stage Self-Supervised Contrastive Learning Aided Transformer for Real-Time Medical Image Segmentation. 用于实时医学图像分割的两级自监督对比学习辅助变换器
IF 6.8 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-01 DOI: 10.1109/JBHI.2023.3340956
Abdul Qayyum, Imran Razzak, Moona Mazher, Tariq Khan, Weiping Ding, Steven Niederer

The availability of large, high-quality annotated datasets in the medical domain poses a substantial challenge in segmentation tasks. To mitigate the reliance on annotated training data, self-supervised pre-training strategies have emerged, particularly employing contrastive learning methods on dense pixel-level representations. In this work, we proposed to capitalize on intrinsic anatomical similarities within medical image data and develop a semantic segmentation framework through a self-supervised fusion network, where the availability of annotated volumes is limited. In a unified training phase, we combine segmentation loss with contrastive loss, enhancing the distinction between significant anatomical regions that adhere to the available annotations. To further improve the segmentation performance, we introduce an efficient parallel transformer module that leverages Multiview multiscale feature fusion and depth-wise features. The proposed transformer architecture, based on multiple encoders, is trained in a self-supervised manner using contrastive loss. Initially, the transformer is trained using an unlabeled dataset. We then fine-tune one encoder using data from the first stage and another encoder using a small set of annotated segmentation masks. These encoder features are subsequently concatenated for the purpose of brain tumor segmentation. The multiencoder-based transformer model yields significantly better outcomes across three medical image segmentation tasks. We validated our proposed solution by fusing images across diverse medical image segmentation challenge datasets, demonstrating its efficacy by outperforming state-of-the-art methodologies.

在医疗领域,大量高质量注释数据集的可用性给分割任务带来了巨大挑战。为了减轻对注释训练数据的依赖,出现了自监督预训练策略,特别是在密集像素级表征上采用对比学习方法。在这项工作中,我们建议利用医学影像数据中内在的解剖学相似性,通过自监督融合网络开发一个语义分割框架,而注释卷的可用性是有限的。在统一训练阶段,我们将分割损失与对比损失相结合,加强了与现有注释一致的重要解剖区域之间的区分。为了进一步提高分割性能,我们引入了一个高效的并行转换器模块,利用多视图多尺度特征融合和深度特征。所提出的转换器架构以多个编码器为基础,使用对比损失以自我监督的方式进行训练。最初,转换器使用无标记数据集进行训练。然后,我们使用来自第一阶段的数据对一个编码器进行微调,并使用一小部分带注释的分割掩码对另一个编码器进行微调。这些编码器的特征随后被串联起来,用于脑肿瘤的分割。基于多编码器的变换器模型在三项医学图像分割任务中都取得了显著的效果。我们通过融合不同医学图像分割挑战数据集中的图像,验证了我们提出的解决方案,证明其功效优于最先进的方法。
{"title":"Two-Stage Self-Supervised Contrastive Learning Aided Transformer for Real-Time Medical Image Segmentation.","authors":"Abdul Qayyum, Imran Razzak, Moona Mazher, Tariq Khan, Weiping Ding, Steven Niederer","doi":"10.1109/JBHI.2023.3340956","DOIUrl":"10.1109/JBHI.2023.3340956","url":null,"abstract":"<p><p>The availability of large, high-quality annotated datasets in the medical domain poses a substantial challenge in segmentation tasks. To mitigate the reliance on annotated training data, self-supervised pre-training strategies have emerged, particularly employing contrastive learning methods on dense pixel-level representations. In this work, we proposed to capitalize on intrinsic anatomical similarities within medical image data and develop a semantic segmentation framework through a self-supervised fusion network, where the availability of annotated volumes is limited. In a unified training phase, we combine segmentation loss with contrastive loss, enhancing the distinction between significant anatomical regions that adhere to the available annotations. To further improve the segmentation performance, we introduce an efficient parallel transformer module that leverages Multiview multiscale feature fusion and depth-wise features. The proposed transformer architecture, based on multiple encoders, is trained in a self-supervised manner using contrastive loss. Initially, the transformer is trained using an unlabeled dataset. We then fine-tune one encoder using data from the first stage and another encoder using a small set of annotated segmentation masks. These encoder features are subsequently concatenated for the purpose of brain tumor segmentation. The multiencoder-based transformer model yields significantly better outcomes across three medical image segmentation tasks. We validated our proposed solution by fusing images across diverse medical image segmentation challenge datasets, demonstrating its efficacy by outperforming state-of-the-art methodologies.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":"1039-1048"},"PeriodicalIF":6.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138800439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hierarchical Temporal Attention Networks for Cancer Registry Abstraction: Leveraging Longitudinal Clinical Data With Interpretability. 癌症登记抽象的分层时间注意网络:利用具有可解释性的纵向临床数据。
IF 6.8 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-01 DOI: 10.1109/JBHI.2025.3592444
Hong-Jie Dai, Han-Hsiang Wu

Cancer registration is a vital source of information for government-driven cancer prevention and control policies. However, cancer registry abstraction is a complex and labor-intensive process, requiring the extraction of structured data from large volumes of unstructured clinical reports. To address these challenges, we propose a hierarchical temporal attention network leveraging attention mechanisms at the word, sentence, and document levels, while incorporating temporal and report type information to capture nuanced relationships within patients' longitudinal data. To ensure robust evaluation, a stratified sampling algorithm was developed to balance the training, validation, and test datasets across 23 coding tasks, mitigating potential biases. The proposed method achieved an average F1-score of 0.82, outperforming existing approaches by prioritizing task-relevant words, sentences, and reports through its attention mechanism. An ablation study confirmed the critical contributions of the proposed components. Furthermore, a prototype visualization tool was developed to present interpretability, providing cancer registrars with insights into the decision-making process by visualizing attention at multiple levels of granularity. Overall, the proposed methods combined with the interpretability-focused visualization tool, represent a significant step toward automating cancer registry abstraction from unstructured clinical text in longitudinal settings.

癌症登记是政府推动的癌症预防和控制政策的重要信息来源。然而,癌症登记的提取是一个复杂和劳动密集型的过程,需要从大量的非结构化临床报告中提取结构化数据。为了应对这些挑战,我们提出了一个分层的时间注意网络,利用单词、句子和文档级别的注意机制,同时结合时间和报告类型信息来捕捉患者纵向数据中的细微关系。为了确保稳健的评估,我们开发了一种分层抽样算法来平衡23个编码任务中的训练、验证和测试数据集,以减轻潜在的偏差。该方法的平均f1得分为0.82,通过注意机制对任务相关的单词、句子和报告进行优先级排序,优于现有方法。一项消融研究证实了所提议的组件的关键贡献。此外,研究人员还开发了一个可视化工具原型,以呈现可解释性,通过在多个粒度级别上可视化注意力,为癌症登记员提供对决策过程的见解。总的来说,所提出的方法与以可解释性为重点的可视化工具相结合,代表了纵向设置中从非结构化临床文本中自动提取癌症注册表的重要一步。
{"title":"Hierarchical Temporal Attention Networks for Cancer Registry Abstraction: Leveraging Longitudinal Clinical Data With Interpretability.","authors":"Hong-Jie Dai, Han-Hsiang Wu","doi":"10.1109/JBHI.2025.3592444","DOIUrl":"10.1109/JBHI.2025.3592444","url":null,"abstract":"<p><p>Cancer registration is a vital source of information for government-driven cancer prevention and control policies. However, cancer registry abstraction is a complex and labor-intensive process, requiring the extraction of structured data from large volumes of unstructured clinical reports. To address these challenges, we propose a hierarchical temporal attention network leveraging attention mechanisms at the word, sentence, and document levels, while incorporating temporal and report type information to capture nuanced relationships within patients' longitudinal data. To ensure robust evaluation, a stratified sampling algorithm was developed to balance the training, validation, and test datasets across 23 coding tasks, mitigating potential biases. The proposed method achieved an average F<sub>1</sub>-score of 0.82, outperforming existing approaches by prioritizing task-relevant words, sentences, and reports through its attention mechanism. An ablation study confirmed the critical contributions of the proposed components. Furthermore, a prototype visualization tool was developed to present interpretability, providing cancer registrars with insights into the decision-making process by visualizing attention at multiple levels of granularity. Overall, the proposed methods combined with the interpretability-focused visualization tool, represent a significant step toward automating cancer registry abstraction from unstructured clinical text in longitudinal settings.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":"1652-1665"},"PeriodicalIF":6.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144707314","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Airs-Net: Adversarial-Improved Reversible Steganography Network for CT Images in the Internet of Medical Things and Telemedicine. Airs-Net:针对医疗物联网和远程医疗中CT图像的对抗改进可逆隐写网络。
IF 6.8 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-01 DOI: 10.1109/JBHI.2025.3602272
Kai Chen, Mu Nie, Jean-Louis Coatrieux, Yang Chen, Shipeng Xie

Medical imaging has developed from an auxiliary means of clinical examination into a significant method and intuitive basis for clinical diagnosis of diseases, providing all-around and full-cycle health protection for the people. The Internet of Medical Things (IoMT) allows medical equipment, intelligent terminals, medical infrastructure, and other elements of medical production to be interconnected, eliminating information silos and data fragmentation. Medical images disseminated in IoMT contain a wide diversity of sensitive patient information, which means protecting the patient's personal information is vital. In this work, an Adversarial-improved reversible steganography network (Airs-Net) for computed tomography (CT) images in the IoMT is presented. Specifically, the Airs-Net adopting the prediction-embedding strategy mainly consists of an image restoration network, an embedded pixel location network, and a discriminator. The image restoration network is effective in restoring the pixel prediction error of the restoration set in integer and non-integer scaled images of arbitrary size when information is concealed. The embedded information location network can automatically select pixel locations for information embedding based on the interpolated image features of the degraded image. The restored image, embedding location map, and embedding information are fed into the embedder for information embedding, and the subsequent secret-carrying image is continuously optimized for the quality of the information-embedded image by the discriminator. Quantitative results show that Airs-Net outperforms state-of-the-art methods in both PSNR and SSIM. Further, the qualitative and quantitative results and analyses under specific clinical application scenarios and in coping with multiple types of medical image information hiding demonstrate the excellent generalization performance and practical application capability of the Airs-Net.

医学影像学已从临床检查的辅助手段发展成为疾病临床诊断的重要方法和直观依据,为人们提供全方位、全周期的健康保障。医疗物联网(IoMT)使医疗设备、智能终端、医疗基础设施和其他医疗生产要素互联互通,消除信息孤岛和数据碎片化。在IoMT中传播的医学图像包含各种各样的敏感患者信息,这意味着保护患者的个人信息至关重要。在这项工作中,提出了一种针对IoMT中计算机断层扫描(CT)图像的对抗改进的可逆隐写网络(air - net)。具体来说,采用预测嵌入策略的Airs-Net主要由图像恢复网络、嵌入像素定位网络和鉴别器组成。该图像恢复网络可以有效地恢复任意大小的整数和非整数比例图像在信息被隐藏的情况下的像素预测误差。嵌入信息定位网络可以根据退化图像的插值图像特征自动选择信息嵌入的像素位置。将恢复后的图像、嵌入位置图和嵌入信息馈送到嵌入器中进行信息嵌入,并通过鉴别器对后续的携带秘密的图像进行持续优化,以达到信息嵌入图像的质量要求。定量结果表明,air - net在PSNR和SSIM方面都优于最先进的方法。此外,在具体的临床应用场景和应对多种类型医学图像信息隐藏的定性和定量结果和分析表明,Airs-Net具有出色的泛化性能和实际应用能力。
{"title":"Airs-Net: Adversarial-Improved Reversible Steganography Network for CT Images in the Internet of Medical Things and Telemedicine.","authors":"Kai Chen, Mu Nie, Jean-Louis Coatrieux, Yang Chen, Shipeng Xie","doi":"10.1109/JBHI.2025.3602272","DOIUrl":"10.1109/JBHI.2025.3602272","url":null,"abstract":"<p><p>Medical imaging has developed from an auxiliary means of clinical examination into a significant method and intuitive basis for clinical diagnosis of diseases, providing all-around and full-cycle health protection for the people. The Internet of Medical Things (IoMT) allows medical equipment, intelligent terminals, medical infrastructure, and other elements of medical production to be interconnected, eliminating information silos and data fragmentation. Medical images disseminated in IoMT contain a wide diversity of sensitive patient information, which means protecting the patient's personal information is vital. In this work, an Adversarial-improved reversible steganography network (Airs-Net) for computed tomography (CT) images in the IoMT is presented. Specifically, the Airs-Net adopting the prediction-embedding strategy mainly consists of an image restoration network, an embedded pixel location network, and a discriminator. The image restoration network is effective in restoring the pixel prediction error of the restoration set in integer and non-integer scaled images of arbitrary size when information is concealed. The embedded information location network can automatically select pixel locations for information embedding based on the interpolated image features of the degraded image. The restored image, embedding location map, and embedding information are fed into the embedder for information embedding, and the subsequent secret-carrying image is continuously optimized for the quality of the information-embedded image by the discriminator. Quantitative results show that Airs-Net outperforms state-of-the-art methods in both PSNR and SSIM. Further, the qualitative and quantitative results and analyses under specific clinical application scenarios and in coping with multiple types of medical image information hiding demonstrate the excellent generalization performance and practical application capability of the Airs-Net.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":"1479-1491"},"PeriodicalIF":6.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144951930","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Self-Supervised Guided Modality Disentangled Representation Learning for Multimodal Sentiment Analysis and Schizophrenia Assessment. 多模态情感分析和精神分裂症评估的自监督引导模态解纠缠表征学习。
IF 6.8 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-01 DOI: 10.1109/JBHI.2025.3604933
Hsin-Yang Chang, An-Sheng Liu, Yi-Ting Lin, Chen-Chung Liu, Lue-En Lee, Feng-Yi Chen, Shu-Hui Hung, Li-Chen Fu

As the impact of chronic mental disorders increases, multimodal sentiment analysis (MSA) has emerged to improve diagnosis and treatment. In this paper, our approach leverages disentangled representation learning to address modality heterogeneity with self-supervised learning as a guidance. The self-supervised learning is proposed to generate pseudo unimodal labels and guide modality-specific representation learning, preventing the acquisition of meaningless features. Additionally, we also propose a text-centric fusion to effectively mitigate the impacts of noise and redundant information and fuse the acquired disentangled representations into a comprehensive multimodal representation. We evaluate our model on three publicly available benchmark datasets for multimodal sentiment analysis and a privately collected dataset focusing on schizophrenia counseling. The experimental results demonstrate state-of-the-art performance across various metrics on the benchmark datasets, surpassing related works. Furthermore, our learning algorithm shows promising performance in real-world applications, outperforming our previous work and achieving significant progress in schizophrenia assessment.

随着慢性精神障碍影响的增加,多模态情绪分析(MSA)已经出现,以提高诊断和治疗。在本文中,我们的方法利用解纠缠表征学习来解决模态异质性,并以自我监督学习为指导。提出了自监督学习,生成伪单峰标签,指导特定模态的表示学习,防止无意义特征的获取。此外,我们还提出了一种以文本为中心的融合,以有效地减轻噪声和冗余信息的影响,并将获得的解纠缠表示融合成一个全面的多模态表示。我们在三个公开可用的多模态情绪分析基准数据集和一个私人收集的专注于精神分裂症咨询的数据集上评估了我们的模型。实验结果表明,在基准数据集上的各种指标上都具有最先进的性能,超过了相关工作。此外,我们的学习算法在实际应用中显示出有希望的性能,优于我们之前的工作,并在精神分裂症评估方面取得了重大进展。
{"title":"Self-Supervised Guided Modality Disentangled Representation Learning for Multimodal Sentiment Analysis and Schizophrenia Assessment.","authors":"Hsin-Yang Chang, An-Sheng Liu, Yi-Ting Lin, Chen-Chung Liu, Lue-En Lee, Feng-Yi Chen, Shu-Hui Hung, Li-Chen Fu","doi":"10.1109/JBHI.2025.3604933","DOIUrl":"10.1109/JBHI.2025.3604933","url":null,"abstract":"<p><p>As the impact of chronic mental disorders increases, multimodal sentiment analysis (MSA) has emerged to improve diagnosis and treatment. In this paper, our approach leverages disentangled representation learning to address modality heterogeneity with self-supervised learning as a guidance. The self-supervised learning is proposed to generate pseudo unimodal labels and guide modality-specific representation learning, preventing the acquisition of meaningless features. Additionally, we also propose a text-centric fusion to effectively mitigate the impacts of noise and redundant information and fuse the acquired disentangled representations into a comprehensive multimodal representation. We evaluate our model on three publicly available benchmark datasets for multimodal sentiment analysis and a privately collected dataset focusing on schizophrenia counseling. The experimental results demonstrate state-of-the-art performance across various metrics on the benchmark datasets, surpassing related works. Furthermore, our learning algorithm shows promising performance in real-world applications, outperforming our previous work and achieving significant progress in schizophrenia assessment.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":"1630-1641"},"PeriodicalIF":6.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144952157","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Pathology-Guided AI System for Accurate Segmentation and Diagnosis of Cervical Spondylosis. 病理引导下的颈椎病准确分割与诊断的AI系统。
IF 6.8 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-01 DOI: 10.1109/JBHI.2025.3598469
Qi Zhang, Xiuyuan Chen, Ziyi He, Lianming Wu, Kun Wang, Jianqi Sun, Hongxing Shen

Cervical spondylosis, a complex and prevalent condition, demands precise and efficient diagnostic techniques for accurate assessment. While MRI offers detailed visualization of cervical spine anatomy, manual interpretation remains labor-intensive and prone to error. To address this, we developed an innovative AI-assisted Expert-based Diagnosis System1 that automates both segmentation and diagnosis of cervical spondylosis using MRI. Leveraging multi-center datasets of cervical MRI images from patients with cervical spondylosis, our system features a pathology-guided segmentation model capable of accurately segmenting key cervical anatomical structures. The segmentation is followed by an expert-based diagnostic framework that automates the calculation of critical clinical indicators. Our segmentation model achieved an impressive average Dice coefficient exceeding 0.90 across four cervical spinal anatomies and demonstrated enhanced accuracy in herniation areas. Diagnostic evaluation further showcased the system's precision, with the lowest mean average errors (MAE) for the C2-C7 Cobb angle and the Maximum Spinal Cord Compression (MSCC) coefficient. In addition, our method delivered high accuracy, precision, recall, and F1 scores in herniation localization, K-line status assessment, T2 hyperintensity detection, and Kang grading. Comparative analysis and external validation demonstrate that our system outperforms existing methods, establishing a new benchmark for segmentation and diagnostic tasks for cervical spondylosis.

颈椎病是一种复杂和普遍的疾病,需要精确和有效的诊断技术来准确评估。虽然MRI提供了详细的颈椎解剖可视化,但人工解释仍然是劳动密集型的,容易出错。为了解决这个问题,我们开发了一种创新的人工智能辅助专家诊断系统,该系统可以使用MRI自动分割和诊断颈椎病。利用来自颈椎病患者的颈椎MRI图像的多中心数据集,我们的系统具有病理引导的分割模型,能够准确分割关键的颈椎解剖结构。分割之后是一个基于专家的诊断框架,自动计算关键临床指标。我们的分割模型取得了令人印象深刻的平均Dice系数超过0.90在四个颈椎解剖,并证明了在突出区域提高准确性。诊断评估进一步展示了该系统的准确性,C2-C7 Cobb角和最大脊髓压缩系数的平均平均误差(MAE)最低。此外,我们的方法在疝定位、k线状态评估、T2高强度检测和Kang分级方面具有较高的准确性、精密度、召回率和F1评分。对比分析和外部验证表明,我们的系统优于现有的方法,为颈椎病的分割和诊断任务建立了新的基准。
{"title":"Pathology-Guided AI System for Accurate Segmentation and Diagnosis of Cervical Spondylosis.","authors":"Qi Zhang, Xiuyuan Chen, Ziyi He, Lianming Wu, Kun Wang, Jianqi Sun, Hongxing Shen","doi":"10.1109/JBHI.2025.3598469","DOIUrl":"10.1109/JBHI.2025.3598469","url":null,"abstract":"<p><p>Cervical spondylosis, a complex and prevalent condition, demands precise and efficient diagnostic techniques for accurate assessment. While MRI offers detailed visualization of cervical spine anatomy, manual interpretation remains labor-intensive and prone to error. To address this, we developed an innovative AI-assisted Expert-based Diagnosis System<sup>1</sup> that automates both segmentation and diagnosis of cervical spondylosis using MRI. Leveraging multi-center datasets of cervical MRI images from patients with cervical spondylosis, our system features a pathology-guided segmentation model capable of accurately segmenting key cervical anatomical structures. The segmentation is followed by an expert-based diagnostic framework that automates the calculation of critical clinical indicators. Our segmentation model achieved an impressive average Dice coefficient exceeding 0.90 across four cervical spinal anatomies and demonstrated enhanced accuracy in herniation areas. Diagnostic evaluation further showcased the system's precision, with the lowest mean average errors (MAE) for the C2-C7 Cobb angle and the Maximum Spinal Cord Compression (MSCC) coefficient. In addition, our method delivered high accuracy, precision, recall, and F1 scores in herniation localization, K-line status assessment, T2 hyperintensity detection, and Kang grading. Comparative analysis and external validation demonstrate that our system outperforms existing methods, establishing a new benchmark for segmentation and diagnostic tasks for cervical spondylosis.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":"1216-1229"},"PeriodicalIF":6.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144845829","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SPA: Leveraging the SAM With Spatial Priors Adapter for Enhanced Medical Image Segmentation. 利用带有空间先验适配器的SAM增强医学图像分割。
IF 6.8 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-01 DOI: 10.1109/JBHI.2025.3526174
Jihong Hu, Yinhao Li, Rahul Kumar Jain, Lanfen Lin, Yen-Wei Chen

The Segment Anything Model (SAM) has gained renown for its success in image segmentation, benefiting significantly from its pretraining on extensive datasets and its interactive prompt-based segmentation approach. Although highly effective in natural (real-world) image segmentation tasks, the SAM model encounters significant challenges in medical imaging due to the inherent differences between these two domains. To address these challenges, we propose the Spatial Prior Adapter (SPA) scheme, a parameter-efficient fine-tuning strategy that enhances SAM's adaptability to medical imaging tasks. SPA introduces two novel modules: the Spatial Prior Module (SPM), which captures localized spatial features through convolutional layers, and the Feature Communication Module (FCM), which integrates these features into SAM's image encoder via cross-attention mechanisms. Furthermore, we develop a Multiscale Feature Fusion Module (MSFFM) to enhance SAM's end-to-end segmentation capabilities by effectively aggregating multiscale contextual information. These lightweight modules require minimal computational resources while significantly boosting segmentation performance. Our approach demonstrates superior performance in both prompt-based and end-to-end segmentation scenarios through extensive experiments on publicly available medical imaging datasets. Performance highlights the potential of the proposed method to bridge the gap between foundation models and domain-specific medical imaging tasks. This advancement paves the way for more effective AI-assisted medical diagnostic systems.

SAM (Segment Anything Model)在图像分割方面取得了巨大的成功,这主要得益于其在大量数据集上的预训练和基于交互式提示的分割方法。尽管在自然(现实世界)图像分割任务中非常有效,但由于这两个领域之间的固有差异,SAM模型在医学成像中遇到了重大挑战。为了解决这些挑战,我们提出了空间先验适配器(SPA)方案,这是一种参数高效的微调策略,增强了SAM对医学成像任务的适应性。SPA引入了两个新颖的模块:空间先验模块(SPM),通过卷积层捕获局部空间特征;特征通信模块(FCM),通过交叉注意机制将这些特征集成到SAM的图像编码器中。此外,我们开发了一个多尺度特征融合模块(MSFFM),通过有效地聚合多尺度上下文信息来增强SAM的端到端分割能力。这些轻量级模块需要最少的计算资源,同时显著提高分割性能。通过在公开可用的医学成像数据集上进行大量实验,我们的方法在基于提示的和端到端分割场景中都展示了卓越的性能。性能突出了所提出的方法弥合基础模型和特定领域医学成像任务之间差距的潜力。这一进步为更有效的人工智能辅助医疗诊断系统铺平了道路。
{"title":"SPA: Leveraging the SAM With Spatial Priors Adapter for Enhanced Medical Image Segmentation.","authors":"Jihong Hu, Yinhao Li, Rahul Kumar Jain, Lanfen Lin, Yen-Wei Chen","doi":"10.1109/JBHI.2025.3526174","DOIUrl":"10.1109/JBHI.2025.3526174","url":null,"abstract":"<p><p>The Segment Anything Model (SAM) has gained renown for its success in image segmentation, benefiting significantly from its pretraining on extensive datasets and its interactive prompt-based segmentation approach. Although highly effective in natural (real-world) image segmentation tasks, the SAM model encounters significant challenges in medical imaging due to the inherent differences between these two domains. To address these challenges, we propose the Spatial Prior Adapter (SPA) scheme, a parameter-efficient fine-tuning strategy that enhances SAM's adaptability to medical imaging tasks. SPA introduces two novel modules: the Spatial Prior Module (SPM), which captures localized spatial features through convolutional layers, and the Feature Communication Module (FCM), which integrates these features into SAM's image encoder via cross-attention mechanisms. Furthermore, we develop a Multiscale Feature Fusion Module (MSFFM) to enhance SAM's end-to-end segmentation capabilities by effectively aggregating multiscale contextual information. These lightweight modules require minimal computational resources while significantly boosting segmentation performance. Our approach demonstrates superior performance in both prompt-based and end-to-end segmentation scenarios through extensive experiments on publicly available medical imaging datasets. Performance highlights the potential of the proposed method to bridge the gap between foundation models and domain-specific medical imaging tasks. This advancement paves the way for more effective AI-assisted medical diagnostic systems.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":"993-1005"},"PeriodicalIF":6.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143541656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DBANet: Dual Boundary Awareness With Confidence-Guided Pseudo Labeling for Medical Image Segmentation. 基于置信度引导的双边界感知伪标记医学图像分割。
IF 6.8 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-01 DOI: 10.1109/JBHI.2025.3592873
Zhonghua Chen, Haitao Cao, Lauri Kettunen, Hongkai Wang

Accurate medical image segmentation is crucial for clinical diagnosis and treatment planning. However, class imbalance and vagueness of boundary in medical images make it challenging to achieve accurate and precise results. In particular, 3D multi-organ segmentation is a complex process. These challenges are further exacerbated in semi-supervised learning settings with limited labeled data. Existing methods rarely effectively incorporate boundary information to alleviate class imbalance, leading to biased predictions and suboptimal segmentation accuracy. To address these limitations, we propose DBANet, a dual-model framework integrating three key modules. The Confidence-Guided Pseudo-Label Fusion (CPF) module enhances pseudo-label reliability by selecting high-confidence logits. This improves training stability in limited annotation settings. The Boundary Distribution Awareness (BDA) module dynamically adjusts class weights based on boundary distributions, alleviating class imbalance and enhancing segmentation performance. Additionally, the Boundary Vagueness Awareness (BVA) module further refines boundary delineation by prioritizing regions with blurred boundaries. Experiments on two benchmark datasets validate the effectiveness of DBANet. On the Synapse dataset, DBANet achieves average Dice score improvements of 3.56%, 2.17%, and 5.12% under 10%, 20%, and 40% labeled data settings, respectively. Similarly, on the WORD dataset, DBANet achieves average Dice score improvements of 1.72%, 0.97%, and 0.65% under 2%, 5%, and 10% labeled data settings, respectively. These results highlight the potential of boundary-aware adaptive weighting for advancing semi-supervised medical image segmentation.

准确的医学图像分割对临床诊断和治疗计划至关重要。然而,医学图像中类别的不平衡和边界的模糊给获得准确可靠的结果带来了挑战。特别是三维多器官分割是一个复杂的过程。在标记数据有限的半监督学习环境中,这些挑战进一步加剧。现有方法很少有效地结合边界信息来缓解类不平衡,导致预测偏差和分割精度不理想。为了解决这些限制,我们提出了DBANet,一个集成了三个关键模块的双模型框架。基于置信度的伪标签融合(CPF)模块通过选择高置信度的逻辑来提高伪标签的可靠性。这提高了在有限注释设置下的训练稳定性。边界分布感知(BDA)模块根据边界分布动态调整类权值,缓解类不平衡,提高分割性能。此外,边界模糊感知(BVA)模块通过优先考虑边界模糊的区域来进一步细化边界划分。在两个基准数据集上的实验验证了DBANet的有效性。在Synapse数据集上,DBANet在10%,20%和40%标记数据设置下分别实现了3.56%,2.17%和5.12%的平均Dice分数提高。同样,在WORD数据集上,DBANet在2%、5%和10%标记数据设置下分别实现了1.72%、0.97%和0.65%的平均Dice分数提高。这些结果突出了边界感知自适应加权在推进半监督医学图像分割方面的潜力。
{"title":"DBANet: Dual Boundary Awareness With Confidence-Guided Pseudo Labeling for Medical Image Segmentation.","authors":"Zhonghua Chen, Haitao Cao, Lauri Kettunen, Hongkai Wang","doi":"10.1109/JBHI.2025.3592873","DOIUrl":"10.1109/JBHI.2025.3592873","url":null,"abstract":"<p><p>Accurate medical image segmentation is crucial for clinical diagnosis and treatment planning. However, class imbalance and vagueness of boundary in medical images make it challenging to achieve accurate and precise results. In particular, 3D multi-organ segmentation is a complex process. These challenges are further exacerbated in semi-supervised learning settings with limited labeled data. Existing methods rarely effectively incorporate boundary information to alleviate class imbalance, leading to biased predictions and suboptimal segmentation accuracy. To address these limitations, we propose DBANet, a dual-model framework integrating three key modules. The Confidence-Guided Pseudo-Label Fusion (CPF) module enhances pseudo-label reliability by selecting high-confidence logits. This improves training stability in limited annotation settings. The Boundary Distribution Awareness (BDA) module dynamically adjusts class weights based on boundary distributions, alleviating class imbalance and enhancing segmentation performance. Additionally, the Boundary Vagueness Awareness (BVA) module further refines boundary delineation by prioritizing regions with blurred boundaries. Experiments on two benchmark datasets validate the effectiveness of DBANet. On the Synapse dataset, DBANet achieves average Dice score improvements of 3.56%, 2.17%, and 5.12% under 10%, 20%, and 40% labeled data settings, respectively. Similarly, on the WORD dataset, DBANet achieves average Dice score improvements of 1.72%, 0.97%, and 0.65% under 2%, 5%, and 10% labeled data settings, respectively. These results highlight the potential of boundary-aware adaptive weighting for advancing semi-supervised medical image segmentation.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":"1203-1215"},"PeriodicalIF":6.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144753167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CalDiff: Calibrating Uncertainty and Accessing Reliability of Diffusion Models for Trustworthy Lesion Segmentation. CalDiff:校正不确定性和获取可信病灶分割扩散模型的可靠性。
IF 6.8 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-01 DOI: 10.1109/JBHI.2025.3624331
Xinxin Wang, Mingrui Yang, Sercan Tosun, Kunio Nakamura, Shuo Li, Xiaojuan Li

Low reliability has consistently been a challenge in the application of deep learning models for high-risk decision-making scenarios. In medical image segmentation, multiple expert annotations can be consulted to reduce subjective bias and reach a consensus, thereby enhancing the segmentation accuracy and reliability. To develop a reliable lesion segmentation model, we propose CalDiff, a novel framework that can leverage the uncertainty from multiple annotations, capture real-world diagnostic variability and provide more informative predictions. To harness the superior generative ability of diffusion models, a dual step-wise and sequence-aware calibration mechanism is proposed on the basis of the sequential nature of diffusion models. We evaluate the calibrated model through a comprehensive quantitative and visual analysis, addressing the previously overlooked challenge of assessing uncertainty calibration and model reliability in scenarios with multiple annotations and multiple predictions. Experimental results on two lesion segmentation datasets demonstrate that CalDiff produces uncertainty maps that can reflect low confidence areas, further indicating the false predictions made by the model. By calibrating the uncertainty in the training phase, the uncertain areas produced by our model are closely correlated with areas where the model has made errors in the inference. In summary, the uncertainty captured by CalDiff can serve as a powerful indicator, which can help mitigate the risks of adopting model's outputs, allowing clinicians to prioritize reviewing areas or slices with higher uncertainty and enhancing the model's reliability and trustworthiness in clinical practice.

低可靠性一直是深度学习模型在高风险决策场景中的应用所面临的挑战。例如,在医学图像分割中,可以咨询多个专家的注释,减少主观偏见,达成共识,从而提高分割的准确性和可靠性。为了开发一个可靠的病变分割模型,我们利用多个注释引入的不确定性,使模型能够更好地捕捉真实世界的诊断可变性,并提供更多信息的预测。由于一个可靠的模型应该产生与实际预测性能一致的校准不确定性估计,我们提出了CalDiff,这是一个新的框架,旨在校准病变分割中的模型不确定性,并减轻过度自信但不正确预测的风险。为了利用扩散模型优越的生成能力,基于扩散模型的序列特性,提出了一种双步进和序列感知的校准机制。我们通过全面的定量和可视化分析来评估校准模型,从而解决了以前被忽视的挑战,即在具有多个注释和多个预测的情况下评估不确定性校准和模型可靠性。在两个多注释的病变分割数据集上的实验结果表明,CalDiff产生的不确定性图可以反映信息丰富的低置信度区域,这可以进一步表明模型可能做出的错误预测。通过校准训练阶段的不确定性,我们的模型产生的不确定性区域与模型在推理中产生错误的区域更紧密相关。总之,我们的CalDiff框架捕获的不确定性可以作为一个强大的指标,它可以帮助减轻采用模型输出的风险,允许临床医生优先审查具有较高不确定性的区域或切片,并提高模型在实际临床实践中的可靠性和可信度。
{"title":"CalDiff: Calibrating Uncertainty and Accessing Reliability of Diffusion Models for Trustworthy Lesion Segmentation.","authors":"Xinxin Wang, Mingrui Yang, Sercan Tosun, Kunio Nakamura, Shuo Li, Xiaojuan Li","doi":"10.1109/JBHI.2025.3624331","DOIUrl":"10.1109/JBHI.2025.3624331","url":null,"abstract":"<p><p>Low reliability has consistently been a challenge in the application of deep learning models for high-risk decision-making scenarios. In medical image segmentation, multiple expert annotations can be consulted to reduce subjective bias and reach a consensus, thereby enhancing the segmentation accuracy and reliability. To develop a reliable lesion segmentation model, we propose CalDiff, a novel framework that can leverage the uncertainty from multiple annotations, capture real-world diagnostic variability and provide more informative predictions. To harness the superior generative ability of diffusion models, a dual step-wise and sequence-aware calibration mechanism is proposed on the basis of the sequential nature of diffusion models. We evaluate the calibrated model through a comprehensive quantitative and visual analysis, addressing the previously overlooked challenge of assessing uncertainty calibration and model reliability in scenarios with multiple annotations and multiple predictions. Experimental results on two lesion segmentation datasets demonstrate that CalDiff produces uncertainty maps that can reflect low confidence areas, further indicating the false predictions made by the model. By calibrating the uncertainty in the training phase, the uncertain areas produced by our model are closely correlated with areas where the model has made errors in the inference. In summary, the uncertainty captured by CalDiff can serve as a powerful indicator, which can help mitigate the risks of adopting model's outputs, allowing clinicians to prioritize reviewing areas or slices with higher uncertainty and enhancing the model's reliability and trustworthiness in clinical practice.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":"1555-1567"},"PeriodicalIF":6.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12682437/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145354716","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE Journal of Biomedical and Health Informatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1