Pub Date : 2026-02-01DOI: 10.1109/JBHI.2025.3592024
Tong Jin, Jin Liu, Diandian Wang, Kun Wang, Chenlong Miao, Yikun Zhang, Dianlin Hu, Zhan Wu, Yang Chen
In computed tomography (CT), metal artifacts pose a persistent challenge to achieving high-quality imaging. Despite advancements in metal artifact reduction (MAR) techniques, many existing approaches have not fully leveraged the intrinsic a priori knowledge related to metal artifacts, improved model interpretability, or addressed the complex texture of CT images effectively. To address these limitations, we propose a novel and interpretable framework, the wavelet-inspired oriented adaptive dictionary network (WOADNet). WOADNet builds on sparse coding with orientational information in the wavelet domain. By exploring the discriminative features of artifacts and anatomical tissues, we adopt a high-precision filter parameterization strategy that incorporates multiangle rotations. Furthermore, we integrate a reweighted sparse constraint framework into the convolutional dictionary learning process and employ a cross-space, multiscale attention mechanism to construct an adaptive convolutional dictionary unit for the artifact feature encoder. This innovative design allows for flexible adjustment of weights and convolutional representations, resulting in significant image quality improvements. The experimental results using synthetic and clinical datasets demonstrate that WOADNet outperforms both traditional and state-of-the-art MAR methods in terms of suppressing artifacts.
{"title":"WOADNet: A Wavelet-Inspired Orientational Adaptive Dictionary Network for CT Metal Artifact Reduction.","authors":"Tong Jin, Jin Liu, Diandian Wang, Kun Wang, Chenlong Miao, Yikun Zhang, Dianlin Hu, Zhan Wu, Yang Chen","doi":"10.1109/JBHI.2025.3592024","DOIUrl":"10.1109/JBHI.2025.3592024","url":null,"abstract":"<p><p>In computed tomography (CT), metal artifacts pose a persistent challenge to achieving high-quality imaging. Despite advancements in metal artifact reduction (MAR) techniques, many existing approaches have not fully leveraged the intrinsic a priori knowledge related to metal artifacts, improved model interpretability, or addressed the complex texture of CT images effectively. To address these limitations, we propose a novel and interpretable framework, the wavelet-inspired oriented adaptive dictionary network (WOADNet). WOADNet builds on sparse coding with orientational information in the wavelet domain. By exploring the discriminative features of artifacts and anatomical tissues, we adopt a high-precision filter parameterization strategy that incorporates multiangle rotations. Furthermore, we integrate a reweighted sparse constraint framework into the convolutional dictionary learning process and employ a cross-space, multiscale attention mechanism to construct an adaptive convolutional dictionary unit for the artifact feature encoder. This innovative design allows for flexible adjustment of weights and convolutional representations, resulting in significant image quality improvements. The experimental results using synthetic and clinical datasets demonstrate that WOADNet outperforms both traditional and state-of-the-art MAR methods in terms of suppressing artifacts.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":"1452-1465"},"PeriodicalIF":6.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144698429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01DOI: 10.1109/JBHI.2025.3594219
Jiaheng Wang, Zhenyu Wang, Tianheng Xu, Ang Li, Yuan Si, Ting Zhou, Xi Zhao, Honglin Hu
In recent years, the diverse applications of electroencephalography (EEG) - based affective brain-computer interfaces (aBCIs) are being extensively explored. However, due to adverse factors like noise and physiological variability, the recognition capability of aBCIs can unforeseeably suffer abrupt declines. Since the timing of these aBCI failures is unknown, placing trust in aBCIs without scrutiny can lead to undesirable consequences. To alleviate this issue, we propose an algorithm for estimating the reliability of aBCI (primarily Graph Convolutional Network), synchronously delivering a probabilistic confidence score upon aBCI decision completion, thereby reflecting the aBCI's real-time recognition capabilities. Methodologically, we use the Maximum Softmax Probability (MSP) from EEG recognition networks as confidence scores and leverage the Scaling Operator to calibrate them. Then, the Projection Operator is employed to address confidence estimation biases caused by noise and subject variability. For the numerical concentration of MSP, we provide fresh insights into its causes and propose corresponding solutions. The derivation of the estimator from the Maximum Entropy Principle is also substantiated for robust theoretical underpinnings. Finally, we confirm theoretically that the estimator does not compromise BCI performance. In experiments conducted on public datasets SEED and SEED-IV, the proposed algorithm demonstrates superior performance in estimating aBCIs reliability compared to other benchmarks, and commendable adaptability to new subjects. This research has the potential to lead to more trustworthy aBCIs and advance their broader application in complex real-world scenarios.
{"title":"Enhancing the Reliability of Affective Brain-Computer Interfaces by Using Specifically Designed Confidence Estimator.","authors":"Jiaheng Wang, Zhenyu Wang, Tianheng Xu, Ang Li, Yuan Si, Ting Zhou, Xi Zhao, Honglin Hu","doi":"10.1109/JBHI.2025.3594219","DOIUrl":"10.1109/JBHI.2025.3594219","url":null,"abstract":"<p><p>In recent years, the diverse applications of electroencephalography (EEG) - based affective brain-computer interfaces (aBCIs) are being extensively explored. However, due to adverse factors like noise and physiological variability, the recognition capability of aBCIs can unforeseeably suffer abrupt declines. Since the timing of these aBCI failures is unknown, placing trust in aBCIs without scrutiny can lead to undesirable consequences. To alleviate this issue, we propose an algorithm for estimating the reliability of aBCI (primarily Graph Convolutional Network), synchronously delivering a probabilistic confidence score upon aBCI decision completion, thereby reflecting the aBCI's real-time recognition capabilities. Methodologically, we use the Maximum Softmax Probability (MSP) from EEG recognition networks as confidence scores and leverage the Scaling Operator to calibrate them. Then, the Projection Operator is employed to address confidence estimation biases caused by noise and subject variability. For the numerical concentration of MSP, we provide fresh insights into its causes and propose corresponding solutions. The derivation of the estimator from the Maximum Entropy Principle is also substantiated for robust theoretical underpinnings. Finally, we confirm theoretically that the estimator does not compromise BCI performance. In experiments conducted on public datasets SEED and SEED-IV, the proposed algorithm demonstrates superior performance in estimating aBCIs reliability compared to other benchmarks, and commendable adaptability to new subjects. This research has the potential to lead to more trustworthy aBCIs and advance their broader application in complex real-world scenarios.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":"1073-1086"},"PeriodicalIF":6.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144764849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01DOI: 10.1109/JBHI.2024.3432195
Andy Yiu-Chau Tam, Ye-Jiao Mao, Derek Ka-Hei Lai, Andy Chi-Ho Chan, Daphne Sze Ki Cheung, William Kearns, Duo Wai-Chi Wong, James Chung-Wai Cheung
The accuracy of sleep posture assessment in standard polysomnography might be compromised by the unfamiliar sleep lab environment. In this work, we aim to develop a depth camera-based sleep posture monitoring and classification system for home or community usage and tailor a deep learning model that can account for blanket interference. Our model included a joint coordinate estimation network (JCE) and sleep posture classification network (SPC). SaccpaNet (Separable Atrous Convolution-based Cascade Pyramid Attention Network) was developed using a combination of pyramidal structure of residual separable atrous convolution unit to reduce computational cost and enlarge receptive field. The Saccpa attention unit served as the core of JCE and SPC, while different backbones for SPC were also evaluated. The model was cross-modally pretrained by RGB images from the COCO whole body dataset and then trained/tested using dept image data collected from 150 participants performing seven sleep postures across four blanket conditions. Besides, we applied a data augmentation technique that used intra-class mix-up to synthesize blanket conditions; and an overlaid flip-cut to synthesize partially covered blanket conditions for a robustness that we referred to as the Post-hoc Data Augmentation Robustness Test (PhD-ART). Our model achieved an average precision of estimated joint coordinate (in terms of PCK@0.1) of 0.652 and demonstrated adequate robustness. The overall classification accuracy of sleep postures (F1-score) was 0.885 and 0.940, for 7- and 6-class classification, respectively. Our system was resistant to the interference of blanket, with a spread difference of 2.5%.
{"title":"SaccpaNet: A Separable Atrous Convolution- Based Cascade Pyramid Attention Network to Estimate Body Landmarks Using Cross-Modal Knowledge Transfer for Under-Blanket Sleep Posture Classification.","authors":"Andy Yiu-Chau Tam, Ye-Jiao Mao, Derek Ka-Hei Lai, Andy Chi-Ho Chan, Daphne Sze Ki Cheung, William Kearns, Duo Wai-Chi Wong, James Chung-Wai Cheung","doi":"10.1109/JBHI.2024.3432195","DOIUrl":"10.1109/JBHI.2024.3432195","url":null,"abstract":"<p><p>The accuracy of sleep posture assessment in standard polysomnography might be compromised by the unfamiliar sleep lab environment. In this work, we aim to develop a depth camera-based sleep posture monitoring and classification system for home or community usage and tailor a deep learning model that can account for blanket interference. Our model included a joint coordinate estimation network (JCE) and sleep posture classification network (SPC). SaccpaNet (Separable Atrous Convolution-based Cascade Pyramid Attention Network) was developed using a combination of pyramidal structure of residual separable atrous convolution unit to reduce computational cost and enlarge receptive field. The Saccpa attention unit served as the core of JCE and SPC, while different backbones for SPC were also evaluated. The model was cross-modally pretrained by RGB images from the COCO whole body dataset and then trained/tested using dept image data collected from 150 participants performing seven sleep postures across four blanket conditions. Besides, we applied a data augmentation technique that used intra-class mix-up to synthesize blanket conditions; and an overlaid flip-cut to synthesize partially covered blanket conditions for a robustness that we referred to as the Post-hoc Data Augmentation Robustness Test (PhD-ART). Our model achieved an average precision of estimated joint coordinate (in terms of PCK@0.1) of 0.652 and demonstrated adequate robustness. The overall classification accuracy of sleep postures (F1-score) was 0.885 and 0.940, for 7- and 6-class classification, respectively. Our system was resistant to the interference of blanket, with a spread difference of 2.5%.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":"1593-1604"},"PeriodicalIF":6.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141751582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01DOI: 10.1109/JBHI.2025.3592897
Yifei Zhao, Xiaoying Wang, Junping Yin
Accurate and efficient Video Polyp Segmentation (VPS) is vital for the early detection of colorectal cancer and the effectivetreatment of polyps. However, achieving this remains highly challenging due to the inherent difficulty in modeling the spatial-temporal relationships within colonoscopy videos. Existing methods that directly associate video frames frequently fail to account for variations in polyp or background motion, leading to excessive noise and reduced segmentation accuracy. Conversely, approaches that rely on optical flow models to estimate motion and align frames incur significant computational overhead. To address these limitations, we propose a novel VPS framework, termed Deformable Alignment and Local Attention (DALA). In this framework, we first construct a shared encoder to jointly encode the feature representations of paired video frames. Subsequently, we introduce a Multi-Scale Frame Alignment (MSFA) module based on deformable convolution to estimate the motion between reference and anchor frames. The multi-scale architecture is designed to accommodate the scale variations of polyps arising from differing viewing angles and speeds during colonoscopy. Furthermore, Local Attention (LA) is employed to selectively aggregate the aligned features, yielding more precise spatial-temporal feature representations. Extensive experiments conducted on the challenging SUN-SEG dataset and PolypGen dataset demonstrate that DALA achieves superior performance compared to state-of-the-art models.
{"title":"Efficient Video Polyp Segmentation by Deformable Alignment and Local Attention.","authors":"Yifei Zhao, Xiaoying Wang, Junping Yin","doi":"10.1109/JBHI.2025.3592897","DOIUrl":"10.1109/JBHI.2025.3592897","url":null,"abstract":"<p><p>Accurate and efficient Video Polyp Segmentation (VPS) is vital for the early detection of colorectal cancer and the effectivetreatment of polyps. However, achieving this remains highly challenging due to the inherent difficulty in modeling the spatial-temporal relationships within colonoscopy videos. Existing methods that directly associate video frames frequently fail to account for variations in polyp or background motion, leading to excessive noise and reduced segmentation accuracy. Conversely, approaches that rely on optical flow models to estimate motion and align frames incur significant computational overhead. To address these limitations, we propose a novel VPS framework, termed Deformable Alignment and Local Attention (DALA). In this framework, we first construct a shared encoder to jointly encode the feature representations of paired video frames. Subsequently, we introduce a Multi-Scale Frame Alignment (MSFA) module based on deformable convolution to estimate the motion between reference and anchor frames. The multi-scale architecture is designed to accommodate the scale variations of polyps arising from differing viewing angles and speeds during colonoscopy. Furthermore, Local Attention (LA) is employed to selectively aggregate the aligned features, yielding more precise spatial-temporal feature representations. Extensive experiments conducted on the challenging SUN-SEG dataset and PolypGen dataset demonstrate that DALA achieves superior performance compared to state-of-the-art models.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":"1534-1543"},"PeriodicalIF":6.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144715061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The thickness of the diaphragm serves as a crucial biometric indicator, particularly in assessing rehabilitation and respiratory dysfunction. However, measuring diaphragm thickness from ultrasound images mainly depends on manual delineation of the fascia, which is subjective, time-consuming, and sensitive to the inherent speckle noise. In this study, we introduce an edge-aware diffusion segmentation model (ESADiff), which incorporates prior structural knowledge of the fascia to improve the accuracy and reliability of diaphragm thickness measurements in ultrasound imaging. We first apply a diffusion model, guided by annotations, to learn the image features while preserving edge details through an iterative denoising process. Specifically, we design an anisotropic edge-sensitive annotation refinement module that corrects inaccurate labels by integrating Hessian geometric priors with a backtracking shortest-path connection algorithm, further enhancing model accuracy. Moreover, a curvature-aware deformable convolution and edge-prior ranking loss function are proposed to leverage the shape prior knowledge of the fascia, allowing the model to selectively focus on relevant linear structures while mitigating the influence of noise on feature extraction. We evaluated the proposed model on an in-house diaphragm ultrasound dataset, a public calf muscle dataset, and an internal tongue muscle dataset to demonstrate robust generalization. Extensive experimental results demonstrate that our method achieves finer fascia segmentation and significantly improves the accuracy of thickness measurements compared to other state-of-the-art techniques, highlighting its potential for clinical applications.
{"title":"Edge-Aware Diffusion Segmentation Model With Hessian Priors for Automated Diaphragm Thickness Measurement in Ultrasound Imaging.","authors":"Chen-Long Miao, Yikang He, Baike Shi, Zhongkai Bian, Wenxue Yu, Yang Chen, Guang-Quan Zhou","doi":"10.1109/JBHI.2025.3601567","DOIUrl":"10.1109/JBHI.2025.3601567","url":null,"abstract":"<p><p>The thickness of the diaphragm serves as a crucial biometric indicator, particularly in assessing rehabilitation and respiratory dysfunction. However, measuring diaphragm thickness from ultrasound images mainly depends on manual delineation of the fascia, which is subjective, time-consuming, and sensitive to the inherent speckle noise. In this study, we introduce an edge-aware diffusion segmentation model (ESADiff), which incorporates prior structural knowledge of the fascia to improve the accuracy and reliability of diaphragm thickness measurements in ultrasound imaging. We first apply a diffusion model, guided by annotations, to learn the image features while preserving edge details through an iterative denoising process. Specifically, we design an anisotropic edge-sensitive annotation refinement module that corrects inaccurate labels by integrating Hessian geometric priors with a backtracking shortest-path connection algorithm, further enhancing model accuracy. Moreover, a curvature-aware deformable convolution and edge-prior ranking loss function are proposed to leverage the shape prior knowledge of the fascia, allowing the model to selectively focus on relevant linear structures while mitigating the influence of noise on feature extraction. We evaluated the proposed model on an in-house diaphragm ultrasound dataset, a public calf muscle dataset, and an internal tongue muscle dataset to demonstrate robust generalization. Extensive experimental results demonstrate that our method achieves finer fascia segmentation and significantly improves the accuracy of thickness measurements compared to other state-of-the-art techniques, highlighting its potential for clinical applications.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":"1544-1554"},"PeriodicalIF":6.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144952132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01DOI: 10.1109/JBHI.2025.3634072
Usman Anwar, Adeel Hussain, Mandar Gogate, Kia Dashtipour, Tughrul Arslan, Amir Hussain, Peter Lomax
The detection of listening effort or cognitive load (CL) has been a major research challenge in recent years. Most conventional techniques utilise physiological or audio-visual sensors and are privacy-invasive and computationally complex. The challenges of synchronization, data alignment and accessibility limitations potentially increase the noise and error probability, compromising the accuracy of CL estimates. This innovative work presents a multi-modal, non-invasive and privacy-preserving approach that combines Radio Frequency (RF) and pupillometry sensing to address these challenges. Custom RF sensors are first designed and developed to capture blood flow changes in specific brain regions with high spatial resolution. Next, multi-modal fusion with pupillometry sensing is proposed and shown to offer a robust assessment of cognitive and listening effort through pupil size and pupil dilation. Our novel approach evaluates RF sensing to estimate CL from cerebral blood flow variations utilizing pupillometry as a baseline. A first-of-its-kind, multi-modal dataset is collected as a new benchmark resource in a controlled environment with participants to comprehend target speech with varying background noise levels. The framework is statistically evaluated using intraclass correlation for pupillometry data (average ICC> 0.95). The correlation between pupillometry and RF data is established through Pearson's correlation (average PCC> 0.79). Further, CL is classified into high and low categories based on RF data using K-means clustering. Future work involves integrating RF sensors with glasses to estimate listening effort for hearing-aid users and utilising RF measurements to optimize speech enhancement based on individual's listening effort and complexity of acoustic environment.
{"title":"Multimodal Cognitive Load Estimation With Radio Frequency Sensing and Pupillometry in Complex Auditory Environments.","authors":"Usman Anwar, Adeel Hussain, Mandar Gogate, Kia Dashtipour, Tughrul Arslan, Amir Hussain, Peter Lomax","doi":"10.1109/JBHI.2025.3634072","DOIUrl":"10.1109/JBHI.2025.3634072","url":null,"abstract":"<p><p>The detection of listening effort or cognitive load (CL) has been a major research challenge in recent years. Most conventional techniques utilise physiological or audio-visual sensors and are privacy-invasive and computationally complex. The challenges of synchronization, data alignment and accessibility limitations potentially increase the noise and error probability, compromising the accuracy of CL estimates. This innovative work presents a multi-modal, non-invasive and privacy-preserving approach that combines Radio Frequency (RF) and pupillometry sensing to address these challenges. Custom RF sensors are first designed and developed to capture blood flow changes in specific brain regions with high spatial resolution. Next, multi-modal fusion with pupillometry sensing is proposed and shown to offer a robust assessment of cognitive and listening effort through pupil size and pupil dilation. Our novel approach evaluates RF sensing to estimate CL from cerebral blood flow variations utilizing pupillometry as a baseline. A first-of-its-kind, multi-modal dataset is collected as a new benchmark resource in a controlled environment with participants to comprehend target speech with varying background noise levels. The framework is statistically evaluated using intraclass correlation for pupillometry data (average ICC> 0.95). The correlation between pupillometry and RF data is established through Pearson's correlation (average PCC> 0.79). Further, CL is classified into high and low categories based on RF data using K-means clustering. Future work involves integrating RF sensors with glasses to estimate listening effort for hearing-aid users and utilising RF measurements to optimize speech enhancement based on individual's listening effort and complexity of acoustic environment.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":"1605-1617"},"PeriodicalIF":6.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145556854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01DOI: 10.1109/JBHI.2025.3591844
Yukai Huang, Ningbo Zhao, Dongmin Huang, Yonglong Ye, Zi Luo, Hongzhou Lu, Min He, Wenjin Wang
The diagnosis of peripheral artery disease (PAD) typically relies on specialized equipment such as ultrasound. The delayed PAD detection of these approaches may lead to amputation and even death. To achieve rapid and ubiquitous PAD screening, we propose a novel concept of camera-based plantar perfusion imaging (CPPI) for PAD diagnosis and severity classification. Specifically, we performed a simulation trial that used an RGB camera to record the plantar video of 20 subjects and a cuff with different pressures applied to the left leg to simulate different degrees of lower limb blockage. We generated the plantar perfusion maps using remote photoplethysmography imaging and proposed a multi-view perfusion (MVP) feature set to represent the perfusion maps for PAD classification. The experimental results show that the Pearson correlation coefficients between MVP and Doppler ultrasound (clinical reference) features were larger than 0.9. MVP feature combined with Support Vector Machine obtains 91.47% accuracy in distinguishing the normal and obstructed states, and 76.48% accuracy in differentiating four different degrees of vascular obstruction. The clinical benchmark demonstrated the potential of CPPI as a rapid, sensitive, and easy-to-use diagnostic tool for PAD, suitable for large-scale screening in home or community settings.
{"title":"Plantar Perfusion Imaging for Peripheral Arterial Disease Screening: A Proof-of-Concept Study.","authors":"Yukai Huang, Ningbo Zhao, Dongmin Huang, Yonglong Ye, Zi Luo, Hongzhou Lu, Min He, Wenjin Wang","doi":"10.1109/JBHI.2025.3591844","DOIUrl":"10.1109/JBHI.2025.3591844","url":null,"abstract":"<p><p>The diagnosis of peripheral artery disease (PAD) typically relies on specialized equipment such as ultrasound. The delayed PAD detection of these approaches may lead to amputation and even death. To achieve rapid and ubiquitous PAD screening, we propose a novel concept of camera-based plantar perfusion imaging (CPPI) for PAD diagnosis and severity classification. Specifically, we performed a simulation trial that used an RGB camera to record the plantar video of 20 subjects and a cuff with different pressures applied to the left leg to simulate different degrees of lower limb blockage. We generated the plantar perfusion maps using remote photoplethysmography imaging and proposed a multi-view perfusion (MVP) feature set to represent the perfusion maps for PAD classification. The experimental results show that the Pearson correlation coefficients between MVP and Doppler ultrasound (clinical reference) features were larger than 0.9. MVP feature combined with Support Vector Machine obtains 91.47% accuracy in distinguishing the normal and obstructed states, and 76.48% accuracy in differentiating four different degrees of vascular obstruction. The clinical benchmark demonstrated the potential of CPPI as a rapid, sensitive, and easy-to-use diagnostic tool for PAD, suitable for large-scale screening in home or community settings.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":"1520-1533"},"PeriodicalIF":6.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144698428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Knee osteoarthritis (KOA) is a prevalent musculoskeletal disorder, often diagnosed using X-rays due to its cost-effectiveness. While Magnetic Resonance Imaging (MRI) provides superior soft tissue visualization and serves as a valuable supplementary diagnostic tool, its high cost and limited accessibility significantly restrict its widespread use. To explore the feasibility of bridging this imaging gap, we conducted a feasibility study leveraging a diffusion-based model that uses an X-ray image as conditional input, alongside target depth and additional patient-specific feature information, to generate corresponding MRI sequences. Our findings demonstrate that the MRI volumes generated by our approach are not only visually closer to real MRI scans compared with other methods but also achieve the highest quantitative performance in terms of Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index (SSIM). Furthermore, by increasing the number of inference steps to interpolate between slice depths, we enhance the continuity of the generated volume, achieving higher adjacent slice correlation coefficients. Through ablation studies, we further validate that integrating supplemental patient-specific information, beyond what X-rays alone can provide, enhances the accuracy and clinical relevance of the generated MRI, which underscores the potential of leveraging external patient-specific information to improve the performance of the MRI generation.
{"title":"Feasibility Study of a Diffusion-Based Model for Cross-Modal Generation of Knee MRI From X-Ray: Integrating External Radiographic Feature Information.","authors":"Zhe Wang, Yung Hsin Chen, Aladine Chetouani, Fabian Bauer, Yuhua Ru, Fang Chen, Liping Zhang, Rachid Jennane, Mohamed Jarraya","doi":"10.1109/JBHI.2025.3593487","DOIUrl":"10.1109/JBHI.2025.3593487","url":null,"abstract":"<p><p>Knee osteoarthritis (KOA) is a prevalent musculoskeletal disorder, often diagnosed using X-rays due to its cost-effectiveness. While Magnetic Resonance Imaging (MRI) provides superior soft tissue visualization and serves as a valuable supplementary diagnostic tool, its high cost and limited accessibility significantly restrict its widespread use. To explore the feasibility of bridging this imaging gap, we conducted a feasibility study leveraging a diffusion-based model that uses an X-ray image as conditional input, alongside target depth and additional patient-specific feature information, to generate corresponding MRI sequences. Our findings demonstrate that the MRI volumes generated by our approach are not only visually closer to real MRI scans compared with other methods but also achieve the highest quantitative performance in terms of Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index (SSIM). Furthermore, by increasing the number of inference steps to interpolate between slice depths, we enhance the continuity of the generated volume, achieving higher adjacent slice correlation coefficients. Through ablation studies, we further validate that integrating supplemental patient-specific information, beyond what X-rays alone can provide, enhances the accuracy and clinical relevance of the generated MRI, which underscores the potential of leveraging external patient-specific information to improve the performance of the MRI generation.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":"1328-1338"},"PeriodicalIF":6.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144742036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01DOI: 10.1109/JBHI.2025.3637164
Hang Wang, Mingxing Duan, Yuhuan Lu, Bin Pu, Yue Qin, Shuihua Wang, Kenli Li
Fetal anatomical structure segmentation in ultrasound images is essential for biometric measurement and disease diagnosis. However, current methods focus on a specific plane or a few structures, whereas obstetricians diagnose by considering multiple structures from different planes. In addition, existing methods struggle with segmenting fuzzy regions, which leads to performance degradation. We propose a real-time segmentation method called Class-aware Multi-structure Instance Segmentation (CMIS), designed to segment 19 key structures in 3 fetal brain planes to support brain-disease diagnosis. We extract instance information and generate class-aware attention for each class instead of dense instances to save computing resources and provide more informative details. Then we implement cross-layer and multi-scale fusion to obtain detailed prototypes. Finally, we fuse global attention with local prototypes cropped by boxes to generate masks and randomly perturb the boxes during training to enhance robustness. Moreover, we propose a new fuzzy region-based constraint loss to address the challenge of structures with varying scales and fuzzy boundaries. Extensive experiments on a fetal brain dataset demonstrate that CMIS outperforms 13 competing baselines, with an mDice of 83.41$pm$0.03% at 37 FPS. CMIS also excels in external experiments on a fetal heart ultrasound dataset, achieving a mDice of 85.73$pm$0.02% . These results demonstrate the effectiveness of CMIS in segmenting complex anatomical structures in ultrasound and its potential for real-time clinical applications. CMIS is limited to 2D normal standard planes ($geq$19 weeks). Thus, its generalization to abnormal cases and broader datasets remains to be investigated.
超声图像中的胎儿解剖结构分割对于生物特征测量和疾病诊断至关重要。然而,目前的方法侧重于一个特定的平面或几个结构,而产科医生通过考虑来自不同平面的多个结构来诊断。此外,现有方法难以分割模糊区域,导致性能下降。我们提出了一种基于类别感知的多结构实例分割(CMIS)实时分割方法,旨在分割3个胎儿脑平面的19个关键结构,以支持脑疾病的诊断。我们提取实例信息,并为每个类生成类感知的注意,而不是密集的实例,以节省计算资源并提供更多的信息细节。然后实现跨层和多尺度融合,获得详细的原型。最后,我们将全局注意力与盒子裁剪的局部原型融合以生成蒙版,并在训练过程中随机扰动盒子以增强鲁棒性。此外,我们提出了一种新的基于模糊区域的约束损失方法来解决具有不同尺度和模糊边界的结构的挑战。在胎儿大脑数据集上进行的大量实验表明,CMIS优于13个竞争基线,其mdevice为83.41 $pm$ 0.03% at 37 FPS. CMIS also excels in external experiments on a fetal heart ultrasound dataset, achieving a mDice of 85.73$pm$0.02%. These results demonstrate the effectiveness of CMIS in segmenting complex anatomical structures in ultrasound and its potential for real-time clinical applications. CMIS is limited to 2D normal standard planes ($geq$19 weeks). Thus, its generalization to abnormal cases and broader datasets remains to be investigated.
{"title":"CMIS: A Class-Aware Multi-Structure Instance Segmentation Model for Fetal Brain Ultrasound Images With Fuzzy Region-Based Constraints.","authors":"Hang Wang, Mingxing Duan, Yuhuan Lu, Bin Pu, Yue Qin, Shuihua Wang, Kenli Li","doi":"10.1109/JBHI.2025.3637164","DOIUrl":"10.1109/JBHI.2025.3637164","url":null,"abstract":"<p><p>Fetal anatomical structure segmentation in ultrasound images is essential for biometric measurement and disease diagnosis. However, current methods focus on a specific plane or a few structures, whereas obstetricians diagnose by considering multiple structures from different planes. In addition, existing methods struggle with segmenting fuzzy regions, which leads to performance degradation. We propose a real-time segmentation method called Class-aware Multi-structure Instance Segmentation (CMIS), designed to segment 19 key structures in 3 fetal brain planes to support brain-disease diagnosis. We extract instance information and generate class-aware attention for each class instead of dense instances to save computing resources and provide more informative details. Then we implement cross-layer and multi-scale fusion to obtain detailed prototypes. Finally, we fuse global attention with local prototypes cropped by boxes to generate masks and randomly perturb the boxes during training to enhance robustness. Moreover, we propose a new fuzzy region-based constraint loss to address the challenge of structures with varying scales and fuzzy boundaries. Extensive experiments on a fetal brain dataset demonstrate that CMIS outperforms 13 competing baselines, with an mDice of 83.41$pm$0.03% at 37 FPS. CMIS also excels in external experiments on a fetal heart ultrasound dataset, achieving a mDice of 85.73$pm$0.02% . These results demonstrate the effectiveness of CMIS in segmenting complex anatomical structures in ultrasound and its potential for real-time clinical applications. CMIS is limited to 2D normal standard planes ($geq$19 weeks). Thus, its generalization to abnormal cases and broader datasets remains to be investigated.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":"1049-1059"},"PeriodicalIF":6.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145632696","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The Internet of Medical Things (IoMT) has transformed traditional healthcare systems by enabling real-time monitoring, remote diagnostics, and data-driven treatment. However, security and privacy remain significant concerns for IoMT adoption due to the sensitive nature of medical data. Therefore, we propose an integrated framework leveraging blockchain and explainable artificial intelligence (XAI) to enable secure, intelligent, and transparent management of IoMT data. First, the traceability and tamper-proof of blockchain are used to realize the secure transaction of IoMT data, transforming the secure transaction of IoMT data into a two-stage Stackelberg game. The dual-chain architecture is used to ensure the security and privacy protection of the transaction. The main-chain manages regular IoMT data transactions, while the side-chain deals with data trading activities aimed at resale. Simultaneously, the perceptual hash technology is used to realize data rights confirmation, which maximally protects the rights and interests of each participant in the transaction. Subsequently, medical time-series data is modeled using bidirectional simple recurrent units to detect anomalies and cyberthreats accurately while overcoming vanishing gradients. Lastly, an adversarial sample generation method based on local interpretable model-agnostic explanations is provided to evaluate, secure, and improve the anomaly detection model, as well as to make it more explainable and resilient to possible adversarial attacks. Simulation results are provided to illustrate the high performance of the integrated secure data management framework leveraging blockchain and XAI, compared with the benchmarks.
{"title":"XAI Driven Intelligent IoMT Secure Data Management Framework.","authors":"Wei Liu, Feng Zhao, Lewis Nkenyereye, Shalli Rani, Keqin Li, Jianhui Lv","doi":"10.1109/JBHI.2024.3408215","DOIUrl":"10.1109/JBHI.2024.3408215","url":null,"abstract":"<p><p>The Internet of Medical Things (IoMT) has transformed traditional healthcare systems by enabling real-time monitoring, remote diagnostics, and data-driven treatment. However, security and privacy remain significant concerns for IoMT adoption due to the sensitive nature of medical data. Therefore, we propose an integrated framework leveraging blockchain and explainable artificial intelligence (XAI) to enable secure, intelligent, and transparent management of IoMT data. First, the traceability and tamper-proof of blockchain are used to realize the secure transaction of IoMT data, transforming the secure transaction of IoMT data into a two-stage Stackelberg game. The dual-chain architecture is used to ensure the security and privacy protection of the transaction. The main-chain manages regular IoMT data transactions, while the side-chain deals with data trading activities aimed at resale. Simultaneously, the perceptual hash technology is used to realize data rights confirmation, which maximally protects the rights and interests of each participant in the transaction. Subsequently, medical time-series data is modeled using bidirectional simple recurrent units to detect anomalies and cyberthreats accurately while overcoming vanishing gradients. Lastly, an adversarial sample generation method based on local interpretable model-agnostic explanations is provided to evaluate, secure, and improve the anomaly detection model, as well as to make it more explainable and resilient to possible adversarial attacks. Simulation results are provided to illustrate the high performance of the integrated secure data management framework leveraging blockchain and XAI, compared with the benchmarks.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":"935-946"},"PeriodicalIF":6.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141237831","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}