Pub Date : 2026-01-12DOI: 10.1109/TBME.2026.3653267
Haoyu Dong, Hanxue Gu, Yaqian Chen, Jichen Yang, Yuwen Chen, Maciej A Mazurowski
Segment Anything Model (SAM) has gained significant attention because of its ability to segment a variety of objects in images upon providing a prompt. Recently developed SAM 2 has extended this ability to video segmentation, and by substituting the third spatial dimension in 3D images for the time dimension in videos, it opens an opportunity to apply SAM 2 to 3D medical images. In this paper, we extensively evaluate SAM 2's ability to segment both 2D and 3D medical images using 80 prompt strategies across 21 medical imaging datasets, including 2D modalities (X-ray and ultrasound), 3D modalities (magnetic resonance imaging, computed tomography, and positron emission tomography), and surgical videos. We find that in the 2D setting, SAM 2 performs similarly to SAM, while in the 3D setting we observe that: (1) selecting the first mask is more effective than choosing the one with the highest confidence, (2) prompting the slice with the largest object appears is the most cost-effective strategy when only one slice is prompted, (3) box prompts result in higher performance than point prompts at a slightly higher annotation cost, (4) bidirectional propagation outperforms front-to-end propagation, (5) interactive annotation is rarely effective, (6) SAM 2, without fine-tuning, achieves 3D IoU from 0.32 with a single point prompt to 0.51 with a ground truth mask on one slice, and exceeds 0.8 on certain datasets when using box or ground-truth prompts, a level that begins to approach clinical usefulness. These findings demonstrate that SAM 2's ability to segment 3D medical images can be improved with our proposed strategies over the default ones, providing practical guidance for using SAM 2 for prompt-based 3D medical image segmentation.
{"title":"Segment Anything Model 2: An Application to 2D and 3D Medical Images.","authors":"Haoyu Dong, Hanxue Gu, Yaqian Chen, Jichen Yang, Yuwen Chen, Maciej A Mazurowski","doi":"10.1109/TBME.2026.3653267","DOIUrl":"https://doi.org/10.1109/TBME.2026.3653267","url":null,"abstract":"<p><p>Segment Anything Model (SAM) has gained significant attention because of its ability to segment a variety of objects in images upon providing a prompt. Recently developed SAM 2 has extended this ability to video segmentation, and by substituting the third spatial dimension in 3D images for the time dimension in videos, it opens an opportunity to apply SAM 2 to 3D medical images. In this paper, we extensively evaluate SAM 2's ability to segment both 2D and 3D medical images using 80 prompt strategies across 21 medical imaging datasets, including 2D modalities (X-ray and ultrasound), 3D modalities (magnetic resonance imaging, computed tomography, and positron emission tomography), and surgical videos. We find that in the 2D setting, SAM 2 performs similarly to SAM, while in the 3D setting we observe that: (1) selecting the first mask is more effective than choosing the one with the highest confidence, (2) prompting the slice with the largest object appears is the most cost-effective strategy when only one slice is prompted, (3) box prompts result in higher performance than point prompts at a slightly higher annotation cost, (4) bidirectional propagation outperforms front-to-end propagation, (5) interactive annotation is rarely effective, (6) SAM 2, without fine-tuning, achieves 3D IoU from 0.32 with a single point prompt to 0.51 with a ground truth mask on one slice, and exceeds 0.8 on certain datasets when using box or ground-truth prompts, a level that begins to approach clinical usefulness. These findings demonstrate that SAM 2's ability to segment 3D medical images can be improved with our proposed strategies over the default ones, providing practical guidance for using SAM 2 for prompt-based 3D medical image segmentation.</p>","PeriodicalId":13245,"journal":{"name":"IEEE Transactions on Biomedical Engineering","volume":"PP ","pages":""},"PeriodicalIF":4.5,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145959261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Quantitative viscoelasticity imaging via shear wave elastography (SWE) remains challenging due to complex wave physics and limitations of conventional reconstruction methods. To address this, we present SW-VEI-Net, a physics-informed neural network (PINN) that simultaneously reconstructs the shear elastic modulus and viscous modulus by integrating viscoelastic wave equations into a dual-network architecture. The framework employs a dual-loss function to balance data fidelity and physics-based regularization, significantly reducing reliance on empirical data while improving interpretability. Extensive validation on tissue-mimicking phantoms, rat liver fibrosis model, and clinical cases demonstrates that SW-VEI-Net outperforms state-of-the-art SWE methods. Compared to SWENet (a PINN-based method using a linear elastic model), SW-VEI-Net not only enables simultaneous assessment of shear elastic and viscous moduli, but also achieves higher accuracy in shear elastic modulus reconstruction. Furthermore, when benchmarked against the dispersion fitting (DF) method (based on a viscoelastic model), SW-VEI-Net produces comparable viscoelastic parameter maps while exhibiting enhanced robustness and consistency. For liver fibrosis staging, SW-VEI-Net achieves AUC values of 0.85 ($geq$F2) and 0.91 ($=$F4) based on elastic modulus classification, surpassing both SWENet (0.84, 0.85) and DF (0.78, 0.88). Additional validation in healthy volunteers shows strong agreement with a commercial ultrasound system. By synergizing deep learning with fundamental wave physics, this study represents a significant advancement in SWE, offering substantial clinical potential for early detection of hepatic fibrosis and malignant lesions through precise viscoelastic biomarker mapping.
{"title":"SW-VEI-Net: A Physics-Informed Deep Neural Network for Shear Wave Viscoelasticity Imaging.","authors":"Haoming Lin, Zhongjun Ma, Yunxiang Wang, Muqing Lin, Shuming Xu, Mian Chen, Minhua Lu, Siping Chen, Xin Chen","doi":"10.1109/TBME.2026.3652121","DOIUrl":"https://doi.org/10.1109/TBME.2026.3652121","url":null,"abstract":"<p><p>Quantitative viscoelasticity imaging via shear wave elastography (SWE) remains challenging due to complex wave physics and limitations of conventional reconstruction methods. To address this, we present SW-VEI-Net, a physics-informed neural network (PINN) that simultaneously reconstructs the shear elastic modulus and viscous modulus by integrating viscoelastic wave equations into a dual-network architecture. The framework employs a dual-loss function to balance data fidelity and physics-based regularization, significantly reducing reliance on empirical data while improving interpretability. Extensive validation on tissue-mimicking phantoms, rat liver fibrosis model, and clinical cases demonstrates that SW-VEI-Net outperforms state-of-the-art SWE methods. Compared to SWENet (a PINN-based method using a linear elastic model), SW-VEI-Net not only enables simultaneous assessment of shear elastic and viscous moduli, but also achieves higher accuracy in shear elastic modulus reconstruction. Furthermore, when benchmarked against the dispersion fitting (DF) method (based on a viscoelastic model), SW-VEI-Net produces comparable viscoelastic parameter maps while exhibiting enhanced robustness and consistency. For liver fibrosis staging, SW-VEI-Net achieves AUC values of 0.85 ($geq$F2) and 0.91 ($=$F4) based on elastic modulus classification, surpassing both SWENet (0.84, 0.85) and DF (0.78, 0.88). Additional validation in healthy volunteers shows strong agreement with a commercial ultrasound system. By synergizing deep learning with fundamental wave physics, this study represents a significant advancement in SWE, offering substantial clinical potential for early detection of hepatic fibrosis and malignant lesions through precise viscoelastic biomarker mapping.</p>","PeriodicalId":13245,"journal":{"name":"IEEE Transactions on Biomedical Engineering","volume":"PP ","pages":""},"PeriodicalIF":4.5,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145959277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.1109/TBME.2026.3651584
Larissa C Jansen, Richard G P Lopata, Hans-Martin Schwab
Pulse wave velocity (PWV) is an indirect measure of vessel stiffness, that has the potential to serve as a meaningful parameter for risk stratification of vascular diseases, such as abdominal aortic aneurysms (AAAs). However, assessing the PWV and pulse wave patterns in the complete abdominal aorta using ultrasound-based pulse wave imaging (PWI) is challenging due to the limited field of view (FOV) and contrast of a single ultrasound (US) probe. Hence, an approach is required that can capture distension of aortas with different levels of stiffness accurately in a large FOV. Therefore, we propose PWI based on dual probe, bistatic US. Single and dual probe ultrasound simulations were performed using finite element models of pressure waves propagating in aortas with different stiffness levels. Next, the approach was tested on an aorta and AAA mimicking phantom in a mock circulation setup. The simulation results show that the FOV, image quality, and PWV-estimation accuracy improve when using the dual probe approach (accuracy range: 94.9 - 99.8 $%$; R$^{2}$ range: 0.92 - 0.98) compared to conventional US (accuracy range: 12.6 - 93.9 $%$; R$^{2}$ range: 0.52 - 0.91). The approach was successfully expanded to the phantom study, which demonstrated expected wave patterns within a larger FOV. With dual probe PWI of the non-dilated phantom, the R$^{2}$-value improves (monostatic: 0.95; bistatic: 0.96) compared to use of single probe PWI (0.85). The proposed method shows to be promising for PWV-estimations in less compliant vessels with high wave speeds.
{"title":"Feasibility of dual probe pulse wave imaging of the abdominal aorta.","authors":"Larissa C Jansen, Richard G P Lopata, Hans-Martin Schwab","doi":"10.1109/TBME.2026.3651584","DOIUrl":"https://doi.org/10.1109/TBME.2026.3651584","url":null,"abstract":"<p><p>Pulse wave velocity (PWV) is an indirect measure of vessel stiffness, that has the potential to serve as a meaningful parameter for risk stratification of vascular diseases, such as abdominal aortic aneurysms (AAAs). However, assessing the PWV and pulse wave patterns in the complete abdominal aorta using ultrasound-based pulse wave imaging (PWI) is challenging due to the limited field of view (FOV) and contrast of a single ultrasound (US) probe. Hence, an approach is required that can capture distension of aortas with different levels of stiffness accurately in a large FOV. Therefore, we propose PWI based on dual probe, bistatic US. Single and dual probe ultrasound simulations were performed using finite element models of pressure waves propagating in aortas with different stiffness levels. Next, the approach was tested on an aorta and AAA mimicking phantom in a mock circulation setup. The simulation results show that the FOV, image quality, and PWV-estimation accuracy improve when using the dual probe approach (accuracy range: 94.9 - 99.8 $%$; R$^{2}$ range: 0.92 - 0.98) compared to conventional US (accuracy range: 12.6 - 93.9 $%$; R$^{2}$ range: 0.52 - 0.91). The approach was successfully expanded to the phantom study, which demonstrated expected wave patterns within a larger FOV. With dual probe PWI of the non-dilated phantom, the R$^{2}$-value improves (monostatic: 0.95; bistatic: 0.96) compared to use of single probe PWI (0.85). The proposed method shows to be promising for PWV-estimations in less compliant vessels with high wave speeds.</p>","PeriodicalId":13245,"journal":{"name":"IEEE Transactions on Biomedical Engineering","volume":"PP ","pages":""},"PeriodicalIF":4.5,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145959297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Multimodal signal fusion is a cornerstone of biomedical engineering and intelligent sensing, enabling holistic analysis of heterogeneous sources such as electroencephalography (EEG), peripheral signals, speech, and imaging data. However, integrating diverse modalities in a computationally efficient and biologically plausible manner remains a significant challenge. Transformer-based fusion architectures rely on global cross-attention to integrate multimodal information but incur high computational costs. In contrast, STDP-driven fully connected layers adopt local learning rules, which restrict their ability to autonomously form efficient sparse topologies for complex multimodal tasks. To address these issues, we propose a novel end-to-end framework-the Multimodal Spiking Neural Network (MSNN)-featuring a fusion module grounded in the Generalized Distributive Law (GDL). This principled mechanism provides an efficient and interpretable means of integrating heterogeneous biomedical and sensory signals. The MSNN further incorporates structure-adaptive leaky integrate-and-fire (SALIF) neurons, enabling dynamic optimization of sparse connectivity to enhance fusion efficiency. The proposed MSNN is validated on a range of datasets, demonstrating strong versatility: it achieves binary classification accuracies of 92.29% (valence) and 91.08% (arousal) on the DEAP dataset for affective state decoding and 99.77% on the WESAD dataset for stress detection, while delivering state-of-the-art performance on standard pattern recognition tasks (MNIST & TIDIGITS: 99.01%) and event-driven neuromorphic datasets (MNIST-DVS & N-TIDIGITS: 99.98%). These results demonstrate that MSNN offers an effective and energy-efficient solution for multimodal sensor fusion in biomedical and intelligent sensing applications.
{"title":"Multimodal Spiking Neural Network With Generalized Distributive Law for Biosignal and Sensory Fusion.","authors":"Zenan Huang, Bingrui Guo, Hailing Xu, Haojie Ruan, Donghui Guo","doi":"10.1109/TBME.2026.3653109","DOIUrl":"https://doi.org/10.1109/TBME.2026.3653109","url":null,"abstract":"<p><p>Multimodal signal fusion is a cornerstone of biomedical engineering and intelligent sensing, enabling holistic analysis of heterogeneous sources such as electroencephalography (EEG), peripheral signals, speech, and imaging data. However, integrating diverse modalities in a computationally efficient and biologically plausible manner remains a significant challenge. Transformer-based fusion architectures rely on global cross-attention to integrate multimodal information but incur high computational costs. In contrast, STDP-driven fully connected layers adopt local learning rules, which restrict their ability to autonomously form efficient sparse topologies for complex multimodal tasks. To address these issues, we propose a novel end-to-end framework-the Multimodal Spiking Neural Network (MSNN)-featuring a fusion module grounded in the Generalized Distributive Law (GDL). This principled mechanism provides an efficient and interpretable means of integrating heterogeneous biomedical and sensory signals. The MSNN further incorporates structure-adaptive leaky integrate-and-fire (SALIF) neurons, enabling dynamic optimization of sparse connectivity to enhance fusion efficiency. The proposed MSNN is validated on a range of datasets, demonstrating strong versatility: it achieves binary classification accuracies of 92.29% (valence) and 91.08% (arousal) on the DEAP dataset for affective state decoding and 99.77% on the WESAD dataset for stress detection, while delivering state-of-the-art performance on standard pattern recognition tasks (MNIST & TIDIGITS: 99.01%) and event-driven neuromorphic datasets (MNIST-DVS & N-TIDIGITS: 99.98%). These results demonstrate that MSNN offers an effective and energy-efficient solution for multimodal sensor fusion in biomedical and intelligent sensing applications.</p>","PeriodicalId":13245,"journal":{"name":"IEEE Transactions on Biomedical Engineering","volume":"PP ","pages":""},"PeriodicalIF":4.5,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145959214","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.1109/TBME.2026.3652428
S M Mahim, Md Emamul Hossen, Manojit Pramanik
Ultrasound (US)-guided needle tracking is a critical procedure for various clinical diagnoses and treatment planning, highlighting the need for improved visualization methods to enhance accuracy. While deep learning (DL) techniques have been employed to boost needle visibility in US images, they often rely heavily on manual annotations or simulated datasets, which can introduce biases and limit real-world applicability. Photoacoustic (PA) imaging, known for its high contrast capabilities, offers a promising solution by providing superior needle visualization compared to conventional US images. In this work, we present FocFormer-UNet, a DL network that leverages PA images of the needle as ground truth for training, eliminating the need for manual annotations. This approach significantly improves needle localization accuracy in US images, reducing the reliance on time-consuming manual labeling. FocFormer-UNet achieves excellent needle localization accuracy, demonstrated by a modified Hausdorff distance of 1.43 1.23 and a targeting error of 1.22 1.14 on human clinical dataset, indicating minimal deviation from actual needle positions. Our method offers robust needle tracking across diverse US systems, improving the precision and reliability of US-guided needle insertion procedures. It holds great promise for advancing AI-driven clinical support tools in medical imaging. The following is the source code: https://github.com/DeeplearningBILAB/FocFormer-UNet. Open Science Framework (OSF) provides datasets and checkpoints at: https://osf.io/yxt9v/.
{"title":"FocFormer-UNet: UNet With Focal Modulation and Transformers for Ultrasound Needle Tracking Using Photoacoustic Ground Truth.","authors":"S M Mahim, Md Emamul Hossen, Manojit Pramanik","doi":"10.1109/TBME.2026.3652428","DOIUrl":"https://doi.org/10.1109/TBME.2026.3652428","url":null,"abstract":"<p><p>Ultrasound (US)-guided needle tracking is a critical procedure for various clinical diagnoses and treatment planning, highlighting the need for improved visualization methods to enhance accuracy. While deep learning (DL) techniques have been employed to boost needle visibility in US images, they often rely heavily on manual annotations or simulated datasets, which can introduce biases and limit real-world applicability. Photoacoustic (PA) imaging, known for its high contrast capabilities, offers a promising solution by providing superior needle visualization compared to conventional US images. In this work, we present FocFormer-UNet, a DL network that leverages PA images of the needle as ground truth for training, eliminating the need for manual annotations. This approach significantly improves needle localization accuracy in US images, reducing the reliance on time-consuming manual labeling. FocFormer-UNet achieves excellent needle localization accuracy, demonstrated by a modified Hausdorff distance of 1.43 1.23 and a targeting error of 1.22 1.14 on human clinical dataset, indicating minimal deviation from actual needle positions. Our method offers robust needle tracking across diverse US systems, improving the precision and reliability of US-guided needle insertion procedures. It holds great promise for advancing AI-driven clinical support tools in medical imaging. The following is the source code: https://github.com/DeeplearningBILAB/FocFormer-UNet. Open Science Framework (OSF) provides datasets and checkpoints at: https://osf.io/yxt9v/.</p>","PeriodicalId":13245,"journal":{"name":"IEEE Transactions on Biomedical Engineering","volume":"PP ","pages":""},"PeriodicalIF":4.5,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145959234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.1109/TBME.2026.3651411
Yuanming Hu, Boyuan Li, Shuang Xu, Christina R Inscoe, Donald A Tyndall, Yueh Z Lee, Jianping Lu, Otto Zhou
Objective: To design a dual-energy cone beam computed tomography (DE-CBCT) scanner with reduced scatter and cone beam artifacts.
Methods: The scanner designed for maxillofacial imaging comprises a carbon nanotube (CNT) X-ray source array with multiple focal spots ("sources") and an energy integrating flat panel detector (FPD). The X-ray photons from each focal spot were narrowly collimated in the axial direction and was filtered by interlaced low‑ and high‑energy spectral filters. Two sets of projection images were acquired by sequentially activating the X‑ray beams from each source in one gantry rotation. The projections were processed using a one-step inversion algorithm. An anthropomorphic head phantom, a Defrise phantom and a water-equivalent phantom containing calcium and iodine inserts were used to compare the performance of the new dual-energy multisource CBCT (DE-MS-CBCT) with a conventional DE-CBCT using the same air-kerma.
Results: The DE-MS-CBCT eliminated the cone beam artifacts, reduced the degree of cupping artifacts from 14.53% to 2.94%, and lowered the mean relative error of water density from 15.3% to 1.7%, while the accuracies for iodine and calcium densities were comparable. The contrast-noise-ratios (CNR) of the calcium and iodine inserts against the solid water increased by 4.8%-53.4%.
Conclusion: The DE‑MS‑CBCT reduces scatter and cone‑beam artifacts, increases the image CNR, and enhances accuracy of materials quantification without increasing X-ray exposure compared to the conventional DE-CBCT.
Significance: The results demonstrate a new DE-CBCT method with improved image quality and accuracy of materials quantification without the need for an energy sensitive detector or kV switching.
目的:设计一种减少散射和锥束伪影的双能锥束ct (DE-CBCT)扫描仪。方法:颌面部成像扫描仪由碳纳米管(CNT)多焦点x射线源阵列(“源”)和能量集成平板探测器(FPD)组成。来自每个焦点点的x射线光子在轴向上被窄准直,并由交错的低能和高能谱滤波器过滤。通过在一次龙门旋转中依次激活来自每个源的X射线束,获得两组投影图像。投影使用一步反演算法进行处理。使用人形头部幻像、Defrise幻像和含有钙和碘插入物的水等效幻像来比较新型双能多源CBCT (DE-MS-CBCT)与传统DE-CBCT的性能。结果:DE-MS-CBCT消除了锥束伪影,将拔罐伪影的程度从14.53%降低到2.94%,将水密度的平均相对误差从15.3%降低到1.7%,而碘密度和钙密度的精度相当。钙、碘填料对固体水的比噪比(CNR)提高了4.8% ~ 53.4%。结论:与传统DE-CBCT相比,DE- MS -CBCT在不增加x射线曝光的情况下减少了散射和锥束伪影,提高了图像的CNR,提高了材料定量的准确性。意义:结果证明了一种新的DE-CBCT方法,提高了图像质量和材料量化精度,而不需要能量敏感探测器或kV开关。
{"title":"A Dual-Energy CBCT With Reduced Scatter and Cone Beam Artifacts Using an X-Ray Source Array and Interlaced Spectral Filters.","authors":"Yuanming Hu, Boyuan Li, Shuang Xu, Christina R Inscoe, Donald A Tyndall, Yueh Z Lee, Jianping Lu, Otto Zhou","doi":"10.1109/TBME.2026.3651411","DOIUrl":"https://doi.org/10.1109/TBME.2026.3651411","url":null,"abstract":"<p><strong>Objective: </strong>To design a dual-energy cone beam computed tomography (DE-CBCT) scanner with reduced scatter and cone beam artifacts.</p><p><strong>Methods: </strong>The scanner designed for maxillofacial imaging comprises a carbon nanotube (CNT) X-ray source array with multiple focal spots (\"sources\") and an energy integrating flat panel detector (FPD). The X-ray photons from each focal spot were narrowly collimated in the axial direction and was filtered by interlaced low‑ and high‑energy spectral filters. Two sets of projection images were acquired by sequentially activating the X‑ray beams from each source in one gantry rotation. The projections were processed using a one-step inversion algorithm. An anthropomorphic head phantom, a Defrise phantom and a water-equivalent phantom containing calcium and iodine inserts were used to compare the performance of the new dual-energy multisource CBCT (DE-MS-CBCT) with a conventional DE-CBCT using the same air-kerma.</p><p><strong>Results: </strong>The DE-MS-CBCT eliminated the cone beam artifacts, reduced the degree of cupping artifacts from 14.53% to 2.94%, and lowered the mean relative error of water density from 15.3% to 1.7%, while the accuracies for iodine and calcium densities were comparable. The contrast-noise-ratios (CNR) of the calcium and iodine inserts against the solid water increased by 4.8%-53.4%.</p><p><strong>Conclusion: </strong>The DE‑MS‑CBCT reduces scatter and cone‑beam artifacts, increases the image CNR, and enhances accuracy of materials quantification without increasing X-ray exposure compared to the conventional DE-CBCT.</p><p><strong>Significance: </strong>The results demonstrate a new DE-CBCT method with improved image quality and accuracy of materials quantification without the need for an energy sensitive detector or kV switching.</p>","PeriodicalId":13245,"journal":{"name":"IEEE Transactions on Biomedical Engineering","volume":"PP ","pages":""},"PeriodicalIF":4.5,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145959189","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Objective: Cardiotocography (CTG) is commonly used to monitor fetal heart rate (FHR) and assess fetal well-being during labor. However, its effectiveness in reducing adverse outcomes remains limited due to low sensitivity and high false-positive rates. This study aims to develop an interpretable deep learning model that fuses FHR time series with tabular clinical features to improve prediction of fetal compromise (umbilical artery pH $< $ 7.05).
Methods: We introduce Fusion ResNet, a novel architecture combining residual convolutional networks for FHR signal processing with a parallel neural network for tabular features. The model was trained and internally validated on a private dataset of 9,887 FHR recordings. External validation was performed on the open-access CTU-UHB dataset comprising 552 recordings. Model interpretability was evaluated using Shapley Additive Explanations (SHAP) and Gradient-Weighted Class Activation Mapping (Grad-CAM).
Results: Fusion ResNet achieved a mean area under the ROC curve (AUC) of 0.77 during internal cross-validation and a state-of-the-art AUC of 0.84 on the CTU-UHB dataset, outperforming existing deep learning approaches. SHAP analysis identified key clinical features contributing to predictions, while Grad-CAM highlighted salient FHR patterns linked to fetal compromise.
Conclusion: The proposed model enhances predictive accuracy while providing clinically meaningful explanations, enabling more transparent and reliable CTG interpretation.
Significance: This work demonstrates the potential of interpretable deep learning to improve fetal monitoring by integrating multimodal data, supporting timely and informed decision-making in obstetric care.
{"title":"Fusing Tabular Features and Deep Learning for Fetal Heart Rate Analysis: A Clinically Interpretable Model for Fetal Compromise Detection.","authors":"Lochana Mendis, Debjyoti Karmakar, Marimuthu Palaniswami, Fiona Brownfoot, Emerson Keenan","doi":"10.1109/TBME.2026.3652309","DOIUrl":"https://doi.org/10.1109/TBME.2026.3652309","url":null,"abstract":"<p><strong>Objective: </strong>Cardiotocography (CTG) is commonly used to monitor fetal heart rate (FHR) and assess fetal well-being during labor. However, its effectiveness in reducing adverse outcomes remains limited due to low sensitivity and high false-positive rates. This study aims to develop an interpretable deep learning model that fuses FHR time series with tabular clinical features to improve prediction of fetal compromise (umbilical artery pH $< $ 7.05).</p><p><strong>Methods: </strong>We introduce Fusion ResNet, a novel architecture combining residual convolutional networks for FHR signal processing with a parallel neural network for tabular features. The model was trained and internally validated on a private dataset of 9,887 FHR recordings. External validation was performed on the open-access CTU-UHB dataset comprising 552 recordings. Model interpretability was evaluated using Shapley Additive Explanations (SHAP) and Gradient-Weighted Class Activation Mapping (Grad-CAM).</p><p><strong>Results: </strong>Fusion ResNet achieved a mean area under the ROC curve (AUC) of 0.77 during internal cross-validation and a state-of-the-art AUC of 0.84 on the CTU-UHB dataset, outperforming existing deep learning approaches. SHAP analysis identified key clinical features contributing to predictions, while Grad-CAM highlighted salient FHR patterns linked to fetal compromise.</p><p><strong>Conclusion: </strong>The proposed model enhances predictive accuracy while providing clinically meaningful explanations, enabling more transparent and reliable CTG interpretation.</p><p><strong>Significance: </strong>This work demonstrates the potential of interpretable deep learning to improve fetal monitoring by integrating multimodal data, supporting timely and informed decision-making in obstetric care.</p>","PeriodicalId":13245,"journal":{"name":"IEEE Transactions on Biomedical Engineering","volume":"PP ","pages":""},"PeriodicalIF":4.5,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145959295","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-05DOI: 10.1109/TBME.2026.3651219
Guo-Xuan Xu, Chien Chen, Chih-Chung Huang
Accurate muscle anisotropy assessment is crucial for understanding muscle mechanics and diagnosing pathologies. Shear wave (SW) elastography struggles with the varying fiber orientations in pennate muscles. Rotating the ultrasound probe offers a solution but is cumbersome in clinical practice. This study presents a tilted supersonic push (TSP) method with elliptical analytical inversion to overcome this limitation. TSP method can generate multi-angle (0°-15°) SWs within a single scan plane, creating an elliptical shear wave velocity (SWV) distribution that enables calculation of fiber-aligned and perpendicular SWVs without probe rotation. The TSP method's accuracy was validated through ex vivo experiments on porcine muscles, and in vivo studies on human gastrocnemius muscles. Results consistently demonstrated accurate SWV measurements, even in the presence of significant pennate angles. For instance, in ex vivo porcine muscles with a 25° pennate angle, TSP corrected longitudinal and transverse SWVs of 3.33 m/s and 2.15 m/s, respectively, consistent with reference values without pennate angle obtained via the traditional rotation method. Similarly, in vivo measurements on human gastrocnemius muscle showed longitudinal and transverse SWVs of 2.55 m/s and 1.21 m/s in a relaxed state, increasing to 4.07 m/s and 1.70 m/s during stretching. These findings highlight the method's ability to capture dynamic changes in muscle stiffness. The TSP method provides a clinically viable and robust approach for comprehensive muscle anisotropy assessment, especially in complex pennate muscles. This technique simplifies the measurement process and offers potential for improved diagnosis and management of musculoskeletal disorders.
{"title":"Shear Wave Anisotropic Imaging for Pennate Muscle Assessment Using a Tilted Supersonic Push with Elliptical Analytical Inversion.","authors":"Guo-Xuan Xu, Chien Chen, Chih-Chung Huang","doi":"10.1109/TBME.2026.3651219","DOIUrl":"https://doi.org/10.1109/TBME.2026.3651219","url":null,"abstract":"<p><p>Accurate muscle anisotropy assessment is crucial for understanding muscle mechanics and diagnosing pathologies. Shear wave (SW) elastography struggles with the varying fiber orientations in pennate muscles. Rotating the ultrasound probe offers a solution but is cumbersome in clinical practice. This study presents a tilted supersonic push (TSP) method with elliptical analytical inversion to overcome this limitation. TSP method can generate multi-angle (0°-15°) SWs within a single scan plane, creating an elliptical shear wave velocity (SWV) distribution that enables calculation of fiber-aligned and perpendicular SWVs without probe rotation. The TSP method's accuracy was validated through ex vivo experiments on porcine muscles, and in vivo studies on human gastrocnemius muscles. Results consistently demonstrated accurate SWV measurements, even in the presence of significant pennate angles. For instance, in ex vivo porcine muscles with a 25° pennate angle, TSP corrected longitudinal and transverse SWVs of 3.33 m/s and 2.15 m/s, respectively, consistent with reference values without pennate angle obtained via the traditional rotation method. Similarly, in vivo measurements on human gastrocnemius muscle showed longitudinal and transverse SWVs of 2.55 m/s and 1.21 m/s in a relaxed state, increasing to 4.07 m/s and 1.70 m/s during stretching. These findings highlight the method's ability to capture dynamic changes in muscle stiffness. The TSP method provides a clinically viable and robust approach for comprehensive muscle anisotropy assessment, especially in complex pennate muscles. This technique simplifies the measurement process and offers potential for improved diagnosis and management of musculoskeletal disorders.</p>","PeriodicalId":13245,"journal":{"name":"IEEE Transactions on Biomedical Engineering","volume":"PP ","pages":""},"PeriodicalIF":4.5,"publicationDate":"2026-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145905804","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01DOI: 10.1109/TBME.2025.3579491
Yang Li, Wei Liu, Tianzhi Feng, Fu Li, Chennan Wu, Boxun Fu, Zhifu Zhao, Xiaotian Wang, Guangming Shi
As a type of multi-dimensional sequential data, the spatial and temporal dependencies of electroencephalogram (EEG) signals should be further investigated. Thus, in this paper, we propose a novel spatial-temporalprogressive attention model (STPAM) to improve EEG classification in rapid serial visual presentation(RSVP) tasks. STPAM employs a progressive approach using three sequential spatial experts to learn brain region topology and mitigate interference from irrelevant areas. Each expert refines EEG electrode selection, guiding subsequent experts to focus on significant spatial information, thus enhancing signals from key regions. Subsequently, based on the above spatially-enhanced features, three temporal experts progressively capture temporal dependencies by focusing attention on crucial EEG time slices. Except for the above EEG classification method, in this paper, we build a novel Infrared RSVP Dataset (IRED) which is based on dim infrared images with small targets for the first time, and conduct extensive experiments on it. Experimental results demonstrate that STPAM outperforms all baselines, achieving 2.02% and 1.17% on the public dataset and IRED dataset, respectively.
{"title":"Spatio-Temporal Progressive Attention Model for EEG Classification in Rapid Serial Visual Presentation Task.","authors":"Yang Li, Wei Liu, Tianzhi Feng, Fu Li, Chennan Wu, Boxun Fu, Zhifu Zhao, Xiaotian Wang, Guangming Shi","doi":"10.1109/TBME.2025.3579491","DOIUrl":"10.1109/TBME.2025.3579491","url":null,"abstract":"<p><p>As a type of multi-dimensional sequential data, the spatial and temporal dependencies of electroencephalogram (EEG) signals should be further investigated. Thus, in this paper, we propose a novel spatial-temporalprogressive attention model (STPAM) to improve EEG classification in rapid serial visual presentation(RSVP) tasks. STPAM employs a progressive approach using three sequential spatial experts to learn brain region topology and mitigate interference from irrelevant areas. Each expert refines EEG electrode selection, guiding subsequent experts to focus on significant spatial information, thus enhancing signals from key regions. Subsequently, based on the above spatially-enhanced features, three temporal experts progressively capture temporal dependencies by focusing attention on crucial EEG time slices. Except for the above EEG classification method, in this paper, we build a novel Infrared RSVP Dataset (IRED) which is based on dim infrared images with small targets for the first time, and conduct extensive experiments on it. Experimental results demonstrate that STPAM outperforms all baselines, achieving 2.02% and 1.17% on the public dataset and IRED dataset, respectively.</p>","PeriodicalId":13245,"journal":{"name":"IEEE Transactions on Biomedical Engineering","volume":"PP ","pages":"191-207"},"PeriodicalIF":4.5,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144283749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01DOI: 10.1109/TBME.2025.3576841
Jingke Song, Jianjun Zhang, Jun Wei, Chenglei Liu, Xiankun Zhao, Cunjin Ai
To address the mismatch between current ankle rehabilitation robots and natural human motion, which affects rehabilitation efficacy, this paper uses screw theory and motion capture experiments to identify the instantaneous finite helical motion axis (IFHA) of the human ankle joint. It determines the distribution law of the IFHA and twist pitch (TP) of the ankle, and designs a human-machine motion compatible rope-driven ankle joint rehabilitation robot that meets the needs of human ankle joint rehabilitation. Firstly, human ankle motion trajectories are captured using the VICON system and IMU, and the experimental data are processed according to screw theory to obtain the distribution law of the IFHA and the range of TP. Secondly, the ankle joint's motion characteristics from the experiment inform the constraint characteristics of the rehabilitation mechanism, which are then mapped into a novel parallel rope-driven ankle rehabilitation robot to meet rehabilitation needs. Thirdly, the kinematic model of the novel mechanism is established, and its kinematic performance and singular configurations are analyzed based on the motion/force transmission index, guiding the optimization of the driving rope layout and mechanism scale parameters. Finally, an experimental platform is built to validate the human-machine motion compatibility, safety, comfort, and effectiveness of the rehabilitation robot.
{"title":"Development and Kinematics Optimization of a Human-Compatible Rope-Driven Ankle Rehabilitation Robot Based on Foot-Ankle IFHA Identification.","authors":"Jingke Song, Jianjun Zhang, Jun Wei, Chenglei Liu, Xiankun Zhao, Cunjin Ai","doi":"10.1109/TBME.2025.3576841","DOIUrl":"https://doi.org/10.1109/TBME.2025.3576841","url":null,"abstract":"<p><p>To address the mismatch between current ankle rehabilitation robots and natural human motion, which affects rehabilitation efficacy, this paper uses screw theory and motion capture experiments to identify the instantaneous finite helical motion axis (IFHA) of the human ankle joint. It determines the distribution law of the IFHA and twist pitch (TP) of the ankle, and designs a human-machine motion compatible rope-driven ankle joint rehabilitation robot that meets the needs of human ankle joint rehabilitation. Firstly, human ankle motion trajectories are captured using the VICON system and IMU, and the experimental data are processed according to screw theory to obtain the distribution law of the IFHA and the range of TP. Secondly, the ankle joint's motion characteristics from the experiment inform the constraint characteristics of the rehabilitation mechanism, which are then mapped into a novel parallel rope-driven ankle rehabilitation robot to meet rehabilitation needs. Thirdly, the kinematic model of the novel mechanism is established, and its kinematic performance and singular configurations are analyzed based on the motion/force transmission index, guiding the optimization of the driving rope layout and mechanism scale parameters. Finally, an experimental platform is built to validate the human-machine motion compatibility, safety, comfort, and effectiveness of the rehabilitation robot.</p>","PeriodicalId":13245,"journal":{"name":"IEEE Transactions on Biomedical Engineering","volume":"73 1","pages":"40-53"},"PeriodicalIF":4.5,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145855793","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}