Pub Date : 2026-01-01DOI: 10.1016/j.compmedimag.2025.102691
Ziteng Liu , Chenghong Zhang , Dongdong He , Chenyang Yang , Hao Liu , Wenpeng Gao , Yili Fu
Surgical smoke removal is crucial for enhancing laparoscopic image quality in computer-assisted surgery. While existing methods utilize estimated smoke distribution to address non-homogeneous characteristics, most treat this information merely as prior input and often suffer from over-desmoking artifacts. To address these limitations, this study introduces a desmoking network that reconstructs smoke-free images by explicitly utilizing smoke distribution information. The network comprises two key modules: the Smoke Attention Estimator (SAE) and the Hybrid Guided Embedding (HGE). The SAE generates a smoke attention map via a channel-aware position embedding with lightness prior to improve accuracy. The HGE takes the predicted smoke attention map from the SAE as input and employs convolutional layers along with a novel field transformation method to generate residual terms. By combining these residual terms with the original image, the HGE preserves fine details in smoke-free regions, thereby preventing over-desmoking. Experimental results reveal that the proposed method achieves improvements of at least 3.71% in Peak Signal-to-Noise Ratio (PSNR) and 18.75% in Learned Perceptual Image Patch Similarity compared to state-of-the-art methods on the synthetic dataset, while attaining the lowest Perception-based Image Quality Evaluator score (24.55) on the Cholec80 dataset. It operates at around 174 frames per second, indicating strong real-time processing capability. The network achieves over 40 dB in PSNR for smoke-free regions, excelling in both color restoration and detail preservation. This work is available at https://homepage.hit.edu.cn/wpgao?lang=en.
{"title":"Attention-aware network with lightness embedding and Hybrid Guided Embedding for laparoscopic image desmoking","authors":"Ziteng Liu , Chenghong Zhang , Dongdong He , Chenyang Yang , Hao Liu , Wenpeng Gao , Yili Fu","doi":"10.1016/j.compmedimag.2025.102691","DOIUrl":"10.1016/j.compmedimag.2025.102691","url":null,"abstract":"<div><div>Surgical smoke removal is crucial for enhancing laparoscopic image quality in computer-assisted surgery. While existing methods utilize estimated smoke distribution to address non-homogeneous characteristics, most treat this information merely as prior input and often suffer from over-desmoking artifacts. To address these limitations, this study introduces a desmoking network that reconstructs smoke-free images by explicitly utilizing smoke distribution information. The network comprises two key modules: the Smoke Attention Estimator (SAE) and the Hybrid Guided Embedding (HGE). The SAE generates a smoke attention map via a channel-aware position embedding with lightness prior to improve accuracy. The HGE takes the predicted smoke attention map from the SAE as input and employs convolutional layers along with a novel field transformation method to generate residual terms. By combining these residual terms with the original image, the HGE preserves fine details in smoke-free regions, thereby preventing over-desmoking. Experimental results reveal that the proposed method achieves improvements of at least 3.71% in Peak Signal-to-Noise Ratio (PSNR) and 18.75% in Learned Perceptual Image Patch Similarity compared to state-of-the-art methods on the synthetic dataset, while attaining the lowest Perception-based Image Quality Evaluator score (24.55) on the Cholec80 dataset. It operates at around 174 frames per second, indicating strong real-time processing capability. The network achieves over 40 dB in PSNR for smoke-free regions, excelling in both color restoration and detail preservation. This work is available at <span><span>https://homepage.hit.edu.cn/wpgao?lang=en</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"127 ","pages":"Article 102691"},"PeriodicalIF":4.9,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145866672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01DOI: 10.1016/j.compmedimag.2025.102693
Xirui Zhao , Deqiang Xiao , Teng Zhang , Jingfan Fan , Danni Ai , Tianyu Fu , Yucong Lin , Long Shao , Hong Song , Junqiang Wang , Jian Yang
Accurate localization of anatomical landmarks from pelvic CT images is crucial for preoperative planning in orthopedic procedures. However, existing automatic methods often underperform when facing defective bone structures, which are common in clinical scenarios involving trauma, resection, or severe degeneration. To address this challenge, we propose DADNet, a defect-adaptive detection network that incorporates personalized structural priors to achieve accurate and robust landmark detection in defective pelvis CT images. DADNet first constructs a structure-aware soft prior map that encodes the spatial distribution of landmarks based on the individual bone anatomy. This prior map, which highlights landmark-related regions, is generated via a dedicated convolutional module followed by logarithmic transformation. Guided by this soft prior, we extract local patches around the candidate regions and performs landmark regression using a patch-based context-aware detection network. To further enhance detection robustness in defective regions, we introduce a bone-aware detection loss that modulates the prediction confidence based on bone structures. The modulation weight is dynamically adjusted during training via a sigmoid scheduler, enabling progressive adaptation from coarse to fine structural constraints. We evaluate DADNet on both public and private datasets featuring varying degrees of pelvic defects. Our approach achieves an average detection error of 1.252 ± 0.075 mm on severely defective cases, significantly outperforming existing methods. The proposed framework demonstrates strong adaptability to anatomical variability and structural incompleteness, offering a promising tool for accurate and robust landmark detection in challenging clinical cases.
{"title":"Defect-adaptive landmark detection in pelvis CT images via personalized structure-aware learning","authors":"Xirui Zhao , Deqiang Xiao , Teng Zhang , Jingfan Fan , Danni Ai , Tianyu Fu , Yucong Lin , Long Shao , Hong Song , Junqiang Wang , Jian Yang","doi":"10.1016/j.compmedimag.2025.102693","DOIUrl":"10.1016/j.compmedimag.2025.102693","url":null,"abstract":"<div><div>Accurate localization of anatomical landmarks from pelvic CT images is crucial for preoperative planning in orthopedic procedures. However, existing automatic methods often underperform when facing defective bone structures, which are common in clinical scenarios involving trauma, resection, or severe degeneration. To address this challenge, we propose DADNet, a defect-adaptive detection network that incorporates personalized structural priors to achieve accurate and robust landmark detection in defective pelvis CT images. DADNet first constructs a structure-aware soft prior map that encodes the spatial distribution of landmarks based on the individual bone anatomy. This prior map, which highlights landmark-related regions, is generated via a dedicated convolutional module followed by logarithmic transformation. Guided by this soft prior, we extract local patches around the candidate regions and performs landmark regression using a patch-based context-aware detection network. To further enhance detection robustness in defective regions, we introduce a bone-aware detection loss that modulates the prediction confidence based on bone structures. The modulation weight is dynamically adjusted during training via a sigmoid scheduler, enabling progressive adaptation from coarse to fine structural constraints. We evaluate DADNet on both public and private datasets featuring varying degrees of pelvic defects. Our approach achieves an average detection error of 1.252 ± 0.075 mm on severely defective cases, significantly outperforming existing methods. The proposed framework demonstrates strong adaptability to anatomical variability and structural incompleteness, offering a promising tool for accurate and robust landmark detection in challenging clinical cases.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"127 ","pages":"Article 102693"},"PeriodicalIF":4.9,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145884165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01DOI: 10.1016/j.compmedimag.2025.102696
Yikun Zhang , Jiashun Wang , Xi Wang , Xu Ji , Kai Chen , Jian Yang , Yinsheng Li , Yang Chen
Compared with conventional computed tomography (CT), spectral CT can simultaneously visualize internal structures and characterize the material composition of scanned objects by acquiring data at different energy spectra. Photon-counting CT (PCCT) and multi-source CT (MSCT) are two promising implementations of spectral CT. Besides, radiation exposure remains a long-standing concern in CT imaging, as excessive X-ray exposure may lead to genetic and cellular damage. For PCCT and MSCT, the radiation dose can be reduced by lowering the tube current and adopting complementary limited-view scanning, respectively. To mitigate the noise and artifacts induced by low-dose acquisition protocols, this paper proposes a Mamba-assisted X-Net leveraging latent priors for spectral CT, termed Spectral-X. First, considering the intrinsic characteristics of spectral CT, Spectral-X exploits the latent representation of the enhanced full-spectrum prior image to facilitate the restoration of multi-energy CT (MECT). Second, Spectral-X employs an X-shaped network with feature fusion blocks to adaptively capture and leverage multi-scale prior information in the latent space. Third, Spectral-X integrates a novel all-around Mamba mechanism that can efficiently model long-range dependencies, thereby enhancing the performance of the image restoration backbone network. Spectral-X is evaluated on both PCCT denoising and limited-view MSCT restoration tasks, and the experimental results demonstrate that Spectral-X achieves state-of-the-art performance in noise suppression, artifact removal, and structural restoration.
{"title":"Spectral-X: Latent prior enhanced spectral CT restoration with mamba-assisted X-net","authors":"Yikun Zhang , Jiashun Wang , Xi Wang , Xu Ji , Kai Chen , Jian Yang , Yinsheng Li , Yang Chen","doi":"10.1016/j.compmedimag.2025.102696","DOIUrl":"10.1016/j.compmedimag.2025.102696","url":null,"abstract":"<div><div>Compared with conventional computed tomography (CT), spectral CT can simultaneously visualize internal structures and characterize the material composition of scanned objects by acquiring data at different energy spectra. Photon-counting CT (PCCT) and multi-source CT (MSCT) are two promising implementations of spectral CT. Besides, radiation exposure remains a long-standing concern in CT imaging, as excessive X-ray exposure may lead to genetic and cellular damage. For PCCT and MSCT, the radiation dose can be reduced by lowering the tube current and adopting complementary limited-view scanning, respectively. To mitigate the noise and artifacts induced by low-dose acquisition protocols, this paper proposes a Mamba-assisted X-Net leveraging latent priors for spectral CT, termed Spectral-X. First, considering the intrinsic characteristics of spectral CT, Spectral-X exploits the latent representation of the enhanced full-spectrum prior image to facilitate the restoration of multi-energy CT (MECT). Second, Spectral-X employs an X-shaped network with feature fusion blocks to adaptively capture and leverage multi-scale prior information in the latent space. Third, Spectral-X integrates a novel all-around Mamba mechanism that can efficiently model long-range dependencies, thereby enhancing the performance of the image restoration backbone network. Spectral-X is evaluated on both PCCT denoising and limited-view MSCT restoration tasks, and the experimental results demonstrate that Spectral-X achieves state-of-the-art performance in noise suppression, artifact removal, and structural restoration.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"127 ","pages":"Article 102696"},"PeriodicalIF":4.9,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145884164","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01DOI: 10.1016/j.compmedimag.2025.102688
Mattia Litrico , Francesco Guarnera , Mario Valerio Giuffrida , Daniele Ravì , Sebastiano Battiato
Generating realistic MRIs to accurately predict future changes in the structure of brain is an invaluable tool for clinicians in assessing clinical outcomes and analysing the disease progression at the patient level. However, current existing methods present some limitations: (i) some approaches fail to explicitly capture the relationship between structural changes and time intervals, especially when trained on age-imbalanced datasets; (ii) others rely only on scan interpolation, which lack clinical utility, as they generate intermediate images between timepoints rather than future pathological progression; and (iii) most approaches rely on 2D slice-based architectures, thereby disregarding full 3D anatomical context, which is essential for accurate longitudinal predictions. We propose a 3D Temporally-Aware Diffusion Model (TADM-3D), which accurately predicts brain progression on MRI volumes. To better model the relationship between time interval and brain changes, TADM-3D uses a pre-trained Brain-Age Estimator (BAE) that guides the diffusion model in the generation of MRIs that accurately reflect the expected age difference between baseline and generated follow-up scans. Additionally, to further improve the temporal awareness of TADM-3D, we propose the Back-In-Time Regularisation (BITR), by training TADM-3D to predict bidirectionally from the baseline to follow-up (forward), as well as from the follow-up to baseline (backward). Although predicting past scans has limited clinical applications, this regularisation helps the model generate temporally more accurate scans. We train and evaluate TADM-3D on the OASIS-3 dataset, and we validate the generalisation performance on an external test set from the NACC dataset. The code is available at https://github.com/MattiaLitrico/TADM-3D.
{"title":"Temporally-aware diffusion model for brain progression modelling with bidirectional temporal regularisation","authors":"Mattia Litrico , Francesco Guarnera , Mario Valerio Giuffrida , Daniele Ravì , Sebastiano Battiato","doi":"10.1016/j.compmedimag.2025.102688","DOIUrl":"10.1016/j.compmedimag.2025.102688","url":null,"abstract":"<div><div>Generating realistic MRIs to accurately predict future changes in the structure of brain is an invaluable tool for clinicians in assessing clinical outcomes and analysing the disease progression at the patient level. However, current existing methods present some limitations: (i) some approaches fail to explicitly capture the relationship between structural changes and time intervals, especially when trained on age-imbalanced datasets; (ii) others rely only on scan interpolation, which lack clinical utility, as they generate intermediate images between timepoints rather than future pathological progression; and (iii) most approaches rely on 2D slice-based architectures, thereby disregarding full 3D anatomical context, which is essential for accurate longitudinal predictions. We propose a 3D Temporally-Aware Diffusion Model (TADM-3D), which accurately predicts brain progression on MRI volumes. To better model the relationship between time interval and brain changes, TADM-3D uses a pre-trained <em>Brain-Age Estimator</em> (BAE) that guides the diffusion model in the generation of MRIs that accurately reflect the expected age difference between baseline and generated follow-up scans. Additionally, to further improve the temporal awareness of TADM-3D, we propose the <em>Back-In-Time Regularisation</em> (BITR), by training TADM-3D to predict bidirectionally from the baseline to follow-up (forward), as well as from the follow-up to baseline (backward). Although predicting past scans has limited clinical applications, this regularisation helps the model generate temporally more accurate scans. We train and evaluate TADM-3D on the OASIS-3 dataset, and we validate the generalisation performance on an external test set from the NACC dataset. The code is available at <span><span>https://github.com/MattiaLitrico/TADM-3D</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"127 ","pages":"Article 102688"},"PeriodicalIF":4.9,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145866649","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01DOI: 10.1016/j.compmedimag.2025.102697
Jaeyoung Huh , Hye Shin Ahn , Hyun Jeong Park , Jong Chul Ye
Breast ultrasound (BUS) is a vital imaging technique for detecting and characterizing breast abnormalities. Generating comprehensive BUS reports typically requires integrating multiple image views and patient information, which can be time-consuming for clinicians. This study explores the feasibility of a modular, AI-assisted framework to support BUS report generation, focusing on system integration. We developed a suite of classification networks for image analysis, coordinated via LangChain with Large Language Models (LLMs), to generate structured and clinically meaningful reports. A Retrieval-Augmented Generation (RAG) component allows the framework to incorporate prior patient information, enabling context-aware and personalized report generation. The system demonstrates the practical integration of existing image-analysis models and language-generation tools within a clinical workflow. Experimental evaluations show that the integrated framework produces consistent and clinically interpretable reports, which align well with radiologists’ assessments. These results suggest that the proposed approach provides a feasible, modular, and extensible solution for semi-automated BUS report generation, offering a foundation for further refinement and potential clinical deployment.
{"title":"Wholistic report generation for Breast ultrasound using LangChain","authors":"Jaeyoung Huh , Hye Shin Ahn , Hyun Jeong Park , Jong Chul Ye","doi":"10.1016/j.compmedimag.2025.102697","DOIUrl":"10.1016/j.compmedimag.2025.102697","url":null,"abstract":"<div><div>Breast ultrasound (BUS) is a vital imaging technique for detecting and characterizing breast abnormalities. Generating comprehensive BUS reports typically requires integrating multiple image views and patient information, which can be time-consuming for clinicians. This study explores the feasibility of a modular, AI-assisted framework to support BUS report generation, focusing on system integration. We developed a suite of classification networks for image analysis, coordinated via LangChain with Large Language Models (LLMs), to generate structured and clinically meaningful reports. A Retrieval-Augmented Generation (RAG) component allows the framework to incorporate prior patient information, enabling context-aware and personalized report generation. The system demonstrates the practical integration of existing image-analysis models and language-generation tools within a clinical workflow. Experimental evaluations show that the integrated framework produces consistent and clinically interpretable reports, which align well with radiologists’ assessments. These results suggest that the proposed approach provides a feasible, modular, and extensible solution for semi-automated BUS report generation, offering a foundation for further refinement and potential clinical deployment.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"127 ","pages":"Article 102697"},"PeriodicalIF":4.9,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145884163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01Epub Date: 2025-12-04DOI: 10.1016/j.compmedimag.2025.102683
Xiaoyin Xu, Stephen Tc Wong
{"title":"Computerized medical imaging and graphics best paper award 2024.","authors":"Xiaoyin Xu, Stephen Tc Wong","doi":"10.1016/j.compmedimag.2025.102683","DOIUrl":"https://doi.org/10.1016/j.compmedimag.2025.102683","url":null,"abstract":"","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"127 ","pages":"102683"},"PeriodicalIF":4.9,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146020430","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01DOI: 10.1016/j.compmedimag.2025.102683
Xiaoyin Xu, Stephen TC Wong
{"title":"Computerized medical imaging and graphics best paper award 2024","authors":"Xiaoyin Xu, Stephen TC Wong","doi":"10.1016/j.compmedimag.2025.102683","DOIUrl":"10.1016/j.compmedimag.2025.102683","url":null,"abstract":"","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"127 ","pages":"Article 102683"},"PeriodicalIF":4.9,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145726624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-26DOI: 10.1016/j.compmedimag.2025.102681
Xiaodong Zhou , Huibin Wang
In the task of 3D reconstruction of X-ray coronary artery, matching vessel branches in different viewpoints is a challenging task. In this study, this task is transformed into the process of vessel branches instance segmentation and then matching branches of the same color, and an instance segmentation network (YOLO-CAVBIS) is proposed specifically for deformed and dynamic vessels. Firstly, since the left and right coronary artery branches are not easy to distinguish, a coronary artery classification dataset is produced and the left and right coronary artery arteries are classified using the YOLOv8-cls classification model, and then the classified images are fed into two parallel YOLO-CAVBIS networks for coronary artery branches instance segmentation. Finally, the branches with the same color of branches in different viewpoints are matched. The experimental results show that the accuracy of the coronary artery classification model can reach 100%, and the mAP50 of the proposed left coronary branches instance segmentation model reaches 98.4%, and the mAP50 of the proposed right coronary branches instance segmentation model reaches 99.4%. In terms of extracting deformation and dynamic vascular features, our proposed YOLO-CAVBIS network demonstrates greater specificity and superiority compared to other instance segmentation networks, and can be used as a baseline model for the task of coronary artery branches instance segmentation. Code repository: https://gitee.com/zaleman/ca_instance_segmentation, https://github.com/zaleman/ca_instance_segmentation.
{"title":"Research on X-ray coronary artery branches instance segmentation and matching task","authors":"Xiaodong Zhou , Huibin Wang","doi":"10.1016/j.compmedimag.2025.102681","DOIUrl":"10.1016/j.compmedimag.2025.102681","url":null,"abstract":"<div><div>In the task of 3D reconstruction of X-ray coronary artery, matching vessel branches in different viewpoints is a challenging task. In this study, this task is transformed into the process of vessel branches instance segmentation and then matching branches of the same color, and an instance segmentation network (YOLO-CAVBIS) is proposed specifically for deformed and dynamic vessels. Firstly, since the left and right coronary artery branches are not easy to distinguish, a coronary artery classification dataset is produced and the left and right coronary artery arteries are classified using the YOLOv8-cls classification model, and then the classified images are fed into two parallel YOLO-CAVBIS networks for coronary artery branches instance segmentation. Finally, the branches with the same color of branches in different viewpoints are matched. The experimental results show that the accuracy of the coronary artery classification model can reach 100%, and the mAP50 of the proposed left coronary branches instance segmentation model reaches 98.4%, and the mAP50 of the proposed right coronary branches instance segmentation model reaches 99.4%. In terms of extracting deformation and dynamic vascular features, our proposed YOLO-CAVBIS network demonstrates greater specificity and superiority compared to other instance segmentation networks, and can be used as a baseline model for the task of coronary artery branches instance segmentation. Code repository: <span><span>https://gitee.com/zaleman/ca_instance_segmentation</span><svg><path></path></svg></span>, <span><span>https://github.com/zaleman/ca_instance_segmentation</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"128 ","pages":"Article 102681"},"PeriodicalIF":4.9,"publicationDate":"2025-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145928056","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-26DOI: 10.1016/j.compmedimag.2025.102695
Li Shiyan, Wang Shuqin, Gu Xin, Sun Debing
In recent years, semi-supervised learning (SSL) has attracted increasing attention in medical image analysis, showing great potential in scenarios with limited annotations. However, existing consistency regularization methods suffer from several limitations: overly uniform constraints at the output layer, lack of interaction within adversarial strategies, and reliance on external sample pools for sample estimation, which together lead to insufficient use of feature-level information and unstable training. To address these challenges, this paper proposes a novel semi-supervised framework, termed Feature-level multi-scale Consistency and Adversarial Training (FCAT). A multi-scale feature-level consistency mechanism is introduced to capture hierarchical structural representations through cross-level feature fusion, enabling robust feature alignment without relying on external sample pools. To overcome the limitation of unidirectional adversarial training, a bidirectional feature perturbation strategy is designed under a teacher–student collaboration scheme, where both models generate perturbations from their own gradients and enforce mutual consistency. In addition, an intrinsic evaluation mechanism based on entropy and complementary confidence is developed to rank unlabeled samples according to their information content, guiding the training process toward informative hard samples while reducing overfitting to trivial ones. Experiments on the balanced Pneumonia Chest X-ray and NCT-CRC-HE histopathology datasets, as well as the imbalanced ISIC 2019 dermoscopic skin lesion dataset, demonstrate that our FCAT achieves competitive performance and strong generalization across diverse imaging modalities and data distributions.
{"title":"Semi-supervised medical image classification via feature-level multi-scale consistency and adversarial training","authors":"Li Shiyan, Wang Shuqin, Gu Xin, Sun Debing","doi":"10.1016/j.compmedimag.2025.102695","DOIUrl":"10.1016/j.compmedimag.2025.102695","url":null,"abstract":"<div><div>In recent years, semi-supervised learning (SSL) has attracted increasing attention in medical image analysis, showing great potential in scenarios with limited annotations. However, existing consistency regularization methods suffer from several limitations: overly uniform constraints at the output layer, lack of interaction within adversarial strategies, and reliance on external sample pools for sample estimation, which together lead to insufficient use of feature-level information and unstable training. To address these challenges, this paper proposes a novel semi-supervised framework, termed Feature-level multi-scale Consistency and Adversarial Training (FCAT). A multi-scale feature-level consistency mechanism is introduced to capture hierarchical structural representations through cross-level feature fusion, enabling robust feature alignment without relying on external sample pools. To overcome the limitation of unidirectional adversarial training, a bidirectional feature perturbation strategy is designed under a teacher–student collaboration scheme, where both models generate perturbations from their own gradients and enforce mutual consistency. In addition, an intrinsic evaluation mechanism based on entropy and complementary confidence is developed to rank unlabeled samples according to their information content, guiding the training process toward informative hard samples while reducing overfitting to trivial ones. Experiments on the balanced Pneumonia Chest X-ray and NCT-CRC-HE histopathology datasets, as well as the imbalanced ISIC 2019 dermoscopic skin lesion dataset, demonstrate that our FCAT achieves competitive performance and strong generalization across diverse imaging modalities and data distributions.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"128 ","pages":"Article 102695"},"PeriodicalIF":4.9,"publicationDate":"2025-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145886479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-19DOI: 10.1016/j.compmedimag.2025.102690
Luohong Wu , Matthias Seibold , Nicola A. Cavalcanti , Giuseppe Loggia , Lisa Reissner , Bastian Sigrist , Jonas Hein , Lilian Calvet , Arnd Viehöfer , Philipp Fürnstahl
Background:
Bone surface reconstruction is an essential component of computer-assisted orthopedic surgery (CAOS), forming the foundation for both preoperative planning and intraoperative guidance. Compared to traditional imaging modalities such as computed tomography (CT) and magnetic resonance imaging (MRI), ultrasound, an emerging CAOS technology, provides a radiation-free, cost-effective, and portable alternative. While ultrasound offers new opportunities in CAOS, technical shortcomings continue to hinder its translation into surgery. In particular, due to the inherent limitations of ultrasound imaging, B-mode ultrasound typically captures only partial bone surfaces. The inter- and intra-operator variability in ultrasound scanning further increases the complexity of the data. Existing reconstruction methods struggle with such challenging data, leading to increased reconstruction errors and artifacts, such as holes and inflated structures. Effective techniques for accurately reconstructing open bone surfaces from real-world 3D ultrasound volumes remain lacking.
Methods:
We propose UltraBoneUDF, a self-supervised framework specifically designed for reconstructing open bone surfaces from ultrasound data. It learns unsigned distance functions (UDFs) from 3D ultrasound data. In addition, we present a novel loss function based on local tangent plane optimization that substantially improves surface reconstruction quality. UltraBoneUDF and competing models are benchmarked on three open-source datasets and further evaluated through ablation studies.
Results:
Qualitative results demonstrate the limitations of the state-of-the-art methods. Quantitatively, UltraBoneUDF achieves comparable or lower bi-directional Chamfer distance across three datasets with fewer parameters: 1.60 mm on the UltraBones100k dataset ( improvement), 0.21 mm on the OpenBoneCT dataset, and 0.18 mm on the ClosedBoneCT dataset.
Conclusion:
UltraBoneUDF represents a promising solution for open bone surface reconstruction from 3D ultrasound volumes, with the potential to advance downstream applications in CAOS.
{"title":"UltraBoneUDF: Self-supervised bone surface reconstruction from ultrasound based on neural unsigned distance functions","authors":"Luohong Wu , Matthias Seibold , Nicola A. Cavalcanti , Giuseppe Loggia , Lisa Reissner , Bastian Sigrist , Jonas Hein , Lilian Calvet , Arnd Viehöfer , Philipp Fürnstahl","doi":"10.1016/j.compmedimag.2025.102690","DOIUrl":"10.1016/j.compmedimag.2025.102690","url":null,"abstract":"<div><h3>Background:</h3><div>Bone surface reconstruction is an essential component of computer-assisted orthopedic surgery (CAOS), forming the foundation for both preoperative planning and intraoperative guidance. Compared to traditional imaging modalities such as computed tomography (CT) and magnetic resonance imaging (MRI), ultrasound, an emerging CAOS technology, provides a radiation-free, cost-effective, and portable alternative. While ultrasound offers new opportunities in CAOS, technical shortcomings continue to hinder its translation into surgery. In particular, due to the inherent limitations of ultrasound imaging, B-mode ultrasound typically captures only partial bone surfaces. The inter- and intra-operator variability in ultrasound scanning further increases the complexity of the data. Existing reconstruction methods struggle with such challenging data, leading to increased reconstruction errors and artifacts, such as holes and inflated structures. Effective techniques for accurately reconstructing open bone surfaces from real-world 3D ultrasound volumes remain lacking.</div></div><div><h3>Methods:</h3><div>We propose UltraBoneUDF, a self-supervised framework specifically designed for reconstructing open bone surfaces from ultrasound data. It learns unsigned distance functions (UDFs) from 3D ultrasound data. In addition, we present a novel loss function based on local tangent plane optimization that substantially improves surface reconstruction quality. UltraBoneUDF and competing models are benchmarked on three open-source datasets and further evaluated through ablation studies.</div></div><div><h3>Results:</h3><div>Qualitative results demonstrate the limitations of the state-of-the-art methods. Quantitatively, UltraBoneUDF achieves comparable or lower bi-directional Chamfer distance across three datasets with fewer parameters: 1.60 mm on the UltraBones100k dataset (<span><math><mrow><mo>≈</mo><mn>25</mn><mo>.</mo><mn>5</mn><mtext>%</mtext></mrow></math></span> improvement), 0.21 mm on the OpenBoneCT dataset, and 0.18 mm on the ClosedBoneCT dataset.</div></div><div><h3>Conclusion:</h3><div>UltraBoneUDF represents a promising solution for open bone surface reconstruction from 3D ultrasound volumes, with the potential to advance downstream applications in CAOS.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"127 ","pages":"Article 102690"},"PeriodicalIF":4.9,"publicationDate":"2025-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145840092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}