Pub Date : 2025-12-15DOI: 10.1007/s11548-025-03558-z
Florian Heemeyer, Leonardo E Guido Lopez, Miguel E Jáuregui Abularach, Beatriz Sanz Verdejo, Quentin Boehler, Oliver Brinkmann, José L Merino, Bradley J Nelson
Purpose: Robotic systems for catheter ablation have been in clinical use for many years. While their impact on the clinical outcome and procedure times is well studied, aspects like usability and operator workload have received limited attention in the literature. Reduced workload and stress levels benefit the operator's mental and physical health, and can also lower the risk of errors and ultimately improve patient safety. The aim of this study is to investigate the workload and usability of remote magnetic navigation compared to conventional manual navigation.
Methods: We performed a user study with eight electrophysiologists. Each participant performed identical in-vitro navigation tasks replicating those found in pulmonary vein isolation using both manual and magnetic navigation. Magnetic navigation experiments were performed using the Navion, a mobile electromagnetic navigation system.
Results: Magnetic navigation significantly improved usability (p < 0.02) and workload (p < 0.01) compared to manual navigation, measured using the System Usability Scale (magnetic: 85.6 ± 9.3 vs. manual: 75.0 ± 17.8) and NASA Task Load Index (magnetic: 72.4 ± 13.5 vs. manual: 45.8 ± 16.7). Additionally, task completion times were shorter (p < 0.01) with magnetic navigation (284.6 ± 80.7 s) compared to manual navigation (411.0 ± 123.7 s).
Conclusion: The findings of this study suggest that remote magnetic navigation using the Navion significantly improves operator experiences in terms of workload and usability, reinforcing the case for wider adoption of well-designed robotic systems in cardiac electrophysiology labs.
{"title":"Investigating workload and usability of remote magnetic navigation for catheter ablation.","authors":"Florian Heemeyer, Leonardo E Guido Lopez, Miguel E Jáuregui Abularach, Beatriz Sanz Verdejo, Quentin Boehler, Oliver Brinkmann, José L Merino, Bradley J Nelson","doi":"10.1007/s11548-025-03558-z","DOIUrl":"https://doi.org/10.1007/s11548-025-03558-z","url":null,"abstract":"<p><strong>Purpose: </strong>Robotic systems for catheter ablation have been in clinical use for many years. While their impact on the clinical outcome and procedure times is well studied, aspects like usability and operator workload have received limited attention in the literature. Reduced workload and stress levels benefit the operator's mental and physical health, and can also lower the risk of errors and ultimately improve patient safety. The aim of this study is to investigate the workload and usability of remote magnetic navigation compared to conventional manual navigation.</p><p><strong>Methods: </strong>We performed a user study with eight electrophysiologists. Each participant performed identical in-vitro navigation tasks replicating those found in pulmonary vein isolation using both manual and magnetic navigation. Magnetic navigation experiments were performed using the Navion, a mobile electromagnetic navigation system.</p><p><strong>Results: </strong>Magnetic navigation significantly improved usability (p < 0.02) and workload (p < 0.01) compared to manual navigation, measured using the System Usability Scale (magnetic: 85.6 ± 9.3 vs. manual: 75.0 ± 17.8) and NASA Task Load Index (magnetic: 72.4 ± 13.5 vs. manual: 45.8 ± 16.7). Additionally, task completion times were shorter (p < 0.01) with magnetic navigation (284.6 ± 80.7 s) compared to manual navigation (411.0 ± 123.7 s).</p><p><strong>Conclusion: </strong>The findings of this study suggest that remote magnetic navigation using the Navion significantly improves operator experiences in terms of workload and usability, reinforcing the case for wider adoption of well-designed robotic systems in cardiac electrophysiology labs.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145764582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-13DOI: 10.1007/s11548-025-03557-0
Minghui Zhang, Yun Gu
Purpose: Shape modeling of volumetric medical images plays a crucial role in quantitative analysis and surgical planning for computer-aided diagnosis. However, automatic shape reconstruction from deep learning models often suffers from limited image resolution and the lack of shape prior constraints. This study aims to address these challenges by developing a method that enables reliable and accurate anatomical shape modeling in the continuous space.
Methods: We present the Reliable Shape Interaction with Implicit Template (ReShapeIT) network, which represents anatomical structures using continuous implicit fields rather than discrete voxel grids. The approach combines a category-specific implicit template field with a deformation network to encode anatomical shapes from training shapes. In addition, a Template Interaction Module (TIM) is designed to refine test cases by aligning learned template shapes with instance-specific latent codes.
Results: We evaluated ReShapeIT on three anatomical datasets-Liver, Pancreas, and Lung Lobe. The proposed method outperforms state-of-the-art approaches in 3D shape reconstruction, achieving Chamfer Distance/Earth Mover's Distance scores of 0.225/0.318 for Liver, 0.125/0.067 for Pancreas, and 0.414/0.098 for Lung Lobe.
Conclusion: ReShapeIT provides a reliable and generalizable solution for implicit anatomical shape modeling by leveraging shared template priors and instance-level deformations. The implementation is publicly available at: https://github.com/EndoluminalSurgicalVision-IMR/ReShapeIT .
{"title":"Reshapeit: reliable shape interaction with implicit template for medical anatomy reconstruction.","authors":"Minghui Zhang, Yun Gu","doi":"10.1007/s11548-025-03557-0","DOIUrl":"https://doi.org/10.1007/s11548-025-03557-0","url":null,"abstract":"<p><strong>Purpose: </strong>Shape modeling of volumetric medical images plays a crucial role in quantitative analysis and surgical planning for computer-aided diagnosis. However, automatic shape reconstruction from deep learning models often suffers from limited image resolution and the lack of shape prior constraints. This study aims to address these challenges by developing a method that enables reliable and accurate anatomical shape modeling in the continuous space.</p><p><strong>Methods: </strong>We present the Reliable Shape Interaction with Implicit Template (ReShapeIT) network, which represents anatomical structures using continuous implicit fields rather than discrete voxel grids. The approach combines a category-specific implicit template field with a deformation network to encode anatomical shapes from training shapes. In addition, a Template Interaction Module (TIM) is designed to refine test cases by aligning learned template shapes with instance-specific latent codes.</p><p><strong>Results: </strong>We evaluated ReShapeIT on three anatomical datasets-Liver, Pancreas, and Lung Lobe. The proposed method outperforms state-of-the-art approaches in 3D shape reconstruction, achieving Chamfer Distance/Earth Mover's Distance scores of 0.225/0.318 for Liver, 0.125/0.067 for Pancreas, and 0.414/0.098 for Lung Lobe.</p><p><strong>Conclusion: </strong>ReShapeIT provides a reliable and generalizable solution for implicit anatomical shape modeling by leveraging shared template priors and instance-level deformations. The implementation is publicly available at: https://github.com/EndoluminalSurgicalVision-IMR/ReShapeIT .</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145745328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-08DOI: 10.1007/s11548-025-03539-2
Melda Yeghaian, Stefano Trebeschi, Marina Herrero-Huertas, Francisco Javier Mendoza Ferradás, Paula Bos, Maarten J A van Alphen, Marcel A J van Gerven, Regina G H Beets-Tan, Zuhir Bodalal, Lilly-Ann van der Velden
Purpose: Accurate prediction of treatment outcomes is crucial for personalized treatment in head and neck squamous cell carcinoma (HNSCC). Beyond one-year survival, assessing long-term enteral nutrition dependence is essential for optimizing patient counseling and resource allocation. This preliminary study aimed to predict one-year survival and feeding tube dependence in surgically treated HNSCC patients using classical machine learning.
Methods: This proof-of-principle retrospective study included 558 surgically treated HNSCC patients. Baseline clinical data, routine blood markers, and MRI-based radiomic features were collected before treatment. Additional postsurgical treatments within one year were also recorded. Random forest classifiers were trained to predict one-year survival and feeding tube dependence. Model explainability was assessed using Shapley Additive exPlanation (SHAP) values.
Results: Using tenfold stratified cross-validation, clinical data showed the highest predictive performance for survival (AUC = 0.75 ± 0.10; p < 0.001). Blood (AUC = 0.67 ± 0.17; p = 0.001) and imaging (AUC = 0.68 ± 0.16; p = 0.26) showed moderate performance, and multimodal integration did not improve predictions (AUC = 0.68 ± 0.16; p = 0.38). For feeding tube dependence, all modalities had low predictive power (AUC ≤ 0.66; p > 0.05). However, postsurgical treatment information outperformed all other modalities (AUC = 0.67 ± 0.07; p = 0.002), but had the lowest predictive value for survival (AUC = 0.57 ± 0.11; p = 0.08).
Conclusion: Clinical data appeared to be the strongest predictor of one-year survival in surgically treated HNSCC, although overall predictive performance was moderate. Postsurgical treatment information played a key role in predicting tube feeding dependence. While multimodal integration did not enhance overall model performance, it showed modest gains for weaker individual modalities, suggesting potential complementarity that warrants further investigation.
{"title":"Machine learning-based treatment outcome prediction in head and neck cancer using integrated noninvasive diagnostics.","authors":"Melda Yeghaian, Stefano Trebeschi, Marina Herrero-Huertas, Francisco Javier Mendoza Ferradás, Paula Bos, Maarten J A van Alphen, Marcel A J van Gerven, Regina G H Beets-Tan, Zuhir Bodalal, Lilly-Ann van der Velden","doi":"10.1007/s11548-025-03539-2","DOIUrl":"https://doi.org/10.1007/s11548-025-03539-2","url":null,"abstract":"<p><strong>Purpose: </strong>Accurate prediction of treatment outcomes is crucial for personalized treatment in head and neck squamous cell carcinoma (HNSCC). Beyond one-year survival, assessing long-term enteral nutrition dependence is essential for optimizing patient counseling and resource allocation. This preliminary study aimed to predict one-year survival and feeding tube dependence in surgically treated HNSCC patients using classical machine learning.</p><p><strong>Methods: </strong>This proof-of-principle retrospective study included 558 surgically treated HNSCC patients. Baseline clinical data, routine blood markers, and MRI-based radiomic features were collected before treatment. Additional postsurgical treatments within one year were also recorded. Random forest classifiers were trained to predict one-year survival and feeding tube dependence. Model explainability was assessed using Shapley Additive exPlanation (SHAP) values.</p><p><strong>Results: </strong>Using tenfold stratified cross-validation, clinical data showed the highest predictive performance for survival (AUC = 0.75 ± 0.10; p < 0.001). Blood (AUC = 0.67 ± 0.17; p = 0.001) and imaging (AUC = 0.68 ± 0.16; p = 0.26) showed moderate performance, and multimodal integration did not improve predictions (AUC = 0.68 ± 0.16; p = 0.38). For feeding tube dependence, all modalities had low predictive power (AUC ≤ 0.66; p > 0.05). However, postsurgical treatment information outperformed all other modalities (AUC = 0.67 ± 0.07; p = 0.002), but had the lowest predictive value for survival (AUC = 0.57 ± 0.11; p = 0.08).</p><p><strong>Conclusion: </strong>Clinical data appeared to be the strongest predictor of one-year survival in surgically treated HNSCC, although overall predictive performance was moderate. Postsurgical treatment information played a key role in predicting tube feeding dependence. While multimodal integration did not enhance overall model performance, it showed modest gains for weaker individual modalities, suggesting potential complementarity that warrants further investigation.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145702875","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-04DOI: 10.1007/s11548-025-03548-1
A Marasi, D Milesi, D Aquino, F M Doniselli, R Pascuzzo, M Grisoli, A Redaelli, E De Momi
Purpose: Accurate prediction of overall survival (OS) in glioblastoma patients is critical for advancing personalized treatments and improving clinical trial design. Conventional radiomics approaches rely on manually engineered features, which limit their ability to capture complex, high-dimensional imaging patterns. This study employs a deep learning architecture to process MRI data for automated glioma segmentation and feature extraction, leveraging high-level representations from the encoder's latent space.
Methods: Multimodal MRI data from the BraTS2020 dataset and a proprietary dataset from Fondazione IRCCS Istituto Neurologico Carlo Besta (Milan, Italy) were processed independently using a U-Net-like model pre-trained on BraTS2018 and fine-tuned on BraTS2020. Features extracted from the encoder's latent space represented hierarchical imaging patterns. These features were combined with clinical variable (patient's age) and reduced via principal component analysis (PCA) to enhance computational efficiency. Machine learning classifiers-including random forest, XGBoost, and a fully connected neural network-were trained on the reduced feature vectors for OS classification.
Results: In the four-modality BraTS4CH setting, the multi-layer perceptron achieved the best performance (F1 = 0.71, AUC = 0.74, accuracy = 0.71). When limited to two modalities on BraTS2020 (BraTS2CH), MLP again led (F1 = 0.67, AUC = 0.70, accuracy = 0.67). On the IRCCS Besta two-modality cohort (Besta2CH), XGBoost produced the highest F1-score and accuracy (F1 = 0.65, accuracy = 0.66), while MLP obtained the top AUC (0.70). These results are competitive with-and in some metrics exceed-state-of-the-art reports, demonstrating the robustness and scalability of our automated framework relative to traditional radiomics and AI-driven approaches.
Conclusion: Integrating encoder-derived features from multimodal MRI data with clinical variables offers a scalable and effective approach for OS prediction in glioblastoma patients. This study demonstrates the potential of deep learning to address traditional radiomics limitations, paving the way for more precise and personalized prognostic tools.
{"title":"Glioblastoma survival prediction through MRI and clinical data integration with transfer learning.","authors":"A Marasi, D Milesi, D Aquino, F M Doniselli, R Pascuzzo, M Grisoli, A Redaelli, E De Momi","doi":"10.1007/s11548-025-03548-1","DOIUrl":"https://doi.org/10.1007/s11548-025-03548-1","url":null,"abstract":"<p><strong>Purpose: </strong>Accurate prediction of overall survival (OS) in glioblastoma patients is critical for advancing personalized treatments and improving clinical trial design. Conventional radiomics approaches rely on manually engineered features, which limit their ability to capture complex, high-dimensional imaging patterns. This study employs a deep learning architecture to process MRI data for automated glioma segmentation and feature extraction, leveraging high-level representations from the encoder's latent space.</p><p><strong>Methods: </strong>Multimodal MRI data from the BraTS2020 dataset and a proprietary dataset from Fondazione IRCCS Istituto Neurologico Carlo Besta (Milan, Italy) were processed independently using a U-Net-like model pre-trained on BraTS2018 and fine-tuned on BraTS2020. Features extracted from the encoder's latent space represented hierarchical imaging patterns. These features were combined with clinical variable (patient's age) and reduced via principal component analysis (PCA) to enhance computational efficiency. Machine learning classifiers-including random forest, XGBoost, and a fully connected neural network-were trained on the reduced feature vectors for OS classification.</p><p><strong>Results: </strong>In the four-modality BraTS4CH setting, the multi-layer perceptron achieved the best performance (F1 = 0.71, AUC = 0.74, accuracy = 0.71). When limited to two modalities on BraTS2020 (BraTS2CH), MLP again led (F1 = 0.67, AUC = 0.70, accuracy = 0.67). On the IRCCS Besta two-modality cohort (Besta2CH), XGBoost produced the highest F1-score and accuracy (F1 = 0.65, accuracy = 0.66), while MLP obtained the top AUC (0.70). These results are competitive with-and in some metrics exceed-state-of-the-art reports, demonstrating the robustness and scalability of our automated framework relative to traditional radiomics and AI-driven approaches.</p><p><strong>Conclusion: </strong>Integrating encoder-derived features from multimodal MRI data with clinical variables offers a scalable and effective approach for OS prediction in glioblastoma patients. This study demonstrates the potential of deep learning to address traditional radiomics limitations, paving the way for more precise and personalized prognostic tools.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145670828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-03DOI: 10.1007/s11548-025-03552-5
Nuno S Rodrigues, Pedro Morais, Lukas R Buschle, Estevão Lima, João L Vilaça
Purpose: Minimally invasive surgical approaches are currently the standard of care for men with prostate cancer, presenting higher rates of erectile function preservation. With these laparoscopic techniques, there is an increasing amount of data and information available. Adaptive systems can play an important role, acting as an intelligent information filter, assuring that all the available information can become useful for the procedure and not overwhelming for the surgeon. Standardizing and structuring the surgical workflow are key requirements for such smart assistants to recognize the different surgical steps through context information about the environment. This work aims to do a detailed characterization of a laparoscopic radical prostatectomy procedure, focusing on the formalization of medical expert knowledge, via surgical process modeling.
Methods: Data were acquired manually, via online and offline observation, and discussion with medical experts. A total of 14 procedures were observed. Both manual laparoscopic radical prostatectomy and robot-assisted laparoscopic prostatectomy were studied. The derived SPM focuses only on the intraoperatory part of the procedure, with constant feedback from the endoscopic camera. For surgery observation, a dedicated Excel template was developed.
Results: The final model is represented in a descriptive and numerical format, combining task description with a workflow diagram arrangement for ease of interpretation. Practical applications of the generated surgical process model are exemplified with the creation of activation trees for surgical phase identification. Anatomical structures are reported for each phase, distinguishing between visible and inferable ones. Additionally, the surgeons involved are identified, surgical instruments, and actions performed in each phase. A total of 11 phases were identified and characterized. Average surgery duration is 87 min.
Conclusion: The generated surgical process model is a first step toward the development of a context-aware surgical assistant and can potentially be used as a roadmap by other research teams, operating room managers and surgical teams.
{"title":"In-depth characterization of a laparoscopic radical prostatectomy procedure based on surgical process modeling.","authors":"Nuno S Rodrigues, Pedro Morais, Lukas R Buschle, Estevão Lima, João L Vilaça","doi":"10.1007/s11548-025-03552-5","DOIUrl":"https://doi.org/10.1007/s11548-025-03552-5","url":null,"abstract":"<p><strong>Purpose: </strong>Minimally invasive surgical approaches are currently the standard of care for men with prostate cancer, presenting higher rates of erectile function preservation. With these laparoscopic techniques, there is an increasing amount of data and information available. Adaptive systems can play an important role, acting as an intelligent information filter, assuring that all the available information can become useful for the procedure and not overwhelming for the surgeon. Standardizing and structuring the surgical workflow are key requirements for such smart assistants to recognize the different surgical steps through context information about the environment. This work aims to do a detailed characterization of a laparoscopic radical prostatectomy procedure, focusing on the formalization of medical expert knowledge, via surgical process modeling.</p><p><strong>Methods: </strong>Data were acquired manually, via online and offline observation, and discussion with medical experts. A total of 14 procedures were observed. Both manual laparoscopic radical prostatectomy and robot-assisted laparoscopic prostatectomy were studied. The derived SPM focuses only on the intraoperatory part of the procedure, with constant feedback from the endoscopic camera. For surgery observation, a dedicated Excel template was developed.</p><p><strong>Results: </strong>The final model is represented in a descriptive and numerical format, combining task description with a workflow diagram arrangement for ease of interpretation. Practical applications of the generated surgical process model are exemplified with the creation of activation trees for surgical phase identification. Anatomical structures are reported for each phase, distinguishing between visible and inferable ones. Additionally, the surgeons involved are identified, surgical instruments, and actions performed in each phase. A total of 11 phases were identified and characterized. Average surgery duration is 87 min.</p><p><strong>Conclusion: </strong>The generated surgical process model is a first step toward the development of a context-aware surgical assistant and can potentially be used as a roadmap by other research teams, operating room managers and surgical teams.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145670873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-02DOI: 10.1007/s11548-025-03555-2
Sreeram Kamabattula, Kai Chen, Kiran Bhattacharyya
Purpose: Surgical video review is essential for minimally invasive surgical training, but manual annotation of surgical steps is time-consuming and limits scalability. We propose a weakly supervised pre-training framework that leverages unannotated or heterogeneously labeled surgical videos to improve automated surgical step recognition.
Methods: We evaluate three types of weak labels derived from unannotated datasets: (1) surgical phases from the same or other procedures, (2) surgical steps from different procedure types, and (3) intraoperative time progression. Using datasets from four robotic-assisted procedures (sleeve gastrectomy, hysterectomy, cholecystectomy, and radical prostatectomy), we simulate real-world annotation scarcity by varying the proportion of available step annotations ( 0.25, 0.5, 0.75, 1.0). We benchmark the performance of a 2D CNN model trained with and without weak label pre-training.
Results: Pre-training with surgical phase labels-particularly from the same procedure type (PHASE-WITHIN)-consistently improved step recognition performance, with gains up to 6.4 f1-score points over standard ImageNet-based models under limited annotation conditions ( = 0.25 on SLG). Cross-procedure step pre-training was beneficial for some procedures, and time-based labels provided moderate gains depending on procedure structure. Label efficiency analysis shows the baseline model would require labeling an additional 30-60 videos at = 0.25 to match the performance achieved by the best weak-pretraining strategy across procedures.
Conclusion: Weakly supervised pre-training offers a practical strategy to improve surgical step recognition when annotated data is scarce. This approach can support scalable feedback and assessment in surgical training workflows where comprehensive annotations are infeasible.
{"title":"Weakly supervised pre-training for surgical step recognition using unannotated and heterogeneously labeled videos.","authors":"Sreeram Kamabattula, Kai Chen, Kiran Bhattacharyya","doi":"10.1007/s11548-025-03555-2","DOIUrl":"https://doi.org/10.1007/s11548-025-03555-2","url":null,"abstract":"<p><strong>Purpose: </strong>Surgical video review is essential for minimally invasive surgical training, but manual annotation of surgical steps is time-consuming and limits scalability. We propose a weakly supervised pre-training framework that leverages unannotated or heterogeneously labeled surgical videos to improve automated surgical step recognition.</p><p><strong>Methods: </strong>We evaluate three types of weak labels derived from unannotated datasets: (1) surgical phases from the same or other procedures, (2) surgical steps from different procedure types, and (3) intraoperative time progression. Using datasets from four robotic-assisted procedures (sleeve gastrectomy, hysterectomy, cholecystectomy, and radical prostatectomy), we simulate real-world annotation scarcity by varying the proportion of available step annotations ( <math><mi>α</mi></math> <math><mo>∈</mo></math> 0.25, 0.5, 0.75, 1.0). We benchmark the performance of a 2D CNN model trained with and without weak label pre-training.</p><p><strong>Results: </strong>Pre-training with surgical phase labels-particularly from the same procedure type (PHASE-WITHIN)-consistently improved step recognition performance, with gains up to 6.4 f1-score points over standard ImageNet-based models under limited annotation conditions ( <math><mi>α</mi></math> = 0.25 on SLG). Cross-procedure step pre-training was beneficial for some procedures, and time-based labels provided moderate gains depending on procedure structure. Label efficiency analysis shows the baseline model would require labeling an additional 30-60 videos at <math><mi>α</mi></math> = 0.25 to match the performance achieved by the best weak-pretraining strategy across procedures.</p><p><strong>Conclusion: </strong>Weakly supervised pre-training offers a practical strategy to improve surgical step recognition when annotated data is scarce. This approach can support scalable feedback and assessment in surgical training workflows where comprehensive annotations are infeasible.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145656095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Purpose: Radiofrequency ablation for liver cancer has advanced rapidly. For accurate ultrasound-guided soft-tissue puncture surgery, it is necessary to fuse intraoperative ultrasound images with preoperative computed tomography images. However, the conventional method is difficult to estimate and fuse images accurately. To address this issue, the present study proposes an algorithm for registering cross-source point clouds based on not surface but the geometric features of the vascular point cloud.
Methods: We developed a fusion system that performs cross-source point cloud registration between ultrasound and computed tomography images, extracting the node, skeleton, and geomatic feature of the vascular point cloud. The system completes the fusion process in an average of 14.5 s after acquiring the vascular point clouds via ultrasound.
Results: The experiments were conducted to fuse liver images by the dummy model and the healthy participants, respectively. The results show the proposed method achieved a registration error within 1.4 mm and decreased the target registration error significantly compared to other methods in a liver dummy model registration experiment. Furthermore, the proposed method achieved the averaged RMSE within 2.23 mm in a human liver vascular skeleton.
Conclusion: The study concluded that because the registration method using vascular feature point cloud could realize the rapid and accurate fusion between ultrasound and computed tomography images, the method is useful to apply the real puncture surgery for radiofrequency ablation for liver. In future work, we will evaluate the proposed method by the patients.
{"title":"Point cloud registration algorithm using liver vascular skeleton feature with computed tomography and ultrasonography image fusion.","authors":"Satoshi Miura, Masayuki Nakayama, Kexin Xu, Zhang Bo, Ryoko Kuromatsu, Masahito Nakano, Yu Noda, Takumi Kawaguchi","doi":"10.1007/s11548-025-03496-w","DOIUrl":"10.1007/s11548-025-03496-w","url":null,"abstract":"<p><strong>Purpose: </strong>Radiofrequency ablation for liver cancer has advanced rapidly. For accurate ultrasound-guided soft-tissue puncture surgery, it is necessary to fuse intraoperative ultrasound images with preoperative computed tomography images. However, the conventional method is difficult to estimate and fuse images accurately. To address this issue, the present study proposes an algorithm for registering cross-source point clouds based on not surface but the geometric features of the vascular point cloud.</p><p><strong>Methods: </strong>We developed a fusion system that performs cross-source point cloud registration between ultrasound and computed tomography images, extracting the node, skeleton, and geomatic feature of the vascular point cloud. The system completes the fusion process in an average of 14.5 s after acquiring the vascular point clouds via ultrasound.</p><p><strong>Results: </strong>The experiments were conducted to fuse liver images by the dummy model and the healthy participants, respectively. The results show the proposed method achieved a registration error within 1.4 mm and decreased the target registration error significantly compared to other methods in a liver dummy model registration experiment. Furthermore, the proposed method achieved the averaged RMSE within 2.23 mm in a human liver vascular skeleton.</p><p><strong>Conclusion: </strong>The study concluded that because the registration method using vascular feature point cloud could realize the rapid and accurate fusion between ultrasound and computed tomography images, the method is useful to apply the real puncture surgery for radiofrequency ablation for liver. In future work, we will evaluate the proposed method by the patients.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":"2469-2478"},"PeriodicalIF":2.3,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12689734/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144977838","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Purpose: For diffusion MRI (dMRI) parameter estimation, machine-learning approaches have shown promising results so far including the synthetic Q-space learning (synQSL) based on regressor training with only synthetic data. In this study, we aimed at the development of a new method named synthetic X-Q space learning (synXQSL) to improve robustness and investigated the basic characteristics.
Methods: For training data, local parameter patterns of 3 × 3 voxels were synthesized by a linear combination of six bases, in which parameters are estimated at the center voxel. We prepared three types of local patterns by choosing the number of bases: flat, linear and quadratic. Then, at each location of 3 × 3 voxels, signal values of the diffusion-weighted image were computed by the signal model equation for diffusional kurtosis imaging and Rician noise simulation. The multi-layer perceptron was used for parameter estimation and was trained for each parameter with various noise levels. The level is controlled by a noise ratio defined as a fraction of the standard deviation in the Rician noise distribution normalized by the average b = 0 signal values. Experiments for visual and quantitative validation were performed with synthetic data, a digital phantom and clinical breast datasets in comparison with the previous methods.
Results: By using synthetic datasets, synXQSL outperformed synQSL in the parameter estimation of noisy data sets. Through the digital phantom experiments, the combination of synXQSL bases yields different results and a quadratic pattern could be the reasonable choice. The clinical data experiments indicate that synXQSL suppresses noises in estimated parameter maps and consequently brings higher contrast.
Conclusion: The basic characteristics of synXQSL were investigated by using various types of datasets. The results indicate that synXQSL with the appropriate choice of bases in training data synthesis has the potential to improve dMRI parameters in noisy datasets.
{"title":"Synthetic X-Q space learning for diffusion MRI parameter estimation: a pilot study in breast DKI.","authors":"Yoshitaka Masutani, Kousei Konya, Erina Kato, Naoko Mori, Hideki Ota, Shunji Mugikura, Kei Takase, Yuki Ichinoseki","doi":"10.1007/s11548-025-03550-7","DOIUrl":"10.1007/s11548-025-03550-7","url":null,"abstract":"<p><strong>Purpose: </strong>For diffusion MRI (dMRI) parameter estimation, machine-learning approaches have shown promising results so far including the synthetic Q-space learning (synQSL) based on regressor training with only synthetic data. In this study, we aimed at the development of a new method named synthetic X-Q space learning (synXQSL) to improve robustness and investigated the basic characteristics.</p><p><strong>Methods: </strong>For training data, local parameter patterns of 3 × 3 voxels were synthesized by a linear combination of six bases, in which parameters are estimated at the center voxel. We prepared three types of local patterns by choosing the number of bases: flat, linear and quadratic. Then, at each location of 3 × 3 voxels, signal values of the diffusion-weighted image were computed by the signal model equation for diffusional kurtosis imaging and Rician noise simulation. The multi-layer perceptron was used for parameter estimation and was trained for each parameter with various noise levels. The level is controlled by a noise ratio defined as a fraction of the standard deviation in the Rician noise distribution normalized by the average b = 0 signal values. Experiments for visual and quantitative validation were performed with synthetic data, a digital phantom and clinical breast datasets in comparison with the previous methods.</p><p><strong>Results: </strong>By using synthetic datasets, synXQSL outperformed synQSL in the parameter estimation of noisy data sets. Through the digital phantom experiments, the combination of synXQSL bases yields different results and a quadratic pattern could be the reasonable choice. The clinical data experiments indicate that synXQSL suppresses noises in estimated parameter maps and consequently brings higher contrast.</p><p><strong>Conclusion: </strong>The basic characteristics of synXQSL were investigated by using various types of datasets. The results indicate that synXQSL with the appropriate choice of bases in training data synthesis has the potential to improve dMRI parameters in noisy datasets.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":"2423-2435"},"PeriodicalIF":2.3,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12689713/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145589737","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Purpose: Carotid plaque is an early manifestation of carotid atherosclerosis, and its accurate segmentation helps to assess cardiovascular disease risk. However, existing carotid artery segmentation algorithms are difficult to accurately capture the structural features of morphologically diverse plaques and lack effective utilization of multilayer features.
Methods: In order to solve the above problems, this paper proposes a multi-scale hybrid attention hierarchical fusion U-network structure (MHAHF-UNet) for segmenting ambiguous plaques in carotid artery images in order to improve the segmentation accuracy for complex structured images. The structure firstly introduces the median-enhanced orthogonal convolution module (MEOConv), which not only effectively suppresses the noise interference in ultrasound images, but also maintains the ability to perceive multi-scale features by combining the median-enhanced ternary channel mechanism and the depth-orthogonal convolution space mechanism. Secondly, it adopts the multi-fusion group convolutional gating module, which realizes the effective integration of shallow detailed features and deep semantic features through the adaptive control strategy of group convolution, and is able to flexibly regulate the transfer weights of features at different levels.
Results: Experiments show that the MHAHF-UNet model achieves a Dice coefficient of and an IOU of in the carotid artery segmentation task.
Conclusion: The model is expected to provide strong support for the prevention and treatment of cardiovascular diseases.
{"title":"MHAHF-UNet: a multi-scale hybrid attention hierarchy fusion network for carotid artery segmentation.","authors":"Changshuo Jiang, Lin Gao, Wei Li, Maoyang Zou, Qingxiao Zheng, Xuhua Qiao","doi":"10.1007/s11548-025-03449-3","DOIUrl":"10.1007/s11548-025-03449-3","url":null,"abstract":"<p><strong>Purpose: </strong>Carotid plaque is an early manifestation of carotid atherosclerosis, and its accurate segmentation helps to assess cardiovascular disease risk. However, existing carotid artery segmentation algorithms are difficult to accurately capture the structural features of morphologically diverse plaques and lack effective utilization of multilayer features.</p><p><strong>Methods: </strong>In order to solve the above problems, this paper proposes a multi-scale hybrid attention hierarchical fusion U-network structure (MHAHF-UNet) for segmenting ambiguous plaques in carotid artery images in order to improve the segmentation accuracy for complex structured images. The structure firstly introduces the median-enhanced orthogonal convolution module (MEOConv), which not only effectively suppresses the noise interference in ultrasound images, but also maintains the ability to perceive multi-scale features by combining the median-enhanced ternary channel mechanism and the depth-orthogonal convolution space mechanism. Secondly, it adopts the multi-fusion group convolutional gating module, which realizes the effective integration of shallow detailed features and deep semantic features through the adaptive control strategy of group convolution, and is able to flexibly regulate the transfer weights of features at different levels.</p><p><strong>Results: </strong>Experiments show that the MHAHF-UNet model achieves a Dice coefficient of <math><mrow><mn>82.46</mn> <mo>±</mo> <mn>0.31</mn> <mo>%</mo></mrow> </math> and an IOU of <math><mrow><mn>71.45</mn> <mo>±</mo> <mn>0.37</mn> <mo>%</mo></mrow> </math> in the carotid artery segmentation task.</p><p><strong>Conclusion: </strong>The model is expected to provide strong support for the prevention and treatment of cardiovascular diseases.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":"2541-2551"},"PeriodicalIF":2.3,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144318640","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01Epub Date: 2025-06-26DOI: 10.1007/s11548-025-03462-6
Vincent K Schenk, Markus A Küper, Maximilian M Menger, Steven C Herath, Tina Histing, Christof K Audretsch
Purpose: The incidence of acetabular and pelvic fractures is rising significantly. Pelvic ring fractures rank as the sixth most common fractures in adults, with the majority occurring in the elderly. Due to complications related to surgical approaches, with rates of up to 31%, there is an increasing demand for minimally invasive surgical techniques. Augmented Reality (AR) has the potential to facilitate spatial orientation by a sophisticated user interface. The aim of this study was to develop an AR-based, radiation-free navigation system for pelvic fractures.
Methods: The Microsoft® HoloLens 2 was used as the AR headset. The Unity® game engine was used for programming. Pelvic models from Sawbones® served as the model. Segmentation was performed using Slicer3D by Slicer Corporation. The symphysis and both anterior superior iliac spines were defined as anatomical reference points. Ten pelvic models were used for testing. A preoperatively defined drill trajectory was displayed to the surgeon. A total of 20 S1 screws and 19 S2 screws were placed using only AR navigation without visual access to the pelvic model. Screw placement was controlled using CT.
Results: The matching process took an average of 3 min and 28 s. 18 out of 20 (90%) S1 screws and 3 out of 20 (15%) S2 screws were placed correctly. In most cases, no perforation occurred. The mean procedure time was 7 min for S1 screws and 5 min for S2 screws.
Conclusion: Proper drilling was achieved by displaying the trajectories via AR, particularly for S1 screws, where a slightly wider drilling corridor was aimed for compared to S2 screws. No registration scan was necessary with our matching method. No intraoperative radiation was required.
目的:髋臼和骨盆骨折的发生率明显上升。骨盆环骨折是成人第六大常见骨折,大多数发生在老年人中。由于与手术入路相关的并发症发生率高达31%,对微创手术技术的需求日益增加。增强现实(AR)具有通过复杂的用户界面促进空间定向的潜力。本研究的目的是开发一种基于ar的无辐射骨盆骨折导航系统。方法:采用Microsoft®HoloLens 2作为AR头显。使用Unity®游戏引擎进行编程。来自Sawbones®的骨盆模型作为模型。使用Slicer Corporation的Slicer3D进行分割。联合和髂前上棘被定义为解剖参考点。采用10个盆腔模型进行试验。将术前定义的钻孔轨迹显示给外科医生。共放置20枚S1螺钉和19枚S2螺钉,仅使用AR导航,不使用视觉进入骨盆模型。CT控制螺钉置入。结果:匹配过程平均耗时3 min 28 s, 20枚S1螺钉中有18枚(90%)正确放置,20枚S2螺钉中有3枚(15%)正确放置。在大多数情况下,没有发生穿孔。S1螺钉的平均手术时间为7分钟,S2螺钉为5分钟。结论:通过AR显示轨迹可以实现适当的钻孔,特别是S1螺钉,与S2螺钉相比,S1螺钉的钻孔通道略宽。我们的匹配方法不需要注册扫描。术中不需要放射治疗。
{"title":"Augmented reality in pelvic surgery: using an AR-headset as intraoperative radiation-free navigation tool.","authors":"Vincent K Schenk, Markus A Küper, Maximilian M Menger, Steven C Herath, Tina Histing, Christof K Audretsch","doi":"10.1007/s11548-025-03462-6","DOIUrl":"10.1007/s11548-025-03462-6","url":null,"abstract":"<p><strong>Purpose: </strong>The incidence of acetabular and pelvic fractures is rising significantly. Pelvic ring fractures rank as the sixth most common fractures in adults, with the majority occurring in the elderly. Due to complications related to surgical approaches, with rates of up to 31%, there is an increasing demand for minimally invasive surgical techniques. Augmented Reality (AR) has the potential to facilitate spatial orientation by a sophisticated user interface. The aim of this study was to develop an AR-based, radiation-free navigation system for pelvic fractures.</p><p><strong>Methods: </strong>The Microsoft® HoloLens 2 was used as the AR headset. The Unity® game engine was used for programming. Pelvic models from Sawbones® served as the model. Segmentation was performed using Slicer3D by Slicer Corporation. The symphysis and both anterior superior iliac spines were defined as anatomical reference points. Ten pelvic models were used for testing. A preoperatively defined drill trajectory was displayed to the surgeon. A total of 20 S1 screws and 19 S2 screws were placed using only AR navigation without visual access to the pelvic model. Screw placement was controlled using CT.</p><p><strong>Results: </strong>The matching process took an average of 3 min and 28 s. 18 out of 20 (90%) S1 screws and 3 out of 20 (15%) S2 screws were placed correctly. In most cases, no perforation occurred. The mean procedure time was 7 min for S1 screws and 5 min for S2 screws.</p><p><strong>Conclusion: </strong>Proper drilling was achieved by displaying the trajectories via AR, particularly for S1 screws, where a slightly wider drilling corridor was aimed for compared to S2 screws. No registration scan was necessary with our matching method. No intraoperative radiation was required.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":"2553-2563"},"PeriodicalIF":2.3,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12689752/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144499154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}