Purpose: Depth estimation is a powerful tool for navigation in laparoscopic surgery. Previous methods utilize predicted depth maps and the relative poses of the camera to accomplish self-supervised depth estimation. However, the smooth surfaces of organs with textureless regions and the laparoscope's complex rotations make depth and pose estimation difficult in laparoscopic scenes. Therefore, we propose a novel and effective self-supervised monocular depth estimation method with self-attention-guided pose estimation and a joint depth-pose loss function for laparoscopic images.
Methods: We extract feature maps and calculate the minimum re-projection error as a feature-metric loss to establish constraints based on feature maps with more meaningful representations. Moreover, we introduce the self-attention block in the pose estimation network to predict rotations and translations of the relative poses. In addition, we minimize the difference between predicted relative poses as the pose loss. We combine all of the losses as a joint depth-pose loss.
Results: The proposed method is extensively evaluated using SCARED and Hamlyn datasets. Quantitative results show that the proposed method achieves improvements of about 18.07 and 14.00 in the absolute relative error when combining all of the proposed components for depth estimation on SCARED and Hamlyn datasets. The qualitative results show that the proposed method produces smooth depth maps with low error in various laparoscopic scenes. The proposed method also exhibits a trade-off between computational efficiency and performance.
Conclusion: This study considers the characteristics of laparoscopic datasets and presents a simple yet effective self-supervised monocular depth estimation. We propose a joint depth-pose loss function based on the extracted feature for depth estimation on laparoscopic images guided by a self-attention block. The experimental results prove that all of the proposed components contribute to the proposed method. Furthermore, the proposed method strikes an efficient balance between computational efficiency and performance.
{"title":"Enhanced self-supervised monocular depth estimation with self-attention and joint depth-pose loss for laparoscopic images.","authors":"Wenda Li, Yuichiro Hayashi, Masahiro Oda, Takayuki Kitasaka, Kazunari Misawa, Kensaku Mori","doi":"10.1007/s11548-025-03332-1","DOIUrl":"https://doi.org/10.1007/s11548-025-03332-1","url":null,"abstract":"<p><strong>Purpose: </strong>Depth estimation is a powerful tool for navigation in laparoscopic surgery. Previous methods utilize predicted depth maps and the relative poses of the camera to accomplish self-supervised depth estimation. However, the smooth surfaces of organs with textureless regions and the laparoscope's complex rotations make depth and pose estimation difficult in laparoscopic scenes. Therefore, we propose a novel and effective self-supervised monocular depth estimation method with self-attention-guided pose estimation and a joint depth-pose loss function for laparoscopic images.</p><p><strong>Methods: </strong>We extract feature maps and calculate the minimum re-projection error as a feature-metric loss to establish constraints based on feature maps with more meaningful representations. Moreover, we introduce the self-attention block in the pose estimation network to predict rotations and translations of the relative poses. In addition, we minimize the difference between predicted relative poses as the pose loss. We combine all of the losses as a joint depth-pose loss.</p><p><strong>Results: </strong>The proposed method is extensively evaluated using SCARED and Hamlyn datasets. Quantitative results show that the proposed method achieves improvements of about 18.07 <math><mo>%</mo></math> and 14.00 <math><mo>%</mo></math> in the absolute relative error when combining all of the proposed components for depth estimation on SCARED and Hamlyn datasets. The qualitative results show that the proposed method produces smooth depth maps with low error in various laparoscopic scenes. The proposed method also exhibits a trade-off between computational efficiency and performance.</p><p><strong>Conclusion: </strong>This study considers the characteristics of laparoscopic datasets and presents a simple yet effective self-supervised monocular depth estimation. We propose a joint depth-pose loss function based on the extracted feature for depth estimation on laparoscopic images guided by a self-attention block. The experimental results prove that all of the proposed components contribute to the proposed method. Furthermore, the proposed method strikes an efficient balance between computational efficiency and performance.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143531055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-24DOI: 10.1007/s11548-025-03333-0
Yu Li, Da Chang, Die Luo, Jin Huang, Lan Dong, Du Wang, Liye Mei, Cheng Lei
Purpose: In laparoscopic surgery, accurate 3D reconstruction from endoscopic video is crucial for effective image-guided techniques. Current methods for monocular depth estimation (MDE) face challenges in complex surgical scenes, including limited training data, specular reflections, and varying illumination conditions.
Methods: We propose SfMDiffusion, a novel diffusion-based self-supervised framework for MDE. Our approach combines: (1) a denoising diffusion process guided by pseudo-ground-truth depth maps, (2) knowledge distillation from a pre-trained teacher model, and (3) discriminative priors to enhance estimation robustness. Our design enables accurate depth estimation without requiring ground-truth depth data during training.
Results: Experiments on the SCARED and Hamlyn datasets demonstrate that SfMDiffusion achieves superior performance: an Absolute relative error (Abs Rel) of 0.049, a Squared relative error (Sq Rel) of 0.366, and a Root Mean Square Error (RMSE) of 4.305 on SCARED dataset, and Abs Rel of 0.067, Sq Rel of 0.800, and RMSE of 7.465 on Hamlyn dataset.
Conclusion: SfMDiffusion provides an innovative approach for 3D reconstruction in image-guided surgical techniques. Future work will focus on computational optimization and validation across diverse surgical scenarios. Our code is available at https://github.com/Skylanding/SfM-Diffusion .
{"title":"SfMDiffusion: self-supervised monocular depth estimation in endoscopy based on diffusion models.","authors":"Yu Li, Da Chang, Die Luo, Jin Huang, Lan Dong, Du Wang, Liye Mei, Cheng Lei","doi":"10.1007/s11548-025-03333-0","DOIUrl":"https://doi.org/10.1007/s11548-025-03333-0","url":null,"abstract":"<p><strong>Purpose: </strong>In laparoscopic surgery, accurate 3D reconstruction from endoscopic video is crucial for effective image-guided techniques. Current methods for monocular depth estimation (MDE) face challenges in complex surgical scenes, including limited training data, specular reflections, and varying illumination conditions.</p><p><strong>Methods: </strong>We propose SfMDiffusion, a novel diffusion-based self-supervised framework for MDE. Our approach combines: (1) a denoising diffusion process guided by pseudo-ground-truth depth maps, (2) knowledge distillation from a pre-trained teacher model, and (3) discriminative priors to enhance estimation robustness. Our design enables accurate depth estimation without requiring ground-truth depth data during training.</p><p><strong>Results: </strong>Experiments on the SCARED and Hamlyn datasets demonstrate that SfMDiffusion achieves superior performance: an Absolute relative error (Abs Rel) of 0.049, a Squared relative error (Sq Rel) of 0.366, and a Root Mean Square Error (RMSE) of 4.305 on SCARED dataset, and Abs Rel of 0.067, Sq Rel of 0.800, and RMSE of 7.465 on Hamlyn dataset.</p><p><strong>Conclusion: </strong>SfMDiffusion provides an innovative approach for 3D reconstruction in image-guided surgical techniques. Future work will focus on computational optimization and validation across diverse surgical scenarios. Our code is available at https://github.com/Skylanding/SfM-Diffusion .</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143494324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-22DOI: 10.1007/s11548-024-03252-6
Qin An, Hirohisa Oda, Yuichiro Hayashi, Takayuki Kitasaka, Hiroo Uchida, Akinari Hinoki, Kojiro Suzuki, Aitaro Takimoto, Masahiro Oda, Kensaku Mori
Purpose: The paper introduces a novel two-step network based on semi-supervised learning for intestine segmentation from CT volumes. The intestine folds in the abdomen with complex spatial structures and contact with neighboring organs that bring difficulty for accurate segmentation and labeling at the pixel level. We propose a multi-dimensional consistency learning method to reduce the insufficient intestine segmentation results caused by complex structures and the limited labeled dataset.
Methods: We designed a two-stage model to segment the intestine. In stage 1, a 2D Swin U-Net is trained using labeled data to generate pseudo-labels for unlabeled data. In stage 2, a 3D U-Net is trained using labeled and unlabeled data to create the final segmentation model. The model comprises two networks from different dimensions, capturing more comprehensive representations of the intestine and potentially enhancing the model's performance in intestine segmentation.
Results: We used 59 CT volumes to validate the effectiveness of our method. The experiment was repeated three times getting the average as the final result. Compared to the baseline method, our method improved 3.25% Dice score and 6.84% recall rate.
Conclusion: The proposed method is based on semi-supervised learning and involves training both 2D Swin U-Net and 3D U-Net. The method mitigates the impact of limited labeled data and maintains consistncy of multi-dimensional outputs from the two networks to improve the segmentation accuracy. Compared to previous methods, our method demonstrates superior segmentation performance.
{"title":"Multi-dimensional consistency learning between 2D Swin U-Net and 3D U-Net for intestine segmentation from CT volume.","authors":"Qin An, Hirohisa Oda, Yuichiro Hayashi, Takayuki Kitasaka, Hiroo Uchida, Akinari Hinoki, Kojiro Suzuki, Aitaro Takimoto, Masahiro Oda, Kensaku Mori","doi":"10.1007/s11548-024-03252-6","DOIUrl":"https://doi.org/10.1007/s11548-024-03252-6","url":null,"abstract":"<p><strong>Purpose: </strong>The paper introduces a novel two-step network based on semi-supervised learning for intestine segmentation from CT volumes. The intestine folds in the abdomen with complex spatial structures and contact with neighboring organs that bring difficulty for accurate segmentation and labeling at the pixel level. We propose a multi-dimensional consistency learning method to reduce the insufficient intestine segmentation results caused by complex structures and the limited labeled dataset.</p><p><strong>Methods: </strong>We designed a two-stage model to segment the intestine. In stage 1, a 2D Swin U-Net is trained using labeled data to generate pseudo-labels for unlabeled data. In stage 2, a 3D U-Net is trained using labeled and unlabeled data to create the final segmentation model. The model comprises two networks from different dimensions, capturing more comprehensive representations of the intestine and potentially enhancing the model's performance in intestine segmentation.</p><p><strong>Results: </strong>We used 59 CT volumes to validate the effectiveness of our method. The experiment was repeated three times getting the average as the final result. Compared to the baseline method, our method improved 3.25% Dice score and 6.84% recall rate.</p><p><strong>Conclusion: </strong>The proposed method is based on semi-supervised learning and involves training both 2D Swin U-Net and 3D U-Net. The method mitigates the impact of limited labeled data and maintains consistncy of multi-dimensional outputs from the two networks to improve the segmentation accuracy. Compared to previous methods, our method demonstrates superior segmentation performance.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143477169","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-20DOI: 10.1007/s11548-025-03335-y
Mohamed Harmanani, Paul F R Wilson, Minh Nguyen Nhat To, Mahdi Gilany, Amoon Jamzad, Fahimeh Fooladgar, Brian Wodlinger, Purang Abolmaesumi, Parvin Mousavi
Purpose: While deep learning methods have shown great promise in improving the effectiveness of prostate cancer (PCa) diagnosis by detecting suspicious lesions from trans-rectal ultrasound (TRUS), they must overcome multiple simultaneous challenges. There is high heterogeneity in tissue appearance, significant class imbalance in favor of benign examples, and scarcity in the number and quality of ground truth annotations available to train models. Failure to address even a single one of these problems can result in unacceptable clinical outcomes.
Methods: We propose TRUSWorthy, a carefully designed, tuned, and integrated system for reliable PCa detection. Our pipeline integrates self-supervised learning, multiple-instance learning aggregation using transformers, random-undersampled boosting and ensembling: These address label scarcity, weak labels, class imbalance, and overconfidence, respectively. We train and rigorously evaluate our method using a large, multi-center dataset of micro-ultrasound data.
Results: Our method outperforms previous state-of-the-art deep learning methods in terms of accuracy and uncertainty calibration, with AUROC and balanced accuracy scores of 79.9% and 71.5%, respectively. On the top 20% of predictions with the highest confidence, we can achieve a balanced accuracy of up to 91%.
Conclusion: The success of TRUSWorthy demonstrates the potential of integrated deep learning solutions to meet clinical needs in a highly challenging deployment setting, and is a significant step toward creating a trustworthy system for computer-assisted PCa diagnosis.
{"title":"TRUSWorthy: toward clinically applicable deep learning for confident detection of prostate cancer in micro-ultrasound.","authors":"Mohamed Harmanani, Paul F R Wilson, Minh Nguyen Nhat To, Mahdi Gilany, Amoon Jamzad, Fahimeh Fooladgar, Brian Wodlinger, Purang Abolmaesumi, Parvin Mousavi","doi":"10.1007/s11548-025-03335-y","DOIUrl":"https://doi.org/10.1007/s11548-025-03335-y","url":null,"abstract":"<p><strong>Purpose: </strong>While deep learning methods have shown great promise in improving the effectiveness of prostate cancer (PCa) diagnosis by detecting suspicious lesions from trans-rectal ultrasound (TRUS), they must overcome multiple simultaneous challenges. There is high heterogeneity in tissue appearance, significant class imbalance in favor of benign examples, and scarcity in the number and quality of ground truth annotations available to train models. Failure to address even a single one of these problems can result in unacceptable clinical outcomes.</p><p><strong>Methods: </strong>We propose TRUSWorthy, a carefully designed, tuned, and integrated system for reliable PCa detection. Our pipeline integrates self-supervised learning, multiple-instance learning aggregation using transformers, random-undersampled boosting and ensembling: These address label scarcity, weak labels, class imbalance, and overconfidence, respectively. We train and rigorously evaluate our method using a large, multi-center dataset of micro-ultrasound data.</p><p><strong>Results: </strong>Our method outperforms previous state-of-the-art deep learning methods in terms of accuracy and uncertainty calibration, with AUROC and balanced accuracy scores of 79.9% and 71.5%, respectively. On the top 20% of predictions with the highest confidence, we can achieve a balanced accuracy of up to 91%.</p><p><strong>Conclusion: </strong>The success of TRUSWorthy demonstrates the potential of integrated deep learning solutions to meet clinical needs in a highly challenging deployment setting, and is a significant step toward creating a trustworthy system for computer-assisted PCa diagnosis.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143460333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-19DOI: 10.1007/s11548-025-03324-1
George R Nahass, Nicolas Kaplan, Isabel Scharf, Devansh Saini, Naji Bou Zeid, Sobhi Kazmouz, Linping Zhao, Lee W T Alkureishi
Purpose: The fibula-free flap (FFF) is a valuable reconstructive technique in maxillofacial surgery; however, the assessment of osteotomy accuracy remains challenging. We devised two novel methodologies to compare planned and postoperative osteotomies in FFF reconstructions that minimized user input but would still generalize to other operations involving the analysis of osteotomies.
Methods: Our approaches leverage basic mathematics to derive both quantitative and qualitative insights about the relationship of the postoperative osteotomy to the planned model. We have coined our methods 'analysis by a shared reference angle' and 'Euler angle analysis.'
Results: In addition to describing our algorithm and the clinical utility, we present a thorough validation of both methods. Our algorithm is highly repeatable in an intraobserver repeatability test and provides information about the overall accuracy as well as geometric specifics of the deviation from the planned reconstruction.
Conclusion: Our algorithm is a novel and robust method for assessing the osteotomy accuracy of FFF reconstructions. This approach has no reliance on the overall position of the reconstruction, which is valuable due to the multiple factors that may influence the outcome of FFF reconstructions. Additionally, while our approach relies on anatomical features for landmark selections, the flexibility in our approach makes it applicable to evaluate any operation involving osteotomies.
{"title":"Mathematical methods for assessing the accuracy of pre-planned and guided surgical osteotomies.","authors":"George R Nahass, Nicolas Kaplan, Isabel Scharf, Devansh Saini, Naji Bou Zeid, Sobhi Kazmouz, Linping Zhao, Lee W T Alkureishi","doi":"10.1007/s11548-025-03324-1","DOIUrl":"https://doi.org/10.1007/s11548-025-03324-1","url":null,"abstract":"<p><strong>Purpose: </strong>The fibula-free flap (FFF) is a valuable reconstructive technique in maxillofacial surgery; however, the assessment of osteotomy accuracy remains challenging. We devised two novel methodologies to compare planned and postoperative osteotomies in FFF reconstructions that minimized user input but would still generalize to other operations involving the analysis of osteotomies.</p><p><strong>Methods: </strong>Our approaches leverage basic mathematics to derive both quantitative and qualitative insights about the relationship of the postoperative osteotomy to the planned model. We have coined our methods 'analysis by a shared reference angle' and 'Euler angle analysis.'</p><p><strong>Results: </strong>In addition to describing our algorithm and the clinical utility, we present a thorough validation of both methods. Our algorithm is highly repeatable in an intraobserver repeatability test and provides information about the overall accuracy as well as geometric specifics of the deviation from the planned reconstruction.</p><p><strong>Conclusion: </strong>Our algorithm is a novel and robust method for assessing the osteotomy accuracy of FFF reconstructions. This approach has no reliance on the overall position of the reconstruction, which is valuable due to the multiple factors that may influence the outcome of FFF reconstructions. Additionally, while our approach relies on anatomical features for landmark selections, the flexibility in our approach makes it applicable to evaluate any operation involving osteotomies.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143450396","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-17DOI: 10.1007/s11548-025-03331-2
Hicham Messaoudi, Marwan Abbas, Bogdan Badic, Douraied Ben Salem, Ahror Belaid, Pierre-Henri Conze
Purpose: Liver resection is a complex procedure requiring precise removal of tumors while preserving viable tissue. This study proposes a novel approach for automated liver resection planning, using segmentations of the liver, vessels, and tumors from CT scans to predict the future liver remnant (FLR), aiming to improve pre-operative planning accuracy and patient outcomes.
Methods: This study evaluates deep convolutional and Transformer-based networks under various computational setups. Using different combinations of anatomical and pathological delineation masks, we assess the contribution of each structure. The method is initially tested with ground-truth masks for feasibility and later validated with predicted masks from a deep learning model.
Results: The experimental results highlight the crucial importance of incorporating anatomical and pathological masks for accurate FLR delineation. Among the tested configurations, the best performing model achieves an average Dice score of approximately 0.86, aligning closely with the inter-observer variability reported in the literature. Additionally, the model achieves an average symmetric surface distance of 0.95 mm, demonstrating its precision in capturing fine-grained structural details critical for pre-operative planning.
Conclusion: This study highlights the potential for fully-automated FLR segmentation pipelines in liver pre-operative planning. Our approach holds promise for developing a solution to reduce the time and variability associated with manual delineation. Such method can provide better decision-making in liver resection planning by providing accurate and consistent segmentation results. Future studies should explore its seamless integration into clinical workflows.
{"title":"Automatic future remnant segmentation in liver resection planning.","authors":"Hicham Messaoudi, Marwan Abbas, Bogdan Badic, Douraied Ben Salem, Ahror Belaid, Pierre-Henri Conze","doi":"10.1007/s11548-025-03331-2","DOIUrl":"https://doi.org/10.1007/s11548-025-03331-2","url":null,"abstract":"<p><strong>Purpose: </strong>Liver resection is a complex procedure requiring precise removal of tumors while preserving viable tissue. This study proposes a novel approach for automated liver resection planning, using segmentations of the liver, vessels, and tumors from CT scans to predict the future liver remnant (FLR), aiming to improve pre-operative planning accuracy and patient outcomes.</p><p><strong>Methods: </strong>This study evaluates deep convolutional and Transformer-based networks under various computational setups. Using different combinations of anatomical and pathological delineation masks, we assess the contribution of each structure. The method is initially tested with ground-truth masks for feasibility and later validated with predicted masks from a deep learning model.</p><p><strong>Results: </strong>The experimental results highlight the crucial importance of incorporating anatomical and pathological masks for accurate FLR delineation. Among the tested configurations, the best performing model achieves an average Dice score of approximately 0.86, aligning closely with the inter-observer variability reported in the literature. Additionally, the model achieves an average symmetric surface distance of 0.95 mm, demonstrating its precision in capturing fine-grained structural details critical for pre-operative planning.</p><p><strong>Conclusion: </strong>This study highlights the potential for fully-automated FLR segmentation pipelines in liver pre-operative planning. Our approach holds promise for developing a solution to reduce the time and variability associated with manual delineation. Such method can provide better decision-making in liver resection planning by providing accurate and consistent segmentation results. Future studies should explore its seamless integration into clinical workflows.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143442785","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-15DOI: 10.1007/s11548-024-03290-0
Fan Wu, Xiangfeng Lin, Yuying Chen, Mengqian Ge, Ting Pan, Jingjing Shi, Linlin Mao, Gang Pan, You Peng, Li Zhou, Haitao Zheng, Dingcun Luo, Yu Zhang
Objective: BRAFV600E is the most common mutation found in thyroid cancer and is particularly associated with papillary thyroid carcinoma (PTC). Currently, genetic mutation detection relies on invasive procedures. This study aimed to extract radiomic features and utilize deep transfer learning (DTL) from ultrasound images to develop a noninvasive artificial intelligence model for identifying BRAFV600E mutations.
Materials and methods: Regions of interest (ROI) were manually annotated in the ultrasound images, and radiomic and DTL features were extracted. These were used in a joint DTL-radiomics (DTLR) model. Fourteen DTL models were employed, and feature selection was performed using the LASSO regression. Eight machine learning methods were used to construct predictive models. Model performance was primarily evaluated using area under the curve (AUC), accuracy, sensitivity and specificity. The interpretability of the model was visualized using gradient-weighted class activation maps (Grad-CAM).
Results: Sole reliance on radiomics for identification of BRAFV600E mutations had limited capability, but the optimal DTLR model, combined with ResNet152, effectively identified BRAFV600E mutations. In the validation set, the AUC, accuracy, sensitivity and specificity were 0.833, 80.6%, 76.2% and 81.7%, respectively. The AUC of the DTLR model was higher than that of the DTL and radiomics models. Visualization using the ResNet152-based DTLR model revealed its ability to capture and learn ultrasound image features related to BRAFV600E mutations.
Conclusion: The ResNet152-based DTLR model demonstrated significant value in identifying BRAFV600E mutations in patients with PTC using ultrasound images. Grad-CAM has the potential to objectively stratify BRAF mutations visually. The findings of this study require further collaboration among more centers and the inclusion of additional data for validation.
{"title":"Breaking barriers: noninvasive AI model for BRAF<sup>V600E</sup> mutation identification.","authors":"Fan Wu, Xiangfeng Lin, Yuying Chen, Mengqian Ge, Ting Pan, Jingjing Shi, Linlin Mao, Gang Pan, You Peng, Li Zhou, Haitao Zheng, Dingcun Luo, Yu Zhang","doi":"10.1007/s11548-024-03290-0","DOIUrl":"https://doi.org/10.1007/s11548-024-03290-0","url":null,"abstract":"<p><strong>Objective: </strong>BRAF<sup>V600E</sup> is the most common mutation found in thyroid cancer and is particularly associated with papillary thyroid carcinoma (PTC). Currently, genetic mutation detection relies on invasive procedures. This study aimed to extract radiomic features and utilize deep transfer learning (DTL) from ultrasound images to develop a noninvasive artificial intelligence model for identifying BRAF<sup>V600E</sup> mutations.</p><p><strong>Materials and methods: </strong>Regions of interest (ROI) were manually annotated in the ultrasound images, and radiomic and DTL features were extracted. These were used in a joint DTL-radiomics (DTLR) model. Fourteen DTL models were employed, and feature selection was performed using the LASSO regression. Eight machine learning methods were used to construct predictive models. Model performance was primarily evaluated using area under the curve (AUC), accuracy, sensitivity and specificity. The interpretability of the model was visualized using gradient-weighted class activation maps (Grad-CAM).</p><p><strong>Results: </strong>Sole reliance on radiomics for identification of BRAF<sup>V600E</sup> mutations had limited capability, but the optimal DTLR model, combined with ResNet152, effectively identified BRAF<sup>V600E</sup> mutations. In the validation set, the AUC, accuracy, sensitivity and specificity were 0.833, 80.6%, 76.2% and 81.7%, respectively. The AUC of the DTLR model was higher than that of the DTL and radiomics models. Visualization using the ResNet152-based DTLR model revealed its ability to capture and learn ultrasound image features related to BRAF<sup>V600E</sup> mutations.</p><p><strong>Conclusion: </strong>The ResNet152-based DTLR model demonstrated significant value in identifying BRAF<sup>V600E</sup> mutations in patients with PTC using ultrasound images. Grad-CAM has the potential to objectively stratify BRAF mutations visually. The findings of this study require further collaboration among more centers and the inclusion of additional data for validation.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143426641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Purpose: Statistical shape models (SSMs) are widely used for morphological assessment of anatomical structures. However, a key limitation is the need for a clear relationship between the model's shape coefficients and clinically relevant anatomical parameters. To address this limitation, this paper proposes a novel deep learning-based anatomically parameterized SSM (DL-ANATSSM) by introducing a nonlinear relationship between anatomical parameters and bone shape information.
Methods: Our approach utilizes a multilayer perceptron model trained on a synthetic femoral bone population to learn the nonlinear mapping between anatomical measurements and shape parameters. The trained model is then fine-tuned on a real bone dataset. We compare the performance of DL-ANATSSM with a linear ANATSSM generated using least-squares regression for baseline evaluation.
Results: When applied to a previously unseen femoral bone dataset, DL-ANATSSM demonstrated superior performance in predicting 3D bone shape based on anatomical parameters compared to the linear baseline model. The impact of fine-tuning was also investigated, with results indicating improved model performance after this process.
Conclusion: The proposed DL-ANATSSM is therefore a more precise and interpretable SSM, which is directly controlled by clinically relevant parameters. The proposed method holds promise for applications in both morphometry analysis and patient-specific 3D model generation without preoperative images.
{"title":"Leveraging deep learning for nonlinear shape representation in anatomically parameterized statistical shape models.","authors":"Behnaz Gheflati, Morteza Mirzaei, Sunil Rottoo, Hassan Rivaz","doi":"10.1007/s11548-025-03330-3","DOIUrl":"https://doi.org/10.1007/s11548-025-03330-3","url":null,"abstract":"<p><strong>Purpose: </strong>Statistical shape models (SSMs) are widely used for morphological assessment of anatomical structures. However, a key limitation is the need for a clear relationship between the model's shape coefficients and clinically relevant anatomical parameters. To address this limitation, this paper proposes a novel deep learning-based anatomically parameterized SSM (DL-ANAT<sub>SSM</sub>) by introducing a nonlinear relationship between anatomical parameters and bone shape information.</p><p><strong>Methods: </strong>Our approach utilizes a multilayer perceptron model trained on a synthetic femoral bone population to learn the nonlinear mapping between anatomical measurements and shape parameters. The trained model is then fine-tuned on a real bone dataset. We compare the performance of DL-ANAT<sub>SSM</sub> with a linear ANAT<sub>SSM</sub> generated using least-squares regression for baseline evaluation.</p><p><strong>Results: </strong>When applied to a previously unseen femoral bone dataset, DL-ANAT<sub>SSM</sub> demonstrated superior performance in predicting 3D bone shape based on anatomical parameters compared to the linear baseline model. The impact of fine-tuning was also investigated, with results indicating improved model performance after this process.</p><p><strong>Conclusion: </strong>The proposed DL-ANAT<sub>SSM</sub> is therefore a more precise and interpretable SSM, which is directly controlled by clinically relevant parameters. The proposed method holds promise for applications in both morphometry analysis and patient-specific 3D model generation without preoperative images.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143426645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-11DOI: 10.1007/s11548-024-03281-1
Luyang Zhang, Yuichiro Hayashi, Masahiro Oda, Kensaku Mori
Purpose: Deep-learning-based supervised CT segmentation relies on fully and densely labeled data, the labeling process of which is time-consuming. In this study, our proposed method aims to improve segmentation performance on CT volumes with limited annotated data by considering category-wise difficulties and distribution.
Methods: We propose a novel confidence-difficulty weight (CDifW) allocation method that considers confidence levels, balancing the training across different categories, influencing the loss function and volume-mixing process for pseudo-label generation. Additionally, we introduce a novel Double-Mix Pseudo-label Framework (DMPF), which strategically selects categories for image blending based on the distribution of voxel-counts per category and the weight of segmentation difficulty. DMPF is designed to enhance the segmentation performance of categories that are challenging to segment.
Result: Our approach was tested on two commonly used datasets: a Congenital Heart Disease (CHD) dataset and a Beyond-the-Cranial-Vault (BTCV) Abdomen dataset. Compared to the SOTA methods, our approach achieved an improvement of 5.1% and 7.0% in Dice score for the segmentation of difficult-to-segment categories on 5% of the labeled data in CHD and 40% of the labeled data in BTCV, respectively.
Conclusion: Our method improves segmentation performance in difficult categories within CT volumes by category-wise weights and weight-based mixture augmentation. Our method was validated across multiple datasets and is significant for advancing semi-supervised segmentation tasks in health care. The code is available at https://github.com/MoriLabNU/Double-Mix .
{"title":"Double-mix pseudo-label framework: enhancing semi-supervised segmentation on category-imbalanced CT volumes.","authors":"Luyang Zhang, Yuichiro Hayashi, Masahiro Oda, Kensaku Mori","doi":"10.1007/s11548-024-03281-1","DOIUrl":"https://doi.org/10.1007/s11548-024-03281-1","url":null,"abstract":"<p><strong>Purpose: </strong>Deep-learning-based supervised CT segmentation relies on fully and densely labeled data, the labeling process of which is time-consuming. In this study, our proposed method aims to improve segmentation performance on CT volumes with limited annotated data by considering category-wise difficulties and distribution.</p><p><strong>Methods: </strong>We propose a novel confidence-difficulty weight (CDifW) allocation method that considers confidence levels, balancing the training across different categories, influencing the loss function and volume-mixing process for pseudo-label generation. Additionally, we introduce a novel Double-Mix Pseudo-label Framework (DMPF), which strategically selects categories for image blending based on the distribution of voxel-counts per category and the weight of segmentation difficulty. DMPF is designed to enhance the segmentation performance of categories that are challenging to segment.</p><p><strong>Result: </strong>Our approach was tested on two commonly used datasets: a Congenital Heart Disease (CHD) dataset and a Beyond-the-Cranial-Vault (BTCV) Abdomen dataset. Compared to the SOTA methods, our approach achieved an improvement of 5.1% and 7.0% in Dice score for the segmentation of difficult-to-segment categories on 5% of the labeled data in CHD and 40% of the labeled data in BTCV, respectively.</p><p><strong>Conclusion: </strong>Our method improves segmentation performance in difficult categories within CT volumes by category-wise weights and weight-based mixture augmentation. Our method was validated across multiple datasets and is significant for advancing semi-supervised segmentation tasks in health care. The code is available at https://github.com/MoriLabNU/Double-Mix .</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143392507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Purpose: Orbital decompression surgery, which expands the volume of the orbit by removing sections of the orbital walls with a drill and saw, is an important treatment option for thyroid-associated ophthalmopathy. However, it is often limited by physical factors such as a narrow operating space and instability of the manual holding of surgical instruments, which constrains doctors from accurately executing surgical planning.
Methods: To overcome these limitations, we designed a surgical robot comprising position adjustment, remote center of motion, and end-effector with a rapid surgical instrument assembly mechanisms. Additionally, to guide surgical robots in precisely performing preoperative surgical planning, we constructed a surgical navigation system comprising preoperative surgical planning and intraoperative optical navigation subsystems. An internally complementary orbital surgical robot system in which the navigation system, optical tracker, and surgical robot and its motion control system serve as the decision-making, perception, and execution layers of the system, respectively, was developed.
Results: The results of precision measurement experiments revealed that the absolute and repeated pose accuracies of the surgical robot satisfied the design requirements. As verified by animal experiments, the precision of osteotomy and bone drilling operation of orbital surgical robot system can meet the clinical technical indicators.
Conclusion: The developed orbital surgical robotic system for orbital decompression surgery could perform routine operations such as drilling and sawing on the orbital bone with assistance and supervision from surgeons. The feasibility and reliability of the orbital surgical robot system were comprehensively verified through accuracy measurements and animal experiments.
{"title":"Development and validation of a surgical robot system for orbital decompression surgery.","authors":"Yanping Lin, Shiqi Peng, Siqi Jiao, Yi Wang, Yinwei Li, Huifang Zhou","doi":"10.1007/s11548-025-03322-3","DOIUrl":"https://doi.org/10.1007/s11548-025-03322-3","url":null,"abstract":"<p><strong>Purpose: </strong>Orbital decompression surgery, which expands the volume of the orbit by removing sections of the orbital walls with a drill and saw, is an important treatment option for thyroid-associated ophthalmopathy. However, it is often limited by physical factors such as a narrow operating space and instability of the manual holding of surgical instruments, which constrains doctors from accurately executing surgical planning.</p><p><strong>Methods: </strong>To overcome these limitations, we designed a surgical robot comprising position adjustment, remote center of motion, and end-effector with a rapid surgical instrument assembly mechanisms. Additionally, to guide surgical robots in precisely performing preoperative surgical planning, we constructed a surgical navigation system comprising preoperative surgical planning and intraoperative optical navigation subsystems. An internally complementary orbital surgical robot system in which the navigation system, optical tracker, and surgical robot and its motion control system serve as the decision-making, perception, and execution layers of the system, respectively, was developed.</p><p><strong>Results: </strong>The results of precision measurement experiments revealed that the absolute and repeated pose accuracies of the surgical robot satisfied the design requirements. As verified by animal experiments, the precision of osteotomy and bone drilling operation of orbital surgical robot system can meet the clinical technical indicators.</p><p><strong>Conclusion: </strong>The developed orbital surgical robotic system for orbital decompression surgery could perform routine operations such as drilling and sawing on the orbital bone with assistance and supervision from surgeons. The feasibility and reliability of the orbital surgical robot system were comprehensively verified through accuracy measurements and animal experiments.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143392506","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}