Purpose: Systems equipped with natural language (NLP) processing can reduce missed radiological findings by physicians, but the annotation costs are burden in the development. This study aimed to compare the effects of active learning (AL) algorithms in NLP for estimating the significance of head computed tomography (CT) reports using bidirectional encoder representations from transformers (BERT).
Methods: A total of 3728 head CT reports annotated with five categories of importance were used and UTH-BERT was adopted as the pre-trained BERT model. We assumed that 64% (2385 reports) of the data were initially in the unlabeled data pool (UDP), while the labeled data set (LD) used to train the model was empty. Twenty-five reports were repeatedly selected from the UDP and added to the LD, based on seven metrices: random sampling (RS: control), four uncertainty sampling (US) methods (least confidence (LC), margin sampling (MS), ratio of confidence (RC), and entropy sampling (ES)), and two distance-based sampling (DS) methods (cosine distance (CD) and Euclidian distance (ED)). The transition of accuracy of the model was evaluated using the test dataset.
Results: The accuracy of the models with US was significantly higher than RS when reports in LD were < 1800, whereas DS methods were significantly lower than RS. Among the US methods, MS and RC were even better than the others. With the US methods, the required labeled data decreased by 15.4-40.5%, and most efficient in RC. In addition, in the US methods, data for minor categories tended to be added to LD earlier than RS and DS.
Conclusions: In the classification task for the importance of head CT reports, US methods, especially RC and MS can lead to the effective fine-tuning of BERT models and reduce the imbalance of categories. AL can contribute to other studies on larger datasets by providing effective annotation.
{"title":"Comparison of active learning algorithms in classifying head computed tomography reports using bidirectional encoder representations from transformers.","authors":"Tomohiro Wataya, Azusa Miura, Takahisa Sakisuka, Masahiro Fujiwara, Hisashi Tanaka, Yu Hiraoka, Junya Sato, Miyuki Tomiyama, Daiki Nishigaki, Kosuke Kita, Yuki Suzuki, Shoji Kido, Noriyuki Tomiyama","doi":"10.1007/s11548-024-03316-7","DOIUrl":"https://doi.org/10.1007/s11548-024-03316-7","url":null,"abstract":"<p><strong>Purpose: </strong>Systems equipped with natural language (NLP) processing can reduce missed radiological findings by physicians, but the annotation costs are burden in the development. This study aimed to compare the effects of active learning (AL) algorithms in NLP for estimating the significance of head computed tomography (CT) reports using bidirectional encoder representations from transformers (BERT).</p><p><strong>Methods: </strong>A total of 3728 head CT reports annotated with five categories of importance were used and UTH-BERT was adopted as the pre-trained BERT model. We assumed that 64% (2385 reports) of the data were initially in the unlabeled data pool (UDP), while the labeled data set (LD) used to train the model was empty. Twenty-five reports were repeatedly selected from the UDP and added to the LD, based on seven metrices: random sampling (RS: control), four uncertainty sampling (US) methods (least confidence (LC), margin sampling (MS), ratio of confidence (RC), and entropy sampling (ES)), and two distance-based sampling (DS) methods (cosine distance (CD) and Euclidian distance (ED)). The transition of accuracy of the model was evaluated using the test dataset.</p><p><strong>Results: </strong>The accuracy of the models with US was significantly higher than RS when reports in LD were < 1800, whereas DS methods were significantly lower than RS. Among the US methods, MS and RC were even better than the others. With the US methods, the required labeled data decreased by 15.4-40.5%, and most efficient in RC. In addition, in the US methods, data for minor categories tended to be added to LD earlier than RS and DS.</p><p><strong>Conclusions: </strong>In the classification task for the importance of head CT reports, US methods, especially RC and MS can lead to the effective fine-tuning of BERT models and reduce the imbalance of categories. AL can contribute to other studies on larger datasets by providing effective annotation.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142958535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-08DOI: 10.1007/s11548-024-03313-w
Pau Cano, Eva Musulen, Debora Gil
Purpose: This work addresses the detection of Helicobacter pylori (H. pylori) in histological images with immunohistochemical staining. This analysis is a time-demanding task, currently done by an expert pathologist that visually inspects the samples. Given the effort required to localize the pathogen in images, a limited number of annotations might be available in an initial setting. Our goal is to design an approach that, using a limited set of annotations, is capable of obtaining results good enough to be used as a support tool.
Methods: We propose to use autoencoders to learn the latent patterns of healthy patches and formulate a specific measure of the reconstruction error of the image in HSV space. ROC analysis is used to set the optimal threshold of this measure and the percentage of positive patches in a sample that determines the presence of H. pylori.
Results: Our method has been tested on an own database of 245 whole slide images (WSI) having 117 cases without H. pylori and different density of the bacteria in the remaining ones. The database has 1211 annotated patches, with only 163 positive patches. This dataset of positive annotations was used to train a baseline thresholding and an SVM using the features of a pre-trained RedNet-18 and ViT models. A 10-fold cross-validation shows that our method has better performance with 91% accuracy, 86% sensitivity, 96% specificity and 0.97 AUC in the diagnosis of H. pylori .
Conclusion: Unlike classification approaches, our shallow autoencoder with threshold adaptation for the detection of anomalous staining is able to achieve competitive results with a limited set of annotated data. This initial approach is good enough to be used as a guide for fast annotation of infected patches.
{"title":"Diagnosing Helicobacter pylori using autoencoders and limited annotations through anomalous staining patterns in IHC whole slide images.","authors":"Pau Cano, Eva Musulen, Debora Gil","doi":"10.1007/s11548-024-03313-w","DOIUrl":"https://doi.org/10.1007/s11548-024-03313-w","url":null,"abstract":"<p><strong>Purpose: </strong>This work addresses the detection of Helicobacter pylori (H. pylori) in histological images with immunohistochemical staining. This analysis is a time-demanding task, currently done by an expert pathologist that visually inspects the samples. Given the effort required to localize the pathogen in images, a limited number of annotations might be available in an initial setting. Our goal is to design an approach that, using a limited set of annotations, is capable of obtaining results good enough to be used as a support tool.</p><p><strong>Methods: </strong>We propose to use autoencoders to learn the latent patterns of healthy patches and formulate a specific measure of the reconstruction error of the image in HSV space. ROC analysis is used to set the optimal threshold of this measure and the percentage of positive patches in a sample that determines the presence of H. pylori.</p><p><strong>Results: </strong>Our method has been tested on an own database of 245 whole slide images (WSI) having 117 cases without H. pylori and different density of the bacteria in the remaining ones. The database has 1211 annotated patches, with only 163 positive patches. This dataset of positive annotations was used to train a baseline thresholding and an SVM using the features of a pre-trained RedNet-18 and ViT models. A 10-fold cross-validation shows that our method has better performance with 91% accuracy, 86% sensitivity, 96% specificity and 0.97 AUC in the diagnosis of H. pylori .</p><p><strong>Conclusion: </strong>Unlike classification approaches, our shallow autoencoder with threshold adaptation for the detection of anomalous staining is able to achieve competitive results with a limited set of annotated data. This initial approach is good enough to be used as a guide for fast annotation of infected patches.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142958543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-07DOI: 10.1007/s11548-024-03280-2
Christoph Großbröhmer, Lasse Hansen, Jürgen Lichtenstein, Ludger Tüshaus, Mattias P Heinrich
Purpose: This study aims to address the challenging estimation of trajectories from freehand ultrasound examinations by means of registration of automatically generated surface points. Current approaches to inter-sweep point cloud registration can be improved by incorporating heatmap predictions, but practical challenges such as label-sparsity or only partially overlapping coverage of target structures arise when applying realistic examination conditions.
Methods: We propose a pipeline comprising three stages: (1) Utilizing a Free Point Transformer for coarse pre-registration, (2) Introducing HeatReg for further refinement using support point clouds, and (3) Employing instance optimization to enhance predicted displacements. Key techniques include expanding point sets with support points derived from prior knowledge and leverage of gradient keypoints. We evaluate our method on a large set of 42 forearm ultrasound sweeps with optical ground-truth tracking and investigate multiple ablations.
Results: The proposed pipeline effectively registers free-hand intra-patient ultrasound sweeps. Combining Free Point Transformer with support-point enhanced HeatReg outperforms the FPT baseline by a mean directed surface distance of 0.96 mm (40%). Subsequent refinement using Adam instance optimization and DiVRoC further improves registration accuracy and trajectory estimation.
Conclusion: The proposed techniques enable and improve the application of point cloud registration as a basis for freehand ultrasound reconstruction. Our results demonstrate significant theoretical and practical advantages of heatmap incorporation and multi-stage model predictions.
{"title":"3d freehand ultrasound reconstruction by reference-based point cloud registration.","authors":"Christoph Großbröhmer, Lasse Hansen, Jürgen Lichtenstein, Ludger Tüshaus, Mattias P Heinrich","doi":"10.1007/s11548-024-03280-2","DOIUrl":"https://doi.org/10.1007/s11548-024-03280-2","url":null,"abstract":"<p><strong>Purpose: </strong>This study aims to address the challenging estimation of trajectories from freehand ultrasound examinations by means of registration of automatically generated surface points. Current approaches to inter-sweep point cloud registration can be improved by incorporating heatmap predictions, but practical challenges such as label-sparsity or only partially overlapping coverage of target structures arise when applying realistic examination conditions.</p><p><strong>Methods: </strong>We propose a pipeline comprising three stages: (1) Utilizing a Free Point Transformer for coarse pre-registration, (2) Introducing HeatReg for further refinement using support point clouds, and (3) Employing instance optimization to enhance predicted displacements. Key techniques include expanding point sets with support points derived from prior knowledge and leverage of gradient keypoints. We evaluate our method on a large set of 42 forearm ultrasound sweeps with optical ground-truth tracking and investigate multiple ablations.</p><p><strong>Results: </strong>The proposed pipeline effectively registers free-hand intra-patient ultrasound sweeps. Combining Free Point Transformer with support-point enhanced HeatReg outperforms the FPT baseline by a mean directed surface distance of 0.96 mm (40%). Subsequent refinement using Adam instance optimization and DiVRoC further improves registration accuracy and trajectory estimation.</p><p><strong>Conclusion: </strong>The proposed techniques enable and improve the application of point cloud registration as a basis for freehand ultrasound reconstruction. Our results demonstrate significant theoretical and practical advantages of heatmap incorporation and multi-stage model predictions.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142958530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-07DOI: 10.1007/s11548-024-03310-z
Paul Kaftan, Mattias P Heinrich, Lasse Hansen, Volker Rasche, Hans A Kestler, Alexander Bigalke
Purpose: Lung fissure segmentation on CT images often relies on 3D convolutional neural networks (CNNs). However, 3D-CNNs are inefficient for detecting thin structures like the fissures, which make up a tiny fraction of the entire image volume. We propose to make lung fissure segmentation more efficient by using geometric deep learning (GDL) on sparse point clouds.
Methods: We abstract image data with sparse keypoint (KP) clouds. We train GDL models to segment the point cloud, comparing three major paradigms of models (PointNets, graph convolutional networks (GCNs), and PointTransformers). From the sparse point segmentations, 3D meshes of the objects are reconstructed to obtain a dense surface. The state-of-the-art Poisson surface reconstruction (PSR) makes up most of the time in our pipeline. Therefore, we propose an efficient point cloud to mesh autoencoder (PC-AE) that deforms a template mesh to fit a point cloud in a single forward pass. Our pipeline is evaluated extensively and compared to the 3D-CNN gold standard nnU-Net on diverse clinical and pathological data.
Results: GCNs yield the best trade-off between inference time and accuracy, being faster with only increased error over the nnU-Net. Our PC-AE also achieves a favorable trade-off, being faster at the error compared to the PSR.
Conclusion: We present a KP-based fissure segmentation pipeline that is more efficient than 3D-CNNs and can greatly speed up large-scale analyses. A novel PC-AE for efficient mesh reconstruction from sparse point clouds is introduced, showing promise not only for fissure segmentation. Source code is available on https://github.com/kaftanski/fissure-segmentation-IJCARS.
{"title":"Sparse keypoint segmentation of lung fissures: efficient geometric deep learning for abstracting volumetric images.","authors":"Paul Kaftan, Mattias P Heinrich, Lasse Hansen, Volker Rasche, Hans A Kestler, Alexander Bigalke","doi":"10.1007/s11548-024-03310-z","DOIUrl":"https://doi.org/10.1007/s11548-024-03310-z","url":null,"abstract":"<p><strong>Purpose: </strong>Lung fissure segmentation on CT images often relies on 3D convolutional neural networks (CNNs). However, 3D-CNNs are inefficient for detecting thin structures like the fissures, which make up a tiny fraction of the entire image volume. We propose to make lung fissure segmentation more efficient by using geometric deep learning (GDL) on sparse point clouds.</p><p><strong>Methods: </strong>We abstract image data with sparse keypoint (KP) clouds. We train GDL models to segment the point cloud, comparing three major paradigms of models (PointNets, graph convolutional networks (GCNs), and PointTransformers). From the sparse point segmentations, 3D meshes of the objects are reconstructed to obtain a dense surface. The state-of-the-art Poisson surface reconstruction (PSR) makes up most of the time in our pipeline. Therefore, we propose an efficient point cloud to mesh autoencoder (PC-AE) that deforms a template mesh to fit a point cloud in a single forward pass. Our pipeline is evaluated extensively and compared to the 3D-CNN gold standard nnU-Net on diverse clinical and pathological data.</p><p><strong>Results: </strong>GCNs yield the best trade-off between inference time and accuracy, being <math><mrow><mn>21</mn> <mo>×</mo></mrow> </math> faster with only <math><mrow><mn>1.4</mn> <mo>×</mo></mrow> </math> increased error over the nnU-Net. Our PC-AE also achieves a favorable trade-off, being <math><mrow><mn>3</mn> <mo>×</mo></mrow> </math> faster at <math><mrow><mn>1.5</mn> <mo>×</mo></mrow> </math> the error compared to the PSR.</p><p><strong>Conclusion: </strong>We present a KP-based fissure segmentation pipeline that is more efficient than 3D-CNNs and can greatly speed up large-scale analyses. A novel PC-AE for efficient mesh reconstruction from sparse point clouds is introduced, showing promise not only for fissure segmentation. Source code is available on https://github.com/kaftanski/fissure-segmentation-IJCARS.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142958381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-05DOI: 10.1007/s11548-024-03302-z
Moujan Saderi, Jaykumar H Patel, Calder D Sheagren, Judit Csőre, Trisha L Roy, Graham A Wright
Purpose: During endovascular revascularization interventions for peripheral arterial disease, the standard modality of X-ray fluoroscopy (XRF) used for image guidance is limited in visualizing distal segments of infrapopliteal vessels. To enhance visualization of arteries, an image registration technique was developed to align pre-acquired computed tomography (CT) angiography images and to create fusion images highlighting arteries of interest.
Methods: X-ray image metadata capturing the position of the X-ray gantry initializes a multiscale iterative optimization process, which uses a local-variance masked normalized cross-correlation loss to rigidly align a digitally reconstructed radiograph (DRR) of the CT dataset with the target X-ray, using the edges of the fibula and tibia as the basis for alignment. A precomputed library of DRRs is used to improve run-time, and the six-degree-of-freedom optimization problem of rigid registration is divided into three smaller sub-problems to improve convergence. The method was tested on a dataset of paired cone-beam CT (CBCT) and XRF images of ex vivo limbs, and registration accuracy at the midline of the artery was evaluated.
Results: On a dataset of CBCTs from 4 different limbs and a total of 17 XRF images, successful registration was achieved in 13 cases, with the remainder suffering from input image quality issues. The method produced average misalignments of less than 1 mm in horizontal projection distance along the artery midline, with an average run-time of 16 s.
Conclusion: The sub-mm spatial accuracy of artery overlays is sufficient for the clinical use case of identifying guidewire deviations from the path of the artery, for early detection of guidewire-induced perforations. The semiautomatic workflow and average run-time of the algorithm make it feasible for integration into clinical workflows.
{"title":"3D CT to 2D X-ray image registration for improved visualization of tibial vessels in endovascular procedures.","authors":"Moujan Saderi, Jaykumar H Patel, Calder D Sheagren, Judit Csőre, Trisha L Roy, Graham A Wright","doi":"10.1007/s11548-024-03302-z","DOIUrl":"https://doi.org/10.1007/s11548-024-03302-z","url":null,"abstract":"<p><strong>Purpose: </strong>During endovascular revascularization interventions for peripheral arterial disease, the standard modality of X-ray fluoroscopy (XRF) used for image guidance is limited in visualizing distal segments of infrapopliteal vessels. To enhance visualization of arteries, an image registration technique was developed to align pre-acquired computed tomography (CT) angiography images and to create fusion images highlighting arteries of interest.</p><p><strong>Methods: </strong>X-ray image metadata capturing the position of the X-ray gantry initializes a multiscale iterative optimization process, which uses a local-variance masked normalized cross-correlation loss to rigidly align a digitally reconstructed radiograph (DRR) of the CT dataset with the target X-ray, using the edges of the fibula and tibia as the basis for alignment. A precomputed library of DRRs is used to improve run-time, and the six-degree-of-freedom optimization problem of rigid registration is divided into three smaller sub-problems to improve convergence. The method was tested on a dataset of paired cone-beam CT (CBCT) and XRF images of ex vivo limbs, and registration accuracy at the midline of the artery was evaluated.</p><p><strong>Results: </strong>On a dataset of CBCTs from 4 different limbs and a total of 17 XRF images, successful registration was achieved in 13 cases, with the remainder suffering from input image quality issues. The method produced average misalignments of less than 1 mm in horizontal projection distance along the artery midline, with an average run-time of 16 s.</p><p><strong>Conclusion: </strong>The sub-mm spatial accuracy of artery overlays is sufficient for the clinical use case of identifying guidewire deviations from the path of the artery, for early detection of guidewire-induced perforations. The semiautomatic workflow and average run-time of the algorithm make it feasible for integration into clinical workflows.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142928132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-04DOI: 10.1007/s11548-024-03311-y
Lovis Schwenderling, Laura Isabel Hanke, Undine Holst, Florentine Huettl, Fabian Joeres, Tobias Huber, Christian Hansen
Purpose: Structured abdominal examination is an essential part of the medical curriculum and surgical training, requiring a blend of theory and practice from trainees. Current training methods, however, often do not provide adequate engagement, fail to address individual learning needs or do not cover rare diseases.
Methods: In this work, an application for structured Abdominal Examination Training using Augmented Reality (AETAR) is presented. Required theoretical knowledge is displayed step by step via virtual indicators directly on the associated body regions. Exercises facilitate building up the routine in performing the examination. AETAR was evaluated in an exploratory user study with medical students (n=12) and teaching surgeons (n=2).
Results: Learning with AETAR was described as fun and beneficial. Usability (SUS=73) and rated suitability for teaching were promising. All students improved in a knowledge test and felt more confident with the abdominal examination. Shortcomings were identified in the area of interaction, especially in teaching examination-specific movements.
Conclusion: AETAR represents a first approach to structured abdominal examination training using augmented reality. The application demonstrates the potential to improve educational outcomes for medical students and provides an important foundation for future research and development in digital medical education.
{"title":"Toward structured abdominal examination training using augmented reality.","authors":"Lovis Schwenderling, Laura Isabel Hanke, Undine Holst, Florentine Huettl, Fabian Joeres, Tobias Huber, Christian Hansen","doi":"10.1007/s11548-024-03311-y","DOIUrl":"https://doi.org/10.1007/s11548-024-03311-y","url":null,"abstract":"<p><strong>Purpose: </strong>Structured abdominal examination is an essential part of the medical curriculum and surgical training, requiring a blend of theory and practice from trainees. Current training methods, however, often do not provide adequate engagement, fail to address individual learning needs or do not cover rare diseases.</p><p><strong>Methods: </strong>In this work, an application for structured Abdominal Examination Training using Augmented Reality (AETAR) is presented. Required theoretical knowledge is displayed step by step via virtual indicators directly on the associated body regions. Exercises facilitate building up the routine in performing the examination. AETAR was evaluated in an exploratory user study with medical students (n=12) and teaching surgeons (n=2).</p><p><strong>Results: </strong>Learning with AETAR was described as fun and beneficial. Usability (SUS=73) and rated suitability for teaching were promising. All students improved in a knowledge test and felt more confident with the abdominal examination. Shortcomings were identified in the area of interaction, especially in teaching examination-specific movements.</p><p><strong>Conclusion: </strong>AETAR represents a first approach to structured abdominal examination training using augmented reality. The application demonstrates the potential to improve educational outcomes for medical students and provides an important foundation for future research and development in digital medical education.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142928249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01Epub Date: 2024-09-20DOI: 10.1007/s11548-024-03275-z
Yuuki Shida, Souto Kumagai, Hiroyasu Iwata
Purpose: The search for heart components in robotic transthoracic echocardiography is a time-consuming process. This paper proposes an optimized robotic navigation system for heart components using deep reinforcement learning to achieve an efficient and effective search technique for heart components.
Method: The proposed method introduces (i) an optimized search behavior generation algorithm that avoids multiple local solutions and searches for the optimal solution and (ii) an optimized path generation algorithm that minimizes the search path, thereby realizing short search times.
Results: The mitral valve search with the proposed method reaches the optimal solution with a probability of 74.4%, the mitral valve confidence loss rate when the local solution stops is 16.3% on average, and the inspection time with the generated path is 48.6 s on average, which is 56.6% of the time cost of the conventional method.
Conclusion: The results indicate that the proposed method improves the search efficiency, and the optimal location can be searched in many cases with the proposed method, and the loss rate of the confidence in the mitral valve was low even when a local solution rather than the optimal solution was reached. It is suggested that the proposed method enables accurate and quick robotic navigation to find heart components.
{"title":"Robotic navigation with deep reinforcement learning in transthoracic echocardiography.","authors":"Yuuki Shida, Souto Kumagai, Hiroyasu Iwata","doi":"10.1007/s11548-024-03275-z","DOIUrl":"10.1007/s11548-024-03275-z","url":null,"abstract":"<p><strong>Purpose: </strong>The search for heart components in robotic transthoracic echocardiography is a time-consuming process. This paper proposes an optimized robotic navigation system for heart components using deep reinforcement learning to achieve an efficient and effective search technique for heart components.</p><p><strong>Method: </strong>The proposed method introduces (i) an optimized search behavior generation algorithm that avoids multiple local solutions and searches for the optimal solution and (ii) an optimized path generation algorithm that minimizes the search path, thereby realizing short search times.</p><p><strong>Results: </strong>The mitral valve search with the proposed method reaches the optimal solution with a probability of 74.4%, the mitral valve confidence loss rate when the local solution stops is 16.3% on average, and the inspection time with the generated path is 48.6 s on average, which is 56.6% of the time cost of the conventional method.</p><p><strong>Conclusion: </strong>The results indicate that the proposed method improves the search efficiency, and the optimal location can be searched in many cases with the proposed method, and the loss rate of the confidence in the mitral valve was low even when a local solution rather than the optimal solution was reached. It is suggested that the proposed method enables accurate and quick robotic navigation to find heart components.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":"191-202"},"PeriodicalIF":2.3,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11757869/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142300392","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01Epub Date: 2024-06-07DOI: 10.1007/s11548-024-03131-0
Hamraz Javaheri, Omid Ghamarnejad, Ragnar Bade, Paul Lukowicz, Jakob Karolus, Gregor Alexander Stavrou
Purpose: The retroperitoneal nature of the pancreas, marked by minimal intraoperative organ shifts and deformations, makes augmented reality (AR)-based systems highly promising for pancreatic surgery. This study presents preliminary data from a prospective study aiming to develop the first wearable AR assistance system, ARAS, for pancreatic surgery and evaluating its usability, accuracy, and effectiveness in enhancing the perioperative outcomes of patients.
Methods: We developed ARAS as a two-phase system for a wearable AR device to aid surgeons in planning and operation. This system was used to visualize and register patient-specific 3D anatomical models during the surgery. The location and precision of the registered 3D anatomy were evaluated by assessing the arterial pulse and employing Doppler and duplex ultrasonography. The usability, accuracy, and effectiveness of ARAS were assessed using a five-point Likert scale questionnaire.
Results: Perioperative outcomes of five patients underwent various pancreatic resections with ARAS are presented. Surgeons rated ARAS as excellent for preoperative planning. All structures were accurately identified without any noteworthy errors. Only tumor identification decreased after the preparation phase, especially in patients who underwent pancreaticoduodenectomy because of the extensive mobilization of peripancreatic structures. No perioperative complications related to ARAS were observed.
Conclusions: ARAS shows promise in enhancing surgical precision during pancreatic procedures. Its efficacy in preoperative planning and intraoperative vascular identification positions it as a valuable tool for pancreatic surgery and a potential educational resource for future surgical residents.
目的:胰腺位于腹膜后,术中器官移位和变形极小,这使得基于增强现实(AR)的系统在胰腺手术中大有可为。本研究介绍了一项前瞻性研究的初步数据,该研究旨在开发首个用于胰腺手术的可穿戴 AR 辅助系统 ARAS,并评估其可用性、准确性以及在提高患者围手术期效果方面的有效性:我们开发的ARAS是一个可穿戴AR设备的两阶段系统,用于辅助外科医生制定计划和进行手术。该系统用于在手术过程中可视化和注册患者特定的三维解剖模型。通过评估动脉脉搏以及使用多普勒和双相超声波检查,对注册的三维解剖模型的位置和精确度进行了评估。使用李克特五点量表问卷对 ARAS 的可用性、准确性和有效性进行了评估:结果:本文介绍了五名使用 ARAS 进行各种胰腺切除术的患者的围手术期结果。外科医生认为ARAS在术前规划方面表现出色。所有结构都能准确识别,没有任何值得注意的错误。只有肿瘤识别率在准备阶段后有所下降,特别是在接受胰十二指肠切除术的患者中,因为需要广泛移动胰腺周围结构。没有观察到与ARAS相关的围手术期并发症:结论:ARAS有望提高胰腺手术的精确度。ARAS在术前规划和术中血管识别方面的功效使其成为胰腺手术的重要工具,也是未来外科住院医生的潜在教育资源。
{"title":"Beyond the visible: preliminary evaluation of the first wearable augmented reality assistance system for pancreatic surgery.","authors":"Hamraz Javaheri, Omid Ghamarnejad, Ragnar Bade, Paul Lukowicz, Jakob Karolus, Gregor Alexander Stavrou","doi":"10.1007/s11548-024-03131-0","DOIUrl":"10.1007/s11548-024-03131-0","url":null,"abstract":"<p><strong>Purpose: </strong>The retroperitoneal nature of the pancreas, marked by minimal intraoperative organ shifts and deformations, makes augmented reality (AR)-based systems highly promising for pancreatic surgery. This study presents preliminary data from a prospective study aiming to develop the first wearable AR assistance system, ARAS, for pancreatic surgery and evaluating its usability, accuracy, and effectiveness in enhancing the perioperative outcomes of patients.</p><p><strong>Methods: </strong>We developed ARAS as a two-phase system for a wearable AR device to aid surgeons in planning and operation. This system was used to visualize and register patient-specific 3D anatomical models during the surgery. The location and precision of the registered 3D anatomy were evaluated by assessing the arterial pulse and employing Doppler and duplex ultrasonography. The usability, accuracy, and effectiveness of ARAS were assessed using a five-point Likert scale questionnaire.</p><p><strong>Results: </strong>Perioperative outcomes of five patients underwent various pancreatic resections with ARAS are presented. Surgeons rated ARAS as excellent for preoperative planning. All structures were accurately identified without any noteworthy errors. Only tumor identification decreased after the preparation phase, especially in patients who underwent pancreaticoduodenectomy because of the extensive mobilization of peripancreatic structures. No perioperative complications related to ARAS were observed.</p><p><strong>Conclusions: </strong>ARAS shows promise in enhancing surgical precision during pancreatic procedures. Its efficacy in preoperative planning and intraoperative vascular identification positions it as a valuable tool for pancreatic surgery and a potential educational resource for future surgical residents.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":"117-129"},"PeriodicalIF":2.3,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11757645/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141288907","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01Epub Date: 2024-09-06DOI: 10.1007/s11548-024-03255-3
William Ndzimbong, Nicolas Thome, Cyril Fourniol, Yvonne Keeza, Benoît Sauer, Jacques Marescaux, Daniel George, Alexandre Hostettler, Toby Collins
Purpose: Automatic registration between abdominal ultrasound (US) and computed tomography (CT) images is needed to enhance interventional guidance of renal procedures, but it remains an open research challenge. We propose a novel method that doesn't require an initial registration estimate (a global method) and also handles registration ambiguity caused by the organ's natural symmetry. Combined with a registration refinement algorithm, this method achieves robust and accurate kidney registration while avoiding manual initialization.
Methods: We propose solving global registration in a three-step approach: (1) Automatic anatomical landmark localization, where 2 deep neural networks (DNNs) localize a set of landmarks in each modality. (2) Registration hypothesis generation, where potential registrations are computed from the landmarks with a deterministic variant of RANSAC. Due to the Kidney's strong bilateral symmetry, there are usually 2 compatible solutions. Finally, in Step (3), the correct solution is determined automatically, using a DNN classifier that resolves the geometric ambiguity. The registration may then be iteratively improved with a registration refinement method. Results are presented with state-of-the-art surface-based refinement-Bayesian coherent point drift (BCPD).
Results: This automatic global registration approach gives better results than various competitive state-of-the-art methods, which, additionally, require organ segmentation. The results obtained on 59 pairs of 3D US/CT kidney images show that the proposed method, combined with BCPD refinement, achieves a target registration error (TRE) of an internal kidney landmark (the renal pelvis) of 5.78 mm and an average nearest neighbor surface distance (nndist) of 2.42 mm.
Conclusion: This work presents the first approach for automatic kidney registration in US and CT images, which doesn't require an initial manual registration estimate to be known a priori. The results show a fully automatic registration approach with performances comparable to manual methods is feasible.
{"title":"Global registration of kidneys in 3D ultrasound and CT images.","authors":"William Ndzimbong, Nicolas Thome, Cyril Fourniol, Yvonne Keeza, Benoît Sauer, Jacques Marescaux, Daniel George, Alexandre Hostettler, Toby Collins","doi":"10.1007/s11548-024-03255-3","DOIUrl":"10.1007/s11548-024-03255-3","url":null,"abstract":"<p><strong>Purpose: </strong>Automatic registration between abdominal ultrasound (US) and computed tomography (CT) images is needed to enhance interventional guidance of renal procedures, but it remains an open research challenge. We propose a novel method that doesn't require an initial registration estimate (a global method) and also handles registration ambiguity caused by the organ's natural symmetry. Combined with a registration refinement algorithm, this method achieves robust and accurate kidney registration while avoiding manual initialization.</p><p><strong>Methods: </strong>We propose solving global registration in a three-step approach: (1) Automatic anatomical landmark localization, where 2 deep neural networks (DNNs) localize a set of landmarks in each modality. (2) Registration hypothesis generation, where potential registrations are computed from the landmarks with a deterministic variant of RANSAC. Due to the Kidney's strong bilateral symmetry, there are usually 2 compatible solutions. Finally, in Step (3), the correct solution is determined automatically, using a DNN classifier that resolves the geometric ambiguity. The registration may then be iteratively improved with a registration refinement method. Results are presented with state-of-the-art surface-based refinement-Bayesian coherent point drift (BCPD).</p><p><strong>Results: </strong>This automatic global registration approach gives better results than various competitive state-of-the-art methods, which, additionally, require organ segmentation. The results obtained on 59 pairs of 3D US/CT kidney images show that the proposed method, combined with BCPD refinement, achieves a target registration error (TRE) of an internal kidney landmark (the renal pelvis) of 5.78 mm and an average nearest neighbor surface distance (nndist) of 2.42 mm.</p><p><strong>Conclusion: </strong>This work presents the first approach for automatic kidney registration in US and CT images, which doesn't require an initial manual registration estimate to be known a priori. The results show a fully automatic registration approach with performances comparable to manual methods is feasible.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":"65-75"},"PeriodicalIF":2.3,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142146830","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01Epub Date: 2024-05-13DOI: 10.1007/s11548-024-03153-8
Ainkaran Santhirasekaram, Mathias Winkler, Andrea Rockall, Ben Glocker
Purpose: Automated prostate disease classification on multi-parametric MRI has recently shown promising results with the use of convolutional neural networks (CNNs). The vision transformer (ViT) is a convolutional free architecture which only exploits the self-attention mechanism and has surpassed CNNs in some natural imaging classification tasks. However, these models are not very robust to textural shifts in the input space. In MRI, we often have to deal with textural shift arising from varying acquisition protocols. Here, we focus on the ability of models to generalise well to new magnet strengths for MRI.
Method: We propose a new framework to improve the robustness of vision transformer-based models for disease classification by constructing discrete representations of the data using vector quantisation. We sample a subset of the discrete representations to form the input into a transformer-based model. We use cross-attention in our transformer model to combine the discrete representations of T2-weighted and apparent diffusion coefficient (ADC) images.
Results: We analyse the robustness of our model by training on a 1.5 T scanner and test on a 3 T scanner and vice versa. Our approach achieves SOTA performance for classification of lesions on prostate MRI and outperforms various other CNN and transformer-based models in terms of robustness to domain shift and perturbations in the input space.
Conclusion: We develop a method to improve the robustness of transformer-based disease classification of prostate lesions on MRI using discrete representations of the T2-weighted and ADC images.
目的:最近,使用卷积神经网络(CNN)对多参数磁共振成像进行前列腺疾病自动分类取得了可喜的成果。视觉转换器(ViT)是一种卷积自由架构,它只利用了自注意机制,在一些自然成像分类任务中已经超越了 CNN。然而,这些模型对输入空间的纹理变化并不十分稳健。在核磁共振成像中,我们经常需要处理因不同采集协议而产生的纹理偏移。在此,我们将重点关注模型对 MRI 新磁铁强度的良好泛化能力:方法:我们提出了一个新框架,通过使用向量量化来构建数据的离散表示,从而提高基于视觉变换器的疾病分类模型的鲁棒性。我们对离散表示的一个子集进行采样,以形成基于转换器的模型的输入。我们在变压器模型中使用交叉注意,将 T2 加权图像和表观扩散系数(ADC)图像的离散表示结合起来:我们通过在 1.5 T 扫描仪上进行训练和在 3 T 扫描仪上进行测试来分析模型的鲁棒性,反之亦然。我们的方法在前列腺磁共振成像病变分类方面实现了 SOTA 性能,在对输入空间的域偏移和扰动的鲁棒性方面优于其他各种基于 CNN 和变压器的模型:我们开发了一种方法,利用 T2 加权和 ADC 图像的离散表示,提高了基于变压器的前列腺 MRI 病变分类的鲁棒性。
{"title":"Robust prostate disease classification using transformers with discrete representations.","authors":"Ainkaran Santhirasekaram, Mathias Winkler, Andrea Rockall, Ben Glocker","doi":"10.1007/s11548-024-03153-8","DOIUrl":"10.1007/s11548-024-03153-8","url":null,"abstract":"<p><strong>Purpose: </strong>Automated prostate disease classification on multi-parametric MRI has recently shown promising results with the use of convolutional neural networks (CNNs). The vision transformer (ViT) is a convolutional free architecture which only exploits the self-attention mechanism and has surpassed CNNs in some natural imaging classification tasks. However, these models are not very robust to textural shifts in the input space. In MRI, we often have to deal with textural shift arising from varying acquisition protocols. Here, we focus on the ability of models to generalise well to new magnet strengths for MRI.</p><p><strong>Method: </strong>We propose a new framework to improve the robustness of vision transformer-based models for disease classification by constructing discrete representations of the data using vector quantisation. We sample a subset of the discrete representations to form the input into a transformer-based model. We use cross-attention in our transformer model to combine the discrete representations of T2-weighted and apparent diffusion coefficient (ADC) images.</p><p><strong>Results: </strong>We analyse the robustness of our model by training on a 1.5 T scanner and test on a 3 T scanner and vice versa. Our approach achieves SOTA performance for classification of lesions on prostate MRI and outperforms various other CNN and transformer-based models in terms of robustness to domain shift and perturbations in the input space.</p><p><strong>Conclusion: </strong>We develop a method to improve the robustness of transformer-based disease classification of prostate lesions on MRI using discrete representations of the T2-weighted and ADC images.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":"11-20"},"PeriodicalIF":2.3,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11759462/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140916593","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}