Ang Li;Jiayi Han;Yongjian Zhao;Max Q.-H. Meng;Li Liu
{"title":"RL-USRegi: Autonomous Ultrasound Registration for Radiation-Free Spinal Surgical Navigation Using Reinforcement Learning","authors":"Ang Li;Jiayi Han;Yongjian Zhao;Max Q.-H. Meng;Li Liu","doi":"10.1109/TASE.2025.3544413","DOIUrl":null,"url":null,"abstract":"Registration of intraoperative ultrasound (iUS) with preoperative CT represents a significant yet challenging task in the context of radiation-free spinal surgical navigation. The presence of thickness response artifacts in US images poses a considerable obstacle to the accurate extraction of bone boundaries. Furthermore, US-CT registration typically necessitates the detection and correspondence of high-quality landmarks at the initial stage. This can be accomplished by surgeons who have undergone extensive training in the localization of standard spinal US views, enabling them to identify key vertebral landmarks for subsequent precise registration. In this paper, we propose a fully automated iUS registration method that employs a limited number of spinal US views as observation objects. Specifically, three-dimensional vertebral meshes segmented from the preoperative CT images are superimposed on the US images and then fed to the reinforcement learning (RL) agent for sequential decision-making. The proposed method achieves fully automatic US-CT registration without relying on prespecified initialization. This is achieved by training the agent to approach bone surfaces on several randomly selected 2D US views. The instability of RL-based iUS registration is primarily attributable to the difficulty of correlating long-range information within the neural network. To address this issue, we propose a Field of View Separation (FoVS) module. The proposed approach employs separate encoders for US and mesh images, followed by cross-attention aggregation, which facilitates information flow between non-adjacent pixels. This approach enables pretraining of feature extraction on distinct encoders and the application of supplementary loss for enhanced feature matching precision, thereby significantly improving the learning capability and stability of the network. Furthermore, a refinement module is introduced to correct the results of the RL registration, which improves the stability of the registration process. To ascertain the efficacy of each module, action, and auxiliary task, comprehensive experiments are conducted. The results demonstrate that the performance of the RL agent is enhanced by the associated modules and auxiliary tasks. The registration exhibited an angular error of <inline-formula> <tex-math>$8.83 \\; \\pm \\; 4.69$ </tex-math></inline-formula> degrees and a translational error of <inline-formula> <tex-math>$3.34 \\; \\pm \\; 1.42$ </tex-math></inline-formula> mm, achieving the state-of-the-art (SOTA) results. It is noteworthy that fine-tuning the model prior to the surgical phase can significantly reduce the registration error, which is a promising outcome for its clinical translation.Note to Practitioners—The objective of this study is to address the issue of image registration using iUS in conjunction with preoperative CT scans in the context of spine surgery. The current 2D/3D image registration methods are constrained by several limitations. Firstly, they often exhibit reduced accuracy, and require high-quality images in substantial quantities. Secondly, there is a lack of effective mechanisms to rectify errors identified after the registration process. This paper proposes a fully automated registration framework based on RL, which incorporates image rendering and mesh clipping to enable continuous adjustment of the pose of 3D data, thereby facilitating 2D/3D registration. The framework employs the distinctive attributes of iUS images and incorporates a refinement module to evaluate registration accuracy, thereby facilitating the rectification of any registration issues. The proposed framework was tested on both sheep lumbar subjects and human lumbar phantoms, demonstrating the highest level of performance to date and indicating its potential for integration into surgical navigation systems.","PeriodicalId":51060,"journal":{"name":"IEEE Transactions on Automation Science and Engineering","volume":"22 ","pages":"12668-12678"},"PeriodicalIF":7.9000,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Automation Science and Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10897959/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Registration of intraoperative ultrasound (iUS) with preoperative CT represents a significant yet challenging task in the context of radiation-free spinal surgical navigation. The presence of thickness response artifacts in US images poses a considerable obstacle to the accurate extraction of bone boundaries. Furthermore, US-CT registration typically necessitates the detection and correspondence of high-quality landmarks at the initial stage. This can be accomplished by surgeons who have undergone extensive training in the localization of standard spinal US views, enabling them to identify key vertebral landmarks for subsequent precise registration. In this paper, we propose a fully automated iUS registration method that employs a limited number of spinal US views as observation objects. Specifically, three-dimensional vertebral meshes segmented from the preoperative CT images are superimposed on the US images and then fed to the reinforcement learning (RL) agent for sequential decision-making. The proposed method achieves fully automatic US-CT registration without relying on prespecified initialization. This is achieved by training the agent to approach bone surfaces on several randomly selected 2D US views. The instability of RL-based iUS registration is primarily attributable to the difficulty of correlating long-range information within the neural network. To address this issue, we propose a Field of View Separation (FoVS) module. The proposed approach employs separate encoders for US and mesh images, followed by cross-attention aggregation, which facilitates information flow between non-adjacent pixels. This approach enables pretraining of feature extraction on distinct encoders and the application of supplementary loss for enhanced feature matching precision, thereby significantly improving the learning capability and stability of the network. Furthermore, a refinement module is introduced to correct the results of the RL registration, which improves the stability of the registration process. To ascertain the efficacy of each module, action, and auxiliary task, comprehensive experiments are conducted. The results demonstrate that the performance of the RL agent is enhanced by the associated modules and auxiliary tasks. The registration exhibited an angular error of $8.83 \; \pm \; 4.69$ degrees and a translational error of $3.34 \; \pm \; 1.42$ mm, achieving the state-of-the-art (SOTA) results. It is noteworthy that fine-tuning the model prior to the surgical phase can significantly reduce the registration error, which is a promising outcome for its clinical translation.Note to Practitioners—The objective of this study is to address the issue of image registration using iUS in conjunction with preoperative CT scans in the context of spine surgery. The current 2D/3D image registration methods are constrained by several limitations. Firstly, they often exhibit reduced accuracy, and require high-quality images in substantial quantities. Secondly, there is a lack of effective mechanisms to rectify errors identified after the registration process. This paper proposes a fully automated registration framework based on RL, which incorporates image rendering and mesh clipping to enable continuous adjustment of the pose of 3D data, thereby facilitating 2D/3D registration. The framework employs the distinctive attributes of iUS images and incorporates a refinement module to evaluate registration accuracy, thereby facilitating the rectification of any registration issues. The proposed framework was tested on both sheep lumbar subjects and human lumbar phantoms, demonstrating the highest level of performance to date and indicating its potential for integration into surgical navigation systems.
期刊介绍:
The IEEE Transactions on Automation Science and Engineering (T-ASE) publishes fundamental papers on Automation, emphasizing scientific results that advance efficiency, quality, productivity, and reliability. T-ASE encourages interdisciplinary approaches from computer science, control systems, electrical engineering, mathematics, mechanical engineering, operations research, and other fields. T-ASE welcomes results relevant to industries such as agriculture, biotechnology, healthcare, home automation, maintenance, manufacturing, pharmaceuticals, retail, security, service, supply chains, and transportation. T-ASE addresses a research community willing to integrate knowledge across disciplines and industries. For this purpose, each paper includes a Note to Practitioners that summarizes how its results can be applied or how they might be extended to apply in practice.