RL-USRegi: Autonomous Ultrasound Registration for Radiation-Free Spinal Surgical Navigation Using Reinforcement Learning

IF 7.9 2区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS IEEE Transactions on Automation Science and Engineering Pub Date : 2025-02-21 DOI:10.1109/TASE.2025.3544413

Ang Li;Jiayi Han;Yongjian Zhao;Max Q.-H. Meng;Li Liu

{"title":"RL-USRegi: Autonomous Ultrasound Registration for Radiation-Free Spinal Surgical Navigation Using Reinforcement Learning","authors":"Ang Li;Jiayi Han;Yongjian Zhao;Max Q.-H. Meng;Li Liu","doi":"10.1109/TASE.2025.3544413","DOIUrl":null,"url":null,"abstract":"Registration of intraoperative ultrasound (iUS) with preoperative CT represents a significant yet challenging task in the context of radiation-free spinal surgical navigation. The presence of thickness response artifacts in US images poses a considerable obstacle to the accurate extraction of bone boundaries. Furthermore, US-CT registration typically necessitates the detection and correspondence of high-quality landmarks at the initial stage. This can be accomplished by surgeons who have undergone extensive training in the localization of standard spinal US views, enabling them to identify key vertebral landmarks for subsequent precise registration. In this paper, we propose a fully automated iUS registration method that employs a limited number of spinal US views as observation objects. Specifically, three-dimensional vertebral meshes segmented from the preoperative CT images are superimposed on the US images and then fed to the reinforcement learning (RL) agent for sequential decision-making. The proposed method achieves fully automatic US-CT registration without relying on prespecified initialization. This is achieved by training the agent to approach bone surfaces on several randomly selected 2D US views. The instability of RL-based iUS registration is primarily attributable to the difficulty of correlating long-range information within the neural network. To address this issue, we propose a Field of View Separation (FoVS) module. The proposed approach employs separate encoders for US and mesh images, followed by cross-attention aggregation, which facilitates information flow between non-adjacent pixels. This approach enables pretraining of feature extraction on distinct encoders and the application of supplementary loss for enhanced feature matching precision, thereby significantly improving the learning capability and stability of the network. Furthermore, a refinement module is introduced to correct the results of the RL registration, which improves the stability of the registration process. To ascertain the efficacy of each module, action, and auxiliary task, comprehensive experiments are conducted. The results demonstrate that the performance of the RL agent is enhanced by the associated modules and auxiliary tasks. The registration exhibited an angular error of <inline-formula> <tex-math>$8.83 \\; \\pm \\; 4.69$ </tex-math></inline-formula> degrees and a translational error of <inline-formula> <tex-math>$3.34 \\; \\pm \\; 1.42$ </tex-math></inline-formula> mm, achieving the state-of-the-art (SOTA) results. It is noteworthy that fine-tuning the model prior to the surgical phase can significantly reduce the registration error, which is a promising outcome for its clinical translation.Note to Practitioners—The objective of this study is to address the issue of image registration using iUS in conjunction with preoperative CT scans in the context of spine surgery. The current 2D/3D image registration methods are constrained by several limitations. Firstly, they often exhibit reduced accuracy, and require high-quality images in substantial quantities. Secondly, there is a lack of effective mechanisms to rectify errors identified after the registration process. This paper proposes a fully automated registration framework based on RL, which incorporates image rendering and mesh clipping to enable continuous adjustment of the pose of 3D data, thereby facilitating 2D/3D registration. The framework employs the distinctive attributes of iUS images and incorporates a refinement module to evaluate registration accuracy, thereby facilitating the rectification of any registration issues. The proposed framework was tested on both sheep lumbar subjects and human lumbar phantoms, demonstrating the highest level of performance to date and indicating its potential for integration into surgical navigation systems.","PeriodicalId":51060,"journal":{"name":"IEEE Transactions on Automation Science and Engineering","volume":"22 ","pages":"12668-12678"},"PeriodicalIF":7.9000,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Automation Science and Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10897959/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Registration of intraoperative ultrasound (iUS) with preoperative CT represents a significant yet challenging task in the context of radiation-free spinal surgical navigation. The presence of thickness response artifacts in US images poses a considerable obstacle to the accurate extraction of bone boundaries. Furthermore, US-CT registration typically necessitates the detection and correspondence of high-quality landmarks at the initial stage. This can be accomplished by surgeons who have undergone extensive training in the localization of standard spinal US views, enabling them to identify key vertebral landmarks for subsequent precise registration. In this paper, we propose a fully automated iUS registration method that employs a limited number of spinal US views as observation objects. Specifically, three-dimensional vertebral meshes segmented from the preoperative CT images are superimposed on the US images and then fed to the reinforcement learning (RL) agent for sequential decision-making. The proposed method achieves fully automatic US-CT registration without relying on prespecified initialization. This is achieved by training the agent to approach bone surfaces on several randomly selected 2D US views. The instability of RL-based iUS registration is primarily attributable to the difficulty of correlating long-range information within the neural network. To address this issue, we propose a Field of View Separation (FoVS) module. The proposed approach employs separate encoders for US and mesh images, followed by cross-attention aggregation, which facilitates information flow between non-adjacent pixels. This approach enables pretraining of feature extraction on distinct encoders and the application of supplementary loss for enhanced feature matching precision, thereby significantly improving the learning capability and stability of the network. Furthermore, a refinement module is introduced to correct the results of the RL registration, which improves the stability of the registration process. To ascertain the efficacy of each module, action, and auxiliary task, comprehensive experiments are conducted. The results demonstrate that the performance of the RL agent is enhanced by the associated modules and auxiliary tasks. The registration exhibited an angular error of

$8.83 \; \pm \; 4.69$

degrees and a translational error of

$3.34 \; \pm \; 1.42$

mm, achieving the state-of-the-art (SOTA) results. It is noteworthy that fine-tuning the model prior to the surgical phase can significantly reduce the registration error, which is a promising outcome for its clinical translation.Note to Practitioners—The objective of this study is to address the issue of image registration using iUS in conjunction with preoperative CT scans in the context of spine surgery. The current 2D/3D image registration methods are constrained by several limitations. Firstly, they often exhibit reduced accuracy, and require high-quality images in substantial quantities. Secondly, there is a lack of effective mechanisms to rectify errors identified after the registration process. This paper proposes a fully automated registration framework based on RL, which incorporates image rendering and mesh clipping to enable continuous adjustment of the pose of 3D data, thereby facilitating 2D/3D registration. The framework employs the distinctive attributes of iUS images and incorporates a refinement module to evaluate registration accuracy, thereby facilitating the rectification of any registration issues. The proposed framework was tested on both sheep lumbar subjects and human lumbar phantoms, demonstrating the highest level of performance to date and indicating its potential for integration into surgical navigation systems.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

RL-USRegi：使用强化学习的无辐射脊柱手术导航的自主超声注册

在无辐射脊柱手术导航的背景下，术中超声（iUS）与术前CT的注册是一项重要但具有挑战性的任务。超声图像中存在的厚度响应伪影对骨边界的准确提取造成了相当大的障碍。此外，US-CT配准通常需要在初始阶段检测和对应高质量的地标。这可以由外科医生完成，他们在标准脊柱US视图定位方面接受过广泛的培训，使他们能够识别关键的椎体地标，以便随后精确定位。在本文中，我们提出了一种全自动iUS注册方法，该方法使用有限数量的脊柱US视图作为观察对象。具体而言，从术前CT图像中分割的三维椎体网格叠加在US图像上，然后馈送给强化学习（RL）代理进行顺序决策。该方法实现了US-CT的全自动配准，而不依赖于预先指定的初始化。这是通过训练代理在几个随机选择的2D US视图上接近骨表面来实现的。基于rl的iUS配准的不稳定性主要归因于神经网络中远程信息的关联困难。为了解决这个问题，我们提出了一个视场分离（FoVS）模块。该方法对US和mesh图像采用单独的编码器，然后进行交叉关注聚合，从而促进非相邻像素之间的信息流动。该方法可以对不同编码器的特征提取进行预训练，并利用补充损失来提高特征匹配精度，从而显著提高网络的学习能力和稳定性。此外，引入了细化模块对RL配准结果进行校正，提高了配准过程的稳定性。为了确定每个模块、动作和辅助任务的有效性，进行了全面的实验。结果表明，关联模块和辅助任务提高了RL代理的性能。该注册显示的角误差为8.83美元；\ \下午;4.69°，平移误差3.34°；\ \下午;1.42百万美元，达到了最先进的（SOTA）结果。值得注意的是，在手术阶段之前对模型进行微调可以显着减少注册误差，这是其临床转化的一个有希望的结果。从业人员注意事项：本研究的目的是解决脊柱手术背景下使用iUS与术前CT扫描相结合的图像配准问题。当前的2D/3D图像配准方法受到一些限制。首先，它们往往表现出较低的精度，并且需要大量的高质量图像。其次，缺乏有效的机制来纠正注册过程后发现的错误。本文提出了一种基于RL的全自动配准框架，该框架结合了图像渲染和网格裁剪，实现了三维数据姿态的连续调整，从而方便了2D/3D配准。该框架利用了iUS图像的独特属性，并结合了一个改进模块来评估配准精度，从而促进了任何配准问题的纠正。所提出的框架在绵羊腰椎受试者和人类腰椎幻影上进行了测试，显示出迄今为止最高水平的性能，并表明其整合到外科导航系统中的潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE Transactions on Automation Science and Engineering 工程技术-自动化与控制系统

CiteScore

12.50

自引率

14.30%

发文量

404

审稿时长

3.0 months

期刊介绍： The IEEE Transactions on Automation Science and Engineering (T-ASE) publishes fundamental papers on Automation, emphasizing scientific results that advance efficiency, quality, productivity, and reliability. T-ASE encourages interdisciplinary approaches from computer science, control systems, electrical engineering, mathematics, mechanical engineering, operations research, and other fields. T-ASE welcomes results relevant to industries such as agriculture, biotechnology, healthcare, home automation, maintenance, manufacturing, pharmaceuticals, retail, security, service, supply chains, and transportation. T-ASE addresses a research community willing to integrate knowledge across disciplines and industries. For this purpose, each paper includes a Note to Practitioners that summarizes how its results can be applied or how they might be extended to apply in practice.