基于机器学习的可转向消融导管多模态力估算

IF 3.8 Q2 ENGINEERING, BIOMEDICAL IEEE transactions on medical robotics and bionics Pub Date : 2024-03-31 DOI:10.1109/TMRB.2024.3407590

Elaheh Arefinia;Jayender Jagadeesan;Rajni V. Patel

{"title":"基于机器学习的可转向消融导管多模态力估算","authors":"Elaheh Arefinia;Jayender Jagadeesan;Rajni V. Patel","doi":"10.1109/TMRB.2024.3407590","DOIUrl":null,"url":null,"abstract":"Catheter-based cardiac ablation is a minimally invasive procedure for treating atrial fibrillation (AF). Electrophysiologists perform the procedure under image guidance during which the contact force between the heart tissue and the catheter tip determines the quality of lesions created. This paper describes a novel multi-modal contact force estimator based on Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs). The estimator takes the shape and optical flow of the deflectable distal section as two modalities since frames and motion between frames complement each other to capture the long context in the video frames of the catheter. The angle between the tissue and the catheter tip is considered a complement of the extracted shape. The data acquisition platform measures the two-degrees-of-freedom contact force and video data as the catheter motion is constrained in the imaging plane. The images are captured via a camera that simulates single-view fluoroscopy for experimental purposes. In this sensor-free procedure, the features of the images and optical flow modalities are extracted through transfer learning. Long Short-Term Memory Networks (LSTMs) with a memory fusion network (MFN) are implemented to consider time dependency and hysteresis due to friction. The architecture integrates spatial and temporal networks. Late fusion with the concatenation of LSTMs, transformer decoders, and Gated Recurrent Units (GRUs) are implemented to verify the feasibility of the proposed network-based approach and its superiority over single-modality networks. The resulting mean absolute error, which accounted for only 2.84% of the total magnitude, was obtained by collecting data under more realistic circumstances in contrast to previous research studies. The decrease in error is considerably better than that achieved by individual modalities and late fusion with concatenation. These results emphasize the practicality and relevance of utilizing a multi-modal network in real-world scenarios.","PeriodicalId":73318,"journal":{"name":"IEEE transactions on medical robotics and bionics","volume":"6 3","pages":"1004-1016"},"PeriodicalIF":3.8000,"publicationDate":"2024-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Machine-Learning-Based Multi-Modal Force Estimation for Steerable Ablation Catheters\",\"authors\":\"Elaheh Arefinia;Jayender Jagadeesan;Rajni V. Patel\",\"doi\":\"10.1109/TMRB.2024.3407590\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Catheter-based cardiac ablation is a minimally invasive procedure for treating atrial fibrillation (AF). Electrophysiologists perform the procedure under image guidance during which the contact force between the heart tissue and the catheter tip determines the quality of lesions created. This paper describes a novel multi-modal contact force estimator based on Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs). The estimator takes the shape and optical flow of the deflectable distal section as two modalities since frames and motion between frames complement each other to capture the long context in the video frames of the catheter. The angle between the tissue and the catheter tip is considered a complement of the extracted shape. The data acquisition platform measures the two-degrees-of-freedom contact force and video data as the catheter motion is constrained in the imaging plane. The images are captured via a camera that simulates single-view fluoroscopy for experimental purposes. In this sensor-free procedure, the features of the images and optical flow modalities are extracted through transfer learning. Long Short-Term Memory Networks (LSTMs) with a memory fusion network (MFN) are implemented to consider time dependency and hysteresis due to friction. The architecture integrates spatial and temporal networks. Late fusion with the concatenation of LSTMs, transformer decoders, and Gated Recurrent Units (GRUs) are implemented to verify the feasibility of the proposed network-based approach and its superiority over single-modality networks. The resulting mean absolute error, which accounted for only 2.84% of the total magnitude, was obtained by collecting data under more realistic circumstances in contrast to previous research studies. The decrease in error is considerably better than that achieved by individual modalities and late fusion with concatenation. These results emphasize the practicality and relevance of utilizing a multi-modal network in real-world scenarios.\",\"PeriodicalId\":73318,\"journal\":{\"name\":\"IEEE transactions on medical robotics and bionics\",\"volume\":\"6 3\",\"pages\":\"1004-1016\"},\"PeriodicalIF\":3.8000,\"publicationDate\":\"2024-03-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on medical robotics and bionics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10545309/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, BIOMEDICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on medical robotics and bionics","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10545309/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}

引用次数: 0

摘要

导管心脏消融术是一种治疗心房颤动（房颤）的微创手术。电生理学家在图像引导下进行手术，在此期间，心脏组织与导管尖端之间的接触力决定了所产生病变的质量。本文介绍了一种基于卷积神经网络（CNN）和循环神经网络（RNN）的新型多模态接触力估算器。该估算器将可偏转远端部分的形状和光流作为两种模态，因为帧和帧之间的运动可以相互补充，从而捕捉导管视频帧中的长背景。组织与导管尖端之间的角度被视为提取形状的补充。当导管运动受限于成像平面时，数据采集平台会测量两自由度接触力和视频数据。出于实验目的，图像是通过模拟单视角透视的摄像头采集的。在这种无传感器程序中，通过迁移学习提取图像和光流模式的特征。长短期记忆网络（LSTM）和记忆融合网络（MFN）的实施考虑了时间依赖性和摩擦导致的滞后。该架构整合了空间和时间网络。通过将 LSTM、变压器解码器和门控递归单元（GRUs）进行后期融合，验证了所提出的基于网络的方法的可行性及其优于单一模式网络的性能。与之前的研究相比，通过在更真实的环境下收集数据，得出的平均绝对误差仅占总误差的 2.84%。误差的减小大大优于单个模态和后期并联融合所实现的误差。这些结果强调了在现实世界中利用多模态网络的实用性和相关性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Machine-Learning-Based Multi-Modal Force Estimation for Steerable Ablation Catheters

Catheter-based cardiac ablation is a minimally invasive procedure for treating atrial fibrillation (AF). Electrophysiologists perform the procedure under image guidance during which the contact force between the heart tissue and the catheter tip determines the quality of lesions created. This paper describes a novel multi-modal contact force estimator based on Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs). The estimator takes the shape and optical flow of the deflectable distal section as two modalities since frames and motion between frames complement each other to capture the long context in the video frames of the catheter. The angle between the tissue and the catheter tip is considered a complement of the extracted shape. The data acquisition platform measures the two-degrees-of-freedom contact force and video data as the catheter motion is constrained in the imaging plane. The images are captured via a camera that simulates single-view fluoroscopy for experimental purposes. In this sensor-free procedure, the features of the images and optical flow modalities are extracted through transfer learning. Long Short-Term Memory Networks (LSTMs) with a memory fusion network (MFN) are implemented to consider time dependency and hysteresis due to friction. The architecture integrates spatial and temporal networks. Late fusion with the concatenation of LSTMs, transformer decoders, and Gated Recurrent Units (GRUs) are implemented to verify the feasibility of the proposed network-based approach and its superiority over single-modality networks. The resulting mean absolute error, which accounted for only 2.84% of the total magnitude, was obtained by collecting data under more realistic circumstances in contrast to previous research studies. The decrease in error is considerably better than that achieved by individual modalities and late fusion with concatenation. These results emphasize the practicality and relevance of utilizing a multi-modal network in real-world scenarios.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE transactions on medical robotics and bionics

CiteScore

6.80

自引率

0.00%

发文量