从单张显微镜图像合成乳突切除术多视角图像

arXiv - CS - Graphics Pub Date : 2024-08-31 DOI:arxiv-2409.03190

Yike Zhang, Jack Noble

{"title":"从单张显微镜图像合成乳突切除术多视角图像","authors":"Yike Zhang, Jack Noble","doi":"arxiv-2409.03190","DOIUrl":null,"url":null,"abstract":"Cochlear Implant (CI) procedures involve performing an invasive mastoidectomy\nto insert an electrode array into the cochlea. In this paper, we introduce a\nnovel pipeline that is capable of generating synthetic multi-view videos from a\nsingle CI microscope image. In our approach, we use a patient's pre-operative\nCT scan to predict the post-mastoidectomy surface using a method designed for\nthis purpose. We manually align the surface with a selected microscope frame to\nobtain an accurate initial pose of the reconstructed CT mesh relative to the\nmicroscope. We then perform UV projection to transfer the colors from the frame\nto surface textures. Novel views of the textured surface can be used to\ngenerate a large dataset of synthetic frames with ground truth poses. We\nevaluated the quality of synthetic views rendered using Pytorch3D and PyVista.\nWe found both rendering engines lead to similarly high-quality synthetic\nnovel-view frames compared to ground truth with a structural similarity index\nfor both methods averaging about 0.86. A large dataset of novel views with\nknown poses is critical for ongoing training of a method to automatically\nestimate microscope pose for 2D to 3D registration with the pre-operative CT to\nfacilitate augmented reality surgery. This dataset will empower various\ndownstream tasks, such as integrating Augmented Reality (AR) in the OR,\ntracking surgical tools, and supporting other video analysis studies.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":"20 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Mastoidectomy Multi-View Synthesis from a Single Microscopy Image\",\"authors\":\"Yike Zhang, Jack Noble\",\"doi\":\"arxiv-2409.03190\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Cochlear Implant (CI) procedures involve performing an invasive mastoidectomy\\nto insert an electrode array into the cochlea. In this paper, we introduce a\\nnovel pipeline that is capable of generating synthetic multi-view videos from a\\nsingle CI microscope image. In our approach, we use a patient's pre-operative\\nCT scan to predict the post-mastoidectomy surface using a method designed for\\nthis purpose. We manually align the surface with a selected microscope frame to\\nobtain an accurate initial pose of the reconstructed CT mesh relative to the\\nmicroscope. We then perform UV projection to transfer the colors from the frame\\nto surface textures. Novel views of the textured surface can be used to\\ngenerate a large dataset of synthetic frames with ground truth poses. We\\nevaluated the quality of synthetic views rendered using Pytorch3D and PyVista.\\nWe found both rendering engines lead to similarly high-quality synthetic\\nnovel-view frames compared to ground truth with a structural similarity index\\nfor both methods averaging about 0.86. A large dataset of novel views with\\nknown poses is critical for ongoing training of a method to automatically\\nestimate microscope pose for 2D to 3D registration with the pre-operative CT to\\nfacilitate augmented reality surgery. This dataset will empower various\\ndownstream tasks, such as integrating Augmented Reality (AR) in the OR,\\ntracking surgical tools, and supporting other video analysis studies.\",\"PeriodicalId\":501174,\"journal\":{\"name\":\"arXiv - CS - Graphics\",\"volume\":\"20 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Graphics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.03190\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Graphics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.03190","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

人工耳蜗植入（CI）手术需要进行侵入性乳突切除术，以便将电极阵列插入耳蜗。在本文中，我们介绍了一种能够从单个 CI 显微镜图像生成合成多视角视频的新流水线。在我们的方法中，我们利用患者术前的 CT 扫描，通过一种专门为此设计的方法来预测乳突切除术后的表面。我们用手动方式将表面与选定的显微镜框架对齐，以获得重建 CT 网格相对于显微镜的精确初始姿态。然后，我们进行 UV 投影，将色彩从框架转移到表面纹理。纹理表面的新视图可用于生成具有地面实况姿态的大型合成帧数据集。我们评估了使用 Pytorch3D 和 PyVista 渲染的合成视图的质量，发现这两种渲染引擎生成的合成视图帧的质量与地面实况相似，两种方法的结构相似度指数平均约为 0.86。大量具有已知姿势的新颖视图数据集对于不断训练自动估计显微镜姿势的方法至关重要，以便与术前 CT 进行二维到三维配准，从而促进增强现实手术。该数据集将支持各种下游任务，例如在手术室整合增强现实技术（AR）、跟踪手术工具以及支持其他视频分析研究。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Mastoidectomy Multi-View Synthesis from a Single Microscopy Image

Cochlear Implant (CI) procedures involve performing an invasive mastoidectomy to insert an electrode array into the cochlea. In this paper, we introduce a novel pipeline that is capable of generating synthetic multi-view videos from a single CI microscope image. In our approach, we use a patient's pre-operative CT scan to predict the post-mastoidectomy surface using a method designed for this purpose. We manually align the surface with a selected microscope frame to obtain an accurate initial pose of the reconstructed CT mesh relative to the microscope. We then perform UV projection to transfer the colors from the frame to surface textures. Novel views of the textured surface can be used to generate a large dataset of synthetic frames with ground truth poses. We evaluated the quality of synthetic views rendered using Pytorch3D and PyVista. We found both rendering engines lead to similarly high-quality synthetic novel-view frames compared to ground truth with a structural similarity index for both methods averaging about 0.86. A large dataset of novel views with known poses is critical for ongoing training of a method to automatically estimate microscope pose for 2D to 3D registration with the pre-operative CT to facilitate augmented reality surgery. This dataset will empower various downstream tasks, such as integrating Augmented Reality (AR) in the OR, tracking surgical tools, and supporting other video analysis studies.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

arXiv - CS - Graphics

自引率

0.00%

发文量