{"title":"从单张显微镜图像合成乳突切除术多视角图像","authors":"Yike Zhang, Jack Noble","doi":"arxiv-2409.03190","DOIUrl":null,"url":null,"abstract":"Cochlear Implant (CI) procedures involve performing an invasive mastoidectomy\nto insert an electrode array into the cochlea. In this paper, we introduce a\nnovel pipeline that is capable of generating synthetic multi-view videos from a\nsingle CI microscope image. In our approach, we use a patient's pre-operative\nCT scan to predict the post-mastoidectomy surface using a method designed for\nthis purpose. We manually align the surface with a selected microscope frame to\nobtain an accurate initial pose of the reconstructed CT mesh relative to the\nmicroscope. We then perform UV projection to transfer the colors from the frame\nto surface textures. Novel views of the textured surface can be used to\ngenerate a large dataset of synthetic frames with ground truth poses. We\nevaluated the quality of synthetic views rendered using Pytorch3D and PyVista.\nWe found both rendering engines lead to similarly high-quality synthetic\nnovel-view frames compared to ground truth with a structural similarity index\nfor both methods averaging about 0.86. A large dataset of novel views with\nknown poses is critical for ongoing training of a method to automatically\nestimate microscope pose for 2D to 3D registration with the pre-operative CT to\nfacilitate augmented reality surgery. This dataset will empower various\ndownstream tasks, such as integrating Augmented Reality (AR) in the OR,\ntracking surgical tools, and supporting other video analysis studies.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":"20 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Mastoidectomy Multi-View Synthesis from a Single Microscopy Image\",\"authors\":\"Yike Zhang, Jack Noble\",\"doi\":\"arxiv-2409.03190\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Cochlear Implant (CI) procedures involve performing an invasive mastoidectomy\\nto insert an electrode array into the cochlea. In this paper, we introduce a\\nnovel pipeline that is capable of generating synthetic multi-view videos from a\\nsingle CI microscope image. In our approach, we use a patient's pre-operative\\nCT scan to predict the post-mastoidectomy surface using a method designed for\\nthis purpose. We manually align the surface with a selected microscope frame to\\nobtain an accurate initial pose of the reconstructed CT mesh relative to the\\nmicroscope. We then perform UV projection to transfer the colors from the frame\\nto surface textures. Novel views of the textured surface can be used to\\ngenerate a large dataset of synthetic frames with ground truth poses. We\\nevaluated the quality of synthetic views rendered using Pytorch3D and PyVista.\\nWe found both rendering engines lead to similarly high-quality synthetic\\nnovel-view frames compared to ground truth with a structural similarity index\\nfor both methods averaging about 0.86. A large dataset of novel views with\\nknown poses is critical for ongoing training of a method to automatically\\nestimate microscope pose for 2D to 3D registration with the pre-operative CT to\\nfacilitate augmented reality surgery. This dataset will empower various\\ndownstream tasks, such as integrating Augmented Reality (AR) in the OR,\\ntracking surgical tools, and supporting other video analysis studies.\",\"PeriodicalId\":501174,\"journal\":{\"name\":\"arXiv - CS - Graphics\",\"volume\":\"20 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Graphics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.03190\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Graphics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.03190","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Mastoidectomy Multi-View Synthesis from a Single Microscopy Image
Cochlear Implant (CI) procedures involve performing an invasive mastoidectomy
to insert an electrode array into the cochlea. In this paper, we introduce a
novel pipeline that is capable of generating synthetic multi-view videos from a
single CI microscope image. In our approach, we use a patient's pre-operative
CT scan to predict the post-mastoidectomy surface using a method designed for
this purpose. We manually align the surface with a selected microscope frame to
obtain an accurate initial pose of the reconstructed CT mesh relative to the
microscope. We then perform UV projection to transfer the colors from the frame
to surface textures. Novel views of the textured surface can be used to
generate a large dataset of synthetic frames with ground truth poses. We
evaluated the quality of synthetic views rendered using Pytorch3D and PyVista.
We found both rendering engines lead to similarly high-quality synthetic
novel-view frames compared to ground truth with a structural similarity index
for both methods averaging about 0.86. A large dataset of novel views with
known poses is critical for ongoing training of a method to automatically
estimate microscope pose for 2D to 3D registration with the pre-operative CT to
facilitate augmented reality surgery. This dataset will empower various
downstream tasks, such as integrating Augmented Reality (AR) in the OR,
tracking surgical tools, and supporting other video analysis studies.