{"title":"A Multiscale Spatial Transformer U-Net for Simultaneously Automatic Reorientation and Segmentation of 3-D Nuclear Cardiac Images","authors":"Yangfan Ni;Duo Zhang;Gege Ma;Fan Rao;Yuanfeng Wu;Lijun Lu;Zhongke Huang;Wentao Zhu","doi":"10.1109/TRPMS.2024.3382318","DOIUrl":null,"url":null,"abstract":"Accurate reorientation and segmentation of the left ventricular (LV) is essential for the quantitative analysis of myocardial perfusion imaging (MPI). This study proposes an end-to-end model, named as multiscale spatial transformer UNet (MS-ST-UNet), which involves the multiscale spatial transformer network (MSSTN) and multiscale UNet (MSUNet) modules to perform simultaneous reorientation and segmentation of LV region from nuclear cardiac images. The multiscale sampler produces images with varying resolutions, while scale transformer (ST) blocks are employed to align the scales of features. The proposed method is trained and tested using two different nuclear cardiac image modalities: \n<inline-formula> <tex-math>$^{13}\\text{N}$ </tex-math></inline-formula>\n-ammonia positron emission tomography (PET) and \n<inline-formula> <tex-math>$^{99m}$ </tex-math></inline-formula>\nTc-sestamibi single-photon emission computed tomography (SPECT). MS-ST-UNet attains dice similarity coefficient (DSC) scores of 91.48% and 94.81% for PET LV myocardium (LV-MY) and SPECT LV-MY, respectively. Additionally, the mean-square error (MSE) between predicted rigid registration parameters and ground truth decreases to below \n<inline-formula> <tex-math>$1.4 \\times 10^{-2}$ </tex-math></inline-formula>\n. The experimental findings indicate that the MS-ST-UNet yields notably reduced registration errors and more precise boundary detection for the LV structure compared to existing methods. This joint learning framework promotes mutual enhancement between reorientation and segmentation tasks, leading to cutting edge performance and an efficient image processing workflow.","PeriodicalId":46807,"journal":{"name":"IEEE Transactions on Radiation and Plasma Medical Sciences","volume":null,"pages":null},"PeriodicalIF":4.6000,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Radiation and Plasma Medical Sciences","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10488032/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0
Abstract
Accurate reorientation and segmentation of the left ventricular (LV) is essential for the quantitative analysis of myocardial perfusion imaging (MPI). This study proposes an end-to-end model, named as multiscale spatial transformer UNet (MS-ST-UNet), which involves the multiscale spatial transformer network (MSSTN) and multiscale UNet (MSUNet) modules to perform simultaneous reorientation and segmentation of LV region from nuclear cardiac images. The multiscale sampler produces images with varying resolutions, while scale transformer (ST) blocks are employed to align the scales of features. The proposed method is trained and tested using two different nuclear cardiac image modalities:
$^{13}\text{N}$
-ammonia positron emission tomography (PET) and
$^{99m}$
Tc-sestamibi single-photon emission computed tomography (SPECT). MS-ST-UNet attains dice similarity coefficient (DSC) scores of 91.48% and 94.81% for PET LV myocardium (LV-MY) and SPECT LV-MY, respectively. Additionally, the mean-square error (MSE) between predicted rigid registration parameters and ground truth decreases to below
$1.4 \times 10^{-2}$
. The experimental findings indicate that the MS-ST-UNet yields notably reduced registration errors and more precise boundary detection for the LV structure compared to existing methods. This joint learning framework promotes mutual enhancement between reorientation and segmentation tasks, leading to cutting edge performance and an efficient image processing workflow.