Rachel A Roston, Sophie M Whikehart, Sara M Rolfe, A. Murat Maga
{"title":"Morphological simulation tests the limits on phenotype discovery in 3D image analysis","authors":"Rachel A Roston, Sophie M Whikehart, Sara M Rolfe, A. Murat Maga","doi":"10.1101/2024.06.30.601430","DOIUrl":null,"url":null,"abstract":"In the past few decades, advances in 3D imaging have created new opportunities for reverse genetic screens. Rapidly growing datasets of 3D images of genetic knockouts require high-throughput, automated computational approaches for identifying and characterizing new phenotypes. However, exploratory, discovery-oriented image analysis pipelines used to discover these phenotypes can be difficult to validate because, by their nature, the expected outcome is not known a priori. Introducing known morphological variation through simulation can help distinguish between real phenotypic differences and random variation; elucidate the effects of sample size; and test the sensitivity and reproducibility of morphometric analyses. Here we present a novel approach for 3D morphological simulation that uses open-source, open-access tools available in 3D Slicer, SlicerMorph, and Advanced Normalization Tools in R (ANTsR). While we focus on diffusible-iodine contrast-enhanced micro-CT (diceCT) images, this approach can be used on any volumetric image. We then use our simulated datasets to test whether tensor-based morphometry (TBM) can recover our introduced differences; to test how effect size and sample size affect detectability; and to determine the reproducibility of our results. In our approach to morphological simulation, we first generate a simulated deformation based on a reference image and then propagate this deformation to subjects using inverse transforms obtained from the registration of subjects to the reference. This produces a new dataset with a shifted population mean while retaining individual variability because each sample deforms more or less based on how different or similar it is from the reference. TBM is a widely-used technique that statistically compares local volume differences associated with local deformations. Our results showed that TBM recovered our introduced morphological differences, but that detectability was dependent on the effect size, the sample size, and the region of interest (ROI) included in the analysis. Detectability of subtle phenotypes can be improved both by increasing the sample size and by limiting analyses to specific body regions. However, it is not always feasible to increase sample sizes in screens of essential genes. Therefore, methodical use of ROIs is a promising way to increase the power of TBM to detect subtle phenotypes. Generating known morphological variation through simulation has broad applicability in developmental, evolutionary, and biomedical morphometrics and is a useful way to distinguish between a failure to detect morphological difference and a true lack of morphological difference. Morphological simulation can also be applied to AI-based supervised learning to augment datasets and overcome dataset limitations.","PeriodicalId":501575,"journal":{"name":"bioRxiv - Zoology","volume":"31 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"bioRxiv - Zoology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.06.30.601430","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In the past few decades, advances in 3D imaging have created new opportunities for reverse genetic screens. Rapidly growing datasets of 3D images of genetic knockouts require high-throughput, automated computational approaches for identifying and characterizing new phenotypes. However, exploratory, discovery-oriented image analysis pipelines used to discover these phenotypes can be difficult to validate because, by their nature, the expected outcome is not known a priori. Introducing known morphological variation through simulation can help distinguish between real phenotypic differences and random variation; elucidate the effects of sample size; and test the sensitivity and reproducibility of morphometric analyses. Here we present a novel approach for 3D morphological simulation that uses open-source, open-access tools available in 3D Slicer, SlicerMorph, and Advanced Normalization Tools in R (ANTsR). While we focus on diffusible-iodine contrast-enhanced micro-CT (diceCT) images, this approach can be used on any volumetric image. We then use our simulated datasets to test whether tensor-based morphometry (TBM) can recover our introduced differences; to test how effect size and sample size affect detectability; and to determine the reproducibility of our results. In our approach to morphological simulation, we first generate a simulated deformation based on a reference image and then propagate this deformation to subjects using inverse transforms obtained from the registration of subjects to the reference. This produces a new dataset with a shifted population mean while retaining individual variability because each sample deforms more or less based on how different or similar it is from the reference. TBM is a widely-used technique that statistically compares local volume differences associated with local deformations. Our results showed that TBM recovered our introduced morphological differences, but that detectability was dependent on the effect size, the sample size, and the region of interest (ROI) included in the analysis. Detectability of subtle phenotypes can be improved both by increasing the sample size and by limiting analyses to specific body regions. However, it is not always feasible to increase sample sizes in screens of essential genes. Therefore, methodical use of ROIs is a promising way to increase the power of TBM to detect subtle phenotypes. Generating known morphological variation through simulation has broad applicability in developmental, evolutionary, and biomedical morphometrics and is a useful way to distinguish between a failure to detect morphological difference and a true lack of morphological difference. Morphological simulation can also be applied to AI-based supervised learning to augment datasets and overcome dataset limitations.
过去几十年来,三维成像技术的进步为反向遗传筛选创造了新的机会。快速增长的基因敲除三维图像数据集需要高通量、自动化的计算方法来识别和描述新的表型。然而,用于发现这些表型的探索性、以发现为导向的图像分析管道可能难以验证,因为就其本质而言,预期结果并不是先验已知的。通过模拟引入已知的形态变异有助于区分真实的表型差异和随机变异;阐明样本大小的影响;测试形态计量分析的灵敏度和可重复性。在此,我们介绍一种新颖的三维形态模拟方法,该方法使用 3D Slicer、SlicerMorph 和 R 中的高级归一化工具(ANTsR)中的开源、开放式工具。我们将重点放在可扩散碘对比增强 micro-CT (diceCT) 图像上,但这种方法可用于任何容积图像。然后,我们使用模拟数据集来测试基于张量的形态计量学(TBM)是否能恢复我们引入的差异;测试效应大小和样本大小如何影响可探测性;以及确定我们结果的可重复性。在我们的形态学模拟方法中,我们首先根据参考图像生成模拟变形,然后使用从受试者与参考图像配准中获得的反变换将此变形传播给受试者。这就产生了一个新的数据集,该数据集具有移动的群体平均值,同时保留了个体的可变性,因为每个样本都会根据其与参考图像的不同或相似程度发生或多或少的变形。TBM 是一种广泛使用的技术,可对与局部变形相关的局部体积差异进行统计比较。我们的研究结果表明,TBM 恢复了我们引入的形态差异,但可检测性取决于效应大小、样本大小和分析中包含的感兴趣区(ROI)。通过增加样本量和限制对特定身体区域的分析,可以提高对微妙表型的可检测性。然而,在筛选重要基因时增加样本量并不总是可行的。因此,有条不紊地使用 ROI 是提高 TBM 检测微妙表型能力的一种可行方法。通过模拟产生已知的形态变异在发育、进化和生物医学形态计量学中具有广泛的适用性,是区分未能检测到形态差异和真正缺乏形态差异的有用方法。形态模拟还可应用于基于人工智能的监督学习,以增强数据集并克服数据集的局限性。