Enhanced adaptive permutation test with negative binomial distribution in genome-wide omics datasets.

IF 1.7 4区生物学 Q4 BIOCHEMISTRY & MOLECULAR BIOLOGY Genes & genomics Pub Date : 2025-01-01 Epub Date: 2024-11-06 DOI:10.1007/s13258-024-01584-w

Iksoo Huh, Taesung Park

{"title":"Enhanced adaptive permutation test with negative binomial distribution in genome-wide omics datasets.","authors":"Iksoo Huh, Taesung Park","doi":"10.1007/s13258-024-01584-w","DOIUrl":null,"url":null,"abstract":"Background: The permutation test has been widely used to provide the p-values of statistical tests when the standard test statistics do not follow parametric null distributions. However, the permutation test may require huge numbers of iterations, especially when the detection of very small p-values is required for multiple testing adjustments in the analysis of datasets with a large number of features.Objective: To overcome this computational burden, we suggest a novel enhanced adaptive permutation test that estimates p-values using the negative binomial (NB) distribution. By the method, the number of permutations are differently determined for individual features according to their potential significance.Methods: In detail, the permutation procedure stops, when test statistics from the permuted dataset exceed the observed statistics from the original dataset by a predefined number of times. We showed that this procedure reduced the number of permutations especially when there were many insignificant features. For significant features, we enhanced the reduction with Stouffer's method after splitting datasets.Results: From the simulation study, we found that the enhanced adaptive permutation test dramatically reduced the number of permutations while keeping the precision of the permutation p-value within a small range, when compared to the ordinary permutation test. In real data analysis, we applied the enhanced adaptive permutation test to a genome-wide single nucleotide polymorphism (SNP) dataset of 327,872 features.Conclusion: We found the analysis with the enhanced adaptive permutation took a feasible time for genome-wide omics datasets, and successfully identified features of highly significant p-values with reasonable confidence intervals.","PeriodicalId":12675,"journal":{"name":"Genes & genomics","volume":" ","pages":"59-70"},"PeriodicalIF":1.7000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genes & genomics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1007/s13258-024-01584-w","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/11/6 0:00:00","PubModel":"Epub","JCR":"Q4","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Background: The permutation test has been widely used to provide the p-values of statistical tests when the standard test statistics do not follow parametric null distributions. However, the permutation test may require huge numbers of iterations, especially when the detection of very small p-values is required for multiple testing adjustments in the analysis of datasets with a large number of features.

Objective: To overcome this computational burden, we suggest a novel enhanced adaptive permutation test that estimates p-values using the negative binomial (NB) distribution. By the method, the number of permutations are differently determined for individual features according to their potential significance.

Methods: In detail, the permutation procedure stops, when test statistics from the permuted dataset exceed the observed statistics from the original dataset by a predefined number of times. We showed that this procedure reduced the number of permutations especially when there were many insignificant features. For significant features, we enhanced the reduction with Stouffer's method after splitting datasets.

Results: From the simulation study, we found that the enhanced adaptive permutation test dramatically reduced the number of permutations while keeping the precision of the permutation p-value within a small range, when compared to the ordinary permutation test. In real data analysis, we applied the enhanced adaptive permutation test to a genome-wide single nucleotide polymorphism (SNP) dataset of 327,872 features.

Conclusion: We found the analysis with the enhanced adaptive permutation took a feasible time for genome-wide omics datasets, and successfully identified features of highly significant p-values with reasonable confidence intervals.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

在全基因组 omics 数据集中使用负二项分布的增强型自适应 permutation 检验。

背景：当标准检验统计量不服从参数空分布时，置换检验被广泛用于提供统计检验的 p 值。然而，置换检验可能需要大量的迭代，尤其是在分析具有大量特征的数据集时，需要检测非常小的 p 值以进行多重检验调整：为了克服这种计算负担，我们提出了一种新的增强型自适应 permutation 检验方法，该方法利用负二项分布（NB）估计 p 值。通过这种方法，我们可以根据各个特征的潜在重要性来确定不同的置换次数：具体来说，当被置换数据集的测试统计量超过原始数据集观测统计量的预定次数时，置换程序就会停止。我们的研究表明，这一程序减少了置换次数，尤其是在有很多不重要特征的情况下。对于重要特征，我们在拆分数据集后使用 Stouffer 方法加强了减少的效果：通过模拟研究，我们发现与普通的置换检验相比，增强型自适应置换检验大大减少了置换次数，同时将置换 p 值的精度保持在较小的范围内。在实际数据分析中，我们将增强型自适应置换检验应用于包含 327 872 个特征的全基因组单核苷酸多态性（SNP）数据集：我们发现，使用增强型自适应置换法分析全基因组 Omics 数据集所需的时间是可行的，而且能成功识别出具有合理置信区间的高显著 p 值特征。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Genes & genomics 生物-生化与分子生物学

CiteScore

3.70

自引率

4.80%

发文量

131

审稿时长

6-12 weeks

期刊介绍： Genes & Genomics is an official journal of the Korean Genetics Society (http://kgenetics.or.kr/). Although it is an official publication of the Genetics Society of Korea, membership of the Society is not required for contributors. It is a peer-reviewed international journal publishing print (ISSN 1976-9571) and online version (E-ISSN 2092-9293). It covers all disciplines of genetics and genomics from prokaryotes to eukaryotes from fundamental heredity to molecular aspects. The articles can be reviews, research articles, and short communications.