对称定向错误发现率控制

Q Mathematics Statistical Methodology Pub Date : 2016-12-01 DOI:10.1016/j.stamet.2016.08.002

Sarah E. Holte , Eva K. Lee , Yajun Mei

{"title":"对称定向错误发现率控制","authors":"Sarah E. Holte , Eva K. Lee , Yajun Mei","doi":"10.1016/j.stamet.2016.08.002","DOIUrl":null,"url":null,"abstract":"<div>This research is motivated from the analysis of a real gene expression data that aims to identify a subset of “interesting” or “significant” genes for further studies. When we blindly applied the standard false discovery rate (FDR) methods, our biology collaborators were suspicious or confused, as the selected list of significant genes was highly unbalanced: there were ten times more under-expressed genes than the over-expressed genes. Their concerns led us to realize that the observed two-sample <math><mi>t</mi></math>-statistics were highly skewed and asymmetric, and thus the standard FDR methods might be inappropriate. To tackle this case, we propose a symmetric directional FDR control method that categorizes the genes into “over-expressed” and “under-expressed” genes, pairs “over-expressed” and “under-expressed” genes, defines the <math><mi>p</mi></math>-values for gene pairs via column permutations, and then applies the standard FDR method to select “significant” gene pairs instead of “significant” individual genes. We compare our proposed symmetric directional FDR method with the standard FDR method by applying them to simulated data and several well-known real data sets.</div>","PeriodicalId":48877,"journal":{"name":"Statistical Methodology","volume":"33 ","pages":"Pages 71-82"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.stamet.2016.08.002","citationCount":"1","resultStr":"{\"title\":\"Symmetric directional false discovery rate control\",\"authors\":\"Sarah E. Holte , Eva K. Lee , Yajun Mei\",\"doi\":\"10.1016/j.stamet.2016.08.002\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>This research is motivated from the analysis of a real gene expression data that aims to identify a subset of “interesting” or “significant” genes for further studies. When we blindly applied the standard false discovery rate (FDR) methods, our biology collaborators were suspicious or confused, as the selected list of significant genes was highly unbalanced: there were ten times more under-expressed genes than the over-expressed genes. Their concerns led us to realize that the observed two-sample <math><mi>t</mi></math>-statistics were highly skewed and asymmetric, and thus the standard FDR methods might be inappropriate. To tackle this case, we propose a symmetric directional FDR control method that categorizes the genes into “over-expressed” and “under-expressed” genes, pairs “over-expressed” and “under-expressed” genes, defines the <math><mi>p</mi></math>-values for gene pairs via column permutations, and then applies the standard FDR method to select “significant” gene pairs instead of “significant” individual genes. We compare our proposed symmetric directional FDR method with the standard FDR method by applying them to simulated data and several well-known real data sets.</div>\",\"PeriodicalId\":48877,\"journal\":{\"name\":\"Statistical Methodology\",\"volume\":\"33 \",\"pages\":\"Pages 71-82\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1016/j.stamet.2016.08.002\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Statistical Methodology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1572312716300247\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q\",\"JCRName\":\"Mathematics\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistical Methodology","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1572312716300247","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q","JCRName":"Mathematics","Score":null,"Total":0}

引用次数: 1

摘要

本研究的动机是对真实基因表达数据的分析，旨在确定一个“有趣的”或“重要的”基因子集，以供进一步研究。当我们盲目地应用标准错误发现率(FDR)方法时，我们的生物学合作者感到怀疑或困惑，因为所选择的重要基因列表高度不平衡:低表达基因比高表达基因多十倍。他们的担忧使我们意识到，观察到的两样本t统计量是高度倾斜和不对称的，因此，标准的罗斯福方法可能是不合适的。为了解决这种情况，我们提出了一种对称定向FDR控制方法，该方法将基因分为“过表达”和“低表达”基因，对“过表达”和“低表达”基因，通过列排列定义基因对的p值，然后应用标准FDR方法选择“显著”基因对而不是“显著”个体基因。我们将所提出的对称定向FDR方法与标准FDR方法进行了比较，并将其应用于模拟数据和几个知名的真实数据集。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Symmetric directional false discovery rate control

This research is motivated from the analysis of a real gene expression data that aims to identify a subset of “interesting” or “significant” genes for further studies. When we blindly applied the standard false discovery rate (FDR) methods, our biology collaborators were suspicious or confused, as the selected list of significant genes was highly unbalanced: there were ten times more under-expressed genes than the over-expressed genes. Their concerns led us to realize that the observed two-sample $t$ -statistics were highly skewed and asymmetric, and thus the standard FDR methods might be inappropriate. To tackle this case, we propose a symmetric directional FDR control method that categorizes the genes into “over-expressed” and “under-expressed” genes, pairs “over-expressed” and “under-expressed” genes, defines the $p$ -values for gene pairs via column permutations, and then applies the standard FDR method to select “significant” gene pairs instead of “significant” individual genes. We compare our proposed symmetric directional FDR method with the standard FDR method by applying them to simulated data and several well-known real data sets.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Statistical Methodology STATISTICS & PROBABILITY-

CiteScore

0.59

自引率

0.00%

发文量

期刊介绍： Statistical Methodology aims to publish articles of high quality reflecting the varied facets of contemporary statistical theory as well as of significant applications. In addition to helping to stimulate research, the journal intends to bring about interactions among statisticians and scientists in other disciplines broadly interested in statistical methodology. The journal focuses on traditional areas such as statistical inference, multivariate analysis, design of experiments, sampling theory, regression analysis, re-sampling methods, time series, nonparametric statistics, etc., and also gives special emphasis to established as well as emerging applied areas.