UGM: a more stable procedure for large-scale multiple testing problems, new solutions to identify oncogene.

Pub Date : 2019-12-23 DOI:10.1186/s12976-019-0117-1

Chengyou Liu, Leilei Zhou, Yuhe Wang, Shuchang Tian, Junlin Zhu, Hang Qin, Yong Ding, Hongbing Jiang

{"title":"UGM: a more stable procedure for large-scale multiple testing problems, new solutions to identify oncogene.","authors":"Chengyou Liu, Leilei Zhou, Yuhe Wang, Shuchang Tian, Junlin Zhu, Hang Qin, Yong Ding, Hongbing Jiang","doi":"10.1186/s12976-019-0117-1","DOIUrl":null,"url":null,"abstract":"<p><p>Variations of gene expression levels play an important role in tumors. There are numerous methods to identify differentially expressed genes in high-throughput sequencing. Several algorithms endeavor to identify distinctive genetic patterns susceptable to particular diseases. Although these processes have been proved successful, the probability that the number of non-differentially expressed genes measured by false discovery rate (FDR) has a large standard deviation, and the misidentification rate (type I error) grows rapidly when the number of genes to be detected become larger. In this study we developed a new method, Unit Gamma Measurement (UGM), accounting for multiple hypotheses test statistics distribution, which could reduce the dependency problem. Simulated expression profile data and breast cancer RNA-Seq data were utilized to testify the accuracy of UGM. The results show that the number of non-differentially expressed genes identified by the UGM is very close to the real-evidence data, and the UGM also has a smaller standard error, range, quartile range and RMS error. In addition, the UGM can be used to screen many breast cancer-associated genes, such as BRCA1, BRCA2, PTEN, BRIP1, etc., provides better accuracy, robustness and efficiency, the method of identification differentially expressed genes in high-throughput sequencing.</p>","PeriodicalId":75215,"journal":{"name":"","volume":"16 1","pages":"20"},"PeriodicalIF":0.0,"publicationDate":"2019-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s12976-019-0117-1","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/s12976-019-0117-1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Variations of gene expression levels play an important role in tumors. There are numerous methods to identify differentially expressed genes in high-throughput sequencing. Several algorithms endeavor to identify distinctive genetic patterns susceptable to particular diseases. Although these processes have been proved successful, the probability that the number of non-differentially expressed genes measured by false discovery rate (FDR) has a large standard deviation, and the misidentification rate (type I error) grows rapidly when the number of genes to be detected become larger. In this study we developed a new method, Unit Gamma Measurement (UGM), accounting for multiple hypotheses test statistics distribution, which could reduce the dependency problem. Simulated expression profile data and breast cancer RNA-Seq data were utilized to testify the accuracy of UGM. The results show that the number of non-differentially expressed genes identified by the UGM is very close to the real-evidence data, and the UGM also has a smaller standard error, range, quartile range and RMS error. In addition, the UGM can be used to screen many breast cancer-associated genes, such as BRCA1, BRCA2, PTEN, BRIP1, etc., provides better accuracy, robustness and efficiency, the method of identification differentially expressed genes in high-throughput sequencing.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

UGM:一个更稳定的程序，大规模的多重测试问题，新的解决方案，以确定致癌基因。

基因表达水平的变化在肿瘤中起着重要的作用。在高通量测序中，有许多方法可以鉴定差异表达基因。一些算法试图识别易受特定疾病影响的独特遗传模式。虽然这些过程已被证明是成功的，但通过错误发现率(FDR)测量的非差异表达基因数量的概率具有较大的标准差，并且当待检测基因数量增加时，错误识别率(I型错误)迅速增长。在本研究中，我们提出了一种新的方法，单位伽马测量(UGM)，该方法考虑了多假设检验统计分布，可以减少依赖问题。模拟表达谱数据和乳腺癌RNA-Seq数据验证了UGM的准确性。结果表明，该方法鉴定的非差异表达基因数量与实际证据数据非常接近，且具有较小的标准误差、极差、四分位极差和均方根误差。此外，UGM可用于筛选多种乳腺癌相关基因，如BRCA1、BRCA2、PTEN、BRIP1等，为鉴别差异表达基因的高通量测序方法提供了更好的准确性、稳健性和高效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助