False and true positives in arthropod thermal adaptation candidate gene lists.

IF 1.3 4区 生物学 Q4 GENETICS & HEREDITY Genetica Pub Date : 2021-06-01 Epub Date: 2021-05-07 DOI:10.1007/s10709-021-00122-w
Maike Herrmann, Lev Y Yampolsky
{"title":"False and true positives in arthropod thermal adaptation candidate gene lists.","authors":"Maike Herrmann,&nbsp;Lev Y Yampolsky","doi":"10.1007/s10709-021-00122-w","DOIUrl":null,"url":null,"abstract":"<p><p>Genome-wide studies are prone to false positives due to inherently low priors and statistical power. One approach to ameliorate this problem is to seek validation of reported candidate genes across independent studies: genes with repeatedly discovered effects are less likely to be false positives. Inversely, genes reported only as many times as expected by chance alone, while possibly representing novel discoveries, are also more likely to be false positives. We show that, across over 30 genome-wide studies that reported Drosophila and Daphnia genes with possible roles in thermal adaptation, the combined lists of candidate genes and orthologous groups are rapidly approaching the total number of genes and orthologous groups in the respective genomes. This is consistent with the expectation of high frequency of false positives. The majority of these spurious candidates have been identified by one or a few studies, as expected by chance alone. In contrast, a noticeable minority of genes have been identified by numerous studies with the probabilities of such discoveries occurring by chance alone being exceedingly small. For this subset of genes, different studies are in agreement with each other despite differences in the ecological settings, genomic tools and methodology, and reporting thresholds. We provide a reference set of presumed true positives among Drosophila candidate genes and orthologous groups involved in response to changes in temperature, suitable for cross-validation purposes. Despite this approach being prone to false negatives, this list of presumed true positives includes several hundred genes, consistent with the \"omnigenic\" concept of genetic architecture of complex traits.</p>","PeriodicalId":55121,"journal":{"name":"Genetica","volume":null,"pages":null},"PeriodicalIF":1.3000,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1007/s10709-021-00122-w","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genetica","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1007/s10709-021-00122-w","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2021/5/7 0:00:00","PubModel":"Epub","JCR":"Q4","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 6

Abstract

Genome-wide studies are prone to false positives due to inherently low priors and statistical power. One approach to ameliorate this problem is to seek validation of reported candidate genes across independent studies: genes with repeatedly discovered effects are less likely to be false positives. Inversely, genes reported only as many times as expected by chance alone, while possibly representing novel discoveries, are also more likely to be false positives. We show that, across over 30 genome-wide studies that reported Drosophila and Daphnia genes with possible roles in thermal adaptation, the combined lists of candidate genes and orthologous groups are rapidly approaching the total number of genes and orthologous groups in the respective genomes. This is consistent with the expectation of high frequency of false positives. The majority of these spurious candidates have been identified by one or a few studies, as expected by chance alone. In contrast, a noticeable minority of genes have been identified by numerous studies with the probabilities of such discoveries occurring by chance alone being exceedingly small. For this subset of genes, different studies are in agreement with each other despite differences in the ecological settings, genomic tools and methodology, and reporting thresholds. We provide a reference set of presumed true positives among Drosophila candidate genes and orthologous groups involved in response to changes in temperature, suitable for cross-validation purposes. Despite this approach being prone to false negatives, this list of presumed true positives includes several hundred genes, consistent with the "omnigenic" concept of genetic architecture of complex traits.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
节肢动物热适应候选基因表的假阳性和真阳性。
由于固有的低先验和统计能力,全基因组研究容易出现假阳性。改善这一问题的一种方法是在独立研究中寻求已报道的候选基因的验证:反复发现作用的基因不太可能是假阳性。相反,基因报告的次数与预期的一样多,虽然可能代表新的发现,但也更有可能是假阳性。我们发现,在30多个全基因组研究中,果蝇和水蚤基因可能在热适应中发挥作用,候选基因和同源群的组合列表正在迅速接近各自基因组中基因和同源群的总数。这与高误报频率的预期是一致的。这些虚假的候选者中的大多数已经被一项或几项研究确定,正如预期的那样纯属偶然。相比之下,只有少数基因被大量的研究发现,而这种偶然发现的可能性非常小。对于这一基因子集,尽管在生态环境、基因组工具和方法以及报告阈值方面存在差异,但不同的研究彼此一致。我们在果蝇候选基因和参与温度变化的同源群中提供了一组假定真阳性的参考,适合交叉验证目的。尽管这种方法容易产生假阴性,但假定的真阳性列表包括数百个基因,与复杂性状遗传结构的“全基因”概念一致。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Genetica
Genetica 生物-遗传学
CiteScore
2.70
自引率
0.00%
发文量
32
审稿时长
>12 weeks
期刊介绍: Genetica publishes papers dealing with genetics, genomics, and evolution. Our journal covers novel advances in the fields of genomics, conservation genetics, genotype-phenotype interactions, evo-devo, population and quantitative genetics, and biodiversity. Genetica publishes original research articles addressing novel conceptual, experimental, and theoretical issues in these areas, whatever the taxon considered. Biomedical papers and papers on breeding animal and plant genetics are not within the scope of Genetica, unless framed in an evolutionary context. Recent advances in genetics, genomics and evolution are also published in thematic issues and synthesis papers published by experts in the field.
期刊最新文献
Genome-wide identification and data mining reveals major-latex protein (MLP) from the PR-10 protein family played defense-related roles against phytopathogenic challenges in cassava (Manihot esculenta Crantz). Comparative genomic analysis reveals expansion of the DnaJ gene family in Lagerstroemia indica and its members response to salt stress. Identification and expression analysis of XIP gene family members in rice. Genome-wide identification and expression analysis of the universal stress protein (USP) gene family in Arabidopsis thaliana, Zea mays, and Oryza sativa. A multi-tissue de novo transcriptome assembly and relative gene expression of the vulnerable freshwater salmonid Thymallus ligericus.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1