在基因组时代利用平均信息 REML 进行(共)方差成分估计的高效计算算法

IF 3.6 1区 农林科学 Q1 AGRICULTURE, DAIRY & ANIMAL SCIENCE Genetics Selection Evolution Pub Date : 2024-11-21 DOI:10.1186/s12711-024-00939-x
Ismo Strandén, Esa A. Mäntysaari, Martin H. Lidauer, Robin Thompson, Hongding Gao
{"title":"在基因组时代利用平均信息 REML 进行(共)方差成分估计的高效计算算法","authors":"Ismo Strandén, Esa A. Mäntysaari, Martin H. Lidauer, Robin Thompson, Hongding Gao","doi":"10.1186/s12711-024-00939-x","DOIUrl":null,"url":null,"abstract":"Methods for estimating variance components (VC) using restricted maximum likelihood (REML) typically require elements from the inverse of the coefficient matrix of the mixed model equations (MME). As genomic information becomes more prevalent, the coefficient matrix of the MME becomes denser, presenting a challenge for analyzing large datasets. Thus, computational algorithms based on iterative solving and Monte Carlo approximation of the inverse of the coefficient matrix become appealing. While the standard average information REML (AI-REML) is known for its rapid convergence, its computational intensity imposes limitations. In particular, the standard AI-REML requires solving the MME for each VC, which can be computationally demanding, especially when dealing with complex models with many VC. To bridge this gap, here we (1) present a computationally efficient and tractable algorithm, named the augmented AI-REML, which facilitates the AI-REML by solving an augmented MME only once within each REML iteration; and (2) implement this approach for VC estimation in a general framework of a multi-trait GBLUP model. VC estimation was investigated based on the number of VC in the model, including a two-trait, three-trait, four-trait, and five-trait GBLUP model. We compared the augmented AI-REML with the standard AI-REML in terms of computing time per REML iteration. Direct and iterative solving methods were used to assess the advances of the augmented AI-REML. When using the direct solving method, the augmented AI-REML and the standard AI-REML required similar computing times for models with a small number of VC (the two- and three-trait GBLUP model), while the augmented AI-REML demonstrated more notable reductions in computing time as the number of VC in the model increased. When using the iterative solving method, the augmented AI-REML demonstrated substantial improvements in computational efficiency compared to the standard AI-REML. The elapsed time of each REML iteration was reduced by 75%, 84%, and 86% for the two-, three-, and four-trait GBLUP models, respectively. The augmented AI-REML can considerably reduce the computing time within each REML iteration, particularly when using an iterative solver. Our results demonstrate the potential of the augmented AI-REML as an appealing approach for large-scale VC estimation in the genomic era.","PeriodicalId":55120,"journal":{"name":"Genetics Selection Evolution","volume":"11 1","pages":""},"PeriodicalIF":3.6000,"publicationDate":"2024-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A computationally efficient algorithm to leverage average information REML for (co)variance component estimation in the genomic era\",\"authors\":\"Ismo Strandén, Esa A. Mäntysaari, Martin H. Lidauer, Robin Thompson, Hongding Gao\",\"doi\":\"10.1186/s12711-024-00939-x\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Methods for estimating variance components (VC) using restricted maximum likelihood (REML) typically require elements from the inverse of the coefficient matrix of the mixed model equations (MME). As genomic information becomes more prevalent, the coefficient matrix of the MME becomes denser, presenting a challenge for analyzing large datasets. Thus, computational algorithms based on iterative solving and Monte Carlo approximation of the inverse of the coefficient matrix become appealing. While the standard average information REML (AI-REML) is known for its rapid convergence, its computational intensity imposes limitations. In particular, the standard AI-REML requires solving the MME for each VC, which can be computationally demanding, especially when dealing with complex models with many VC. To bridge this gap, here we (1) present a computationally efficient and tractable algorithm, named the augmented AI-REML, which facilitates the AI-REML by solving an augmented MME only once within each REML iteration; and (2) implement this approach for VC estimation in a general framework of a multi-trait GBLUP model. VC estimation was investigated based on the number of VC in the model, including a two-trait, three-trait, four-trait, and five-trait GBLUP model. We compared the augmented AI-REML with the standard AI-REML in terms of computing time per REML iteration. Direct and iterative solving methods were used to assess the advances of the augmented AI-REML. When using the direct solving method, the augmented AI-REML and the standard AI-REML required similar computing times for models with a small number of VC (the two- and three-trait GBLUP model), while the augmented AI-REML demonstrated more notable reductions in computing time as the number of VC in the model increased. When using the iterative solving method, the augmented AI-REML demonstrated substantial improvements in computational efficiency compared to the standard AI-REML. The elapsed time of each REML iteration was reduced by 75%, 84%, and 86% for the two-, three-, and four-trait GBLUP models, respectively. The augmented AI-REML can considerably reduce the computing time within each REML iteration, particularly when using an iterative solver. Our results demonstrate the potential of the augmented AI-REML as an appealing approach for large-scale VC estimation in the genomic era.\",\"PeriodicalId\":55120,\"journal\":{\"name\":\"Genetics Selection Evolution\",\"volume\":\"11 1\",\"pages\":\"\"},\"PeriodicalIF\":3.6000,\"publicationDate\":\"2024-11-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Genetics Selection Evolution\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1186/s12711-024-00939-x\",\"RegionNum\":1,\"RegionCategory\":\"农林科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AGRICULTURE, DAIRY & ANIMAL SCIENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genetics Selection Evolution","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s12711-024-00939-x","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURE, DAIRY & ANIMAL SCIENCE","Score":null,"Total":0}
引用次数: 0

摘要

使用限制性最大似然法(REML)估计方差成分(VC)的方法通常需要混合模型方程(MME)系数矩阵逆矩阵中的元素。随着基因组信息越来越普遍,MME 的系数矩阵也变得越来越密集,这给分析大型数据集带来了挑战。因此,基于系数矩阵逆的迭代求解和蒙特卡罗近似的计算算法变得非常有吸引力。虽然标准平均信息 REML(AI-REML)以收敛速度快而著称,但其计算强度也有局限性。特别是,标准平均信息 REML 需要求解每个 VC 的 MME,这对计算能力要求很高,尤其是在处理有许多 VC 的复杂模型时。为了弥补这一差距,我们在这里(1)提出了一种计算效率高、易于操作的算法,命名为增强型 AI-REML,该算法在每次 REML 迭代中只需求解一次增强型 MME,从而简化了 AI-REML;(2)在多性状 GBLUP 模型的一般框架下,将这种方法用于 VC 估计。我们根据模型中 VC 的数量对 VC 估计进行了研究,包括二性状、三性状、四性状和五性状 GBLUP 模型。我们比较了增强型 AI-REML 和标准 AI-REML 每次 REML 迭代的计算时间。我们使用了直接求解法和迭代求解法来评估增强型 AI-REML 的进步。在使用直接求解法时,增强型 AI-REML 和标准 AI-REML 对具有少量 VC 的模型(两特征和三特征 GBLUP 模型)所需的计算时间相似,而随着模型中 VC 数量的增加,增强型 AI-REML 的计算时间明显减少。在使用迭代求解方法时,增强型 AI-REML 的计算效率比标准 AI-REML 有了大幅提高。对于两特征、三特征和四特征 GBLUP 模型,每次 REML 迭代所需的时间分别减少了 75%、84% 和 86%。增强型 AI-REML 可以大大减少每次 REML 迭代的计算时间,尤其是在使用迭代求解器时。我们的研究结果证明了增强型 AI-REML 作为基因组时代大规模 VC 估计的一种有吸引力的方法的潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
A computationally efficient algorithm to leverage average information REML for (co)variance component estimation in the genomic era
Methods for estimating variance components (VC) using restricted maximum likelihood (REML) typically require elements from the inverse of the coefficient matrix of the mixed model equations (MME). As genomic information becomes more prevalent, the coefficient matrix of the MME becomes denser, presenting a challenge for analyzing large datasets. Thus, computational algorithms based on iterative solving and Monte Carlo approximation of the inverse of the coefficient matrix become appealing. While the standard average information REML (AI-REML) is known for its rapid convergence, its computational intensity imposes limitations. In particular, the standard AI-REML requires solving the MME for each VC, which can be computationally demanding, especially when dealing with complex models with many VC. To bridge this gap, here we (1) present a computationally efficient and tractable algorithm, named the augmented AI-REML, which facilitates the AI-REML by solving an augmented MME only once within each REML iteration; and (2) implement this approach for VC estimation in a general framework of a multi-trait GBLUP model. VC estimation was investigated based on the number of VC in the model, including a two-trait, three-trait, four-trait, and five-trait GBLUP model. We compared the augmented AI-REML with the standard AI-REML in terms of computing time per REML iteration. Direct and iterative solving methods were used to assess the advances of the augmented AI-REML. When using the direct solving method, the augmented AI-REML and the standard AI-REML required similar computing times for models with a small number of VC (the two- and three-trait GBLUP model), while the augmented AI-REML demonstrated more notable reductions in computing time as the number of VC in the model increased. When using the iterative solving method, the augmented AI-REML demonstrated substantial improvements in computational efficiency compared to the standard AI-REML. The elapsed time of each REML iteration was reduced by 75%, 84%, and 86% for the two-, three-, and four-trait GBLUP models, respectively. The augmented AI-REML can considerably reduce the computing time within each REML iteration, particularly when using an iterative solver. Our results demonstrate the potential of the augmented AI-REML as an appealing approach for large-scale VC estimation in the genomic era.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Genetics Selection Evolution
Genetics Selection Evolution 生物-奶制品与动物科学
CiteScore
6.50
自引率
9.80%
发文量
74
审稿时长
1 months
期刊介绍: Genetics Selection Evolution invites basic, applied and methodological content that will aid the current understanding and the utilization of genetic variability in domestic animal species. Although the focus is on domestic animal species, research on other species is invited if it contributes to the understanding of the use of genetic variability in domestic animals. Genetics Selection Evolution publishes results from all levels of study, from the gene to the quantitative trait, from the individual to the population, the breed or the species. Contributions concerning both the biological approach, from molecular genetics to quantitative genetics, as well as the mathematical approach, from population genetics to statistics, are welcome. Specific areas of interest include but are not limited to: gene and QTL identification, mapping and characterization, analysis of new phenotypes, high-throughput SNP data analysis, functional genomics, cytogenetics, genetic diversity of populations and breeds, genetic evaluation, applied and experimental selection, genomic selection, selection efficiency, and statistical methodology for the genetic analysis of phenotypes with quantitative and mixed inheritance.
期刊最新文献
A computationally efficient algorithm to leverage average information REML for (co)variance component estimation in the genomic era On the ability of the LR method to detect bias when there is pedigree misspecification and lack of connectedness Empirical versus estimated accuracy of imputation: optimising filtering thresholds for sequence imputation The effect of phenotyping, adult selection, and mating strategies on genetic gain and rate of inbreeding in black soldier fly breeding programs Investigating genotype by environment interaction for beef cattle fertility traits in commercial herds in northern Australia with multi-trait analysis
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1