基因组最佳线性无偏预测和四种机器学习模型在工作犬基因组育种值估计中的性能比较。

IF 3.2 2区 农林科学 Q1 AGRICULTURE, DAIRY & ANIMAL SCIENCE Animals Pub Date : 2025-02-02 DOI:10.3390/ani15030408
Joseph A Thorsrud, Katy M Evans, Kyle C Quigley, Krishnamoorthy Srikanth, Heather J Huson
{"title":"基因组最佳线性无偏预测和四种机器学习模型在工作犬基因组育种值估计中的性能比较。","authors":"Joseph A Thorsrud, Katy M Evans, Kyle C Quigley, Krishnamoorthy Srikanth, Heather J Huson","doi":"10.3390/ani15030408","DOIUrl":null,"url":null,"abstract":"<p><p>This study investigates the efficacy of various genomic prediction models-Genomic Best Linear Unbiased Prediction (GBLUP), Random Forest (RF), Support Vector Machine (SVM), Extreme Gradient Boosting (XGB), and Multilayer Perceptron (MLP)-in predicting genomic breeding values (gEBVs). The phenotypic data include three binary health traits (anodontia, distichiasis, oral papillomatosis) and one behavioral trait (distraction) in a population of guide dogs. These traits impact the potential for success in guide dogs and are therefore routinely characterized but were chosen based on differences in heritability and case counts specifically to assess gEBV model performance. Utilizing a dataset from The Seeing Eye organization, which includes German Shepherds (<i>n</i> = 482), Golden Retrievers (<i>n</i> = 239), Labrador Retrievers (<i>n</i> = 1188), and Labrador and Golden Retriever crosses (<i>n</i> = 111), we assessed model performance within and across different breeds, trait heritability, case counts, and SNP marker densities. Our results indicate that no significant differences were found in model performance across varying heritabilities, case counts, or SNP densities, with all models performing similarly. Given its lack of need for parameter optimization, GBLUP was the most efficient model. Distichiasis showed the highest overall predictive performance, likely due to its higher heritability, while anodontia and distraction exhibited moderate accuracy, and oral papillomatosis had the lowest accuracy, correlating with its low heritability. These findings underscore that lower density SNP datasets can effectively construct gEBVs, suggesting that high-cost, high-density genotyping may not always be necessary. Additionally, the similar performance of all models indicates that simpler models like GBLUP, which requires less fine tuning, may be sufficient for genomic prediction in canine breeding programs. The research highlights the importance of standardized phenotypic assessments and carefully constructed reference populations to optimize the utility of genomic selection in canine breeding programs.</p>","PeriodicalId":7955,"journal":{"name":"Animals","volume":"15 3","pages":""},"PeriodicalIF":3.2000,"publicationDate":"2025-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11816165/pdf/","citationCount":"0","resultStr":"{\"title\":\"Performance Comparison of Genomic Best Linear Unbiased Prediction and Four Machine Learning Models for Estimating Genomic Breeding Values in Working Dogs.\",\"authors\":\"Joseph A Thorsrud, Katy M Evans, Kyle C Quigley, Krishnamoorthy Srikanth, Heather J Huson\",\"doi\":\"10.3390/ani15030408\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>This study investigates the efficacy of various genomic prediction models-Genomic Best Linear Unbiased Prediction (GBLUP), Random Forest (RF), Support Vector Machine (SVM), Extreme Gradient Boosting (XGB), and Multilayer Perceptron (MLP)-in predicting genomic breeding values (gEBVs). The phenotypic data include three binary health traits (anodontia, distichiasis, oral papillomatosis) and one behavioral trait (distraction) in a population of guide dogs. These traits impact the potential for success in guide dogs and are therefore routinely characterized but were chosen based on differences in heritability and case counts specifically to assess gEBV model performance. Utilizing a dataset from The Seeing Eye organization, which includes German Shepherds (<i>n</i> = 482), Golden Retrievers (<i>n</i> = 239), Labrador Retrievers (<i>n</i> = 1188), and Labrador and Golden Retriever crosses (<i>n</i> = 111), we assessed model performance within and across different breeds, trait heritability, case counts, and SNP marker densities. Our results indicate that no significant differences were found in model performance across varying heritabilities, case counts, or SNP densities, with all models performing similarly. Given its lack of need for parameter optimization, GBLUP was the most efficient model. Distichiasis showed the highest overall predictive performance, likely due to its higher heritability, while anodontia and distraction exhibited moderate accuracy, and oral papillomatosis had the lowest accuracy, correlating with its low heritability. These findings underscore that lower density SNP datasets can effectively construct gEBVs, suggesting that high-cost, high-density genotyping may not always be necessary. Additionally, the similar performance of all models indicates that simpler models like GBLUP, which requires less fine tuning, may be sufficient for genomic prediction in canine breeding programs. The research highlights the importance of standardized phenotypic assessments and carefully constructed reference populations to optimize the utility of genomic selection in canine breeding programs.</p>\",\"PeriodicalId\":7955,\"journal\":{\"name\":\"Animals\",\"volume\":\"15 3\",\"pages\":\"\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2025-02-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11816165/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Animals\",\"FirstCategoryId\":\"97\",\"ListUrlMain\":\"https://doi.org/10.3390/ani15030408\",\"RegionNum\":2,\"RegionCategory\":\"农林科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AGRICULTURE, DAIRY & ANIMAL SCIENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Animals","FirstCategoryId":"97","ListUrlMain":"https://doi.org/10.3390/ani15030408","RegionNum":2,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURE, DAIRY & ANIMAL SCIENCE","Score":null,"Total":0}
引用次数: 0

摘要

本研究探讨了各种基因组预测模型——基因组最佳线性无偏预测(GBLUP)、随机森林(RF)、支持向量机(SVM)、极端梯度增强(XGB)和多层感知器(MLP)——在预测基因组育种值(gEBVs)中的功效。表型数据包括导盲犬种群的三个二元健康特征(畸形、双牙病、口腔乳头状瘤病)和一个行为特征(注意力分散)。这些特征影响着导盲犬成功的潜力,因此通常被描述为特征,但根据遗传差异和病例数来选择,专门评估gEBV模型的性能。利用The Seeing Eye组织的数据集,其中包括德国牧羊犬(n = 482)、金毛猎犬(n = 239)、拉布拉多猎犬(n = 1188)和拉布拉多与金毛猎犬杂交(n = 111),我们评估了不同品种内部和不同品种的模型性能、性状遗传力、病例数和SNP标记密度。我们的研究结果表明,在不同的遗传率、病例数或SNP密度下,模型的性能没有显著差异,所有模型的性能都相似。由于不需要参数优化,GBLUP是最有效的模型。双牙病表现出最高的总体预测性能,可能是由于其较高的遗传率,而无牙症和分心表现出中等的准确性,而口腔乳头状瘤病的准确性最低,与其低遗传率相关。这些发现强调,低密度SNP数据集可以有效地构建gebv,这表明高成本、高密度的基因分型可能并不总是必要的。此外,所有模型的相似性能表明,像GBLUP这样更简单的模型,需要更少的微调,可能足以用于犬类育种计划的基因组预测。该研究强调了标准化表型评估和精心构建参考种群的重要性,以优化犬类育种计划中基因组选择的效用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

摘要图片

摘要图片

摘要图片

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Performance Comparison of Genomic Best Linear Unbiased Prediction and Four Machine Learning Models for Estimating Genomic Breeding Values in Working Dogs.

This study investigates the efficacy of various genomic prediction models-Genomic Best Linear Unbiased Prediction (GBLUP), Random Forest (RF), Support Vector Machine (SVM), Extreme Gradient Boosting (XGB), and Multilayer Perceptron (MLP)-in predicting genomic breeding values (gEBVs). The phenotypic data include three binary health traits (anodontia, distichiasis, oral papillomatosis) and one behavioral trait (distraction) in a population of guide dogs. These traits impact the potential for success in guide dogs and are therefore routinely characterized but were chosen based on differences in heritability and case counts specifically to assess gEBV model performance. Utilizing a dataset from The Seeing Eye organization, which includes German Shepherds (n = 482), Golden Retrievers (n = 239), Labrador Retrievers (n = 1188), and Labrador and Golden Retriever crosses (n = 111), we assessed model performance within and across different breeds, trait heritability, case counts, and SNP marker densities. Our results indicate that no significant differences were found in model performance across varying heritabilities, case counts, or SNP densities, with all models performing similarly. Given its lack of need for parameter optimization, GBLUP was the most efficient model. Distichiasis showed the highest overall predictive performance, likely due to its higher heritability, while anodontia and distraction exhibited moderate accuracy, and oral papillomatosis had the lowest accuracy, correlating with its low heritability. These findings underscore that lower density SNP datasets can effectively construct gEBVs, suggesting that high-cost, high-density genotyping may not always be necessary. Additionally, the similar performance of all models indicates that simpler models like GBLUP, which requires less fine tuning, may be sufficient for genomic prediction in canine breeding programs. The research highlights the importance of standardized phenotypic assessments and carefully constructed reference populations to optimize the utility of genomic selection in canine breeding programs.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Animals
Animals Agricultural and Biological Sciences-Animal Science and Zoology
CiteScore
4.90
自引率
16.70%
发文量
3015
审稿时长
20.52 days
期刊介绍: Animals (ISSN 2076-2615) is an international and interdisciplinary scholarly open access journal. It publishes original research articles, reviews, communications, and short notes that are relevant to any field of study that involves animals, including zoology, ethnozoology, animal science, animal ethics and animal welfare. However, preference will be given to those articles that provide an understanding of animals within a larger context (i.e., the animals'' interactions with the outside world, including humans). There is no restriction on the length of the papers. Our aim is to encourage scientists to publish their experimental and theoretical research in as much detail as possible. Full experimental details and/or method of study, must be provided for research articles. Articles submitted that involve subjecting animals to unnecessary pain or suffering will not be accepted, and all articles must be submitted with the necessary ethical approval (please refer to the Ethical Guidelines for more information).
期刊最新文献
Mamba-YOLO-SRC: An Automatic Deep Learning Framework for Respiratory Behavior Detection in the Chinese Giant Salamander. Acute Thermal Tolerance and Physiological Responses in Commercial and Native Red-Feathered Roosters Sharing the Same HSP70 Homozygous Genotype. Radix pseudostellariae Saponins Promote Immunocyte Migration and Chemotaxis via the CCL5/CCR4 Signaling Axis. Camera-Trap Assessment of Terrestrial Mammals and Ground-Dwelling Birds in the Zhangjiajie Chinese Giant Salamander National Nature Reserve, China. Assessment of Maternal Genetic Diversity and Mitochondrial Population Structure of Endangered Indigenous Chicken Breeds in China.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1