Comparison of methods for deriving phenotypes from incomplete observation data with an application to age at puberty in dairy cattle.

IF 6.5 1区农林科学 Q1 Agricultural and Biological Sciences Journal of Animal Science and Biotechnology Pub Date : 2023-09-09 DOI:10.1186/s40104-023-00921-5

Melissa A Stephen, Chris R Burke, Jennie E Pryce, Nicole M Steele, Peter R Amer, Susanne Meier, Claire V C Phyn, Dorian J Garrick

{"title":"Comparison of methods for deriving phenotypes from incomplete observation data with an application to age at puberty in dairy cattle.","authors":"Melissa A Stephen, Chris R Burke, Jennie E Pryce, Nicole M Steele, Peter R Amer, Susanne Meier, Claire V C Phyn, Dorian J Garrick","doi":"10.1186/s40104-023-00921-5","DOIUrl":null,"url":null,"abstract":"Background: Many phenotypes in animal breeding are derived from incomplete measures, especially if they are challenging or expensive to measure precisely. Examples include time-dependent traits such as reproductive status, or lifespan. Incomplete measures for these traits result in phenotypes that are subject to left-, interval- and right-censoring, where phenotypes are only known to fall below an upper bound, between a lower and upper bound, or above a lower bound respectively. Here we compare three methods for deriving phenotypes from incomplete data using age at first elevation (> 1 ng/mL) in blood plasma progesterone (AGEP4), which generally coincides with onset of puberty, as an example trait.Methods: We produced AGEP4 phenotypes from three blood samples collected at about 30-day intervals from approximately 5,000 Holstein-Friesian or Holstein-Friesian × Jersey cross-bred dairy heifers managed in 54 seasonal-calving, pasture-based herds in New Zealand. We used these actual data to simulate 7 different visit scenarios, increasing the extent of censoring by disregarding data from one or two of the three visits. Three methods for deriving phenotypes from these data were explored: 1) ordinal categorical variables which were analysed using categorical threshold analysis; 2) continuous variables, with a penalty of 31 d assigned to right-censored phenotypes; and 3) continuous variables, sampled from within a lower and upper bound using a data augmentation approach.Results: Credibility intervals for heritability estimations overlapped across all methods and visit scenarios, but estimated heritabilities tended to be higher when left censoring was reduced. For sires with at least 5 daughters, the correlations between estimated breeding values (EBVs) from our three-visit scenario and each reduced data scenario varied by method, ranging from 0.65 to 0.95. The estimated breed effects also varied by method, but breed differences were smaller as phenotype censoring increased.Conclusion: Our results indicate that using some methods, phenotypes derived from one observation per offspring for a time-dependent trait such as AGEP4 may provide comparable sire rankings to three observations per offspring. This has implications for the design of large-scale phenotyping initiatives where animal breeders aim to estimate variance parameters and estimated breeding values (EBVs) for phenotypes that are challenging to measure or prohibitively expensive.","PeriodicalId":14928,"journal":{"name":"Journal of Animal Science and Biotechnology","volume":"14 1","pages":"119"},"PeriodicalIF":6.5000,"publicationDate":"2023-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10492402/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Animal Science and Biotechnology","FirstCategoryId":"97","ListUrlMain":"https://doi.org/10.1186/s40104-023-00921-5","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Agricultural and Biological Sciences","Score":null,"Total":0}

引用次数: 0

Abstract

Background: Many phenotypes in animal breeding are derived from incomplete measures, especially if they are challenging or expensive to measure precisely. Examples include time-dependent traits such as reproductive status, or lifespan. Incomplete measures for these traits result in phenotypes that are subject to left-, interval- and right-censoring, where phenotypes are only known to fall below an upper bound, between a lower and upper bound, or above a lower bound respectively. Here we compare three methods for deriving phenotypes from incomplete data using age at first elevation (> 1 ng/mL) in blood plasma progesterone (AGEP4), which generally coincides with onset of puberty, as an example trait.

Methods: We produced AGEP4 phenotypes from three blood samples collected at about 30-day intervals from approximately 5,000 Holstein-Friesian or Holstein-Friesian × Jersey cross-bred dairy heifers managed in 54 seasonal-calving, pasture-based herds in New Zealand. We used these actual data to simulate 7 different visit scenarios, increasing the extent of censoring by disregarding data from one or two of the three visits. Three methods for deriving phenotypes from these data were explored: 1) ordinal categorical variables which were analysed using categorical threshold analysis; 2) continuous variables, with a penalty of 31 d assigned to right-censored phenotypes; and 3) continuous variables, sampled from within a lower and upper bound using a data augmentation approach.

Results: Credibility intervals for heritability estimations overlapped across all methods and visit scenarios, but estimated heritabilities tended to be higher when left censoring was reduced. For sires with at least 5 daughters, the correlations between estimated breeding values (EBVs) from our three-visit scenario and each reduced data scenario varied by method, ranging from 0.65 to 0.95. The estimated breed effects also varied by method, but breed differences were smaller as phenotype censoring increased.

Conclusion: Our results indicate that using some methods, phenotypes derived from one observation per offspring for a time-dependent trait such as AGEP4 may provide comparable sire rankings to three observations per offspring. This has implications for the design of large-scale phenotyping initiatives where animal breeders aim to estimate variance parameters and estimated breeding values (EBVs) for phenotypes that are challenging to measure or prohibitively expensive.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

从不完全观察数据中获得表型的方法与应用于奶牛青春期年龄的比较。

背景:动物育种中的许多表型是由不完整的测量得出的，特别是如果它们具有挑战性或昂贵的精确测量。例子包括与时间相关的特征，如生殖状态或寿命。对这些性状的不完整测量导致表型受制于左、间隔和右审查，其中表型仅分别落在上界之下、下界和上界之间或下界之上。在这里，我们比较了三种从不完整数据中获得表型的方法，使用血浆孕酮(AGEP4)的首次升高年龄(>.1 ng/mL)，通常与青春期的开始一致，作为一个例子特征。方法:我们从新西兰54个季节性产犊放牧畜群中约5000头荷斯泰因-弗里西亚或荷斯泰因-弗里西亚×泽西杂交乳牛中采集的三份血液样本中提取AGEP4表型，每隔30天采集一次。我们使用这些实际数据模拟了7种不同的访问场景，通过忽略三次访问中的一两次数据来增加审查的程度。从这些数据中提取表型的方法有三种:1)使用分类阈值分析对有序分类变量进行分析;2)连续变量，右截尾表型的惩罚为31 d;3)连续变量，使用数据增强方法从下界和上界内采样。结果:遗传率估计的可信区间在所有方法和访问场景中都有重叠，但当左侧审查减少时，估计的遗传率往往更高。对于至少有5个子代的母猪，我们的三次访问情景和每个简化数据情景的估计育种值(ebv)之间的相关性因方法而异，范围为0.65至0.95。估计的品种效应也因方法而异，但随着表型审查的增加，品种差异较小。结论:我们的研究结果表明，使用一些方法，对AGEP4等时间依赖性性状，每个后代一次观察得出的表型可以提供与每个后代三次观察相当的父系排名。这对大规模表型计划的设计具有启示意义，在这些计划中，动物育种者旨在估计具有挑战性或昂贵的表型的方差参数和估计育种值(ebv)。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Journal of Animal Science and Biotechnology AGRICULTURE, DAIRY & ANIMAL SCIENCE-

CiteScore

9.90

自引率

2.90%

发文量

822

审稿时长

17 weeks

期刊介绍： Journal of Animal Science and Biotechnology is an open access, peer-reviewed journal that encompasses all aspects of animal science and biotechnology. That includes domestic animal production, animal genetics and breeding, animal reproduction and physiology, animal nutrition and biochemistry, feed processing technology and bioevaluation, animal biotechnology, and meat science.