首页 > 最新文献

Emerging Themes in Epidemiology最新文献

英文 中文
Cannons and sparrows: an exact maximum likelihood non-parametric test for meta-analysis of k 2 × 2 tables. 大炮和麻雀:k2荟萃分析的精确最大似然非参数检验 × 2张表。
IF 2.3 Q1 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH Pub Date : 2018-06-26 DOI: 10.1186/s12982-018-0077-7
Lawrence M Paul

Background: The use of meta-analysis to aggregate multiple studies has increased dramatically over the last 30 years. For meta-analysis of homogeneous data where the effect sizes for the studies contributing to the meta-analysis differ only by statistical error, the Mantel-Haenszel technique has typically been utilized. If homogeneity cannot be assumed or established, the most popular technique is the inverse-variance DerSimonian-Laird technique. However, both of these techniques are based on large sample, asymptotic assumptions and are, at best, an approximation especially when the number of cases observed in any cell of the corresponding contingency tables is small.

Results: This paper develops an exact, non-parametric test based on a maximum likelihood test statistic as an alternative to the asymptotic techniques. Further, the test can be used across a wide range of heterogeneity. Monte Carlo simulations show that for the homogeneous case, the ML-NP-EXACT technique to be generally more powerful than the DerSimonian-Laird inverse-variance technique for realistic, smaller values of disease probability, and across a large range of odds ratios, number of contributing studies, and sample size. Possibly most important, for large values of heterogeneity, the pre-specified level of Type I Error is much better maintained by the ML-NP-EXACT technique relative to the DerSimonian-Laird technique. A fully tested implementation in the R statistical language is freely available from the author.

Conclusions: This research has developed an exact test for the meta-analysis of dichotomous data. The ML-NP-EXACT technique was strongly superior to the DerSimonian-Laird technique in maintaining a pre-specified level of Type I Error. As shown, the DerSimonian-Laird technique demonstrated many large violations of this level. Given the various biases towards finding statistical significance prevalent in epidemiology today, a strong focus on maintaining a pre-specified level of Type I Error would seem critical.

背景:在过去的30年里,荟萃分析在综合多项研究中的应用急剧增加。对于同质数据的荟萃分析,其中对荟萃分析有贡献的研究的影响大小仅因统计误差而不同,通常使用Mantel Haenszel技术。如果不能假设或建立同质性,最流行的技术是逆方差DerSimonian-Laird技术。然而,这两种技术都是基于大样本渐近假设的,充其量只是一种近似,尤其是当在相应列联表的任何单元格中观察到的情况数量很小时。结果:本文提出了一种基于最大似然检验统计量的精确非参数检验,作为渐近技术的替代方案。此外,该测试可以在广泛的异质性范围内使用。蒙特卡罗模拟表明,对于同质情况,ML-NP-EXACT技术通常比DerSimonian-Laird逆方差技术更强大,因为它具有真实的、较小的疾病概率值,并且在很大范围的优势比、贡献研究的数量和样本量上都是如此。可能最重要的是,对于大的异质性值,ML-NP-EXACT技术比DerSimonian-Laird技术更好地保持了预先指定的I型误差水平。作者可以免费获得R统计语言中经过充分测试的实现。结论:本研究为二分法数据的荟萃分析开发了一种精确的测试方法。ML-NP-EXACT技术在保持预先指定的I型误差水平方面明显优于DerSimonian-Laird技术。如图所示,DerSimonian-Laird技术展示了许多该级别的大型违规行为。考虑到当今流行病学中普遍存在的对发现统计显著性的各种偏见,强烈关注保持预先指定的I型错误水平似乎至关重要。
{"title":"Cannons and sparrows: an exact maximum likelihood non-parametric test for meta-analysis of k 2 × 2 tables.","authors":"Lawrence M Paul","doi":"10.1186/s12982-018-0077-7","DOIUrl":"10.1186/s12982-018-0077-7","url":null,"abstract":"<p><strong>Background: </strong>The use of meta-analysis to aggregate multiple studies has increased dramatically over the last 30 years. For meta-analysis of homogeneous data where the effect sizes for the studies contributing to the meta-analysis differ only by statistical error, the Mantel-Haenszel technique has typically been utilized. If homogeneity cannot be assumed or established, the most popular technique is the inverse-variance DerSimonian-Laird technique. However, both of these techniques are based on large sample, asymptotic assumptions and are, at best, an approximation especially when the number of cases observed in any cell of the corresponding contingency tables is small.</p><p><strong>Results: </strong>This paper develops an exact, non-parametric test based on a maximum likelihood test statistic as an alternative to the asymptotic techniques. Further, the test can be used across a wide range of heterogeneity. Monte Carlo simulations show that for the homogeneous case, the ML-NP-EXACT technique to be generally more powerful than the DerSimonian-Laird inverse-variance technique for realistic, smaller values of disease probability, and across a large range of odds ratios, number of contributing studies, and sample size. Possibly most important, for large values of heterogeneity, the pre-specified level of Type I Error is much better maintained by the ML-NP-EXACT technique relative to the DerSimonian-Laird technique. A fully tested implementation in the R statistical language is freely available from the author.</p><p><strong>Conclusions: </strong>This research has developed an exact test for the meta-analysis of dichotomous data. The ML-NP-EXACT technique was strongly superior to the DerSimonian-Laird technique in maintaining a pre-specified level of Type I Error. As shown, the DerSimonian-Laird technique demonstrated many large violations of this level. Given the various biases towards finding statistical significance prevalent in epidemiology today, a strong focus on maintaining a pre-specified level of Type I Error would seem critical.</p>","PeriodicalId":39896,"journal":{"name":"Emerging Themes in Epidemiology","volume":"15 ","pages":"9"},"PeriodicalIF":2.3,"publicationDate":"2018-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s12982-018-0077-7","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"36293961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
The contributions and future direction of Program Science in HIV/STI prevention. 规划科学在HIV/STI预防中的贡献及未来发展方向。
IF 2.3 Q1 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH Pub Date : 2018-05-28 eCollection Date: 2018-01-01 DOI: 10.1186/s12982-018-0076-8
Marissa Becker, Sharmistha Mishra, Sevgi Aral, Parinita Bhattacharjee, Rob Lorway, Kalada Green, John Anthony, Shajy Isac, Faran Emmanuel, Helgar Musyoki, Lisa Lazarus, Laura H Thompson, Eve Cheuk, James F Blanchard

Background: Program Science is an iterative, multi-phase research and program framework where programs drive the scientific inquiry, and both program and science are aligned towards a collective goal of improving population health.

Discussion: To achieve this, Program Science involves the systematic application of theoretical and empirical knowledge to optimize the scale, quality and impact of public health programs. Program Science tools and approaches developed for strategic planning, program implementation, and program management and evaluation have been incorporated into HIV and sexually transmitted infection prevention programs in Kenya, Nigeria, India, and the United States.

Conclusion: In this paper, we highlight key scientific contributions that emerged from the growing application of Program Science in the field of HIV and STI prevention, and conclude by proposing future directions for Program Science.

背景:程序科学是一个迭代的,多阶段的研究和程序框架,其中程序驱动科学探究,程序和科学都朝着改善人口健康的共同目标保持一致。讨论:为了实现这一目标,项目科学涉及到理论和经验知识的系统应用,以优化公共卫生项目的规模、质量和影响。为战略规划、项目实施、项目管理和评估而开发的项目科学工具和方法已被纳入肯尼亚、尼日利亚、印度和美国的艾滋病毒和性传播感染预防项目。结论:在本文中,我们强调了程序科学在艾滋病和性传播感染预防领域日益增长的应用所产生的关键科学贡献,并提出了程序科学的未来发展方向。
{"title":"The contributions and future direction of Program Science in HIV/STI prevention.","authors":"Marissa Becker,&nbsp;Sharmistha Mishra,&nbsp;Sevgi Aral,&nbsp;Parinita Bhattacharjee,&nbsp;Rob Lorway,&nbsp;Kalada Green,&nbsp;John Anthony,&nbsp;Shajy Isac,&nbsp;Faran Emmanuel,&nbsp;Helgar Musyoki,&nbsp;Lisa Lazarus,&nbsp;Laura H Thompson,&nbsp;Eve Cheuk,&nbsp;James F Blanchard","doi":"10.1186/s12982-018-0076-8","DOIUrl":"https://doi.org/10.1186/s12982-018-0076-8","url":null,"abstract":"<p><strong>Background: </strong>Program Science is an iterative, multi-phase research and program framework where programs drive the scientific inquiry, and both program and science are aligned towards a collective goal of improving population health.</p><p><strong>Discussion: </strong>To achieve this, Program Science involves the systematic application of theoretical and empirical knowledge to optimize the scale, quality and impact of public health programs. Program Science tools and approaches developed for strategic planning, program implementation, and program management and evaluation have been incorporated into HIV and sexually transmitted infection prevention programs in Kenya, Nigeria, India, and the United States.</p><p><strong>Conclusion: </strong>In this paper, we highlight key scientific contributions that emerged from the growing application of Program Science in the field of HIV and STI prevention, and conclude by proposing future directions for Program Science.</p>","PeriodicalId":39896,"journal":{"name":"Emerging Themes in Epidemiology","volume":"15 ","pages":"7"},"PeriodicalIF":2.3,"publicationDate":"2018-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s12982-018-0076-8","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"36196389","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Change in quality of malnutrition surveys between 1986 and 2015. 1986 年至 2015 年间营养不良调查质量的变化。
IF 2.3 Q1 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH Pub Date : 2018-05-28 eCollection Date: 2018-01-01 DOI: 10.1186/s12982-018-0075-9
Emmanuel Grellety, Michael H Golden

Background: Representative surveys collecting weight, height and MUAC are used to estimate the prevalence of acute malnutrition. The results are then used to assess the scale of malnutrition in a population and type of nutritional intervention required. There have been changes in methodology over recent decades; the objective of this study was to determine if these have resulted in higher quality surveys.

Methods: In order to examine the change in reliability of such surveys we have analysed the statistical distributions of the derived anthropometric parameters from 1843 surveys conducted by 19 agencies between 1986 and 2015.

Results: With the introduction of standardised guidelines and software by 2003 and their more general application from 2007 the mean standard deviation, kurtosis and skewness of the parameters used to assess nutritional status have each moved to now approximate the distribution of the WHO standards when the exclusion of outliers from analysis is based upon SMART flagging procedure. Where WHO flags, that only exclude data incompatible with life, are used the quality of anthropometric surveys has improved and the results now approach those seen with SMART flags and the WHO standards distribution. Agencies vary in their uptake and adherence to standard guidelines. Those agencies that fully implement the guidelines achieve the most consistently reliable results.

Conclusions: Standard methods should be universally used to produce reliable data and tests of data quality and SMART type flagging procedures should be applied and reported to ensure that the data are credible and therefore inform appropriate intervention. Use of SMART guidelines has coincided with reliable anthropometric data since 2007.

背景:收集体重、身高和 MUAC 的代表性调查用于估算急性营养不良的发生率。然后根据调查结果评估人口营养不良的程度和所需营养干预的类型。近几十年来,调查方法发生了一些变化;本研究旨在确定这些变化是否提高了调查质量:为了研究此类调查在可靠性方面的变化,我们分析了 1986 年至 2015 年间 19 个机构开展的 1843 次调查中得出的人体测量参数的统计分布情况:结果:随着 2003 年标准化指南和软件的引入,以及从 2007 年起这些指南和软件的更广泛应用,用于评估营养状况的各项参数的平均标准偏差、峰度和偏度均有所变化,目前已接近于根据 SMART 标记程序将异常值排除在分析之外时世界卫生组织标准的分布情况。世卫组织的标志只排除与生命不符的数据,在使用世卫组织标志的情况下,人体测量调查的质量有所提高,现在的结果接近于使用 SMART 标志和世卫组织标准分布的结果。各机构在采纳和遵守标准准则方面各不相同。那些全面执行准则的机构取得的结果最为稳定可靠:应普遍采用标准方法来生成可靠的数据,并应用和报告数据质量检验和 SMART 类型标记程序,以确保数据可信,从而为适当的干预措施提供依据。自 2007 年以来,SMART 准则的使用与可靠的人体测量数据相吻合。
{"title":"Change in quality of malnutrition surveys between 1986 and 2015.","authors":"Emmanuel Grellety, Michael H Golden","doi":"10.1186/s12982-018-0075-9","DOIUrl":"10.1186/s12982-018-0075-9","url":null,"abstract":"<p><strong>Background: </strong>Representative surveys collecting weight, height and MUAC are used to estimate the prevalence of acute malnutrition. The results are then used to assess the scale of malnutrition in a population and type of nutritional intervention required. There have been changes in methodology over recent decades; the objective of this study was to determine if these have resulted in higher quality surveys.</p><p><strong>Methods: </strong>In order to examine the change in reliability of such surveys we have analysed the statistical distributions of the derived anthropometric parameters from 1843 surveys conducted by 19 agencies between 1986 and 2015.</p><p><strong>Results: </strong>With the introduction of standardised guidelines and software by 2003 and their more general application from 2007 the mean standard deviation, kurtosis and skewness of the parameters used to assess nutritional status have each moved to now approximate the distribution of the WHO standards when the exclusion of outliers from analysis is based upon SMART flagging procedure. Where WHO flags, that only exclude data incompatible with life, are used the quality of anthropometric surveys has improved and the results now approach those seen with SMART flags and the WHO standards distribution. Agencies vary in their uptake and adherence to standard guidelines. Those agencies that fully implement the guidelines achieve the most consistently reliable results.</p><p><strong>Conclusions: </strong>Standard methods should be universally used to produce reliable data and tests of data quality and SMART type flagging procedures should be applied and reported to ensure that the data are credible and therefore inform appropriate intervention. Use of SMART guidelines has coincided with reliable anthropometric data since 2007.</p>","PeriodicalId":39896,"journal":{"name":"Emerging Themes in Epidemiology","volume":"15 ","pages":"8"},"PeriodicalIF":2.3,"publicationDate":"2018-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5972441/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"36196390","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Role of survey response rates on valid inference: an application to HIV prevalence estimates. 调查回复率对有效推断的作用:艾滋病毒流行率估算的应用。
IF 3.6 Q1 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH Pub Date : 2018-03-05 eCollection Date: 2018-01-01 DOI: 10.1186/s12982-018-0074-x
Miguel Marino, Marcello Pagano

Background: Nationally-representative surveys suggest that females have a higher prevalence of HIV than males in most African countries. Unfortunately, these results are made on the basis of surveys with non-ignorable missing data. This study evaluates the impact that differential survey nonresponse rates between males and females can have on the point estimate of the HIV prevalence ratio of these two classifiers.

Methods: We study 29 Demographic and Health Surveys (DHS) from 2001 to 2010. Instead of employing often used multiple imputation models with a Missing at Random assumption that may not hold in this setting, we assess the effect of ignoring the information contained in the missing HIV information for males and females through three proposed statistical measures. These measures can be used in settings where the interest is comparing the prevalence of a disease between two groups. The proposed measures do not utilize parametric models and can be implemented by researchers of any level. They are: (1) an upper bound on the potential bias of the usual practise of using reported HIV prevalence estimates that ignore subjects who have missing HIV outcomes. (2) Plausible range intervals to account for nonresponses, without any additional parametric modeling assumptions. (3) Prevalence ratio inflation factors to correct the point estimate of the HIV prevalence ratio, if estimates of nonresponders' HIV prevalences were known.

Results: In 86% of countries, males have higher upper bounds of HIV prevalence than females, this is consonant with males possibly having higher infection rates than females. Additionally, 74% of surveys have a plausible range that crosses 1.0, suggesting a plausible equivalence between male and female HIV prevalences.

Conclusions: It is quite reasonable to conclude that there is so much DHS nonresponse in evaluating the HIV status question, that existing data is plausibly generated by the situation where the virus is equally distributed between the sexes.

背景:具有全国代表性的调查表明,在大多数非洲国家,女性的艾滋病毒感染率高于男性。遗憾的是,这些结果都是在有不可忽略的缺失数据的调查基础上得出的。本研究评估了男性和女性之间不同的调查无应答率对这两种分类方法的 HIV 感染率比值点估算的影响:我们研究了 2001 年至 2010 年的 29 次人口与健康调查(DHS)。我们没有采用通常使用的随机缺失假设的多重估算模型,而是通过三种拟议的统计测量方法来评估忽略男性和女性缺失的 HIV 信息所产生的影响。这些统计量可用于比较两组间疾病流行率的情况。建议的测量方法不使用参数模型,任何水平的研究人员都可以实施。它们是(1) 对使用报告的艾滋病流行率估计值的通常做法的潜在偏差设定上限,这种做法忽略了缺失艾滋病结果的受试者。(2) 合理的范围区间,以考虑到未回复的情况,而无需任何额外的参数建模假设。(3) 如果已知未回复者的艾滋病毒感染率估计值,则采用感染率比率膨胀系数来修正艾滋病毒感染率比率的点估计值:在 86% 的国家中,男性的 HIV 感染率上限高于女性,这与男性的感染率可能高于女性相吻合。此外,74% 的调查的可信范围超过了 1.0,这表明男性和女性的艾滋病感染率之间存在可信的等值关系:在评估 HIV 感染状况的问题时,人口与健康调查中存在大量的无响应情况,因此现有数据可能是由病毒在两性之间平均分布的情况产生的,这一结论是非常合理的。
{"title":"Role of survey response rates on valid inference: an application to HIV prevalence estimates.","authors":"Miguel Marino, Marcello Pagano","doi":"10.1186/s12982-018-0074-x","DOIUrl":"10.1186/s12982-018-0074-x","url":null,"abstract":"<p><strong>Background: </strong>Nationally-representative surveys suggest that females have a higher prevalence of HIV than males in most African countries. Unfortunately, these results are made on the basis of surveys with non-ignorable missing data. This study evaluates the impact that differential survey nonresponse rates between males and females can have on the point estimate of the HIV prevalence ratio of these two classifiers.</p><p><strong>Methods: </strong>We study 29 Demographic and Health Surveys (DHS) from 2001 to 2010. Instead of employing often used multiple imputation models with a Missing at Random assumption that may not hold in this setting, we assess the effect of ignoring the information contained in the missing HIV information for males and females through three proposed statistical measures. These measures can be used in settings where the interest is comparing the prevalence of a disease between two groups. The proposed measures do not utilize parametric models and can be implemented by researchers of any level. They are: (1) an upper bound on the potential bias of the usual practise of using reported HIV prevalence estimates that ignore subjects who have missing HIV outcomes. (2) Plausible range intervals to account for nonresponses, without any additional parametric modeling assumptions. (3) Prevalence ratio inflation factors to correct the point estimate of the HIV prevalence ratio, if estimates of nonresponders' HIV prevalences were known.</p><p><strong>Results: </strong>In 86% of countries, males have higher upper bounds of HIV prevalence than females, this is consonant with males possibly having higher infection rates than females. Additionally, 74% of surveys have a <i>plausible</i> range that crosses 1.0, suggesting a plausible equivalence between male and female HIV prevalences.</p><p><strong>Conclusions: </strong>It is quite reasonable to conclude that there is so much DHS nonresponse in evaluating the HIV status question, that existing data is plausibly generated by the situation where the virus is equally distributed between the sexes.</p>","PeriodicalId":39896,"journal":{"name":"Emerging Themes in Epidemiology","volume":"15 ","pages":"6"},"PeriodicalIF":3.6,"publicationDate":"2018-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5839032/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"35903247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Modelling fertility in rural South Africa with combined nonlinear parametric and semi-parametric methods. 结合非线性参数和半参数方法对南非农村生育率进行建模。
IF 2.3 Q1 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH Pub Date : 2018-03-02 eCollection Date: 2018-01-01 DOI: 10.1186/s12982-018-0073-y
Robert W Eyre, Thomas House, F Xavier Gómez-Olivé, Frances E Griffiths

Background: Central to the study of populations, and therefore to the analysis of the development of countries undergoing major transitions, is the calculation of fertility patterns and their dependence on different variables such as age, education, and socio-economic status. Most epidemiological research on these matters rely on the often unjustified assumption of (generalised) linearity, or alternatively makes a parametric assumption (e.g. for age-patterns).

Methods: We consider nonlinearity of fertility in the covariates by combining an established nonlinear parametric model for fertility over age with nonlinear modelling of fertility over other covariates. For the latter, we use the semi-parametric method of Gaussian process regression which is a popular methodology in many fields including machine learning, computer science, and systems biology. We applied the method to data from the Agincourt Health and Socio-Demographic Surveillance System, annual census rounds performed on a poor rural region of South Africa since 1992, to analyse fertility patterns over age and socio-economic status.

Results: We capture a previously established age-pattern of fertility, whilst being able to more robustly model the relationship between fertility and socio-economic status without unjustified a priori assumptions of linearity. Peak fertility over age is shown to be increasing over time, as well as for adolescents but not for those later in life for whom fertility is generally decreasing over time.

Conclusions: Combining Gaussian process regression with nonlinear parametric modelling of fertility over age allowed for the incorporation of further covariates into the analysis without needing to assume a linear relationship. This enabled us to provide further insights into the fertility patterns of the Agincourt study area, in particular the interaction between age and socio-economic status.

背景:人口研究的核心,因此也是分析正在经历重大转型的国家的发展的核心,是计算生育率模式及其对年龄、教育和社会经济地位等不同变量的依赖。大多数关于这些问题的流行病学研究依赖于通常不合理的(广义的)线性假设,或者做出参数假设(例如年龄模式)。方法:通过将已建立的生育率随年龄变化的非线性参数模型与生育率随其他协变量的非线性模型相结合,考虑生育率在协变量中的非线性。对于后者,我们使用高斯过程回归的半参数方法,这是许多领域的流行方法,包括机器学习,计算机科学和系统生物学。我们将该方法应用于阿金库尔健康和社会人口监测系统的数据,该系统自1992年以来在南非贫困农村地区进行年度人口普查,以分析年龄和社会经济地位的生育模式。结果:我们捕获了先前建立的生育率年龄模式,同时能够更稳健地模拟生育率和社会经济地位之间的关系,而没有不合理的线性先验假设。随着年龄的增长,生育高峰会随着时间的推移而增加,青少年也是如此,但对于那些生育能力随着时间的推移而普遍下降的人来说,情况并非如此。结论:将高斯过程回归与生育年龄的非线性参数建模相结合,可以将进一步的协变量纳入分析,而无需假设线性关系。这使我们能够进一步了解阿金库尔研究区域的生育模式,特别是年龄和社会经济地位之间的相互作用。
{"title":"Modelling fertility in rural South Africa with combined nonlinear parametric and semi-parametric methods.","authors":"Robert W Eyre,&nbsp;Thomas House,&nbsp;F Xavier Gómez-Olivé,&nbsp;Frances E Griffiths","doi":"10.1186/s12982-018-0073-y","DOIUrl":"https://doi.org/10.1186/s12982-018-0073-y","url":null,"abstract":"<p><strong>Background: </strong>Central to the study of populations, and therefore to the analysis of the development of countries undergoing major transitions, is the calculation of fertility patterns and their dependence on different variables such as age, education, and socio-economic status. Most epidemiological research on these matters rely on the often unjustified assumption of (generalised) linearity, or alternatively makes a parametric assumption (e.g. for age-patterns).</p><p><strong>Methods: </strong>We consider nonlinearity of fertility in the covariates by combining an established nonlinear parametric model for fertility over age with nonlinear modelling of fertility over other covariates. For the latter, we use the semi-parametric method of Gaussian process regression which is a popular methodology in many fields including machine learning, computer science, and systems biology. We applied the method to data from the Agincourt Health and Socio-Demographic Surveillance System, annual census rounds performed on a poor rural region of South Africa since 1992, to analyse fertility patterns over age and socio-economic status.</p><p><strong>Results: </strong>We capture a previously established age-pattern of fertility, whilst being able to more robustly model the relationship between fertility and socio-economic status without unjustified a priori assumptions of linearity. Peak fertility over age is shown to be increasing over time, as well as for adolescents but not for those later in life for whom fertility is generally decreasing over time.</p><p><strong>Conclusions: </strong>Combining Gaussian process regression with nonlinear parametric modelling of fertility over age allowed for the incorporation of further covariates into the analysis without needing to assume a linear relationship. This enabled us to provide further insights into the fertility patterns of the Agincourt study area, in particular the interaction between age and socio-economic status.</p>","PeriodicalId":39896,"journal":{"name":"Emerging Themes in Epidemiology","volume":"15 ","pages":"5"},"PeriodicalIF":2.3,"publicationDate":"2018-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s12982-018-0073-y","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"35885842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Novel metrics for growth model selection. 用于选择增长模型的新指标。
IF 2.3 Q1 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH Pub Date : 2018-02-23 eCollection Date: 2018-01-01 DOI: 10.1186/s12982-018-0072-z
Matthew R Grigsby, Junrui Di, Andrew Leroux, Vadim Zipunnikov, Luo Xiao, Ciprian Crainiceanu, William Checkley

Background: Literature surrounding the statistical modeling of childhood growth data involves a diverse set of potential models from which investigators can choose. However, the lack of a comprehensive framework for comparing non-nested models leads to difficulty in assessing model performance. This paper proposes a framework for comparing non-nested growth models using novel metrics of predictive accuracy based on modifications of the mean squared error criteria.

Methods: Three metrics were created: normalized, age-adjusted, and weighted mean squared error (MSE). Predictive performance metrics were used to compare linear mixed effects models and functional regression models. Prediction accuracy was assessed by partitioning the observed data into training and test datasets. This partitioning was constructed to assess prediction accuracy for backward (i.e., early growth), forward (i.e., late growth), in-range, and on new-individuals. Analyses were done with height measurements from 215 Peruvian children with data spanning from near birth to 2 years of age.

Results: Functional models outperformed linear mixed effects models in all scenarios tested. In particular, prediction errors for functional concurrent regression (FCR) and functional principal component analysis models were approximately 6% lower when compared to linear mixed effects models. When we weighted subject-specific MSEs according to subject-specific growth rates during infancy, we found that FCR was the best performer in all scenarios.

Conclusion: With this novel approach, we can quantitatively compare non-nested models and weight subgroups of interest to select the best performing growth model for a particular application or problem at hand.

背景:有关儿童生长数据统计建模的文献涉及多种潜在模型,研究人员可从中进行选择。然而,由于缺乏一个全面的框架来比较非嵌套模型,因此在评估模型性能方面存在困难。本文根据对均方误差标准的修改,提出了一个使用新的预测准确性指标来比较非嵌套生长模型的框架:方法:创建了三个指标:归一化、年龄调整和加权均方误差(MSE)。预测性能指标用于比较线性混合效应模型和函数回归模型。预测准确性是通过将观测数据划分为训练数据集和测试数据集来评估的。这种划分是为了评估后向(即早期生长)、前向(即晚期生长)、范围内和新个体的预测准确性。分析使用了 215 名秘鲁儿童的身高测量数据,数据时间跨度为近出生至 2 岁:结果:在所有测试方案中,功能模型都优于线性混合效应模型。特别是,与线性混合效应模型相比,功能并发回归(FCR)和功能主成分分析模型的预测误差低约 6%。当我们根据婴儿期特定受试者的生长速度对特定受试者的 MSE 进行加权时,我们发现 FCR 在所有情况下都表现最佳:通过这种新方法,我们可以定量比较非嵌套模型,并对感兴趣的子组进行加权,从而为特定应用或手头的问题选择性能最佳的生长模型。
{"title":"Novel metrics for growth model selection.","authors":"Matthew R Grigsby, Junrui Di, Andrew Leroux, Vadim Zipunnikov, Luo Xiao, Ciprian Crainiceanu, William Checkley","doi":"10.1186/s12982-018-0072-z","DOIUrl":"10.1186/s12982-018-0072-z","url":null,"abstract":"<p><strong>Background: </strong>Literature surrounding the statistical modeling of childhood growth data involves a diverse set of potential models from which investigators can choose. However, the lack of a comprehensive framework for comparing non-nested models leads to difficulty in assessing model performance. This paper proposes a framework for comparing non-nested growth models using novel metrics of predictive accuracy based on modifications of the mean squared error criteria.</p><p><strong>Methods: </strong>Three metrics were created: normalized, age-adjusted, and weighted mean squared error (MSE). Predictive performance metrics were used to compare linear mixed effects models and functional regression models. Prediction accuracy was assessed by partitioning the observed data into training and test datasets. This partitioning was constructed to assess prediction accuracy for backward (i.e., early growth), forward (i.e., late growth), in-range, and on new-individuals. Analyses were done with height measurements from 215 Peruvian children with data spanning from near birth to 2 years of age.</p><p><strong>Results: </strong>Functional models outperformed linear mixed effects models in all scenarios tested. In particular, prediction errors for functional concurrent regression (FCR) and functional principal component analysis models were approximately 6% lower when compared to linear mixed effects models. When we weighted subject-specific MSEs according to subject-specific growth rates during infancy, we found that FCR was the best performer in all scenarios.</p><p><strong>Conclusion: </strong>With this novel approach, we can quantitatively compare non-nested models and weight subgroups of interest to select the best performing growth model for a particular application or problem at hand.</p>","PeriodicalId":39896,"journal":{"name":"Emerging Themes in Epidemiology","volume":"15 ","pages":"4"},"PeriodicalIF":2.3,"publicationDate":"2018-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5824542/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"35865435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Effect of correcting for gestational age at birth on population prevalence of early childhood undernutrition. 校正出生时的胎龄对幼儿营养不良人群患病率的影响。
IF 2.3 Q1 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH Pub Date : 2018-02-06 eCollection Date: 2018-01-01 DOI: 10.1186/s12982-018-0070-1
Nandita Perumal, Daniel E Roth, Johnna Perdrizet, Aluísio J D Barros, Iná S Santos, Alicia Matijasevich, Diego G Bassani

Background: Postmenstrual and/or gestational age-corrected age (CA) is required to apply child growth standards to children born preterm (< 37 weeks gestational age). Yet, CA is rarely used in epidemiologic studies in low- and middle-income countries (LMICs), which may bias population estimates of childhood undernutrition. To evaluate the effect of accounting for GA in the application of growth standards, we used GA-specific standards at birth (INTERGROWTH-21st newborn size standards) in conjunction with CA for preterm-born children in the application of World Health Organization Child Growth Standards postnatally (referred to as 'CA' strategy) versus postnatal age for all children, to estimate mean length-for-age (LAZ) and weight-for-age (WAZ) z scores at 0, 3, 12, 24, and 48-months of age in the 2004 Pelotas (Brazil) Birth Cohort.

Results: At birth (n = 4066), mean LAZ was higher and the prevalence of stunting (LAZ < -2) was lower using CA versus postnatal age (mean ± SD): - 0.36 ± 1.19 versus - 0.67 ± 1.32; and 8.3 versus 11.6%, respectively. Odds ratio (OR) and population attributable risk (PAR) of stunting due to preterm birth were attenuated and changed inferences using CA versus postnatal age at birth [OR, 95% confidence interval (CI): 1.32 (95% CI 0.95, 1.82) vs 14.7 (95% CI 11.7, 18.4); PAR 3.1 vs 42.9%]; differences in inferences persisted at 3-months. At 12, 24, and 48-months, preterm birth was associated with stunting, but ORs/PARs remained attenuated using CA compared to postnatal age. Findings were similar for weight-for-age z scores.

Conclusions: Population-based epidemiologic studies in LMICs in which GA is unused or unavailable may overestimate the prevalence of early childhood undernutrition and inflate the fraction of undernutrition attributable to preterm birth.

背景:对早产儿适用儿童生长标准需要月经后年龄和/或胎龄校正年龄(CA)(2004 年佩洛塔斯(巴西)出生队列中 0、3、12、24 和 48 个月时的 z 评分):出生时(n = 4066),平均 LAZ 值较高,发育迟缓的发生率(LAZ z 分数)也较高:在未使用或无法获得 GA 的低收入与中等收入国家开展的基于人口的流行病学研究可能会高估儿童早期营养不良的发生率,并夸大早产造成的营养不良比例。
{"title":"Effect of correcting for gestational age at birth on population prevalence of early childhood undernutrition.","authors":"Nandita Perumal, Daniel E Roth, Johnna Perdrizet, Aluísio J D Barros, Iná S Santos, Alicia Matijasevich, Diego G Bassani","doi":"10.1186/s12982-018-0070-1","DOIUrl":"10.1186/s12982-018-0070-1","url":null,"abstract":"<p><strong>Background: </strong>Postmenstrual and/or gestational age-corrected age (CA) is required to apply child growth standards to children born preterm (< 37 weeks gestational age). Yet, CA is rarely used in epidemiologic studies in low- and middle-income countries (LMICs), which may bias population estimates of childhood undernutrition. To evaluate the effect of accounting for GA in the application of growth standards, we used GA-specific standards at birth (INTERGROWTH-21st newborn size standards) in conjunction with CA for preterm-born children in the application of World Health Organization Child Growth Standards postnatally (referred to as 'CA' strategy) versus postnatal age for all children, to estimate mean length-for-age (LAZ) and weight-for-age (WAZ) <i>z</i> scores at 0, 3, 12, 24, and 48-months of age in the 2004 Pelotas (Brazil) Birth Cohort.</p><p><strong>Results: </strong>At birth (n = 4066), mean LAZ was higher and the prevalence of stunting (LAZ < -2) was lower using CA versus postnatal age (mean ± SD): - 0.36 ± 1.19 versus - 0.67 ± 1.32; and 8.3 versus 11.6%, respectively. Odds ratio (OR) and population attributable risk (PAR) of stunting due to preterm birth were attenuated and changed inferences using CA versus postnatal age at birth [OR, 95% confidence interval (CI): 1.32 (95% CI 0.95, 1.82) vs 14.7 (95% CI 11.7, 18.4); PAR 3.1 vs 42.9%]; differences in inferences persisted at 3-months. At 12, 24, and 48-months, preterm birth was associated with stunting, but ORs/PARs remained attenuated using CA compared to postnatal age. Findings were similar for weight-for-age <i>z</i> scores.</p><p><strong>Conclusions: </strong>Population-based epidemiologic studies in LMICs in which GA is unused or unavailable may overestimate the prevalence of early childhood undernutrition and inflate the fraction of undernutrition attributable to preterm birth.</p>","PeriodicalId":39896,"journal":{"name":"Emerging Themes in Epidemiology","volume":"15 ","pages":"3"},"PeriodicalIF":2.3,"publicationDate":"2018-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5799899/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"35830088","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Contextual factors in maternal and newborn health evaluation: a protocol applied in Nigeria, India and Ethiopia. 孕产妇和新生儿健康评价中的环境因素:尼日利亚、印度和埃塞俄比亚适用的议定书。
IF 2.3 Q1 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH Pub Date : 2018-02-06 eCollection Date: 2018-01-01 DOI: 10.1186/s12982-018-0071-0
Kate Sabot, Tanya Marchant, Neil Spicer, Della Berhanu, Meenakshi Gautham, Nasir Umar, Joanna Schellenberg

Background: Understanding the context of a health programme is important in interpreting evaluation findings and in considering the external validity for other settings. Public health researchers can be imprecise and inconsistent in their usage of the word "context" and its application to their work. This paper presents an approach to defining context, to capturing relevant contextual information and to using such information to help interpret findings from the perspective of a research group evaluating the effect of diverse innovations on coverage of evidence-based, life-saving interventions for maternal and newborn health in Ethiopia, Nigeria, and India.

Methods: We define "context" as the background environment or setting of any program, and "contextual factors" as those elements of context that could affect implementation of a programme. Through a structured, consultative process, contextual factors were identified while trying to strike a balance between comprehensiveness and feasibility. Thematic areas included demographics and socio-economics, epidemiological profile, health systems and service uptake, infrastructure, education, environment, politics, policy and governance. We outline an approach for capturing and using contextual factors while maximizing use of existing data. Methods include desk reviews, secondary data extraction and key informant interviews. Outputs include databases of contextual factors and summaries of existing maternal and newborn health policies and their implementation. Use of contextual data will be qualitative in nature and may assist in interpreting findings in both quantitative and qualitative aspects of programme evaluation.

Discussion: Applying this approach was more resource intensive than expected, in part because routinely available information was not consistently available across settings and more primary data collection was required than anticipated. Data was used only minimally, partly due to a lack of evaluation results that needed further explanation, but also because contextual data was not available for the precise units of analysis or time periods of interest. We would advise others to consider integrating contextual factors within other data collection activities, and to conduct regular reviews of maternal and newborn health policies. This approach and the learnings from its application could help inform the development of guidelines for the collection and use of contextual factors in public health evaluation.

背景:了解卫生规划的背景对于解释评价结果和考虑其他环境的外部有效性非常重要。公共卫生研究人员在使用“上下文”一词及其在工作中的应用时可能不精确和不一致。本文提出了一种定义背景、获取相关背景信息并利用这些信息帮助从一个研究小组的角度解释研究结果的方法,该研究小组评估了埃塞俄比亚、尼日利亚和印度各种创新对以证据为基础的孕产妇和新生儿健康救生干预措施覆盖面的影响。方法:我们将“上下文”定义为任何程序的背景环境或设置,“上下文因素”定义为可能影响程序实施的上下文元素。通过一个有组织的协商过程,确定了各种背景因素,同时设法在全面性和可行性之间取得平衡。专题领域包括人口统计和社会经济学、流行病学概况、卫生系统和服务吸收、基础设施、教育、环境、政治、政策和治理。我们概述了在最大限度地利用现有数据的同时捕获和使用上下文因素的方法。方法包括案头回顾、二次数据提取和关键线人访谈。产出包括环境因素数据库和现有孕产妇和新生儿保健政策及其执行情况摘要。背景数据的使用将是定性的,可能有助于解释方案评价在数量和质量两方面的调查结果。讨论:应用这种方法比预期的需要更多的资源,部分原因是常规可用的信息在不同的设置中并不一致,并且需要比预期更多的原始数据收集。数据的使用很少,部分原因是缺乏需要进一步解释的评价结果,但也因为上下文数据无法用于分析的精确单位或感兴趣的时间段。我们建议其他国家考虑将环境因素纳入其他数据收集活动,并定期审查孕产妇和新生儿保健政策。这一方法及其应用所获得的经验可以帮助制定在公共卫生评价中收集和使用背景因素的指导方针。
{"title":"Contextual factors in maternal and newborn health evaluation: a protocol applied in Nigeria, India and Ethiopia.","authors":"Kate Sabot,&nbsp;Tanya Marchant,&nbsp;Neil Spicer,&nbsp;Della Berhanu,&nbsp;Meenakshi Gautham,&nbsp;Nasir Umar,&nbsp;Joanna Schellenberg","doi":"10.1186/s12982-018-0071-0","DOIUrl":"https://doi.org/10.1186/s12982-018-0071-0","url":null,"abstract":"<p><strong>Background: </strong>Understanding the context of a health programme is important in interpreting evaluation findings and in considering the external validity for other settings. Public health researchers can be imprecise and inconsistent in their usage of the word \"context\" and its application to their work. This paper presents an approach to defining context, to capturing relevant contextual information and to using such information to help interpret findings from the perspective of a research group evaluating the effect of diverse innovations on coverage of evidence-based, life-saving interventions for maternal and newborn health in Ethiopia, Nigeria, and India.</p><p><strong>Methods: </strong>We define \"context\" as the background environment or setting of any program, and \"contextual factors\" as those elements of context that could affect implementation of a programme. Through a structured, consultative process, contextual factors were identified while trying to strike a balance between comprehensiveness and feasibility. Thematic areas included demographics and socio-economics, epidemiological profile, health systems and service uptake, infrastructure, education, environment, politics, policy and governance. We outline an approach for capturing and using contextual factors while maximizing use of existing data. Methods include desk reviews, secondary data extraction and key informant interviews. Outputs include databases of contextual factors and summaries of existing maternal and newborn health policies and their implementation. Use of contextual data will be qualitative in nature and may assist in interpreting findings in both quantitative and qualitative aspects of programme evaluation.</p><p><strong>Discussion: </strong>Applying this approach was more resource intensive than expected, in part because routinely available information was not consistently available across settings and more primary data collection was required than anticipated. Data was used only minimally, partly due to a lack of evaluation results that needed further explanation, but also because contextual data was not available for the precise units of analysis or time periods of interest. We would advise others to consider integrating contextual factors within other data collection activities, and to conduct regular reviews of maternal and newborn health policies. This approach and the learnings from its application could help inform the development of guidelines for the collection and use of contextual factors in public health evaluation.</p>","PeriodicalId":39896,"journal":{"name":"Emerging Themes in Epidemiology","volume":"15 ","pages":"2"},"PeriodicalIF":2.3,"publicationDate":"2018-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s12982-018-0071-0","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"35830087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
An introduction to instrumental variable assumptions, validation and estimation. 介绍工具变量的假设、验证和估计。
IF 2.3 Q1 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH Pub Date : 2018-01-22 eCollection Date: 2018-01-01 DOI: 10.1186/s12982-018-0069-7
Mette Lise Lousdal

The instrumental variable method has been employed within economics to infer causality in the presence of unmeasured confounding. Emphasising the parallels to randomisation may increase understanding of the underlying assumptions within epidemiology. An instrument is a variable that predicts exposure, but conditional on exposure shows no independent association with the outcome. The random assignment in trials is an example of what would be expected to be an ideal instrument, but instruments can also be found in observational settings with a naturally varying phenomenon e.g. geographical variation, physical distance to facility or physician's preference. The fourth identifying assumption has received less attention, but is essential for the generalisability of estimated effects. The instrument identifies the group of compliers in which exposure is pseudo-randomly assigned leading to exchangeability with regard to unmeasured confounders. Underlying assumptions can only partially be tested empirically and require subject-matter knowledge. Future studies employing instruments should carefully seek to validate all four assumptions, possibly drawing on parallels to randomisation.

在经济学中,工具变量法已被用于在存在无法测量的混杂时推断因果关系。强调与随机化的相似之处可能会增加对流行病学中潜在假设的理解。仪器是预测暴露的变量,但以暴露为条件与结果没有独立关联。试验中的随机分配是理想仪器的一个例子,但仪器也可以在具有自然变化现象的观察环境中找到,例如地理变化,到设施的物理距离或医生的偏好。第四个识别假设受到的关注较少,但对于估计效果的普遍性至关重要。该工具确定了暴露是伪随机分配的编译器组,导致未测量混杂因素的互换性。潜在的假设只能部分地被经验检验,并且需要相关的知识。未来使用工具的研究应谨慎地寻求验证所有四个假设,可能与随机化相似。
{"title":"An introduction to instrumental variable assumptions, validation and estimation.","authors":"Mette Lise Lousdal","doi":"10.1186/s12982-018-0069-7","DOIUrl":"https://doi.org/10.1186/s12982-018-0069-7","url":null,"abstract":"<p><p>The instrumental variable method has been employed within economics to infer causality in the presence of unmeasured confounding. Emphasising the parallels to randomisation may increase understanding of the underlying assumptions within epidemiology. An instrument is a variable that predicts exposure, but conditional on exposure shows no independent association with the outcome. The random assignment in trials is an example of what would be expected to be an ideal instrument, but instruments can also be found in observational settings with a naturally varying phenomenon e.g. geographical variation, physical distance to facility or physician's preference. The fourth identifying assumption has received less attention, but is essential for the generalisability of estimated effects. The instrument identifies the group of <i>compliers</i> in which exposure is pseudo-randomly assigned leading to exchangeability with regard to unmeasured confounders. Underlying assumptions can only partially be tested empirically and require subject-matter knowledge. Future studies employing instruments should carefully seek to validate all four assumptions, possibly drawing on parallels to randomisation.</p>","PeriodicalId":39896,"journal":{"name":"Emerging Themes in Epidemiology","volume":"15 ","pages":"1"},"PeriodicalIF":2.3,"publicationDate":"2018-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s12982-018-0069-7","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"35782943","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 94
Multiple imputation using linked proxy outcome data resulted in important bias reduction and efficiency gains: a simulation study. 使用关联代理结果数据的多重输入导致重要的偏差减少和效率提高:一项模拟研究。
IF 3.6 Q1 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH Pub Date : 2017-12-19 eCollection Date: 2017-01-01 DOI: 10.1186/s12982-017-0068-0
R P Cornish, J Macleod, J R Carpenter, K Tilling

Background: When an outcome variable is missing not at random (MNAR: probability of missingness depends on outcome values), estimates of the effect of an exposure on this outcome are often biased. We investigated the extent of this bias and examined whether the bias can be reduced through incorporating proxy outcomes obtained through linkage to administrative data as auxiliary variables in multiple imputation (MI).

Methods: Using data from the Avon Longitudinal Study of Parents and Children (ALSPAC) we estimated the association between breastfeeding and IQ (continuous outcome), incorporating linked attainment data (proxies for IQ) as auxiliary variables in MI models. Simulation studies explored the impact of varying the proportion of missing data (from 20 to 80%), the correlation between the outcome and its proxy (0.1-0.9), the strength of the missing data mechanism, and having a proxy variable that was incomplete.

Results: Incorporating a linked proxy for the missing outcome as an auxiliary variable reduced bias and increased efficiency in all scenarios, even when 80% of the outcome was missing. Using an incomplete proxy was similarly beneficial. High correlations (> 0.5) between the outcome and its proxy substantially reduced the missing information. Consistent with this, ALSPAC analysis showed inclusion of a proxy reduced bias and improved efficiency. Gains with additional proxies were modest.

Conclusions: In longitudinal studies with loss to follow-up, incorporating proxies for this study outcome obtained via linkage to external sources of data as auxiliary variables in MI models can give practically important bias reduction and efficiency gains when the study outcome is MNAR.

背景:当一个结果变量不是随机丢失时(MNAR:丢失的概率取决于结果值),对暴露对该结果的影响的估计往往是有偏差的。我们调查了这种偏差的程度,并检查了是否可以通过将通过与行政数据联系获得的代理结果作为多重imputation (MI)的辅助变量来减少偏差。方法:使用雅芳父母与儿童纵向研究(ALSPAC)的数据,我们估计母乳喂养与智商(连续结果)之间的关联,并将相关成就数据(智商的代理)作为MI模型的辅助变量。模拟研究探讨了不同缺失数据比例(从20%到80%)、结果与其代理之间的相关性(0.1-0.9)、缺失数据机制的强度以及代理变量不完整的影响。结果:将缺失结果的关联代理作为辅助变量,即使在80%的结果缺失的情况下,也可以在所有情况下减少偏差并提高效率。使用不完整的代理也同样有益。结果与其代理之间的高相关性(>.5)大大减少了丢失的信息。与此一致,ALSPAC分析显示,纳入代理减少了偏倚,提高了效率。额外代理的收益是温和的。结论:在随访损失的纵向研究中,当研究结果为MNAR时,将通过与外部数据来源的联系获得的研究结果的代理作为MI模型中的辅助变量,可以减少实际重要的偏倚并提高效率。
{"title":"Multiple imputation using linked proxy outcome data resulted in important bias reduction and efficiency gains: a simulation study.","authors":"R P Cornish, J Macleod, J R Carpenter, K Tilling","doi":"10.1186/s12982-017-0068-0","DOIUrl":"10.1186/s12982-017-0068-0","url":null,"abstract":"<p><strong>Background: </strong>When an outcome variable is missing not at random (MNAR: probability of missingness depends on outcome values), estimates of the effect of an exposure on this outcome are often biased. We investigated the extent of this bias and examined whether the bias can be reduced through incorporating proxy outcomes obtained through linkage to administrative data as auxiliary variables in multiple imputation (MI).</p><p><strong>Methods: </strong>Using data from the Avon Longitudinal Study of Parents and Children (ALSPAC) we estimated the association between breastfeeding and IQ (continuous outcome), incorporating linked attainment data (proxies for IQ) as auxiliary variables in MI models. Simulation studies explored the impact of varying the proportion of missing data (from 20 to 80%), the correlation between the outcome and its proxy (0.1-0.9), the strength of the missing data mechanism, and having a proxy variable that was incomplete.</p><p><strong>Results: </strong>Incorporating a linked proxy for the missing outcome as an auxiliary variable reduced bias and increased efficiency in all scenarios, even when 80% of the outcome was missing. Using an incomplete proxy was similarly beneficial. High correlations (> 0.5) between the outcome and its proxy substantially reduced the missing information. Consistent with this, ALSPAC analysis showed inclusion of a proxy reduced bias and improved efficiency. Gains with additional proxies were modest.</p><p><strong>Conclusions: </strong>In longitudinal studies with loss to follow-up, incorporating proxies for this study outcome obtained via linkage to external sources of data as auxiliary variables in MI models can give practically important bias reduction and efficiency gains when the study outcome is MNAR.</p>","PeriodicalId":39896,"journal":{"name":"Emerging Themes in Epidemiology","volume":"14 ","pages":"14"},"PeriodicalIF":3.6,"publicationDate":"2017-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5735815/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"35682082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Emerging Themes in Epidemiology
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1