流行病学研究中的决策树。

IF 3.6 Q1 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH Emerging Themes in Epidemiology Pub Date : 2017-09-20 eCollection Date: 2017-01-01 DOI:10.1186/s12982-017-0064-4
Ashwini Venkatasubramaniam, Julian Wolfson, Nathan Mitchell, Timothy Barnes, Meghan JaKa, Simone French
{"title":"流行病学研究中的决策树。","authors":"Ashwini Venkatasubramaniam,&nbsp;Julian Wolfson,&nbsp;Nathan Mitchell,&nbsp;Timothy Barnes,&nbsp;Meghan JaKa,&nbsp;Simone French","doi":"10.1186/s12982-017-0064-4","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>In many studies, it is of interest to identify population subgroups that are relatively homogeneous with respect to an outcome. The nature of these subgroups can provide insight into effect mechanisms and suggest targets for tailored interventions. However, identifying relevant subgroups can be challenging with standard statistical methods.</p><p><strong>Main text: </strong>We review the literature on decision trees, a family of techniques for partitioning the population, on the basis of covariates, into distinct subgroups who share similar values of an outcome variable. We compare two decision tree methods, the popular Classification and Regression tree (CART) technique and the newer Conditional Inference tree (CTree) technique, assessing their performance in a simulation study and using data from the Box Lunch Study, a randomized controlled trial of a portion size intervention. Both CART and CTree identify homogeneous population subgroups and offer improved prediction accuracy relative to regression-based approaches when subgroups are truly present in the data. An important distinction between CART and CTree is that the latter uses a formal statistical hypothesis testing framework in building decision trees, which simplifies the process of identifying and interpreting the final tree model. We also introduce a novel way to visualize the subgroups defined by decision trees. Our novel graphical visualization provides a more scientifically meaningful characterization of the subgroups identified by decision trees.</p><p><strong>Conclusions: </strong>Decision trees are a useful tool for identifying homogeneous subgroups defined by combinations of individual characteristics. While all decision tree techniques generate subgroups, we advocate the use of the newer CTree technique due to its simplicity and ease of interpretation.</p>","PeriodicalId":39896,"journal":{"name":"Emerging Themes in Epidemiology","volume":"14 ","pages":"11"},"PeriodicalIF":3.6000,"publicationDate":"2017-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s12982-017-0064-4","citationCount":"86","resultStr":"{\"title\":\"Decision trees in epidemiological research.\",\"authors\":\"Ashwini Venkatasubramaniam,&nbsp;Julian Wolfson,&nbsp;Nathan Mitchell,&nbsp;Timothy Barnes,&nbsp;Meghan JaKa,&nbsp;Simone French\",\"doi\":\"10.1186/s12982-017-0064-4\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>In many studies, it is of interest to identify population subgroups that are relatively homogeneous with respect to an outcome. The nature of these subgroups can provide insight into effect mechanisms and suggest targets for tailored interventions. However, identifying relevant subgroups can be challenging with standard statistical methods.</p><p><strong>Main text: </strong>We review the literature on decision trees, a family of techniques for partitioning the population, on the basis of covariates, into distinct subgroups who share similar values of an outcome variable. We compare two decision tree methods, the popular Classification and Regression tree (CART) technique and the newer Conditional Inference tree (CTree) technique, assessing their performance in a simulation study and using data from the Box Lunch Study, a randomized controlled trial of a portion size intervention. Both CART and CTree identify homogeneous population subgroups and offer improved prediction accuracy relative to regression-based approaches when subgroups are truly present in the data. An important distinction between CART and CTree is that the latter uses a formal statistical hypothesis testing framework in building decision trees, which simplifies the process of identifying and interpreting the final tree model. We also introduce a novel way to visualize the subgroups defined by decision trees. Our novel graphical visualization provides a more scientifically meaningful characterization of the subgroups identified by decision trees.</p><p><strong>Conclusions: </strong>Decision trees are a useful tool for identifying homogeneous subgroups defined by combinations of individual characteristics. While all decision tree techniques generate subgroups, we advocate the use of the newer CTree technique due to its simplicity and ease of interpretation.</p>\",\"PeriodicalId\":39896,\"journal\":{\"name\":\"Emerging Themes in Epidemiology\",\"volume\":\"14 \",\"pages\":\"11\"},\"PeriodicalIF\":3.6000,\"publicationDate\":\"2017-09-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1186/s12982-017-0064-4\",\"citationCount\":\"86\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Emerging Themes in Epidemiology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1186/s12982-017-0064-4\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2017/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q1\",\"JCRName\":\"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Emerging Themes in Epidemiology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/s12982-017-0064-4","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2017/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}
引用次数: 86

摘要

背景:在许多研究中,确定与结果相对均匀的人口亚群是很有意义的。这些亚组的性质可以提供对效果机制的洞察,并为量身定制的干预措施提出目标。然而,用标准的统计方法确定相关的子组可能具有挑战性。主要文本:我们回顾了关于决策树的文献,决策树是一种基于协变量将人口划分为具有相似结果变量值的不同子组的技术。我们比较了两种决策树方法,即流行的分类与回归树(CART)技术和较新的条件推理树(CTree)技术,在模拟研究中评估了它们的性能,并使用了盒饭研究(一项分量干预的随机对照试验)的数据。CART和CTree都能识别同质的总体子组,当子组真正存在于数据中时,相对于基于回归的方法,它们能提供更高的预测精度。CART和CTree之间的一个重要区别是,后者在构建决策树时使用正式的统计假设检验框架,这简化了识别和解释最终树模型的过程。我们还介绍了一种新的方法来可视化由决策树定义的子群。我们新颖的图形可视化为决策树识别的子群提供了更有科学意义的表征。结论:决策树是识别由个体特征组合定义的同质子群的有用工具。虽然所有的决策树技术都会生成子组,但我们提倡使用更新的CTree技术,因为它简单且易于解释。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

摘要图片

摘要图片

摘要图片

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Decision trees in epidemiological research.

Background: In many studies, it is of interest to identify population subgroups that are relatively homogeneous with respect to an outcome. The nature of these subgroups can provide insight into effect mechanisms and suggest targets for tailored interventions. However, identifying relevant subgroups can be challenging with standard statistical methods.

Main text: We review the literature on decision trees, a family of techniques for partitioning the population, on the basis of covariates, into distinct subgroups who share similar values of an outcome variable. We compare two decision tree methods, the popular Classification and Regression tree (CART) technique and the newer Conditional Inference tree (CTree) technique, assessing their performance in a simulation study and using data from the Box Lunch Study, a randomized controlled trial of a portion size intervention. Both CART and CTree identify homogeneous population subgroups and offer improved prediction accuracy relative to regression-based approaches when subgroups are truly present in the data. An important distinction between CART and CTree is that the latter uses a formal statistical hypothesis testing framework in building decision trees, which simplifies the process of identifying and interpreting the final tree model. We also introduce a novel way to visualize the subgroups defined by decision trees. Our novel graphical visualization provides a more scientifically meaningful characterization of the subgroups identified by decision trees.

Conclusions: Decision trees are a useful tool for identifying homogeneous subgroups defined by combinations of individual characteristics. While all decision tree techniques generate subgroups, we advocate the use of the newer CTree technique due to its simplicity and ease of interpretation.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Emerging Themes in Epidemiology
Emerging Themes in Epidemiology Medicine-Epidemiology
CiteScore
4.40
自引率
4.30%
发文量
9
审稿时长
28 weeks
期刊介绍: Emerging Themes in Epidemiology is an open access, peer-reviewed, online journal that aims to promote debate and discussion on practical and theoretical aspects of epidemiology. Combining statistical approaches with an understanding of the biology of disease, epidemiologists seek to elucidate the social, environmental and host factors related to adverse health outcomes. Although research findings from epidemiologic studies abound in traditional public health journals, little publication space is devoted to discussion of the practical and theoretical concepts that underpin them. Because of its immediate impact on public health, an openly accessible forum is needed in the field of epidemiology to foster such discussion.
期刊最新文献
Explaining biological differences between men and women by gendered mechanisms. Population cause of death estimation using verbal autopsy methods in large-scale field trials of maternal and child health: lessons learned from a 20-year research collaboration in Central Ghana. Dynamics of COVID-19 progression and the long-term influences of measures on pandemic outcomes. Puberty health intervention to improve menstrual health and school attendance among adolescent girls in The Gambia: study methodology of a cluster-randomised controlled trial in rural Gambia (MEGAMBO TRIAL). Are verbatim transcripts necessary in applied qualitative research: experiences from two community-based intervention trials in Ghana.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1