Regression Trees and Ensemble for Multivariate Outcomes.

IF 0.7 Q4 STATISTICS & PROBABILITY Sankhya-Series B-Applied and Interdisciplinary Statistics Pub Date : 2023-05-01 Epub Date: 2023-02-16 DOI:10.1007/s13571-023-00301-z
Evan L Reynolds, Brian C Callaghan, Michael Gaies, Mousumi Banerjee
{"title":"Regression Trees and Ensemble for Multivariate Outcomes.","authors":"Evan L Reynolds, Brian C Callaghan, Michael Gaies, Mousumi Banerjee","doi":"10.1007/s13571-023-00301-z","DOIUrl":null,"url":null,"abstract":"<p><p>Tree-based methods have become one of the most flexible, intuitive, and powerful analytic tools for exploring complex data structures. The best documented, and arguably most popular uses of tree-based methods are in biomedical research, where multivariateoutcomes occur commonly (e.g. diastolic and systolic blood pressure and nerve conduction measures in studies of neuropathy). Existing tree-based methods for multivariate outcomes do not appropriately take into account the correlation that exists in such data. In this paper, we develop goodness-of-split measures for building multivariate regression trees for continuous multivariate outcomes. We propose two general approaches: minimizing within-node homogeneity and maximizing between-node separation. Within-node homogeneity is measured using the average Mahalanobis distance and the determinant of the variance-covariance matrix. Between-node separation is measured using the Mahalanobis distance, Euclidean distance and standardized Euclidean distance. To enhance prediction accuracy we extend the single multivariate regression tree to an ensemble of multivariate trees. Extensive simulations are presented to examine the properties of our goodness-of-split measures. Finally, the proposed methods are illustrated using two clinical datasets of neuropathy and pediatric cardiac surgery.</p>","PeriodicalId":45608,"journal":{"name":"Sankhya-Series B-Applied and Interdisciplinary Statistics","volume":"85 1","pages":"77-109"},"PeriodicalIF":0.7000,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12711322/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Sankhya-Series B-Applied and Interdisciplinary Statistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s13571-023-00301-z","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/2/16 0:00:00","PubModel":"Epub","JCR":"Q4","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 0

Abstract

Tree-based methods have become one of the most flexible, intuitive, and powerful analytic tools for exploring complex data structures. The best documented, and arguably most popular uses of tree-based methods are in biomedical research, where multivariateoutcomes occur commonly (e.g. diastolic and systolic blood pressure and nerve conduction measures in studies of neuropathy). Existing tree-based methods for multivariate outcomes do not appropriately take into account the correlation that exists in such data. In this paper, we develop goodness-of-split measures for building multivariate regression trees for continuous multivariate outcomes. We propose two general approaches: minimizing within-node homogeneity and maximizing between-node separation. Within-node homogeneity is measured using the average Mahalanobis distance and the determinant of the variance-covariance matrix. Between-node separation is measured using the Mahalanobis distance, Euclidean distance and standardized Euclidean distance. To enhance prediction accuracy we extend the single multivariate regression tree to an ensemble of multivariate trees. Extensive simulations are presented to examine the properties of our goodness-of-split measures. Finally, the proposed methods are illustrated using two clinical datasets of neuropathy and pediatric cardiac surgery.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
多元结果的回归树和集合
基于树的方法已经成为探索复杂数据结构的最灵活、直观和强大的分析工具之一。基于树的方法在生物医学研究中有最好的记录,并且可以说是最受欢迎的应用,其中多变量结果通常发生(例如,神经病变研究中的舒张压和收缩压以及神经传导测量)。现有的基于树的多变量结果方法没有适当地考虑到这些数据中存在的相关性。在本文中,我们开发了分割优度度量,用于为连续的多变量结果构建多变量回归树。我们提出了两种一般的方法:最小化节点内同质性和最大化节点间分离。使用平均马氏距离和方差-协方差矩阵的行列式来测量节点内的均匀性。采用马氏距离、欧几里得距离和标准化欧几里得距离测量节点间距离。为了提高预测精度,我们将单多元回归树扩展为多元树的集合。大量的模拟被提出来检验我们的分裂优度度量的性质。最后,提出的方法是用两个临床数据集神经病变和儿科心脏手术说明。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
1.50
自引率
0.00%
发文量
24
期刊介绍: Sankhya, Series A, publishes original, high quality research articles in various areas of modern statistics, such as probability, theoretical statistics, mathematical statistics and machine learning. The areas are interpreted in a broad sense. Articles are judged on the basis of their novelty and technical correctness. Sankhya, Series B, primarily covers applied and interdisciplinary statistics including data sciences. Applied articles should preferably include analysis of original data of broad interest, novel applications of methodology and development of methods and techniques of immediate practical use. Authoritative reviews and comprehensive discussion articles in areas of vigorous current research are also welcome.
期刊最新文献
Optimum Plans for Progressive Censored Competing Risk Data Under Kies Distribution Shrinkage Estimation of Location Parameter for Uniform Distribution Based on k-record Values Black-box optimization on hyper-rectangle using Recursive Modified Pattern Search and application to ROC-based Classification Problem A General Equivalence Theorem for Crossover Designs under Generalized Linear Models Within Groups Designs: Inferences Based on A Robust Nonparametric Measure of Effect Size
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1