多元响应回归中的聚类检测

IF 4.4 2区 数学 Q1 STATISTICS & PROBABILITY Wiley Interdisciplinary Reviews-Computational Statistics Pub Date : 2021-02-03 DOI:10.1002/wics.1551
Bradley S. Price, Corban Allenbrand, Ben Sherwood
{"title":"多元响应回归中的聚类检测","authors":"Bradley S. Price, Corban Allenbrand, Ben Sherwood","doi":"10.1002/wics.1551","DOIUrl":null,"url":null,"abstract":"Multivariate regression, which can also be posed as a multitask machine learning problem, is used to better understand multiple outputs based on a given set of inputs. Many methods have been proposed on how to utilize shared information about responses with applications in fields such as economics, genomics, advanced manufacturing, and precision medicine. Interest in these areas coupled with the rise of large data sets (“big data”) has generated interest in how to make the computations more efficient, but also to develop methods that account for the heterogeneity that may exist between responses. One way to exploit this heterogeneity between responses is to use methods that detect groups, also called clusters, of related responses. These methods provide a framework that can increase computational speed and account for complexity of relationships of a large number of responses. With this flexibility, comes additional challenges such as how to identify these clusters of responses, model selection, and the development of more complex algorithms that combine concepts from both the supervised and unsupervised learning literature. We explore current state of the art methods, present a framework to better understand methods that utilize or detect clusters of responses, and provide insights on the computational challenges associated with this framework. Specifically we present a simulation study that discusses the challenges with model selection when detecting clusters of responses of interest. We also comment on extensions and open problems that are of interest to both the research and practitioner communities.","PeriodicalId":47779,"journal":{"name":"Wiley Interdisciplinary Reviews-Computational Statistics","volume":" ","pages":""},"PeriodicalIF":4.4000,"publicationDate":"2021-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1002/wics.1551","citationCount":"3","resultStr":"{\"title\":\"Detecting clusters in multivariate response regression\",\"authors\":\"Bradley S. Price, Corban Allenbrand, Ben Sherwood\",\"doi\":\"10.1002/wics.1551\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Multivariate regression, which can also be posed as a multitask machine learning problem, is used to better understand multiple outputs based on a given set of inputs. Many methods have been proposed on how to utilize shared information about responses with applications in fields such as economics, genomics, advanced manufacturing, and precision medicine. Interest in these areas coupled with the rise of large data sets (“big data”) has generated interest in how to make the computations more efficient, but also to develop methods that account for the heterogeneity that may exist between responses. One way to exploit this heterogeneity between responses is to use methods that detect groups, also called clusters, of related responses. These methods provide a framework that can increase computational speed and account for complexity of relationships of a large number of responses. With this flexibility, comes additional challenges such as how to identify these clusters of responses, model selection, and the development of more complex algorithms that combine concepts from both the supervised and unsupervised learning literature. We explore current state of the art methods, present a framework to better understand methods that utilize or detect clusters of responses, and provide insights on the computational challenges associated with this framework. Specifically we present a simulation study that discusses the challenges with model selection when detecting clusters of responses of interest. We also comment on extensions and open problems that are of interest to both the research and practitioner communities.\",\"PeriodicalId\":47779,\"journal\":{\"name\":\"Wiley Interdisciplinary Reviews-Computational Statistics\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":4.4000,\"publicationDate\":\"2021-02-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1002/wics.1551\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Wiley Interdisciplinary Reviews-Computational Statistics\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1002/wics.1551\",\"RegionNum\":2,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"STATISTICS & PROBABILITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Wiley Interdisciplinary Reviews-Computational Statistics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1002/wics.1551","RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 3

摘要

多元回归也可以作为一个多任务机器学习问题,用于更好地理解基于给定输入集的多个输出。关于如何利用关于反应的共享信息,已经提出了许多方法,这些方法在经济学、基因组学、先进制造业和精准医学等领域都有应用。对这些领域的兴趣,加上大型数据集(“大数据”)的兴起,产生了人们对如何提高计算效率的兴趣,同时也产生了对开发解释响应之间可能存在的异质性的方法的兴趣。利用反应之间这种异质性的一种方法是使用检测相关反应的组(也称为集群)的方法。这些方法提供了一个可以提高计算速度并考虑大量响应关系复杂性的框架。这种灵活性带来了额外的挑战,如如何识别这些响应集群、模型选择,以及开发更复杂的算法,将监督和非监督学习文献中的概念结合起来。我们探索了当前最先进的方法,提出了一个框架来更好地理解利用或检测响应集群的方法,并提供了与该框架相关的计算挑战的见解。具体而言,我们提出了一项模拟研究,讨论了在检测感兴趣的响应集群时模型选择的挑战。我们还评论了研究和从业者群体感兴趣的扩展和开放问题。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Detecting clusters in multivariate response regression
Multivariate regression, which can also be posed as a multitask machine learning problem, is used to better understand multiple outputs based on a given set of inputs. Many methods have been proposed on how to utilize shared information about responses with applications in fields such as economics, genomics, advanced manufacturing, and precision medicine. Interest in these areas coupled with the rise of large data sets (“big data”) has generated interest in how to make the computations more efficient, but also to develop methods that account for the heterogeneity that may exist between responses. One way to exploit this heterogeneity between responses is to use methods that detect groups, also called clusters, of related responses. These methods provide a framework that can increase computational speed and account for complexity of relationships of a large number of responses. With this flexibility, comes additional challenges such as how to identify these clusters of responses, model selection, and the development of more complex algorithms that combine concepts from both the supervised and unsupervised learning literature. We explore current state of the art methods, present a framework to better understand methods that utilize or detect clusters of responses, and provide insights on the computational challenges associated with this framework. Specifically we present a simulation study that discusses the challenges with model selection when detecting clusters of responses of interest. We also comment on extensions and open problems that are of interest to both the research and practitioner communities.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
6.20
自引率
0.00%
发文量
31
期刊最新文献
Neuroimaging statistical approaches for determining neural correlates of Alzheimer's disease via positron emission tomography imaging. A spectrum of explainable and interpretable machine learning approaches for genomic studies Functional neuroimaging in the era of Big Data and Open Science: A modern overview Information criteria for model selection Data Integration in Causal Inference.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1