{"title":"马尔可夫转换决策树","authors":"Timo Adam, Marius Ötting, Rouven Michels","doi":"10.1007/s10182-024-00501-6","DOIUrl":null,"url":null,"abstract":"<div><p>Decision trees constitute a simple yet powerful and interpretable machine learning tool. While tree-based methods are designed only for cross-sectional data, we propose an approach that combines decision trees with time series modeling and thereby bridges the gap between machine learning and statistics. In particular, we combine decision trees with hidden Markov models where, for any time point, an underlying (hidden) Markov chain selects the tree that generates the corresponding observation. We propose an estimation approach that is based on the expectation-maximisation algorithm and assess its feasibility in simulation experiments. In our real-data application, we use eight seasons of National Football League (NFL) data to predict play calls conditional on covariates, such as the current quarter and the score, where the model’s states can be linked to the teams’ strategies. R code that implements the proposed method is available on GitHub.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"108 2","pages":"461 - 476"},"PeriodicalIF":1.4000,"publicationDate":"2024-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10182-024-00501-6.pdf","citationCount":"0","resultStr":"{\"title\":\"Markov-switching decision trees\",\"authors\":\"Timo Adam, Marius Ötting, Rouven Michels\",\"doi\":\"10.1007/s10182-024-00501-6\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Decision trees constitute a simple yet powerful and interpretable machine learning tool. While tree-based methods are designed only for cross-sectional data, we propose an approach that combines decision trees with time series modeling and thereby bridges the gap between machine learning and statistics. In particular, we combine decision trees with hidden Markov models where, for any time point, an underlying (hidden) Markov chain selects the tree that generates the corresponding observation. We propose an estimation approach that is based on the expectation-maximisation algorithm and assess its feasibility in simulation experiments. In our real-data application, we use eight seasons of National Football League (NFL) data to predict play calls conditional on covariates, such as the current quarter and the score, where the model’s states can be linked to the teams’ strategies. R code that implements the proposed method is available on GitHub.</p></div>\",\"PeriodicalId\":55446,\"journal\":{\"name\":\"Asta-Advances in Statistical Analysis\",\"volume\":\"108 2\",\"pages\":\"461 - 476\"},\"PeriodicalIF\":1.4000,\"publicationDate\":\"2024-05-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://link.springer.com/content/pdf/10.1007/s10182-024-00501-6.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Asta-Advances in Statistical Analysis\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s10182-024-00501-6\",\"RegionNum\":4,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"STATISTICS & PROBABILITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Asta-Advances in Statistical Analysis","FirstCategoryId":"100","ListUrlMain":"https://link.springer.com/article/10.1007/s10182-024-00501-6","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 0
摘要
决策树是一种简单但功能强大、可解释的机器学习工具。虽然基于树的方法只适用于横截面数据,但我们提出了一种将决策树与时间序列建模相结合的方法,从而缩小了机器学习与统计学之间的差距。特别是,我们将决策树与隐马尔可夫模型相结合,对于任何时间点,底层(隐)马尔可夫链都会选择生成相应观测值的树。我们提出了一种基于期望最大化算法的估计方法,并在模拟实验中评估了其可行性。在我们的真实数据应用中,我们使用美国国家橄榄球联盟(NFL)八个赛季的数据来预测以当前季度和比分等协变量为条件的比赛调用,其中模型的状态可以与球队的策略相关联。实现该方法的 R 代码可在 GitHub 上获取。
Decision trees constitute a simple yet powerful and interpretable machine learning tool. While tree-based methods are designed only for cross-sectional data, we propose an approach that combines decision trees with time series modeling and thereby bridges the gap between machine learning and statistics. In particular, we combine decision trees with hidden Markov models where, for any time point, an underlying (hidden) Markov chain selects the tree that generates the corresponding observation. We propose an estimation approach that is based on the expectation-maximisation algorithm and assess its feasibility in simulation experiments. In our real-data application, we use eight seasons of National Football League (NFL) data to predict play calls conditional on covariates, such as the current quarter and the score, where the model’s states can be linked to the teams’ strategies. R code that implements the proposed method is available on GitHub.
期刊介绍:
AStA - Advances in Statistical Analysis, a journal of the German Statistical Society, is published quarterly and presents original contributions on statistical methods and applications and review articles.