{"title":"从黑箱到聚光灯:用随机森林预测区间说明线性回归中假设的影响","authors":"Andrew J. Sage, Yang Liu, Joe Sato","doi":"10.1080/00031305.2022.2107568","DOIUrl":null,"url":null,"abstract":"Abstract We introduce a pair of Shiny web applications that allow users to visualize random forest prediction intervals alongside those produced by linear regression models. The apps are designed to help undergraduate students deepen their understanding of the role that assumptions play in statistical modeling by comparing and contrasting intervals produced by regression models with those produced by more flexible algorithmic techniques. We describe the mechanics of each approach, illustrate the features of the apps, provide examples highlighting the insights students can gain through their use, and discuss our experience implementing them in an undergraduate class. We argue that, contrary to their reputation as a black box, random forests can be used as a spotlight, for educational purposes, illuminating the role of assumptions in regression models and their impact on the shape, width, and coverage rates of prediction intervals.","PeriodicalId":342642,"journal":{"name":"The American Statistician","volume":"90 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"From Black Box to Shining Spotlight: Using Random Forest Prediction Intervals to Illuminate the Impact of Assumptions in Linear Regression\",\"authors\":\"Andrew J. Sage, Yang Liu, Joe Sato\",\"doi\":\"10.1080/00031305.2022.2107568\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract We introduce a pair of Shiny web applications that allow users to visualize random forest prediction intervals alongside those produced by linear regression models. The apps are designed to help undergraduate students deepen their understanding of the role that assumptions play in statistical modeling by comparing and contrasting intervals produced by regression models with those produced by more flexible algorithmic techniques. We describe the mechanics of each approach, illustrate the features of the apps, provide examples highlighting the insights students can gain through their use, and discuss our experience implementing them in an undergraduate class. We argue that, contrary to their reputation as a black box, random forests can be used as a spotlight, for educational purposes, illuminating the role of assumptions in regression models and their impact on the shape, width, and coverage rates of prediction intervals.\",\"PeriodicalId\":342642,\"journal\":{\"name\":\"The American Statistician\",\"volume\":\"90 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-07-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"The American Statistician\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1080/00031305.2022.2107568\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"The American Statistician","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/00031305.2022.2107568","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
From Black Box to Shining Spotlight: Using Random Forest Prediction Intervals to Illuminate the Impact of Assumptions in Linear Regression
Abstract We introduce a pair of Shiny web applications that allow users to visualize random forest prediction intervals alongside those produced by linear regression models. The apps are designed to help undergraduate students deepen their understanding of the role that assumptions play in statistical modeling by comparing and contrasting intervals produced by regression models with those produced by more flexible algorithmic techniques. We describe the mechanics of each approach, illustrate the features of the apps, provide examples highlighting the insights students can gain through their use, and discuss our experience implementing them in an undergraduate class. We argue that, contrary to their reputation as a black box, random forests can be used as a spotlight, for educational purposes, illuminating the role of assumptions in regression models and their impact on the shape, width, and coverage rates of prediction intervals.