Tree-Based Models Using Random Grid Search Optimization for Disease Classification Based on Environmental Factors: A Case Study on Asthma Hospitalizations

P. Nanthakumaran, L. Liyanage
{"title":"Tree-Based Models Using Random Grid Search Optimization for Disease Classification Based on Environmental Factors: A Case Study on Asthma Hospitalizations","authors":"P. Nanthakumaran, L. Liyanage","doi":"10.1109/PRML52754.2021.9520720","DOIUrl":null,"url":null,"abstract":"An understanding on the exposure to environmental factors aggravating global disease burden can aid mitigating it. Generally, a class of generalized linear models and generalized additive models are used in predicting disease burden whereas, tree-based models are underused. The objective of this paper is to evaluate the performance of different tree-based models namely decision tree, random forest, gradient boosted tree and stochastic gradient boosted trees in predicting asthma attack based on short-term exposure to environmental factors and to examine the environmental factors triggering asthma attack. A sample of patients during 2013 - 2015 from different parts of Victoria was considered. The study area for the considered study period had reasonably good air quality and relatively humid environment. The tree-based models were tuned using random grid search optimization with bootstrapping to address over-fitting. The models considered performed well in predicting asthma attacks in terms of area under the receiver operating curve (ROC AUC) (>0.82). All the gradient boosted trees (accuracy = 76%; recall = 63%; F2-score = 64%) showed better overall prediction whereas decision tree (accuracy = 71%; recall = 75%; F2-score = 71%) outperformed other models in identifying the positive cases. Tree-based models revealed that O3 exposure consistently influence Asthma. Further, decision tree revealed O3 exposure < 13 ppb or with high O3 exposure >= 13 ppb, and with [SO2 exposure < 0.5 ppb and maximum wind speed > 5.4. km/hr.] influenced Asthma. In addition, relative humidity and exposure to CO were also detected in other tree-based models as relevant predictors triggering asthma attacks.","PeriodicalId":429603,"journal":{"name":"2021 IEEE 2nd International Conference on Pattern Recognition and Machine Learning (PRML)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 2nd International Conference on Pattern Recognition and Machine Learning (PRML)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PRML52754.2021.9520720","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

An understanding on the exposure to environmental factors aggravating global disease burden can aid mitigating it. Generally, a class of generalized linear models and generalized additive models are used in predicting disease burden whereas, tree-based models are underused. The objective of this paper is to evaluate the performance of different tree-based models namely decision tree, random forest, gradient boosted tree and stochastic gradient boosted trees in predicting asthma attack based on short-term exposure to environmental factors and to examine the environmental factors triggering asthma attack. A sample of patients during 2013 - 2015 from different parts of Victoria was considered. The study area for the considered study period had reasonably good air quality and relatively humid environment. The tree-based models were tuned using random grid search optimization with bootstrapping to address over-fitting. The models considered performed well in predicting asthma attacks in terms of area under the receiver operating curve (ROC AUC) (>0.82). All the gradient boosted trees (accuracy = 76%; recall = 63%; F2-score = 64%) showed better overall prediction whereas decision tree (accuracy = 71%; recall = 75%; F2-score = 71%) outperformed other models in identifying the positive cases. Tree-based models revealed that O3 exposure consistently influence Asthma. Further, decision tree revealed O3 exposure < 13 ppb or with high O3 exposure >= 13 ppb, and with [SO2 exposure < 0.5 ppb and maximum wind speed > 5.4. km/hr.] influenced Asthma. In addition, relative humidity and exposure to CO were also detected in other tree-based models as relevant predictors triggering asthma attacks.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于环境因素的随机网格搜索优化疾病分类树模型:以哮喘住院病例为例
了解环境因素对加重全球疾病负担的影响有助于减轻疾病负担。在疾病负担预测中,一般采用广义线性模型和广义加性模型,而基于树的模型应用较少。本文的目的是评估决策树、随机森林、梯度增强树和随机梯度增强树等不同的基于树的模型在基于短期暴露于环境因素的哮喘发作预测中的性能,并研究引发哮喘发作的环境因素。在2013年至2015年期间,来自维多利亚州不同地区的患者样本被考虑。研究区在考虑的研究期间空气质量较好,环境相对潮湿。基于树的模型使用随机网格搜索优化和自举来解决过拟合问题。所考虑的模型在受试者工作曲线下面积(ROC AUC)方面预测哮喘发作的效果良好(>0.82)。所有的梯度增强树(准确率= 76%;召回率= 63%;F2-score = 64%)表现出更好的整体预测能力,而决策树(准确率= 71%;召回率= 75%;F2-score = 71%)在识别阳性病例方面优于其他模型。基于树的模型显示,臭氧暴露持续影响哮喘。此外,决策树显示O3暴露< 13 ppb或高O3暴露>= 13 ppb, SO2暴露< 0.5 ppb,最大风速> 5.4。公里/小时。影响哮喘。此外,在其他基于树木的模型中,相对湿度和CO暴露也被检测到是引发哮喘发作的相关预测因素。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Intelligent Robot for Cleaning Garbage Based on OpenCV Research on Tibetan-Chinese Machine Translation Based on Multi-Strategy Processing A Survey of Object Detection Based on CNN and Transformer A Review of Segmentation and Classification for Retinal Optical Coherence Tomography Images Research on the Methods of Speech Synthesis Technology
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1