{"title":"Interpretable machine learning for predicting urban flash flood hotspots using intertwined land and built-environment features","authors":"Zhewei Liu , Tyler Felton , Ali Mostafavi","doi":"10.1016/j.compenvurbsys.2024.102096","DOIUrl":null,"url":null,"abstract":"<div><p>Pluvial flash floods are fast-moving hazards and causes significant disruptions in urban areas. With the increase in heavy precipitations, the ability to proactively identify flash floods hotspots in cities is critical for flood nowcasting and predictive monitoring of risks. While rainfall runoff models and hydrologic models are useful models for flash flood prediction, these models are computationally expensive and effort intensive to be used for flood nowcasting. To address this challenge, this study presents interpretable machine learning models for predicting urban flash flood hotspots based on intertwined land and built environment features. The task of predicting flash flood hotspots is formulated as a binary classification problem, and three recent flash flood events in U.S. cities are selected for data collection and model validation. Various features related to land and built environment characteristics are constructed using diverse datasets, and the occurrences of flash floods are captured using crowdsource data from the events. Using these features and datasets, the flash flood hotspots of cities are predicted with two ensemble models based on decision trees. The results demonstrate that the models can achieve good accuracy (0.8) in identifying flooded/non-flooded locations. Especially, the models can achieve high true positive rate (0.83–0.89) and low missing rate, demonstrating the methods' practicability for accurately predicting flooded hotspots. The model interpretation results indicate that land features related to hydrological and topological features have greater impacts on flash flood risk, than built environment features. Further analysis reveals that the feature importance, model performance, and model transferability performance vary among cities and localized specifications of the models are needed for accurate prediction of flash flood for a particular city. The data-driven machine learning models presented in this study provide a useful tool for predicting flash flood hotspots based on the intertwined features of land and the built environment in cities to enable nowcasting and proactive monitoring of flash flood hotspots for emergency response and also inform integrated urban design and development towards flash flood risk reduction.</p></div>","PeriodicalId":48241,"journal":{"name":"Computers Environment and Urban Systems","volume":"110 ","pages":"Article 102096"},"PeriodicalIF":7.1000,"publicationDate":"2024-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers Environment and Urban Systems","FirstCategoryId":"89","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0198971524000255","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENVIRONMENTAL STUDIES","Score":null,"Total":0}
引用次数: 0
Abstract
Pluvial flash floods are fast-moving hazards and causes significant disruptions in urban areas. With the increase in heavy precipitations, the ability to proactively identify flash floods hotspots in cities is critical for flood nowcasting and predictive monitoring of risks. While rainfall runoff models and hydrologic models are useful models for flash flood prediction, these models are computationally expensive and effort intensive to be used for flood nowcasting. To address this challenge, this study presents interpretable machine learning models for predicting urban flash flood hotspots based on intertwined land and built environment features. The task of predicting flash flood hotspots is formulated as a binary classification problem, and three recent flash flood events in U.S. cities are selected for data collection and model validation. Various features related to land and built environment characteristics are constructed using diverse datasets, and the occurrences of flash floods are captured using crowdsource data from the events. Using these features and datasets, the flash flood hotspots of cities are predicted with two ensemble models based on decision trees. The results demonstrate that the models can achieve good accuracy (0.8) in identifying flooded/non-flooded locations. Especially, the models can achieve high true positive rate (0.83–0.89) and low missing rate, demonstrating the methods' practicability for accurately predicting flooded hotspots. The model interpretation results indicate that land features related to hydrological and topological features have greater impacts on flash flood risk, than built environment features. Further analysis reveals that the feature importance, model performance, and model transferability performance vary among cities and localized specifications of the models are needed for accurate prediction of flash flood for a particular city. The data-driven machine learning models presented in this study provide a useful tool for predicting flash flood hotspots based on the intertwined features of land and the built environment in cities to enable nowcasting and proactive monitoring of flash flood hotspots for emergency response and also inform integrated urban design and development towards flash flood risk reduction.
期刊介绍:
Computers, Environment and Urban Systemsis an interdisciplinary journal publishing cutting-edge and innovative computer-based research on environmental and urban systems, that privileges the geospatial perspective. The journal welcomes original high quality scholarship of a theoretical, applied or technological nature, and provides a stimulating presentation of perspectives, research developments, overviews of important new technologies and uses of major computational, information-based, and visualization innovations. Applied and theoretical contributions demonstrate the scope of computer-based analysis fostering a better understanding of environmental and urban systems, their spatial scope and their dynamics.