Road traffic injuries continue to pose a significant public health challenge in Australia, with pedestrians representing one of the most vulnerable road user groups. Accurate prediction of injury severity, particularly fatal outcomes, is essential for improving road safety interventions and resource allocation. This study applies advanced machine learning techniques to predict pedestrian crash severity using national hospitalization and mortality data collected from 2011 to 2021. The analysis focuses on addressing class imbalance, a common issue in injury data by evaluating the impact of several data balancing methods, including SMOTE, ADASYN, Random Oversampling (ROS), and Threshold Moving. We implement and compare four supervised learning algorithms: Logistic Regression, Support Vector Machine (SVM), Decision Tree, and XGBoost. Model performance is assessed using F1-score and macro-accuracy, with a focus on the minority (fatality) class. Results show that XGBoost combined with Threshold Moving achieves the highest performance, yielding an F1-score of 72% for fatality classification and a macro-accuracy of 84%. Additionally, feature importance analysis using SHAP values reveals age, gender, road user type, and crash location as key predictors of injury severity. The study highlights the critical role of data balancing strategies in enhancing predictive accuracy for rare but high-impact outcomes. These findings provide actionable insights for transport authorities and policymakers seeking to develop data-driven, targeted safety measures to protect pedestrians and reduce the severity of crash outcomes.
扫码关注我们
求助内容:
应助结果提醒方式:
