利用可解释人工智能预测共享单车站点需求

IF 4.9 Machine learning with applications Pub Date : 2024-09-01 Epub Date: 2024-08-09 DOI:10.1016/j.mlwa.2024.100582

Frank Ngeni , Boniphace Kutela , Tumlumbe Juliana Chengula , Cuthbert Ruseruka , Hannah Musau , Norris Novat , Debbie Aisiana Indah , Sarah Kasomi

{"title":"利用可解释人工智能预测共享单车站点需求","authors":"Frank Ngeni , Boniphace Kutela , Tumlumbe Juliana Chengula , Cuthbert Ruseruka , Hannah Musau , Norris Novat , Debbie Aisiana Indah , Sarah Kasomi","doi":"10.1016/j.mlwa.2024.100582","DOIUrl":null,"url":null,"abstract":"<div><p>Bike-sharing systems have grown in popularity in metropolitan areas, providing a handy and environmentally friendly transportation choice for commuters and visitors alike. As demand for bike-sharing programs grows, efficient capacity planning becomes critical to ensuring good user experience and system sustainability in terms of demand. The random forest model was used in this study to predict bike-sharing station demand and is considered a strong ensemble learning approach that can successfully capture complicated nonlinear correlations and interactions between input variables. This study employed data from the Smart Location Database (SLD) to test the model accuracy in estimating station demand and used a form of explainable artificial intelligence (XAI) function to further understand machine learning (ML) prediction outcomes owing to the blackbox tendencies of ML models. Vehicle Miles of Travel (VMT) and Greenhouse Gas (GHG) emissions were the most important features in predicting docking station demand individually but not holistically based on the datasets. The percentage of zero-car households, gross residential density, road network density, aggregate frequency of transit service, and gross activity density were found to have a moderate influence on the prediction model. Further, there may be a better prediction model generating sensible results for every type of explanatory variable, but their contributions are minimum to the prediction outcome. By measuring each feature's contribution to demand prediction in feature engineering, bike-sharing operators can acquire a better understanding of the bike-sharing station capacity and forecast future demands during planning. At the same time, ML models will need further assessment before a holistic conclusion.</p></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"17 ","pages":"Article 100582"},"PeriodicalIF":4.9000,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666827024000586/pdfft?md5=bf46aecfa5d4b69f24c5a8d196610032&pid=1-s2.0-S2666827024000586-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Prediction of bike-sharing station demand using explainable artificial intelligence\",\"authors\":\"Frank Ngeni , Boniphace Kutela , Tumlumbe Juliana Chengula , Cuthbert Ruseruka , Hannah Musau , Norris Novat , Debbie Aisiana Indah , Sarah Kasomi\",\"doi\":\"10.1016/j.mlwa.2024.100582\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Bike-sharing systems have grown in popularity in metropolitan areas, providing a handy and environmentally friendly transportation choice for commuters and visitors alike. As demand for bike-sharing programs grows, efficient capacity planning becomes critical to ensuring good user experience and system sustainability in terms of demand. The random forest model was used in this study to predict bike-sharing station demand and is considered a strong ensemble learning approach that can successfully capture complicated nonlinear correlations and interactions between input variables. This study employed data from the Smart Location Database (SLD) to test the model accuracy in estimating station demand and used a form of explainable artificial intelligence (XAI) function to further understand machine learning (ML) prediction outcomes owing to the blackbox tendencies of ML models. Vehicle Miles of Travel (VMT) and Greenhouse Gas (GHG) emissions were the most important features in predicting docking station demand individually but not holistically based on the datasets. The percentage of zero-car households, gross residential density, road network density, aggregate frequency of transit service, and gross activity density were found to have a moderate influence on the prediction model. Further, there may be a better prediction model generating sensible results for every type of explanatory variable, but their contributions are minimum to the prediction outcome. By measuring each feature's contribution to demand prediction in feature engineering, bike-sharing operators can acquire a better understanding of the bike-sharing station capacity and forecast future demands during planning. At the same time, ML models will need further assessment before a holistic conclusion.</p></div>\",\"PeriodicalId\":74093,\"journal\":{\"name\":\"Machine learning with applications\",\"volume\":\"17 \",\"pages\":\"Article 100582\"},\"PeriodicalIF\":4.9000,\"publicationDate\":\"2024-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2666827024000586/pdfft?md5=bf46aecfa5d4b69f24c5a8d196610032&pid=1-s2.0-S2666827024000586-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Machine learning with applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2666827024000586\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/8/9 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine learning with applications","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666827024000586","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/8/9 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

共享单车系统在大都市地区越来越受欢迎，为通勤者和游客提供了便捷、环保的交通选择。随着共享单车需求的增长，高效的容量规划对确保良好的用户体验和系统需求的可持续性至关重要。本研究采用随机森林模型来预测共享单车站点的需求，该模型被认为是一种强大的集合学习方法，能够成功捕捉输入变量之间复杂的非线性关联和相互作用。本研究使用智能地点数据库（SLD）中的数据来测试模型在估算站点需求方面的准确性，并使用一种可解释人工智能（XAI）函数来进一步理解机器学习（ML）预测结果，因为ML模型具有黑箱倾向。车辆行驶里程（VMT）和温室气体（GHG）排放量是单独预测停靠站需求的最重要特征，但不是基于数据集的整体预测。零汽车家庭比例、住宅总密度、路网密度、公交服务总频率和活动总密度对预测模型的影响不大。此外，每一种解释变量都可能有一个更好的预测模型来产生合理的结果，但它们对预测结果的贡献都是最小的。通过在特征工程中衡量每个特征对需求预测的贡献，共享单车运营商可以更好地了解共享单车站点的容量，并在规划过程中预测未来的需求。同时，ML 模型还需要进一步评估才能得出整体结论。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Prediction of bike-sharing station demand using explainable artificial intelligence

Bike-sharing systems have grown in popularity in metropolitan areas, providing a handy and environmentally friendly transportation choice for commuters and visitors alike. As demand for bike-sharing programs grows, efficient capacity planning becomes critical to ensuring good user experience and system sustainability in terms of demand. The random forest model was used in this study to predict bike-sharing station demand and is considered a strong ensemble learning approach that can successfully capture complicated nonlinear correlations and interactions between input variables. This study employed data from the Smart Location Database (SLD) to test the model accuracy in estimating station demand and used a form of explainable artificial intelligence (XAI) function to further understand machine learning (ML) prediction outcomes owing to the blackbox tendencies of ML models. Vehicle Miles of Travel (VMT) and Greenhouse Gas (GHG) emissions were the most important features in predicting docking station demand individually but not holistically based on the datasets. The percentage of zero-car households, gross residential density, road network density, aggregate frequency of transit service, and gross activity density were found to have a moderate influence on the prediction model. Further, there may be a better prediction model generating sensible results for every type of explanatory variable, but their contributions are minimum to the prediction outcome. By measuring each feature's contribution to demand prediction in feature engineering, bike-sharing operators can acquire a better understanding of the bike-sharing station capacity and forecast future demands during planning. At the same time, ML models will need further assessment before a holistic conclusion.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Machine learning with applications Management Science and Operations Research, Artificial Intelligence, Computer Science Applications

自引率

0.00%

发文量

审稿时长

98 days