Prediction of bike-sharing station demand using explainable artificial intelligence

Frank Ngeni , Boniphace Kutela , Tumlumbe Juliana Chengula , Cuthbert Ruseruka , Hannah Musau , Norris Novat , Debbie Aisiana Indah , Sarah Kasomi
{"title":"Prediction of bike-sharing station demand using explainable artificial intelligence","authors":"Frank Ngeni ,&nbsp;Boniphace Kutela ,&nbsp;Tumlumbe Juliana Chengula ,&nbsp;Cuthbert Ruseruka ,&nbsp;Hannah Musau ,&nbsp;Norris Novat ,&nbsp;Debbie Aisiana Indah ,&nbsp;Sarah Kasomi","doi":"10.1016/j.mlwa.2024.100582","DOIUrl":null,"url":null,"abstract":"<div><p>Bike-sharing systems have grown in popularity in metropolitan areas, providing a handy and environmentally friendly transportation choice for commuters and visitors alike. As demand for bike-sharing programs grows, efficient capacity planning becomes critical to ensuring good user experience and system sustainability in terms of demand. The random forest model was used in this study to predict bike-sharing station demand and is considered a strong ensemble learning approach that can successfully capture complicated nonlinear correlations and interactions between input variables. This study employed data from the Smart Location Database (SLD) to test the model accuracy in estimating station demand and used a form of explainable artificial intelligence (XAI) function to further understand machine learning (ML) prediction outcomes owing to the blackbox tendencies of ML models. Vehicle Miles of Travel (VMT) and Greenhouse Gas (GHG) emissions were the most important features in predicting docking station demand individually but not holistically based on the datasets. The percentage of zero-car households, gross residential density, road network density, aggregate frequency of transit service, and gross activity density were found to have a moderate influence on the prediction model. Further, there may be a better prediction model generating sensible results for every type of explanatory variable, but their contributions are minimum to the prediction outcome. By measuring each feature's contribution to demand prediction in feature engineering, bike-sharing operators can acquire a better understanding of the bike-sharing station capacity and forecast future demands during planning. At the same time, ML models will need further assessment before a holistic conclusion.</p></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"17 ","pages":"Article 100582"},"PeriodicalIF":0.0000,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666827024000586/pdfft?md5=bf46aecfa5d4b69f24c5a8d196610032&pid=1-s2.0-S2666827024000586-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine learning with applications","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666827024000586","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Bike-sharing systems have grown in popularity in metropolitan areas, providing a handy and environmentally friendly transportation choice for commuters and visitors alike. As demand for bike-sharing programs grows, efficient capacity planning becomes critical to ensuring good user experience and system sustainability in terms of demand. The random forest model was used in this study to predict bike-sharing station demand and is considered a strong ensemble learning approach that can successfully capture complicated nonlinear correlations and interactions between input variables. This study employed data from the Smart Location Database (SLD) to test the model accuracy in estimating station demand and used a form of explainable artificial intelligence (XAI) function to further understand machine learning (ML) prediction outcomes owing to the blackbox tendencies of ML models. Vehicle Miles of Travel (VMT) and Greenhouse Gas (GHG) emissions were the most important features in predicting docking station demand individually but not holistically based on the datasets. The percentage of zero-car households, gross residential density, road network density, aggregate frequency of transit service, and gross activity density were found to have a moderate influence on the prediction model. Further, there may be a better prediction model generating sensible results for every type of explanatory variable, but their contributions are minimum to the prediction outcome. By measuring each feature's contribution to demand prediction in feature engineering, bike-sharing operators can acquire a better understanding of the bike-sharing station capacity and forecast future demands during planning. At the same time, ML models will need further assessment before a holistic conclusion.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用可解释人工智能预测共享单车站点需求
共享单车系统在大都市地区越来越受欢迎,为通勤者和游客提供了便捷、环保的交通选择。随着共享单车需求的增长,高效的容量规划对确保良好的用户体验和系统需求的可持续性至关重要。本研究采用随机森林模型来预测共享单车站点的需求,该模型被认为是一种强大的集合学习方法,能够成功捕捉输入变量之间复杂的非线性关联和相互作用。本研究使用智能地点数据库(SLD)中的数据来测试模型在估算站点需求方面的准确性,并使用一种可解释人工智能(XAI)函数来进一步理解机器学习(ML)预测结果,因为ML模型具有黑箱倾向。车辆行驶里程(VMT)和温室气体(GHG)排放量是单独预测停靠站需求的最重要特征,但不是基于数据集的整体预测。零汽车家庭比例、住宅总密度、路网密度、公交服务总频率和活动总密度对预测模型的影响不大。此外,每一种解释变量都可能有一个更好的预测模型来产生合理的结果,但它们对预测结果的贡献都是最小的。通过在特征工程中衡量每个特征对需求预测的贡献,共享单车运营商可以更好地了解共享单车站点的容量,并在规划过程中预测未来的需求。同时,ML 模型还需要进一步评估才能得出整体结论。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Machine learning with applications
Machine learning with applications Management Science and Operations Research, Artificial Intelligence, Computer Science Applications
自引率
0.00%
发文量
0
审稿时长
98 days
期刊最新文献
Document Layout Error Rate (DLER) metric to evaluate image segmentation methods Supervised machine learning for microbiomics: Bridging the gap between current and best practices Playing with words: Comparing the vocabulary and lexical diversity of ChatGPT and humans A survey on knowledge distillation: Recent advancements Texas rural land market integration: A causal analysis using machine learning applications
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1