基于数据挖掘技术的城镇旅行预测

Mohammad Fili, Majid Khedmati
{"title":"基于数据挖掘技术的城镇旅行预测","authors":"Mohammad Fili, Majid Khedmati","doi":"10.30495/JIEI.2020.678774","DOIUrl":null,"url":null,"abstract":"In this paper, a data mining approach is proposed for duration prediction of the town trips (travel time) in New York City. In this regard, at first, two novel approaches, including a mathematical and a statistical approach, are proposed for grouping categorical variables with a huge number of levels. The proposed approaches work based on the cost matrix generated by repetitive post-hoc tests for different pairs. Then, a random forest model is constructed for the prediction of the type of trips, short or long. Finally, based on the trip type and each of the mathematical and statistical approaches, separate artificial neural networks (ANN) are developed to predict the duration time of the trips. According to the results, the mathematical approach performs better and provides more accurate results than the statistical approach. In addition, the proposed methods are compared with some other methods in the literature in which the results show that they perform better than all other methods. The RMSE of mathematical and statistical approaches is, respectively, 4.23 and 4.27 minutes for short trips, and the related value is 9.5 minutes for long trips. In addition, a modified version of the nearest neighborhood approach, entitled modified nearest neighborhood (MNN), is proposed for the prediction of the trip duration. This model resulted in accurate predictions where its RMSE is 4.45 minutes.","PeriodicalId":37850,"journal":{"name":"Journal of Industrial Engineering International","volume":"154 1","pages":"1-13"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Town trip forecasting based on data mining techniques\",\"authors\":\"Mohammad Fili, Majid Khedmati\",\"doi\":\"10.30495/JIEI.2020.678774\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, a data mining approach is proposed for duration prediction of the town trips (travel time) in New York City. In this regard, at first, two novel approaches, including a mathematical and a statistical approach, are proposed for grouping categorical variables with a huge number of levels. The proposed approaches work based on the cost matrix generated by repetitive post-hoc tests for different pairs. Then, a random forest model is constructed for the prediction of the type of trips, short or long. Finally, based on the trip type and each of the mathematical and statistical approaches, separate artificial neural networks (ANN) are developed to predict the duration time of the trips. According to the results, the mathematical approach performs better and provides more accurate results than the statistical approach. In addition, the proposed methods are compared with some other methods in the literature in which the results show that they perform better than all other methods. The RMSE of mathematical and statistical approaches is, respectively, 4.23 and 4.27 minutes for short trips, and the related value is 9.5 minutes for long trips. In addition, a modified version of the nearest neighborhood approach, entitled modified nearest neighborhood (MNN), is proposed for the prediction of the trip duration. This model resulted in accurate predictions where its RMSE is 4.45 minutes.\",\"PeriodicalId\":37850,\"journal\":{\"name\":\"Journal of Industrial Engineering International\",\"volume\":\"154 1\",\"pages\":\"1-13\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Industrial Engineering International\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.30495/JIEI.2020.678774\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"Engineering\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Industrial Engineering International","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.30495/JIEI.2020.678774","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Engineering","Score":null,"Total":0}
引用次数: 0

摘要

本文提出了一种数据挖掘方法,用于预测纽约市城镇出行(旅行时间)的持续时间。在这方面,首先提出了两种新的方法,包括数学方法和统计方法,用于分组具有大量水平的分类变量。所提出的方法基于对不同对的重复事后测试生成的代价矩阵。然后,构建了一个随机森林模型来预测短途或长途旅行的类型。最后,基于出行类型和各种数学和统计方法,建立了单独的人工神经网络(ANN)来预测出行持续时间。结果表明,数学方法比统计方法具有更好的性能和更精确的结果。此外,将所提出的方法与文献中其他一些方法进行了比较,结果表明它们的性能优于所有其他方法。数学方法和统计方法的RMSE在短途旅行中分别为4.23和4.27分钟,在长途旅行中相关值为9.5分钟。此外,本文还提出了一种改进的最近邻法,称为修正最近邻法(MNN),用于行程时间的预测。该模型预测准确,均方根误差为4.45分钟。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Town trip forecasting based on data mining techniques
In this paper, a data mining approach is proposed for duration prediction of the town trips (travel time) in New York City. In this regard, at first, two novel approaches, including a mathematical and a statistical approach, are proposed for grouping categorical variables with a huge number of levels. The proposed approaches work based on the cost matrix generated by repetitive post-hoc tests for different pairs. Then, a random forest model is constructed for the prediction of the type of trips, short or long. Finally, based on the trip type and each of the mathematical and statistical approaches, separate artificial neural networks (ANN) are developed to predict the duration time of the trips. According to the results, the mathematical approach performs better and provides more accurate results than the statistical approach. In addition, the proposed methods are compared with some other methods in the literature in which the results show that they perform better than all other methods. The RMSE of mathematical and statistical approaches is, respectively, 4.23 and 4.27 minutes for short trips, and the related value is 9.5 minutes for long trips. In addition, a modified version of the nearest neighborhood approach, entitled modified nearest neighborhood (MNN), is proposed for the prediction of the trip duration. This model resulted in accurate predictions where its RMSE is 4.45 minutes.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Industrial Engineering International
Journal of Industrial Engineering International Engineering-Industrial and Manufacturing Engineering
CiteScore
4.20
自引率
0.00%
发文量
0
审稿时长
12 weeks
期刊介绍: Journal of Industrial Engineering International is an international journal dedicated to the latest advancement of industrial engineering. The goal of this journal is to provide a platform for engineers and academicians all over the world to promote, share, and discuss various new issues and developments in different areas of industrial engineering. All manuscripts must be prepared in English and are subject to a rigorous and fair peer-review process. Accepted articles will immediately appear online. The journal publishes original research articles, review articles, technical notes, case studies and letters to the Editor, including but not limited to the following fields: Operations Research and Decision-Making Models, Production Planning and Inventory Control, Supply Chain Management, Quality Engineering, Applications of Fuzzy Theory in Industrial Engineering, Applications of Stochastic Models in Industrial Engineering, Applications of Metaheuristic Methods in Industrial Engineering.
期刊最新文献
ANALISIS PERENCANAAN KAPASITAS PRODUKSI MENGGUNAKAN METODE ROUGH CUT CAPACITY PLANNING DI CV FAMILY BAKERY PENGGUNAAN METODE ECONOMIC ORDER QUANTITY PADA PENGENDALIAN PERSEDIAAN BAHAN BAKU JAGUNG DI PABRIK PAKAN IKAN TERAPUNG BUMG MALAKA BIREUEN PENGENDALIAN KUALITAS PRODUK CACAT SABUN CREAM DENGAN METODE STATISTICAL PROCESS CONTROL DI PT. JAMPALAN BARU ANALISIS EFEKTIVITAS MESIN RIPPLE MILL DENGAN MENGGUNAKAN METODE OVERALL EQUIPMENT EFFECTIVENESS (OEE) DAN SIX BIG LOSSES DI PT PARASAWITA PERANCANGAN SISTEM INFORMASI PENJUALAN ZAHRA MARKET BERBASIS WEB
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1