首页 > 最新文献

Proceedings of the 2019 International Conference on Data Mining and Machine Learning最新文献

英文 中文
Demand Forecasting Based on Machine Learning for Mass Customization in Smart Manufacturing 基于机器学习的智能制造大规模定制需求预测
Myungsoo Kim, Jongpil Jeong, Sang-Pil Bae
Mass customization is essential for smart manufacturing. In particular, generating demand forecast is undoubtedly the most important part of any industry. Appropriate demand forecasts make S&OP quality, which greatly contributes to overall corporate management. In addition, proper stock can be maintained to save the costs of maintaining multiple warehouses. In this paper, we find out why mass customization is needed in smart manufacturing and find appropriate demand forecasting techniques by comparing the traditional time series technique ARIMA analysis with the nonlinear network model. Afterwards, the company develops an algorithm to evaluate the sales process by finalizing the production plan by evaluating the expected inventory through mathematical modelling.
大规模定制是智能制造的关键。特别是,需求预测无疑是任何行业最重要的部分。适当的需求预测可以提高S&OP的质量,对企业的整体管理有很大的帮助。此外,可以保持适当的库存,以节省维护多个仓库的成本。本文通过比较传统的时间序列技术ARIMA分析和非线性网络模型,找出智能制造需要大规模定制的原因,并找到合适的需求预测技术。然后,公司通过数学建模评估预期库存,最终确定生产计划,开发出评估销售过程的算法。
{"title":"Demand Forecasting Based on Machine Learning for Mass Customization in Smart Manufacturing","authors":"Myungsoo Kim, Jongpil Jeong, Sang-Pil Bae","doi":"10.1145/3335656.3335658","DOIUrl":"https://doi.org/10.1145/3335656.3335658","url":null,"abstract":"Mass customization is essential for smart manufacturing. In particular, generating demand forecast is undoubtedly the most important part of any industry. Appropriate demand forecasts make S&OP quality, which greatly contributes to overall corporate management. In addition, proper stock can be maintained to save the costs of maintaining multiple warehouses. In this paper, we find out why mass customization is needed in smart manufacturing and find appropriate demand forecasting techniques by comparing the traditional time series technique ARIMA analysis with the nonlinear network model. Afterwards, the company develops an algorithm to evaluate the sales process by finalizing the production plan by evaluating the expected inventory through mathematical modelling.","PeriodicalId":396772,"journal":{"name":"Proceedings of the 2019 International Conference on Data Mining and Machine Learning","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121757088","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Assistant Decision-making Method of "NIMBY" Crisis Conversion in Waste Incineration Based on "Reputation and Benefit Space" 基于“声誉与利益空间”的垃圾焚烧“邻避”危机转化辅助决策方法
Enyuan Liu, Minxuan Li, Shengya Liu
Waste incineration power generation, as a waste disposal method of "reduction, harmlessness and resource utilization", is an important measure to improve national well-being index and guarantee the achievement of overall well-off struggle. However, in the process of project promotion, it is faced with the problem of landing difficulties caused by "NIMBY". In order to solve this social problem scientifically and quantitatively, this paper innovatively constructs a network case analysis method based on reputation and benefit space, and abstracts a clustering center with scientific management significance by case clustering method to evolve reputation and benefit space. Based on this, a decision-making aided method based on similarity calculation is constructed to provide support for the transformation of "NIMBY" crisis.
垃圾焚烧发电作为一种“减量化、无害化、资源化”的垃圾处理方式,是提高国民幸福指数、实现全面小康奋斗目标的重要举措。但在项目推进过程中,却面临着“邻避”导致落地困难的问题。为了科学、定量地解决这一社会问题,本文创新性地构建了基于声誉与效益空间的网络案例分析方法,并通过案例聚类方法抽象出具有科学管理意义的聚类中心来演化声誉与效益空间。在此基础上,构建了基于相似度计算的决策辅助方法,为“邻避”危机的转化提供支持。
{"title":"Assistant Decision-making Method of \"NIMBY\" Crisis Conversion in Waste Incineration Based on \"Reputation and Benefit Space\"","authors":"Enyuan Liu, Minxuan Li, Shengya Liu","doi":"10.1145/3335656.3335686","DOIUrl":"https://doi.org/10.1145/3335656.3335686","url":null,"abstract":"Waste incineration power generation, as a waste disposal method of \"reduction, harmlessness and resource utilization\", is an important measure to improve national well-being index and guarantee the achievement of overall well-off struggle. However, in the process of project promotion, it is faced with the problem of landing difficulties caused by \"NIMBY\". In order to solve this social problem scientifically and quantitatively, this paper innovatively constructs a network case analysis method based on reputation and benefit space, and abstracts a clustering center with scientific management significance by case clustering method to evolve reputation and benefit space. Based on this, a decision-making aided method based on similarity calculation is constructed to provide support for the transformation of \"NIMBY\" crisis.","PeriodicalId":396772,"journal":{"name":"Proceedings of the 2019 International Conference on Data Mining and Machine Learning","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130895066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Research on Code Plagiarism Detection Model Based on Random Forest and Gradient Boosting Decision Tree 基于随机森林和梯度增强决策树的代码抄袭检测模型研究
Huang Qiubo, Tang Jingdong, Fang Guo-zheng
This paper studies the Online Judge System for assignments such as programming. Sometimes there are plagiarismsin codes submitted by students[1]. In addition to calculating the similarity degree between the codes, we also extract other features to determine whether there isplagiarismsuspicion of a submitted code or not. By using combination of Random Forest and Gradient Boosting Decision Tree, we also can getitssuspicion level. The model first calculates the similarity degree between the newly submitted code and all submitted codes, and determines plagiarism suspect. For some codes that are difficult to confirm whetherisplagiarismor not, we extract the programming style similarity degree, and the student's submission behavior pattern (such as similar target concentration degree) and other features, to create decision trees such as Random Forestand Gradient Boosting Decision Trees, which can help determine the level of plagiarism suspect. If the level is medium, the teacher will mark the code as plagiarized or not. Finally, the learning model is incrementally trained to improve the accuracy of the model and the classification results. Experiment results show that the accuracy rate can reach 95.9%. As a result, the model can prevent students from plagiarizing while minimizing the workload of the teacher.
本文研究了编程等作业的在线裁判系统。有时会有学生提交的剽窃代码。除了计算代码之间的相似度外,我们还提取了其他特征来确定提交的代码是否存在抄袭嫌疑。通过将随机森林与梯度增强决策树相结合,我们还可以得到怀疑程度。该模型首先计算新提交的代码与所有提交的代码之间的相似度,并确定抄袭嫌疑。对于一些难以确定是否抄袭的代码,我们提取编程风格的相似度,以及学生的提交行为模式(如相似目标集中度)等特征,创建决策树,如Random Forestand Gradient Boosting decision trees,可以帮助确定抄袭嫌疑的程度。如果水平是中等,老师会将代码标记为抄袭或不抄袭。最后,对学习模型进行增量训练,提高模型的准确率和分类结果。实验结果表明,该方法的准确率可达95.9%。因此,该模式可以防止学生抄袭,同时最大限度地减少教师的工作量。
{"title":"Research on Code Plagiarism Detection Model Based on Random Forest and Gradient Boosting Decision Tree","authors":"Huang Qiubo, Tang Jingdong, Fang Guo-zheng","doi":"10.1145/3335656.3335692","DOIUrl":"https://doi.org/10.1145/3335656.3335692","url":null,"abstract":"This paper studies the Online Judge System for assignments such as programming. Sometimes there are plagiarismsin codes submitted by students[1]. In addition to calculating the similarity degree between the codes, we also extract other features to determine whether there isplagiarismsuspicion of a submitted code or not. By using combination of Random Forest and Gradient Boosting Decision Tree, we also can getitssuspicion level. The model first calculates the similarity degree between the newly submitted code and all submitted codes, and determines plagiarism suspect. For some codes that are difficult to confirm whetherisplagiarismor not, we extract the programming style similarity degree, and the student's submission behavior pattern (such as similar target concentration degree) and other features, to create decision trees such as Random Forestand Gradient Boosting Decision Trees, which can help determine the level of plagiarism suspect. If the level is medium, the teacher will mark the code as plagiarized or not. Finally, the learning model is incrementally trained to improve the accuracy of the model and the classification results. Experiment results show that the accuracy rate can reach 95.9%. As a result, the model can prevent students from plagiarizing while minimizing the workload of the teacher.","PeriodicalId":396772,"journal":{"name":"Proceedings of the 2019 International Conference on Data Mining and Machine Learning","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114602273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Joint Transceiver Design for Fully-Duplex Cloud-Access DEINs 全双工云接入dein的联合收发器设计
Qin Yu, Junliang Yu, Jie Hu, Kun Yang, Taijun Wang, Rongsheng Ding
With the rapid development of communication technology, the demand of rate of communication network is higher and higher, and the problem of energy consumption is becoming more and more serious. Data and Energy Integrated communication Networks (DEINs) can simultaneously transmit information and energy for the terminal, which greatly improves the convenience of the terminal and makes devices without batteries possible in future. This paper studies the joint design of transceivers in a full-duplex cloud access number-integrated network. The system model considers both upstream and downstream users. Considering the need for joint resource allocation for system uplink and downlink, full-duplex technology and self-interference caused by full-duplex technology are considered into the system. The optimization goal of this problem is to minimize the total power consumption under the uplink and downlink SINR and EH constraints. For this non-convex optimization problem, An algorithm combining ZF beamforming and MRT beamforming is proposed. In the hybrid beamforming algorithm, the zero-forcing (ZF) beamformer and MRT beamformer are linearly combined, which simplifies the optimization of the downlink beam vector to the optimization of the combination ratio. The proposed algorithm is simulated. Simulation results show that the power consumed in the half-duplex scenario is higher than that in the full duplex scenario. The time spent in the hybrid beamforming algorithm does not change with the increase in the number of RRH antennas.
随着通信技术的飞速发展,对通信网络速率的要求越来越高,能耗问题也越来越严重。数据与能量集成通信网络(DEINs)可以同时为终端传输信息和能量,大大提高了终端的便利性,使未来无电池设备成为可能。研究了全双工云接入数字集成网络中收发器的联合设计。系统模型同时考虑了上游和下游用户。考虑到系统上下行链路需要联合资源分配,系统中考虑了全双工技术和全双工技术带来的自干扰。该问题的优化目标是在上行链路和下行链路SINR和EH约束下使总功耗最小。针对这一非凸优化问题,提出了一种ZF波束形成和MRT波束形成相结合的算法。在混合波束形成算法中,零强迫(zero-forcing, ZF)波束形成器与MRT波束形成器线性组合,将下行波束矢量的优化简化为组合比的优化。对该算法进行了仿真。仿真结果表明,半双工场景下的功耗要高于全双工场景。混合波束形成算法所花费的时间不随RRH天线数量的增加而变化。
{"title":"Joint Transceiver Design for Fully-Duplex Cloud-Access DEINs","authors":"Qin Yu, Junliang Yu, Jie Hu, Kun Yang, Taijun Wang, Rongsheng Ding","doi":"10.1145/3335656.3335691","DOIUrl":"https://doi.org/10.1145/3335656.3335691","url":null,"abstract":"With the rapid development of communication technology, the demand of rate of communication network is higher and higher, and the problem of energy consumption is becoming more and more serious. Data and Energy Integrated communication Networks (DEINs) can simultaneously transmit information and energy for the terminal, which greatly improves the convenience of the terminal and makes devices without batteries possible in future. This paper studies the joint design of transceivers in a full-duplex cloud access number-integrated network. The system model considers both upstream and downstream users. Considering the need for joint resource allocation for system uplink and downlink, full-duplex technology and self-interference caused by full-duplex technology are considered into the system. The optimization goal of this problem is to minimize the total power consumption under the uplink and downlink SINR and EH constraints. For this non-convex optimization problem, An algorithm combining ZF beamforming and MRT beamforming is proposed. In the hybrid beamforming algorithm, the zero-forcing (ZF) beamformer and MRT beamformer are linearly combined, which simplifies the optimization of the downlink beam vector to the optimization of the combination ratio. The proposed algorithm is simulated. Simulation results show that the power consumed in the half-duplex scenario is higher than that in the full duplex scenario. The time spent in the hybrid beamforming algorithm does not change with the increase in the number of RRH antennas.","PeriodicalId":396772,"journal":{"name":"Proceedings of the 2019 International Conference on Data Mining and Machine Learning","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128676020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The extraction research on evaluation rules for students based on discernibility matrix 基于差别矩阵的学生评价规则提取研究
Fan Yan-ying, Zhang Zi-min, Chen Guan-ping, Zheng Shi-yong
We apply the rough set theory to the evaluation process of students. Firstly, we should create the information table of evaluation decision by using the discernibility matrix of rough set theory to do attribute reduction for the evaluation data and hence reduce unnecessary evaluation indicators. We will do value reduction and rule extraction algorithm based on this and then dig out the general rule of the evaluation for students from enormous evaluation data in order to provide decision basis for the work of students at school. This process of evaluation is totally about enabling the date talk and reduce the influence of human-dominated factors, as a result, the outcome of the evaluation will be more objective and fair.
我们将粗糙集理论应用到学生的评价过程中。首先,利用粗糙集理论的差别矩阵建立评价决策信息表,对评价数据进行属性约简,减少不必要的评价指标;我们将在此基础上进行值约简和规则提取算法,从海量的评价数据中挖掘出学生评价的一般规律,为学生在学校的工作提供决策依据。这个评估过程完全是为了实现约会谈话,减少人为因素的影响,从而使评估结果更加客观公正。
{"title":"The extraction research on evaluation rules for students based on discernibility matrix","authors":"Fan Yan-ying, Zhang Zi-min, Chen Guan-ping, Zheng Shi-yong","doi":"10.1145/3335656.3335680","DOIUrl":"https://doi.org/10.1145/3335656.3335680","url":null,"abstract":"We apply the rough set theory to the evaluation process of students. Firstly, we should create the information table of evaluation decision by using the discernibility matrix of rough set theory to do attribute reduction for the evaluation data and hence reduce unnecessary evaluation indicators. We will do value reduction and rule extraction algorithm based on this and then dig out the general rule of the evaluation for students from enormous evaluation data in order to provide decision basis for the work of students at school. This process of evaluation is totally about enabling the date talk and reduce the influence of human-dominated factors, as a result, the outcome of the evaluation will be more objective and fair.","PeriodicalId":396772,"journal":{"name":"Proceedings of the 2019 International Conference on Data Mining and Machine Learning","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128934462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spatio-temporal Changes of Tourists Based on Multi-source data in Chengdu 基于多源数据的成都市游客时空变化研究
R. Yuan
The popularity of mobile internet accelerates the dissemination and communication of information and also changes the way tourists obtain information. Tourists no longer rely on the officially published travel brochures and TV programs to obtain tourism information. Through Twitter, Sina Weibo, Facebook and other We-Media channels, tourists can get first-hand information about the tourist destination. A large number of GPS trajectory data, such as taxi trajectory data and mobile signaling data, are generated through the widely existing GPS sensors and have been widely used in traffic and resident travel research. Since tourists are not familiar with the road distribution and traffic rules of the destination city, taxi car is an important travel method for non-local tourists to choose, and its OD(origin-destination) points reflect the travel needs and travel characteristics of tourists. Therefore, this paper applies the taxi data to the tourism research. In our study, CFSDPF clustering algorithm is adopted to cluster Sina Weibo data to form tourism ROI (region of interest), and the tourism ROI is used to cluster taxi OD data. The travel characteristics of tourists can be fully and accurately reflected through multi-source data. From two different scales of citywide and central city, we can comprehensively analyze the relationship between the travel characteristics of tourists in chengdu and the tourism ROI.
移动互联网的普及加速了信息的传播和交流,也改变了游客获取信息的方式。旅游者不再依靠官方出版的旅游手册和电视节目来获取旅游信息。通过Twitter、新浪微博、Facebook等自媒体渠道,游客可以获得旅游目的地的第一手信息。大量的GPS轨迹数据是通过广泛存在的GPS传感器产生的,如出租车轨迹数据、移动信令数据等,已广泛应用于交通和居民出行研究中。由于游客不熟悉目的地城市的道路分布和交通规则,出租车是外地游客选择的重要出行方式,其OD(出发地)点反映了游客的出行需求和出行特点。因此,本文将出租车数据应用到旅游研究中。本研究采用CFSDPF聚类算法对新浪微博数据进行聚类,形成旅游ROI(兴趣区域),并利用旅游ROI对出租车OD数据进行聚类。通过多源数据,可以充分、准确地反映旅游者的旅游特征。从全市和中心城市两个不同的尺度,可以全面分析成都游客的旅游特征与旅游投资回报率之间的关系。
{"title":"Spatio-temporal Changes of Tourists Based on Multi-source data in Chengdu","authors":"R. Yuan","doi":"10.1145/3335656.3335696","DOIUrl":"https://doi.org/10.1145/3335656.3335696","url":null,"abstract":"The popularity of mobile internet accelerates the dissemination and communication of information and also changes the way tourists obtain information. Tourists no longer rely on the officially published travel brochures and TV programs to obtain tourism information. Through Twitter, Sina Weibo, Facebook and other We-Media channels, tourists can get first-hand information about the tourist destination. A large number of GPS trajectory data, such as taxi trajectory data and mobile signaling data, are generated through the widely existing GPS sensors and have been widely used in traffic and resident travel research. Since tourists are not familiar with the road distribution and traffic rules of the destination city, taxi car is an important travel method for non-local tourists to choose, and its OD(origin-destination) points reflect the travel needs and travel characteristics of tourists. Therefore, this paper applies the taxi data to the tourism research. In our study, CFSDPF clustering algorithm is adopted to cluster Sina Weibo data to form tourism ROI (region of interest), and the tourism ROI is used to cluster taxi OD data. The travel characteristics of tourists can be fully and accurately reflected through multi-source data. From two different scales of citywide and central city, we can comprehensively analyze the relationship between the travel characteristics of tourists in chengdu and the tourism ROI.","PeriodicalId":396772,"journal":{"name":"Proceedings of the 2019 International Conference on Data Mining and Machine Learning","volume":"14 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130410592","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Simulation for Agglomeration Effect of Internet Crowdfunding Model 互联网众筹模型集聚效应仿真
Yunjie Ji, YanXia Zhu
Crowdfunding has become an important channel for the transformation of innovation achievements. Exploring the healthy and rapid development of crowdfunding is a hot of academic research. This paper simulates the agglomeration effect in the development of crowdfunding mode through multi-agent system. And this paper findsthat properly supporting superior enterprises or high-quality projects and concentrating resources to stimulate innovation and transformation, are beneficial to improve the whole development level of the crowdfunding system without reducing the stability of the system operation.
众筹已成为创新成果转化的重要渠道。探索众筹的健康快速发展是学术界研究的热点。本文通过多智能体系统模拟了众筹模式发展中的集聚效应。研究发现,适当扶持优势企业或优质项目,集中资源激发创新和转型,有利于在不降低系统运行稳定性的前提下,提高众筹系统的整体发展水平。
{"title":"Simulation for Agglomeration Effect of Internet Crowdfunding Model","authors":"Yunjie Ji, YanXia Zhu","doi":"10.1145/3335656.3335682","DOIUrl":"https://doi.org/10.1145/3335656.3335682","url":null,"abstract":"Crowdfunding has become an important channel for the transformation of innovation achievements. Exploring the healthy and rapid development of crowdfunding is a hot of academic research. This paper simulates the agglomeration effect in the development of crowdfunding mode through multi-agent system. And this paper findsthat properly supporting superior enterprises or high-quality projects and concentrating resources to stimulate innovation and transformation, are beneficial to improve the whole development level of the crowdfunding system without reducing the stability of the system operation.","PeriodicalId":396772,"journal":{"name":"Proceedings of the 2019 International Conference on Data Mining and Machine Learning","volume":"190 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116532347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Research on Vehicle Identification Method Based on Computer Vision 基于计算机视觉的车辆识别方法研究
Zhou Yan, Deming Yuan, Zhou Jun
Identifying the vehicle in front of road is an important research topic for active safety and intelligent driving of vehicles. A vehicle identification algorithm is proposed based on computer vision using supervised machine learning algorithm AdaBoost and Haar-like features. Firstly, in terms of feature selection, dimension reduction processing is performed from two aspects of feature type and feature size, and integral graph is applied to accelerate the calculation of Haar-like eigenvalues. Secondly, a more efficient classifier is constructed based on a small number of effective features, and a single strong classifier is used to identify and verify the vehicle in front. Finally, the whole vehicle identification algorithm is tested with the test data including 350 frames captured from the highway video set and 450 frames captured from the urban road video set. The result shows that the vehicle identification algorithm have a high detection rate and Lower detection error rate.
道路前方车辆识别是车辆主动安全和智能驾驶的重要研究课题。利用AdaBoost监督机器学习算法和haar类特征,提出了一种基于计算机视觉的车辆识别算法。首先,在特征选择方面,从特征类型和特征尺寸两个方面进行降维处理,并利用积分图加速haar样特征值的计算;其次,基于少量有效特征构建更高效的分类器,并使用单个强分类器对前方车辆进行识别和验证;最后,对整车识别算法进行了测试,测试数据包括350帧高速公路视频集和450帧城市道路视频集。结果表明,该算法具有较高的检测率和较低的检测错误率。
{"title":"Research on Vehicle Identification Method Based on Computer Vision","authors":"Zhou Yan, Deming Yuan, Zhou Jun","doi":"10.1145/3335656.3335700","DOIUrl":"https://doi.org/10.1145/3335656.3335700","url":null,"abstract":"Identifying the vehicle in front of road is an important research topic for active safety and intelligent driving of vehicles. A vehicle identification algorithm is proposed based on computer vision using supervised machine learning algorithm AdaBoost and Haar-like features. Firstly, in terms of feature selection, dimension reduction processing is performed from two aspects of feature type and feature size, and integral graph is applied to accelerate the calculation of Haar-like eigenvalues. Secondly, a more efficient classifier is constructed based on a small number of effective features, and a single strong classifier is used to identify and verify the vehicle in front. Finally, the whole vehicle identification algorithm is tested with the test data including 350 frames captured from the highway video set and 450 frames captured from the urban road video set. The result shows that the vehicle identification algorithm have a high detection rate and Lower detection error rate.","PeriodicalId":396772,"journal":{"name":"Proceedings of the 2019 International Conference on Data Mining and Machine Learning","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115182338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
The Design of Word Cloud Rendering Platform and Its Application on Measuring Systematic Financial Risks 词云绘制平台的设计及其在系统金融风险度量中的应用
Shifen Wang, Yining Sun
With the development of the Internet, the amount of daily output data is constantly increasing, and the value contained in the data is increasing as well; Meanwhile, the difficulty of data mining and the complexity of data analysis increases sharply. Developing a new data processing system is in urgent need especially in the macroeconomic field. Word cloud is a trendy way to visualize hot spot. At first, the design of a distributed batch-based website word cloud rendering platform will be explained, combining the processing mode of big data and the traditional web crawler design method to collect all the information of a website and present the data using word cloud. Then, this platform will be used for practice and applied to the measurement of systemic financial risks.
随着互联网的发展,每天输出的数据量在不断增加,数据所包含的价值也在不断增加;同时,数据挖掘的难度和数据分析的复杂性急剧增加。特别是在宏观经济领域,迫切需要开发一种新的数据处理系统。词云是可视化热点的一种流行方式。首先阐述基于分布式批处理的网站词云呈现平台的设计,将大数据的处理方式与传统的网络爬虫设计方法相结合,收集网站的全部信息,并利用词云呈现数据。然后,将该平台用于实践,并将其应用于系统性金融风险的度量。
{"title":"The Design of Word Cloud Rendering Platform and Its Application on Measuring Systematic Financial Risks","authors":"Shifen Wang, Yining Sun","doi":"10.1145/3335656.3335698","DOIUrl":"https://doi.org/10.1145/3335656.3335698","url":null,"abstract":"With the development of the Internet, the amount of daily output data is constantly increasing, and the value contained in the data is increasing as well; Meanwhile, the difficulty of data mining and the complexity of data analysis increases sharply. Developing a new data processing system is in urgent need especially in the macroeconomic field. Word cloud is a trendy way to visualize hot spot. At first, the design of a distributed batch-based website word cloud rendering platform will be explained, combining the processing mode of big data and the traditional web crawler design method to collect all the information of a website and present the data using word cloud. Then, this platform will be used for practice and applied to the measurement of systemic financial risks.","PeriodicalId":396772,"journal":{"name":"Proceedings of the 2019 International Conference on Data Mining and Machine Learning","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125073815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Analysis and Research on the Use Situation of Public Bicycles Based on Spark Machine Learning 基于Spark机器学习的公共自行车使用状况分析与研究
Chengang Li, Yu Liu, Chengcheng Li
Public bicycles are a healthy and environmentally friendly means of transportation that facilitates people's travel. However, due to the uncertainty of urban travel, especially the tidal phenomenon, public bicycles often "difficult to borrow a car" and "return the car". This will result in unreasonable distribution of the site during the operation of the public bicycle system, unbalanced bicycle processes at various sites during peak hours, and unbalanced operation and management, which restricts the development of public bicycles. This paper uses the data of the San Francisco Bay Area as the experimental data of this paper, using Spark SQL and Spark Dataframe to analyze the use of public bicycle users and sites, according to the impact of different user types on the use of public bicycles, using K-means clustering algorithm Analyze the use of the site. Based on the Spark MLlib machine learning library, the gradient usage algorithm is used to predict daily usage.
公共自行车是一种健康环保的交通工具,方便了人们的出行。然而,由于城市出行的不确定性,特别是潮汐现象,公共自行车经常出现“借车难”和“还车难”的情况。这将导致公共自行车系统运行时站点分布不合理,高峰时段各站点的自行车流程不平衡,运营管理不平衡,制约了公共自行车的发展。本文以旧金山湾区的数据作为本文的实验数据,使用Spark SQL和Spark Dataframe分析公共自行车用户和站点的使用情况,根据不同用户类型对公共自行车使用情况的影响,使用K-means聚类算法分析站点的使用情况。基于Spark MLlib机器学习库,采用梯度使用算法预测日常使用情况。
{"title":"Analysis and Research on the Use Situation of Public Bicycles Based on Spark Machine Learning","authors":"Chengang Li, Yu Liu, Chengcheng Li","doi":"10.1145/3335656.3335704","DOIUrl":"https://doi.org/10.1145/3335656.3335704","url":null,"abstract":"Public bicycles are a healthy and environmentally friendly means of transportation that facilitates people's travel. However, due to the uncertainty of urban travel, especially the tidal phenomenon, public bicycles often \"difficult to borrow a car\" and \"return the car\". This will result in unreasonable distribution of the site during the operation of the public bicycle system, unbalanced bicycle processes at various sites during peak hours, and unbalanced operation and management, which restricts the development of public bicycles. This paper uses the data of the San Francisco Bay Area as the experimental data of this paper, using Spark SQL and Spark Dataframe to analyze the use of public bicycle users and sites, according to the impact of different user types on the use of public bicycles, using K-means clustering algorithm Analyze the use of the site. Based on the Spark MLlib machine learning library, the gradient usage algorithm is used to predict daily usage.","PeriodicalId":396772,"journal":{"name":"Proceedings of the 2019 International Conference on Data Mining and Machine Learning","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127294637","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
Proceedings of the 2019 International Conference on Data Mining and Machine Learning
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1