首页 > 最新文献

2019 IEEE 35th International Conference on Data Engineering Workshops (ICDEW)最新文献

英文 中文
Big Stream Processing Systems: An Experimental Evaluation 大数据流处理系统:一个实验评估
Pub Date : 2019-04-01 DOI: 10.1109/ICDEW.2019.00-35
E. Shahverdi, Ahmed Awad, S. Sakr
As the world gets more instrumented and connected, we are witnessing a flood of digital data generated from various hardware (e.g., sensors) or software in the format of flowing streams of data. Real-time processing for such massive amounts of streaming data is a crucial requirement in several application domains including financial markets, surveillance systems, manufacturing, smart cities, and scalable monitoring infrastructure. In the last few years, several big stream processing engines have been introduced to tackle this challenge. In this article, we present an extensive experimental study of five popular systems in this domain, namely, Apache Storm, Apache Flink, Apache Spark, Kafka Streams and Hazelcast Jet. We report and analyze the performance characteristics of these systems. In addition, we report a set of insights and important lessons that we have learned from conducting our experiments.
随着世界越来越多的仪器化和互联化,我们正在目睹各种硬件(如传感器)或软件以流动数据流的形式产生的大量数字数据。在金融市场、监控系统、制造业、智能城市和可扩展的监控基础设施等多个应用领域,实时处理如此大量的流数据是一个至关重要的需求。在过去的几年中,已经引入了几个大型流处理引擎来应对这一挑战。在本文中,我们对该领域的五个流行系统进行了广泛的实验研究,即Apache Storm, Apache Flink, Apache Spark, Kafka Streams和Hazelcast Jet。我们报告并分析了这些系统的性能特征。此外,我们还报告了我们从进行实验中学到的一系列见解和重要经验教训。
{"title":"Big Stream Processing Systems: An Experimental Evaluation","authors":"E. Shahverdi, Ahmed Awad, S. Sakr","doi":"10.1109/ICDEW.2019.00-35","DOIUrl":"https://doi.org/10.1109/ICDEW.2019.00-35","url":null,"abstract":"As the world gets more instrumented and connected, we are witnessing a flood of digital data generated from various hardware (e.g., sensors) or software in the format of flowing streams of data. Real-time processing for such massive amounts of streaming data is a crucial requirement in several application domains including financial markets, surveillance systems, manufacturing, smart cities, and scalable monitoring infrastructure. In the last few years, several big stream processing engines have been introduced to tackle this challenge. In this article, we present an extensive experimental study of five popular systems in this domain, namely, Apache Storm, Apache Flink, Apache Spark, Kafka Streams and Hazelcast Jet. We report and analyze the performance characteristics of these systems. In addition, we report a set of insights and important lessons that we have learned from conducting our experiments.","PeriodicalId":186190,"journal":{"name":"2019 IEEE 35th International Conference on Data Engineering Workshops (ICDEW)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132733409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Multi-camera Background and Scene Activity Modelling Based on Spearman Correlation Analysis and Inception-V3 Network 基于Spearman相关分析和Inception-V3网络的多摄像机背景和场景活动建模
Pub Date : 2019-04-01 DOI: 10.1109/ICDEW.2019.00058
Keyang Cheng, Muhammad Saddam Khokhar, Yunbo Rao, Rabia Tahir
A novel approach for background and scene activity modelling with spearman correlation analysis and customized deep learning model is introduced in this paper. It detects and gives correlated analytics between casual and temporal regional activities on the basis of similarities and primary dissimilarities in the same scene captured by several cameras. The experiment implement on four overlapped videos that are captured inside the hall from four cameras. Detected and analyzed by our model, 17.32% correlated co-occurrences is actual correlation among all videos. Rest of 82.68% of videos is background that shows similar and repetitive features in spearman rank tied result. Simulation results demonstrate that the proposed method can detect high correlation among all activities during the frame rate with tied features ability.
提出了一种基于spearman相关分析和自定义深度学习模型的背景和场景活动建模新方法。它可以根据几个摄像机拍摄的同一场景的相似点和主要不同点,检测并给出偶然和暂时区域活动之间的相关分析。实验是在四个重叠的视频上进行的,这些视频是由四个摄像机在大厅内拍摄的。通过我们的模型检测和分析,17.32%的相关共现是所有视频之间的实际相关。其余82.68%的视频是在spearman排名结果中显示相似和重复特征的背景。仿真结果表明,该方法能够检测出帧率期间所有活动之间的高度相关性,具有特征绑定能力。
{"title":"Multi-camera Background and Scene Activity Modelling Based on Spearman Correlation Analysis and Inception-V3 Network","authors":"Keyang Cheng, Muhammad Saddam Khokhar, Yunbo Rao, Rabia Tahir","doi":"10.1109/ICDEW.2019.00058","DOIUrl":"https://doi.org/10.1109/ICDEW.2019.00058","url":null,"abstract":"A novel approach for background and scene activity modelling with spearman correlation analysis and customized deep learning model is introduced in this paper. It detects and gives correlated analytics between casual and temporal regional activities on the basis of similarities and primary dissimilarities in the same scene captured by several cameras. The experiment implement on four overlapped videos that are captured inside the hall from four cameras. Detected and analyzed by our model, 17.32% correlated co-occurrences is actual correlation among all videos. Rest of 82.68% of videos is background that shows similar and repetitive features in spearman rank tied result. Simulation results demonstrate that the proposed method can detect high correlation among all activities during the frame rate with tied features ability.","PeriodicalId":186190,"journal":{"name":"2019 IEEE 35th International Conference on Data Engineering Workshops (ICDEW)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129332170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Incorporating Latent Space Correlation Coefficients to Collaborative Filtering 基于潜在空间相关系数的协同过滤
Pub Date : 2019-04-01 DOI: 10.1109/ICDEW.2019.00-17
Zongxi Li, Haoran Xie, Yingchao Zhao, Qing Li
Collaborative Filtering (CF) is a popular approach to generate predicted rating of a target user on an item by aggregating neighbor users' ratings; these ratings are weighted by a correlation coefficient between two users. Thus, the user-user similarity computation is a significant step in CF to select proper neighborhood and exploit suitable correlation coefficients for prediction, and multiple weighting techniques have been proposed to enhance the performance. However, existing approaches compute the similarity directly based on users' rating vectors, which may lead the system to suffer from severe low-sparsity problem, and will also cause the system to be less interpretive because the rating only represents user's preference on a certain item but does not include extra feature information like attributes or genres. In this paper, we propose a method to compute the user' correlations in latent space by incorporating matrix factorization (MF) technique, and exploit the correlation coefficients in the prediction step of CF. We have evaluated the proposed approach with variant methods on MovieLens dataset to validate the effectiveness in CF.
协同过滤(CF)是一种流行的方法,通过聚合邻居用户的评分来生成目标用户对某项商品的预测评分;这些评级由两个用户之间的相关系数加权。因此,用户-用户相似度计算是CF中选择合适的邻域和利用合适的相关系数进行预测的重要步骤,并提出了多重加权技术来提高性能。然而,现有的方法直接基于用户的评分向量来计算相似度,这可能会导致系统存在严重的低稀疏性问题,并且由于评分只表示用户对某一物品的偏好,而不包括属性或类型等额外的特征信息,也会导致系统的解释性较差。本文提出了一种结合矩阵分解(MF)技术计算潜在空间中用户相关性的方法,并在CF的预测步骤中利用相关系数。我们在MovieLens数据集上用不同的方法对所提出的方法进行了评估,以验证该方法在CF中的有效性。
{"title":"Incorporating Latent Space Correlation Coefficients to Collaborative Filtering","authors":"Zongxi Li, Haoran Xie, Yingchao Zhao, Qing Li","doi":"10.1109/ICDEW.2019.00-17","DOIUrl":"https://doi.org/10.1109/ICDEW.2019.00-17","url":null,"abstract":"Collaborative Filtering (CF) is a popular approach to generate predicted rating of a target user on an item by aggregating neighbor users' ratings; these ratings are weighted by a correlation coefficient between two users. Thus, the user-user similarity computation is a significant step in CF to select proper neighborhood and exploit suitable correlation coefficients for prediction, and multiple weighting techniques have been proposed to enhance the performance. However, existing approaches compute the similarity directly based on users' rating vectors, which may lead the system to suffer from severe low-sparsity problem, and will also cause the system to be less interpretive because the rating only represents user's preference on a certain item but does not include extra feature information like attributes or genres. In this paper, we propose a method to compute the user' correlations in latent space by incorporating matrix factorization (MF) technique, and exploit the correlation coefficients in the prediction step of CF. We have evaluated the proposed approach with variant methods on MovieLens dataset to validate the effectiveness in CF.","PeriodicalId":186190,"journal":{"name":"2019 IEEE 35th International Conference on Data Engineering Workshops (ICDEW)","volume":"82 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126264043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
VQL: Providing Query Efficiency and Data Authenticity in Blockchain Systems VQL:为区块链系统提供查询效率和数据真实性
Pub Date : 2019-04-01 DOI: 10.1109/ICDEW.2019.00-44
Zhe Peng, Haotian Wu, Bin Xiao, Songtao Guo
Blockchain, as the underlying technique of cryptocurrency, has triggered a wave of innovation in decentralized computing. Despite some research on blockchain data query, a primary concern for blockchain to be fully practical is to combat the data query inefficiency and query result authenticity. To provide both efficient and verifiable data query services for blockchain-based systems, we propose a Verifiable Query Layer (VQL). The middleware layer extracts transactions stored in the underlying blockchain system and efficiently reorganizes them in databases to provide various query services for public users. To prevent falsified data being stored in the middleware, a cryptographic hash value is calculated for each constructed database. The database fingerprint including the hash value and some database properties will be first verified by miners and then stored in the blockchain. We implement VQL and conduct extensive experiments based on a practical blockchain system Ethereum. The evaluation results demonstrate that VQL can effectively support various data query services and guarantee the authenticity of query results for the blockchain system.
区块链作为加密货币的底层技术,引发了去中心化计算的创新浪潮。尽管对区块链数据查询进行了一些研究,但区块链要完全实用的一个主要问题是克服数据查询的低效率和查询结果的真实性。为了为基于区块链的系统提供高效和可验证的数据查询服务,我们提出了一个可验证查询层(VQL)。中间件层提取存储在底层区块链系统中的事务,并在数据库中高效地进行重组,为公共用户提供各种查询服务。为了防止伪造的数据存储在中间件中,为每个构造的数据库计算一个加密散列值。包括哈希值和一些数据库属性在内的数据库指纹将首先由矿工验证,然后存储在区块链中。我们基于实际的区块链系统以太坊实现了VQL并进行了广泛的实验。评估结果表明,VQL能够有效支持区块链系统的各种数据查询服务,保证查询结果的真实性。
{"title":"VQL: Providing Query Efficiency and Data Authenticity in Blockchain Systems","authors":"Zhe Peng, Haotian Wu, Bin Xiao, Songtao Guo","doi":"10.1109/ICDEW.2019.00-44","DOIUrl":"https://doi.org/10.1109/ICDEW.2019.00-44","url":null,"abstract":"Blockchain, as the underlying technique of cryptocurrency, has triggered a wave of innovation in decentralized computing. Despite some research on blockchain data query, a primary concern for blockchain to be fully practical is to combat the data query inefficiency and query result authenticity. To provide both efficient and verifiable data query services for blockchain-based systems, we propose a Verifiable Query Layer (VQL). The middleware layer extracts transactions stored in the underlying blockchain system and efficiently reorganizes them in databases to provide various query services for public users. To prevent falsified data being stored in the middleware, a cryptographic hash value is calculated for each constructed database. The database fingerprint including the hash value and some database properties will be first verified by miners and then stored in the blockchain. We implement VQL and conduct extensive experiments based on a practical blockchain system Ethereum. The evaluation results demonstrate that VQL can effectively support various data query services and guarantee the authenticity of query results for the blockchain system.","PeriodicalId":186190,"journal":{"name":"2019 IEEE 35th International Conference on Data Engineering Workshops (ICDEW)","volume":"254 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124705665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
A Framework for Self-Managing Database Systems 自管理数据库系统的框架
Pub Date : 2019-04-01 DOI: 10.1109/ICDEW.2019.00-27
Jan Kossmann, R. Schlosser
Database systems that autonomously manage their configuration and physical database design face numerous challenges: They need to anticipate future workloads, find satisfactory and robust configurations efficiently, and learn from recent actions. We describe a component-based framework for self-managed database systems to facilitate development and database integration with low overhead by relying on a clear separation of concerns. Our framework results in exchangeable and reusable components, which simplify experiments and promote further research. Furthermore, we propose an LP-based algorithm to find an efficient order to tune multiple dependent features in a recursive way.
自主管理其配置和物理数据库设计的数据库系统面临着许多挑战:它们需要预测未来的工作负载,有效地找到令人满意和健壮的配置,并从最近的操作中学习。我们描述了一个用于自管理数据库系统的基于组件的框架,通过清晰的关注点分离来促进低开销的开发和数据库集成。我们的框架产生了可交换和可重用的组件,这简化了实验并促进了进一步的研究。此外,我们提出了一种基于lp的算法,以递归的方式找到一个有效的顺序来调整多个相关特征。
{"title":"A Framework for Self-Managing Database Systems","authors":"Jan Kossmann, R. Schlosser","doi":"10.1109/ICDEW.2019.00-27","DOIUrl":"https://doi.org/10.1109/ICDEW.2019.00-27","url":null,"abstract":"Database systems that autonomously manage their configuration and physical database design face numerous challenges: They need to anticipate future workloads, find satisfactory and robust configurations efficiently, and learn from recent actions. We describe a component-based framework for self-managed database systems to facilitate development and database integration with low overhead by relying on a clear separation of concerns. Our framework results in exchangeable and reusable components, which simplify experiments and promote further research. Furthermore, we propose an LP-based algorithm to find an efficient order to tune multiple dependent features in a recursive way.","PeriodicalId":186190,"journal":{"name":"2019 IEEE 35th International Conference on Data Engineering Workshops (ICDEW)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115481155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Guided Bayesian Optimization to AutoTune Memory-Based Analytics 引导贝叶斯优化自动调整基于内存的分析
Pub Date : 2019-04-01 DOI: 10.1109/ICDEW.2019.00-22
Mayuresh Kunjir
There is a lot of interest today in building autonomous (or, self-driving) data processing systems. An emerging school of thought is to leverage the "black box" algorithm of Bayesian Optimization for problems of this flavor both due to its wider applicability and theoretical guarantees on the quality of results produced. The black-box approach, however, could be time and labor-intensive; or otherwise get stuck in a local minima. We study an important problem of auto-tuning the memory allocation for applications running on modern distributed data processing systems. A simple "white-box" model is developed which can quickly separate good configurations from bad ones. To combine the benefits of the two approaches to tuning, we build a framework called Guided Bayesian Optimization (GBO) that uses the white-box model as a guide during the Bayesian Optimization exploration process. An evaluation carried out on Apache Spark using industry-standard benchmark applications shows that GBO consistently provides performance speedups across the application workload with the magnitude of savings being close to 2x.
如今,人们对构建自主(或自动驾驶)数据处理系统很感兴趣。一个新兴的思想流派是利用贝叶斯优化的“黑盒”算法来解决这类问题,因为它具有更广泛的适用性和对结果质量的理论保证。然而,黑盒方法可能会耗费时间和人力;否则就会陷入局部极小值。本文研究了在现代分布式数据处理系统上运行的应用程序的内存分配自动调优问题。开发了一个简单的“白盒”模型,可以快速区分好配置和坏配置。为了结合这两种调优方法的优点,我们构建了一个名为引导贝叶斯优化(Guided Bayesian Optimization, GBO)的框架,该框架在贝叶斯优化探索过程中使用白盒模型作为指导。使用行业标准基准测试应用程序对Apache Spark进行的评估表明,GBO始终如一地提供跨应用程序工作负载的性能加速,节省的幅度接近2倍。
{"title":"Guided Bayesian Optimization to AutoTune Memory-Based Analytics","authors":"Mayuresh Kunjir","doi":"10.1109/ICDEW.2019.00-22","DOIUrl":"https://doi.org/10.1109/ICDEW.2019.00-22","url":null,"abstract":"There is a lot of interest today in building autonomous (or, self-driving) data processing systems. An emerging school of thought is to leverage the \"black box\" algorithm of Bayesian Optimization for problems of this flavor both due to its wider applicability and theoretical guarantees on the quality of results produced. The black-box approach, however, could be time and labor-intensive; or otherwise get stuck in a local minima. We study an important problem of auto-tuning the memory allocation for applications running on modern distributed data processing systems. A simple \"white-box\" model is developed which can quickly separate good configurations from bad ones. To combine the benefits of the two approaches to tuning, we build a framework called Guided Bayesian Optimization (GBO) that uses the white-box model as a guide during the Bayesian Optimization exploration process. An evaluation carried out on Apache Spark using industry-standard benchmark applications shows that GBO consistently provides performance speedups across the application workload with the magnitude of savings being close to 2x.","PeriodicalId":186190,"journal":{"name":"2019 IEEE 35th International Conference on Data Engineering Workshops (ICDEW)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130045356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Reducing Forks in the Blockchain via Probabilistic Verification 通过概率验证减少区块链中的分叉
Pub Date : 2019-04-01 DOI: 10.1109/ICDEW.2019.00-42
Bing Liu, Yang Qin, X. Chu
Blockchain is a disruptive technique that finds many applications in FinTech, IoT, and token economy. Because of the asynchrony of network, the competition of mining, and the nondeterministic block propagation delay, forks in the blockchain occur frequently which not only waste a lot of computing resources but also result in potential security issues. This paper introduces PvScheme, a probabilistic verification scheme that can effectively reduce the block propagation delay and hence reduce the occurrence of blockchain forks. We further enhance the security of PvScheme to provide reliable block delivery. We also analyze the resistance of PvScheme to fake blocks and double spending attacks. The results of several comparative experiments show that our scheme can indeed reduce forks and improve the blockchain performance.
区块链是一种颠覆性技术,在金融科技、物联网和代币经济中有许多应用。由于网络的异步性、挖矿的竞争性以及区块传播延迟的不确定性,区块链中的分叉频繁发生,不仅浪费了大量的计算资源,而且造成了潜在的安全问题。本文介绍了PvScheme,这是一种概率验证方案,可以有效地减少块传播延迟,从而减少区块链分叉的发生。我们进一步增强了PvScheme的安全性,以提供可靠的块交付。我们还分析了PvScheme对假块和双重花费攻击的抵抗力。几个对比实验的结果表明,我们的方案确实可以减少分叉,提高区块链的性能。
{"title":"Reducing Forks in the Blockchain via Probabilistic Verification","authors":"Bing Liu, Yang Qin, X. Chu","doi":"10.1109/ICDEW.2019.00-42","DOIUrl":"https://doi.org/10.1109/ICDEW.2019.00-42","url":null,"abstract":"Blockchain is a disruptive technique that finds many applications in FinTech, IoT, and token economy. Because of the asynchrony of network, the competition of mining, and the nondeterministic block propagation delay, forks in the blockchain occur frequently which not only waste a lot of computing resources but also result in potential security issues. This paper introduces PvScheme, a probabilistic verification scheme that can effectively reduce the block propagation delay and hence reduce the occurrence of blockchain forks. We further enhance the security of PvScheme to provide reliable block delivery. We also analyze the resistance of PvScheme to fake blocks and double spending attacks. The results of several comparative experiments show that our scheme can indeed reduce forks and improve the blockchain performance.","PeriodicalId":186190,"journal":{"name":"2019 IEEE 35th International Conference on Data Engineering Workshops (ICDEW)","volume":"161 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126919126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
[Copyright notice] (版权)
Pub Date : 2019-04-01 DOI: 10.1109/icdew.2019.00003
{"title":"[Copyright notice]","authors":"","doi":"10.1109/icdew.2019.00003","DOIUrl":"https://doi.org/10.1109/icdew.2019.00003","url":null,"abstract":"","PeriodicalId":186190,"journal":{"name":"2019 IEEE 35th International Conference on Data Engineering Workshops (ICDEW)","volume":"104 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133151144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Driving Big Data: A First Look at Driving Behavior via a Large-Scale Private Car Dataset 驾驶大数据:通过大型私家车数据集了解驾驶行为
Pub Date : 2019-04-01 DOI: 10.1109/ICDEW.2019.00-34
Tong Li, A. Alhilal, Anlan Zhang, M. A. Hoque, Dimitris Chatzopoulos, Zhu Xiao, Yong Li, P. Hui
The increasing number of privately owned vehicles in large metropolitan cities has contributed to traffic congestion, increased energy waste, raised CO2 emissions, and impacted our living conditions negatively. Analysis of data representing citizens' driving behavior can provide insights to reverse these conditions. This article presents a large-scale driving status and trajectory dataset consisting of 426,992,602 records collected from 68,069 vehicles over a month. From the dataset, we analyze the driving behavior and produce random distributions of trip duration and millage to characterize car trips. We have found that a private car has more than 17% probability to make four trips per day, and a trip has more than 25% probability to last 20-30 minutes and 33% probability to travel 10 Kilometers during the trip. The collective distributions of trip mileage and duration follow Weibull distribution, whereas the hourly trips follow the well known diurnal pattern and so the hourly fuel efficiency. Based on these findings, we have developed an application which recommends the drivers to find the nearby gas stations and possible favorite places from past trips. We further highlight that our dataset can be applied for developing dynamic Green maps for fuel-efficient routing, modeling efficient Vehicle-to-Vehicle (V2V) communications, verifying existing V2V protocols, and understanding user behavior in driving their private cars.
大城市私家车数量的增加造成了交通拥堵、能源浪费、二氧化碳排放量的增加,对我们的生活条件产生了负面影响。对代表公民驾驶行为的数据进行分析,可以为扭转这些状况提供见解。本文介绍了一个大规模的驾驶状态和轨迹数据集,包括从68,069辆汽车中收集的426,992,602条记录。从数据集中,我们分析驾驶行为,并产生行程时间和里程的随机分布来表征汽车旅行。我们发现,私家车每天出行四次的概率超过17%,一次出行持续20-30分钟的概率超过25%,出行10公里的概率超过33%。行程里程和持续时间的总体分布遵循威布尔分布,而小时行程遵循众所周知的日模式,因此小时燃油效率。基于这些发现,我们开发了一个应用程序,它可以推荐司机找到附近的加油站和过去旅行中可能喜欢的地方。我们进一步强调,我们的数据集可以应用于开发节能路线的动态绿色地图,建模高效的车对车(V2V)通信,验证现有的V2V协议,以及了解驾驶私家车的用户行为。
{"title":"Driving Big Data: A First Look at Driving Behavior via a Large-Scale Private Car Dataset","authors":"Tong Li, A. Alhilal, Anlan Zhang, M. A. Hoque, Dimitris Chatzopoulos, Zhu Xiao, Yong Li, P. Hui","doi":"10.1109/ICDEW.2019.00-34","DOIUrl":"https://doi.org/10.1109/ICDEW.2019.00-34","url":null,"abstract":"The increasing number of privately owned vehicles in large metropolitan cities has contributed to traffic congestion, increased energy waste, raised CO2 emissions, and impacted our living conditions negatively. Analysis of data representing citizens' driving behavior can provide insights to reverse these conditions. This article presents a large-scale driving status and trajectory dataset consisting of 426,992,602 records collected from 68,069 vehicles over a month. From the dataset, we analyze the driving behavior and produce random distributions of trip duration and millage to characterize car trips. We have found that a private car has more than 17% probability to make four trips per day, and a trip has more than 25% probability to last 20-30 minutes and 33% probability to travel 10 Kilometers during the trip. The collective distributions of trip mileage and duration follow Weibull distribution, whereas the hourly trips follow the well known diurnal pattern and so the hourly fuel efficiency. Based on these findings, we have developed an application which recommends the drivers to find the nearby gas stations and possible favorite places from past trips. We further highlight that our dataset can be applied for developing dynamic Green maps for fuel-efficient routing, modeling efficient Vehicle-to-Vehicle (V2V) communications, verifying existing V2V protocols, and understanding user behavior in driving their private cars.","PeriodicalId":186190,"journal":{"name":"2019 IEEE 35th International Conference on Data Engineering Workshops (ICDEW)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127247310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
TVDP: Translational Visual Data Platform for Smart Cities TVDP:智慧城市可视化数据平台
Pub Date : 2019-04-01 DOI: 10.1109/ICDEW.2019.00-36
S. H. Kim, Abdullah Alfarrarjeh, G. Constantinou, C. Shahabi
This paper proposes a platform, dubbed "Translational Visual Data Platform (TVDP)", to collect, manage, analyze urban visual data which enables participating community members connected not only to enhance their individual operations but also to smartly incorporate visual data acquisition, access, analysis methods and results among them. Specifically, we focus on geo-tagged visual data since location information is essential in many smart city applications and provides a fundamental connection in managing and sharing data among collaborators. Furthermore, our study targets for an image-based machine learning platform to prepare users for the upcoming era of machine learning (ML) and artificial intelligence (AI) applications. TVDP will be used to pilot, test, and apply various visual data-intensive applications in a collaborative way. New data, methods, and extracted knowledge from one application can be effectively translated into other applications, ultimately making visual data and analysis as a smart city infrastructure. The goal is to make value creation through visual data and their analysis as broadly available as possible, thus to make social and economic problem solving more distributed and collaborative among users. This paper reports the design and implementation of TVDP in progress and partial experimental results to demonstrate its feasibility.
本文提出了一个“转化视觉数据平台(TVDP)”,用于收集、管理和分析城市视觉数据,使参与其中的社区成员不仅可以提高各自的运营能力,而且可以将视觉数据的获取、访问、分析方法和结果巧妙地融合在一起。具体来说,我们专注于地理标记的视觉数据,因为位置信息在许多智慧城市应用中是必不可少的,并提供了在协作者之间管理和共享数据的基本连接。此外,我们的研究目标是基于图像的机器学习平台,为即将到来的机器学习(ML)和人工智能(AI)应用时代的用户做好准备。TVDP将用于以协作方式试点、测试和应用各种可视化数据密集型应用。从一个应用程序中提取的新数据、方法和知识可以有效地转化为其他应用程序,最终使可视化数据和分析成为智慧城市的基础设施。其目标是通过可视化数据及其分析尽可能广泛地提供价值创造,从而使社会和经济问题的解决在用户之间更加分散和协作。本文报告了正在进行的TVDP的设计与实现,并给出了部分实验结果来证明其可行性。
{"title":"TVDP: Translational Visual Data Platform for Smart Cities","authors":"S. H. Kim, Abdullah Alfarrarjeh, G. Constantinou, C. Shahabi","doi":"10.1109/ICDEW.2019.00-36","DOIUrl":"https://doi.org/10.1109/ICDEW.2019.00-36","url":null,"abstract":"This paper proposes a platform, dubbed \"Translational Visual Data Platform (TVDP)\", to collect, manage, analyze urban visual data which enables participating community members connected not only to enhance their individual operations but also to smartly incorporate visual data acquisition, access, analysis methods and results among them. Specifically, we focus on geo-tagged visual data since location information is essential in many smart city applications and provides a fundamental connection in managing and sharing data among collaborators. Furthermore, our study targets for an image-based machine learning platform to prepare users for the upcoming era of machine learning (ML) and artificial intelligence (AI) applications. TVDP will be used to pilot, test, and apply various visual data-intensive applications in a collaborative way. New data, methods, and extracted knowledge from one application can be effectively translated into other applications, ultimately making visual data and analysis as a smart city infrastructure. The goal is to make value creation through visual data and their analysis as broadly available as possible, thus to make social and economic problem solving more distributed and collaborative among users. This paper reports the design and implementation of TVDP in progress and partial experimental results to demonstrate its feasibility.","PeriodicalId":186190,"journal":{"name":"2019 IEEE 35th International Conference on Data Engineering Workshops (ICDEW)","volume":"134 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114754184","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
期刊
2019 IEEE 35th International Conference on Data Engineering Workshops (ICDEW)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1