Complex Event Processing (CEP) system enables extraction of higher-level information from real-time data streams produced by distributed sources. However, these systems are subject to changes in the user environment e.g., density of sources, rate at which events occur and mobile sources. Therefore, it becomes difficult to satisfy stringent performance requirements posed in terms of Quality of Service (QoS) demands under such a dynamic environment. This work investigates adaptive use of CEP mechanisms e.g., operator placement and operator migration by supporting transitions i.e., dynamic exchange of these mechanisms. In particular, we build a transition-capable CEP system --- Tcep to enable integration of multiple heterogeneous CEP mechanisms and allow cost-efficient and seamless transitions between them. As a proof-of-concept, we have recently designed and developed an initial architecture named Tcep, where we have shown benefits of transitions among operator placement mechanisms. In an ongoing research, we explore other CEP mechanisms e.g., operator migration and investigate whether transitions can bring performance benefits, under the execution of different strategies. In the future, we will investigate if mechanism transitions in CEP are beneficial in middleware infrastructures including information-centric networks.
{"title":"Adapting to Dynamic User Environments in Complex Event Processing System using Transitions","authors":"Manisha Luthra","doi":"10.1145/3210284.3226051","DOIUrl":"https://doi.org/10.1145/3210284.3226051","url":null,"abstract":"Complex Event Processing (CEP) system enables extraction of higher-level information from real-time data streams produced by distributed sources. However, these systems are subject to changes in the user environment e.g., density of sources, rate at which events occur and mobile sources. Therefore, it becomes difficult to satisfy stringent performance requirements posed in terms of Quality of Service (QoS) demands under such a dynamic environment. This work investigates adaptive use of CEP mechanisms e.g., operator placement and operator migration by supporting transitions i.e., dynamic exchange of these mechanisms. In particular, we build a transition-capable CEP system --- Tcep to enable integration of multiple heterogeneous CEP mechanisms and allow cost-efficient and seamless transitions between them. As a proof-of-concept, we have recently designed and developed an initial architecture named Tcep, where we have shown benefits of transitions among operator placement mechanisms. In an ongoing research, we explore other CEP mechanisms e.g., operator migration and investigate whether transitions can bring performance benefits, under the execution of different strategies. In the future, we will investigate if mechanism transitions in CEP are beneficial in middleware infrastructures including information-centric networks.","PeriodicalId":412438,"journal":{"name":"Proceedings of the 12th ACM International Conference on Distributed and Event-based Systems","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123406844","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ciprian Amariei, Paul Diac, Emanuel Onica, Valentin Rosca
The 2018 Grand Challenge targets the problem of accurate predictions on data streams produced by automatic identification system (AIS) equipment, describing naval traffic. This paper reports the technical details of a custom solution, which exposes multiple tuning parameters, making its configurability one of the main strengths. Our solution employs a cell grid architecture essentially based on a sequence of hash tables, specifically built for the targeted use case. This makes it particularly effective in prediction on AIS data, obtaining a high accuracy and scalable performance results. Moreover, the architecture proposed accommodates also an optionally semi-supervised learning process besides the basic supervised mode.
{"title":"Cell Grid Architecture for Maritime Route Prediction on AIS Data Streams","authors":"Ciprian Amariei, Paul Diac, Emanuel Onica, Valentin Rosca","doi":"10.1145/3210284.3220503","DOIUrl":"https://doi.org/10.1145/3210284.3220503","url":null,"abstract":"The 2018 Grand Challenge targets the problem of accurate predictions on data streams produced by automatic identification system (AIS) equipment, describing naval traffic. This paper reports the technical details of a custom solution, which exposes multiple tuning parameters, making its configurability one of the main strengths. Our solution employs a cell grid architecture essentially based on a sequence of hash tables, specifically built for the targeted use case. This makes it particularly effective in prediction on AIS data, obtaining a high accuracy and scalable performance results. Moreover, the architecture proposed accommodates also an optionally semi-supervised learning process besides the basic supervised mode.","PeriodicalId":412438,"journal":{"name":"Proceedings of the 12th ACM International Conference on Distributed and Event-based Systems","volume":"334 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126026349","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Proceedings of the 12th ACM International Conference on Distributed and Event-based Systems","authors":"","doi":"10.1145/3210284","DOIUrl":"https://doi.org/10.1145/3210284","url":null,"abstract":"","PeriodicalId":412438,"journal":{"name":"Proceedings of the 12th ACM International Conference on Distributed and Event-based Systems","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114822523","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Manisha Luthra, B. Koldehofe, P. Weisenburger, G. Salvaneschi, Raheel Arif
Operator placement has a profound impact on the performance of a distributed complex event processing system (DCEP). Since the behavior of a placement mechanism strongly depends on its environment; a single placement mechanism is often not enough to fulfill stringent performance requirements under environmental changes. In this paper, we show how DCEP can benefit from the adaptive use of multiple placement mechanisms. We propose Tcep, a DCEP system to integrate multiple placement mechanisms. By enabling transitions, Tcep can seamlessly exchange distinct operator mechanisms at runtime. We make two main contributions that are highly important for a cost-efficient transition: i) a transition strategy for efficiently scheduling state migrations and ii) a lightweight learning algorithm to adaptively select an appropriate placement mechanism as a consequence of a transition. Our evaluations for important decentralized placement mechanisms in the context of an IoT scenario show that transitions can better fulfill QoS demands in a dynamic environment. Thereby efficient scheduling of state migrations can help to faster complete transitions by up to 94 %.
{"title":"TCEP","authors":"Manisha Luthra, B. Koldehofe, P. Weisenburger, G. Salvaneschi, Raheel Arif","doi":"10.1145/3210284.3210292","DOIUrl":"https://doi.org/10.1145/3210284.3210292","url":null,"abstract":"Operator placement has a profound impact on the performance of a distributed complex event processing system (DCEP). Since the behavior of a placement mechanism strongly depends on its environment; a single placement mechanism is often not enough to fulfill stringent performance requirements under environmental changes. In this paper, we show how DCEP can benefit from the adaptive use of multiple placement mechanisms. We propose Tcep, a DCEP system to integrate multiple placement mechanisms. By enabling transitions, Tcep can seamlessly exchange distinct operator mechanisms at runtime. We make two main contributions that are highly important for a cost-efficient transition: i) a transition strategy for efficiently scheduling state migrations and ii) a lightweight learning algorithm to adaptively select an appropriate placement mechanism as a consequence of a transition. Our evaluations for important decentralized placement mechanisms in the context of an IoT scenario show that transitions can better fulfill QoS demands in a dynamic environment. Thereby efficient scheduling of state migrations can help to faster complete transitions by up to 94 %.","PeriodicalId":412438,"journal":{"name":"Proceedings of the 12th ACM International Conference on Distributed and Event-based Systems","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126152670","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nicolo Rivetti, Nikos Zacheilas, A. Gal, V. Kalogeraki
In a networked world, events are transmitted from multiple distributed sources into CEP systems, where events are related to one another along multiple dimensions, e.g., temporal and spatial, to create complex events. The big data era brought with it an increase in the scale and frequency of event reporting. Internet of Things adds another layer of complexity with multiple, continuously changing event sources, not all of which are perfectly reliable, often suffering from late arrivals. In this work we propose a probabilistic model to deal with the problem of reduced reliability of event arrival time. We use statistical theories to fit the distributions of inter-generation at the source and network delays per event type. Equipped with these distributions we propose a predictive method for determining whether an event belonging to a window has yet to arrive. Given some user-defined tolerance levels (on quality and timeliness), we propose an algorithm for dynamically determining the amount of time a complex event time-window should remain open. Using a thorough empirical analysis, we compare the proposed algorithm against state-of-the-art mechanisms for delayed arrival of events and show the superiority of our proposed method.
{"title":"Probabilistic Management of Late Arrival of Events","authors":"Nicolo Rivetti, Nikos Zacheilas, A. Gal, V. Kalogeraki","doi":"10.1145/3210284.3210293","DOIUrl":"https://doi.org/10.1145/3210284.3210293","url":null,"abstract":"In a networked world, events are transmitted from multiple distributed sources into CEP systems, where events are related to one another along multiple dimensions, e.g., temporal and spatial, to create complex events. The big data era brought with it an increase in the scale and frequency of event reporting. Internet of Things adds another layer of complexity with multiple, continuously changing event sources, not all of which are perfectly reliable, often suffering from late arrivals. In this work we propose a probabilistic model to deal with the problem of reduced reliability of event arrival time. We use statistical theories to fit the distributions of inter-generation at the source and network delays per event type. Equipped with these distributions we propose a predictive method for determining whether an event belonging to a window has yet to arrive. Given some user-defined tolerance levels (on quality and timeliness), we propose an algorithm for dynamically determining the amount of time a complex event time-window should remain open. Using a thorough empirical analysis, we compare the proposed algorithm against state-of-the-art mechanisms for delayed arrival of events and show the superiority of our proposed method.","PeriodicalId":412438,"journal":{"name":"Proceedings of the 12th ACM International Conference on Distributed and Event-based Systems","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134198586","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Marco Balduini, Sivam Pasupathipillai, Emanuele Della Valle
Distributed systems have become the preferred solution for dealing with Big Data analysis tasks. These systems are able to achieve superior performance by managing a large pool of resources as a single entity. However, in many contexts, performance is not the only metric to consider. When comparing two performance equivalent solutions, their cost becomes an important factor. Distributed systems are usually more expensive to deploy than traditional single-threaded applications. In this work, we build on these considerations by presenting an empirical study that compares the cost of two performance equivalent solutions for a real streaming data analysis task for the Telecommunication industry. The first solution is built on popular distributed processing engines (Apache Spark), while the second solution is a single-threaded application built on an home-brew stream processing framework (Natron). We show that, in the case of continuous analysis, the benefits of distributed processing are outvalued by the distributed data ingestion costs. This is also the case for periodic analysis. However, if data ingestion costs are fixed and small, we show that the most cost-effective solution depends on the dataset size.
{"title":"Cost-Aware Streaming Data Analysis: Distributed vs Single-Thread","authors":"Marco Balduini, Sivam Pasupathipillai, Emanuele Della Valle","doi":"10.1145/3210284.3210294","DOIUrl":"https://doi.org/10.1145/3210284.3210294","url":null,"abstract":"Distributed systems have become the preferred solution for dealing with Big Data analysis tasks. These systems are able to achieve superior performance by managing a large pool of resources as a single entity. However, in many contexts, performance is not the only metric to consider. When comparing two performance equivalent solutions, their cost becomes an important factor. Distributed systems are usually more expensive to deploy than traditional single-threaded applications. In this work, we build on these considerations by presenting an empirical study that compares the cost of two performance equivalent solutions for a real streaming data analysis task for the Telecommunication industry. The first solution is built on popular distributed processing engines (Apache Spark), while the second solution is a single-threaded application built on an home-brew stream processing framework (Natron). We show that, in the case of continuous analysis, the benefits of distributed processing are outvalued by the distributed data ingestion costs. This is also the case for periodic analysis. However, if data ingestion costs are fixed and small, we show that the most cost-effective solution depends on the dataset size.","PeriodicalId":412438,"journal":{"name":"Proceedings of the 12th ACM International Conference on Distributed and Event-based Systems","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124679672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Distributed event-sourced systems adopt a fairly new architectural style for data-intensive applications that maintains the full history of the application state. However, the performance implications of such systems are not yet well explored, let alone how the performance of these systems can be improved. A central issue is the lack of systematic performance engineering approaches that take into account the specific characteristics of these systems. To address this problem, we suggest a methodology for performance engineering and performance analysis of distributed event-sourced systems based on specific measurements and subsequent, targeted optimizations. The methodology blends in well into existing software engineering processes and helps developers to identify bottlenecks and to resolve performance issues. Using our structured approach, we improved an existing event-sourced system prototype and increased its performance considerably.
{"title":"Performance Engineering in Distributed Event-sourced Systems","authors":"Dominik Meißner, Benjamin Erb, F. Kargl","doi":"10.1145/3210284.3219770","DOIUrl":"https://doi.org/10.1145/3210284.3219770","url":null,"abstract":"Distributed event-sourced systems adopt a fairly new architectural style for data-intensive applications that maintains the full history of the application state. However, the performance implications of such systems are not yet well explored, let alone how the performance of these systems can be improved. A central issue is the lack of systematic performance engineering approaches that take into account the specific characteristics of these systems. To address this problem, we suggest a methodology for performance engineering and performance analysis of distributed event-sourced systems based on specific measurements and subsequent, targeted optimizations. The methodology blends in well into existing software engineering processes and helps developers to identify bottlenecks and to resolve performance issues. Using our structured approach, we improved an existing event-sourced system prototype and increased its performance considerably.","PeriodicalId":412438,"journal":{"name":"Proceedings of the 12th ACM International Conference on Distributed and Event-based Systems","volume":"346 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124274901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chun-Xun Lin, Tsung-Wei Huang, Guannan Guo, Martin D. F. Wong
In this paper, we present MtDetector, a high performance marine traffic detector that can predict the destination and the arrival time of travelling vessels. MtDetector accepts streaming data reported by the moving vessels and generates continuous predictions of the arrival port and arrival time for those vessels. To predict the destination for a ship, MtDetector builds a neural network for every port and infers the arrival port for vessels based on their departure port. For the arrival time prediction, we derive informative features from training data and apply Deep Neural Network (DNN) to estimate the traveling time. MtDetector is built on top of DtCraft [1,2], a high-performance distributed execution engine for stream programming. By utilizing the task-based parallelism in DtCraft, MtDetector can process multiple predictions concurrently to achieve high throughput and low latency.
{"title":"MtDetector","authors":"Chun-Xun Lin, Tsung-Wei Huang, Guannan Guo, Martin D. F. Wong","doi":"10.1145/3210284.3220504","DOIUrl":"https://doi.org/10.1145/3210284.3220504","url":null,"abstract":"In this paper, we present MtDetector, a high performance marine traffic detector that can predict the destination and the arrival time of travelling vessels. MtDetector accepts streaming data reported by the moving vessels and generates continuous predictions of the arrival port and arrival time for those vessels. To predict the destination for a ship, MtDetector builds a neural network for every port and infers the arrival port for vessels based on their departure port. For the arrival time prediction, we derive informative features from training data and apply Deep Neural Network (DNN) to estimate the traveling time. MtDetector is built on top of DtCraft [1,2], a high-performance distributed execution engine for stream programming. By utilizing the task-based parallelism in DtCraft, MtDetector can process multiple predictions concurrently to achieve high throughput and low latency.","PeriodicalId":412438,"journal":{"name":"Proceedings of the 12th ACM International Conference on Distributed and Event-based Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114160954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vincenzo Gulisano, Zbigniew Jerzak, Pavel Smirnov, M. Strohbach, H. Ziekow, D. Zissis
The ACM DEBS 2018 Grand Challenge is the eighth in a series of challenges which seek to provide a common ground and evaluation criteria for a competition aimed at both research and industrial event-based systems. The focus of the 2018 Grand Challenge is on the application of machine learning to spatio-temporal streaming data. The goal of the challenge is to make the naval transportation industry more reliable by providing predictions for vessels' destinations and arrival times. This paper describes the specifics of the data streams and queries that define the DEBS 2018 Grand Challenge. It also describes the benchmarking platform that supports testing of corresponding solutions.
{"title":"The DEBS 2018 Grand Challenge","authors":"Vincenzo Gulisano, Zbigniew Jerzak, Pavel Smirnov, M. Strohbach, H. Ziekow, D. Zissis","doi":"10.1145/3210284.3220510","DOIUrl":"https://doi.org/10.1145/3210284.3220510","url":null,"abstract":"The ACM DEBS 2018 Grand Challenge is the eighth in a series of challenges which seek to provide a common ground and evaluation criteria for a competition aimed at both research and industrial event-based systems. The focus of the 2018 Grand Challenge is on the application of machine learning to spatio-temporal streaming data. The goal of the challenge is to make the naval transportation industry more reliable by providing predictions for vessels' destinations and arrival times. This paper describes the specifics of the data streams and queries that define the DEBS 2018 Grand Challenge. It also describes the benchmarking platform that supports testing of corresponding solutions.","PeriodicalId":412438,"journal":{"name":"Proceedings of the 12th ACM International Conference on Distributed and Event-based Systems","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114930623","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Stateful applications are based on the state they hold and how it changes over time. This history of state changes is usually discarded as the application progresses. By building on concepts from event processing and storing the application history we envision a novel programming paradigm that supports retroaction. Retroactive computing introduces new opportunities for a developer to access and even modify an application timeline. By enabling the exploration of alternative scenarios, retroactive computing establishes powerful new ways to debug systems and introduces new approaches to solve problems. Initial work has shown the practicality and possibilities of this new programming paradigm and introduces further research questions and challenges.
{"title":"Towards Time Travel in Distributed Event-sourced Systems","authors":"Dominik Meißner","doi":"10.1145/3210284.3219499","DOIUrl":"https://doi.org/10.1145/3210284.3219499","url":null,"abstract":"Stateful applications are based on the state they hold and how it changes over time. This history of state changes is usually discarded as the application progresses. By building on concepts from event processing and storing the application history we envision a novel programming paradigm that supports retroaction. Retroactive computing introduces new opportunities for a developer to access and even modify an application timeline. By enabling the exploration of alternative scenarios, retroactive computing establishes powerful new ways to debug systems and introduces new approaches to solve problems. Initial work has shown the practicality and possibilities of this new programming paradigm and introduces further research questions and challenges.","PeriodicalId":412438,"journal":{"name":"Proceedings of the 12th ACM International Conference on Distributed and Event-based Systems","volume":"93 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121696175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}