The ability to detect and analyze interesting subgraph patterns on large and dynamic graph-structured data in near-real time is crucial for many applications; example includes anomaly detection in phone call networks, advertisement targeting in social networks, malware detection in file download graphs, and many more. Such patterns often need to reason about how the nodes are connected to each other (i.e., the structural component) as well as how the nodes behave in the network (i.e., the activity component). An example of such an activity-driven subgraph pattern is a clique of users in a social network (the structural predicate), who each have posted more than 10 messages in last 2 hours (the activity-based predicate). In this paper, we present Casqd, a system for continuous detection and analysis of such active subgraph pattern queries over large dynamic graphs. Some of key challenges in executing such queries include: handling a wide variety of user-specified activities of interest, low selectivities of activity-based predicates and the resultant exponential search space, and high ingestion rates. A key abstraction in Casqd is a notion called graph-view, which acts as an independence layer between the query language and the underlying physical representation of the graph and the active attributes. This abstraction is aimed at simplifying the query language, while empowering the query optimizer. Considering the balance between expressibility (i.e., patterns that cover many real-world use cases) and optimizability of such patterns, we primarily focus on efficient continuous detection of the active regular structures (specifically, active cliques, active stars, and active bi-cliques). We develop a series of optimization techniques including model-based neighborhood explorations, lazy evaluation of the activity predicates, neighborhood-based search space pruning, and others, for efficient query evaluation. We perform a thorough comparative study of the execution strategies under various settings, and show that our system is capable of achieving event processing throughputs over 800k/s using a single, powerful machine.
{"title":"CASQD: continuous detection of activity-based subgraph pattern queries on dynamic graphs","authors":"J. Mondal, A. Deshpande","doi":"10.1145/2933267.2933316","DOIUrl":"https://doi.org/10.1145/2933267.2933316","url":null,"abstract":"The ability to detect and analyze interesting subgraph patterns on large and dynamic graph-structured data in near-real time is crucial for many applications; example includes anomaly detection in phone call networks, advertisement targeting in social networks, malware detection in file download graphs, and many more. Such patterns often need to reason about how the nodes are connected to each other (i.e., the structural component) as well as how the nodes behave in the network (i.e., the activity component). An example of such an activity-driven subgraph pattern is a clique of users in a social network (the structural predicate), who each have posted more than 10 messages in last 2 hours (the activity-based predicate). In this paper, we present Casqd, a system for continuous detection and analysis of such active subgraph pattern queries over large dynamic graphs. Some of key challenges in executing such queries include: handling a wide variety of user-specified activities of interest, low selectivities of activity-based predicates and the resultant exponential search space, and high ingestion rates. A key abstraction in Casqd is a notion called graph-view, which acts as an independence layer between the query language and the underlying physical representation of the graph and the active attributes. This abstraction is aimed at simplifying the query language, while empowering the query optimizer. Considering the balance between expressibility (i.e., patterns that cover many real-world use cases) and optimizability of such patterns, we primarily focus on efficient continuous detection of the active regular structures (specifically, active cliques, active stars, and active bi-cliques). We develop a series of optimization techniques including model-based neighborhood explorations, lazy evaluation of the activity predicates, neighborhood-based search space pruning, and others, for efficient query evaluation. We perform a thorough comparative study of the execution strategies under various settings, and show that our system is capable of achieving event processing throughputs over 800k/s using a single, powerful machine.","PeriodicalId":277061,"journal":{"name":"Proceedings of the 10th ACM International Conference on Distributed and Event-based Systems","volume":"27 22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116535770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Detection of stateful complex event patterns using parallel programming features is a challenging task because of statefulness of event detection operators. Parallelization of event detection tasks needs to be implemented in a way that keeps track of state changes by new arriving events. In this paper, we describe our implementation for a customized complex event detection engine by using Open Multi-Processing (OpenMP), a shared memory programming model. In our system event detection is implemented using Deterministic Finite Automata (DFAs). We implemented a data stream aggregator that merges 4 given event streams into a sequence of C++ objects in a buffer used as source event stream for event detection in a next processing step. We describe implementation details and 3 architectural variations for stream aggregation and parallelized of event processing. We conducted performance experiments with each of the variations and report some of our experimental results. A comparison of our performance results shows that for event processing on single machine with multi cores and limited memory, using mutli-threads with shared buffer has better stream processing performance than an implementation with multi-processes and shared memory.
{"title":"Stateful complex event detection on event streams using parallelization of event stream aggregations and detection tasks","authors":"Saeed Fathollahzadeh, Kia Teymourian, M. Sharifi","doi":"10.1145/2933267.2933518","DOIUrl":"https://doi.org/10.1145/2933267.2933518","url":null,"abstract":"Detection of stateful complex event patterns using parallel programming features is a challenging task because of statefulness of event detection operators. Parallelization of event detection tasks needs to be implemented in a way that keeps track of state changes by new arriving events. In this paper, we describe our implementation for a customized complex event detection engine by using Open Multi-Processing (OpenMP), a shared memory programming model. In our system event detection is implemented using Deterministic Finite Automata (DFAs). We implemented a data stream aggregator that merges 4 given event streams into a sequence of C++ objects in a buffer used as source event stream for event detection in a next processing step. We describe implementation details and 3 architectural variations for stream aggregation and parallelized of event processing. We conducted performance experiments with each of the variations and report some of our experimental results. A comparison of our performance results shows that for event processing on single machine with multi cores and limited memory, using mutli-threads with shared buffer has better stream processing performance than an implementation with multi-processes and shared memory.","PeriodicalId":277061,"journal":{"name":"Proceedings of the 10th ACM International Conference on Distributed and Event-based Systems","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132834042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vincenzo Gulisano, Zbigniew Jerzak, Spyros Voulgaris, H. Ziekow
The DEBS Grand Challenge is a series of challenges which address problems in event stream processing. The focus of the Grand Challenge in 2016 is on processing of data streams that originate from social networks. Hence, the data represents an evolving graph structure. With this challenge we take up the general scenario and data source from the 2014 SIGMOD contest. However, in contrasts to the SIGMOD contest, the DEBS grand challenge explicitly focuses on continuous processing of streaming data and thus dynamic changes in graphs. This paper describes the specifics of the data streams and continuous queries that define the DEBS Grand Challenge 2016.
{"title":"The DEBS 2016 grand challenge","authors":"Vincenzo Gulisano, Zbigniew Jerzak, Spyros Voulgaris, H. Ziekow","doi":"10.1145/2933267.2933519","DOIUrl":"https://doi.org/10.1145/2933267.2933519","url":null,"abstract":"The DEBS Grand Challenge is a series of challenges which address problems in event stream processing. The focus of the Grand Challenge in 2016 is on processing of data streams that originate from social networks. Hence, the data represents an evolving graph structure. With this challenge we take up the general scenario and data source from the 2014 SIGMOD contest. However, in contrasts to the SIGMOD contest, the DEBS grand challenge explicitly focuses on continuous processing of streaming data and thus dynamic changes in graphs. This paper describes the specifics of the data streams and continuous queries that define the DEBS Grand Challenge 2016.","PeriodicalId":277061,"journal":{"name":"Proceedings of the 10th ACM International Conference on Distributed and Event-based Systems","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132838525","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
An indoor tracking system is inherently an asynchronous and distributed system that contains various types (e.g., detection, selection, and fusion) of events. One of the key challenges with regards to indoor tracking is an efficient selection and arrangement of sensor devices in the environment. Selecting the "right" subset of these sensors for tracking an object as it traverses an indoor environment is the necessary precondition to achieving accurate indoor tracking. With the recent proliferation of mobile devices, specifically those with many onboard sensors, this challenge has increased in both complexity and scale. No longer can one assume that the sensor infrastructure is static, but rather indoor tracking systems must consider and properly plan for a wide variety of sensors, both static and mobile, to be present. In such a dynamic setup, sensors need to be properly selected using an opportunistic approach. This opportunistic tracking allows for a new dimension of indoor tracking that previously was often infeasible or unpractical due to logistic or financial constraints of most entities. In this paper, we are proposing a selection technique that uses trust as manifested by its a quality-of-service (QoS) feature, accuracy, in a sensor selection function. We first outline how classification of sensors is achieved in a dynamic manner and then how the accuracy can be discerned from this classification in an effort to properly identify the trust of a tracking sensor and then use this information to improve the sensor selection process. We conclude this paper with a discussion of results of this implementation on a prototype indoor tracking system in an effort to demonstrate the overall effectiveness of this selection technique.
{"title":"Infusing trust in indoor tracking: poster","authors":"Ryan Rybarczyk, R. Raje, M. Tuceryan","doi":"10.1145/2933267.2933538","DOIUrl":"https://doi.org/10.1145/2933267.2933538","url":null,"abstract":"An indoor tracking system is inherently an asynchronous and distributed system that contains various types (e.g., detection, selection, and fusion) of events. One of the key challenges with regards to indoor tracking is an efficient selection and arrangement of sensor devices in the environment. Selecting the \"right\" subset of these sensors for tracking an object as it traverses an indoor environment is the necessary precondition to achieving accurate indoor tracking. With the recent proliferation of mobile devices, specifically those with many onboard sensors, this challenge has increased in both complexity and scale. No longer can one assume that the sensor infrastructure is static, but rather indoor tracking systems must consider and properly plan for a wide variety of sensors, both static and mobile, to be present. In such a dynamic setup, sensors need to be properly selected using an opportunistic approach. This opportunistic tracking allows for a new dimension of indoor tracking that previously was often infeasible or unpractical due to logistic or financial constraints of most entities. In this paper, we are proposing a selection technique that uses trust as manifested by its a quality-of-service (QoS) feature, accuracy, in a sensor selection function. We first outline how classification of sensors is achieved in a dynamic manner and then how the accuracy can be discerned from this classification in an effort to properly identify the trust of a tracking sensor and then use this information to improve the sensor selection process. We conclude this paper with a discussion of results of this implementation on a prototype indoor tracking system in an effort to demonstrate the overall effectiveness of this selection technique.","PeriodicalId":277061,"journal":{"name":"Proceedings of the 10th ACM International Conference on Distributed and Event-based Systems","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133839724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
San Yeung, S. Madria, M. Linderman, James R. Milligan
Airborne image sensing systems are equipped on piloted or remotely-piloted aerial vehicles to collect imagery data. Often the equipped image sensors are mostly underutilized. The objective is to increase the sensor system utilization by enabling dynamic multitasking so that ground operators can access and transmit sensor task requests to an aerial vehicle. However, this may deviate the original route of an aerial vehicle. In this paper, we will be investigating this new problem of generating a new route to follow, as long as the assigned target points and original waypoints are not affected. Our goal is to find an optimal route on the fly between the given original waypoints such that it satisfies the maximum number of sensor task requests from ground users, of minimum sum of deviations subject to maximum deviation from the original route, without violating the original mission and flight maneuvering constraints. With the given constraints, finding an optimal route is an NP-hard problem. Therefore, we proposed two heuristic-based methods: namely, the FPCA approach that utilizes the idea of footprint diameter, and the SWCA approach that tackles this problem via the use of task clustering. The performance of these algorithms are compared through experiments using data from real flight trajectories. Our results show that SWCA outperforms FPCA in most settings.
{"title":"Routing and scheduling of spatio-temporal tasks for optimizing airborne sensor system utilization","authors":"San Yeung, S. Madria, M. Linderman, James R. Milligan","doi":"10.1145/2933267.2933301","DOIUrl":"https://doi.org/10.1145/2933267.2933301","url":null,"abstract":"Airborne image sensing systems are equipped on piloted or remotely-piloted aerial vehicles to collect imagery data. Often the equipped image sensors are mostly underutilized. The objective is to increase the sensor system utilization by enabling dynamic multitasking so that ground operators can access and transmit sensor task requests to an aerial vehicle. However, this may deviate the original route of an aerial vehicle. In this paper, we will be investigating this new problem of generating a new route to follow, as long as the assigned target points and original waypoints are not affected. Our goal is to find an optimal route on the fly between the given original waypoints such that it satisfies the maximum number of sensor task requests from ground users, of minimum sum of deviations subject to maximum deviation from the original route, without violating the original mission and flight maneuvering constraints. With the given constraints, finding an optimal route is an NP-hard problem. Therefore, we proposed two heuristic-based methods: namely, the FPCA approach that utilizes the idea of footprint diameter, and the SWCA approach that tackles this problem via the use of task clustering. The performance of these algorithms are compared through experiments using data from real flight trajectories. Our results show that SWCA outperforms FPCA in most settings.","PeriodicalId":277061,"journal":{"name":"Proceedings of the 10th ACM International Conference on Distributed and Event-based Systems","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127610121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Evaluating continuous queries in real-time while receiving events notifications requires a balanced exploitation of available resources. In this paper, we present our solution for aggregating events from social streams in memory. With the help of a compact representation of the data and relying on an efficient tree data structure, we were able to minimize the costs of the updates required when an event enters or leaves the current window which led to low and stable latencies and high throughput.
{"title":"In-memory indexation of event streams","authors":"Ahmad Hasan, A. Paschke","doi":"10.1145/2933267.2933511","DOIUrl":"https://doi.org/10.1145/2933267.2933511","url":null,"abstract":"Evaluating continuous queries in real-time while receiving events notifications requires a balanced exploitation of available resources. In this paper, we present our solution for aggregating events from social streams in memory. With the help of a compact representation of the data and relying on an efficient tree data structure, we were able to minimize the costs of the updates required when an event enters or leaves the current window which led to low and stable latencies and high throughput.","PeriodicalId":277061,"journal":{"name":"Proceedings of the 10th ACM International Conference on Distributed and Event-based Systems","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133852203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
V. Cardellini, V. Grassi, F. L. Presti, Matteo Nardelli
Data Stream Processing (DSP) applications are widely used to timely extract information from distributed data sources, such as sensing devices, monitoring stations, and social networks. To successfully handle this ever increasing amount of data, recent trends investigate the possibility of exploiting decentralized computational resources (e.g., Fog computing) to define the applications placement. Several placement policies have been proposed in the literature, but they are based on different assumptions and optimization goals and, as such, they are not completely comparable to each other. In this paper we study the placement problem for distributed DSP applications. Our contributions are twofold. We provide a general formulation of the optimal DSP placement (for short, ODP) as an Integer Linear Programming problem which takes explicitly into account the heterogeneity of computing and networking resources and which encompasses - as special cases - the different solutions proposed in the literature. We present an ODP-based scheduler for the Apache Storm DSP framework. This allows us to compare some well-known centralized and decentralized placement solutions. We also extensively analyze the ODP scalability with respect to various parameter settings.
{"title":"Optimal operator placement for distributed stream processing applications","authors":"V. Cardellini, V. Grassi, F. L. Presti, Matteo Nardelli","doi":"10.1145/2933267.2933312","DOIUrl":"https://doi.org/10.1145/2933267.2933312","url":null,"abstract":"Data Stream Processing (DSP) applications are widely used to timely extract information from distributed data sources, such as sensing devices, monitoring stations, and social networks. To successfully handle this ever increasing amount of data, recent trends investigate the possibility of exploiting decentralized computational resources (e.g., Fog computing) to define the applications placement. Several placement policies have been proposed in the literature, but they are based on different assumptions and optimization goals and, as such, they are not completely comparable to each other. In this paper we study the placement problem for distributed DSP applications. Our contributions are twofold. We provide a general formulation of the optimal DSP placement (for short, ODP) as an Integer Linear Programming problem which takes explicitly into account the heterogeneity of computing and networking resources and which encompasses - as special cases - the different solutions proposed in the literature. We present an ODP-based scheduler for the Apache Storm DSP framework. This allows us to compare some well-known centralized and decentralized placement solutions. We also extensively analyze the ODP scalability with respect to various parameter settings.","PeriodicalId":277061,"journal":{"name":"Proceedings of the 10th ACM International Conference on Distributed and Event-based Systems","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132616183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In recent years, the proliferation of highly dynamic graph-structured data streams fueled the demand for real-time data analytics. For instance, detecting recent trends in social networks enables new applications in areas such as disaster detection, business analytics or health-care. Parallel Complex Event Processing has evolved as the paradigm of choice to analyze data streams in a timely manner, where the incoming data streams are split and processed independently by parallel operator instances. However, the degree of parallelism is limited by the feasibility of splitting the data streams into independent parts such that correctness of event processing is still ensured. In this paper, we overcome this limitation for graph-structured data by further parallelizing individual operator instances using modern graph processing systems. These systems partition the graph data and execute graph algorithms in a highly parallel fashion, for instance using cloud resources. To this end, we propose a novel graph-based Complex Event Processing system GraphCEP and evaluate its performance in the setting of two case studies from the DEBS Grand Challenge 2016.
{"title":"GraphCEP: real-time data analytics using parallel complex event and graph processing","authors":"R. Mayer, C. Mayer, M. Tariq, K. Rothermel","doi":"10.1145/2933267.2933509","DOIUrl":"https://doi.org/10.1145/2933267.2933509","url":null,"abstract":"In recent years, the proliferation of highly dynamic graph-structured data streams fueled the demand for real-time data analytics. For instance, detecting recent trends in social networks enables new applications in areas such as disaster detection, business analytics or health-care. Parallel Complex Event Processing has evolved as the paradigm of choice to analyze data streams in a timely manner, where the incoming data streams are split and processed independently by parallel operator instances. However, the degree of parallelism is limited by the feasibility of splitting the data streams into independent parts such that correctness of event processing is still ensured. In this paper, we overcome this limitation for graph-structured data by further parallelizing individual operator instances using modern graph processing systems. These systems partition the graph data and execute graph algorithms in a highly parallel fashion, for instance using cloud resources. To this end, we propose a novel graph-based Complex Event Processing system GraphCEP and evaluate its performance in the setting of two case studies from the DEBS Grand Challenge 2016.","PeriodicalId":277061,"journal":{"name":"Proceedings of the 10th ACM International Conference on Distributed and Event-based Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129170698","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yuanzhen Ji, A. Nica, Zbigniew Jerzak, Gregor Hackenbroich, C. Fetzer
Handling timestamp-disorder among stream tuples is a basic requirement for data stream processing, and involves an inevitable tradeoff between the latency and the quality of stream query results. To meet the tradeoff requirements of diverse streaming applications, the approach of buffer-based, quality-driven disorder handling (QDDH) was proposed recently, which aims to minimize sizes of stream-sorting buffers, thus the result latency, while honoring user-specified result-quality requirements. Previous work on QDDH focuses only on individual stream queries. However, streaming systems often run multiple queries concurrently, and may exploit sharing opportunities across the concurrent queries. Under such shared query execution, stream-sorting buffers can be shared across queries as well, which can potentially reduce the overall memory cost incurred by the sorting buffers. In this paper, focusing on windowed stream queries, we propose a solution for doing QDDH for concurrent queries, across which common source and stream-filtering operators are shared. Experimental results show that our solution can determine the optimal way of sharing sorting buffers across the concurrent queries, such that the goal of quality-driven result-latency minimization is achieved for each query at a minimum memory cost.
{"title":"Quality-driven disorder handling for concurrent windowed stream queries with shared operators","authors":"Yuanzhen Ji, A. Nica, Zbigniew Jerzak, Gregor Hackenbroich, C. Fetzer","doi":"10.1145/2933267.2933307","DOIUrl":"https://doi.org/10.1145/2933267.2933307","url":null,"abstract":"Handling timestamp-disorder among stream tuples is a basic requirement for data stream processing, and involves an inevitable tradeoff between the latency and the quality of stream query results. To meet the tradeoff requirements of diverse streaming applications, the approach of buffer-based, quality-driven disorder handling (QDDH) was proposed recently, which aims to minimize sizes of stream-sorting buffers, thus the result latency, while honoring user-specified result-quality requirements. Previous work on QDDH focuses only on individual stream queries. However, streaming systems often run multiple queries concurrently, and may exploit sharing opportunities across the concurrent queries. Under such shared query execution, stream-sorting buffers can be shared across queries as well, which can potentially reduce the overall memory cost incurred by the sorting buffers. In this paper, focusing on windowed stream queries, we propose a solution for doing QDDH for concurrent queries, across which common source and stream-filtering operators are shared. Experimental results show that our solution can determine the optimal way of sharing sorting buffers across the concurrent queries, such that the goal of quality-driven result-latency minimization is achieved for each query at a minimum memory cost.","PeriodicalId":277061,"journal":{"name":"Proceedings of the 10th ACM International Conference on Distributed and Event-based Systems","volume":"90 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126245396","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sukanya Bhowmik, M. Tariq, J. Grunert, K. Rothermel
With the vision of Internet of Things gaining popularity at a global level, efficient publish/subscribe middleware for communication within and across datacenters is extremely desirable. In this respect, the very popular Software-defined Networking (SDN), which enables publish/subscribe middleware to perform line-rate filtering of events directly on hardware, can prove to be very useful. While deploying content filters directly on switches of a software-defined network allows optimized paths, high throughput rates, and low end-to-end latency, it suffers from certain inherent limitations w.r.t. no. of bits available on hardware switches to represent these filters. Such a limitation affects expressiveness of filters, resulting in unnecessary traffic in the network. In this paper, we explore various techniques to represent content filters expressively while being limited by hardware. We implement and evaluate techniques that i) use workload, in terms of events and subscriptions, to represent content, and ii) efficiently select attributes to reduce redundancy in content. Moreover, these techniques complement each other and can be combined together to further enhance performance. Our detailed performance evaluations show the potential of these techniques in reducing unnecessary traffic when subjected to different workloads.
{"title":"Bandwidth-efficient content-based routing on software-defined networks","authors":"Sukanya Bhowmik, M. Tariq, J. Grunert, K. Rothermel","doi":"10.1145/2933267.2933310","DOIUrl":"https://doi.org/10.1145/2933267.2933310","url":null,"abstract":"With the vision of Internet of Things gaining popularity at a global level, efficient publish/subscribe middleware for communication within and across datacenters is extremely desirable. In this respect, the very popular Software-defined Networking (SDN), which enables publish/subscribe middleware to perform line-rate filtering of events directly on hardware, can prove to be very useful. While deploying content filters directly on switches of a software-defined network allows optimized paths, high throughput rates, and low end-to-end latency, it suffers from certain inherent limitations w.r.t. no. of bits available on hardware switches to represent these filters. Such a limitation affects expressiveness of filters, resulting in unnecessary traffic in the network. In this paper, we explore various techniques to represent content filters expressively while being limited by hardware. We implement and evaluate techniques that i) use workload, in terms of events and subscriptions, to represent content, and ii) efficiently select attributes to reduce redundancy in content. Moreover, these techniques complement each other and can be combined together to further enhance performance. Our detailed performance evaluations show the potential of these techniques in reducing unnecessary traffic when subjected to different workloads.","PeriodicalId":277061,"journal":{"name":"Proceedings of the 10th ACM International Conference on Distributed and Event-based Systems","volume":"91 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125254328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}