首页 > 最新文献

Scientific and statistical database management : International Conference, SSDBM ... : proceedings. International Conference on Scientific and Statistical Database Management最新文献

英文 中文
Protection of sensitive trajectory datasets through spatial and temporal exchange 通过时空交换保护敏感轨迹数据集
Elham Naghizade, L. Kulik, E. Tanin
Privacy concerns place a great impediment to publishing and/or exchanging trajectory data across companies and institutions. This has urged researchers to address privacy issues prior to trajectory data release. Currently, privacy preserving solutions distort original data unnecessarily, hence, degrade data utility and make such data less useful for third parties. We consider a trajectory as a sequence of stops and moves, and propose an approach that exploits features of a trajectory as means for preserving privacy while maintaining a high level of utility. We introduce the concept of sensitivity for stops based on the assumption that they are more vulnerable to privacy threats. We propose an efficient algorithm that either substitutes sensitive stop points of a trajectory with moves from the same trajectory or introduces a minimal detour if a less sensitive stop can not be found on the same route. Our experiments shows that our method balances user privacy and data utility: it protects privacy through preventing an adversary from making inferences about sensitive stops while maintaining a high level of data similarity to the original dataset.
隐私问题对公司和机构之间发布和/或交换轨迹数据造成了很大的障碍。这促使研究人员在轨迹数据发布之前解决隐私问题。目前,隐私保护解决方案不必要地扭曲了原始数据,从而降低了数据的效用,使这些数据对第三方的有用性降低。我们将轨迹视为一系列的停止和移动,并提出了一种利用轨迹特征作为保护隐私同时保持高水平效用的方法。基于站点更容易受到隐私威胁的假设,我们引入了站点敏感性的概念。我们提出了一种有效的算法,该算法要么用来自同一轨迹的移动替代轨迹上的敏感停靠点,要么在同一路线上找不到不那么敏感的停靠点时引入最小绕行。我们的实验表明,我们的方法平衡了用户隐私和数据效用:它通过防止对手对敏感停止进行推断来保护隐私,同时保持与原始数据集的高水平数据相似性。
{"title":"Protection of sensitive trajectory datasets through spatial and temporal exchange","authors":"Elham Naghizade, L. Kulik, E. Tanin","doi":"10.1145/2618243.2618278","DOIUrl":"https://doi.org/10.1145/2618243.2618278","url":null,"abstract":"Privacy concerns place a great impediment to publishing and/or exchanging trajectory data across companies and institutions. This has urged researchers to address privacy issues prior to trajectory data release. Currently, privacy preserving solutions distort original data unnecessarily, hence, degrade data utility and make such data less useful for third parties. We consider a trajectory as a sequence of stops and moves, and propose an approach that exploits features of a trajectory as means for preserving privacy while maintaining a high level of utility. We introduce the concept of sensitivity for stops based on the assumption that they are more vulnerable to privacy threats. We propose an efficient algorithm that either substitutes sensitive stop points of a trajectory with moves from the same trajectory or introduces a minimal detour if a less sensitive stop can not be found on the same route. Our experiments shows that our method balances user privacy and data utility: it protects privacy through preventing an adversary from making inferences about sensitive stops while maintaining a high level of data similarity to the original dataset.","PeriodicalId":74773,"journal":{"name":"Scientific and statistical database management : International Conference, SSDBM ... : proceedings. International Conference on Scientific and Statistical Database Management","volume":"43 1","pages":"40:1-40:4"},"PeriodicalIF":0.0,"publicationDate":"2014-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87105648","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
A provable algorithmic approach to product selection problems for market entry and sustainability 市场进入和可持续性的产品选择问题的可证明算法方法
Silei Xu, Yishi Lin, Hong Xie, John C.S. Lui
Given the globalized economy, how to process the heterogeneous web data so to extract customers' purchase behavior is crucial to manufacturers who want to enter or sustain in a competitive market. To maximize the sales, manufacturers not only need to decide what products to produce so to meet diverse customers' requirements, but at the same time, compete with competitors' products. In this paper, we present a general framework for the following product selection problems: (1) k-BSP problem, which is for a manufacturer to enter a competitive market, and (2) k-BBP problem, which is for a manufacturer to sustain in a competitive market. We propose several product adoption models to describe the complex purchase behavior of customers, and formally show that these problems are NP-hard in general. To tackle these problems, we propose computationally efficient greedy-based approximation algorithms. Based on the submodularity analysis, we prove that our algorithms can guarantee a (1--1/e)-approximation ratio as compared to the optimal solutions. We perform large scale data analysis to show the efficiency and accuracy of our framework. In our experiments, we observe 1,300 to 250,000 times speedup as compared to the exhaustive algorithms, and our solutions can achieve on average 96% of solution quality as compared to the optimal solutions. Finally, we apply our algorithms on web dataset to show the impact of customers' different purchase behavior on the results of product selection.
在经济全球化的背景下,如何处理异构的网络数据,提取客户的购买行为,对于想要进入竞争激烈的市场或在竞争中维持生存的制造商来说至关重要。为了使销售最大化,制造商不仅需要决定生产什么产品以满足不同客户的要求,同时还要与竞争对手的产品进行竞争。本文提出了以下产品选择问题的一般框架:(1)制造商进入竞争市场的k-BSP问题,(2)制造商在竞争市场中维持的k-BBP问题。我们提出了几个产品采用模型来描述客户的复杂购买行为,并正式表明这些问题通常是np困难的。为了解决这些问题,我们提出了计算效率高的基于贪婪的近似算法。基于子模块化分析,我们证明了与最优解相比,我们的算法可以保证(1—1/e)-近似比。我们进行了大规模的数据分析,以显示我们的框架的效率和准确性。在我们的实验中,我们观察到与穷举算法相比,我们的解决方案的速度提高了1300到25万倍,与最优解决方案相比,我们的解决方案平均可以达到96%的解决方案质量。最后,我们将我们的算法应用于web数据集,以显示客户不同的购买行为对产品选择结果的影响。
{"title":"A provable algorithmic approach to product selection problems for market entry and sustainability","authors":"Silei Xu, Yishi Lin, Hong Xie, John C.S. Lui","doi":"10.1145/2618243.2618250","DOIUrl":"https://doi.org/10.1145/2618243.2618250","url":null,"abstract":"Given the globalized economy, how to process the heterogeneous web data so to extract customers' purchase behavior is crucial to manufacturers who want to enter or sustain in a competitive market. To maximize the sales, manufacturers not only need to decide what products to produce so to meet diverse customers' requirements, but at the same time, compete with competitors' products. In this paper, we present a general framework for the following product selection problems: (1) k-BSP problem, which is for a manufacturer to enter a competitive market, and (2) k-BBP problem, which is for a manufacturer to sustain in a competitive market. We propose several product adoption models to describe the complex purchase behavior of customers, and formally show that these problems are NP-hard in general. To tackle these problems, we propose computationally efficient greedy-based approximation algorithms. Based on the submodularity analysis, we prove that our algorithms can guarantee a (1--1/e)-approximation ratio as compared to the optimal solutions. We perform large scale data analysis to show the efficiency and accuracy of our framework. In our experiments, we observe 1,300 to 250,000 times speedup as compared to the exhaustive algorithms, and our solutions can achieve on average 96% of solution quality as compared to the optimal solutions. Finally, we apply our algorithms on web dataset to show the impact of customers' different purchase behavior on the results of product selection.","PeriodicalId":74773,"journal":{"name":"Scientific and statistical database management : International Conference, SSDBM ... : proceedings. International Conference on Scientific and Statistical Database Management","volume":"29 1","pages":"19:1-19:12"},"PeriodicalIF":0.0,"publicationDate":"2014-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82806787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
On efficiently generating realistic social media timeline structures 有效地生成现实的社交媒体时间线结构
Chengcheng Yu, Fan Xia, Weining Qian, Aoying Zhou, Jianlong Chang
A framework of synthetic data generator to generate social media timeline structures is proposed in this paper, which is useful for benchmarking query processing over social media data, and validating hypothesis over users' behavior. It is flexible to generate synthetic data with different distributions. With the help of its asynchronized parallel processing model and delayed update strategy, it is efficient to feed out timeline structure with high throughput. We show in experiments that our method can generate realistic social media timeline structures efficiently.
本文提出了一个生成社交媒体时间线结构的合成数据生成器框架,该框架可用于对社交媒体数据的查询处理进行基准测试,并验证对用户行为的假设。它可以灵活地生成具有不同分布的合成数据。利用异步并行处理模型和延迟更新策略,可以有效地输出高吞吐量的时间线结构。实验表明,我们的方法可以有效地生成真实的社交媒体时间线结构。
{"title":"On efficiently generating realistic social media timeline structures","authors":"Chengcheng Yu, Fan Xia, Weining Qian, Aoying Zhou, Jianlong Chang","doi":"10.1145/2618243.2618272","DOIUrl":"https://doi.org/10.1145/2618243.2618272","url":null,"abstract":"A framework of synthetic data generator to generate social media timeline structures is proposed in this paper, which is useful for benchmarking query processing over social media data, and validating hypothesis over users' behavior. It is flexible to generate synthetic data with different distributions. With the help of its asynchronized parallel processing model and delayed update strategy, it is efficient to feed out timeline structure with high throughput. We show in experiments that our method can generate realistic social media timeline structures efficiently.","PeriodicalId":74773,"journal":{"name":"Scientific and statistical database management : International Conference, SSDBM ... : proceedings. International Conference on Scientific and Statistical Database Management","volume":"15 1","pages":"45:1-45:4"},"PeriodicalIF":0.0,"publicationDate":"2014-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89420643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A case study in optimizing continuous queries using the magic update technique 一个使用神奇更新技术优化连续查询的案例研究
Andreas Behrend, Gereon Schüller
The evaluation of continuous queries over data streams often becomes difficult as soon as static context data must be combined with dynamic stream data. This is especially the case if the context data is organized in form of view hierarchies and thus computed from some base facts. In this scenario, typical algebraic optimization strategies fail in providing a well-optimized query evaluation plan which effectively combines the stream and classical view subparts of the given query. The Magic Update method represents a possible solution to this problem as it allows for dynamically generating new selection conditions from the data stream which are pushed into the view hierarchy of context data. In this paper we present a case study in which the performance gain of this technique is shown when optimizing anomaly detection views in an air-traffic surveillance scenario.
一旦静态上下文数据必须与动态流数据相结合,对数据流上的连续查询的评估就会变得困难。如果上下文数据以视图层次结构的形式组织,从而根据一些基本事实计算,则尤其如此。在这种情况下,典型的代数优化策略无法提供优化良好的查询评估计划,该计划无法有效地将给定查询的流和经典视图子部分结合起来。Magic Update方法代表了这个问题的一种可能的解决方案,因为它允许从数据流动态生成新的选择条件,这些选择条件被推送到上下文数据的视图层次结构中。在本文中,我们提出了一个案例研究,在优化空中交通监视场景中的异常检测视图时,显示了该技术的性能增益。
{"title":"A case study in optimizing continuous queries using the magic update technique","authors":"Andreas Behrend, Gereon Schüller","doi":"10.1145/2618243.2618285","DOIUrl":"https://doi.org/10.1145/2618243.2618285","url":null,"abstract":"The evaluation of continuous queries over data streams often becomes difficult as soon as static context data must be combined with dynamic stream data. This is especially the case if the context data is organized in form of view hierarchies and thus computed from some base facts. In this scenario, typical algebraic optimization strategies fail in providing a well-optimized query evaluation plan which effectively combines the stream and classical view subparts of the given query. The Magic Update method represents a possible solution to this problem as it allows for dynamically generating new selection conditions from the data stream which are pushed into the view hierarchy of context data. In this paper we present a case study in which the performance gain of this technique is shown when optimizing anomaly detection views in an air-traffic surveillance scenario.","PeriodicalId":74773,"journal":{"name":"Scientific and statistical database management : International Conference, SSDBM ... : proceedings. International Conference on Scientific and Statistical Database Management","volume":"27 1","pages":"31:1-31:4"},"PeriodicalIF":0.0,"publicationDate":"2014-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76406929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
A system for efficient and simultaneous processing of moving K nearest neighbor and spatial keyword queries 一个有效的系统,同时处理移动K近邻和空间关键字查询
Chongsheng Zhang
We study the efficient, generic processing of moving K nearest neighbor (MKNN) and top-K spatial keyword (MKSK) queries. Such generic processing is attractive during high query loads. We propose GridVoronoi--an index that enables users to find the spatial nearest neighbor (NN) from uniformly distributed datasets in almost O(1) time. GridVoronoi is based upon Voronoi diagram which has proven to be highly efficient in exploring the local neighborhood of a given Voronoi cell. However, Voronoi diagram needs a method to promptly find out which Voronoi cell contains the query point. So we add a virtual (i.e., conceptual) grid to the Voronoi diagram. For any query point, GridVoronoi first uses the grid to compute which Voronoi cell contains the query, next utilizes Voronoi diagram to quickly find the NN and KNN (i.e., K nearest neighbors) of the query. Upon GridVoronoi we introduce UniSpatial framework that is able to simultaneously process MKNN and MKSK queries. For each keyword, UniSpatial builds a GridVoronoi index that enables the fast retrieval of the spatial Web objects containing this keyword. UniSpatial employs the same method to process MKNN and MKSK queries, but for MKSK queries it needs to rank the retrieved objects by their proximity to the query location and textual relevance to the input keywords. In the demo, we will use real datasets to show the functionality and performance of UniSpatial.
我们研究了移动K个最近邻(MKNN)和顶部K个空间关键字(MKSK)查询的高效、通用处理。这种通用处理在高查询负载期间很有吸引力。我们提出了GridVoronoi——一个使用户能够在几乎0(1)时间内从均匀分布的数据集中找到空间最近邻(NN)的索引。GridVoronoi基于Voronoi图,该图已被证明在探索给定Voronoi细胞的局部邻域时非常有效。然而,Voronoi图需要一种方法来快速找出哪个Voronoi单元格包含查询点。因此,我们在Voronoi图中添加了一个虚拟(即概念)网格。对于任何查询点,GridVoronoi首先使用网格计算包含查询的Voronoi单元格,然后利用Voronoi图快速找到查询的NN和KNN(即K个最近邻)。在gridoronoi上,我们引入了能够同时处理MKNN和MKSK查询的UniSpatial框架。对于每个关键字,UniSpatial构建一个gridoronoi索引,该索引支持快速检索包含该关键字的空间Web对象。UniSpatial使用相同的方法来处理MKNN和MKSK查询,但是对于MKSK查询,它需要根据检索对象与查询位置的接近程度以及与输入关键字的文本相关性对检索对象进行排序。在演示中,我们将使用真实的数据集来展示UniSpatial的功能和性能。
{"title":"A system for efficient and simultaneous processing of moving K nearest neighbor and spatial keyword queries","authors":"Chongsheng Zhang","doi":"10.1145/2618243.2618290","DOIUrl":"https://doi.org/10.1145/2618243.2618290","url":null,"abstract":"We study the efficient, generic processing of moving K nearest neighbor (MKNN) and top-K spatial keyword (MKSK) queries. Such generic processing is attractive during high query loads. We propose GridVoronoi--an index that enables users to find the spatial nearest neighbor (NN) from uniformly distributed datasets in almost O(1) time. GridVoronoi is based upon Voronoi diagram which has proven to be highly efficient in exploring the local neighborhood of a given Voronoi cell. However, Voronoi diagram needs a method to promptly find out which Voronoi cell contains the query point. So we add a virtual (i.e., conceptual) grid to the Voronoi diagram. For any query point, GridVoronoi first uses the grid to compute which Voronoi cell contains the query, next utilizes Voronoi diagram to quickly find the NN and KNN (i.e., K nearest neighbors) of the query.\u0000 Upon GridVoronoi we introduce UniSpatial framework that is able to simultaneously process MKNN and MKSK queries. For each keyword, UniSpatial builds a GridVoronoi index that enables the fast retrieval of the spatial Web objects containing this keyword. UniSpatial employs the same method to process MKNN and MKSK queries, but for MKSK queries it needs to rank the retrieved objects by their proximity to the query location and textual relevance to the input keywords. In the demo, we will use real datasets to show the functionality and performance of UniSpatial.","PeriodicalId":74773,"journal":{"name":"Scientific and statistical database management : International Conference, SSDBM ... : proceedings. International Conference on Scientific and Statistical Database Management","volume":"25 1","pages":"50:1-50:4"},"PeriodicalIF":0.0,"publicationDate":"2014-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78223579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Data perturbation for outlier detection ensembles 离群检测系统的数据摄动
A. Zimek, R. Campello, J. Sander
Outlier detection and ensemble learning are well established research directions in data mining yet the application of ensemble techniques to outlier detection has been rarely studied. Building an ensemble requires learning of diverse models and combining these diverse models in an appropriate way. We propose data perturbation as a new technique to induce diversity in individual outlier detectors as well as a rank accumulation method for the combination of the individual outlier rankings in order to construct an outlier detection ensemble. In an extensive evaluation, we study the impact, potential, and shortcomings of this new approach for outlier detection ensembles. We show that this ensemble can significantly improve over weak performing base methods.
离群点检测和集成学习是数据挖掘中较为成熟的研究方向,但集成技术在离群点检测中的应用研究却很少。构建集成需要学习不同的模型,并以适当的方式组合这些不同的模型。我们提出了一种新的数据扰动技术来诱导单个离群点检测器的多样性,并提出了一种秩累积方法来组合单个离群点的排名,以构建一个离群点检测集合。在广泛的评估中,我们研究了这种新方法对异常值检测集合的影响、潜力和缺点。我们表明,这种集成可以显着改善性能较弱的基本方法。
{"title":"Data perturbation for outlier detection ensembles","authors":"A. Zimek, R. Campello, J. Sander","doi":"10.1145/2618243.2618257","DOIUrl":"https://doi.org/10.1145/2618243.2618257","url":null,"abstract":"Outlier detection and ensemble learning are well established research directions in data mining yet the application of ensemble techniques to outlier detection has been rarely studied. Building an ensemble requires learning of diverse models and combining these diverse models in an appropriate way. We propose data perturbation as a new technique to induce diversity in individual outlier detectors as well as a rank accumulation method for the combination of the individual outlier rankings in order to construct an outlier detection ensemble. In an extensive evaluation, we study the impact, potential, and shortcomings of this new approach for outlier detection ensembles. We show that this ensemble can significantly improve over weak performing base methods.","PeriodicalId":74773,"journal":{"name":"Scientific and statistical database management : International Conference, SSDBM ... : proceedings. International Conference on Scientific and Statistical Database Management","volume":"50 1","pages":"13:1-13:12"},"PeriodicalIF":0.0,"publicationDate":"2014-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72810350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 40
SLACID - sparse linear algebra in a column-oriented in-memory database system SLACID——面向列的内存数据库系统中的稀疏线性代数
D. Kernert, F. Köhler, Wolfgang Lehner
Scientific computations and analytical business applications are often based on linear algebra operations on large, sparse matrices. With the hardware shift of the primary storage from disc into memory it is now feasible to execute linear algebra queries directly in the database engine. This paper presents and compares different approaches of storing sparse matrices in an in-memory column-oriented database system. We show that a system layout derived from the compressed sparse row representation integrates well with a columnar database design and that the resulting architecture is moreover amenable to a wide range of non-numerical use cases when dictionary encoding is used. Dynamic matrix manipulation operations, like online insertion or deletion of elements, are not covered by most linear algebra frameworks. Therefore, we present a hybrid architecture that consists of a read-optimized main and a write-optimized delta structure and evaluate the performance for dynamic sparse matrix workloads by applying workflows of nuclear science and network graphs.
科学计算和分析业务应用通常基于对大型稀疏矩阵的线性代数运算。随着主存储从磁盘到内存的硬件转移,现在可以直接在数据库引擎中执行线性代数查询。本文提出并比较了在面向列的内存数据库系统中存储稀疏矩阵的不同方法。我们表明,从压缩稀疏行表示派生的系统布局与柱状数据库设计集成得很好,并且当使用字典编码时,所得到的架构还适用于广泛的非数值用例。动态矩阵操作操作,如在线插入或删除元素,不包括在大多数线性代数框架中。因此,我们提出了一个由读优化主结构和写优化增量结构组成的混合架构,并通过应用核科学和网络图的工作流来评估动态稀疏矩阵工作负载的性能。
{"title":"SLACID - sparse linear algebra in a column-oriented in-memory database system","authors":"D. Kernert, F. Köhler, Wolfgang Lehner","doi":"10.1145/2618243.2618254","DOIUrl":"https://doi.org/10.1145/2618243.2618254","url":null,"abstract":"Scientific computations and analytical business applications are often based on linear algebra operations on large, sparse matrices. With the hardware shift of the primary storage from disc into memory it is now feasible to execute linear algebra queries directly in the database engine. This paper presents and compares different approaches of storing sparse matrices in an in-memory column-oriented database system. We show that a system layout derived from the compressed sparse row representation integrates well with a columnar database design and that the resulting architecture is moreover amenable to a wide range of non-numerical use cases when dictionary encoding is used. Dynamic matrix manipulation operations, like online insertion or deletion of elements, are not covered by most linear algebra frameworks. Therefore, we present a hybrid architecture that consists of a read-optimized main and a write-optimized delta structure and evaluate the performance for dynamic sparse matrix workloads by applying workflows of nuclear science and network graphs.","PeriodicalId":74773,"journal":{"name":"Scientific and statistical database management : International Conference, SSDBM ... : proceedings. International Conference on Scientific and Statistical Database Management","volume":"23 1","pages":"11:1-11:12"},"PeriodicalIF":0.0,"publicationDate":"2014-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72900372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Managing evolving shapes in sensor networks 在传感器网络中管理不断变化的形状
Besim Avci, Goce Trajcevski, P. Scheuermann
This work addresses the problem of efficient distributed detection and tracking of mobile and evolving/deformable spatial shapes in Wireless Sensor Networks (WSN). The shapes correspond to contiguous regions bounding the locations of sensors in which the readings of the sensors satisfy a particular threshold-based criterion related to the values of a physical phenomenon that they measure. We formalize the predicates representing the shapes in such settings and present detection algorithms. In addition, we provide a light-weight protocol and aggregation methods for energy-efficient distributed execution of those algorithms. Another contribution of this work is that we developed efficient techniques for detecting a co-occurrence of shapes within a given proximity from each other. Our experiments demonstrate that, when compared to the centralized techniques -- which is, predicates being detected in a dedicated sink -- as well as distributed periodic contours construction, our methodologies yield significant energy/communication savings.
这项工作解决了无线传感器网络(WSN)中移动和演化/变形空间形状的高效分布式检测和跟踪问题。形状对应于传感器位置的相邻区域,其中传感器的读数满足与它们测量的物理现象值相关的基于特定阈值的标准。我们形式化了在这种情况下表示形状的谓词,并提出了检测算法。此外,我们还提供了轻量级协议和聚合方法,以实现这些算法的高效分布式执行。这项工作的另一个贡献是我们开发了有效的技术来检测给定距离内形状的共现。我们的实验表明,与集中式技术(即在专用接收器中检测谓词)以及分布式周期性轮廓构建相比,我们的方法可以显著节省能源/通信。
{"title":"Managing evolving shapes in sensor networks","authors":"Besim Avci, Goce Trajcevski, P. Scheuermann","doi":"10.1145/2618243.2618264","DOIUrl":"https://doi.org/10.1145/2618243.2618264","url":null,"abstract":"This work addresses the problem of efficient distributed detection and tracking of mobile and evolving/deformable spatial shapes in Wireless Sensor Networks (WSN). The shapes correspond to contiguous regions bounding the locations of sensors in which the readings of the sensors satisfy a particular threshold-based criterion related to the values of a physical phenomenon that they measure. We formalize the predicates representing the shapes in such settings and present detection algorithms. In addition, we provide a light-weight protocol and aggregation methods for energy-efficient distributed execution of those algorithms. Another contribution of this work is that we developed efficient techniques for detecting a co-occurrence of shapes within a given proximity from each other. Our experiments demonstrate that, when compared to the centralized techniques -- which is, predicates being detected in a dedicated sink -- as well as distributed periodic contours construction, our methodologies yield significant energy/communication savings.","PeriodicalId":74773,"journal":{"name":"Scientific and statistical database management : International Conference, SSDBM ... : proceedings. International Conference on Scientific and Statistical Database Management","volume":"37 5","pages":"22:1-22:12"},"PeriodicalIF":0.0,"publicationDate":"2014-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91551195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Toward efficient and reliable genome analysis using main-memory database systems 利用内存数据库系统进行高效可靠的基因组分析
Sebastian Dorok, S. Breß, H. Läpple, G. Saake
Improvements in DNA sequencing technologies allow to sequence complete human genomes in a short time and at acceptable cost. Hence, the vision of genome analysis as standard procedure to support and improve medical treatment becomes reachable. In this vision paper, we describe important data-management challenges that have to be met to make this vision come true. Besides genome-analysis performance, data-management capabilities such as data provenance and data integrity become increasingly important to enable comprehensible and reliable genome analysis. We argue to meet these challenges by using main-memory database technologies, which combine fast processing capabilities with extensive data-management capabilities. Finally, we discuss possibilities of integrating genome-analysis tasks into DBMSs and derive new research questions.
DNA测序技术的改进使我们能够在短时间内以可接受的成本完成人类基因组的测序。因此,基因组分析作为支持和改善医疗的标准程序的愿景是可以实现的。在这篇愿景论文中,我们描述了实现这一愿景必须面对的重要数据管理挑战。除了基因组分析性能,数据来源和数据完整性等数据管理能力对于实现可理解和可靠的基因组分析也变得越来越重要。我们主张通过使用主存数据库技术来应对这些挑战,该技术将快速处理能力与广泛的数据管理能力相结合。最后,我们讨论了将基因组分析任务集成到数据库管理系统中的可能性,并提出了新的研究问题。
{"title":"Toward efficient and reliable genome analysis using main-memory database systems","authors":"Sebastian Dorok, S. Breß, H. Läpple, G. Saake","doi":"10.1145/2618243.2618276","DOIUrl":"https://doi.org/10.1145/2618243.2618276","url":null,"abstract":"Improvements in DNA sequencing technologies allow to sequence complete human genomes in a short time and at acceptable cost. Hence, the vision of genome analysis as standard procedure to support and improve medical treatment becomes reachable. In this vision paper, we describe important data-management challenges that have to be met to make this vision come true. Besides genome-analysis performance, data-management capabilities such as data provenance and data integrity become increasingly important to enable comprehensible and reliable genome analysis. We argue to meet these challenges by using main-memory database technologies, which combine fast processing capabilities with extensive data-management capabilities. Finally, we discuss possibilities of integrating genome-analysis tasks into DBMSs and derive new research questions.","PeriodicalId":74773,"journal":{"name":"Scientific and statistical database management : International Conference, SSDBM ... : proceedings. International Conference on Scientific and Statistical Database Management","volume":"45 1","pages":"34:1-34:4"},"PeriodicalIF":0.0,"publicationDate":"2014-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74738892","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Extending the SQL array concept to support scientific analytics 扩展SQL数组概念以支持科学分析
D. Misev, P. Baumann
Arrays are among those data types which contribute the most to Big Data -- examples include satellite images and weather simulation output in the Earth sciences, confocal microscopy and CAT scans in the Life sciences, as well as telescope and cosmological observations in Space science, to name but a few. Traditionally, the database community has neglected this, with the effect that ad-hoc implementations prevail. With the advent of NewSQL in recent years, however, the database scope has broadened, and array modelling and query support is seriously considered. Different models have been suggested, some of which are implemented or under implementation, and a consolidation of concepts can be observed. Consequently, integration of array queries into SQL is being addressed. We fill this gap by proposing a generic model, ASQL, for modelling and querying multi-dimensional arrays in ISO SQL. The model integrates concepts from the three major array models seen today: rasdaman, SciQL, and SciDB. It is declarative, optimizable, minimal, yet powerful enough for application domains in science, engineering, and beyond. ASQL has been implemented and is currently being discussed in ISO for extending standard SQL.
阵列是对大数据贡献最大的数据类型之一,例如地球科学中的卫星图像和天气模拟输出,生命科学中的共聚焦显微镜和CAT扫描,以及空间科学中的望远镜和宇宙学观测,仅举几例。传统上,数据库社区忽略了这一点,其结果是临时实现占上风。然而,随着近年来NewSQL的出现,数据库范围得到了扩展,数组建模和查询支持得到了认真的考虑。已经提出了不同的模型,其中一些已经实施或正在实施,并且可以观察到概念的巩固。因此,正在解决将数组查询集成到SQL中的问题。我们提出了一个通用模型ASQL来填补这一空白,ASQL用于在ISO SQL中建模和查询多维数组。该模型集成了目前所见的三种主要数组模型的概念:rasdaman、SciQL和SciDB。它是声明性的、可优化的、最小的,但对于科学、工程等领域的应用程序来说却足够强大。ASQL已经实现,目前正在ISO中讨论扩展标准SQL。
{"title":"Extending the SQL array concept to support scientific analytics","authors":"D. Misev, P. Baumann","doi":"10.1145/2618243.2618255","DOIUrl":"https://doi.org/10.1145/2618243.2618255","url":null,"abstract":"Arrays are among those data types which contribute the most to Big Data -- examples include satellite images and weather simulation output in the Earth sciences, confocal microscopy and CAT scans in the Life sciences, as well as telescope and cosmological observations in Space science, to name but a few. Traditionally, the database community has neglected this, with the effect that ad-hoc implementations prevail. With the advent of NewSQL in recent years, however, the database scope has broadened, and array modelling and query support is seriously considered. Different models have been suggested, some of which are implemented or under implementation, and a consolidation of concepts can be observed. Consequently, integration of array queries into SQL is being addressed.\u0000 We fill this gap by proposing a generic model, ASQL, for modelling and querying multi-dimensional arrays in ISO SQL. The model integrates concepts from the three major array models seen today: rasdaman, SciQL, and SciDB. It is declarative, optimizable, minimal, yet powerful enough for application domains in science, engineering, and beyond. ASQL has been implemented and is currently being discussed in ISO for extending standard SQL.","PeriodicalId":74773,"journal":{"name":"Scientific and statistical database management : International Conference, SSDBM ... : proceedings. International Conference on Scientific and Statistical Database Management","volume":"152 1","pages":"10:1-10:11"},"PeriodicalIF":0.0,"publicationDate":"2014-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86226236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
期刊
Scientific and statistical database management : International Conference, SSDBM ... : proceedings. International Conference on Scientific and Statistical Database Management
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1