首页 > 最新文献

2014 IEEE 34th International Conference on Distributed Computing Systems最新文献

英文 中文
Learning from the Past: Intelligent On-Line Weather Monitoring Based on Matrix Completion 借鉴过去:基于矩阵补全的智能在线天气监测
Pub Date : 2014-06-30 DOI: 10.1109/ICDCS.2014.26
Kun Xie, Lele Wang, Xin Wang, Jigang Wen, Gaogang Xie
Matrix completion has emerged very recently and provides a new venue for low cost data gathering in WSNs. Existing schemes often assume that the data matrix has a known and fixed low-rank, which is unlikely to hold in a practical monitoring system such as weather data gathering. Weather data varies in temporal and spatial domain with time. By analyzing a large set of weather data collected from 196 sensors in ZhuZhou, China, we reveal that weather data have the features of low-rank, temporal stability, and relative rank stability. Taking advantage of these features, we propose an on-line data gathering scheme based on matrix completion theory, named MC-Weather, to adaptively sample different locations according to environmental and weather conditions. To better schedule sampling process while satisfying the required reconstruction accuracy, we propose several novel techniques, including three sample learning principles, an adaptive sampling algorithm based on matrix completion, and a uniform time slot and cross sample model. With these techniques, our MC-Weather scheme can collect the sensory data at required accuracy while largely reduce the cost for sensing, communication and computation. We perform extensive simulations based on the real weather data sets and the simulation results validate the efficiency and efficacy of the proposed scheme.
矩阵补全是最近才出现的,为wsn的低成本数据采集提供了新的途径。现有方案通常假设数据矩阵具有已知和固定的低秩,这在实际监测系统(如天气数据收集)中不太可能成立。天气资料在时空上随时间而变化。通过对株洲地区196个传感器采集的大量天气数据的分析,发现天气数据具有低秩、时间稳定性和相对秩稳定性的特征。利用这些特点,我们提出了一种基于矩阵补全理论的在线数据采集方案MC-Weather,根据环境和天气条件自适应采样不同的地点。为了在满足重构精度要求的同时更好地调度采样过程,我们提出了几种新技术,包括三个样本学习原理、基于矩阵补全的自适应采样算法以及均匀时隙和交叉样本模型。利用这些技术,我们的MC-Weather方案可以以所需的精度收集传感器数据,同时大大降低了传感、通信和计算的成本。我们基于真实的天气数据集进行了大量的模拟,模拟结果验证了所提出方案的效率和有效性。
{"title":"Learning from the Past: Intelligent On-Line Weather Monitoring Based on Matrix Completion","authors":"Kun Xie, Lele Wang, Xin Wang, Jigang Wen, Gaogang Xie","doi":"10.1109/ICDCS.2014.26","DOIUrl":"https://doi.org/10.1109/ICDCS.2014.26","url":null,"abstract":"Matrix completion has emerged very recently and provides a new venue for low cost data gathering in WSNs. Existing schemes often assume that the data matrix has a known and fixed low-rank, which is unlikely to hold in a practical monitoring system such as weather data gathering. Weather data varies in temporal and spatial domain with time. By analyzing a large set of weather data collected from 196 sensors in ZhuZhou, China, we reveal that weather data have the features of low-rank, temporal stability, and relative rank stability. Taking advantage of these features, we propose an on-line data gathering scheme based on matrix completion theory, named MC-Weather, to adaptively sample different locations according to environmental and weather conditions. To better schedule sampling process while satisfying the required reconstruction accuracy, we propose several novel techniques, including three sample learning principles, an adaptive sampling algorithm based on matrix completion, and a uniform time slot and cross sample model. With these techniques, our MC-Weather scheme can collect the sensory data at required accuracy while largely reduce the cost for sensing, communication and computation. We perform extensive simulations based on the real weather data sets and the simulation results validate the efficiency and efficacy of the proposed scheme.","PeriodicalId":170186,"journal":{"name":"2014 IEEE 34th International Conference on Distributed Computing Systems","volume":"84 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114503421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
Will They Blend?: Exploring Big Data Computation Atop Traditional HPC NAS Storage 他们会融合吗?:探索基于传统HPC NAS存储的大数据计算
Pub Date : 2014-06-30 DOI: 10.1109/ICDCS.2014.60
E. Wilson, M. Kandemir, Garth A. Gibson
The Apache Hadoop framework has rung in a new era in how data-rich organizations can process, store, and analyze large amounts of data. This has resulted in increased potential for an infrastructure exodus from the traditional solution of commercial database ad-hoc analytics on network-attached storage (NAS). While many data-rich organizations can afford to either move entirely to Hadoop for their Big Data analytics, or to maintain their existing traditional infrastructures and acquire a new set of infrastructure solely for Hadoop jobs, most supercomputing centers do not enjoy either of those possibilities. Too much of the existing scientific code is tailored to work on massively parallel file systems unlike the Hadoop Distributed File System (HDFS), and their datasets are too large to reasonably maintain and/or ferry between two distinct storage systems. Nevertheless, as scientists search for easier-to-program frameworks with a lower time-to-science to post-process their huge datasets after execution, there is increasing pressure to enable use of MapReduce within these traditional High Performance Computing (HPC) architectures. Therefore, in this work we explore potential means to enable use of the easy-to-program Hadoop MapReduce framework without requiring a complete infrastructure overhaul from existing HPC NAS solutions. We demonstrate that retaining function-dedicated resources like NAS is not only possible, but can even be effected efficiently with MapReduce. In our exploration, we unearth subtle pitfalls resultant from this mash-up of new-era Big Data computation on conventional HPC storage and share the clever architectural configurations that allow us to avoid them. Last, we design and present a novel Hadoop File System, the Reliable Array of Independent NAS File System (RainFS), and experimentally demonstrate its improvements in performance and reliability over the previous architectures we have investigated.
Apache Hadoop框架开启了一个数据丰富的组织处理、存储和分析大量数据的新时代。这增加了基础设施从传统的基于网络附加存储(NAS)的商业数据库特别分析解决方案中流失的可能性。虽然许多数据丰富的组织可以完全转移到Hadoop进行大数据分析,或者维持现有的传统基础设施,并获得一套新的基础设施,仅用于Hadoop工作,但大多数超级计算中心都不享受这两种可能性。与Hadoop分布式文件系统(HDFS)不同,现有的科学代码中有太多是为大规模并行文件系统量身定制的,而且它们的数据集太大,无法合理地维护和/或在两个不同的存储系统之间传输。然而,随着科学家们寻找更容易编程的框架,并在执行后更短的时间内对其庞大的数据集进行后处理,在这些传统的高性能计算(HPC)架构中使用MapReduce的压力越来越大。因此,在这项工作中,我们探索了使用易于编程的Hadoop MapReduce框架的潜在方法,而不需要从现有的HPC NAS解决方案中进行完整的基础设施检修。我们证明保留像NAS这样的功能专用资源不仅是可能的,而且甚至可以通过MapReduce有效地实现。在我们的探索中,我们发现了新时代大数据计算在传统HPC存储上混搭所带来的微妙陷阱,并分享了巧妙的架构配置,使我们能够避免这些陷阱。最后,我们设计并提出了一个新的Hadoop文件系统,独立NAS文件系统的可靠阵列(RainFS),并通过实验证明了它在性能和可靠性方面的改进。
{"title":"Will They Blend?: Exploring Big Data Computation Atop Traditional HPC NAS Storage","authors":"E. Wilson, M. Kandemir, Garth A. Gibson","doi":"10.1109/ICDCS.2014.60","DOIUrl":"https://doi.org/10.1109/ICDCS.2014.60","url":null,"abstract":"The Apache Hadoop framework has rung in a new era in how data-rich organizations can process, store, and analyze large amounts of data. This has resulted in increased potential for an infrastructure exodus from the traditional solution of commercial database ad-hoc analytics on network-attached storage (NAS). While many data-rich organizations can afford to either move entirely to Hadoop for their Big Data analytics, or to maintain their existing traditional infrastructures and acquire a new set of infrastructure solely for Hadoop jobs, most supercomputing centers do not enjoy either of those possibilities. Too much of the existing scientific code is tailored to work on massively parallel file systems unlike the Hadoop Distributed File System (HDFS), and their datasets are too large to reasonably maintain and/or ferry between two distinct storage systems. Nevertheless, as scientists search for easier-to-program frameworks with a lower time-to-science to post-process their huge datasets after execution, there is increasing pressure to enable use of MapReduce within these traditional High Performance Computing (HPC) architectures. Therefore, in this work we explore potential means to enable use of the easy-to-program Hadoop MapReduce framework without requiring a complete infrastructure overhaul from existing HPC NAS solutions. We demonstrate that retaining function-dedicated resources like NAS is not only possible, but can even be effected efficiently with MapReduce. In our exploration, we unearth subtle pitfalls resultant from this mash-up of new-era Big Data computation on conventional HPC storage and share the clever architectural configurations that allow us to avoid them. Last, we design and present a novel Hadoop File System, the Reliable Array of Independent NAS File System (RainFS), and experimentally demonstrate its improvements in performance and reliability over the previous architectures we have investigated.","PeriodicalId":170186,"journal":{"name":"2014 IEEE 34th International Conference on Distributed Computing Systems","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131928839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Optimal Energy Cost for Strongly Stable Multi-hop Green Cellular Networks 强稳定多跳绿色蜂窝网络的最优能量代价
Pub Date : 2014-06-30 DOI: 10.1109/ICDCS.2014.15
Weixian Liao, Ming Li, Sergio Salinas, Pan Li, M. Pan
With the ever increasing user adoption of mobile devices like smart phones and tablets, the cellular service providers' energy consumption and cost are fast-growing and have received tremendous attention. How to effectively reduce the energy cost of cellular networks and achieve green communications while satisfying cellular users' rocketing traffic demands has become an urgent and challenging problem. In this paper, we investigate the minimization of the long-term time-averaged expected energy cost of a cellular service provider while guaranteeing the strong stability of the network. We first formulate an offline optimization problem with a joint consideration of flow routing, link scheduling, and energy (i.e., renewable energy resource, energy storage unit, etc.) constraints. Since the formulated problem is a time-coupling stochastic Mixed-Integer Non-Linear Programming (MINLP) problem, it is prohibitively expensive to solve. Then, we reformulate the problem by employing Lyapunov optimization theory. A decomposition based algorithm is developed to solve the problem, which is proved to guarantee the network strong stability. Both the lower and upper bounds on the optimal result of the original problem are derived and proven. Simulation results demonstrate that the obtained lower and upper bounds are very tight, and that the proposed scheme results in noticeable energy cost savings.
随着用户对智能手机、平板电脑等移动设备的使用越来越多,移动运营商的能耗和成本也在快速增长,受到了人们的极大关注。如何在满足移动用户飞速增长的流量需求的同时,有效降低蜂窝网络的能源成本,实现绿色通信,已成为一个迫切而具有挑战性的问题。在本文中,我们研究了在保证网络的强稳定性的同时,蜂窝服务提供商的长期时间平均期望能量成本的最小化。首先提出了一个综合考虑流路由、链路调度和能量(即可再生能源、储能单元等)约束的离线优化问题。由于该公式问题是一个时间耦合随机混合整数非线性规划(MINLP)问题,求解成本非常高。然后,利用李雅普诺夫优化理论对问题进行了重新表述。提出了一种基于分解的算法来解决该问题,并证明该算法保证了网络的强稳定性。导出并证明了原问题最优结果的下界和上界。仿真结果表明,所得到的下界和上界是非常紧凑的,并且所提出的方案显著地节省了能源成本。
{"title":"Optimal Energy Cost for Strongly Stable Multi-hop Green Cellular Networks","authors":"Weixian Liao, Ming Li, Sergio Salinas, Pan Li, M. Pan","doi":"10.1109/ICDCS.2014.15","DOIUrl":"https://doi.org/10.1109/ICDCS.2014.15","url":null,"abstract":"With the ever increasing user adoption of mobile devices like smart phones and tablets, the cellular service providers' energy consumption and cost are fast-growing and have received tremendous attention. How to effectively reduce the energy cost of cellular networks and achieve green communications while satisfying cellular users' rocketing traffic demands has become an urgent and challenging problem. In this paper, we investigate the minimization of the long-term time-averaged expected energy cost of a cellular service provider while guaranteeing the strong stability of the network. We first formulate an offline optimization problem with a joint consideration of flow routing, link scheduling, and energy (i.e., renewable energy resource, energy storage unit, etc.) constraints. Since the formulated problem is a time-coupling stochastic Mixed-Integer Non-Linear Programming (MINLP) problem, it is prohibitively expensive to solve. Then, we reformulate the problem by employing Lyapunov optimization theory. A decomposition based algorithm is developed to solve the problem, which is proved to guarantee the network strong stability. Both the lower and upper bounds on the optimal result of the original problem are derived and proven. Simulation results demonstrate that the obtained lower and upper bounds are very tight, and that the proposed scheme results in noticeable energy cost savings.","PeriodicalId":170186,"journal":{"name":"2014 IEEE 34th International Conference on Distributed Computing Systems","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128907175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
OpenSample: A Low-Latency, Sampling-Based Measurement Platform for Commodity SDN OpenSample:一个低延迟、基于采样的商用SDN测量平台
Pub Date : 2014-06-30 DOI: 10.1109/ICDCS.2014.31
Junho Suh, T. Kwon, C. Dixon, Wes Felter, J. Carter
In this paper we propose, implement and evaluate OpenSample: a low-latency, sampling-based network measurement platform targeted at building faster control loops for software-defined networks. OpenSample leverages sFlow packet sampling to provide near-real-time measurements of both network load and individual flows. While OpenSample is useful in any context, it is particularly useful in an SDN environment where a network controller can quickly take action based on the data it provides. Using sampling for network monitoring allows OpenSample to have a 100 millisecond control loop rather than the 1-5 second control loop of prior polling-based approaches. We implement OpenSample in the Floodlight Open Flow controller and evaluate it both in simulation and on a test bed comprised of commodity switches. When used to inform traffic engineering, OpenSample provides up to a 150% throughput improvement over both static equal-cost multi-path routing and a polling-based solution with a one second control loop.
在本文中,我们提出,实施和评估OpenSample:一个低延迟,基于采样的网络测量平台,旨在为软件定义网络构建更快的控制回路。OpenSample利用sFlow数据包采样来提供网络负载和单个流的近实时测量。虽然OpenSample在任何上下文中都很有用,但它在SDN环境中特别有用,因为网络控制器可以根据它提供的数据快速采取行动。使用采样进行网络监控允许OpenSample拥有100毫秒的控制循环,而不是之前基于轮询的方法的1-5秒控制循环。我们在泛光灯开放流量控制器中实现了OpenSample,并在模拟和由商品开关组成的测试台上对其进行了评估。当用于通知流量工程时,OpenSample比静态等成本多路径路由和基于轮询的一秒控制回路解决方案提供了高达150%的吞吐量改进。
{"title":"OpenSample: A Low-Latency, Sampling-Based Measurement Platform for Commodity SDN","authors":"Junho Suh, T. Kwon, C. Dixon, Wes Felter, J. Carter","doi":"10.1109/ICDCS.2014.31","DOIUrl":"https://doi.org/10.1109/ICDCS.2014.31","url":null,"abstract":"In this paper we propose, implement and evaluate OpenSample: a low-latency, sampling-based network measurement platform targeted at building faster control loops for software-defined networks. OpenSample leverages sFlow packet sampling to provide near-real-time measurements of both network load and individual flows. While OpenSample is useful in any context, it is particularly useful in an SDN environment where a network controller can quickly take action based on the data it provides. Using sampling for network monitoring allows OpenSample to have a 100 millisecond control loop rather than the 1-5 second control loop of prior polling-based approaches. We implement OpenSample in the Floodlight Open Flow controller and evaluate it both in simulation and on a test bed comprised of commodity switches. When used to inform traffic engineering, OpenSample provides up to a 150% throughput improvement over both static equal-cost multi-path routing and a polling-based solution with a one second control loop.","PeriodicalId":170186,"journal":{"name":"2014 IEEE 34th International Conference on Distributed Computing Systems","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130824159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 159
Enabling Privacy-Preserving Image-Centric Social Discovery 实现以图像为中心的隐私保护社交发现
Pub Date : 2014-06-30 DOI: 10.1109/ICDCS.2014.28
Xingliang Yuan, Xinyu Wang, Cong Wang, A. Squicciarini, K. Ren
The increasing popularity of images at social media sites is posing new opportunities for social discovery applications, i.e., suggesting new friends and discovering new social groups with similar interests via exploring images. To effectively handle the explosive growth of images involved in social discovery, one common trend for many emerging social media sites is to leverage the commercial public cloud as their robust backend data center. While extremely convenient, directly exposing content-rich images and the related social discovery results to the public cloud also raises new acute privacy concerns. In light of the observation, in this paper we propose a privacy-preserving social discovery service architecture based on encrypted images. As the core of such social discovery is to compare and quantify similar images, we first adopt the effective Bag-of-Words model to extract the "visual similarity content" of users' images into image profile vectors, and then model the problem as similarity retrieval of encrypted high-dimensional image profiles. To support fast and scalable similarity search over hundreds of thousands of encrypted images, we propose a secure and efficient indexing structure. The resulting design enables social media sites to obtain secure, practical, and accurate social discovery from the public cloud, without disclosing the encrypted image content. We formally prove the security and discuss further extensions on user image update and the compatibility with existing image sharing social functionalities. Extensive experiments on a large Flickr image dataset demonstrate the practical performance of the proposed design. Our qualitative social discovery results show consistency with human perception.
图片在社交媒体网站上的日益普及为社交发现应用提供了新的机会,即通过探索图片推荐新朋友和发现有相似兴趣的新社交群体。为了有效地处理社交发现中涉及的图像的爆炸性增长,许多新兴社交媒体站点的一个共同趋势是利用商业公共云作为其健壮的后端数据中心。虽然非常方便,但将内容丰富的图像和相关的社交发现结果直接暴露在公共云上也引发了新的尖锐的隐私问题。鉴于此,本文提出了一种基于加密图像的隐私保护社交发现服务架构。由于这种社交发现的核心是对相似图像进行比较和量化,我们首先采用有效的Bag-of-Words模型将用户图像的“视觉相似内容”提取到图像轮廓向量中,然后将问题建模为加密高维图像轮廓的相似度检索。为了支持对成千上万的加密图像进行快速和可扩展的相似度搜索,我们提出了一种安全高效的索引结构。最终的设计使社交媒体网站能够在不泄露加密图像内容的情况下,从公共云获得安全、实用、准确的社交发现。我们正式证明了安全性,并讨论了用户图像更新的进一步扩展以及与现有图像共享社交功能的兼容性。在大型Flickr图像数据集上的大量实验证明了所提出设计的实际性能。我们的定性社会发现结果与人类感知一致。
{"title":"Enabling Privacy-Preserving Image-Centric Social Discovery","authors":"Xingliang Yuan, Xinyu Wang, Cong Wang, A. Squicciarini, K. Ren","doi":"10.1109/ICDCS.2014.28","DOIUrl":"https://doi.org/10.1109/ICDCS.2014.28","url":null,"abstract":"The increasing popularity of images at social media sites is posing new opportunities for social discovery applications, i.e., suggesting new friends and discovering new social groups with similar interests via exploring images. To effectively handle the explosive growth of images involved in social discovery, one common trend for many emerging social media sites is to leverage the commercial public cloud as their robust backend data center. While extremely convenient, directly exposing content-rich images and the related social discovery results to the public cloud also raises new acute privacy concerns. In light of the observation, in this paper we propose a privacy-preserving social discovery service architecture based on encrypted images. As the core of such social discovery is to compare and quantify similar images, we first adopt the effective Bag-of-Words model to extract the \"visual similarity content\" of users' images into image profile vectors, and then model the problem as similarity retrieval of encrypted high-dimensional image profiles. To support fast and scalable similarity search over hundreds of thousands of encrypted images, we propose a secure and efficient indexing structure. The resulting design enables social media sites to obtain secure, practical, and accurate social discovery from the public cloud, without disclosing the encrypted image content. We formally prove the security and discuss further extensions on user image update and the compatibility with existing image sharing social functionalities. Extensive experiments on a large Flickr image dataset demonstrate the practical performance of the proposed design. Our qualitative social discovery results show consistency with human perception.","PeriodicalId":170186,"journal":{"name":"2014 IEEE 34th International Conference on Distributed Computing Systems","volume":"1 9","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113976329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 47
Exploring the Use of Diverse Replicas for Big Location Tracking Data 探索在大位置跟踪数据中使用不同的副本
Pub Date : 2014-06-30 DOI: 10.1109/ICDCS.2014.17
Ye Ding, Haoyu Tan, Wuman Luo, L. Ni
The value of large amount of location tracking data has received wide attention in many applications including human behavior analysis, urban transportation planning, and various location-based services (LBS). Nowadays, both scientific and industrial communities are encouraged to collect as much location tracking data as possible, which brings about two issues: 1) it is challenging to process the queries on big location tracking data efficiently, and 2) it is expensive to store several exact data replicas for fault-tolerance. So far, several dedicated storage systems have been proposed to address these issues. However, they do not work well when the query ranges vary widely. In this paper, we present the design of a storage system using diverse replica scheme which improves the query processing efficiency with reduced cost of storage space. To the best of our knowledge, we are the first to investigate the data storage and processing in the context of big location tracking data. Specifically, we conduct in-depth theoretical and empirical analysis of the trade-offs between different spatio-temporal partitioning schemes as well as data encoding schemes. Then we propose an effective approach to select an appropriate set of diverse replicas, which is optimized for the expected query loads while conforming to the given storage space budget. The experiment results confirm that using diverse replicas can significantly improve the overall query performance. The results also demonstrate that the proposed algorithms for the replica selection problem is both effective and efficient.
在人类行为分析、城市交通规划以及各种基于位置的服务(LBS)等诸多应用中,大量位置跟踪数据的价值受到了广泛关注。目前,科学界和工业界都鼓励尽可能多地收集位置跟踪数据,这带来了两个问题:1)对大量位置跟踪数据的查询处理具有挑战性,2)为了容错而存储多个精确的数据副本的成本很高。到目前为止,已经提出了几种专用存储系统来解决这些问题。然而,当查询范围变化很大时,它们就不能很好地工作了。在本文中,我们设计了一种采用多副本方案的存储系统,在降低存储空间成本的同时提高了查询处理效率。据我们所知,我们是第一个在大位置跟踪数据背景下研究数据存储和处理的公司。具体而言,我们对不同时空划分方案和数据编码方案之间的权衡进行了深入的理论和实证分析。然后,我们提出了一种有效的方法来选择合适的不同副本集,该方法针对预期的查询负载进行了优化,同时符合给定的存储空间预算。实验结果证实,使用不同的副本可以显著提高整体查询性能。结果还表明,本文提出的算法在副本选择问题上是有效的。
{"title":"Exploring the Use of Diverse Replicas for Big Location Tracking Data","authors":"Ye Ding, Haoyu Tan, Wuman Luo, L. Ni","doi":"10.1109/ICDCS.2014.17","DOIUrl":"https://doi.org/10.1109/ICDCS.2014.17","url":null,"abstract":"The value of large amount of location tracking data has received wide attention in many applications including human behavior analysis, urban transportation planning, and various location-based services (LBS). Nowadays, both scientific and industrial communities are encouraged to collect as much location tracking data as possible, which brings about two issues: 1) it is challenging to process the queries on big location tracking data efficiently, and 2) it is expensive to store several exact data replicas for fault-tolerance. So far, several dedicated storage systems have been proposed to address these issues. However, they do not work well when the query ranges vary widely. In this paper, we present the design of a storage system using diverse replica scheme which improves the query processing efficiency with reduced cost of storage space. To the best of our knowledge, we are the first to investigate the data storage and processing in the context of big location tracking data. Specifically, we conduct in-depth theoretical and empirical analysis of the trade-offs between different spatio-temporal partitioning schemes as well as data encoding schemes. Then we propose an effective approach to select an appropriate set of diverse replicas, which is optimized for the expected query loads while conforming to the given storage space budget. The experiment results confirm that using diverse replicas can significantly improve the overall query performance. The results also demonstrate that the proposed algorithms for the replica selection problem is both effective and efficient.","PeriodicalId":170186,"journal":{"name":"2014 IEEE 34th International Conference on Distributed Computing Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131158697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Impact Analysis of Topology Poisoning Attacks on Economic Operation of the Smart Power Grid 拓扑投毒攻击对智能电网经济运行的影响分析
Pub Date : 2014-06-30 DOI: 10.1109/ICDCS.2014.72
M. Rahman, E. Al-Shaer, R. Kavasseri
The Optimal Power Flow (OPF) routine used in energy control centers allocates individual generator outputs by minimizing the overall cost of generation subject to system level operating constraints. The OPF relies on the outputs of two other modules, namely topology processor and state estimator. The topology processor maps the grid topology based on statuses received from the switches and circuit breakers across the system. The state estimator computes the system state, i.e., voltage magnitudes with phase angles, transmission line flows, and system loads based on real-time meter measurements. However, topology statuses and meter measurements are vulnerable to false data injection attacks. Recent research has shown that such cyber attacks can be launched against state estimation where adversaries can corrupt the states but still remain undetected. In this paper, we show how the stealthy topology poisoning attacks can compromise the integrity of OPF, and thus undermine economic operation. We describe a formal verification based framework to systematically analyze the impact of such attacks on OPF. The proposed framework is illustrated with an example. We also evaluate the scalability of the framework with respect to time and memory requirements.
能源控制中心使用的最优潮流(OPF)程序在系统运行约束下,通过最小化总体发电成本来分配各个发电机的输出。OPF依赖于另外两个模块的输出,即拓扑处理器和状态估计器。拓扑处理器根据从整个系统的开关和断路器接收到的状态映射网格拓扑。状态估计器根据实时仪表测量值计算系统状态,即带相角的电压幅值、传输线流量和系统负载。然而,拓扑状态和仪表测量容易受到虚假数据注入攻击。最近的研究表明,这种网络攻击可以针对国家估计发起,在这种情况下,对手可以破坏国家,但仍未被发现。在本文中,我们展示了隐身拓扑中毒攻击如何损害OPF的完整性,从而破坏经济运行。我们描述了一个基于正式验证的框架,以系统地分析此类攻击对OPF的影响。通过一个实例说明了所提出的框架。我们还根据时间和内存需求评估框架的可伸缩性。
{"title":"Impact Analysis of Topology Poisoning Attacks on Economic Operation of the Smart Power Grid","authors":"M. Rahman, E. Al-Shaer, R. Kavasseri","doi":"10.1109/ICDCS.2014.72","DOIUrl":"https://doi.org/10.1109/ICDCS.2014.72","url":null,"abstract":"The Optimal Power Flow (OPF) routine used in energy control centers allocates individual generator outputs by minimizing the overall cost of generation subject to system level operating constraints. The OPF relies on the outputs of two other modules, namely topology processor and state estimator. The topology processor maps the grid topology based on statuses received from the switches and circuit breakers across the system. The state estimator computes the system state, i.e., voltage magnitudes with phase angles, transmission line flows, and system loads based on real-time meter measurements. However, topology statuses and meter measurements are vulnerable to false data injection attacks. Recent research has shown that such cyber attacks can be launched against state estimation where adversaries can corrupt the states but still remain undetected. In this paper, we show how the stealthy topology poisoning attacks can compromise the integrity of OPF, and thus undermine economic operation. We describe a formal verification based framework to systematically analyze the impact of such attacks on OPF. The proposed framework is illustrated with an example. We also evaluate the scalability of the framework with respect to time and memory requirements.","PeriodicalId":170186,"journal":{"name":"2014 IEEE 34th International Conference on Distributed Computing Systems","volume":"18 6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133980325","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 31
Columbus: Configuration Discovery for Clouds 哥伦布:云的配置发现
Pub Date : 2014-06-30 DOI: 10.1109/ICDCS.2014.41
R. Balani, Deepak Jeswani, Dipyaman Banerjee, Akshat Verma
Low-cost, accurate and scalable software configuration discovery is the key to simplifying many cloud management tasks. However, the lack of standardization across software configuration techniques has prevented the development of a fully automated and application independent configuration discovery solution. In this work, we present Columbus, an application-agnostic system to automatically discover environmental configuration parameters or Points of Variability (PoV) in clustered applications with high accuracy. Columbus uses the insight that even though configuration mechanisms and files vary across different software, the PoVs are encoded using a few common patterns. It uses a novel rule framework to annotate file content with PoVs and a Bayesian network to estimate confidence for annotated PoVs. Our experiments confirm that Columbus can accurately discover configuration for a diverse set of enterprise and cloud applications. It has subsequently been integrated in three real-world systems that analyze this information for discovery of distributed application dependencies, enterprise IT migration and virtual application configuration.
低成本、准确和可扩展的软件配置发现是简化许多云管理任务的关键。然而,缺乏跨软件配置技术的标准化阻碍了完全自动化和独立于应用程序的配置发现解决方案的开发。在这项工作中,我们提出了Columbus,一个与应用无关的系统,用于高精度地自动发现集群应用中的环境配置参数或可变性点(PoV)。Columbus认为,尽管配置机制和文件在不同的软件中是不同的,但是pov是使用一些通用模式进行编码的。它使用一种新的规则框架来用pov注释文件内容,并使用贝叶斯网络来估计注释pov的置信度。我们的实验证实,Columbus可以准确地发现各种企业和云应用程序的配置。它随后被集成到三个现实世界的系统中,这些系统分析这些信息,以发现分布式应用程序依赖关系、企业It迁移和虚拟应用程序配置。
{"title":"Columbus: Configuration Discovery for Clouds","authors":"R. Balani, Deepak Jeswani, Dipyaman Banerjee, Akshat Verma","doi":"10.1109/ICDCS.2014.41","DOIUrl":"https://doi.org/10.1109/ICDCS.2014.41","url":null,"abstract":"Low-cost, accurate and scalable software configuration discovery is the key to simplifying many cloud management tasks. However, the lack of standardization across software configuration techniques has prevented the development of a fully automated and application independent configuration discovery solution. In this work, we present Columbus, an application-agnostic system to automatically discover environmental configuration parameters or Points of Variability (PoV) in clustered applications with high accuracy. Columbus uses the insight that even though configuration mechanisms and files vary across different software, the PoVs are encoded using a few common patterns. It uses a novel rule framework to annotate file content with PoVs and a Bayesian network to estimate confidence for annotated PoVs. Our experiments confirm that Columbus can accurately discover configuration for a diverse set of enterprise and cloud applications. It has subsequently been integrated in three real-world systems that analyze this information for discovery of distributed application dependencies, enterprise IT migration and virtual application configuration.","PeriodicalId":170186,"journal":{"name":"2014 IEEE 34th International Conference on Distributed Computing Systems","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114771803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Scalable Traffic-Aware Virtual Machine Management for Cloud Data Centers 面向云数据中心的可扩展流量感知虚拟机管理
Pub Date : 2014-06-30 DOI: 10.1109/ICDCS.2014.32
Fung Po Tso, K. Oikonomou, Eleni Kavvadia, D. Pezaros
Virtual Machine (VM) management is a powerful mechanism for providing elastic services over Cloud Data Centers (DC)s. At the same time, the resulting network congestion has been repeatedly reported as the main bottleneck in DCs, even when the overall resource utilization of the infrastructure remains low. However, most current VM management strategies are traffic-agnostic, while the few that are traffic-aware only concern a static initial allocation, ignore bandwidth oversubscription, or do not scale. In this paper we present S-CORE, a scalable VM migration algorithm to dynamically reallocate VMs to servers while minimizing the overall communication footprint of active traffic flows. We formulate the aggregate VM communication as an optimization problem and we then define a novel distributed migration scheme that iteratively adapts to dynamic traffic changes. Through extensive simulation and implementation results, we show that S-CORE achieves significant (up to 87%) communication cost reduction while incurring minimal overhead and downtime.
虚拟机管理是在云数据中心(DC)上提供弹性服务的强大机制。与此同时,由此产生的网络拥塞被反复报道为数据中心的主要瓶颈,即使在基础设施的总体资源利用率仍然很低的情况下也是如此。然而,大多数当前的VM管理策略是流量不可知的,而少数流量感知策略只关注静态初始分配,忽略带宽超额订阅,或者不扩展。在本文中,我们提出了S-CORE,一种可扩展的虚拟机迁移算法,可以动态地将虚拟机重新分配到服务器,同时最大限度地减少活动流量的总体通信占用。我们将聚合虚拟机通信表述为一个优化问题,然后我们定义了一种新的分布式迁移方案,该方案迭代地适应动态流量变化。通过广泛的仿真和实现结果,我们表明S-CORE在产生最小开销和停机时间的同时实现了显著(高达87%)的通信成本降低。
{"title":"Scalable Traffic-Aware Virtual Machine Management for Cloud Data Centers","authors":"Fung Po Tso, K. Oikonomou, Eleni Kavvadia, D. Pezaros","doi":"10.1109/ICDCS.2014.32","DOIUrl":"https://doi.org/10.1109/ICDCS.2014.32","url":null,"abstract":"Virtual Machine (VM) management is a powerful mechanism for providing elastic services over Cloud Data Centers (DC)s. At the same time, the resulting network congestion has been repeatedly reported as the main bottleneck in DCs, even when the overall resource utilization of the infrastructure remains low. However, most current VM management strategies are traffic-agnostic, while the few that are traffic-aware only concern a static initial allocation, ignore bandwidth oversubscription, or do not scale. In this paper we present S-CORE, a scalable VM migration algorithm to dynamically reallocate VMs to servers while minimizing the overall communication footprint of active traffic flows. We formulate the aggregate VM communication as an optimization problem and we then define a novel distributed migration scheme that iteratively adapts to dynamic traffic changes. Through extensive simulation and implementation results, we show that S-CORE achieves significant (up to 87%) communication cost reduction while incurring minimal overhead and downtime.","PeriodicalId":170186,"journal":{"name":"2014 IEEE 34th International Conference on Distributed Computing Systems","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129498275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
Efficient Data Forwarding in Mobile Social Networks with Diverse Connectivity Characteristics 具有不同连接特征的移动社交网络中的高效数据转发
Pub Date : 2014-06-30 DOI: 10.1109/ICDCS.2014.12
Xiaomei Zhang, G. Cao
Mobile Social Network (MSN) with diverse connectivity characteristics is a combination of opportunistic network and mobile ad hoc network. Since the major difficulty of data forwarding is the opportunistic part, techniques designed for opportunistic networks are commonly used to forward data in MSNs. However, this may not be the best solution since they do not consider the ubiquitous existences of Transient Connected Components (TCCs), where nodes inside a TCC can reach each other by multi-hop wireless communications. In this paper, we first identify the existence of TCCs and analyze their properties based on five real traces. Then, we propose TCC-aware data forwarding strategies which exploit the special characteristics of TCCs to increase the contact opportunities and then improve the performance of data forwarding. Trace-driven simulations show that our TCC-aware data forwarding strategies outperform existing data forwarding strategies in terms of data delivery ratio and network overhead.
移动社交网络(MSN)是机会网络和移动自组织网络的结合,具有多种连接特性。由于数据转发的主要困难是机会性部分,因此通常采用为机会性网络设计的技术来转发msn中的数据。然而,这可能不是最好的解决方案,因为它们没有考虑到无处不在的瞬态连接组件(TCC)的存在,其中TCC内部的节点可以通过多跳无线通信相互到达。在本文中,我们首先确定了tcc的存在,并基于5条真实迹线分析了它们的性质。在此基础上,提出了基于tcc感知的数据转发策略,利用tcc的特性增加接触机会,从而提高数据转发的性能。跟踪驱动仿真表明,我们的tcc感知数据转发策略在数据传输率和网络开销方面优于现有的数据转发策略。
{"title":"Efficient Data Forwarding in Mobile Social Networks with Diverse Connectivity Characteristics","authors":"Xiaomei Zhang, G. Cao","doi":"10.1109/ICDCS.2014.12","DOIUrl":"https://doi.org/10.1109/ICDCS.2014.12","url":null,"abstract":"Mobile Social Network (MSN) with diverse connectivity characteristics is a combination of opportunistic network and mobile ad hoc network. Since the major difficulty of data forwarding is the opportunistic part, techniques designed for opportunistic networks are commonly used to forward data in MSNs. However, this may not be the best solution since they do not consider the ubiquitous existences of Transient Connected Components (TCCs), where nodes inside a TCC can reach each other by multi-hop wireless communications. In this paper, we first identify the existence of TCCs and analyze their properties based on five real traces. Then, we propose TCC-aware data forwarding strategies which exploit the special characteristics of TCCs to increase the contact opportunities and then improve the performance of data forwarding. Trace-driven simulations show that our TCC-aware data forwarding strategies outperform existing data forwarding strategies in terms of data delivery ratio and network overhead.","PeriodicalId":170186,"journal":{"name":"2014 IEEE 34th International Conference on Distributed Computing Systems","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124982697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
期刊
2014 IEEE 34th International Conference on Distributed Computing Systems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1