社会宝藏:一种面向社会感知的自总结存储服务

2015 IEEE International Conference on Autonomic Computing Pub Date : 2015-07-07 DOI:10.1109/ICAC.2015.47

Md. Tanvir Al Amin, Shen Li, Muntasir Raihan Rahman, P. Seetharamu, Shiguang Wang, T. Abdelzaher, Indranil Gupta, M. Srivatsa, R. Ganti, Reaz Ahmed, H. Le

{"title":"社会宝藏:一种面向社会感知的自总结存储服务","authors":"Md. Tanvir Al Amin, Shen Li, Muntasir Raihan Rahman, P. Seetharamu, Shiguang Wang, T. Abdelzaher, Indranil Gupta, M. Srivatsa, R. Ganti, Reaz Ahmed, H. Le","doi":"10.1109/ICAC.2015.47","DOIUrl":null,"url":null,"abstract":"The increasing availability of smartphones, cameras, and wearables with instant data sharing capabilities, and the exploitation of social networks for information broadcast, heralds a future of real-time information overload. With the growing excess of worldwide streaming data, such as images, geotags, text annotations, and sensory measurements, an increasingly common service will become one of data summarization. The objective of such a service will be to obtain a representative sampling of large data streams at a configurable granularity, in real-time, for subsequent consumption by a range of data-centric applications. This paper describes a general-purpose self-summarizing storage service, called Social Trove, for social sensing applications. The service summarizes data streams from human sources, or sensors in their possession, by hierarchically clustering received information in accordance with an application-specific distance metric. It then serves a sampling of produced clusters at a configurable granularity in response to application queries. While Social Trove is a general service, we illustrate its functionality and evaluate it in the specific context of workloads collected from Twitter. Results show that Social Trove supports a high query throughput, while maintaining a low access latency to the produced real-time application-specific data summaries. As a specific application case-study, we implement a fact-finding service on top of Social Trove.","PeriodicalId":6643,"journal":{"name":"2015 IEEE International Conference on Autonomic Computing","volume":"12 1","pages":"41-50"},"PeriodicalIF":0.0000,"publicationDate":"2015-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"19","resultStr":"{\"title\":\"Social Trove: A Self-Summarizing Storage Service for Social Sensing\",\"authors\":\"Md. Tanvir Al Amin, Shen Li, Muntasir Raihan Rahman, P. Seetharamu, Shiguang Wang, T. Abdelzaher, Indranil Gupta, M. Srivatsa, R. Ganti, Reaz Ahmed, H. Le\",\"doi\":\"10.1109/ICAC.2015.47\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The increasing availability of smartphones, cameras, and wearables with instant data sharing capabilities, and the exploitation of social networks for information broadcast, heralds a future of real-time information overload. With the growing excess of worldwide streaming data, such as images, geotags, text annotations, and sensory measurements, an increasingly common service will become one of data summarization. The objective of such a service will be to obtain a representative sampling of large data streams at a configurable granularity, in real-time, for subsequent consumption by a range of data-centric applications. This paper describes a general-purpose self-summarizing storage service, called Social Trove, for social sensing applications. The service summarizes data streams from human sources, or sensors in their possession, by hierarchically clustering received information in accordance with an application-specific distance metric. It then serves a sampling of produced clusters at a configurable granularity in response to application queries. While Social Trove is a general service, we illustrate its functionality and evaluate it in the specific context of workloads collected from Twitter. Results show that Social Trove supports a high query throughput, while maintaining a low access latency to the produced real-time application-specific data summaries. As a specific application case-study, we implement a fact-finding service on top of Social Trove.\",\"PeriodicalId\":6643,\"journal\":{\"name\":\"2015 IEEE International Conference on Autonomic Computing\",\"volume\":\"12 1\",\"pages\":\"41-50\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-07-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"19\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 IEEE International Conference on Autonomic Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICAC.2015.47\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE International Conference on Autonomic Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAC.2015.47","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 19

摘要

智能手机、相机和具有即时数据共享功能的可穿戴设备的日益普及，以及利用社交网络进行信息广播，预示着实时信息过载的未来。随着图像、地理标记、文本注释和感官测量等全球流数据的不断增长，数据汇总将成为一种越来越常见的服务。这种服务的目标是以可配置粒度实时获取大数据流的代表性样本，供一系列以数据为中心的应用程序后续使用。本文描述了一种用于社会传感应用的通用自总结存储服务，称为Social Trove。该服务通过根据特定应用程序的距离度量对接收到的信息进行分层聚类，从而总结来自人力资源或其拥有的传感器的数据流。然后，它以可配置的粒度对生成的集群进行采样，以响应应用程序查询。虽然Social Trove是一项通用服务，但我们将在从Twitter收集的工作负载的特定上下文中演示其功能并对其进行评估。结果表明，Social Trove支持高查询吞吐量，同时保持对生成的实时特定于应用程序的数据摘要的低访问延迟。作为一个具体的应用案例研究，我们在Social Trove之上实现了一个事实调查服务。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Social Trove: A Self-Summarizing Storage Service for Social Sensing

The increasing availability of smartphones, cameras, and wearables with instant data sharing capabilities, and the exploitation of social networks for information broadcast, heralds a future of real-time information overload. With the growing excess of worldwide streaming data, such as images, geotags, text annotations, and sensory measurements, an increasingly common service will become one of data summarization. The objective of such a service will be to obtain a representative sampling of large data streams at a configurable granularity, in real-time, for subsequent consumption by a range of data-centric applications. This paper describes a general-purpose self-summarizing storage service, called Social Trove, for social sensing applications. The service summarizes data streams from human sources, or sensors in their possession, by hierarchically clustering received information in accordance with an application-specific distance metric. It then serves a sampling of produced clusters at a configurable granularity in response to application queries. While Social Trove is a general service, we illustrate its functionality and evaluate it in the specific context of workloads collected from Twitter. Results show that Social Trove supports a high query throughput, while maintaining a low access latency to the produced real-time application-specific data summaries. As a specific application case-study, we implement a fact-finding service on top of Social Trove.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助