首页 > 最新文献

2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW 2010)最新文献

英文 中文
On the influence of social factors on team recommendations 社会因素对团队推荐的影响
Pub Date : 2010-03-01 DOI: 10.1109/ICDEW.2010.5452716
Michele Brocco, Georg Groh, C. Kern
In the last 10 years a new paradigm for creating innovations by also using external sources and paths to market has emerged and became popular. This paradigm is known as open innovation. Through the possible inclusion of these external sources for the innovation process a larger number of people (and thereby knowledge and skills) are available. People and organizations are connected in a network (so called open innovation network) of collaboration. These networks are valuable and provide an important source for composing teams, working on specific open innovation projects inside an open innovation community. We address the problem of composing such a team given the complexity of the network and innovation tasks with algorithmic team recommendation. Thereby different challenges have to be regarded such as including different aspects of team composition that were subject of research in the social and psychological sciences. We base this article on our previous work on the categorization of influencing team compostion aspects and create a team composition model based uniquely on social aspects as an example for mapping classical team composition models onto our categorization. Furthermore, we describe typical issues arising when creating team composition models from scratch when mapping them onto our proposed meta model that represents the main component of our recommender approach.
在过去的10年里,一种利用外部资源和市场途径进行创新的新模式已经出现并流行起来。这种模式被称为开放式创新。通过可能将这些外部资源纳入创新过程,可以获得更多的人员(从而获得知识和技能)。个人和组织在协作网络(所谓的开放式创新网络)中联系在一起。这些网络是有价值的,为组成团队,在开放创新社区中从事特定的开放创新项目提供了重要的资源。考虑到网络和创新任务的复杂性,我们通过算法团队推荐解决了组成这样一个团队的问题。因此,必须考虑不同的挑战,例如包括团队组成的不同方面,这些方面是社会和心理科学研究的主题。本文在前人对影响团队构成的因素进行分类的基础上,建立了一个独特的基于社会因素的团队构成模型,作为将经典团队构成模型映射到我们的分类中的一个例子。此外,我们描述了从头开始创建团队组合模型时出现的典型问题,并将它们映射到我们提议的元模型上,该模型代表了我们推荐方法的主要组件。
{"title":"On the influence of social factors on team recommendations","authors":"Michele Brocco, Georg Groh, C. Kern","doi":"10.1109/ICDEW.2010.5452716","DOIUrl":"https://doi.org/10.1109/ICDEW.2010.5452716","url":null,"abstract":"In the last 10 years a new paradigm for creating innovations by also using external sources and paths to market has emerged and became popular. This paradigm is known as open innovation. Through the possible inclusion of these external sources for the innovation process a larger number of people (and thereby knowledge and skills) are available. People and organizations are connected in a network (so called open innovation network) of collaboration. These networks are valuable and provide an important source for composing teams, working on specific open innovation projects inside an open innovation community. We address the problem of composing such a team given the complexity of the network and innovation tasks with algorithmic team recommendation. Thereby different challenges have to be regarded such as including different aspects of team composition that were subject of research in the social and psychological sciences. We base this article on our previous work on the categorization of influencing team compostion aspects and create a team composition model based uniquely on social aspects as an example for mapping classical team composition models onto our categorization. Furthermore, we describe typical issues arising when creating team composition models from scratch when mapping them onto our proposed meta model that represents the main component of our recommender approach.","PeriodicalId":442345,"journal":{"name":"2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW 2010)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116092324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Towards discovery of eras in social networks 走向社交网络时代的发现
Pub Date : 2010-03-01 DOI: 10.1109/ICDEW.2010.5452713
M. Berlingerio, M. Coscia, F. Giannotti, A. Monreale, D. Pedreschi
In the last decades, much research has been devoted in topics related to Social Network Analysis. One important direction in this area is to analyze the temporal evolution of a network. So far, previous approaches analyzed this setting at both the global and the local level. In this paper, we focus on finding a way to detect temporal eras in an evolving network. We pose the basis for a general framework that aims at helping the analyst in browsing the temporal clusters both in a top-down and bottom-up way, exploring the network at any level of temporal details. We show the effectiveness of our approach on real data, by applying our proposed methodology to a co-authorship network extracted from a bibliographic dataset. Our first results are encouraging, and open the way for the definition and implementation of a general framework for discovering eras in evolving social networks.
在过去的几十年里,许多研究都致力于与社会网络分析相关的主题。该领域的一个重要方向是分析网络的时间演化。到目前为止,以前的方法在全局和局部级别上分析了这种设置。在本文中,我们的重点是寻找一种方法来检测一个不断发展的网络的时间时代。我们提出了一个通用框架的基础,旨在帮助分析师以自上而下和自下而上的方式浏览时间集群,探索任何级别的时间细节的网络。通过将我们提出的方法应用于从书目数据集中提取的合著者网络,我们展示了我们方法在实际数据上的有效性。我们的第一个结果是令人鼓舞的,并为定义和实施一个通用框架开辟了道路,用于发现不断发展的社交网络中的时代。
{"title":"Towards discovery of eras in social networks","authors":"M. Berlingerio, M. Coscia, F. Giannotti, A. Monreale, D. Pedreschi","doi":"10.1109/ICDEW.2010.5452713","DOIUrl":"https://doi.org/10.1109/ICDEW.2010.5452713","url":null,"abstract":"In the last decades, much research has been devoted in topics related to Social Network Analysis. One important direction in this area is to analyze the temporal evolution of a network. So far, previous approaches analyzed this setting at both the global and the local level. In this paper, we focus on finding a way to detect temporal eras in an evolving network. We pose the basis for a general framework that aims at helping the analyst in browsing the temporal clusters both in a top-down and bottom-up way, exploring the network at any level of temporal details. We show the effectiveness of our approach on real data, by applying our proposed methodology to a co-authorship network extracted from a bibliographic dataset. Our first results are encouraging, and open the way for the definition and implementation of a general framework for discovering eras in evolving social networks.","PeriodicalId":442345,"journal":{"name":"2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW 2010)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121524996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Evaluating path queries over route collections 评估路由集合上的路径查询
Pub Date : 2010-03-01 DOI: 10.1109/ICDEW.2010.5452732
Panagiotis Bouros, Y. Vassiliou
Nowadays, vast amount of routing data, like sequences of points of interests, landmarks, etc., are available due to the proliferation of geodata services. We refer to these sequences as routes and the involved points simply as nodes. In this thesis, we consider the problem of evaluating path queries on frequently updated route collections. We present our current work for two path queries: (i) identifying a path between two nodes of the collection, and (ii) identifying a constrained shortest path. Finally, some interesting open problems are described and our future work directions are clearly stated.
如今,由于地理数据服务的激增,大量的路由数据,如兴趣点序列、地标等都是可用的。我们将这些序列称为路径,将涉及的点简单地称为节点。在本文中,我们考虑了在频繁更新的路由集合上评估路径查询的问题。我们介绍了我们目前在两个路径查询方面的工作:(i)识别集合的两个节点之间的路径,以及(ii)识别受限的最短路径。最后,描述了一些有趣的开放性问题,并明确了我们未来的工作方向。
{"title":"Evaluating path queries over route collections","authors":"Panagiotis Bouros, Y. Vassiliou","doi":"10.1109/ICDEW.2010.5452732","DOIUrl":"https://doi.org/10.1109/ICDEW.2010.5452732","url":null,"abstract":"Nowadays, vast amount of routing data, like sequences of points of interests, landmarks, etc., are available due to the proliferation of geodata services. We refer to these sequences as routes and the involved points simply as nodes. In this thesis, we consider the problem of evaluating path queries on frequently updated route collections. We present our current work for two path queries: (i) identifying a path between two nodes of the collection, and (ii) identifying a constrained shortest path. Finally, some interesting open problems are described and our future work directions are clearly stated.","PeriodicalId":442345,"journal":{"name":"2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW 2010)","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117301838","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
On dynamic data clustering and visualization using swarm intelligence 基于群体智能的动态数据聚类与可视化研究
Pub Date : 2010-03-01 DOI: 10.1109/ICDEW.2010.5452721
Esin Saka, O. Nasraoui
Clustering and visualizing high-dimensional sparse data simultaneously is a very attractive goal, yet it is also a challenging problem. Our previous studies using a special type of swarms, known as flocks of agents, provided some promising approaches to this challenging problem on several limited size UCI machine learning data sets and Web usage sessions (from web access logs) [1], [2]. However, dynamic domains, such as practically any data generated on the Web, may require frequent costly updates of the clusters (and the visualization), whenever new data records are added to the dataset. The new coming data may be due to new user activity on a website (clickstreams) or a search engine (queries), or new Web pages in the case of document clustering, etc. Additionally, data records may result in a change of clustering in time. Therefore, clusters may need to be updated, thus leading to the need to mine dynamic clusters. This paper summarizes our initial studies in designing a simultaneous clustering and visualization algorithm and proposes the Dynamic-FClust Algorithm, which is based on flocks of agents as a biological metaphor. This algorithm falls within the swarm-based clustering family, which is unique compared to other approaches, because its model is an ongoing swarm of agents that socially interact with each other, and is therefore inherently dynamic.
同时实现高维稀疏数据的聚类和可视化是一个非常有吸引力的目标,但也是一个具有挑战性的问题。我们之前的研究使用了一种特殊类型的群体,称为代理群,在几个有限大小的UCI机器学习数据集和Web使用会话(来自Web访问日志)上提供了一些有希望的方法来解决这个具有挑战性的问题[1],[2]。然而,动态域(例如几乎在Web上生成的任何数据)可能需要在向数据集添加新数据记录时频繁地更新集群(和可视化),并且代价高昂。新的数据可能是由于网站上的新用户活动(点击流)或搜索引擎(查询),或者在文档聚类的情况下新的Web页面,等等。此外,数据记录可能会导致聚类在时间上发生变化。因此,集群可能需要更新,从而导致需要挖掘动态集群。本文总结了我们在设计同时聚类和可视化算法方面的初步研究,提出了基于agent群作为生物隐喻的Dynamic-FClust算法。该算法属于基于群体的聚类家族,与其他方法相比,它是独一无二的,因为它的模型是一个持续的、相互社会互动的代理群体,因此本质上是动态的。
{"title":"On dynamic data clustering and visualization using swarm intelligence","authors":"Esin Saka, O. Nasraoui","doi":"10.1109/ICDEW.2010.5452721","DOIUrl":"https://doi.org/10.1109/ICDEW.2010.5452721","url":null,"abstract":"Clustering and visualizing high-dimensional sparse data simultaneously is a very attractive goal, yet it is also a challenging problem. Our previous studies using a special type of swarms, known as flocks of agents, provided some promising approaches to this challenging problem on several limited size UCI machine learning data sets and Web usage sessions (from web access logs) [1], [2]. However, dynamic domains, such as practically any data generated on the Web, may require frequent costly updates of the clusters (and the visualization), whenever new data records are added to the dataset. The new coming data may be due to new user activity on a website (clickstreams) or a search engine (queries), or new Web pages in the case of document clustering, etc. Additionally, data records may result in a change of clustering in time. Therefore, clusters may need to be updated, thus leading to the need to mine dynamic clusters. This paper summarizes our initial studies in designing a simultaneous clustering and visualization algorithm and proposes the Dynamic-FClust Algorithm, which is based on flocks of agents as a biological metaphor. This algorithm falls within the swarm-based clustering family, which is unique compared to other approaches, because its model is an ongoing swarm of agents that socially interact with each other, and is therefore inherently dynamic.","PeriodicalId":442345,"journal":{"name":"2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW 2010)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125649006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Ontology alignment argumentation with mutual dependency between arguments and mappings 本体对齐参数,参数和映射之间相互依赖
Pub Date : 2010-03-01 DOI: 10.1109/ICDEW.2010.5452705
P. Maio, Nuno Silva
For a successful communication, autonomous entities (e.g. agents, web services, peers) must reconcile vocabulary used in their ontologies. The result is a set of mappings between ontology entities. Since each party might have its own perspective about what are the best mappings, conflicts will arise. Toward a mapping consensus building between information exchanging parties, this paper proposes an approach based on a formal argumentation framework, whose existing ontology matching algorithms generate the mappings, which are further interpreted into semantic arguments employed during the argumentation. The proposal models a mutual dependency between the mappings and arguments, which goes beyond the state of the art in argumentation-based ontology alignment negotiation, better reflecting the requirements of the task.
为了成功通信,自治实体(例如代理、web服务、对等体)必须协调其本体中使用的词汇表。结果是本体实体之间的一组映射。由于各方可能对什么是最佳映射有自己的看法,因此会产生冲突。为了在信息交换方之间建立映射共识,本文提出了一种基于形式化论证框架的方法,该方法利用现有的本体匹配算法生成映射,并将映射进一步解释为在论证过程中使用的语义参数。该提议建立了映射和参数之间的相互依赖关系模型,这超越了基于参数的本体对齐协商的技术水平,更好地反映了任务的需求。
{"title":"Ontology alignment argumentation with mutual dependency between arguments and mappings","authors":"P. Maio, Nuno Silva","doi":"10.1109/ICDEW.2010.5452705","DOIUrl":"https://doi.org/10.1109/ICDEW.2010.5452705","url":null,"abstract":"For a successful communication, autonomous entities (e.g. agents, web services, peers) must reconcile vocabulary used in their ontologies. The result is a set of mappings between ontology entities. Since each party might have its own perspective about what are the best mappings, conflicts will arise. Toward a mapping consensus building between information exchanging parties, this paper proposes an approach based on a formal argumentation framework, whose existing ontology matching algorithms generate the mappings, which are further interpreted into semantic arguments employed during the argumentation. The proposal models a mutual dependency between the mappings and arguments, which goes beyond the state of the art in argumentation-based ontology alignment negotiation, better reflecting the requirements of the task.","PeriodicalId":442345,"journal":{"name":"2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW 2010)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114144869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A first step towards integration independence 迈向一体化独立的第一步
Pub Date : 2010-03-01 DOI: 10.1109/ICDEW.2010.5452753
L. Haas, Renée J. Miller, Donald Kossmann, Martin Hentschel
Two major forms of information integration, federation and materialization, continue to dominate the market, embedded in separate products, each with their strengths and weaknesses. Application developers must make difficult choices among techniques and products, choices that are hard to change later. We propose a new design principle, Integration Independence, for integration engines. Integration independence frees the application designer from deciding how to integrate data. We then describe a new, adaptive information integration engine that provides the ability to index base data or to materialize transformed data, giving us a flexible platform for experimentation.
两种主要的信息集成形式,联邦和物化,继续主导着市场,嵌入在不同的产品中,各有优缺点。应用程序开发人员必须在技术和产品之间做出艰难的选择,这些选择以后很难改变。我们提出了一个新的集成引擎设计原则,即集成独立。集成独立性使应用程序设计人员不必决定如何集成数据。然后,我们描述了一个新的、自适应的信息集成引擎,它提供了索引基础数据或实现转换后的数据的能力,为我们提供了一个灵活的实验平台。
{"title":"A first step towards integration independence","authors":"L. Haas, Renée J. Miller, Donald Kossmann, Martin Hentschel","doi":"10.1109/ICDEW.2010.5452753","DOIUrl":"https://doi.org/10.1109/ICDEW.2010.5452753","url":null,"abstract":"Two major forms of information integration, federation and materialization, continue to dominate the market, embedded in separate products, each with their strengths and weaknesses. Application developers must make difficult choices among techniques and products, choices that are hard to change later. We propose a new design principle, Integration Independence, for integration engines. Integration independence frees the application designer from deciding how to integrate data. We then describe a new, adaptive information integration engine that provides the ability to index base data or to materialize transformed data, giving us a flexible platform for experimentation.","PeriodicalId":442345,"journal":{"name":"2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW 2010)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114557402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Coordination of data in heterogenous domains 异构领域的数据协调
Pub Date : 2010-03-01 DOI: 10.1109/ICDEW.2010.5452757
Michael K. Lawrence, R. Pottinger, S. Staub-French
Existing semantic integration approaches to coordinating data do not meet the needs of real world scenarios which contain fine-grained relationships between data sources. In this paper, we describe extensions to the popular GLAV mapping formalism to express such relationships. We outline methods for solving the data coordination problem using these mappings, and discuss future research problems for data coordination to be realized in heterogeneous domain scenarios that occur in practice.
现有的用于协调数据的语义集成方法不能满足包含数据源之间细粒度关系的现实场景的需要。在本文中,我们描述了对流行的GLAV映射形式的扩展来表达这种关系。我们概述了使用这些映射解决数据协调问题的方法,并讨论了在实践中发生的异构领域场景中实现数据协调的未来研究问题。
{"title":"Coordination of data in heterogenous domains","authors":"Michael K. Lawrence, R. Pottinger, S. Staub-French","doi":"10.1109/ICDEW.2010.5452757","DOIUrl":"https://doi.org/10.1109/ICDEW.2010.5452757","url":null,"abstract":"Existing semantic integration approaches to coordinating data do not meet the needs of real world scenarios which contain fine-grained relationships between data sources. In this paper, we describe extensions to the popular GLAV mapping formalism to express such relationships. We outline methods for solving the data coordination problem using these mappings, and discuss future research problems for data coordination to be realized in heterogeneous domain scenarios that occur in practice.","PeriodicalId":442345,"journal":{"name":"2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW 2010)","volume":"144 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123291740","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Towards best-effort merge of taxonomically organized data 实现分类组织数据的最佳合并
Pub Date : 2010-03-01 DOI: 10.1109/ICDEW.2010.5452756
D. Thau, S. Bowers, Bertram Ludäscher
We consider the task of merging datasets that have been organized using different, but aligned taxonomies. We assume such a merge is intended to create a single dataset that unambiguously describes the information in the source datasets using the alignment. We also assume that the merged result should reflect the observations of the datasets as specifically as possible. Typically, there will be no single merge result that is both unambiguous and maximally specific. In this case, a user may be provided with a set of possible merged datasets. If the user requires a single dataset, that dataset loses specificity. Here we examine whether the data exchange setting can provide a way to derive a ¿best-effort¿ merge. We find that the data exchange setting might be a good candidate for providing the merge, but further research is needed.
我们考虑合并使用不同但一致的分类法组织的数据集的任务。我们假设这样的合并是为了创建一个单一的数据集,该数据集使用对齐来明确地描述源数据集中的信息。我们还假设合并的结果应尽可能具体地反映数据集的观测结果。通常,不会有一个合并结果是明确的和最大程度的特定的。在这种情况下,可能会向用户提供一组可能合并的数据集。如果用户需要单个数据集,则该数据集将失去特异性。在这里,我们检查数据交换设置是否可以提供一种获得“尽力而为”合并的方法。我们发现数据交换设置可能是提供合并的一个很好的候选,但需要进一步的研究。
{"title":"Towards best-effort merge of taxonomically organized data","authors":"D. Thau, S. Bowers, Bertram Ludäscher","doi":"10.1109/ICDEW.2010.5452756","DOIUrl":"https://doi.org/10.1109/ICDEW.2010.5452756","url":null,"abstract":"We consider the task of merging datasets that have been organized using different, but aligned taxonomies. We assume such a merge is intended to create a single dataset that unambiguously describes the information in the source datasets using the alignment. We also assume that the merged result should reflect the observations of the datasets as specifically as possible. Typically, there will be no single merge result that is both unambiguous and maximally specific. In this case, a user may be provided with a set of possible merged datasets. If the user requires a single dataset, that dataset loses specificity. Here we examine whether the data exchange setting can provide a way to derive a ¿best-effort¿ merge. We find that the data exchange setting might be a good candidate for providing the merge, but further research is needed.","PeriodicalId":442345,"journal":{"name":"2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW 2010)","volume":"14 5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115717545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Statistics-driven workload modeling for the Cloud 面向云的统计驱动工作负载建模
Pub Date : 2010-03-01 DOI: 10.1109/ICDEW.2010.5452742
A. Ganapathi, Yanpei Chen, A. Fox, R. Katz, D. Patterson
A recent trend for data-intensive computations is to use pay-as-you-go execution environments that scale transparently to the user. However, providers of such environments must tackle the challenge of configuring their system to provide maximal performance while minimizing the cost of resources used. In this paper, we use statistical models to predict resource requirements for Cloud computing applications. Such a prediction framework can guide system design and deployment decisions such as scale, scheduling, and capacity. In addition, we present initial design of a workload generator that can be used to evaluate alternative configurations without the overhead of reproducing a real workload. This paper focuses on statistical modeling and its application to data-intensive workloads.
数据密集型计算的最新趋势是使用按需付费的执行环境,这种环境对用户是透明的。然而,这种环境的提供者必须解决这样的挑战:配置他们的系统,以提供最大的性能,同时最小化所使用的资源成本。在本文中,我们使用统计模型来预测云计算应用程序的资源需求。这样的预测框架可以指导系统设计和部署决策,例如规模、调度和容量。此外,我们还提供了一个工作负载生成器的初始设计,该生成器可用于评估备选配置,而无需重新生成实际工作负载的开销。本文主要研究统计建模及其在数据密集型工作负载中的应用。
{"title":"Statistics-driven workload modeling for the Cloud","authors":"A. Ganapathi, Yanpei Chen, A. Fox, R. Katz, D. Patterson","doi":"10.1109/ICDEW.2010.5452742","DOIUrl":"https://doi.org/10.1109/ICDEW.2010.5452742","url":null,"abstract":"A recent trend for data-intensive computations is to use pay-as-you-go execution environments that scale transparently to the user. However, providers of such environments must tackle the challenge of configuring their system to provide maximal performance while minimizing the cost of resources used. In this paper, we use statistical models to predict resource requirements for Cloud computing applications. Such a prediction framework can guide system design and deployment decisions such as scale, scheduling, and capacity. In addition, we present initial design of a workload generator that can be used to evaluate alternative configurations without the overhead of reproducing a real workload. This paper focuses on statistical modeling and its application to data-intensive workloads.","PeriodicalId":442345,"journal":{"name":"2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW 2010)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125489265","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 259
A generic auto-provisioning framework for cloud databases 用于云数据库的通用自动供应框架
Pub Date : 2010-03-01 DOI: 10.1109/ICDEW.2010.5452746
Jennie Duggan, Olga Papaemmanouil, U. Çetintemel
We discuss the problem of resource provisioning for database management systems operating on top of an Infrastructure-As-A-Service (IaaS) cloud. To solve this problem, we describe an extensible framework that, given a target query workload, continually optimizes the system's operational cost, estimated based on the IaaS provider's pricing model, while satisfying QoS expectations. Specifically, we describe two different approaches, a “white-box” approach that uses a fine-grained estimation of the expected resource consumption for a workload, and a “black-box” approach that relies on coarse-grained profiling to characterize the workload's end-to-end performance across various cloud resources. We formalize both approaches as a constraint programming problem and use a generic constraint solver to efficiently tackle them. We present preliminary experimental numbers, obtained by running TPC-H queries with PostsgreSQL on Amazon's EC2, that provide evidence of the feasibility and utility of our approaches. We also briefly discuss the pertinent challenges and directions of on-going research.
我们将讨论在基础设施即服务(IaaS)云上运行的数据库管理系统的资源配置问题。为了解决这个问题,我们描述了一个可扩展框架,在给定目标查询工作负载的情况下,该框架根据IaaS提供商的定价模型不断优化系统的运营成本,同时满足QoS预期。具体来说,我们描述了两种不同的方法,一种是使用细粒度估计工作负载的预期资源消耗的“白盒”方法,另一种是依赖粗粒度分析来描述工作负载跨各种云资源的端到端性能的“黑盒”方法。我们将这两种方法形式化为约束规划问题,并使用通用约束求解器来有效地解决它们。通过在Amazon的EC2上使用PostsgreSQL运行TPC-H查询,我们给出了初步的实验数据,这些数据证明了我们的方法的可行性和实用性。我们还简要讨论了相关的挑战和正在进行的研究方向。
{"title":"A generic auto-provisioning framework for cloud databases","authors":"Jennie Duggan, Olga Papaemmanouil, U. Çetintemel","doi":"10.1109/ICDEW.2010.5452746","DOIUrl":"https://doi.org/10.1109/ICDEW.2010.5452746","url":null,"abstract":"We discuss the problem of resource provisioning for database management systems operating on top of an Infrastructure-As-A-Service (IaaS) cloud. To solve this problem, we describe an extensible framework that, given a target query workload, continually optimizes the system's operational cost, estimated based on the IaaS provider's pricing model, while satisfying QoS expectations. Specifically, we describe two different approaches, a “white-box” approach that uses a fine-grained estimation of the expected resource consumption for a workload, and a “black-box” approach that relies on coarse-grained profiling to characterize the workload's end-to-end performance across various cloud resources. We formalize both approaches as a constraint programming problem and use a generic constraint solver to efficiently tackle them. We present preliminary experimental numbers, obtained by running TPC-H queries with PostsgreSQL on Amazon's EC2, that provide evidence of the feasibility and utility of our approaches. We also briefly discuss the pertinent challenges and directions of on-going research.","PeriodicalId":442345,"journal":{"name":"2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW 2010)","volume":"777 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116413381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 51
期刊
2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW 2010)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1