2008 The 28th International Conference on Distributed Computing Systems最新文献

英文中文

Fair K Mutual Exclusion Algorithm for Peer to Peer Systems 对等系统的公平K互斥算法

2008 The 28th International Conference on Distributed Computing Systems

Pub Date : 2008-06-17 DOI: 10.1109/ICDCS.2008.76

V. Korthikanti, Prateek Mittal, Indranil Gupta

k-mutual exclusion is an important problem for resource-intensive peer-to-peer applications ranging from aggregation to file downloads. In order to be practically useful, k-mutual exclusion algorithms not only need to be safe and live, but they also need to be fair across hosts. We propose a new solution to the k-mutual exclusion problem that provides a notion of time-based fairness. Specifically, our algorithm attempts to minimize the spread of access time for the critical resource. While a client's access time is the time between it requesting and accessing the resource, the spread is defined as a system-wide metric that measures some notion of the variance of access times across a homogeneous host population, e.g., difference between max and mean. We analytically prove the correctness of our algorithm, and evaluate its fairness experimentally using simulations. Our evaluation under two settings - a LAN setting and a WAN based on the King latency data set - shows even with 100 hosts accessing one resource, the spread of access time is within 15 seconds.

从聚合到文件下载，k互斥是资源密集型点对点应用的一个重要问题。为了在实践中发挥作用，k互斥算法不仅需要安全且有效，而且还需要在主机之间公平。我们提出了一个新的k互斥问题的解决方案，它提供了一个基于时间的公平性的概念。具体来说，我们的算法试图最小化关键资源的访问时间分布。虽然客户机的访问时间是它请求和访问资源之间的时间，但扩展被定义为系统范围的度量，它测量了跨同质主机种群的访问时间方差的一些概念，例如，max和mean之间的差异。分析证明了算法的正确性，并通过仿真实验对算法的公平性进行了评价。我们在两种设置下的评估——基于King延迟数据集的LAN设置和WAN设置——显示，即使有100台主机访问一个资源，访问时间的分布也在15秒以内。

引用次数: 12

Game Theoretic Peer Selection for Resilient Peer-to-Peer Media Streaming Systems 弹性点对点媒体流系统的博弈论对等选择

2008 The 28th International Conference on Distributed Computing Systems

Pub Date : 2008-06-17 DOI: 10.1109/ICDCS.2008.69

M. Yeung, Yu-Kwong Kwok

Peer-to-peer (P2P) media streaming quickly emerges as an important application over the Internet. A plethora of approaches have been suggested and implemented to support P2P media streaming. In our study, we first classified existing approaches and studied their characteristics by looking at three important quantities: number of upstream peers (parents), number of downstream peers (children) and average number of links per peer. We find that in existing approaches, peers are assigned with a fixed number of parents without regard to their contributions, measured by the amount of outgoing bandwidths. Obviously, this is an undesirable arrangement as it leads to highly inefficient use of the P2P links. This observation motivates us to model the peer selection process as a cooperative game among peers. This results in a novel peer selection protocol such that the number of upstream peers of a peer is related to its outgoing bandwidth. Specifically, peers with larger outgoing bandwidth are given more parents, which makes them less vulnerable to peer dynamics. Simulation results show that the proposed protocol improves delivery ratio with similar number of links per peer, comparing with existing approaches in a wide range of settings.

点对点(P2P)媒体流迅速成为互联网上的一个重要应用。已经提出并实施了大量的方法来支持P2P媒体流。在我们的研究中，我们首先对现有的方法进行了分类，并通过观察三个重要的数量来研究它们的特征:上游节点(父节点)的数量，下游节点(子节点)的数量和每个节点的平均链接数。我们发现，在现有的方法中，孩子被分配给固定数量的父母，而不考虑他们的贡献，以传出带宽的数量来衡量。显然，这是一种不可取的安排，因为它导致P2P链接的使用效率极低。这一观察结果促使我们将同伴选择过程建模为同伴之间的合作游戏。这就产生了一种新的对等体选择协议，使得一个对等体的上行对等体的数量与其出带宽相关。具体来说，具有较大输出带宽的同伴会得到更多的父母，这使得他们更不容易受到同伴动态的影响。仿真结果表明，与现有的协议相比，该协议在广泛的环境下，在每个对等点的链路数相同的情况下提高了交付率。

{"title":"Game Theoretic Peer Selection for Resilient Peer-to-Peer Media Streaming Systems","authors":"M. Yeung, Yu-Kwong Kwok","doi":"10.1109/ICDCS.2008.69","DOIUrl":"https://doi.org/10.1109/ICDCS.2008.69","url":null,"abstract":"Peer-to-peer (P2P) media streaming quickly emerges as an important application over the Internet. A plethora of approaches have been suggested and implemented to support P2P media streaming. In our study, we first classified existing approaches and studied their characteristics by looking at three important quantities: number of upstream peers (parents), number of downstream peers (children) and average number of links per peer. We find that in existing approaches, peers are assigned with a fixed number of parents without regard to their contributions, measured by the amount of outgoing bandwidths. Obviously, this is an undesirable arrangement as it leads to highly inefficient use of the P2P links. This observation motivates us to model the peer selection process as a cooperative game among peers. This results in a novel peer selection protocol such that the number of upstream peers of a peer is related to its outgoing bandwidth. Specifically, peers with larger outgoing bandwidth are given more parents, which makes them less vulnerable to peer dynamics. Simulation results show that the proposed protocol improves delivery ratio with similar number of links per peer, comparing with existing approaches in a wide range of settings.","PeriodicalId":240205,"journal":{"name":"2008 The 28th International Conference on Distributed Computing Systems","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129698899","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 21

Detecting Click Fraud in Pay-Per-Click Streams of Online Advertising Networks 在线广告网络按点击付费流中的点击欺诈检测

2008 The 28th International Conference on Distributed Computing Systems

Pub Date : 2008-06-17 DOI: 10.1109/ICDCS.2008.98

Linfeng Zhang, Y. Guan

With the rapid growth of the Internet, online advertisement plays a more and more important role in the advertising market. One of the current and widely used revenue models for online advertising involves charging for each click based on the popularity of keywords and the number of competing advertisers. This pay-per-click model leaves room for individuals or rival companies to generate false clicks (i.e., click fraud), which pose serious problems to the development of healthy online advertising market. To detect click fraud, an important issue is to detect duplicate clicks over decaying window models, such as jumping windows and sliding windows. Decaying window models can be very helpful in defining and determining click fraud. However, although there are available algorithms to detect duplicates, there is still a lack of practical and effective solutions to detect click fraud in pay-per-click streams over decaying window models. In this paper, we address the problem of detecting duplicate clicks in pay-per-click streams over jumping windows and sliding windows, and are the first that propose two innovative algorithms that make only one pass over click streams and require significantly less memory space and operations. GBF algorithm is built on group Bloom filters which can process click streams over jumping windows with small number of sub-windows, while TBF algorithm is based on a new data structure called timing Bloom filter that detects click fraud over sliding windows and jumping windows with large number of sub-windows. Both GBF algorithm and TBF algorithm have zero false negative. Furthermore, both theoretical analysis and experimental results show that our algorithms can achieve low false positive rate when detecting duplicate clicks in pay-per-click streams over jumping windows and sliding windows.

随着互联网的快速发展，网络广告在广告市场中扮演着越来越重要的角色。目前广泛使用的在线广告收入模式之一是根据关键词的受欢迎程度和竞争广告商的数量对每次点击收费。这种按点击付费的模式给个人或竞争对手留下了产生虚假点击(即点击欺诈)的空间，这对健康的在线广告市场的发展构成了严重的问题。为了检测点击欺诈，一个重要的问题是检测衰减窗口模型(如跳跃窗口和滑动窗口)上的重复点击。衰减窗口模型在定义和确定点击欺诈方面非常有帮助。然而，尽管有可用的算法来检测重复，但仍然缺乏实用和有效的解决方案来检测衰减窗口模型中按点击付费流中的点击欺诈。在本文中，我们解决了在跳过窗口和滑动窗口的按点击付费流中检测重复点击的问题，并且是第一个提出两种创新算法的人，这两种算法只在点击流中进行一次传递，并且需要更少的内存空间和操作。GBF算法是建立在组布隆过滤器的基础上的，它可以处理具有少量子窗口的跳跃窗口的点击流，而TBF算法是基于一种称为定时布隆过滤器的新数据结构，它可以检测具有大量子窗口的滑动窗口和跳跃窗口的点击欺诈。GBF算法和TBF算法均为零假阴性。此外，理论分析和实验结果表明，我们的算法在检测跨跳跃窗口和滑动窗口的按点击付费流中的重复点击时可以实现低误报率。

{"title":"Detecting Click Fraud in Pay-Per-Click Streams of Online Advertising Networks","authors":"Linfeng Zhang, Y. Guan","doi":"10.1109/ICDCS.2008.98","DOIUrl":"https://doi.org/10.1109/ICDCS.2008.98","url":null,"abstract":"With the rapid growth of the Internet, online advertisement plays a more and more important role in the advertising market. One of the current and widely used revenue models for online advertising involves charging for each click based on the popularity of keywords and the number of competing advertisers. This pay-per-click model leaves room for individuals or rival companies to generate false clicks (i.e., click fraud), which pose serious problems to the development of healthy online advertising market. To detect click fraud, an important issue is to detect duplicate clicks over decaying window models, such as jumping windows and sliding windows. Decaying window models can be very helpful in defining and determining click fraud. However, although there are available algorithms to detect duplicates, there is still a lack of practical and effective solutions to detect click fraud in pay-per-click streams over decaying window models. In this paper, we address the problem of detecting duplicate clicks in pay-per-click streams over jumping windows and sliding windows, and are the first that propose two innovative algorithms that make only one pass over click streams and require significantly less memory space and operations. GBF algorithm is built on group Bloom filters which can process click streams over jumping windows with small number of sub-windows, while TBF algorithm is based on a new data structure called timing Bloom filter that detects click fraud over sliding windows and jumping windows with large number of sub-windows. Both GBF algorithm and TBF algorithm have zero false negative. Furthermore, both theoretical analysis and experimental results show that our algorithms can achieve low false positive rate when detecting duplicate clicks in pay-per-click streams over jumping windows and sliding windows.","PeriodicalId":240205,"journal":{"name":"2008 The 28th International Conference on Distributed Computing Systems","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130125771","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 154

Autotuning Configurations in Distributed Systems for Performance Improvements Using Evolutionary Strategies 使用进化策略改进分布式系统性能的自动调优配置

2008 The 28th International Conference on Distributed Computing Systems

Pub Date : 2008-06-17 DOI: 10.1109/ICDCS.2008.11

A. Saboori, Guofei Jiang, Haifeng Chen

Distributed systems usually have many configurable parameters such as those included in common configuration files. Performance of distributed systems is partially dependent on these system configurations. While operators may choose default settings or manually tune parameters based on their experience and intuition, the resulted settings may not be the optimal one for specific services running on the distributed system. In this paper, we formulate the problem of autotuning configurations as a black-box optimization problem. This problem becomes quite challenging since the joint parameter search space is huge and also no explicit relationship between performance and configurations exists. We propose to use a well known evolutionary algorithm called covariance matrix adaptation (CMA) to automatically tune system parameters. We compare CMA algorithm to another existing techniques called smart hill climbing (SHC) and demonstrate that CMA algorithm outperforms SHC algorithm both on synthetic data and in a real system.

分布式系统通常有许多可配置的参数，比如那些包含在通用配置文件中的参数。分布式系统的性能部分依赖于这些系统配置。虽然操作人员可能会根据自己的经验和直觉选择默认设置或手动调整参数，但结果设置可能不是分布式系统上运行的特定服务的最佳设置。在本文中，我们将自调优配置问题表述为一个黑盒优化问题。由于联合参数搜索空间很大，而且性能和配置之间没有明确的关系，因此这个问题变得非常具有挑战性。我们建议使用一种众所周知的进化算法，称为协方差矩阵自适应(CMA)来自动调整系统参数。我们将CMA算法与另一种称为智能爬坡(SHC)的现有技术进行了比较，并证明CMA算法在合成数据和实际系统中都优于SHC算法。

引用次数: 34

J2EE Architecture for Database Cluster-Based High Volume E-Commerce Web Applications 基于数据库集群的大容量电子商务Web应用的J2EE体系结构

2008 The 28th International Conference on Distributed Computing Systems

Pub Date : 2008-06-17 DOI: 10.1109/ICDCS.2008.105

Vishal S. Batra, Wen-Syan Li, Sumit Negi

High volume database-driven e-commerce applications demand a cluster-based infrastructure to offers high availability, scalability and fault tolerance. The current J2EE architecture and containers restrict the transparent deployment of applications over database clusters without engineering data access logic into the applications. Our work extends the J2EE architecture to allow transparent deployment of J2EE applications on a database cluster. The key challenge is to load balance read and write queries between the master and replica database instance and yet provide the application with the most recent data in the cluster while enabling service class based query routing. We validate the applicability and effectiveness of the proposed architecture using IBM WebSphere Trade3 stock trading application.

大容量数据库驱动的电子商务应用程序需要基于集群的基础设施来提供高可用性、可伸缩性和容错性。当前的J2EE体系结构和容器限制了应用程序在数据库集群上的透明部署，而不需要将数据访问逻辑设计到应用程序中。我们的工作扩展了J2EE体系结构，以允许在数据库集群上透明地部署J2EE应用程序。关键的挑战是在主数据库和副本数据库实例之间实现读写查询的负载平衡，同时为应用程序提供集群中最新的数据，同时启用基于服务类的查询路由。我们使用IBM WebSphere Trade3股票交易应用程序验证所建议的体系结构的适用性和有效性。

引用次数: 0

Securing Wireless Data Networks against Eavesdropping using Smart Antennas 使用智能天线保护无线数据网络免受窃听

2008 The 28th International Conference on Distributed Computing Systems

Pub Date : 2008-06-17 DOI: 10.1109/ICDCS.2008.87

Sriram Lakshmanan, Cheng-Lin Tsao, Raghupathy Sivakumar, K. Sundaresan

In this paper, we focus on securing communication over wireless data networks from malicious eavesdroppers, using smart antennas. While conventional cryptography based approaches focus on hiding the meaning of the information being communicated from the eavesdropper, we consider a complimentary class of strategies that limit knowledge of the existence of the information from the eavesdropper. We profile the performance achievable using simple beamforming strategies using a newly defined metric called exposure region. We then present three strategies within the context of an approach called virtual arrays of physical arrays to significantly improve the exposure region performance of a wireless LAN environment. Using simulations and analysis, we validate and evaluate the proposed strategies.

在本文中，我们着重于利用智能天线保护无线数据网络上的通信免受恶意窃听者的攻击。传统的基于密码学的方法侧重于隐藏与窃听者通信的信息的含义，而我们考虑了一类互补的策略，限制了窃听者对信息存在性的了解。我们描述了使用简单波束形成策略可以实现的性能，使用新定义的度量称为曝光区域。然后，我们在一种称为物理阵列的虚拟阵列的方法的上下文中提出了三种策略，以显着提高无线局域网环境的暴露区域性能。通过模拟和分析，我们验证和评估了所提出的策略。

引用次数: 46

Utility-Based Opportunistic Routing in Multi-Hop Wireless Networks 多跳无线网络中基于效用的机会路由

2008 The 28th International Conference on Distributed Computing Systems

Pub Date : 2008-06-17 DOI: 10.1109/ICDCS.2008.90

Jie Wu, Mingming Lu, Feng Li

Recently, opportunistic routing (OR) has been widely used to compensate for the low packet delivery ratio of multi-hop wireless networks. Previous works either provide heuristic solutions without optimality analysis, or assume that unlimited retransmission is available for delivering a data packet. In this paper, we apply OR to a utility-based routing where the successful delivery of a data packet generates benefit. The objective is to maximize utility, defined as a function of benefit and cost of transmission. As the link reliability of each relay determines eventual packet delivery and hence utility, OR offers the ability to increase reliability through opportunistic relays. We explore the optimality of utility-based routing through OR without allowing retransmission, and observe that the optimal scheme requires exhaustive searching of all paths from source to destination. We then propose a heuristic solution to select relays and determine priorities among them. Finally, we provide distributed implementations for both schemes. Simulations on NS-2 and our customized simulator are conducted to verify the effectiveness of the heuristic compared with the optimal.

近年来，机会路由(opportunistic routing, OR)被广泛用于补偿多跳无线网络的低分组传输率。以前的工作要么提供没有最优性分析的启发式解决方案，要么假设可以无限地重新传输数据包。在本文中，我们将OR应用于基于实用程序的路由，其中数据包的成功传递会产生好处。目标是最大化效用，定义为传输收益和成本的函数。由于每个中继的链路可靠性决定了最终的数据包传递和效用，OR提供了通过机会中继提高可靠性的能力。我们探索了通过OR而不允许重传的基于效用的路由的最优性，并观察到最优方案需要穷举搜索从源到目的的所有路径。然后，我们提出了一种启发式解决方案来选择继电器并确定它们之间的优先级。最后，我们为这两种方案提供了分布式实现。在NS-2和我们定制的模拟器上进行了仿真，验证了启发式算法与最优算法的有效性。

{"title":"Utility-Based Opportunistic Routing in Multi-Hop Wireless Networks","authors":"Jie Wu, Mingming Lu, Feng Li","doi":"10.1109/ICDCS.2008.90","DOIUrl":"https://doi.org/10.1109/ICDCS.2008.90","url":null,"abstract":"Recently, opportunistic routing (OR) has been widely used to compensate for the low packet delivery ratio of multi-hop wireless networks. Previous works either provide heuristic solutions without optimality analysis, or assume that unlimited retransmission is available for delivering a data packet. In this paper, we apply OR to a utility-based routing where the successful delivery of a data packet generates benefit. The objective is to maximize utility, defined as a function of benefit and cost of transmission. As the link reliability of each relay determines eventual packet delivery and hence utility, OR offers the ability to increase reliability through opportunistic relays. We explore the optimality of utility-based routing through OR without allowing retransmission, and observe that the optimal scheme requires exhaustive searching of all paths from source to destination. We then propose a heuristic solution to select relays and determine priorities among them. Finally, we provide distributed implementations for both schemes. Simulations on NS-2 and our customized simulator are conducted to verify the effectiveness of the heuristic compared with the optimal.","PeriodicalId":240205,"journal":{"name":"2008 The 28th International Conference on Distributed Computing Systems","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116036237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 83

Distributed Connected Dominating Set Construction in Geometric k-Disk Graphs 几何k盘图的分布连通支配集构造

2008 The 28th International Conference on Distributed Computing Systems

Pub Date : 2008-06-17 DOI: 10.1109/ICDCS.2008.39

Kai Xing, Wei Cheng, E. Park, S. Rotenstreich

In this paper, we study the problem of minimum connected dominating set in geometric k-disk graphs. This research is motivated by the problem of virtual backbone construction in wireless ad hoc and sensor networks, where the coverage area of nodes are disks with different radii. We derive the size relationship of any maximal independent set and the minimum connected dominating set in geometric k-disk graphs, and apply it to analyze the performances of two distributed connected dominating set algorithms we propose in this paper. These algorithms have a bounded performance ratio and low communication overhead, and therefore have the potential to be applied in real ad hoc and sensor networks.

本文研究几何k盘图的最小连通支配集问题。针对无线自组网和传感器网络中节点覆盖区域为不同半径磁盘的虚拟主干网构建问题，进行了本课题的研究。我们导出了几何k盘图中任意最大独立集与最小连通控制集的大小关系，并应用它分析了本文提出的两种分布式连通控制集算法的性能。这些算法具有有限的性能比和较低的通信开销，因此在实际的自组网和传感器网络中具有应用潜力。

引用次数: 18

Online Optimization for Latency Assignment in Distributed Real-Time Systems 分布式实时系统延迟分配的在线优化

2008 The 28th International Conference on Distributed Computing Systems

Pub Date : 2008-06-17 DOI: 10.1109/ICDCS.2008.102

C. Lumezanu, S. Bhola, Mark Astley

As distributed real-time applications gain in popularity, a key challenge is to allocate resources so that diverse real-time requirements (including non-real-time applications), distributed application components and varying workloads can all be accommodated without violating timeliness constraints. We examine the problem of resource allocation in distributed soft real-time systems, where both network and CPU resources are consumed. The timeliness constraints of applications are expressed through utility functions, which compute "benefit" as a function of end-to-end latency. We present LLA (Lagrangian Latency Assignment), a scalable and efficient distributed algorithm which maximizes aggregate utility by computing an optimal trade-off between end-to-end latency and allocated resources. The algorithm runs continuously and adapts to both workload and resource variations. LLA is guaranteed to converge if the workload and resource requirements stabilize. We evaluate the quality of results and convergence characteristics under various workloads, using both simulation and real-world experimentation.

随着分布式实时应用程序的流行，一个关键的挑战是分配资源，以便在不违反时效性约束的情况下，能够适应不同的实时需求(包括非实时应用程序)、分布式应用程序组件和不同的工作负载。我们研究了分布式软实时系统中的资源分配问题，其中网络和CPU资源都被消耗。应用程序的时效性约束通过效用函数表示，效用函数将“收益”计算为端到端延迟的函数。我们提出了LLA(拉格朗日延迟分配)，这是一种可扩展和高效的分布式算法，通过计算端到端延迟和分配资源之间的最佳权衡来最大化总效用。该算法可以连续运行，并适应工作负载和资源的变化。如果工作负载和资源需求稳定，LLA可以保证收敛。我们评估了结果的质量和收敛特性在各种工作负载下，使用模拟和现实世界的实验。

引用次数: 10

Multi-query Optimization for Distributed Similarity Query Processing 分布式相似查询处理的多查询优化

2008 The 28th International Conference on Distributed Computing Systems

Pub Date : 2008-06-17 DOI: 10.1109/ICDCS.2008.58

Zhuang Yi, Qing Li, Lei Chen

This paper considers a multi-query optimization issue for distributed similarity query processing, which attempts to exploit the dependencies in the derivation of a query evaluation plan. To the best of our knowledge, this is the first work investigating a multi- query optimization technique for distributed similarity query processing (MDSQ). Four steps are incorporated in our MDSQ algorithm. First when a number of query requests(i.e., m query vectors and m radiuses) are simultaneously submitted by users, then a cost-based dynamic query scheduling(DQS) procedure is invoked to quickly and effectively identify the correlation among the query spheres (requests). After that, an index-based vector set reduction is performed at data node level in parallel. Finally, a refinement process of the candidate vectors is conducted to get the answer set. The proposed method includes a cost-based dynamic query scheduling, a Start-Distance(SD)-based load balancing scheme, and an index-based vector set reduction algorithm. The experimental results validate the efficiency and effectiveness of the algorithm in minimizing the response time and increasing the parallelism of I/O and CPU.

本文研究了分布式相似查询处理中的多查询优化问题，该问题试图利用查询评估计划派生过程中的依赖关系。据我们所知，这是第一个研究分布式相似查询处理(MDSQ)的多查询优化技术的工作。我们的MDSQ算法包含四个步骤。首先，当大量的查询请求(例如:用户同时提交m个查询向量和m个半径)，然后调用基于成本的动态查询调度(DQS)过程来快速有效地识别查询域(请求)之间的相关性。之后，在数据节点级别并行执行基于索引的向量集约简。最后，对候选向量进行细化处理，得到答案集。该方法包括基于成本的动态查询调度、基于起始距离(SD)的负载均衡方案和基于索引的向量集约简算法。实验结果验证了该算法在最小化响应时间和提高I/O和CPU并行性方面的效率和有效性。

{"title":"Multi-query Optimization for Distributed Similarity Query Processing","authors":"Zhuang Yi, Qing Li, Lei Chen","doi":"10.1109/ICDCS.2008.58","DOIUrl":"https://doi.org/10.1109/ICDCS.2008.58","url":null,"abstract":"This paper considers a multi-query optimization issue for distributed similarity query processing, which attempts to exploit the dependencies in the derivation of a query evaluation plan. To the best of our knowledge, this is the first work investigating a multi- query optimization technique for distributed similarity query processing (MDSQ). Four steps are incorporated in our MDSQ algorithm. First when a number of query requests(i.e., m query vectors and m radiuses) are simultaneously submitted by users, then a cost-based dynamic query scheduling(DQS) procedure is invoked to quickly and effectively identify the correlation among the query spheres (requests). After that, an index-based vector set reduction is performed at data node level in parallel. Finally, a refinement process of the candidate vectors is conducted to get the answer set. The proposed method includes a cost-based dynamic query scheduling, a Start-Distance(SD)-based load balancing scheme, and an index-based vector set reduction algorithm. The experimental results validate the efficiency and effectiveness of the algorithm in minimizing the response time and increasing the parallelism of I/O and CPU.","PeriodicalId":240205,"journal":{"name":"2008 The 28th International Conference on Distributed Computing Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130686597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 11

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2008 The 28th International Conference on Distributed Computing Systems

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀