Adaptive parallelism for web search

Proceedings of the Eleventh European Conference on Computer Systems Pub Date : 2013-04-15 DOI:10.1145/2465351.2465367

Myeongjae Jeon, Yuxiong He, S. Elnikety, A. Cox, S. Rixner

{"title":"Adaptive parallelism for web search","authors":"Myeongjae Jeon, Yuxiong He, S. Elnikety, A. Cox, S. Rixner","doi":"10.1145/2465351.2465367","DOIUrl":null,"url":null,"abstract":"A web search query made to Microsoft Bing is currently parallelized by distributing the query processing across many servers. Within each of these servers, the query is, however, processed sequentially. Although each server may be processing multiple queries concurrently, with modern multicore servers, parallelizing the processing of an individual query within the server may nonetheless improve the user's experience by reducing the response time. In this paper, we describe the issues that make the parallelization of an individual query within a server challenging, and we present a parallelization approach that effectively addresses these challenges. Since each server may be processing multiple queries concurrently, we also present a adaptive resource management algorithm that chooses the degree of parallelism at run-time for each query, taking into account system load and parallelization efficiency. As a result, the servers now execute queries with a high degree of parallelism at low loads, gracefully reduce the degree of parallelism with increased load, and choose sequential execution under high load. We have implemented our parallelization approach and adaptive resource management algorithm in Bing servers and evaluated them experimentally with production workloads. The experimental results show that the mean and 95th-percentile response times for queries are reduced by more than 50% under light or moderate load. Moreover, under high load where parallelization adversely degrades the system performance, the response times are kept the same as when queries are executed sequentially. In all cases, we observe no degradation in the relevance of the search results.","PeriodicalId":20737,"journal":{"name":"Proceedings of the Eleventh European Conference on Computer Systems","volume":"28 1","pages":"155-168"},"PeriodicalIF":0.0000,"publicationDate":"2013-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"52","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Eleventh European Conference on Computer Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2465351.2465367","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 52

Abstract

A web search query made to Microsoft Bing is currently parallelized by distributing the query processing across many servers. Within each of these servers, the query is, however, processed sequentially. Although each server may be processing multiple queries concurrently, with modern multicore servers, parallelizing the processing of an individual query within the server may nonetheless improve the user's experience by reducing the response time. In this paper, we describe the issues that make the parallelization of an individual query within a server challenging, and we present a parallelization approach that effectively addresses these challenges. Since each server may be processing multiple queries concurrently, we also present a adaptive resource management algorithm that chooses the degree of parallelism at run-time for each query, taking into account system load and parallelization efficiency. As a result, the servers now execute queries with a high degree of parallelism at low loads, gracefully reduce the degree of parallelism with increased load, and choose sequential execution under high load. We have implemented our parallelization approach and adaptive resource management algorithm in Bing servers and evaluated them experimentally with production workloads. The experimental results show that the mean and 95th-percentile response times for queries are reduced by more than 50% under light or moderate load. Moreover, under high load where parallelization adversely degrades the system performance, the response times are kept the same as when queries are executed sequentially. In all cases, we observe no degradation in the relevance of the search results.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

自适应并行网络搜索

对微软必应(Microsoft Bing)的网络搜索查询目前是通过将查询处理分布在许多服务器上来并行化的。然而，在这些服务器中，查询是按顺序处理的。尽管每个服务器可能并发地处理多个查询，但是使用现代多核服务器，在服务器内并行处理单个查询仍然可以通过减少响应时间来改善用户体验。在本文中，我们描述了使服务器内单个查询的并行化具有挑战性的问题，并提出了一种有效解决这些挑战的并行化方法。由于每个服务器可能并发地处理多个查询，因此我们还提出了一种自适应资源管理算法，该算法在考虑系统负载和并行化效率的情况下，在运行时为每个查询选择并行度。因此，服务器现在在低负载下以高并行度执行查询，在增加负载时优雅地降低并行度，并在高负载下选择顺序执行。我们已经在Bing服务器上实现了我们的并行化方法和自适应资源管理算法，并在生产工作负载上对它们进行了实验评估。实验结果表明，在轻负荷或中等负荷下，查询的平均和第95百分位响应时间减少了50%以上。此外，在并行化会降低系统性能的高负载下，响应时间与顺序执行查询时保持一致。在所有情况下，我们观察到搜索结果的相关性没有下降。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the Eleventh European Conference on Computer Systems

自引率

0.00%

发文量