Query-level learning to rank using isotonic regression

2008 46th Annual Allerton Conference on Communication, Control, and Computing Pub Date : 2008-09-01 DOI:10.1109/ALLERTON.2008.4797684

Zhaohui Zheng, H. Zha, Gordon Sun

{"title":"Query-level learning to rank using isotonic regression","authors":"Zhaohui Zheng, H. Zha, Gordon Sun","doi":"10.1109/ALLERTON.2008.4797684","DOIUrl":null,"url":null,"abstract":"Ranking functions determine the relevance of search results of search engines, and learning ranking functions has become an active research area at the interface between Web search, information retrieval and machine learning. Generally, the training data for learning to rank come in two different forms: (1) absolute relevance judgments assessing the degree of relevance of a document with respect to a query. This type of judgments is also called labeled data and are usually obtained through human editorial efforts; and (2) relative relevance judgments indicating that a document is more relevant than another with respect to a query. This type of judgments is also called preference data and can usually be extracted from the abundantly available user click-through data recording users' interactions with the search results. Most existing learning to rank methods ignore the query boundaries, treating the labeled data or preference data equally across queries. In this paper, we propose a minimum effort optimization method that takes into account the entire training data within a query at each iteration. We tackle this optimization problem using functional iterative methods where the update at each iteration is computed by solving an isotonic regression problem. This more global approach results in faster convergency and signficantly improved performance of the learned ranking functions over existing state-of-the-art methods. We demonstrate the effectiveness of the proposed method using data sets obtained from a commercial search engine as well as publicly available data.","PeriodicalId":120561,"journal":{"name":"2008 46th Annual Allerton Conference on Communication, Control, and Computing","volume":"6 4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"28","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 46th Annual Allerton Conference on Communication, Control, and Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ALLERTON.2008.4797684","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 28

Abstract

Ranking functions determine the relevance of search results of search engines, and learning ranking functions has become an active research area at the interface between Web search, information retrieval and machine learning. Generally, the training data for learning to rank come in two different forms: (1) absolute relevance judgments assessing the degree of relevance of a document with respect to a query. This type of judgments is also called labeled data and are usually obtained through human editorial efforts; and (2) relative relevance judgments indicating that a document is more relevant than another with respect to a query. This type of judgments is also called preference data and can usually be extracted from the abundantly available user click-through data recording users' interactions with the search results. Most existing learning to rank methods ignore the query boundaries, treating the labeled data or preference data equally across queries. In this paper, we propose a minimum effort optimization method that takes into account the entire training data within a query at each iteration. We tackle this optimization problem using functional iterative methods where the update at each iteration is computed by solving an isotonic regression problem. This more global approach results in faster convergency and signficantly improved performance of the learned ranking functions over existing state-of-the-art methods. We demonstrate the effectiveness of the proposed method using data sets obtained from a commercial search engine as well as publicly available data.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

查询级学习排序使用等渗回归

排名函数决定了搜索引擎搜索结果的相关性，学习排名函数已经成为Web搜索、信息检索和机器学习之间的一个活跃的研究领域。一般来说，学习排序的训练数据有两种不同的形式:(1)绝对相关性判断，评估文档相对于查询的相关性程度。这种类型的判断也被称为标记数据，通常是通过人类编辑努力获得的;(2)相对相关性判断，表明一份文件比另一份文件在查询方面更相关。这种类型的判断也被称为偏好数据，通常可以从大量可用的用户点击数据中提取，这些数据记录了用户与搜索结果的交互。大多数现有的排序学习方法忽略查询边界，在查询中平等地对待标记数据或首选项数据。在本文中，我们提出了一种最小努力优化方法，该方法在每次迭代时考虑到查询中的整个训练数据。我们使用函数迭代方法解决这个优化问题，其中每次迭代的更新是通过求解等渗回归问题来计算的。与现有的最先进的方法相比，这种更全局的方法可以更快地收敛并显着提高学习到的排名函数的性能。我们使用从商业搜索引擎获得的数据集以及公开可用的数据来证明所提出方法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2008 46th Annual Allerton Conference on Communication, Control, and Computing

自引率

0.00%

发文量

期刊最新文献

Learning sparse doubly-selective channels Ergodic two-user interference channels: Is separability optimal? Weight distribution of codes on hypergraphs Compound multiple access channels with conferencing decoders Transmission techniques for relay-interference networks