Query-level learning to rank using isotonic regression

Zhaohui Zheng, H. Zha, Gordon Sun
{"title":"Query-level learning to rank using isotonic regression","authors":"Zhaohui Zheng, H. Zha, Gordon Sun","doi":"10.1109/ALLERTON.2008.4797684","DOIUrl":null,"url":null,"abstract":"Ranking functions determine the relevance of search results of search engines, and learning ranking functions has become an active research area at the interface between Web search, information retrieval and machine learning. Generally, the training data for learning to rank come in two different forms: (1) absolute relevance judgments assessing the degree of relevance of a document with respect to a query. This type of judgments is also called labeled data and are usually obtained through human editorial efforts; and (2) relative relevance judgments indicating that a document is more relevant than another with respect to a query. This type of judgments is also called preference data and can usually be extracted from the abundantly available user click-through data recording users' interactions with the search results. Most existing learning to rank methods ignore the query boundaries, treating the labeled data or preference data equally across queries. In this paper, we propose a minimum effort optimization method that takes into account the entire training data within a query at each iteration. We tackle this optimization problem using functional iterative methods where the update at each iteration is computed by solving an isotonic regression problem. This more global approach results in faster convergency and signficantly improved performance of the learned ranking functions over existing state-of-the-art methods. We demonstrate the effectiveness of the proposed method using data sets obtained from a commercial search engine as well as publicly available data.","PeriodicalId":120561,"journal":{"name":"2008 46th Annual Allerton Conference on Communication, Control, and Computing","volume":"6 4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"28","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 46th Annual Allerton Conference on Communication, Control, and Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ALLERTON.2008.4797684","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 28

Abstract

Ranking functions determine the relevance of search results of search engines, and learning ranking functions has become an active research area at the interface between Web search, information retrieval and machine learning. Generally, the training data for learning to rank come in two different forms: (1) absolute relevance judgments assessing the degree of relevance of a document with respect to a query. This type of judgments is also called labeled data and are usually obtained through human editorial efforts; and (2) relative relevance judgments indicating that a document is more relevant than another with respect to a query. This type of judgments is also called preference data and can usually be extracted from the abundantly available user click-through data recording users' interactions with the search results. Most existing learning to rank methods ignore the query boundaries, treating the labeled data or preference data equally across queries. In this paper, we propose a minimum effort optimization method that takes into account the entire training data within a query at each iteration. We tackle this optimization problem using functional iterative methods where the update at each iteration is computed by solving an isotonic regression problem. This more global approach results in faster convergency and signficantly improved performance of the learned ranking functions over existing state-of-the-art methods. We demonstrate the effectiveness of the proposed method using data sets obtained from a commercial search engine as well as publicly available data.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
查询级学习排序使用等渗回归
排名函数决定了搜索引擎搜索结果的相关性,学习排名函数已经成为Web搜索、信息检索和机器学习之间的一个活跃的研究领域。一般来说,学习排序的训练数据有两种不同的形式:(1)绝对相关性判断,评估文档相对于查询的相关性程度。这种类型的判断也被称为标记数据,通常是通过人类编辑努力获得的;(2)相对相关性判断,表明一份文件比另一份文件在查询方面更相关。这种类型的判断也被称为偏好数据,通常可以从大量可用的用户点击数据中提取,这些数据记录了用户与搜索结果的交互。大多数现有的排序学习方法忽略查询边界,在查询中平等地对待标记数据或首选项数据。在本文中,我们提出了一种最小努力优化方法,该方法在每次迭代时考虑到查询中的整个训练数据。我们使用函数迭代方法解决这个优化问题,其中每次迭代的更新是通过求解等渗回归问题来计算的。与现有的最先进的方法相比,这种更全局的方法可以更快地收敛并显着提高学习到的排名函数的性能。我们使用从商业搜索引擎获得的数据集以及公开可用的数据来证明所提出方法的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Learning sparse doubly-selective channels Ergodic two-user interference channels: Is separability optimal? Weight distribution of codes on hypergraphs Compound multiple access channels with conferencing decoders Transmission techniques for relay-interference networks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1