解缠结果列表细化和质量排序:一个评估和预测的框架

Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval Pub Date : 2015-08-09 DOI:10.1145/2766462.2767740

Jiyin He, M. Bron, A. D. Vries, L. Azzopardi, M. de Rijke

{"title":"解缠结果列表细化和质量排序:一个评估和预测的框架","authors":"Jiyin He, M. Bron, A. D. Vries, L. Azzopardi, M. de Rijke","doi":"10.1145/2766462.2767740","DOIUrl":null,"url":null,"abstract":"Traditional batch evaluation metrics assume that user interaction with search results is limited to scanning down a ranked list. However, modern search interfaces come with additional elements supporting result list refinement (RLR) through facets and filters, making user search behavior increasingly dynamic. We develop an evaluation framework that takes a step beyond the interaction assumption of traditional evaluation metrics and allows for batch evaluation of systems with and without RLR elements. In our framework we model user interaction as switching between different sublists. This provides a measure of user effort based on the joint effect of user interaction with RLR elements and result quality. We validate our framework by conducting a user study and comparing model predictions with real user performance. Our model predictions show significant positive correlation with real user effort. Further, in contrast to traditional evaluation metrics, the predictions using our framework, of when users stand to benefit from RLR elements, reflect findings from our user study. Finally, we use the framework to investigate under what conditions systems with and without RLR elements are likely to be effective. We simulate varying conditions concerning ranking quality, users, task and interface properties demonstrating a cost-effective way to study whole system performance.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"57 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Untangling Result List Refinement and Ranking Quality: a Framework for Evaluation and Prediction\",\"authors\":\"Jiyin He, M. Bron, A. D. Vries, L. Azzopardi, M. de Rijke\",\"doi\":\"10.1145/2766462.2767740\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Traditional batch evaluation metrics assume that user interaction with search results is limited to scanning down a ranked list. However, modern search interfaces come with additional elements supporting result list refinement (RLR) through facets and filters, making user search behavior increasingly dynamic. We develop an evaluation framework that takes a step beyond the interaction assumption of traditional evaluation metrics and allows for batch evaluation of systems with and without RLR elements. In our framework we model user interaction as switching between different sublists. This provides a measure of user effort based on the joint effect of user interaction with RLR elements and result quality. We validate our framework by conducting a user study and comparing model predictions with real user performance. Our model predictions show significant positive correlation with real user effort. Further, in contrast to traditional evaluation metrics, the predictions using our framework, of when users stand to benefit from RLR elements, reflect findings from our user study. Finally, we use the framework to investigate under what conditions systems with and without RLR elements are likely to be effective. We simulate varying conditions concerning ranking quality, users, task and interface properties demonstrating a cost-effective way to study whole system performance.\",\"PeriodicalId\":297035,\"journal\":{\"name\":\"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval\",\"volume\":\"57 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-08-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2766462.2767740\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2766462.2767740","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

摘要

传统的批处理评估指标假设用户与搜索结果的交互仅限于扫描排序列表。但是，现代搜索界面提供了通过facet和过滤器支持结果列表细化(RLR)的附加元素，使得用户搜索行为越来越动态。我们开发了一个评估框架，它超越了传统评估指标的交互假设，并允许对有或没有RLR元素的系统进行批量评估。在我们的框架中，我们将用户交互建模为在不同子列表之间切换。这提供了一种基于用户与RLR元素交互的联合效果和结果质量的用户工作度量。我们通过进行用户研究并将模型预测与实际用户性能进行比较来验证我们的框架。我们的模型预测与实际用户的努力有显著的正相关。此外，与传统的评估指标相比，使用我们的框架的预测，即用户何时能够从RLR元素中受益，反映了我们的用户研究结果。最后，我们使用该框架来研究在什么条件下，有或没有RLR元素的系统可能是有效的。我们模拟了关于排序质量、用户、任务和界面属性的不同条件，展示了一种经济有效的方法来研究整个系统的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Untangling Result List Refinement and Ranking Quality: a Framework for Evaluation and Prediction

Traditional batch evaluation metrics assume that user interaction with search results is limited to scanning down a ranked list. However, modern search interfaces come with additional elements supporting result list refinement (RLR) through facets and filters, making user search behavior increasingly dynamic. We develop an evaluation framework that takes a step beyond the interaction assumption of traditional evaluation metrics and allows for batch evaluation of systems with and without RLR elements. In our framework we model user interaction as switching between different sublists. This provides a measure of user effort based on the joint effect of user interaction with RLR elements and result quality. We validate our framework by conducting a user study and comparing model predictions with real user performance. Our model predictions show significant positive correlation with real user effort. Further, in contrast to traditional evaluation metrics, the predictions using our framework, of when users stand to benefit from RLR elements, reflect findings from our user study. Finally, we use the framework to investigate under what conditions systems with and without RLR elements are likely to be effective. We simulate varying conditions concerning ranking quality, users, task and interface properties demonstrating a cost-effective way to study whole system performance.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助