Liangda Li, Hongbo Deng, Anlei Dong, Yi Chang, H. Zha, R. Baeza-Yates
Query auto-completion (QAC) plays an important role in assisting users typing less while submitting a query. The QAC engine generally offers a list of suggested queries that start with a user's input as a prefix, and the list of suggestions is changed to match the updated input after the user types each keystroke. Therefore rich user interactions can be observed along with each keystroke until a user clicks a suggestion or types the entire query manually. It becomes increasingly important to analyze and understand users' interactions with the QAC engine, to improve its performance. Existing works on QAC either ignored users' interaction data, or assumed that their interactions at each keystroke are independent from others. Our paper pays high attention to users' sequential interactions with a QAC engine in and across QAC sessions, rather than users' interactions at each keystroke of each QAC session separately. Analyzing the dependencies in users' sequential interactions improves our understanding of the following three questions: 1) how is a user's skipping/viewing move at the current keystroke influenced by that at the previous keystroke? 2) how to improve search engines' query suggestions at short keystrokes based on those at latter long keystrokes? and 3) facing a targeted query shown in the suggestion list, why does a user decide to continue typing rather than click the intended suggestion? We propose a probabilistic model that addresses those three questions in a unified way, and illustrate how the model determines users' final click decisions. By comparing with state-of-the-art methods, our proposed model does suggest queries that better satisfy users' intents.
{"title":"Analyzing User's Sequential Behavior in Query Auto-Completion via Markov Processes","authors":"Liangda Li, Hongbo Deng, Anlei Dong, Yi Chang, H. Zha, R. Baeza-Yates","doi":"10.1145/2766462.2767723","DOIUrl":"https://doi.org/10.1145/2766462.2767723","url":null,"abstract":"Query auto-completion (QAC) plays an important role in assisting users typing less while submitting a query. The QAC engine generally offers a list of suggested queries that start with a user's input as a prefix, and the list of suggestions is changed to match the updated input after the user types each keystroke. Therefore rich user interactions can be observed along with each keystroke until a user clicks a suggestion or types the entire query manually. It becomes increasingly important to analyze and understand users' interactions with the QAC engine, to improve its performance. Existing works on QAC either ignored users' interaction data, or assumed that their interactions at each keystroke are independent from others. Our paper pays high attention to users' sequential interactions with a QAC engine in and across QAC sessions, rather than users' interactions at each keystroke of each QAC session separately. Analyzing the dependencies in users' sequential interactions improves our understanding of the following three questions: 1) how is a user's skipping/viewing move at the current keystroke influenced by that at the previous keystroke? 2) how to improve search engines' query suggestions at short keystrokes based on those at latter long keystrokes? and 3) facing a targeted query shown in the suggestion list, why does a user decide to continue typing rather than click the intended suggestion? We propose a probabilistic model that addresses those three questions in a unified way, and illustrate how the model determines users' final click decisions. By comparing with state-of-the-art methods, our proposed model does suggest queries that better satisfy users' intents.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131254189","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Johanne R. Trippas, Damiano Spina, M. Sanderson, L. Cavedon
Presenting search results over a speech-only communication channel involves a number of challenges for users due to cognitive limitations and the serial nature of speech. We investigated the impact of search result summary length in speech-based web search, and compared our results to a text baseline. Based on crowdsourced workers, we found that users preferred longer, more informative summaries for text presentation. For audio, user preferences depended on the style of query. For single-facet queries, shortened audio summaries were preferred, additionally users were found to judge relevance with a similar accuracy compared to text-based summaries. For multi-facet queries, user preferences were not as clear, suggesting that more sophisticated techniques are required to handle such queries.
{"title":"Towards Understanding the Impact of Length in Web Search Result Summaries over a Speech-only Communication Channel","authors":"Johanne R. Trippas, Damiano Spina, M. Sanderson, L. Cavedon","doi":"10.1145/2766462.2767826","DOIUrl":"https://doi.org/10.1145/2766462.2767826","url":null,"abstract":"Presenting search results over a speech-only communication channel involves a number of challenges for users due to cognitive limitations and the serial nature of speech. We investigated the impact of search result summary length in speech-based web search, and compared our results to a text baseline. Based on crowdsourced workers, we found that users preferred longer, more informative summaries for text presentation. For audio, user preferences depended on the style of query. For single-facet queries, shortened audio summaries were preferred, additionally users were found to judge relevance with a similar accuracy compared to text-based summaries. For multi-facet queries, user preferences were not as clear, suggesting that more sophisticated techniques are required to handle such queries.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133317139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The quantum probabilistic framework has recently been applied to Information Retrieval (IR). A representative is the Quantum Language Model (QLM), which is developed for the ad-hoc retrieval with single queries and has achieved significant improvements over traditional language models. In QLM, a density matrix, defined on the quantum probabilistic space, is estimated as a representation of user's search intention with respect to a specific query. However, QLM is unable to capture the dynamics of user's information need in query history. This limitation restricts its further application on the dynamic search tasks, e.g., session search. In this paper, we propose a Session-based Quantum Language Model (SQLM) that deals with multi-query session search task. In SQLM, a transformation model of density matrices is proposed to model the evolution of user's information need in response to the user's interaction with search engine, by incorporating features extracted from both positive feedback (clicked documents) and negative feedback (skipped documents). Extensive experiments conducted on TREC 2013 and 2014 session track data demonstrate the effectiveness of SQLM in comparison with the classic QLM.
近年来,量子概率框架在信息检索领域得到了广泛的应用。一个代表是量子语言模型(Quantum Language Model, QLM),它是为使用单个查询的特别检索而开发的,并且比传统的语言模型取得了显著的改进。在QLM中,定义在量子概率空间上的密度矩阵被估计为用户相对于特定查询的搜索意图的表示。但是,QLM无法在查询历史中捕捉用户信息需求的动态。这限制了它在动态搜索任务(如会话搜索)上的进一步应用。本文提出了一种基于会话的量子语言模型(SQLM),用于处理多查询会话搜索任务。在SQLM中,提出了密度矩阵的转换模型,通过结合从正反馈(点击文档)和负反馈(跳过文档)中提取的特征,来模拟用户与搜索引擎交互时用户信息需求的演变。在TREC 2013年和2014年的会话轨迹数据上进行的大量实验表明,与经典的QLM相比,SQLM是有效的。
{"title":"Modeling Multi-query Retrieval Tasks Using Density Matrix Transformation","authors":"Qiuchi Li, Jingfei Li, Peng Zhang, D. Song","doi":"10.1145/2766462.2767819","DOIUrl":"https://doi.org/10.1145/2766462.2767819","url":null,"abstract":"The quantum probabilistic framework has recently been applied to Information Retrieval (IR). A representative is the Quantum Language Model (QLM), which is developed for the ad-hoc retrieval with single queries and has achieved significant improvements over traditional language models. In QLM, a density matrix, defined on the quantum probabilistic space, is estimated as a representation of user's search intention with respect to a specific query. However, QLM is unable to capture the dynamics of user's information need in query history. This limitation restricts its further application on the dynamic search tasks, e.g., session search. In this paper, we propose a Session-based Quantum Language Model (SQLM) that deals with multi-query session search task. In SQLM, a transformation model of density matrices is proposed to model the evolution of user's information need in response to the user's interaction with search engine, by incorporating features extracted from both positive feedback (clicked documents) and negative feedback (skipped documents). Extensive experiments conducted on TREC 2013 and 2014 session track data demonstrate the effectiveness of SQLM in comparison with the classic QLM.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114316445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The primary purpose of this research is to explore the impact of perceived time pressure on search behaviors, searcher perceptions of the search system and the search experience. Are there observable behavioral changes when a searcher is time-pressured? To what extent are search behavior differences attributable to objective experimental manipulation versus to the subjective experience of time pressure? An important secondary purpose of this work is to identify appropriate outcome measures that allow for the comparison of session-level search behaviors when time is manipulated.
{"title":"Time Pressure in Information Search","authors":"Anita Crescenzi","doi":"10.1145/2766462.2767851","DOIUrl":"https://doi.org/10.1145/2766462.2767851","url":null,"abstract":"The primary purpose of this research is to explore the impact of perceived time pressure on search behaviors, searcher perceptions of the search system and the search experience. Are there observable behavioral changes when a searcher is time-pressured? To what extent are search behavior differences attributable to objective experimental manipulation versus to the subjective experience of time pressure? An important secondary purpose of this work is to identify appropriate outcome measures that allow for the comparison of session-level search behaviors when time is manipulated.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"78 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114307612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We compute and evaluate relevance scores for knowledge-base triples from type-like relations. Such a score measures the degree to which an entity "belongs" to a type. For example, Quentin Tarantino has various professions, including Film Director, Screenwriter, and Actor. The first two would get a high score in our setting, because those are his main professions. The third would get a low score, because he mostly had cameo appearances in his own movies. Such scores are essential in the ranking for entity queries, e.g. "American actors" or "Quentin Tarantino professions". These scores are different from scores for "correctness" or "accuracy" (all three professions above are correct and accurate). We propose a variety of algorithms to compute these scores. For our evaluation we designed a new benchmark, which includes a ground truth based on about 14K human judgments obtained via crowdsourcing. Inter-judge agreement is slightly over 90%. Existing approaches from the literature give results far from the optimum. Our best algorithms achieve an agreement of about 80% with the ground truth.
{"title":"Relevance Scores for Triples from Type-Like Relations","authors":"H. Bast, Björn Buchhold, Elmar Haussmann","doi":"10.1145/2766462.2767734","DOIUrl":"https://doi.org/10.1145/2766462.2767734","url":null,"abstract":"We compute and evaluate relevance scores for knowledge-base triples from type-like relations. Such a score measures the degree to which an entity \"belongs\" to a type. For example, Quentin Tarantino has various professions, including Film Director, Screenwriter, and Actor. The first two would get a high score in our setting, because those are his main professions. The third would get a low score, because he mostly had cameo appearances in his own movies. Such scores are essential in the ranking for entity queries, e.g. \"American actors\" or \"Quentin Tarantino professions\". These scores are different from scores for \"correctness\" or \"accuracy\" (all three professions above are correct and accurate). We propose a variety of algorithms to compute these scores. For our evaluation we designed a new benchmark, which includes a ground truth based on about 14K human judgments obtained via crowdsourcing. Inter-judge agreement is slightly over 90%. Existing approaches from the literature give results far from the optimum. Our best algorithms achieve an agreement of about 80% with the ground truth.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116310516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Search personalization tailors the search experience to individual searchers. To do this, search engines construct interest models comprising signals from observed behavior associated with ma-chines, often via Web browser cookies or other user identifiers. However, shared device usage is common, meaning that the activities of multiple searchers may be interwoven in the interest models generated. Recent research on activity attribution has led to methods to automatically disentangle the histories of multiple searchers and correctly ascribe newly-observed search activity to the correct per-son. Building on this, we introduce attribution-based personalization (ABP), a procedure that extends traditional personalization to target individual searchers on shared devices. Activity attribution may improve personalization, but its benefits are not yet fully understood. We present an oracle study (with perfect knowledge of which searchers perform each action on each machine) to under-stand the effectiveness of ABP in predicting searchers' future interests. We utilize a large Web search log dataset containing both per-son identifiers and machine identifiers to quantify the gain in personalization performance from ABP, identify the circumstances under which ABP is most effective, and develop a classifier to determine when to apply it that yields sizable gains in personalization performance. ABP allows search providers to personalize experiences for individuals rather than targeting all users of a device collectively.
{"title":"Personalizing Search on Shared Devices","authors":"Ryen W. White, Ahmed Hassan Awadallah","doi":"10.1145/2766462.2767736","DOIUrl":"https://doi.org/10.1145/2766462.2767736","url":null,"abstract":"Search personalization tailors the search experience to individual searchers. To do this, search engines construct interest models comprising signals from observed behavior associated with ma-chines, often via Web browser cookies or other user identifiers. However, shared device usage is common, meaning that the activities of multiple searchers may be interwoven in the interest models generated. Recent research on activity attribution has led to methods to automatically disentangle the histories of multiple searchers and correctly ascribe newly-observed search activity to the correct per-son. Building on this, we introduce attribution-based personalization (ABP), a procedure that extends traditional personalization to target individual searchers on shared devices. Activity attribution may improve personalization, but its benefits are not yet fully understood. We present an oracle study (with perfect knowledge of which searchers perform each action on each machine) to under-stand the effectiveness of ABP in predicting searchers' future interests. We utilize a large Web search log dataset containing both per-son identifiers and machine identifiers to quantify the gain in personalization performance from ABP, identify the circumstances under which ABP is most effective, and develop a classifier to determine when to apply it that yields sizable gains in personalization performance. ABP allows search providers to personalize experiences for individuals rather than targeting all users of a device collectively.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116415460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jonathan J. Dorando, Konstantine Arkoudas, P. Vasa, Gary Kazantsev, Gideon Mann
The financial markets are a rich domain for search, and it is not simple to serving the entire scope of financial professionals, who make their living on accurate, timely, and deep information. The data sources are many and disparate. This includes domains with rich structured data such as company and security attributes, textual data like research reports, and time sensitive news stories. Not only is the domain complicated, but some of the techniques that work for web search have to be adapted and reconsidered in an enterprise context with fewer eyeballs but just as complicated questions. At Bloomberg, we have been addressing these problems over the past four years in the search and discoverability group, heavily leveraging the insights from the academic and open-source communities to apply to our problems. We'll discuss about our efforts in Natural Language Question & Answer (NLQA), learning to rank, federated search, crowd sourcing, and how this all comes together to make search effective for our users.
金融市场是一个丰富的搜索领域,为所有金融专业人士提供服务并不简单,他们靠准确、及时和深入的信息为生。数据源很多,而且完全不同。这包括具有丰富结构化数据(如公司和安全属性)、文本数据(如研究报告)和时间敏感的新闻故事的域。不仅这个领域很复杂,而且一些适用于网络搜索的技术必须在企业环境中进行调整和重新考虑,因为企业环境的关注较少,但问题同样复杂。在彭博社,在过去的四年里,我们一直在搜索和可发现性小组中解决这些问题,大量利用来自学术和开源社区的见解来解决我们的问题。我们将讨论我们在自然语言问答(Natural Language Question & Answer, NLQA)、学习排序、联合搜索、众包方面所做的努力,以及这一切是如何结合在一起使搜索对我们的用户有效的。
{"title":"Finding Money in the Haystack: Information Retrieval at Bloomberg","authors":"Jonathan J. Dorando, Konstantine Arkoudas, P. Vasa, Gary Kazantsev, Gideon Mann","doi":"10.1145/2766462.2776782","DOIUrl":"https://doi.org/10.1145/2766462.2776782","url":null,"abstract":"The financial markets are a rich domain for search, and it is not simple to serving the entire scope of financial professionals, who make their living on accurate, timely, and deep information. The data sources are many and disparate. This includes domains with rich structured data such as company and security attributes, textual data like research reports, and time sensitive news stories. Not only is the domain complicated, but some of the techniques that work for web search have to be adapted and reconsidered in an enterprise context with fewer eyeballs but just as complicated questions. At Bloomberg, we have been addressing these problems over the past four years in the search and discoverability group, heavily leveraging the insights from the academic and open-source communities to apply to our problems. We'll discuss about our efforts in Natural Language Question & Answer (NLQA), learning to rank, federated search, crowd sourcing, and how this all comes together to make search effective for our users.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123647694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yiqun Liu, Ye Chen, Jinhui Tang, Jiashen Sun, Min Zhang, Shaoping Ma, Xuan Zhu
Satisfaction prediction is one of the prime concerns in search performance evaluation. It is a non-trivial task for two major reasons: (1) The definition of satisfaction is rather subjective and different users may have different opinions in satisfaction judgement. (2) Most existing studies on satisfaction prediction mainly rely on users' click-through or query reformulation behaviors but there are many sessions without such kind of interactions. To shed light on these research questions, we construct an experimental search engine that could collect users' satisfaction feedback as well as mouse click-through/movement data. Different from existing studies, we compare for the first time search users' and external assessors' opinions on satisfaction. We find that search users pay more attention to the utility of results while external assessors emphasize on the efforts spent in search sessions. Inspired by recent studies in predicting result relevance based on mouse movement patterns (namely motifs), we propose to estimate the utilities of search results and the efforts in search sessions with motifs extracted from mouse movement data on search result pages (SERPs). Besides the existing frequency-based motif selection method, two novel selection strategies (distance-based and distribution-based) are also adopted to extract high quality motifs for satisfaction prediction. Experimental results on over 1,000 user sessions show that the proposed strategies outperform existing methods and also have promising generalization capability for different users and queries.
{"title":"Different Users, Different Opinions: Predicting Search Satisfaction with Mouse Movement Information","authors":"Yiqun Liu, Ye Chen, Jinhui Tang, Jiashen Sun, Min Zhang, Shaoping Ma, Xuan Zhu","doi":"10.1145/2766462.2767721","DOIUrl":"https://doi.org/10.1145/2766462.2767721","url":null,"abstract":"Satisfaction prediction is one of the prime concerns in search performance evaluation. It is a non-trivial task for two major reasons: (1) The definition of satisfaction is rather subjective and different users may have different opinions in satisfaction judgement. (2) Most existing studies on satisfaction prediction mainly rely on users' click-through or query reformulation behaviors but there are many sessions without such kind of interactions. To shed light on these research questions, we construct an experimental search engine that could collect users' satisfaction feedback as well as mouse click-through/movement data. Different from existing studies, we compare for the first time search users' and external assessors' opinions on satisfaction. We find that search users pay more attention to the utility of results while external assessors emphasize on the efforts spent in search sessions. Inspired by recent studies in predicting result relevance based on mouse movement patterns (namely motifs), we propose to estimate the utilities of search results and the efforts in search sessions with motifs extracted from mouse movement data on search result pages (SERPs). Besides the existing frequency-based motif selection method, two novel selection strategies (distance-based and distribution-based) are also adopted to extract high quality motifs for satisfaction prediction. Experimental results on over 1,000 user sessions show that the proposed strategies outperform existing methods and also have promising generalization capability for different users and queries.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125901368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In some interactive image retrieval systems, users can select images from image search results and click to view their similar or related images until they reach the targets. Existing image ranking options are based on relevance, update time, interestingness and so on. Because the inexact description of user targets or unsatisfying performance of image retrieval methods, it is possible that users cannot reach their targets in single-round interaction. When we consider multi-round interactions, how to assist users to select the images that are easier to reach the targets in fewer rounds is a useful issue. In this paper, we propose a new kind of ranking option to users by ranking the images according to their difficulties of reaching potential targets. We model the interactive image search behavior as navigation on information network constructed by an image collection and an image retrieval method. We use the properties of this information network for reachability based ranking. Experiments based on a social image collection show the efficiency of our approach.
{"title":"Reachability based Ranking in Interactive Image Retrieval","authors":"Jiyi Li","doi":"10.1145/2766462.2767777","DOIUrl":"https://doi.org/10.1145/2766462.2767777","url":null,"abstract":"In some interactive image retrieval systems, users can select images from image search results and click to view their similar or related images until they reach the targets. Existing image ranking options are based on relevance, update time, interestingness and so on. Because the inexact description of user targets or unsatisfying performance of image retrieval methods, it is possible that users cannot reach their targets in single-round interaction. When we consider multi-round interactions, how to assist users to select the images that are easier to reach the targets in fewer rounds is a useful issue. In this paper, we propose a new kind of ranking option to users by ranking the images according to their difficulties of reaching potential targets. We model the interactive image search behavior as navigation on information network constructed by an image collection and an image retrieval method. We use the properties of this information network for reachability based ranking. Experiments based on a social image collection show the efficiency of our approach.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"54 62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124705287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nouns are more important than other parts of speech in information retrieval and are more often found near the beginning or the end of sentences. In this paper, we investigate the effects of rewarding terms based on their location in sentences on information retrieval. Particularly, we propose a novel Term Location (TEL) retrieval model based on BM25 to enhance probabilistic information retrieval, where a kernel-based method is used to capture term placement patterns. Experiments on five TREC datasets of varied size and content indicate the proposed model significantly outperforms the optimized BM25 and DirichletLM in MAP over all datasets with all kernel functions, and excels the optimized BM25 and DirichletLM over most of the datasets in P@5 and P@20 with different kernel functions.
{"title":"Using Term Location Information to Enhance Probabilistic Information Retrieval","authors":"Baiyan Liu, X. An, Xiangji Huang","doi":"10.1145/2766462.2767827","DOIUrl":"https://doi.org/10.1145/2766462.2767827","url":null,"abstract":"Nouns are more important than other parts of speech in information retrieval and are more often found near the beginning or the end of sentences. In this paper, we investigate the effects of rewarding terms based on their location in sentences on information retrieval. Particularly, we propose a novel Term Location (TEL) retrieval model based on BM25 to enhance probabilistic information retrieval, where a kernel-based method is used to capture term placement patterns. Experiments on five TREC datasets of varied size and content indicate the proposed model significantly outperforms the optimized BM25 and DirichletLM in MAP over all datasets with all kernel functions, and excels the optimized BM25 and DirichletLM over most of the datasets in P@5 and P@20 with different kernel functions.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128730192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}