The majority of research into Collaborative Information Retrieval (CIR) has assumed a uniformity of information access and visibility between collaborators. However in a number of real world scenarios, information access is not uniform between all collaborators in a team e.g. security, health etc. This can be referred to as Multi-Level Collaborative Information Retrieval (MLCIR). To the best of our knowledge, there has not yet been any systematic investigation of the effect of MLCIR on search outcomes. To address this shortcoming, in this paper, we present the results of a simulated evaluation conducted over 4 different non-uniform information access scenarios and 3 different collaborative search strategies. Results indicate that there is some tolerance to removing access to the collection and that there may not always be a negative impact on performance. We also highlight how different access scenarios and search strategies impact on search outcomes.
{"title":"Towards Quantifying the Impact of Non-Uniform Information Access in Collaborative Information Retrieval","authors":"N. Htun, Martin Halvey, L. Baillie","doi":"10.1145/2766462.2767779","DOIUrl":"https://doi.org/10.1145/2766462.2767779","url":null,"abstract":"The majority of research into Collaborative Information Retrieval (CIR) has assumed a uniformity of information access and visibility between collaborators. However in a number of real world scenarios, information access is not uniform between all collaborators in a team e.g. security, health etc. This can be referred to as Multi-Level Collaborative Information Retrieval (MLCIR). To the best of our knowledge, there has not yet been any systematic investigation of the effect of MLCIR on search outcomes. To address this shortcoming, in this paper, we present the results of a simulated evaluation conducted over 4 different non-uniform information access scenarios and 3 different collaborative search strategies. Results indicate that there is some tolerance to removing access to the collection and that there may not always be a negative impact on performance. We also highlight how different access scenarios and search strategies impact on search outcomes.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131061865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Timothy Jones, Paul Thomas, Falk Scholer, M. Sanderson
Many IR effectiveness measures are motivated from intuition, theory, or user studies. In general, most effectiveness measures are well correlated with each other. But, what about where they don't correlate? Which rankings cause measures to disagree? Are these rankings predictable for particular pairs of measures? In this work, we examine how and where metrics disagree, and identify differences that should be considered when selecting metrics for use in evaluating retrieval systems.
{"title":"Features of Disagreement Between Retrieval Effectiveness Measures","authors":"Timothy Jones, Paul Thomas, Falk Scholer, M. Sanderson","doi":"10.1145/2766462.2767824","DOIUrl":"https://doi.org/10.1145/2766462.2767824","url":null,"abstract":"Many IR effectiveness measures are motivated from intuition, theory, or user studies. In general, most effectiveness measures are well correlated with each other. But, what about where they don't correlate? Which rankings cause measures to disagree? Are these rankings predictable for particular pairs of measures? In this work, we examine how and where metrics disagree, and identify differences that should be considered when selecting metrics for use in evaluating retrieval systems.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131208558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sérgio D. Canuto, Marcos André Gonçalves, W. M. D. Santos, Thierson Couto, W. Martins
The unprecedented growth of available data nowadays has stimulated the development of new methods for organizing and extracting useful knowledge from this immense amount of data. Automatic Document Classification (ADC) is one of such methods, that uses machine learning techniques to build models capable of automatically associating documents to well-defined semantic classes. ADC is the basis of many important applications such as language identification, sentiment analysis, recommender systems, spam filtering, among others. Recently, the use of meta-features has been shown to substantially improve the effectiveness of ADC algorithms. In particular, the use of meta-features that make a combined use of local information (through kNN-based features) and global information (through category centroids) has produced promising results. However, the generation of these meta-features is very costly in terms of both, memory consumption and runtime since there is the need to constantly call the kNN algorithm. We take advantage of the current manycore GPU architecture and present a massively parallel version of the kNN algorithm for highly dimensional and sparse datasets (which is the case for ADC). Our experimental results show that we can obtain speedup gains of up to 15x while reducing memory consumption in more than 5000x when compared to a state-of-the-art parallel baseline. This opens up the possibility of applying meta-features based classification in large collections of documents, that would otherwise take too much time or require the use of an expensive computational platform.
{"title":"An Efficient and Scalable MetaFeature-based Document Classification Approach based on Massively Parallel Computing","authors":"Sérgio D. Canuto, Marcos André Gonçalves, W. M. D. Santos, Thierson Couto, W. Martins","doi":"10.1145/2766462.2767743","DOIUrl":"https://doi.org/10.1145/2766462.2767743","url":null,"abstract":"The unprecedented growth of available data nowadays has stimulated the development of new methods for organizing and extracting useful knowledge from this immense amount of data. Automatic Document Classification (ADC) is one of such methods, that uses machine learning techniques to build models capable of automatically associating documents to well-defined semantic classes. ADC is the basis of many important applications such as language identification, sentiment analysis, recommender systems, spam filtering, among others. Recently, the use of meta-features has been shown to substantially improve the effectiveness of ADC algorithms. In particular, the use of meta-features that make a combined use of local information (through kNN-based features) and global information (through category centroids) has produced promising results. However, the generation of these meta-features is very costly in terms of both, memory consumption and runtime since there is the need to constantly call the kNN algorithm. We take advantage of the current manycore GPU architecture and present a massively parallel version of the kNN algorithm for highly dimensional and sparse datasets (which is the case for ADC). Our experimental results show that we can obtain speedup gains of up to 15x while reducing memory consumption in more than 5000x when compared to a state-of-the-art parallel baseline. This opens up the possibility of applying meta-features based classification in large collections of documents, that would otherwise take too much time or require the use of an expensive computational platform.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128732551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zeyang Liu, Yiqun Liu, K. Zhou, Min Zhang, Shaoping Ma
Research in how users examine results on search engine result pages (SERPs) helps improve result ranking, advertisement placement, performance evaluation and search UI design. Although examination behavior on organic search results (also known as "ten blue links") has been well studied in existing works, there lacks a thorough investigation on how users examine SERPs with verticals. Considering the fact that a large fraction of SERPs are served with one or more verticals in the practical Web search scenario, it is of vital importance to understand the influence of vertical results on search examination behaviors. In this paper, we focus on five popular vertical types and try to study their influences on users' examination processes in both cases when they are relevant or irrelevant to the search queries. With examination behavior data collected with an eye-tracking device, we show the existence of vertical-aware user behavior effects including vertical attraction effect, examination cut-off effect in the presence of a relevant vertical, and examination spill-over effect in the presence of an irrelevant vertical. Furthermore, we are also among the first to systematically investigate the internal examination behavior within the vertical results. We believe that this work will promote our understanding of user interactions with federated search engines and bring benefit to the construction of search performance evaluations.
{"title":"Influence of Vertical Result in Web Search Examination","authors":"Zeyang Liu, Yiqun Liu, K. Zhou, Min Zhang, Shaoping Ma","doi":"10.1145/2766462.2767714","DOIUrl":"https://doi.org/10.1145/2766462.2767714","url":null,"abstract":"Research in how users examine results on search engine result pages (SERPs) helps improve result ranking, advertisement placement, performance evaluation and search UI design. Although examination behavior on organic search results (also known as \"ten blue links\") has been well studied in existing works, there lacks a thorough investigation on how users examine SERPs with verticals. Considering the fact that a large fraction of SERPs are served with one or more verticals in the practical Web search scenario, it is of vital importance to understand the influence of vertical results on search examination behaviors. In this paper, we focus on five popular vertical types and try to study their influences on users' examination processes in both cases when they are relevant or irrelevant to the search queries. With examination behavior data collected with an eye-tracking device, we show the existence of vertical-aware user behavior effects including vertical attraction effect, examination cut-off effect in the presence of a relevant vertical, and examination spill-over effect in the presence of an irrelevant vertical. Furthermore, we are also among the first to systematically investigate the internal examination behavior within the vertical results. We believe that this work will promote our understanding of user interactions with federated search engines and bring benefit to the construction of search performance evaluations.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122152929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lina Yao, Quan Z. Sheng, Yongrui Qin, Xianzhi Wang, A. Shemshadi, Qi He
Point-of-Interest (POI) recommendation is a new type of recommendation task that comes along with the prevalence of location-based social networks in recent years. Compared with traditional tasks, it focuses more on personalized, context-aware recommendation results to provide better user experience. To address this new challenge, we propose a Collaborative Filtering method based on Non-negative Tensor Factorization, a generalization of the Matrix Factorization approach that exploits a high-order tensor instead of traditional User-Location matrix to model multi-dimensional contextual information. The factorization of this tensor leads to a compact model of the data which is specially suitable for context-aware POI recommendations. In addition, we fuse users' social relations as regularization terms of the factorization to improve the recommendation accuracy. Experimental results on real-world datasets demonstrate the effectiveness of our approach.
{"title":"Context-aware Point-of-Interest Recommendation Using Tensor Factorization with Social Regularization","authors":"Lina Yao, Quan Z. Sheng, Yongrui Qin, Xianzhi Wang, A. Shemshadi, Qi He","doi":"10.1145/2766462.2767794","DOIUrl":"https://doi.org/10.1145/2766462.2767794","url":null,"abstract":"Point-of-Interest (POI) recommendation is a new type of recommendation task that comes along with the prevalence of location-based social networks in recent years. Compared with traditional tasks, it focuses more on personalized, context-aware recommendation results to provide better user experience. To address this new challenge, we propose a Collaborative Filtering method based on Non-negative Tensor Factorization, a generalization of the Matrix Factorization approach that exploits a high-order tensor instead of traditional User-Location matrix to model multi-dimensional contextual information. The factorization of this tensor leads to a compact model of the data which is specially suitable for context-aware POI recommendations. In addition, we fuse users' social relations as regularization terms of the factorization to improve the recommendation accuracy. Experimental results on real-world datasets demonstrate the effectiveness of our approach.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125872995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
B-CUBED metrics have recently been adopted in the evaluation of clustering results as well as in many other related tasks. However, this family of metrics is not well adapted when datasets are unbalanced. This issue is extremely frequent in Web results, where classes are distributed following a strong unbalanced pattern. In this paper, we present a modified version of B-CUBED metrics to overcome this situation. Results in toy and real datasets indicate that the proposed adaptation correctly considers the particularities of unbalanced cases.
b - cube指标最近被用于聚类结果的评估以及许多其他相关任务。然而,当数据集不平衡时,这一系列指标不能很好地适应。这个问题在Web结果中非常常见,因为类是按照强烈的不平衡模式分布的。在本文中,我们提出了一个修改版本的B-CUBED度量来克服这种情况。在玩具和实际数据集上的结果表明,所提出的自适应方法正确地考虑了不平衡情况的特殊性。
{"title":"Adapted B-CUBED Metrics to Unbalanced Datasets","authors":"Jose G. Moreno, G. Dias","doi":"10.1145/2766462.2767836","DOIUrl":"https://doi.org/10.1145/2766462.2767836","url":null,"abstract":"B-CUBED metrics have recently been adopted in the evaluation of clustering results as well as in many other related tasks. However, this family of metrics is not well adapted when datasets are unbalanced. This issue is extremely frequent in Web results, where classes are distributed following a strong unbalanced pattern. In this paper, we present a modified version of B-CUBED metrics to overcome this situation. Results in toy and real datasets indicate that the proposed adaptation correctly considers the particularities of unbalanced cases.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114026371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Effective postings list compression techniques, and the efficiency of postings list processing schemes such as WAND, have significantly improved the practical performance of ranked document retrieval using inverted indexes. Recently, suffix array-based index structures have been proposed as a complementary tool, to support phrase searching. The relative merits of these alternative approaches to ranked querying using phrase components are, however, unclear. Here we provide: (1) an overview of existing phrase indexing techniques; (2) a description of how to incorporate recent advances in list compression and processing; and (3) an empirical evaluation of state-of-the-art suffix-array and inverted file-based phrase retrieval indexes using a standard IR test collection.
{"title":"On the Cost of Phrase-Based Ranking","authors":"M. Petri, Alistair Moffat","doi":"10.1145/2766462.2767769","DOIUrl":"https://doi.org/10.1145/2766462.2767769","url":null,"abstract":"Effective postings list compression techniques, and the efficiency of postings list processing schemes such as WAND, have significantly improved the practical performance of ranked document retrieval using inverted indexes. Recently, suffix array-based index structures have been proposed as a complementary tool, to support phrase searching. The relative merits of these alternative approaches to ranked querying using phrase components are, however, unclear. Here we provide: (1) an overview of existing phrase indexing techniques; (2) a description of how to incorporate recent advances in list compression and processing; and (3) an empirical evaluation of state-of-the-art suffix-array and inverted file-based phrase retrieval indexes using a standard IR test collection.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123030964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Information retrieval researchers and engineers use human computation as a mechanism to produce labeled data sets for product development, research and experimentation. To gather useful results, a successful labeling task relies on many different elements: clear instructions, user interface guidelines, representative high-quality datasets, appropriate inter-rater agreement metrics, work quality checks, and channels for worker feedback. Furthermore, designing and implementing tasks that produce and use several thousands or millions of labels is different than conducting small scale research investigations. In this paper we present a perspective for collecting high quality labels with an emphasis on practical problems and scalability. We focus on three main topics: programming crowds, debugging tasks with low agreement, and algorithms for quality control. We show examples from an industrial setting.
{"title":"Practical Lessons for Gathering Quality Labels at Scale","authors":"Omar Alonso","doi":"10.1145/2766462.2776778","DOIUrl":"https://doi.org/10.1145/2766462.2776778","url":null,"abstract":"Information retrieval researchers and engineers use human computation as a mechanism to produce labeled data sets for product development, research and experimentation. To gather useful results, a successful labeling task relies on many different elements: clear instructions, user interface guidelines, representative high-quality datasets, appropriate inter-rater agreement metrics, work quality checks, and channels for worker feedback. Furthermore, designing and implementing tasks that produce and use several thousands or millions of labels is different than conducting small scale research investigations. In this paper we present a perspective for collecting high quality labels with an emphasis on practical problems and scalability. We focus on three main topics: programming crowds, debugging tasks with low agreement, and algorithms for quality control. We show examples from an industrial setting.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"167 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126780736","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Modern Internet companies improve evaluation criteria of their data-driven decision-making that is based on online controlled experiments (also known as A/B tests). The amplitude metrics of user engagement are known to be well sensitive to service changes, but they could not be used to determine, whether the treatment effect is positive or negative. We propose to overcome this sign-agnostic issue by paying attention to the phase of the corresponding DFT sine wave. We refine the amplitude metrics of the first frequency by the phase ones and formalize our intuition in several novel overall evaluation criteria. These criteria are then verified over A/B experiments on real users of Yandex. We find that our approach holds the sensitivity level of the amplitudes and makes their changes sign-aware w.r.t. the treatment effect.
{"title":"Sign-Aware Periodicity Metrics of User Engagement for Online Search Quality Evaluation","authors":"Alexey Drutsa","doi":"10.1145/2766462.2767814","DOIUrl":"https://doi.org/10.1145/2766462.2767814","url":null,"abstract":"Modern Internet companies improve evaluation criteria of their data-driven decision-making that is based on online controlled experiments (also known as A/B tests). The amplitude metrics of user engagement are known to be well sensitive to service changes, but they could not be used to determine, whether the treatment effect is positive or negative. We propose to overcome this sign-agnostic issue by paying attention to the phase of the corresponding DFT sine wave. We refine the amplitude metrics of the first frequency by the phase ones and formalize our intuition in several novel overall evaluation criteria. These criteria are then verified over A/B experiments on real users of Yandex. We find that our approach holds the sensitivity level of the amplitudes and makes their changes sign-aware w.r.t. the treatment effect.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129189861","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Selective distributed search is a retrieval architecture that reduces search costs by partitioning a corpus into topical shards such that only a few shards need to be searched for each query. Prior research created topical shards by using random seed documents to cluster a random sample of the full corpus. The resource selection algorithm might use a different random sample of the corpus. These random components make selective search non-deterministic. This paper studies how these random components affect experimental results. Experiments on two ClueWeb09 corpora and four query sets show that in spite of random components, selective search is stable for most queries.
{"title":"How Random Decisions Affect Selective Distributed Search","authors":"Zhuyun Dai, Yubin Kim, James P. Callan","doi":"10.1145/2766462.2767796","DOIUrl":"https://doi.org/10.1145/2766462.2767796","url":null,"abstract":"Selective distributed search is a retrieval architecture that reduces search costs by partitioning a corpus into topical shards such that only a few shards need to be searched for each query. Prior research created topical shards by using random seed documents to cluster a random sample of the full corpus. The resource selection algorithm might use a different random sample of the corpus. These random components make selective search non-deterministic. This paper studies how these random components affect experimental results. Experiments on two ClueWeb09 corpora and four query sets show that in spite of random components, selective search is stable for most queries.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"48 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131478051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}