首页 > 最新文献

Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval最新文献

英文 中文
Towards Quantifying the Impact of Non-Uniform Information Access in Collaborative Information Retrieval 对协同信息检索中非统一信息访问影响的量化研究
N. Htun, Martin Halvey, L. Baillie
The majority of research into Collaborative Information Retrieval (CIR) has assumed a uniformity of information access and visibility between collaborators. However in a number of real world scenarios, information access is not uniform between all collaborators in a team e.g. security, health etc. This can be referred to as Multi-Level Collaborative Information Retrieval (MLCIR). To the best of our knowledge, there has not yet been any systematic investigation of the effect of MLCIR on search outcomes. To address this shortcoming, in this paper, we present the results of a simulated evaluation conducted over 4 different non-uniform information access scenarios and 3 different collaborative search strategies. Results indicate that there is some tolerance to removing access to the collection and that there may not always be a negative impact on performance. We also highlight how different access scenarios and search strategies impact on search outcomes.
大多数关于协同信息检索(CIR)的研究都假定协作者之间的信息访问和可见性是一致的。然而,在现实世界的许多场景中,团队中所有协作者之间的信息访问并不统一,例如安全、健康等。这可以称为多层次协同信息检索(MLCIR)。据我们所知,目前还没有任何关于MLCIR对搜索结果影响的系统调查。为了解决这一缺点,在本文中,我们给出了对4种不同的非统一信息访问场景和3种不同的协同搜索策略进行模拟评估的结果。结果表明,删除对集合的访问有一定的容忍度,并且可能并不总是对性能产生负面影响。我们还强调了不同的访问场景和搜索策略如何影响搜索结果。
{"title":"Towards Quantifying the Impact of Non-Uniform Information Access in Collaborative Information Retrieval","authors":"N. Htun, Martin Halvey, L. Baillie","doi":"10.1145/2766462.2767779","DOIUrl":"https://doi.org/10.1145/2766462.2767779","url":null,"abstract":"The majority of research into Collaborative Information Retrieval (CIR) has assumed a uniformity of information access and visibility between collaborators. However in a number of real world scenarios, information access is not uniform between all collaborators in a team e.g. security, health etc. This can be referred to as Multi-Level Collaborative Information Retrieval (MLCIR). To the best of our knowledge, there has not yet been any systematic investigation of the effect of MLCIR on search outcomes. To address this shortcoming, in this paper, we present the results of a simulated evaluation conducted over 4 different non-uniform information access scenarios and 3 different collaborative search strategies. Results indicate that there is some tolerance to removing access to the collection and that there may not always be a negative impact on performance. We also highlight how different access scenarios and search strategies impact on search outcomes.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131061865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Features of Disagreement Between Retrieval Effectiveness Measures 检索有效性度量差异的特征
Timothy Jones, Paul Thomas, Falk Scholer, M. Sanderson
Many IR effectiveness measures are motivated from intuition, theory, or user studies. In general, most effectiveness measures are well correlated with each other. But, what about where they don't correlate? Which rankings cause measures to disagree? Are these rankings predictable for particular pairs of measures? In this work, we examine how and where metrics disagree, and identify differences that should be considered when selecting metrics for use in evaluating retrieval systems.
许多IR有效性度量是由直觉、理论或用户研究驱动的。一般来说,大多数有效性度量都是相互关联的。但是,如果它们不相关呢?哪些排名导致测量结果不一致?这些排名对于特定的衡量标准是可预测的吗?在这项工作中,我们检查度量不一致的方式和位置,并确定在选择用于评估检索系统的度量时应该考虑的差异。
{"title":"Features of Disagreement Between Retrieval Effectiveness Measures","authors":"Timothy Jones, Paul Thomas, Falk Scholer, M. Sanderson","doi":"10.1145/2766462.2767824","DOIUrl":"https://doi.org/10.1145/2766462.2767824","url":null,"abstract":"Many IR effectiveness measures are motivated from intuition, theory, or user studies. In general, most effectiveness measures are well correlated with each other. But, what about where they don't correlate? Which rankings cause measures to disagree? Are these rankings predictable for particular pairs of measures? In this work, we examine how and where metrics disagree, and identify differences that should be considered when selecting metrics for use in evaluating retrieval systems.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131208558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
An Efficient and Scalable MetaFeature-based Document Classification Approach based on Massively Parallel Computing 基于大规模并行计算的高效可扩展元特征文档分类方法
Sérgio D. Canuto, Marcos André Gonçalves, W. M. D. Santos, Thierson Couto, W. Martins
The unprecedented growth of available data nowadays has stimulated the development of new methods for organizing and extracting useful knowledge from this immense amount of data. Automatic Document Classification (ADC) is one of such methods, that uses machine learning techniques to build models capable of automatically associating documents to well-defined semantic classes. ADC is the basis of many important applications such as language identification, sentiment analysis, recommender systems, spam filtering, among others. Recently, the use of meta-features has been shown to substantially improve the effectiveness of ADC algorithms. In particular, the use of meta-features that make a combined use of local information (through kNN-based features) and global information (through category centroids) has produced promising results. However, the generation of these meta-features is very costly in terms of both, memory consumption and runtime since there is the need to constantly call the kNN algorithm. We take advantage of the current manycore GPU architecture and present a massively parallel version of the kNN algorithm for highly dimensional and sparse datasets (which is the case for ADC). Our experimental results show that we can obtain speedup gains of up to 15x while reducing memory consumption in more than 5000x when compared to a state-of-the-art parallel baseline. This opens up the possibility of applying meta-features based classification in large collections of documents, that would otherwise take too much time or require the use of an expensive computational platform.
如今,可用数据的空前增长刺激了从海量数据中组织和提取有用知识的新方法的发展。自动文档分类(ADC)就是这样一种方法,它使用机器学习技术来构建能够自动将文档与定义良好的语义类关联起来的模型。ADC是许多重要应用的基础,如语言识别、情感分析、推荐系统、垃圾邮件过滤等。最近,元特征的使用已被证明可以大大提高ADC算法的有效性。特别是,结合使用局部信息(通过基于knn的特征)和全局信息(通过类别质心)的元特征的使用产生了有希望的结果。然而,这些元特征的生成在内存消耗和运行时间方面都非常昂贵,因为需要不断调用kNN算法。我们利用当前的多核GPU架构,为高维和稀疏数据集(ADC的情况)提供了大规模并行版本的kNN算法。我们的实验结果表明,与最先进的并行基线相比,我们可以获得高达15倍的加速增益,同时将内存消耗减少5000倍以上。这开启了在大型文档集合中应用基于元特征的分类的可能性,否则将花费太多时间或需要使用昂贵的计算平台。
{"title":"An Efficient and Scalable MetaFeature-based Document Classification Approach based on Massively Parallel Computing","authors":"Sérgio D. Canuto, Marcos André Gonçalves, W. M. D. Santos, Thierson Couto, W. Martins","doi":"10.1145/2766462.2767743","DOIUrl":"https://doi.org/10.1145/2766462.2767743","url":null,"abstract":"The unprecedented growth of available data nowadays has stimulated the development of new methods for organizing and extracting useful knowledge from this immense amount of data. Automatic Document Classification (ADC) is one of such methods, that uses machine learning techniques to build models capable of automatically associating documents to well-defined semantic classes. ADC is the basis of many important applications such as language identification, sentiment analysis, recommender systems, spam filtering, among others. Recently, the use of meta-features has been shown to substantially improve the effectiveness of ADC algorithms. In particular, the use of meta-features that make a combined use of local information (through kNN-based features) and global information (through category centroids) has produced promising results. However, the generation of these meta-features is very costly in terms of both, memory consumption and runtime since there is the need to constantly call the kNN algorithm. We take advantage of the current manycore GPU architecture and present a massively parallel version of the kNN algorithm for highly dimensional and sparse datasets (which is the case for ADC). Our experimental results show that we can obtain speedup gains of up to 15x while reducing memory consumption in more than 5000x when compared to a state-of-the-art parallel baseline. This opens up the possibility of applying meta-features based classification in large collections of documents, that would otherwise take too much time or require the use of an expensive computational platform.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128732551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Influence of Vertical Result in Web Search Examination 垂直结果对网络搜索考试的影响
Zeyang Liu, Yiqun Liu, K. Zhou, Min Zhang, Shaoping Ma
Research in how users examine results on search engine result pages (SERPs) helps improve result ranking, advertisement placement, performance evaluation and search UI design. Although examination behavior on organic search results (also known as "ten blue links") has been well studied in existing works, there lacks a thorough investigation on how users examine SERPs with verticals. Considering the fact that a large fraction of SERPs are served with one or more verticals in the practical Web search scenario, it is of vital importance to understand the influence of vertical results on search examination behaviors. In this paper, we focus on five popular vertical types and try to study their influences on users' examination processes in both cases when they are relevant or irrelevant to the search queries. With examination behavior data collected with an eye-tracking device, we show the existence of vertical-aware user behavior effects including vertical attraction effect, examination cut-off effect in the presence of a relevant vertical, and examination spill-over effect in the presence of an irrelevant vertical. Furthermore, we are also among the first to systematically investigate the internal examination behavior within the vertical results. We believe that this work will promote our understanding of user interactions with federated search engines and bring benefit to the construction of search performance evaluations.
研究用户如何检查搜索引擎结果页面(serp)上的结果有助于改进结果排名、广告位置、性能评估和搜索UI设计。虽然已有研究对自然搜索结果(也称为“十个蓝链接”)的审查行为进行了很好的研究,但缺乏对用户如何审查垂直搜索结果的彻底调查。考虑到在实际的Web搜索场景中,有很大一部分serp是由一个或多个垂直服务提供的,因此了解垂直结果对搜索审查行为的影响至关重要。在本文中,我们关注五种流行的垂直类型,并试图研究它们在与搜索查询相关或不相关的两种情况下对用户检查过程的影响。通过眼动追踪设备收集的检查行为数据,我们证明了垂直感知用户行为效应的存在,包括垂直吸引效应、相关垂直存在时的检查切断效应和不相关垂直存在时的检查溢出效应。此外,我们也是第一个系统地调查内部考试行为在垂直结果。我们相信这项工作将促进我们对用户与联邦搜索引擎交互的理解,并为搜索性能评估的构建带来好处。
{"title":"Influence of Vertical Result in Web Search Examination","authors":"Zeyang Liu, Yiqun Liu, K. Zhou, Min Zhang, Shaoping Ma","doi":"10.1145/2766462.2767714","DOIUrl":"https://doi.org/10.1145/2766462.2767714","url":null,"abstract":"Research in how users examine results on search engine result pages (SERPs) helps improve result ranking, advertisement placement, performance evaluation and search UI design. Although examination behavior on organic search results (also known as \"ten blue links\") has been well studied in existing works, there lacks a thorough investigation on how users examine SERPs with verticals. Considering the fact that a large fraction of SERPs are served with one or more verticals in the practical Web search scenario, it is of vital importance to understand the influence of vertical results on search examination behaviors. In this paper, we focus on five popular vertical types and try to study their influences on users' examination processes in both cases when they are relevant or irrelevant to the search queries. With examination behavior data collected with an eye-tracking device, we show the existence of vertical-aware user behavior effects including vertical attraction effect, examination cut-off effect in the presence of a relevant vertical, and examination spill-over effect in the presence of an irrelevant vertical. Furthermore, we are also among the first to systematically investigate the internal examination behavior within the vertical results. We believe that this work will promote our understanding of user interactions with federated search engines and bring benefit to the construction of search performance evaluations.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122152929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 60
Context-aware Point-of-Interest Recommendation Using Tensor Factorization with Social Regularization 基于社会正则化的张量分解的上下文感知兴趣点推荐
Lina Yao, Quan Z. Sheng, Yongrui Qin, Xianzhi Wang, A. Shemshadi, Qi He
Point-of-Interest (POI) recommendation is a new type of recommendation task that comes along with the prevalence of location-based social networks in recent years. Compared with traditional tasks, it focuses more on personalized, context-aware recommendation results to provide better user experience. To address this new challenge, we propose a Collaborative Filtering method based on Non-negative Tensor Factorization, a generalization of the Matrix Factorization approach that exploits a high-order tensor instead of traditional User-Location matrix to model multi-dimensional contextual information. The factorization of this tensor leads to a compact model of the data which is specially suitable for context-aware POI recommendations. In addition, we fuse users' social relations as regularization terms of the factorization to improve the recommendation accuracy. Experimental results on real-world datasets demonstrate the effectiveness of our approach.
兴趣点推荐是近年来随着基于位置的社交网络的普及而出现的一种新型推荐任务。与传统任务相比,它更注重个性化、上下文感知的推荐结果,以提供更好的用户体验。为了解决这一新的挑战,我们提出了一种基于非负张量分解的协同过滤方法,这是矩阵分解方法的一种推广,利用高阶张量而不是传统的用户位置矩阵来建模多维上下文信息。这个张量的因式分解导致数据的紧凑模型,特别适合上下文感知的POI建议。此外,我们将用户的社会关系作为分解的正则化项来融合,以提高推荐的准确率。在真实数据集上的实验结果证明了我们方法的有效性。
{"title":"Context-aware Point-of-Interest Recommendation Using Tensor Factorization with Social Regularization","authors":"Lina Yao, Quan Z. Sheng, Yongrui Qin, Xianzhi Wang, A. Shemshadi, Qi He","doi":"10.1145/2766462.2767794","DOIUrl":"https://doi.org/10.1145/2766462.2767794","url":null,"abstract":"Point-of-Interest (POI) recommendation is a new type of recommendation task that comes along with the prevalence of location-based social networks in recent years. Compared with traditional tasks, it focuses more on personalized, context-aware recommendation results to provide better user experience. To address this new challenge, we propose a Collaborative Filtering method based on Non-negative Tensor Factorization, a generalization of the Matrix Factorization approach that exploits a high-order tensor instead of traditional User-Location matrix to model multi-dimensional contextual information. The factorization of this tensor leads to a compact model of the data which is specially suitable for context-aware POI recommendations. In addition, we fuse users' social relations as regularization terms of the factorization to improve the recommendation accuracy. Experimental results on real-world datasets demonstrate the effectiveness of our approach.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125872995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 107
Adapted B-CUBED Metrics to Unbalanced Datasets 适应b - cube指标不平衡的数据集
Jose G. Moreno, G. Dias
B-CUBED metrics have recently been adopted in the evaluation of clustering results as well as in many other related tasks. However, this family of metrics is not well adapted when datasets are unbalanced. This issue is extremely frequent in Web results, where classes are distributed following a strong unbalanced pattern. In this paper, we present a modified version of B-CUBED metrics to overcome this situation. Results in toy and real datasets indicate that the proposed adaptation correctly considers the particularities of unbalanced cases.
b - cube指标最近被用于聚类结果的评估以及许多其他相关任务。然而,当数据集不平衡时,这一系列指标不能很好地适应。这个问题在Web结果中非常常见,因为类是按照强烈的不平衡模式分布的。在本文中,我们提出了一个修改版本的B-CUBED度量来克服这种情况。在玩具和实际数据集上的结果表明,所提出的自适应方法正确地考虑了不平衡情况的特殊性。
{"title":"Adapted B-CUBED Metrics to Unbalanced Datasets","authors":"Jose G. Moreno, G. Dias","doi":"10.1145/2766462.2767836","DOIUrl":"https://doi.org/10.1145/2766462.2767836","url":null,"abstract":"B-CUBED metrics have recently been adopted in the evaluation of clustering results as well as in many other related tasks. However, this family of metrics is not well adapted when datasets are unbalanced. This issue is extremely frequent in Web results, where classes are distributed following a strong unbalanced pattern. In this paper, we present a modified version of B-CUBED metrics to overcome this situation. Results in toy and real datasets indicate that the proposed adaptation correctly considers the particularities of unbalanced cases.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114026371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
On the Cost of Phrase-Based Ranking 基于短语的排序成本研究
M. Petri, Alistair Moffat
Effective postings list compression techniques, and the efficiency of postings list processing schemes such as WAND, have significantly improved the practical performance of ranked document retrieval using inverted indexes. Recently, suffix array-based index structures have been proposed as a complementary tool, to support phrase searching. The relative merits of these alternative approaches to ranked querying using phrase components are, however, unclear. Here we provide: (1) an overview of existing phrase indexing techniques; (2) a description of how to incorporate recent advances in list compression and processing; and (3) an empirical evaluation of state-of-the-art suffix-array and inverted file-based phrase retrieval indexes using a standard IR test collection.
有效的发布列表压缩技术和发布列表处理方案(如WAND)的效率显著提高了使用倒排索引进行排序文档检索的实际性能。最近,基于后缀数组的索引结构被提出作为支持短语搜索的补充工具。然而,这些使用短语组件排序查询的替代方法的相对优点尚不清楚。本文提供:(1)对现有短语索引技术的概述;(2)描述如何结合列表压缩和处理的最新进展;(3)使用标准IR测试集对最先进的基于后缀数组和反向文件的短语检索索引进行实证评价。
{"title":"On the Cost of Phrase-Based Ranking","authors":"M. Petri, Alistair Moffat","doi":"10.1145/2766462.2767769","DOIUrl":"https://doi.org/10.1145/2766462.2767769","url":null,"abstract":"Effective postings list compression techniques, and the efficiency of postings list processing schemes such as WAND, have significantly improved the practical performance of ranked document retrieval using inverted indexes. Recently, suffix array-based index structures have been proposed as a complementary tool, to support phrase searching. The relative merits of these alternative approaches to ranked querying using phrase components are, however, unclear. Here we provide: (1) an overview of existing phrase indexing techniques; (2) a description of how to incorporate recent advances in list compression and processing; and (3) an empirical evaluation of state-of-the-art suffix-array and inverted file-based phrase retrieval indexes using a standard IR test collection.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123030964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Practical Lessons for Gathering Quality Labels at Scale 大规模收集质量标签的实践教训
Omar Alonso
Information retrieval researchers and engineers use human computation as a mechanism to produce labeled data sets for product development, research and experimentation. To gather useful results, a successful labeling task relies on many different elements: clear instructions, user interface guidelines, representative high-quality datasets, appropriate inter-rater agreement metrics, work quality checks, and channels for worker feedback. Furthermore, designing and implementing tasks that produce and use several thousands or millions of labels is different than conducting small scale research investigations. In this paper we present a perspective for collecting high quality labels with an emphasis on practical problems and scalability. We focus on three main topics: programming crowds, debugging tasks with low agreement, and algorithms for quality control. We show examples from an industrial setting.
信息检索研究人员和工程师使用人工计算作为一种机制,为产品开发、研究和实验产生标记数据集。为了收集有用的结果,一个成功的标签任务依赖于许多不同的元素:明确的说明、用户界面指南、具有代表性的高质量数据集、适当的评分者之间的协议指标、工作质量检查和工人反馈的渠道。此外,设计和实施产生和使用数千或数百万个标签的任务与进行小规模的研究调查是不同的。在本文中,我们提出了一个收集高质量标签的观点,重点是实际问题和可扩展性。我们主要关注三个主题:编程人群、低一致性调试任务和质量控制算法。我们将展示来自工业环境的例子。
{"title":"Practical Lessons for Gathering Quality Labels at Scale","authors":"Omar Alonso","doi":"10.1145/2766462.2776778","DOIUrl":"https://doi.org/10.1145/2766462.2776778","url":null,"abstract":"Information retrieval researchers and engineers use human computation as a mechanism to produce labeled data sets for product development, research and experimentation. To gather useful results, a successful labeling task relies on many different elements: clear instructions, user interface guidelines, representative high-quality datasets, appropriate inter-rater agreement metrics, work quality checks, and channels for worker feedback. Furthermore, designing and implementing tasks that produce and use several thousands or millions of labels is different than conducting small scale research investigations. In this paper we present a perspective for collecting high quality labels with an emphasis on practical problems and scalability. We focus on three main topics: programming crowds, debugging tasks with low agreement, and algorithms for quality control. We show examples from an industrial setting.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"167 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126780736","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Sign-Aware Periodicity Metrics of User Engagement for Online Search Quality Evaluation 基于符号感知的用户参与周期度量用于在线搜索质量评估
Alexey Drutsa
Modern Internet companies improve evaluation criteria of their data-driven decision-making that is based on online controlled experiments (also known as A/B tests). The amplitude metrics of user engagement are known to be well sensitive to service changes, but they could not be used to determine, whether the treatment effect is positive or negative. We propose to overcome this sign-agnostic issue by paying attention to the phase of the corresponding DFT sine wave. We refine the amplitude metrics of the first frequency by the phase ones and formalize our intuition in several novel overall evaluation criteria. These criteria are then verified over A/B experiments on real users of Yandex. We find that our approach holds the sensitivity level of the amplitudes and makes their changes sign-aware w.r.t. the treatment effect.
现代互联网公司改进了基于在线控制实验(也称为A/B测试)的数据驱动决策的评估标准。众所周知,用户参与度的幅度指标对服务变化非常敏感,但它们不能用于确定治疗效果是积极的还是消极的。我们建议通过关注相应DFT正弦波的相位来克服这个符号不可知的问题。我们通过相位改进了第一个频率的幅度指标,并将我们的直觉形式化为几个新的总体评估标准。然后通过对Yandex真实用户的A/B实验验证这些标准。我们发现我们的方法保持了振幅的灵敏度水平,并且使它们的变化与治疗效果无关。
{"title":"Sign-Aware Periodicity Metrics of User Engagement for Online Search Quality Evaluation","authors":"Alexey Drutsa","doi":"10.1145/2766462.2767814","DOIUrl":"https://doi.org/10.1145/2766462.2767814","url":null,"abstract":"Modern Internet companies improve evaluation criteria of their data-driven decision-making that is based on online controlled experiments (also known as A/B tests). The amplitude metrics of user engagement are known to be well sensitive to service changes, but they could not be used to determine, whether the treatment effect is positive or negative. We propose to overcome this sign-agnostic issue by paying attention to the phase of the corresponding DFT sine wave. We refine the amplitude metrics of the first frequency by the phase ones and formalize our intuition in several novel overall evaluation criteria. These criteria are then verified over A/B experiments on real users of Yandex. We find that our approach holds the sensitivity level of the amplitudes and makes their changes sign-aware w.r.t. the treatment effect.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129189861","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
How Random Decisions Affect Selective Distributed Search 随机决策如何影响选择性分布式搜索
Zhuyun Dai, Yubin Kim, James P. Callan
Selective distributed search is a retrieval architecture that reduces search costs by partitioning a corpus into topical shards such that only a few shards need to be searched for each query. Prior research created topical shards by using random seed documents to cluster a random sample of the full corpus. The resource selection algorithm might use a different random sample of the corpus. These random components make selective search non-deterministic. This paper studies how these random components affect experimental results. Experiments on two ClueWeb09 corpora and four query sets show that in spite of random components, selective search is stable for most queries.
选择性分布式搜索是一种检索体系结构,它通过将语料库划分为主题碎片来降低搜索成本,这样每个查询只需要搜索几个碎片。先前的研究通过使用随机种子文档对完整语料库的随机样本进行聚类来创建主题碎片。资源选择算法可能使用语料库的不同随机样本。这些随机成分使得选择性搜索不确定。本文研究了这些随机成分对实验结果的影响。在两个ClueWeb09语料库和四个查询集上的实验表明,尽管存在随机成分,选择性搜索对大多数查询是稳定的。
{"title":"How Random Decisions Affect Selective Distributed Search","authors":"Zhuyun Dai, Yubin Kim, James P. Callan","doi":"10.1145/2766462.2767796","DOIUrl":"https://doi.org/10.1145/2766462.2767796","url":null,"abstract":"Selective distributed search is a retrieval architecture that reduces search costs by partitioning a corpus into topical shards such that only a few shards need to be searched for each query. Prior research created topical shards by using random seed documents to cluster a random sample of the full corpus. The resource selection algorithm might use a different random sample of the corpus. These random components make selective search non-deterministic. This paper studies how these random components affect experimental results. Experiments on two ClueWeb09 corpora and four query sets show that in spite of random components, selective search is stable for most queries.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"48 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131478051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
期刊
Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1