首页 > 最新文献

Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval最新文献

英文 中文
cwl_eval: An Evaluation Tool for Information Retrieval cwl_eval:一个信息检索的评估工具
L. Azzopardi, Paul Thomas, Alistair Moffat
We present a tool ("cwl_eval") which unifies many metrics typically used to evaluate information retrieval systems using test collections. In the CWL framework metrics are specified via a single function which can be used to derive a number of related measurements: Expected Utility per item, Expected Total Utility, Expected Cost per item, Expected Total Cost, and Expected Depth. The CWL framework brings together several independent approaches for measuring the quality of a ranked list, and provides a coherent user model-based framework for developing measures based on utility (gain) and cost. Here we outline the CWL measurement framework; describe the cwl_eval architecture; and provide examples of how to use it. We provide implementations of a number of recent metrics, including Time Biased Gain, U-Measure, Bejewelled Measure, and the Information Foraging Based Measure, as well as previous metrics such as Precision, Average Precision, Discounted Cumulative Gain, Rank-Biased Precision, and INST. By providing state-of-the-art and traditional metrics within the same framework, we promote a standardised approach to evaluating search effectiveness.
我们提出了一个工具(“cwl_eval”),它统一了许多通常用于使用测试集合评估信息检索系统的指标。在CWL框架中,度量是通过单个函数指定的,该函数可用于派生出许多相关度量:每个项目的预期效用、预期总效用、每个项目的预期成本、预期总成本和预期深度。CWL框架汇集了几种独立的方法来衡量排名列表的质量,并提供了一个基于用户模型的一致框架,用于开发基于效用(收益)和成本的度量。在这里,我们概述了CWL的测量框架;描述cwl_eval架构;并提供如何使用它的例子。我们提供了许多最新指标的实现,包括时间偏差增益、u型测量、宝石迷阵测量和基于信息采集的测量,以及以前的指标,如精度、平均精度、折扣累积增益、秩偏差精度和INST。通过在同一框架内提供最先进和传统的指标,我们促进了一种评估搜索有效性的标准化方法。
{"title":"cwl_eval: An Evaluation Tool for Information Retrieval","authors":"L. Azzopardi, Paul Thomas, Alistair Moffat","doi":"10.1145/3331184.3331398","DOIUrl":"https://doi.org/10.1145/3331184.3331398","url":null,"abstract":"We present a tool (\"cwl_eval\") which unifies many metrics typically used to evaluate information retrieval systems using test collections. In the CWL framework metrics are specified via a single function which can be used to derive a number of related measurements: Expected Utility per item, Expected Total Utility, Expected Cost per item, Expected Total Cost, and Expected Depth. The CWL framework brings together several independent approaches for measuring the quality of a ranked list, and provides a coherent user model-based framework for developing measures based on utility (gain) and cost. Here we outline the CWL measurement framework; describe the cwl_eval architecture; and provide examples of how to use it. We provide implementations of a number of recent metrics, including Time Biased Gain, U-Measure, Bejewelled Measure, and the Information Foraging Based Measure, as well as previous metrics such as Precision, Average Precision, Discounted Cumulative Gain, Rank-Biased Precision, and INST. By providing state-of-the-art and traditional metrics within the same framework, we promote a standardised approach to evaluating search effectiveness.","PeriodicalId":20700,"journal":{"name":"Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"10 9 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88053218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
One-Class Order Embedding for Dependency Relation Prediction 依赖关系预测的单类顺序嵌入
Meng-Fen Chiang, Ee-Peng Lim, Wang-Chien Lee, Xavier Jayaraj Siddarth Ashok, Philips Kokoh Prasetyo
Learning the dependency relations among entities and the hierarchy formed by these relations by mapping entities into some order embedding space can effectively enable several important applications, including knowledge base completion and prerequisite relations prediction. Nevertheless, it is very challenging to learn a good order embedding due to the existence of partial ordering and missing relations in the observed data. Moreover, most application scenarios do not provide non-trivial negative dependency relation instances. We therefore propose a framework that performs dependency relation prediction by exploring both rich semantic and hierarchical structure information in the data. In particular, we propose several negative sampling strategies based on graph-specific centrality properties, which supplement the positive dependency relations with appropriate negative samples to effectively learn order embeddings. This research not only addresses the needs of automatically recovering missing dependency relations, but also unravels dependencies among entities using several real-world datasets, such as course dependency hierarchy involving course prerequisite relations, job hierarchy in organizations, and paper citation hierarchy. Extensive experiments are conducted on both synthetic and real-world datasets to demonstrate the prediction accuracy as well as to gain insights using the learned order embedding.
通过将实体映射到某个顺序嵌入空间,学习实体之间的依赖关系以及这些关系所形成的层次结构,可以有效地实现知识库补全和前提关系预测等重要应用。然而,由于观测数据中存在偏序和缺失关系,学习一个好的序嵌入是非常具有挑战性的。此外,大多数应用程序场景不提供重要的负依赖关系实例。因此,我们提出了一个框架,通过探索数据中丰富的语义和层次结构信息来执行依赖关系预测。特别是,我们提出了几种基于图特定中心性的负采样策略,这些策略用适当的负样本补充了正依赖关系,以有效地学习阶嵌入。本研究不仅解决了自动恢复缺失依赖关系的需求,还利用多个真实数据集揭示了实体之间的依赖关系,如涉及课程先决条件关系的课程依赖层次、组织中的工作层次和论文引用层次。在合成和现实世界的数据集上进行了大量的实验,以证明预测的准确性以及使用学习的顺序嵌入获得的见解。
{"title":"One-Class Order Embedding for Dependency Relation Prediction","authors":"Meng-Fen Chiang, Ee-Peng Lim, Wang-Chien Lee, Xavier Jayaraj Siddarth Ashok, Philips Kokoh Prasetyo","doi":"10.1145/3331184.3331249","DOIUrl":"https://doi.org/10.1145/3331184.3331249","url":null,"abstract":"Learning the dependency relations among entities and the hierarchy formed by these relations by mapping entities into some order embedding space can effectively enable several important applications, including knowledge base completion and prerequisite relations prediction. Nevertheless, it is very challenging to learn a good order embedding due to the existence of partial ordering and missing relations in the observed data. Moreover, most application scenarios do not provide non-trivial negative dependency relation instances. We therefore propose a framework that performs dependency relation prediction by exploring both rich semantic and hierarchical structure information in the data. In particular, we propose several negative sampling strategies based on graph-specific centrality properties, which supplement the positive dependency relations with appropriate negative samples to effectively learn order embeddings. This research not only addresses the needs of automatically recovering missing dependency relations, but also unravels dependencies among entities using several real-world datasets, such as course dependency hierarchy involving course prerequisite relations, job hierarchy in organizations, and paper citation hierarchy. Extensive experiments are conducted on both synthetic and real-world datasets to demonstrate the prediction accuracy as well as to gain insights using the learned order embedding.","PeriodicalId":20700,"journal":{"name":"Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"75 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86802197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Training Streaming Factorization Machines with Alternating Least Squares 交替最小二乘训练流分解机
Xueyu Mao, Saayan Mitra, Sheng Li
Factorization Machines (FM) have been widely applied in industrial applications for recommendations. Traditionally FM models are trained in batch mode, which entails training the model with large datasets every few hours or days. Such training procedure cannot capture the trends evolving in real time with large volume of streaming data. In this paper, we propose an online training scheme for FM with the alternating least squares (ALS) technique, which has comparable performance with existing batch training algorithms. We incorporate an online update mechanism to the model parameters at the cost of storing a small cache. The mechanism also stabilizes the training error more than a traditional online training technique like stochastic gradient descent (SGD) as data points come in, which is crucial for real-time applications. Experiments on large scale datasets validate the efficiency and robustness of our method.
因式分解机(FM)在工业应用中得到了广泛的应用。传统的FM模型是以批处理模式训练的,这需要每隔几个小时或几天用大型数据集训练模型。这样的训练过程在大量流数据的情况下,无法实时捕捉变化的趋势。本文提出了一种基于交替最小二乘(ALS)技术的FM在线训练方案,该方案与现有的批处理训练算法性能相当。我们将在线更新机制整合到模型参数中,代价是存储一个小缓存。随着数据点的输入,该机制比传统的在线训练技术(如随机梯度下降(SGD))更能稳定训练误差,这对实时应用至关重要。大规模数据集实验验证了该方法的有效性和鲁棒性。
{"title":"Training Streaming Factorization Machines with Alternating Least Squares","authors":"Xueyu Mao, Saayan Mitra, Sheng Li","doi":"10.1145/3331184.3331374","DOIUrl":"https://doi.org/10.1145/3331184.3331374","url":null,"abstract":"Factorization Machines (FM) have been widely applied in industrial applications for recommendations. Traditionally FM models are trained in batch mode, which entails training the model with large datasets every few hours or days. Such training procedure cannot capture the trends evolving in real time with large volume of streaming data. In this paper, we propose an online training scheme for FM with the alternating least squares (ALS) technique, which has comparable performance with existing batch training algorithms. We incorporate an online update mechanism to the model parameters at the cost of storing a small cache. The mechanism also stabilizes the training error more than a traditional online training technique like stochastic gradient descent (SGD) as data points come in, which is crucial for real-time applications. Experiments on large scale datasets validate the efficiency and robustness of our method.","PeriodicalId":20700,"journal":{"name":"Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"7 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78622288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Finding Camouflaged Needle in a Haystack?: Pornographic Products Detection via Berrypicking Tree Model 在干草堆里找到伪装的针?:基于berrypkingtree模型的色情产品检测
Guoxiu He, Yangyang Kang, Zhe Gao, Zhuoren Jiang, Changlong Sun, Xiaozhong Liu, Wei Lu, Qiong Zhang, Luo Si
It is an important and urgent research problem for decentralized eCommerce services, e.g., eBay, eBid, and Taobao, to detect illegal products, e.g., unclassified pornographic products. However, it is a challenging task as some sellers may utilize and change camouflaged text to deceive the current detection algorithms. In this study, we propose a novel task to dynamically locate the pornographic products from very large product collections. Unlike prior product classification efforts focusing on textual information, the proposed model, BerryPIcking TRee MoDel (BIRD), utilizes both product textual content and buyers' seeking behavior information as berrypicking trees. In particular, the BIRD encodes both semantic information with respect to all branches sequence and the overall latent buyer intent during the whole seeking process. An extensive set of experiments have been conducted to demonstrate the advantage of the proposed model against alternative solutions. To facilitate further research of this practical and important problem, the codes and buyers' seeking behavior data have been made publicly available1.
对于eBay、eBid、淘宝等分散的电子商务服务平台来说,如何检测非法产品(如未分类的色情产品)是一个重要而迫切的研究问题。然而,这是一项具有挑战性的任务,因为一些卖家可能会利用和改变伪装文本来欺骗当前的检测算法。在这项研究中,我们提出了一个新的任务,从非常大的产品集合中动态定位色情产品。与以往的产品分类工作侧重于文本信息不同,本文提出的berryping树模型(BIRD)将产品文本内容和购买者的寻找行为信息作为berryping树。特别是,BIRD在整个寻找过程中对所有分支序列的语义信息和整体潜在买家意图进行编码。已经进行了一系列广泛的实验,以证明所提出的模型相对于替代解决方案的优势。为了便于对这一现实而重要的问题进行进一步的研究,这些代码和买家的寻找行为数据已经公开。
{"title":"Finding Camouflaged Needle in a Haystack?: Pornographic Products Detection via Berrypicking Tree Model","authors":"Guoxiu He, Yangyang Kang, Zhe Gao, Zhuoren Jiang, Changlong Sun, Xiaozhong Liu, Wei Lu, Qiong Zhang, Luo Si","doi":"10.1145/3331184.3331197","DOIUrl":"https://doi.org/10.1145/3331184.3331197","url":null,"abstract":"It is an important and urgent research problem for decentralized eCommerce services, e.g., eBay, eBid, and Taobao, to detect illegal products, e.g., unclassified pornographic products. However, it is a challenging task as some sellers may utilize and change camouflaged text to deceive the current detection algorithms. In this study, we propose a novel task to dynamically locate the pornographic products from very large product collections. Unlike prior product classification efforts focusing on textual information, the proposed model, BerryPIcking TRee MoDel (BIRD), utilizes both product textual content and buyers' seeking behavior information as berrypicking trees. In particular, the BIRD encodes both semantic information with respect to all branches sequence and the overall latent buyer intent during the whole seeking process. An extensive set of experiments have been conducted to demonstrate the advantage of the proposed model against alternative solutions. To facilitate further research of this practical and important problem, the codes and buyers' seeking behavior data have been made publicly available1.","PeriodicalId":20700,"journal":{"name":"Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79906582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Investigating the Interplay Between Searchers' Privacy Concerns and Their Search Behavior 调查搜索者的隐私问题和他们的搜索行为之间的相互作用
Steven Zimmerman, Alistair Thorpe, C. Fox, Udo Kruschwitz
Privacy concerns are becoming a dominant focus in search applications, thus there is a growing need to understand implications of efforts to address these concerns. Our research investigates a search system with privacy warning labels, an approach inspired by decision making research on food nutrition labels. This approach is designed to alert users to potential privacy threats in their search for information as one possible avenue to address privacy concerns. Our primary goal is to understand the extent to which attitudes towards privacy are linked to behaviors that protect privacy. In the present study, participants were given a set of fact-based decision tasks from the domain of health search. Participants were rotated through variations of search engine results pages (SERPs) including a SERP with a privacy warning light system. Lastly, participants completed a survey to capture attitudes towards privacy, behaviors to protect privacy, and other demographic information. In addition to the comparison of interactive search behaviors of a privacy warning SERP with a control SERP, we compared self-report privacy measures with interactive search behaviors. Participants reported strong concerns around privacy of health information while simultaneously placing high importance on the correctness of this information. Analysis of our interactive experiment and self-report privacy measures indicate that 1) choice of privacy-protective browsers has a significant link to privacy attitudes and privacy-protective behaviors in a SERP and 2) there are no significant links between reported concerns towards privacy and recorded behavior in an information retrieval system with warnings that enable users to protect their privacy.
隐私问题正在成为搜索应用程序的主要关注点,因此越来越需要了解解决这些问题的努力的含义。我们的研究研究了一个带有隐私警告标签的搜索系统,这是一种受食品营养标签决策研究启发的方法。这种方法旨在提醒用户在搜索信息时注意潜在的隐私威胁,作为解决隐私问题的一种可能途径。我们的主要目标是了解对隐私的态度与保护隐私的行为之间的联系程度。在本研究中,参与者被给予一组基于事实的决策任务,这些任务来自健康搜索领域。参与者轮流浏览各种搜索引擎结果页面(SERP),包括带有隐私警示灯系统的SERP。最后,参与者完成了一项调查,以获取对隐私的态度、保护隐私的行为和其他人口统计信息。除了比较隐私警告SERP与对照SERP的交互搜索行为外,我们还比较了自我报告隐私措施与交互搜索行为。与会者报告了对健康信息隐私的强烈关切,同时高度重视这些信息的正确性。我们的互动实验和自我报告隐私措施的分析表明,1)选择隐私保护浏览器与SERP中的隐私态度和隐私保护行为有显著联系;2)报告的隐私关注与信息检索系统中记录的行为之间没有显著联系,这些系统带有警告,使用户能够保护他们的隐私。
{"title":"Investigating the Interplay Between Searchers' Privacy Concerns and Their Search Behavior","authors":"Steven Zimmerman, Alistair Thorpe, C. Fox, Udo Kruschwitz","doi":"10.1145/3331184.3331280","DOIUrl":"https://doi.org/10.1145/3331184.3331280","url":null,"abstract":"Privacy concerns are becoming a dominant focus in search applications, thus there is a growing need to understand implications of efforts to address these concerns. Our research investigates a search system with privacy warning labels, an approach inspired by decision making research on food nutrition labels. This approach is designed to alert users to potential privacy threats in their search for information as one possible avenue to address privacy concerns. Our primary goal is to understand the extent to which attitudes towards privacy are linked to behaviors that protect privacy. In the present study, participants were given a set of fact-based decision tasks from the domain of health search. Participants were rotated through variations of search engine results pages (SERPs) including a SERP with a privacy warning light system. Lastly, participants completed a survey to capture attitudes towards privacy, behaviors to protect privacy, and other demographic information. In addition to the comparison of interactive search behaviors of a privacy warning SERP with a control SERP, we compared self-report privacy measures with interactive search behaviors. Participants reported strong concerns around privacy of health information while simultaneously placing high importance on the correctness of this information. Analysis of our interactive experiment and self-report privacy measures indicate that 1) choice of privacy-protective browsers has a significant link to privacy attitudes and privacy-protective behaviors in a SERP and 2) there are no significant links between reported concerns towards privacy and recorded behavior in an information retrieval system with warnings that enable users to protect their privacy.","PeriodicalId":20700,"journal":{"name":"Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"5 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83852980","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
A Context-based Framework for Resource Citation Classification in Scientific Literatures 基于上下文的科学文献资源引文分类框架
He Zhao, Zhunchen Luo, Chong Feng, Yuming Ye
In this paper, we introduce the task of resource citation classification for scientific literature using a context-based framework. This task is to analyze the purpose of citing an on-line resource in scientific text by modeling the role and function of each resource citation. It can be incorporated into resource indexing and recommendation systems to help better understand and classify on-line resources in scientific literature. We propose a new annotation scheme for this task and develop a dataset of 3,088 manually annotated resource citations. We adopt a neural-based model to build the classifiers and apply them on the large ARC dataset to examine the revolution of scientific resources from trends in their function over time.
本文介绍了基于上下文框架的科学文献资源引文分类任务。本课题通过对各资源被引的角色和功能建模,分析科学文本中在线资源被引的目的。它可以整合到资源索引和推荐系统中,以帮助更好地理解和分类科学文献中的在线资源。为此,我们提出了一种新的标注方案,并开发了一个包含3088条人工标注资源引文的数据集。我们采用基于神经网络的模型来构建分类器,并将其应用于大型ARC数据集,从其功能随时间的趋势来检查科学资源的革命。
{"title":"A Context-based Framework for Resource Citation Classification in Scientific Literatures","authors":"He Zhao, Zhunchen Luo, Chong Feng, Yuming Ye","doi":"10.1145/3331184.3331348","DOIUrl":"https://doi.org/10.1145/3331184.3331348","url":null,"abstract":"In this paper, we introduce the task of resource citation classification for scientific literature using a context-based framework. This task is to analyze the purpose of citing an on-line resource in scientific text by modeling the role and function of each resource citation. It can be incorporated into resource indexing and recommendation systems to help better understand and classify on-line resources in scientific literature. We propose a new annotation scheme for this task and develop a dataset of 3,088 manually annotated resource citations. We adopt a neural-based model to build the classifiers and apply them on the large ARC dataset to examine the revolution of scientific resources from trends in their function over time.","PeriodicalId":20700,"journal":{"name":"Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"142 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91422320","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Document Distance Metric Learning in an Interactive Exploration Process 交互式探索过程中的文档距离度量学习
Marco Wrzalik
Visualization of inter-document similarities is widely used for the exploration of document collections and interactive retrieval. However, similarity relationships between documents are multifaceted and measured distances by a given metric often do not match the perceived similarity of human beings. Furthermore, the user's notion of similarity can drastically change with the exploration objective or task at hand. Therefore, this research proposes to investigate online adjustments to the similarity model using feedback generated during exploration or exploratory search. In this course, rich visualizations and interactions will support users to give valuable feedback. Based on this, metric learning methodologies will be applied to adjust a similarity model in order to improve the exploration experience. At the same time, trained models are considered as valuable outcomes whose benefits for similarity-based tasks such as query-by-example retrieval or classification will be tested.
文档间相似度的可视化被广泛应用于文档集合的探索和交互检索。然而,文档之间的相似关系是多方面的,通过给定度量测量的距离通常与人类感知的相似度不匹配。此外,用户对相似性的概念可能会随着手边的探索目标或任务而急剧变化。因此,本研究提出利用探索或探索性搜索过程中产生的反馈对相似度模型进行在线调整。在本课程中,丰富的可视化和交互将支持用户提供有价值的反馈。在此基础上,采用度量学习方法调整相似度模型,以提高勘探体验。同时,经过训练的模型被认为是有价值的结果,其对基于相似性的任务(如按例查询检索或分类)的好处将得到测试。
{"title":"Document Distance Metric Learning in an Interactive Exploration Process","authors":"Marco Wrzalik","doi":"10.1145/3331184.3331420","DOIUrl":"https://doi.org/10.1145/3331184.3331420","url":null,"abstract":"Visualization of inter-document similarities is widely used for the exploration of document collections and interactive retrieval. However, similarity relationships between documents are multifaceted and measured distances by a given metric often do not match the perceived similarity of human beings. Furthermore, the user's notion of similarity can drastically change with the exploration objective or task at hand. Therefore, this research proposes to investigate online adjustments to the similarity model using feedback generated during exploration or exploratory search. In this course, rich visualizations and interactions will support users to give valuable feedback. Based on this, metric learning methodologies will be applied to adjust a similarity model in order to improve the exploration experience. At the same time, trained models are considered as valuable outcomes whose benefits for similarity-based tasks such as query-by-example retrieval or classification will be tested.","PeriodicalId":20700,"journal":{"name":"Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"47 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87001484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On Anonymous Commenting: A Greedy Approach to Balance Utilization and Anonymity for Instagram Users 匿名评论:Instagram用户平衡利用率和匿名性的贪婪方法
Arian Askari, Asal Jalilvand, Mahmood Neshati
In many online services, anonymous commenting is not possible for the users; therefore, the users can not express their critical opinions without disregarding the consequences. As for now, naïve approaches are available for anonymous commenting which cause problems for analytical services on user comments. In this paper, we explore anonymous commenting approaches and their pros and cons. We also propose methods for anonymous commenting where it's possible to protect the user privacy while allowing sentimental analytics for service providers. Our experiments were conducted on a real dataset gathered from Instagram comments which indicate the effectiveness of our proposed methods in privacy protection and sentimental analytics. The proposed methods are independent of a particular website and can be utilized in various domains.
在许多在线服务中,用户不可能匿名评论;因此,用户不可能在不顾后果的情况下表达他们的批评意见。目前,匿名评论的方法有naïve,这会给用户评论的分析服务带来问题。在本文中,我们探讨了匿名评论方法及其优缺点。我们还提出了匿名评论的方法,在允许服务提供商进行情感分析的同时,可以保护用户隐私。我们的实验是在从Instagram评论中收集的真实数据集上进行的,这表明我们提出的方法在隐私保护和情感分析方面是有效的。所提出的方法独立于特定的网站,可用于各种领域。
{"title":"On Anonymous Commenting: A Greedy Approach to Balance Utilization and Anonymity for Instagram Users","authors":"Arian Askari, Asal Jalilvand, Mahmood Neshati","doi":"10.1145/3331184.3331364","DOIUrl":"https://doi.org/10.1145/3331184.3331364","url":null,"abstract":"In many online services, anonymous commenting is not possible for the users; therefore, the users can not express their critical opinions without disregarding the consequences. As for now, naïve approaches are available for anonymous commenting which cause problems for analytical services on user comments. In this paper, we explore anonymous commenting approaches and their pros and cons. We also propose methods for anonymous commenting where it's possible to protect the user privacy while allowing sentimental analytics for service providers. Our experiments were conducted on a real dataset gathered from Instagram comments which indicate the effectiveness of our proposed methods in privacy protection and sentimental analytics. The proposed methods are independent of a particular website and can be utilized in various domains.","PeriodicalId":20700,"journal":{"name":"Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"47 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86014846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Dynamic Sampling Meets Pooling 动态抽样满足池化
G. Cormack, Haotian Zhang, Nimesh Ghelani, Mustafa Abualsaud, Mark D. Smucker, Maura R. Grossman, Shahin Rahbariasl, Amira Ghenai
A team of six assessors used Dynamic Sampling (Cormack and Grossman 2018) and one hour of assessment effort per topic to form, without pooling, a test collection for the TREC 2018 Common Core Track. Later, official relevance assessments were rendered by NIST for documents selected by depth-10 pooling augmented by move-to-front (MTF) pooling (Cormack et al. 1998), as well as the documents selected by our Dynamic Sampling effort. MAP estimates rendered from dynamically sampled assessments using the xinfAP statistical evaluator are comparable to those rendered from the complete set of official assessments using the standard trec_eval tool. MAP estimates rendered using only documents selected by pooling, on the other hand, differ substantially. The results suggest that the use of Dynamic Sampling without pooling can, for an order of magnitude less assessment effort, yield information-retrieval effectiveness estimates that exhibit lower bias, lower error, and comparable ability to rank system effectiveness.
一个由六名评估人员组成的团队使用动态抽样(Cormack和Grossman 2018)和每个主题一小时的评估工作,形成TREC 2018年共同核心轨道的测试集合,而不是汇集。后来,NIST对深度-10池选择的通过移动到前端(MTF)池(Cormack et al. 1998)增强的文档以及我们的动态采样工作选择的文档进行了官方相关性评估。使用xinfAP统计评估器从动态采样评估中呈现的MAP估计与使用标准tre_eval工具从完整的官方评估集呈现的MAP估计相当。另一方面,仅使用池选择的文档呈现的MAP估计有很大不同。结果表明,使用没有池化的动态抽样可以在一个数量级上减少评估工作,产生具有更低偏差、更低误差的信息检索有效性估计,并具有对系统有效性排序的可比较能力。
{"title":"Dynamic Sampling Meets Pooling","authors":"G. Cormack, Haotian Zhang, Nimesh Ghelani, Mustafa Abualsaud, Mark D. Smucker, Maura R. Grossman, Shahin Rahbariasl, Amira Ghenai","doi":"10.1145/3331184.3331354","DOIUrl":"https://doi.org/10.1145/3331184.3331354","url":null,"abstract":"A team of six assessors used Dynamic Sampling (Cormack and Grossman 2018) and one hour of assessment effort per topic to form, without pooling, a test collection for the TREC 2018 Common Core Track. Later, official relevance assessments were rendered by NIST for documents selected by depth-10 pooling augmented by move-to-front (MTF) pooling (Cormack et al. 1998), as well as the documents selected by our Dynamic Sampling effort. MAP estimates rendered from dynamically sampled assessments using the xinfAP statistical evaluator are comparable to those rendered from the complete set of official assessments using the standard trec_eval tool. MAP estimates rendered using only documents selected by pooling, on the other hand, differ substantially. The results suggest that the use of Dynamic Sampling without pooling can, for an order of magnitude less assessment effort, yield information-retrieval effectiveness estimates that exhibit lower bias, lower error, and comparable ability to rank system effectiveness.","PeriodicalId":20700,"journal":{"name":"Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72933854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Unified Collaborative Filtering over Graph Embeddings 图嵌入的统一协同过滤
Pengfei Wang, H. Chen, Yadong Zhu, Huawei Shen, Yongfeng Zhang
Collaborative Filtering (CF) by learning from the wisdom of crowds has become one of the most important approaches to recommender systems research, and various CF models have been designed and applied to different scenarios. However, a challenging task is how to select the most appropriate CF model for a specific recommendation task. In this paper, we propose a Unified Collaborative Filtering framework based on Graph Embeddings (UGrec for short) to solve the problem. Specifically, UGrec models user and item interactions within a graph network, and sequential recommendation path is designed as a basic unit to capture the correlations between users and items. Mathematically, we show that many representative recommendation approaches and their variants can be mapped as a recommendation path in the graph. In addition, by applying a carefully designed attention mechanism on the recommendation paths, UGrec can determine the significance of each sequential recommendation path so as to conduct automatic model selection. Compared with state-of-the-art methods, our method shows significant improvements for recommendation quality. This work also leads to a deeper understanding of the connection between graph embeddings and recommendation algorithms.
基于群体智慧的协同过滤(CF)已成为推荐系统研究的重要方法之一,各种协同过滤模型已被设计并应用于不同的场景。然而,如何为特定的推荐任务选择最合适的CF模型是一个具有挑战性的任务。在本文中,我们提出了一个基于图嵌入的统一协同过滤框架(UGrec)来解决这个问题。具体来说,UGrec在图网络中对用户和物品的交互进行建模,并将顺序推荐路径设计为捕获用户和物品之间相关性的基本单元。在数学上,我们证明了许多有代表性的推荐方法及其变体可以映射为图中的推荐路径。此外,通过在推荐路径上应用精心设计的关注机制,UGrec可以确定每条顺序推荐路径的重要性,从而进行自动模型选择。与最先进的方法相比,我们的方法在推荐质量上有了显著的提高。这项工作也使人们对图嵌入和推荐算法之间的联系有了更深的理解。
{"title":"Unified Collaborative Filtering over Graph Embeddings","authors":"Pengfei Wang, H. Chen, Yadong Zhu, Huawei Shen, Yongfeng Zhang","doi":"10.1145/3331184.3331224","DOIUrl":"https://doi.org/10.1145/3331184.3331224","url":null,"abstract":"Collaborative Filtering (CF) by learning from the wisdom of crowds has become one of the most important approaches to recommender systems research, and various CF models have been designed and applied to different scenarios. However, a challenging task is how to select the most appropriate CF model for a specific recommendation task. In this paper, we propose a Unified Collaborative Filtering framework based on Graph Embeddings (UGrec for short) to solve the problem. Specifically, UGrec models user and item interactions within a graph network, and sequential recommendation path is designed as a basic unit to capture the correlations between users and items. Mathematically, we show that many representative recommendation approaches and their variants can be mapped as a recommendation path in the graph. In addition, by applying a carefully designed attention mechanism on the recommendation paths, UGrec can determine the significance of each sequential recommendation path so as to conduct automatic model selection. Compared with state-of-the-art methods, our method shows significant improvements for recommendation quality. This work also leads to a deeper understanding of the connection between graph embeddings and recommendation algorithms.","PeriodicalId":20700,"journal":{"name":"Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"18 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73352730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
期刊
Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1