首页 > 最新文献

Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval最新文献

英文 中文
FactCatch: Incremental Pay-as-You-Go Fact Checking with Minimal User Effort FactCatch:以最小的用户工作量进行增量付费事实检查
T. Nguyen, M. Weidlich, Hongzhi Yin, Bolong Zheng, Q. Nguyen, Quoc Viet Hung Nguyen
The open nature of the Web enables users to produce and propagate any content without authentication, which has been exploited to spread thousands of unverified claims via millions of online documents. Maintenance of credible knowledge bases thus has to rely on fact checking that constructs a trusted set of facts through credibility assessment. Due to an inherent lack of ground truth information and language ambiguity, fact checking cannot be done in a purely automated manner without compromising accuracy. However, state-of-the-art fact checking services, rely mostly on human validation, which is costly, slow, and non-transparent. This paper presents FactCatch, a human-in-the-loop system to guide users in fact checking that aims at minimisation of the invested effort. It supports incremental quality estimation, mistake mitigation, and pay-as-you-go instantiation of a high-quality fact database.
Web的开放特性使用户能够在没有身份验证的情况下生成和传播任何内容,这已经被利用,通过数百万个在线文档传播了数千个未经验证的声明。因此,可信知识库的维护必须依赖于通过可信度评估构建可信事实集的事实检查。由于缺乏真实信息和语言的模糊性,事实检查无法在不影响准确性的情况下以纯粹自动化的方式完成。然而,最先进的事实检查服务主要依赖于人工验证,这是昂贵的、缓慢的和不透明的。本文介绍了FactCatch,这是一个指导用户事实检查的人在循环系统,旨在最大限度地减少投入的努力。它支持增量质量评估、错误缓解和高质量事实数据库的随用随付实例化。
{"title":"FactCatch: Incremental Pay-as-You-Go Fact Checking with Minimal User Effort","authors":"T. Nguyen, M. Weidlich, Hongzhi Yin, Bolong Zheng, Q. Nguyen, Quoc Viet Hung Nguyen","doi":"10.1145/3397271.3401408","DOIUrl":"https://doi.org/10.1145/3397271.3401408","url":null,"abstract":"The open nature of the Web enables users to produce and propagate any content without authentication, which has been exploited to spread thousands of unverified claims via millions of online documents. Maintenance of credible knowledge bases thus has to rely on fact checking that constructs a trusted set of facts through credibility assessment. Due to an inherent lack of ground truth information and language ambiguity, fact checking cannot be done in a purely automated manner without compromising accuracy. However, state-of-the-art fact checking services, rely mostly on human validation, which is costly, slow, and non-transparent. This paper presents FactCatch, a human-in-the-loop system to guide users in fact checking that aims at minimisation of the invested effort. It supports incremental quality estimation, mistake mitigation, and pay-as-you-go instantiation of a high-quality fact database.","PeriodicalId":252050,"journal":{"name":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130492577","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Models Versus Satisfaction: Towards a Better Understanding of Evaluation Metrics 模型与满意度:更好地理解评价指标
Fan Zhang, Jiaxin Mao, Yiqun Liu, Xiaohui Xie, Weizhi Ma, Min Zhang, Shaoping Ma
Evaluation metrics play an important role in the batch evaluation of IR systems. Based on a user model that describes how users interact with the rank list, an evaluation metric is defined to link the relevance scores of a list of documents to an estimation of system effectiveness and user satisfaction. Therefore, the validity of an evaluation metric has two facets: whether the underlying user model can accurately predict user behavior and whether the evaluation metric correlates well with user satisfaction. While a tremendous amount of work has been undertaken to design, evaluate, and compare different evaluation metrics, few studies have explored the consistency between these two facets of evaluation metrics. Specifically, we want to investigate whether the metrics that are well calibrated with user behavior data can perform as well in estimating user satisfaction. To shed light on this research question, we compare the performance of various metrics with the C/W/L Framework in estimating user satisfaction when they are optimized to fit observed user behavior. Experimental results on both self-collected and public available user search behavior datasets show that the metrics optimized to fit users' click behavior can perform as well as those calibrated with user satisfaction feedback. We also investigate the reliability in the calibration process of evaluation metrics to find out how much data is required for parameter tuning. Our findings provide empirical support for the consistency between user behavior modeling and satisfaction measurement, as well as guidance for tuning the parameters in evaluation metrics.
评价指标在红外系统的批量评价中起着重要的作用。基于描述用户如何与排名列表交互的用户模型,定义了一个评估度量,将文档列表的相关性分数与系统有效性和用户满意度的估计联系起来。因此,评估指标的有效性有两个方面:底层用户模型是否能准确预测用户行为,以及评估指标是否与用户满意度相关。虽然已经进行了大量的工作来设计、评估和比较不同的评估指标,但很少有研究探索评估指标的这两个方面之间的一致性。具体来说,我们想要调查的是,是否与用户行为数据校准的指标可以很好地估计用户满意度。为了阐明这一研究问题,我们将各种指标的性能与C/W/L框架进行比较,以估计用户满意度,当它们被优化以适应观察到的用户行为时。在自收集和公共用户搜索行为数据集上的实验结果表明,根据用户点击行为优化的指标与根据用户满意度反馈校准的指标一样出色。我们还研究了评估指标校准过程中的可靠性,以找出参数调整需要多少数据。我们的研究结果为用户行为建模与满意度测量之间的一致性提供了实证支持,并为评估指标参数的调整提供了指导。
{"title":"Models Versus Satisfaction: Towards a Better Understanding of Evaluation Metrics","authors":"Fan Zhang, Jiaxin Mao, Yiqun Liu, Xiaohui Xie, Weizhi Ma, Min Zhang, Shaoping Ma","doi":"10.1145/3397271.3401162","DOIUrl":"https://doi.org/10.1145/3397271.3401162","url":null,"abstract":"Evaluation metrics play an important role in the batch evaluation of IR systems. Based on a user model that describes how users interact with the rank list, an evaluation metric is defined to link the relevance scores of a list of documents to an estimation of system effectiveness and user satisfaction. Therefore, the validity of an evaluation metric has two facets: whether the underlying user model can accurately predict user behavior and whether the evaluation metric correlates well with user satisfaction. While a tremendous amount of work has been undertaken to design, evaluate, and compare different evaluation metrics, few studies have explored the consistency between these two facets of evaluation metrics. Specifically, we want to investigate whether the metrics that are well calibrated with user behavior data can perform as well in estimating user satisfaction. To shed light on this research question, we compare the performance of various metrics with the C/W/L Framework in estimating user satisfaction when they are optimized to fit observed user behavior. Experimental results on both self-collected and public available user search behavior datasets show that the metrics optimized to fit users' click behavior can perform as well as those calibrated with user satisfaction feedback. We also investigate the reliability in the calibration process of evaluation metrics to find out how much data is required for parameter tuning. Our findings provide empirical support for the consistency between user behavior modeling and satisfaction measurement, as well as guidance for tuning the parameters in evaluation metrics.","PeriodicalId":252050,"journal":{"name":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114210364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 26
Jointly Non-Sampling Learning for Knowledge Graph Enhanced Recommendation 联合非抽样学习的知识图增强推荐
Chong Chen, Min Zhang, Weizhi Ma, Yiqun Liu, Shaoping Ma
Knowledge graph (KG) contains well-structured external information and has shown to be effective for high-quality recommendation. However, existing KG enhanced recommendation methods have largely focused on exploring advanced neural network architectures to better investigate the structural information of KG. While for model learning, these methods mainly rely on Negative Sampling (NS) to optimize the models for both KG embedding task and recommendation task. Since NS is not robust (e.g., sampling a small fraction of negative instances may lose lots of useful information), it is reasonable to argue that these methods are insufficient to capture collaborative information among users, items, and entities. In this paper, we propose a novel Jointly Non-Sampling learning model for Knowledge graph enhanced Recommendation (JNSKR). Specifically, we first design a new efficient NS optimization algorithm for knowledge graph embedding learning. The subgraphs are then encoded by the proposed attentive neural network to better characterize user preference over items. Through novel designs of memorization strategies and joint learning framework, JNSKR not only models the fine-grained connections among users, items, and entities, but also efficiently learns model parameters from the whole training data (including all non-observed data) with a rather low time complexity. Experimental results on two public benchmarks show that JNSKR significantly outperforms the state-of-the-art methods like RippleNet and KGAT. Remarkably, JNSKR also shows significant advantages in training efficiency (about 20 times faster than KGAT), which makes it more applicable to real-world large-scale systems.
知识图(KG)包含结构良好的外部信息,对高质量的推荐是有效的。然而,现有的KG增强推荐方法主要集中在探索先进的神经网络架构,以更好地研究KG的结构信息。而对于模型学习,这些方法主要依靠负抽样(NS)来优化模型,无论是针对KG嵌入任务还是推荐任务。由于NS不是鲁棒的(例如,采样一小部分负面实例可能会丢失大量有用的信息),因此有理由认为这些方法不足以捕获用户、项目和实体之间的协作信息。本文提出了一种用于知识图增强推荐(JNSKR)的联合非采样学习模型。具体来说,我们首先为知识图嵌入学习设计了一种新的高效NS优化算法。子图然后由所提出的专注神经网络编码,以更好地表征用户对物品的偏好。通过对记忆策略和联合学习框架的新颖设计,JNSKR不仅可以对用户、项目和实体之间的细粒度连接进行建模,而且可以以较低的时间复杂度从整个训练数据(包括所有未观察到的数据)中高效地学习模型参数。在两个公共基准测试上的实验结果表明,JNSKR显著优于RippleNet和KGAT等最先进的方法。值得注意的是,JNSKR在训练效率上也显示出显著的优势(大约比KGAT快20倍),这使得它更适用于现实世界的大规模系统。
{"title":"Jointly Non-Sampling Learning for Knowledge Graph Enhanced Recommendation","authors":"Chong Chen, Min Zhang, Weizhi Ma, Yiqun Liu, Shaoping Ma","doi":"10.1145/3397271.3401040","DOIUrl":"https://doi.org/10.1145/3397271.3401040","url":null,"abstract":"Knowledge graph (KG) contains well-structured external information and has shown to be effective for high-quality recommendation. However, existing KG enhanced recommendation methods have largely focused on exploring advanced neural network architectures to better investigate the structural information of KG. While for model learning, these methods mainly rely on Negative Sampling (NS) to optimize the models for both KG embedding task and recommendation task. Since NS is not robust (e.g., sampling a small fraction of negative instances may lose lots of useful information), it is reasonable to argue that these methods are insufficient to capture collaborative information among users, items, and entities. In this paper, we propose a novel Jointly Non-Sampling learning model for Knowledge graph enhanced Recommendation (JNSKR). Specifically, we first design a new efficient NS optimization algorithm for knowledge graph embedding learning. The subgraphs are then encoded by the proposed attentive neural network to better characterize user preference over items. Through novel designs of memorization strategies and joint learning framework, JNSKR not only models the fine-grained connections among users, items, and entities, but also efficiently learns model parameters from the whole training data (including all non-observed data) with a rather low time complexity. Experimental results on two public benchmarks show that JNSKR significantly outperforms the state-of-the-art methods like RippleNet and KGAT. Remarkably, JNSKR also shows significant advantages in training efficiency (about 20 times faster than KGAT), which makes it more applicable to real-world large-scale systems.","PeriodicalId":252050,"journal":{"name":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116319618","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 63
Deep Critiquing for VAE-based Recommender Systems 基于人工智能的推荐系统的深度批评
Kai Luo, Hojin Yang, Ga Wu, S. Sanner
Providing explanations for recommended items not only allows users to understand the reason for receiving recommendations but also provides users with an opportunity to refine recommendations by critiquing undesired parts of the explanation. While much research focuses on improving the explanation of recommendations, less effort has focused on interactive recommendation by allowing a user to critique explanations. Aside from traditional constraint- and utility-based critiquing systems, the only end-to-end deep learning based critiquing approach in the literature so far, CE-VNCF, suffers from unstable and inefficient training performance. In this paper, we propose a Variational Autoencoder (VAE) based critiquing system to mitigate these issues and improve overall performance. The proposed model generates keyphrase-based explanations of recommendations and allows users to critique the generated explanations to refine their personalized recommendations. Our experiments show promising results: (1) The proposed model is competitive in terms of general performance in comparison to state-of-the-art recommenders, despite having an augmented loss function to support explanation and critiquing. (2) The proposed model can generate high-quality explanations compared to user or item keyphrase popularity baselines. (3) The proposed model is more effective in refining recommendations based on critiquing than CE-VNCF, where the rank of critiquing-affected items drops while general recommendation performance remains stable. In summary, this paper presents a significantly improved method for multi-step deep critiquing based recommender systems based on the VAE framework.
为推荐的项目提供解释不仅可以让用户理解接受推荐的原因,而且还为用户提供了通过批评解释中不希望看到的部分来改进推荐的机会。虽然许多研究都集中在改进推荐的解释上,但很少有人通过允许用户评论解释来关注交互式推荐。除了传统的基于约束和效用的批评系统之外,迄今为止文献中唯一基于端到端深度学习的批评方法CE-VNCF存在训练性能不稳定和效率低下的问题。在本文中,我们提出了一个基于变分自编码器(VAE)的批评系统来缓解这些问题并提高整体性能。提出的模型生成基于关键短语的推荐解释,并允许用户评论生成的解释,以改进他们的个性化推荐。我们的实验显示了有希望的结果:(1)尽管有一个增强的损失函数来支持解释和批评,但与最先进的推荐器相比,所提出的模型在一般性能方面具有竞争力。(2)与用户或项目关键词流行度基线相比,所提出的模型可以生成高质量的解释。(3)该模型比CE-VNCF更有效地改进了基于评论的推荐,在CE-VNCF中,受评论影响的项目排名下降,而一般推荐性能保持稳定。综上所述,本文提出了一种基于VAE框架的基于多步深度批评的推荐系统的改进方法。
{"title":"Deep Critiquing for VAE-based Recommender Systems","authors":"Kai Luo, Hojin Yang, Ga Wu, S. Sanner","doi":"10.1145/3397271.3401091","DOIUrl":"https://doi.org/10.1145/3397271.3401091","url":null,"abstract":"Providing explanations for recommended items not only allows users to understand the reason for receiving recommendations but also provides users with an opportunity to refine recommendations by critiquing undesired parts of the explanation. While much research focuses on improving the explanation of recommendations, less effort has focused on interactive recommendation by allowing a user to critique explanations. Aside from traditional constraint- and utility-based critiquing systems, the only end-to-end deep learning based critiquing approach in the literature so far, CE-VNCF, suffers from unstable and inefficient training performance. In this paper, we propose a Variational Autoencoder (VAE) based critiquing system to mitigate these issues and improve overall performance. The proposed model generates keyphrase-based explanations of recommendations and allows users to critique the generated explanations to refine their personalized recommendations. Our experiments show promising results: (1) The proposed model is competitive in terms of general performance in comparison to state-of-the-art recommenders, despite having an augmented loss function to support explanation and critiquing. (2) The proposed model can generate high-quality explanations compared to user or item keyphrase popularity baselines. (3) The proposed model is more effective in refining recommendations based on critiquing than CE-VNCF, where the rank of critiquing-affected items drops while general recommendation performance remains stable. In summary, this paper presents a significantly improved method for multi-step deep critiquing based recommender systems based on the VAE framework.","PeriodicalId":252050,"journal":{"name":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124034571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 34
QuAChIE: Question Answering based Chinese Information Extraction System 基于问答的中文信息抽取系统
Dongyu Ru, Zhenghui Wang, Lin Qiu, Hao Zhou, Lei Li, Weinan Zhang, Yong Yu
In this paper, we present the design of QuAChIE, a Question Answering based Chinese Information Extraction system. QuAChIE mainly depends on a well-trained question answering model to extract high-quality triples. The group of head entity and relation are regarded as a question given the input text as the context. For the training and evaluation of each model in the system, we build a large-scale information extraction dataset using Wikidata and Wikipedia pages by distant supervision. The advanced models implemented on top of the pre-trained language model and the enormous distant supervision data enable QuAChIE to extract relation triples from documents with cross-sentence correlations. The experimental results on the test set and the case study based on the interactive demonstration show its satisfactory Information Extraction quality on Chinese document-level texts.
本文提出了基于问答的中文信息抽取系统QuAChIE的设计。QuAChIE主要依靠训练有素的问答模型来提取高质量的三元组。在给定输入文本作为上下文的情况下,将标题实体和关系组视为一个问题。为了对系统中的每个模型进行训练和评估,我们通过远程监督,利用维基数据和维基百科页面构建了一个大规模的信息提取数据集。在预训练语言模型之上实现的高级模型和庞大的远程监督数据使QuAChIE能够从具有跨句相关性的文档中提取关系三元组。在测试集和基于交互演示的案例研究上的实验结果表明,该方法对中文文档级文本的信息提取质量令人满意。
{"title":"QuAChIE: Question Answering based Chinese Information Extraction System","authors":"Dongyu Ru, Zhenghui Wang, Lin Qiu, Hao Zhou, Lei Li, Weinan Zhang, Yong Yu","doi":"10.1145/3397271.3401411","DOIUrl":"https://doi.org/10.1145/3397271.3401411","url":null,"abstract":"In this paper, we present the design of QuAChIE, a Question Answering based Chinese Information Extraction system. QuAChIE mainly depends on a well-trained question answering model to extract high-quality triples. The group of head entity and relation are regarded as a question given the input text as the context. For the training and evaluation of each model in the system, we build a large-scale information extraction dataset using Wikidata and Wikipedia pages by distant supervision. The advanced models implemented on top of the pre-trained language model and the enormous distant supervision data enable QuAChIE to extract relation triples from documents with cross-sentence correlations. The experimental results on the test set and the case study based on the interactive demonstration show its satisfactory Information Extraction quality on Chinese document-level texts.","PeriodicalId":252050,"journal":{"name":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127757699","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Active Learning Stopping Strategies for Technology-Assisted Sensitivity Review 技术辅助敏感性评价的主动学习停止策略
G. Mcdonald, C. Macdonald, I. Ounis
Active learning strategies are often deployed in technology-assisted review tasks, such as e-discovery and sensitivity review, to learn a classifier that can assist the reviewers with their task. In particular, an active learning strategy selects the documents that are expected to be the most useful for learning an effective classifier, so that these documents can be reviewed before the less useful ones. However, when reviewing for sensitivity, the order in which the documents are reviewed can impact on the reviewers' ability to perform the review. Therefore, when deploying active learning in technology-assisted sensitivity review, we want to know when a sufficiently effective classifier has been learned, such that the active learning can stop and the reviewing order of the documents can be selected by the reviewer instead of the classifier. In this work, we propose two active learning stopping strategies for technology-assisted sensitivity review. We evaluate the effectiveness of our proposed approaches in comparison with three state-of-the-art stopping strategies from the literature. We show that our best performing approach results in a significantly more effective sensitivity classifier (+6.6% F2) than the best performing stopping strategy from the literature (McNemar's test, p<0.05).
主动学习策略通常部署在技术辅助的审查任务中,例如电子发现和敏感性审查,以学习可以帮助审查者完成任务的分类器。特别是,主动学习策略选择对学习有效分类器最有用的文档,以便可以在不太有用的文档之前查看这些文档。然而,在进行敏感性评审时,评审文档的顺序可能会影响评审人员执行评审的能力。因此,在技术辅助敏感性审查中部署主动学习时,我们想知道什么时候已经学习到一个足够有效的分类器,以便主动学习可以停止,并且可以由审稿人而不是分类器来选择文档的审查顺序。在这项工作中,我们提出了两种主动学习停止策略,用于技术辅助敏感性审查。我们评估了我们提出的方法的有效性,并与文献中三种最先进的停止策略进行了比较。我们表明,我们表现最好的方法产生的灵敏度分类器(+6.6% F2)明显高于文献中表现最好的停止策略(McNemar检验,p<0.05)。
{"title":"Active Learning Stopping Strategies for Technology-Assisted Sensitivity Review","authors":"G. Mcdonald, C. Macdonald, I. Ounis","doi":"10.1145/3397271.3401267","DOIUrl":"https://doi.org/10.1145/3397271.3401267","url":null,"abstract":"Active learning strategies are often deployed in technology-assisted review tasks, such as e-discovery and sensitivity review, to learn a classifier that can assist the reviewers with their task. In particular, an active learning strategy selects the documents that are expected to be the most useful for learning an effective classifier, so that these documents can be reviewed before the less useful ones. However, when reviewing for sensitivity, the order in which the documents are reviewed can impact on the reviewers' ability to perform the review. Therefore, when deploying active learning in technology-assisted sensitivity review, we want to know when a sufficiently effective classifier has been learned, such that the active learning can stop and the reviewing order of the documents can be selected by the reviewer instead of the classifier. In this work, we propose two active learning stopping strategies for technology-assisted sensitivity review. We evaluate the effectiveness of our proposed approaches in comparison with three state-of-the-art stopping strategies from the literature. We show that our best performing approach results in a significantly more effective sensitivity classifier (+6.6% F2) than the best performing stopping strategy from the literature (McNemar's test, p<0.05).","PeriodicalId":252050,"journal":{"name":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125700445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
WIRE: An Automated Report Generation System using Topical and Temporal Summarization WIRE:使用主题和时间摘要的自动报告生成系统
Yunseok Noh, Yongmin Shin, Junmo Park, A.-Yeong Kim, S. Choi, Hyun-Je Song, Seong-Bae Park, Seyoung Park
The demand for a tool for summarizing emerging topics is increasing in modern life since the tool can deliver well-organized information to its users. Even though there are already a number of successful search systems, the system which automatically summarizes and organizes the content of emerging topics is still in its infancy. To fulfill such demand, we introduce an automated report generation system that generates a well-summarized human-readable report for emerging topics. In this report generation system, emerging topics are automatically discovered by a topic model and news articles are indexed by the discovered topics. Then, a topical summary and a timeline summary for each topic is generated by a topical multi-document summarizer and a timeline summarizer respectively. In order to enhance the apprehensibility of the users, the proposed report system provides two report modes. One is Today's Briefing which summarizes five discovered topics of every day, and the other is Full Report which shows a long-term view of each topic with a detailed topical summary and an important event timeline.
在现代生活中,对总结新兴主题的工具的需求正在增加,因为该工具可以向其用户提供组织良好的信息。尽管已经有了一些成功的搜索系统,但是自动总结和组织新兴主题内容的系统还处于起步阶段。为了满足这样的需求,我们引入了一个自动报告生成系统,该系统可以为新出现的主题生成一个总结良好的人类可读报告。在这个报告生成系统中,新出现的主题由主题模型自动发现,新闻文章由发现的主题建立索引。然后,由主题多文档摘要器和时间轴摘要器分别生成每个主题的主题摘要和时间轴摘要。为了增强用户的可理解性,提出的报表系统提供了两种报表模式。一个是今天的简报,总结了每天发现的五个话题,另一个是完整的报告,展示了每个话题的长远观点,有详细的话题总结和重要的事件时间表。
{"title":"WIRE: An Automated Report Generation System using Topical and Temporal Summarization","authors":"Yunseok Noh, Yongmin Shin, Junmo Park, A.-Yeong Kim, S. Choi, Hyun-Je Song, Seong-Bae Park, Seyoung Park","doi":"10.1145/3397271.3401409","DOIUrl":"https://doi.org/10.1145/3397271.3401409","url":null,"abstract":"The demand for a tool for summarizing emerging topics is increasing in modern life since the tool can deliver well-organized information to its users. Even though there are already a number of successful search systems, the system which automatically summarizes and organizes the content of emerging topics is still in its infancy. To fulfill such demand, we introduce an automated report generation system that generates a well-summarized human-readable report for emerging topics. In this report generation system, emerging topics are automatically discovered by a topic model and news articles are indexed by the discovered topics. Then, a topical summary and a timeline summary for each topic is generated by a topical multi-document summarizer and a timeline summarizer respectively. In order to enhance the apprehensibility of the users, the proposed report system provides two report modes. One is Today's Briefing which summarizes five discovered topics of every day, and the other is Full Report which shows a long-term view of each topic with a detailed topical summary and an important event timeline.","PeriodicalId":252050,"journal":{"name":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115982852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Preference-based Evaluation Metrics for Web Image Search 基于偏好的网络图像搜索评价指标
Xiaohui Xie, Jiaxin Mao, Y. Liu, M. de Rijke, Haitian Chen, Min Zhang, Shaoping Ma
Following the success of Cranfield-like evaluation approaches to evaluation in web search, web image search has also been evaluated with absolute judgments of (graded) relevance. However, recent research has found that collecting absolute relevance judgments may be difficult in image search scenarios due to the multi-dimensional nature of relevance for image results. Moreover, existing evaluation metrics based on absolute relevance judgments do not correlate well with search users' satisfaction perceptions in web image search. Unlike absolute relevance judgments, preference judgments do not require that relevance grades be pre-defined, i.e., how many levels to use and what those levels mean. Instead of considering each document in isolation, preference judgments consider a pair of documents and require judges to state their relative preference. Such preference judgments are usually more reliable than absolute judgments since the presence of (at least) two items establishes a certain context. While preference judgments have been studied extensively for general web search, there exists no thorough investigation on how preference judgments and preference-based evaluation metrics can be used to evaluate web image search systems. Compared to general web search, web image search may be an even better fit for preference-based evaluation because of its grid-based presentation style. The limited need for fresh results in web image search also makes preference judgments more reusable than for general web search. In this paper, we provide a thorough comparison of variants of preference judgments for web image search. We find that compared to strict preference judgments, weak preference judgments require less time and have better inter-assessor agreement. We also study how absolute relevance levels of two given images affect preference judgments between them. Furthermore, we propose a preference-based evaluation metric named Preference-Winning-Penalty (PWP) to evaluate and compare between two different image search systems. The proposed PWP metric outperforms existing evaluation metrics based on absolute relevance judgments in terms of agreement to system-level preferences of actual users.
继类似克兰菲尔德的评价方法在网络搜索中的成功之后,网络图像搜索也被评价为绝对(分级)相关性判断。然而,最近的研究发现,由于图像结果的相关性具有多维性,因此在图像搜索场景中收集绝对相关性判断可能很困难。此外,现有的基于绝对相关性判断的评价指标与搜索用户在网络图像搜索中的满意度感知相关性不强。与绝对相关性判断不同,偏好判断不需要预先定义相关性等级,即使用多少级别以及这些级别意味着什么。偏好判断不是孤立地考虑每个文件,而是考虑一对文件,并要求法官陈述他们的相对偏好。这种偏好判断通常比绝对判断更可靠,因为(至少)两个项目的存在建立了特定的上下文。虽然偏好判断已经被广泛地用于一般的网络搜索,但对于如何使用偏好判断和基于偏好的评估指标来评估网络图像搜索系统,还没有深入的研究。与一般的网络搜索相比,网络图像搜索可能更适合基于偏好的评估,因为它基于网格的呈现风格。网络图像搜索对新结果的需求有限,这也使得偏好判断比一般的网络搜索更具可重用性。在本文中,我们提供了一个全面的比较的变体偏好判断的网络图像搜索。我们发现,与严格偏好判断相比,弱偏好判断所需的时间更少,并且评估者之间的一致性更好。我们还研究了两个给定图像的绝对相关水平如何影响它们之间的偏好判断。此外,我们提出了一个基于偏好的评价指标,称为偏好-获胜-惩罚(PWP),以评估和比较两种不同的图像搜索系统。建议的PWP度量优于现有的基于对实际用户的系统级偏好的绝对相关性判断的评估度量。
{"title":"Preference-based Evaluation Metrics for Web Image Search","authors":"Xiaohui Xie, Jiaxin Mao, Y. Liu, M. de Rijke, Haitian Chen, Min Zhang, Shaoping Ma","doi":"10.1145/3397271.3401146","DOIUrl":"https://doi.org/10.1145/3397271.3401146","url":null,"abstract":"Following the success of Cranfield-like evaluation approaches to evaluation in web search, web image search has also been evaluated with absolute judgments of (graded) relevance. However, recent research has found that collecting absolute relevance judgments may be difficult in image search scenarios due to the multi-dimensional nature of relevance for image results. Moreover, existing evaluation metrics based on absolute relevance judgments do not correlate well with search users' satisfaction perceptions in web image search. Unlike absolute relevance judgments, preference judgments do not require that relevance grades be pre-defined, i.e., how many levels to use and what those levels mean. Instead of considering each document in isolation, preference judgments consider a pair of documents and require judges to state their relative preference. Such preference judgments are usually more reliable than absolute judgments since the presence of (at least) two items establishes a certain context. While preference judgments have been studied extensively for general web search, there exists no thorough investigation on how preference judgments and preference-based evaluation metrics can be used to evaluate web image search systems. Compared to general web search, web image search may be an even better fit for preference-based evaluation because of its grid-based presentation style. The limited need for fresh results in web image search also makes preference judgments more reusable than for general web search. In this paper, we provide a thorough comparison of variants of preference judgments for web image search. We find that compared to strict preference judgments, weak preference judgments require less time and have better inter-assessor agreement. We also study how absolute relevance levels of two given images affect preference judgments between them. Furthermore, we propose a preference-based evaluation metric named Preference-Winning-Penalty (PWP) to evaluate and compare between two different image search systems. The proposed PWP metric outperforms existing evaluation metrics based on absolute relevance judgments in terms of agreement to system-level preferences of actual users.","PeriodicalId":252050,"journal":{"name":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130229597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
TADS: Learning Time-Aware Scheduling Policy with Dyna-Style Planning for Spaced Repetition TADS:学习时间感知调度策略与动态式计划的间隔重复
Zhengyu Yang, Jian Shen, Yunfei Liu, Yang Yang, Weinan Zhang, Yong Yu
Spaced repetition technique aims at improving long-term memory retention for human students by exploiting repeated, spaced reviews of learning contents. The study of spaced repetition focuses on designing an optimal policy to schedule the learning contents. To the best of our knowledge, none of the existing methods based on reinforcement learning take into account the varying time intervals between two adjacent learning events of the student, which, however, are essential to determine real-world schedule. In this paper, we aim to learn a scheduling policy that fully exploits the varying time interval information with high sample efficiency. We propose the Time-Aware scheduler with Dyna-Style planning (TADS) approach: a sample-efficient reinforcement learning framework for realistic spaced repetition. TADS learns a Time-LSTM policy to select an optimal content according to the student's whole learning history and the time interval since the last learning event. Besides, Dyna-style planning is integrated into TADS to further improve the sample efficiency. We evaluate our approach on three environments built from synthetic data and real-world data based on well-recognized cognitive models. Empirical results demonstrate that TADS achieves superior performance against state-of-the-art algorithms.
间隔重复技术旨在通过对学习内容的重复、间隔复习来提高人类学生的长期记忆力。空间重复学习的研究重点在于设计学习内容的最优调度策略。据我们所知,现有的基于强化学习的方法都没有考虑到学生两个相邻学习事件之间不断变化的时间间隔,然而,这对于确定现实世界的时间表是必不可少的。在本文中,我们的目标是学习一种充分利用变化的时间间隔信息和高样本效率的调度策略。我们提出了具有动态规划(TADS)方法的时间感知调度程序:一种用于现实间隔重复的样本高效强化学习框架。TADS学习一种time - lstm策略,根据学生的整个学习历史和距离上次学习事件的时间间隔来选择最优内容。此外,在TADS中集成了dyna式规划,进一步提高了采样效率。我们在基于公认的认知模型的合成数据和现实世界数据构建的三种环境中评估了我们的方法。实证结果表明,TADS与最先进的算法相比具有优越的性能。
{"title":"TADS: Learning Time-Aware Scheduling Policy with Dyna-Style Planning for Spaced Repetition","authors":"Zhengyu Yang, Jian Shen, Yunfei Liu, Yang Yang, Weinan Zhang, Yong Yu","doi":"10.1145/3397271.3401316","DOIUrl":"https://doi.org/10.1145/3397271.3401316","url":null,"abstract":"Spaced repetition technique aims at improving long-term memory retention for human students by exploiting repeated, spaced reviews of learning contents. The study of spaced repetition focuses on designing an optimal policy to schedule the learning contents. To the best of our knowledge, none of the existing methods based on reinforcement learning take into account the varying time intervals between two adjacent learning events of the student, which, however, are essential to determine real-world schedule. In this paper, we aim to learn a scheduling policy that fully exploits the varying time interval information with high sample efficiency. We propose the Time-Aware scheduler with Dyna-Style planning (TADS) approach: a sample-efficient reinforcement learning framework for realistic spaced repetition. TADS learns a Time-LSTM policy to select an optimal content according to the student's whole learning history and the time interval since the last learning event. Besides, Dyna-style planning is integrated into TADS to further improve the sample efficiency. We evaluate our approach on three environments built from synthetic data and real-world data based on well-recognized cognitive models. Empirical results demonstrate that TADS achieves superior performance against state-of-the-art algorithms.","PeriodicalId":252050,"journal":{"name":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130686237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Web of Scholars: A Scholar Knowledge Graph 学者网:学者知识图谱
Jiaying Liu, Jing Ren, Wenqing Zheng, Lianhua Chi, Ivan Lee, Feng Xia
In this work, we demonstrate a novel system, namely Web of Scholars, which integrates state-of-the-art mining techniques to search, mine, and visualize complex networks behind scholars in the field of Computer Science. Relying on the knowledge graph, it provides services for fast, accurate, and intelligent semantic querying as well as powerful recommendations. In addition, in order to realize information sharing, it provides open API to be served as the underlying architecture for advanced functions. Web of Scholars takes advantage of knowledge graph, which means that it will be able to access more knowledge if more search exist. It can be served as a useful and interoperable tool for scholars to conduct in-depth analysis within Science of Science.
在这项工作中,我们展示了一个新的系统,即学者网络,它集成了最先进的挖掘技术来搜索、挖掘和可视化计算机科学领域学者背后的复杂网络。它以知识图谱为依托,提供快速、准确、智能的语义查询服务和强大的推荐服务。此外,为了实现信息共享,它提供了开放的API,作为高级功能的底层架构。学者网利用了知识图谱,这意味着如果存在更多的搜索,它将能够访问更多的知识。它可以作为一个有用的、可互操作的工具,供学者在《科学的科学》中进行深入分析。
{"title":"Web of Scholars: A Scholar Knowledge Graph","authors":"Jiaying Liu, Jing Ren, Wenqing Zheng, Lianhua Chi, Ivan Lee, Feng Xia","doi":"10.1145/3397271.3401405","DOIUrl":"https://doi.org/10.1145/3397271.3401405","url":null,"abstract":"In this work, we demonstrate a novel system, namely Web of Scholars, which integrates state-of-the-art mining techniques to search, mine, and visualize complex networks behind scholars in the field of Computer Science. Relying on the knowledge graph, it provides services for fast, accurate, and intelligent semantic querying as well as powerful recommendations. In addition, in order to realize information sharing, it provides open API to be served as the underlying architecture for advanced functions. Web of Scholars takes advantage of knowledge graph, which means that it will be able to access more knowledge if more search exist. It can be served as a useful and interoperable tool for scholars to conduct in-depth analysis within Science of Science.","PeriodicalId":252050,"journal":{"name":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131162070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 33
期刊
Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1