首页 > 最新文献

Proceedings of the 30th ACM International Conference on Information & Knowledge Management最新文献

英文 中文
GAKG: A Multimodal Geoscience Academic Knowledge Graph GAKG:多模态地球科学学术知识图谱
Cheng Deng, Yuting Jia, Hui Xu, Chong Zhang, Jingyao Tang, Luoyi Fu, Weinan Zhang, Haisong Zhang, Xinbing Wang, Cheng Zhou
The research of geoscience plays a strong role in helping people gain a better understanding of the Earth. To effectively represent the knowledge (KG) from enormous geoscience research papers, knowledge graphs can be a powerful means. In the face of enormous geoscience research papers, knowledge graphs can be a powerful means to manage the relationships of data and integrate knowledge extracted from them. However, the existing geoscience KGs mainly focus on the external connection between concepts, whereas the potential abundant information contained in the internal multimodal data of the paper is largely overlooked for more fine-grained knowledge mining. To this end, we propose GAKG, a large-scale multimodal academic KG based on 1.12 million papers published in various geoscience-related journals. In addition to the bibliometrics elements, we also extracted the internal illustrations, tables, and text information of the articles, and dig out the knowledge entities of the papers and the era and spatial attributes of the articles, coupling multimodal academic data and features. Specifically, GAKG realizes knowledge entity extraction under our proposed Human-In-the-Loop framework, the novelty of which is to combine the techniques of machine reading and information retrieval with manual annotation of geoscientists in the loop. Considering the fact that literature of geoscience often contains more abundant illustrations and time scale information compared with that of other disciplines, we extract all the geographical information and era from the geoscience papers' text and illustrations, mapping papers to the atlas and chronology. Based on GAKG, we build several knowledge discovery benchmarks for finding geoscience communities and predicting potential links. GAKG and its services have been made publicly available and user-friendly.
地球科学的研究在帮助人们更好地了解地球方面起着重要作用。为了有效地表示大量地学研究论文中的知识(KG),知识图谱是一种强有力的手段。面对海量的地学研究论文,知识图谱可以成为管理数据关系和整合从中提取的知识的有力手段。然而,现有的地球科学知识库主要关注概念之间的外部联系,而本文内部多模态数据中潜在的丰富信息在更细粒度的知识挖掘中被忽视。为此,我们提出了GAKG,这是一个基于112万篇地球科学相关期刊论文的大型多模式学术KG。除了文献计量学元素外,我们还提取了文章的内部插图、表格和文本信息,并耦合多模态学术数据和特征,挖掘出论文的知识实体和文章的时代和空间属性。具体而言,GAKG在我们提出的Human-In-the-Loop框架下实现了知识实体的提取,其新颖之处在于将机器阅读和信息检索技术与地球科学家在环的人工标注技术相结合。考虑到地学文献往往比其他学科文献包含更丰富的插图和时间尺度信息,我们从地学论文的文本和插图中提取所有的地理信息和时代信息,并从地学论文中提取地学文献的地图集和年表。基于GAKG,我们建立了几个知识发现基准,用于寻找地球科学社区和预测潜在的联系。GAKG及其服务已向公众开放,并方便用户使用。
{"title":"GAKG: A Multimodal Geoscience Academic Knowledge Graph","authors":"Cheng Deng, Yuting Jia, Hui Xu, Chong Zhang, Jingyao Tang, Luoyi Fu, Weinan Zhang, Haisong Zhang, Xinbing Wang, Cheng Zhou","doi":"10.1145/3459637.3482003","DOIUrl":"https://doi.org/10.1145/3459637.3482003","url":null,"abstract":"The research of geoscience plays a strong role in helping people gain a better understanding of the Earth. To effectively represent the knowledge (KG) from enormous geoscience research papers, knowledge graphs can be a powerful means. In the face of enormous geoscience research papers, knowledge graphs can be a powerful means to manage the relationships of data and integrate knowledge extracted from them. However, the existing geoscience KGs mainly focus on the external connection between concepts, whereas the potential abundant information contained in the internal multimodal data of the paper is largely overlooked for more fine-grained knowledge mining. To this end, we propose GAKG, a large-scale multimodal academic KG based on 1.12 million papers published in various geoscience-related journals. In addition to the bibliometrics elements, we also extracted the internal illustrations, tables, and text information of the articles, and dig out the knowledge entities of the papers and the era and spatial attributes of the articles, coupling multimodal academic data and features. Specifically, GAKG realizes knowledge entity extraction under our proposed Human-In-the-Loop framework, the novelty of which is to combine the techniques of machine reading and information retrieval with manual annotation of geoscientists in the loop. Considering the fact that literature of geoscience often contains more abundant illustrations and time scale information compared with that of other disciplines, we extract all the geographical information and era from the geoscience papers' text and illustrations, mapping papers to the atlas and chronology. Based on GAKG, we build several knowledge discovery benchmarks for finding geoscience communities and predicting potential links. GAKG and its services have been made publicly available and user-friendly.","PeriodicalId":405296,"journal":{"name":"Proceedings of the 30th ACM International Conference on Information & Knowledge Management","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131480914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Efficient Multi-Scale Feature Generation Adaptive Network 高效多尺度特征生成自适应网络
Gwanghan Lee, Minhan Kim, Simon S. Woo
Recently, an early exit network, which dynamically adjusts the model complexity during inference time, has achieved remarkable performance and neural network efficiency to be used for various applications. So far, many researchers have been focusing on reducing the redundancy of input sample or model architecture. However, they were unsuccessful at resolving the performance drop of early classifiers that make predictions with insufficient high-level feature information. Consequently, the performance degradation of early classifiers had a devastating effect on the entire network performance sharing the backbone. Thus, in this paper, we propose an Efficient Multi-Scale Feature Generation Adaptive Network (EMGNet), which not only reduced the redundancy of the architecture but also generates multi-scale features to improve the performance of the early exit network. Our approach renders multi-scale feature generation highly efficient through sharing weights in the center of the convolution kernel. Also, our gating network effectively learns to automatically determine the proper multi-scale feature ratio required for each convolution layer in different locations of the network. We demonstrate that our proposed model outperforms the state-of-the-art adaptive networks on CIFAR10, CIFAR100, and ImageNet datasets. The implementation code is available at https://github.com/lee-gwang/EMGNet
近年来,在推理时间内动态调整模型复杂度的早期退出网络已经取得了显著的性能和神经网络的效率,可用于各种应用。到目前为止,许多研究人员都集中在减少输入样本或模型架构的冗余上。然而,他们未能解决早期分类器在缺乏高级特征信息的情况下进行预测的性能下降问题。因此,早期分类器的性能下降对共享主干的整个网络性能产生了毁灭性的影响。因此,本文提出了一种高效的多尺度特征生成自适应网络(EMGNet),该网络不仅减少了体系结构的冗余,而且还生成了多尺度特征,从而提高了早退出网络的性能。我们的方法通过在卷积核中心共享权值,使得多尺度特征生成非常高效。此外,我们的门控网络有效地学习自动确定网络中不同位置的每个卷积层所需的合适的多尺度特征比。我们证明了我们提出的模型在CIFAR10、CIFAR100和ImageNet数据集上优于最先进的自适应网络。实现代码可从https://github.com/lee-gwang/EMGNet获得
{"title":"Efficient Multi-Scale Feature Generation Adaptive Network","authors":"Gwanghan Lee, Minhan Kim, Simon S. Woo","doi":"10.1145/3459637.3482337","DOIUrl":"https://doi.org/10.1145/3459637.3482337","url":null,"abstract":"Recently, an early exit network, which dynamically adjusts the model complexity during inference time, has achieved remarkable performance and neural network efficiency to be used for various applications. So far, many researchers have been focusing on reducing the redundancy of input sample or model architecture. However, they were unsuccessful at resolving the performance drop of early classifiers that make predictions with insufficient high-level feature information. Consequently, the performance degradation of early classifiers had a devastating effect on the entire network performance sharing the backbone. Thus, in this paper, we propose an Efficient Multi-Scale Feature Generation Adaptive Network (EMGNet), which not only reduced the redundancy of the architecture but also generates multi-scale features to improve the performance of the early exit network. Our approach renders multi-scale feature generation highly efficient through sharing weights in the center of the convolution kernel. Also, our gating network effectively learns to automatically determine the proper multi-scale feature ratio required for each convolution layer in different locations of the network. We demonstrate that our proposed model outperforms the state-of-the-art adaptive networks on CIFAR10, CIFAR100, and ImageNet datasets. The implementation code is available at https://github.com/lee-gwang/EMGNet","PeriodicalId":405296,"journal":{"name":"Proceedings of the 30th ACM International Conference on Information & Knowledge Management","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132847219","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AdaptiveGCN
Dongyue Li, Tao Yang, Lun Du, Zhezhi He, Li Jiang
Graph Convolutional Networks (GCNs) have become the prevailing approach to efficiently learn representations from graph-structured data. Current GCN models adopt a neighborhood aggregation mechanism based on two primary operations, aggregation and combination. The workload of these two processes is determined by the input graph structure, making the graph input the bottleneck of processing GCN. Meanwhile, a large amount of task-irrelevant information in the graphs would hurt the model generalization performance. This brings the opportunity of studying how to remove the redundancy in the graphs. In this paper, we aim to accelerate GCN models by removing the task-irrelevant edges in the graph. We present AdaptiveGCN, an efficient and supervised graph sparsification framework. AdaptiveGCN adopts an edge predictor module to get edge selection strategies by learning the downstream task feedback signals for each GCN layer separately and adaptively in the training stage, then only inference with the selected edges in the test stage to speed up the GCN computation. The experimental results indicate that AdaptiveGCN could yield 43% (on CPU) and 39% (on GPU) GCN model speed-up averagely with comparable model performance on public graph learning benchmarks.
{"title":"AdaptiveGCN","authors":"Dongyue Li, Tao Yang, Lun Du, Zhezhi He, Li Jiang","doi":"10.1145/3459637.3482049","DOIUrl":"https://doi.org/10.1145/3459637.3482049","url":null,"abstract":"Graph Convolutional Networks (GCNs) have become the prevailing approach to efficiently learn representations from graph-structured data. Current GCN models adopt a neighborhood aggregation mechanism based on two primary operations, aggregation and combination. The workload of these two processes is determined by the input graph structure, making the graph input the bottleneck of processing GCN. Meanwhile, a large amount of task-irrelevant information in the graphs would hurt the model generalization performance. This brings the opportunity of studying how to remove the redundancy in the graphs. In this paper, we aim to accelerate GCN models by removing the task-irrelevant edges in the graph. We present AdaptiveGCN, an efficient and supervised graph sparsification framework. AdaptiveGCN adopts an edge predictor module to get edge selection strategies by learning the downstream task feedback signals for each GCN layer separately and adaptively in the training stage, then only inference with the selected edges in the test stage to speed up the GCN computation. The experimental results indicate that AdaptiveGCN could yield 43% (on CPU) and 39% (on GPU) GCN model speed-up averagely with comparable model performance on public graph learning benchmarks.","PeriodicalId":405296,"journal":{"name":"Proceedings of the 30th ACM International Conference on Information & Knowledge Management","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127586705","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Multi-task Learning for Bias-Free Joint CTR Prediction and Market Price Modeling in Online Advertising 网络广告中无偏差联合点击率预测和市场价格建模的多任务学习
Haizhi Yang, Tengyun Wang, Xiaoli Tang, Qianyu Li, Yueyue Shi, Siyu Jiang, Han Yu, Hengjie Song
The rapid rise of real-time bidding-based online advertising has brought significant economic benefits and attracted extensive research attention. From the perspective of an advertiser, it is crucial to perform accurate utility estimation and cost estimation for each individual auction in order to achieve cost-effective advertising. These problems are known as the click through rate (CTR) prediction task and the market price modeling task, respectively. However, existing approaches treat CTR prediction and market price modeling as two independent tasks to be optimized without regard to each other, thus resulting in suboptimal performance. Moreover, they do not make full use of unlabeled data from the losing bids during estimations, which makes them suffer from the sample selection bias issue. To address these limitations, we propose Multi-task Advertising Estimator (MTAE), an end-to-end joint optimization framework which performs both CTR prediction and market price modeling simultaneously. Through multi-task learning, both estimation tasks can take advantage of knowledge transfer to achieve improved feature representation and generalization abilities. In addition, we leverage the abundant bid price signals in the full-volume bid request data and introduce an auxiliary task of predicting the winning probability into the framework for unbiased learning. Through extensive experiments on two large-scale real-world public datasets, we demonstrate that our proposed approach has achieved significant improvements over the state-of-the-art models under various performance metrics.
基于实时竞价的网络广告迅速兴起,带来了显著的经济效益,引起了广泛的研究关注。从广告主的角度来看,对每一次拍卖进行准确的效用估计和成本估计是实现广告成本效益的关键。这些问题分别被称为点击率(CTR)预测任务和市场价格建模任务。然而,现有的方法将点击率预测和市场价格建模作为两个独立的任务进行优化,而不考虑彼此,从而导致性能次优。此外,在估计过程中,他们没有充分利用来自失败出价的未标记数据,这使得他们遭受样本选择偏差问题。为了解决这些限制,我们提出了多任务广告估计器(MTAE),这是一个端到端的联合优化框架,同时执行点击率预测和市场价格建模。通过多任务学习,两种估计任务都可以利用知识转移来提高特征表示和泛化能力。此外,我们利用全量投标请求数据中丰富的投标价格信号,并在框架中引入预测获胜概率的辅助任务,以进行无偏学习。通过对两个大规模真实世界公共数据集的广泛实验,我们证明了我们提出的方法在各种性能指标下比最先进的模型取得了显着改进。
{"title":"Multi-task Learning for Bias-Free Joint CTR Prediction and Market Price Modeling in Online Advertising","authors":"Haizhi Yang, Tengyun Wang, Xiaoli Tang, Qianyu Li, Yueyue Shi, Siyu Jiang, Han Yu, Hengjie Song","doi":"10.1145/3459637.3482373","DOIUrl":"https://doi.org/10.1145/3459637.3482373","url":null,"abstract":"The rapid rise of real-time bidding-based online advertising has brought significant economic benefits and attracted extensive research attention. From the perspective of an advertiser, it is crucial to perform accurate utility estimation and cost estimation for each individual auction in order to achieve cost-effective advertising. These problems are known as the click through rate (CTR) prediction task and the market price modeling task, respectively. However, existing approaches treat CTR prediction and market price modeling as two independent tasks to be optimized without regard to each other, thus resulting in suboptimal performance. Moreover, they do not make full use of unlabeled data from the losing bids during estimations, which makes them suffer from the sample selection bias issue. To address these limitations, we propose Multi-task Advertising Estimator (MTAE), an end-to-end joint optimization framework which performs both CTR prediction and market price modeling simultaneously. Through multi-task learning, both estimation tasks can take advantage of knowledge transfer to achieve improved feature representation and generalization abilities. In addition, we leverage the abundant bid price signals in the full-volume bid request data and introduce an auxiliary task of predicting the winning probability into the framework for unbiased learning. Through extensive experiments on two large-scale real-world public datasets, we demonstrate that our proposed approach has achieved significant improvements over the state-of-the-art models under various performance metrics.","PeriodicalId":405296,"journal":{"name":"Proceedings of the 30th ACM International Conference on Information & Knowledge Management","volume":"120 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131177139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Structure Aware Experience Replay for Incremental Learning in Graph-based Recommender Systems 基于图的推荐系统中增量学习的结构感知经验回放
Kian Ahrabian, Yishi Xu, Yingxue Zhang, Jiapeng Wu, Yuening Wang, M. Coates
Large-scale recommender systems are integral parts of many services. With the recent rapid growth of accessible data, the need for efficient training methods has arisen. Given the high computational cost of training state-of-the-art graph neural network (GNN) based models, it is infeasible to train them from scratch with every new set of interactions. In this work, we present a novel framework for incrementally training GNN-based models. Our framework takes advantage of an experience reply technique built on top of a structurally aware reservoir sampling method tailored for this setting. This framework addresses catastrophic forgetting, allowing the model to preserve its understanding of users' long-term behavioral patterns while adapting to new trends. Our experiments demonstrate the superior performance of our framework on numerous datasets when combined with state-of-the-art GNN-based models.
大规模的推荐系统是许多服务的组成部分。随着最近可访问数据的快速增长,对有效训练方法的需求已经出现。考虑到训练最先进的基于图神经网络(GNN)的模型的高计算成本,对于每一组新的交互都从头开始训练它们是不可行的。在这项工作中,我们提出了一个新的框架,用于增量训练基于gnn的模型。我们的框架利用了经验回复技术,该技术建立在为该环境量身定制的结构感知油藏采样方法之上。这个框架解决了灾难性遗忘问题,允许模型在适应新趋势的同时保持对用户长期行为模式的理解。我们的实验表明,当与最先进的基于gnn的模型相结合时,我们的框架在许多数据集上具有优越的性能。
{"title":"Structure Aware Experience Replay for Incremental Learning in Graph-based Recommender Systems","authors":"Kian Ahrabian, Yishi Xu, Yingxue Zhang, Jiapeng Wu, Yuening Wang, M. Coates","doi":"10.1145/3459637.3482193","DOIUrl":"https://doi.org/10.1145/3459637.3482193","url":null,"abstract":"Large-scale recommender systems are integral parts of many services. With the recent rapid growth of accessible data, the need for efficient training methods has arisen. Given the high computational cost of training state-of-the-art graph neural network (GNN) based models, it is infeasible to train them from scratch with every new set of interactions. In this work, we present a novel framework for incrementally training GNN-based models. Our framework takes advantage of an experience reply technique built on top of a structurally aware reservoir sampling method tailored for this setting. This framework addresses catastrophic forgetting, allowing the model to preserve its understanding of users' long-term behavioral patterns while adapting to new trends. Our experiments demonstrate the superior performance of our framework on numerous datasets when combined with state-of-the-art GNN-based models.","PeriodicalId":405296,"journal":{"name":"Proceedings of the 30th ACM International Conference on Information & Knowledge Management","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131218673","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Beast 野兽
A. Eldawy, Vagelis Hristidis, Saheli Ghosh, Majid Saeedan, Akil Sevim, A.B. Siddique, Samriddhi Singla, Ganeshram Sivaram, Tin Vu, Yaming Zhang
Book file PDF easily for everyone and every device. You can download and read online Beast file PDF Book only if you are registered here. And also you can download or read online all Book PDF file that related with Beast book. Happy reading Beast Bookeveryone. Download file Free Book PDF Beast at Complete PDF Library. This Book have some digital formats such us :paperbook, ebook, kindle, epub, fb2 and another formats. Here is The Complete PDF Book Library. It's free to register here to get Book file PDF Beast.
图书文件PDF容易为每个人,每个设备。只有在这里注册,您才能在线下载和阅读野兽文件PDF书。你也可以下载或在线阅读所有与野兽书相关的PDF文件。大家阅读《野兽之书》愉快。下载文件免费图书PDF野兽在完整的PDF库。这本书有一些数字格式,如:纸质书,电子书,kindle, epub, fb2和其他格式。这里是完整的PDF图书库。在这里注册可以免费获得PDF格式的图书文件。
{"title":"Beast","authors":"A. Eldawy, Vagelis Hristidis, Saheli Ghosh, Majid Saeedan, Akil Sevim, A.B. Siddique, Samriddhi Singla, Ganeshram Sivaram, Tin Vu, Yaming Zhang","doi":"10.1145/3459637.3481897","DOIUrl":"https://doi.org/10.1145/3459637.3481897","url":null,"abstract":"Book file PDF easily for everyone and every device. You can download and read online Beast file PDF Book only if you are registered here. And also you can download or read online all Book PDF file that related with Beast book. Happy reading Beast Bookeveryone. Download file Free Book PDF Beast at Complete PDF Library. This Book have some digital formats such us :paperbook, ebook, kindle, epub, fb2 and another formats. Here is The Complete PDF Book Library. It's free to register here to get Book file PDF Beast.","PeriodicalId":405296,"journal":{"name":"Proceedings of the 30th ACM International Conference on Information & Knowledge Management","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133287576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Predicting Instance Type Assertions in Knowledge Graphs Using Stochastic Neural Networks 利用随机神经网络预测知识图中的实例类型断言
T. Weller, Maribel Acosta
Instance type information is particularly relevant to perform reasoning and obtain further information about entities in knowledge graphs (KGs). However, during automated or pay-as-you-go KG construction processes, instance types might be incomplete or missing in some entities. Previous work focused mostly on representing entities and relations as embeddings based on the statements in the KG. While the computed embeddings encode semantic descriptions and preserve the relationship between the entities, the focus of these methods is often not on predicting schema knowledge, but on predicting missing statements between instances for completing the KG. To fill this gap, we propose an approach that first learns a KG representation suitable for predicting instance type assertions. Then, our solution implements a neural network architecture to predict instance types based on the learned representation. Results show that our representations of entities are much more separable with respect to their associations with classes in the KG, compared to existing methods. For this reason, the performance of predicting instance types on a large number of KGs, in particular on cross-domain KGs with a high variety of classes, is significantly better in terms of F1-score than previous work.
实例类型信息与执行推理和获取知识图(KGs)中实体的进一步信息特别相关。然而,在自动化或按需付费的KG构建过程中,实例类型可能在某些实体中不完整或缺失。以前的工作主要集中在基于KG中的语句将实体和关系表示为嵌入。虽然计算的嵌入编码语义描述并保留实体之间的关系,但这些方法的重点通常不是预测模式知识,而是预测完成KG的实例之间缺失的语句。为了填补这一空白,我们提出了一种方法,该方法首先学习适合预测实例类型断言的KG表示。然后,我们的解决方案实现了一个基于学习表征的神经网络架构来预测实例类型。结果表明,与现有方法相比,我们的实体表示在与KG中的类的关联方面更加可分离。因此,在大量KGs上预测实例类型的性能,特别是在具有多种类别的跨域KGs上,在f1得分方面明显优于以前的工作。
{"title":"Predicting Instance Type Assertions in Knowledge Graphs Using Stochastic Neural Networks","authors":"T. Weller, Maribel Acosta","doi":"10.1145/3459637.3482377","DOIUrl":"https://doi.org/10.1145/3459637.3482377","url":null,"abstract":"Instance type information is particularly relevant to perform reasoning and obtain further information about entities in knowledge graphs (KGs). However, during automated or pay-as-you-go KG construction processes, instance types might be incomplete or missing in some entities. Previous work focused mostly on representing entities and relations as embeddings based on the statements in the KG. While the computed embeddings encode semantic descriptions and preserve the relationship between the entities, the focus of these methods is often not on predicting schema knowledge, but on predicting missing statements between instances for completing the KG. To fill this gap, we propose an approach that first learns a KG representation suitable for predicting instance type assertions. Then, our solution implements a neural network architecture to predict instance types based on the learned representation. Results show that our representations of entities are much more separable with respect to their associations with classes in the KG, compared to existing methods. For this reason, the performance of predicting instance types on a large number of KGs, in particular on cross-domain KGs with a high variety of classes, is significantly better in terms of F1-score than previous work.","PeriodicalId":405296,"journal":{"name":"Proceedings of the 30th ACM International Conference on Information & Knowledge Management","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133339307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Popularity-Enhanced News Recommendation with Multi-View Interest Representation 基于多视角兴趣表示的人气增强新闻推荐
Jingkun Wang, Yipu Chen, Zichun Wang, Wen Zhao
News recommendation is of vital importance to alleviating in-formation overload. Recent research shows that precise modeling of news content and user interests become critical for news rec-ommendation. Existing methods usually utilize information such as news title, abstract, entities to predict Click Through Rate(CTR) or add some auxiliary tasks to a multi-task learning framework. However, none of them directly consider predicted news popularity and the degree of users' attention to popular news into the CTR prediction results. Meanwhile, multiple inter-ests may arise throughout users' browsing history. Thus it is hard to represent user interests via a single user vector. In this paper, we propose PENR, a Popularity-Enhanced News Recommenda-tion method, which integrates popularity prediction task to im-prove the performance of the news encoder. News popularity score is predicted and added to the final CTR, while news popu-larity is utilized to model the degree of users' tendency to follow hot news. Moreover, user interests are modeled from different perspectives via a subspace projection method that assembles the browsing history to multiple subspaces. In this way, we capture users' multi-view interest representations. Experiments on a real-world dataset validate the effectiveness of our PENR approach.
新闻推荐对于缓解信息过载至关重要。最近的研究表明,新闻内容和用户兴趣的精确建模对新闻推荐至关重要。现有的方法通常利用新闻标题、摘要、实体等信息来预测点击率(CTR),或者在多任务学习框架中添加一些辅助任务。但是,它们都没有将预测的新闻热度和用户对热门新闻的关注程度直接考虑到CTR预测结果中。同时,在用户的浏览历史中可能会出现多种兴趣。因此,很难通过单个用户向量来表示用户兴趣。在本文中,我们提出了一种名为PENR的流行度增强新闻推荐方法,该方法集成了流行度预测任务来提高新闻编码器的性能。预测新闻流行度得分并将其添加到最终的点击率中,而新闻流行度则用来模拟用户关注热点新闻的倾向程度。此外,通过子空间投影方法从不同角度对用户兴趣进行建模,该方法将浏览历史组合到多个子空间中。通过这种方式,我们捕获了用户的多视图兴趣表示。在真实数据集上的实验验证了我们的PENR方法的有效性。
{"title":"Popularity-Enhanced News Recommendation with Multi-View Interest Representation","authors":"Jingkun Wang, Yipu Chen, Zichun Wang, Wen Zhao","doi":"10.1145/3459637.3482462","DOIUrl":"https://doi.org/10.1145/3459637.3482462","url":null,"abstract":"News recommendation is of vital importance to alleviating in-formation overload. Recent research shows that precise modeling of news content and user interests become critical for news rec-ommendation. Existing methods usually utilize information such as news title, abstract, entities to predict Click Through Rate(CTR) or add some auxiliary tasks to a multi-task learning framework. However, none of them directly consider predicted news popularity and the degree of users' attention to popular news into the CTR prediction results. Meanwhile, multiple inter-ests may arise throughout users' browsing history. Thus it is hard to represent user interests via a single user vector. In this paper, we propose PENR, a Popularity-Enhanced News Recommenda-tion method, which integrates popularity prediction task to im-prove the performance of the news encoder. News popularity score is predicted and added to the final CTR, while news popu-larity is utilized to model the degree of users' tendency to follow hot news. Moreover, user interests are modeled from different perspectives via a subspace projection method that assembles the browsing history to multiple subspaces. In this way, we capture users' multi-view interest representations. Experiments on a real-world dataset validate the effectiveness of our PENR approach.","PeriodicalId":405296,"journal":{"name":"Proceedings of the 30th ACM International Conference on Information & Knowledge Management","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132104493","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Modeling Inter-Claim Interactions for Verifying Multiple Claims 建模索赔之间的相互作用,以验证多个索赔
Shuai Wang, W. Mao
To inhibit the spread of rumorous information, fact checking aims at retrieving evidence to verify the truthfulness of a given statement. Fact checking methods typically use knowledge graphs (KGs) as external repositories and develop reasoning methods to retrieve evidence from KGs. As real-world statement is often complex and contains multiple claims, multi-claim fact verification is not only necessary but more important for practical applications. However, existing methods only focus on verifying a single claim (i.e. a single-claim statement). Multiple claims imply rich context information and modeling the interrelations between claims can facilitate better verification of a multi-claim statement as a whole. In this paper, we propose a computational method to model inter-claim interactions for multi-claim fact checking. To focus on relevant claims within a statement, our method first extracts topics from the statement and connects the triple claims in the statement to form a claim graph. It then learns a policy-based agent to sequentially select topic-related triples from the claim graph. To fully exploit information from the statement, our method further employs multiple agents and develops a hierarchical attention mechanism to verify multiple claims as a whole. Experimental results on two real-world datasets show the effectiveness of our method for multi-claim fact verification.
为了抑制谣言信息的传播,事实核查的目的是检索证据来验证给定陈述的真实性。事实核查方法通常使用知识图(KGs)作为外部存储库,并开发推理方法从KGs中检索证据。由于现实世界的陈述通常很复杂,包含多个声明,因此多声明事实验证不仅是必要的,而且在实际应用中更为重要。然而,现有的方法只侧重于验证单个权利要求(即单个权利要求声明)。多个权利要求意味着丰富的上下文信息,对权利要求之间的相互关系进行建模可以更好地从整体上验证多个权利要求陈述。在本文中,我们提出了一种计算方法来模拟索赔之间的相互作用,用于多索赔事实检查。为了关注语句中的相关索赔,我们的方法首先从语句中提取主题,并将语句中的三个索赔连接起来,形成索赔图。然后,它学习一个基于策略的代理,以顺序地从索赔图中选择与主题相关的三元组。为了充分利用声明中的信息,我们的方法进一步使用了多个代理,并开发了一个分层关注机制来整体验证多个声明。在两个真实数据集上的实验结果表明了我们的方法对多声明事实验证的有效性。
{"title":"Modeling Inter-Claim Interactions for Verifying Multiple Claims","authors":"Shuai Wang, W. Mao","doi":"10.1145/3459637.3482144","DOIUrl":"https://doi.org/10.1145/3459637.3482144","url":null,"abstract":"To inhibit the spread of rumorous information, fact checking aims at retrieving evidence to verify the truthfulness of a given statement. Fact checking methods typically use knowledge graphs (KGs) as external repositories and develop reasoning methods to retrieve evidence from KGs. As real-world statement is often complex and contains multiple claims, multi-claim fact verification is not only necessary but more important for practical applications. However, existing methods only focus on verifying a single claim (i.e. a single-claim statement). Multiple claims imply rich context information and modeling the interrelations between claims can facilitate better verification of a multi-claim statement as a whole. In this paper, we propose a computational method to model inter-claim interactions for multi-claim fact checking. To focus on relevant claims within a statement, our method first extracts topics from the statement and connects the triple claims in the statement to form a claim graph. It then learns a policy-based agent to sequentially select topic-related triples from the claim graph. To fully exploit information from the statement, our method further employs multiple agents and develops a hierarchical attention mechanism to verify multiple claims as a whole. Experimental results on two real-world datasets show the effectiveness of our method for multi-claim fact verification.","PeriodicalId":405296,"journal":{"name":"Proceedings of the 30th ACM International Conference on Information & Knowledge Management","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132571318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Evaluating the Prediction Bias Induced by Label Imbalance in Multi-label Classification 多标签分类中标签不平衡引起的预测偏差评估
Luca Piras, Ludovico Boratto, Guilherme Ramos
Prediction bias is a well-known problem in classification algorithms, which tend to be skewed towards more represented classes. This phenomenon is even more remarkable in multi-label scenarios, where the number of underrepresented classes is usually larger. In light of this, we hereby present the Prediction Bias Coefficient (PBC), a novel measure that aims to assess the bias induced by label imbalance in multi-label classification. The approach leverages Spearman's rank correlation coefficient between the label frequencies and the F-scores obtained for each label individually. After describing the theoretical properties of the proposed indicator, we illustrate its behaviour on a classification task performed with state-of-the-art methods on two real-world datasets, and we compare it experimentally with other metrics described in the literature.
预测偏差是分类算法中一个众所周知的问题,它倾向于倾向于更有代表性的类。这种现象在多标签场景中更为显著,在这种场景中,未被充分代表的类的数量通常更大。鉴于此,我们提出了预测偏差系数(PBC),这是一个新的度量,旨在评估多标签分类中标签不平衡引起的偏差。该方法利用了标签频率与每个标签单独获得的f分数之间的Spearman等级相关系数。在描述了所提出的指标的理论特性之后,我们说明了它在两个现实世界数据集上使用最先进的方法执行的分类任务中的行为,并将其与文献中描述的其他指标进行了实验比较。
{"title":"Evaluating the Prediction Bias Induced by Label Imbalance in Multi-label Classification","authors":"Luca Piras, Ludovico Boratto, Guilherme Ramos","doi":"10.1145/3459637.3482100","DOIUrl":"https://doi.org/10.1145/3459637.3482100","url":null,"abstract":"Prediction bias is a well-known problem in classification algorithms, which tend to be skewed towards more represented classes. This phenomenon is even more remarkable in multi-label scenarios, where the number of underrepresented classes is usually larger. In light of this, we hereby present the Prediction Bias Coefficient (PBC), a novel measure that aims to assess the bias induced by label imbalance in multi-label classification. The approach leverages Spearman's rank correlation coefficient between the label frequencies and the F-scores obtained for each label individually. After describing the theoretical properties of the proposed indicator, we illustrate its behaviour on a classification task performed with state-of-the-art methods on two real-world datasets, and we compare it experimentally with other metrics described in the literature.","PeriodicalId":405296,"journal":{"name":"Proceedings of the 30th ACM International Conference on Information & Knowledge Management","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133119073","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
Proceedings of the 30th ACM International Conference on Information & Knowledge Management
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1