首页 > 最新文献

2021 IEEE International Conference on Big Knowledge (ICBK)最新文献

英文 中文
Temporal Analysis of Knowledge Networks 知识网络的时间分析
Pub Date : 2021-12-01 DOI: 10.1109/ICKG52313.2021.00034
Xikun Huang, Chuanqing Wang, Qilin Sun, Yangyang Li, Weizhuo Li
Knowledge network has played an important role in revealing knowledge correlations, exploring innovation trends, and implementing knowledge-guided machine learning. Previous work has studied knowledge network as a static network. However, there is much less study on the evolution of knowledge networks. In this paper, we investigate the evolution of knowledge networks from a temporal network perspective. We extract knowledge networks of different topics from Wikipedia, and examine how local and global properties of these networks evolve over time. We find that many properties such as the power-law exponent of in(out)-degree distribution, density, clustering coefficient, effective diameter, and reciprocity either stay stable or vary little over time after a certain stage. And the shape of macro topology structure of each network is more like a coffee pot rather than a bow-tie. In addition, preferential attachment phenomena are found in the evolution of these knowledge networks. All the code and data are publicly available at https://github.com/XikunHuang/TAKN.
知识网络在揭示知识关联、探索创新趋势、实现知识引导的机器学习等方面发挥了重要作用。以往的工作将知识网络作为静态网络进行研究。然而,对知识网络演化的研究却少之又少。本文从时间网络的角度研究了知识网络的演化。我们从维基百科中提取不同主题的知识网络,并研究这些网络的局部和全局属性如何随着时间的推移而演变。我们发现许多性质,如幂律指数的内(外)度分布、密度、聚类系数、有效直径和互易性,在某一阶段后要么保持稳定,要么变化很小。每个网络的宏观拓扑结构的形状更像一个咖啡壶而不是一个领结。此外,在这些知识网络的演化过程中还发现了优先依恋现象。所有的代码和数据都可以在https://github.com/XikunHuang/TAKN上公开获得。
{"title":"Temporal Analysis of Knowledge Networks","authors":"Xikun Huang, Chuanqing Wang, Qilin Sun, Yangyang Li, Weizhuo Li","doi":"10.1109/ICKG52313.2021.00034","DOIUrl":"https://doi.org/10.1109/ICKG52313.2021.00034","url":null,"abstract":"Knowledge network has played an important role in revealing knowledge correlations, exploring innovation trends, and implementing knowledge-guided machine learning. Previous work has studied knowledge network as a static network. However, there is much less study on the evolution of knowledge networks. In this paper, we investigate the evolution of knowledge networks from a temporal network perspective. We extract knowledge networks of different topics from Wikipedia, and examine how local and global properties of these networks evolve over time. We find that many properties such as the power-law exponent of in(out)-degree distribution, density, clustering coefficient, effective diameter, and reciprocity either stay stable or vary little over time after a certain stage. And the shape of macro topology structure of each network is more like a coffee pot rather than a bow-tie. In addition, preferential attachment phenomena are found in the evolution of these knowledge networks. All the code and data are publicly available at https://github.com/XikunHuang/TAKN.","PeriodicalId":174126,"journal":{"name":"2021 IEEE International Conference on Big Knowledge (ICBK)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115667630","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MTSC-GE: A Novel Graph based Method for Multivariate Time Series Clustering 一种新的基于图的多元时间序列聚类方法
Pub Date : 2021-12-01 DOI: 10.1109/ICKG52313.2021.00027
Ze Yang, Changyang Tai, Gongqing Wu, Zan Zhang, Xianyu Bao
Few clustering methods show good performance on multivariate time series (MTS) data. Traditional methods rely too much on similarity measures and perform poorly on the MTS data with complex structures. This paper proposes an MTS clustering algorithm based on graph embedding called MTSC-GE to improve the performance of MTS clustering. MTSC-GE can map MTS samples to the feature representations in a low-dimensional space and then cluster them. While mining the information of the samples themselves, MTSC-GE builds the whole time series data into a graph, paying attention to the connections between samples from an overall perspective and discovering the local structural feature of MTS data. The proposed MTSC-G E consists of three stages. The first stage builds a graph using the original dataset, where each of the MTS samples is regarded as a node in the graph. The second stage uses the graph embedding technique to obtain a new representation of each node. Finally, MTSC-G E uses the K - Means algorithm to cluster based on the newly obtained representation. We compare MTSC-GE with six state-of-the-art benchmark methods on five public datasets, experimental results show that MTSC-GE has achieved good performance.
对于多变量时间序列(MTS)数据,很少有聚类方法表现出良好的聚类性能。传统方法过于依赖相似度量,在复杂结构的MTS数据上表现不佳。为了提高MTS聚类的性能,本文提出了一种基于图嵌入的MTS聚类算法MTSC-GE。MTSC-GE可以将MTS样本映射到低维空间的特征表示,然后聚类。在挖掘样本本身信息的同时,MTSC-GE将整个时间序列数据构建成一个图,从整体角度关注样本之间的联系,发现MTS数据的局部结构特征。拟议的MTSC-G包括三个阶段。第一阶段使用原始数据集构建图,其中每个MTS样本都被视为图中的一个节点。第二阶段使用图嵌入技术获得每个节点的新表示。最后,mtsc - ge使用K - Means算法对新得到的表示进行聚类。我们将MTSC-GE与六种最先进的基准方法在五个公共数据集上进行了比较,实验结果表明MTSC-GE取得了良好的性能。
{"title":"MTSC-GE: A Novel Graph based Method for Multivariate Time Series Clustering","authors":"Ze Yang, Changyang Tai, Gongqing Wu, Zan Zhang, Xianyu Bao","doi":"10.1109/ICKG52313.2021.00027","DOIUrl":"https://doi.org/10.1109/ICKG52313.2021.00027","url":null,"abstract":"Few clustering methods show good performance on multivariate time series (MTS) data. Traditional methods rely too much on similarity measures and perform poorly on the MTS data with complex structures. This paper proposes an MTS clustering algorithm based on graph embedding called MTSC-GE to improve the performance of MTS clustering. MTSC-GE can map MTS samples to the feature representations in a low-dimensional space and then cluster them. While mining the information of the samples themselves, MTSC-GE builds the whole time series data into a graph, paying attention to the connections between samples from an overall perspective and discovering the local structural feature of MTS data. The proposed MTSC-G E consists of three stages. The first stage builds a graph using the original dataset, where each of the MTS samples is regarded as a node in the graph. The second stage uses the graph embedding technique to obtain a new representation of each node. Finally, MTSC-G E uses the K - Means algorithm to cluster based on the newly obtained representation. We compare MTSC-GE with six state-of-the-art benchmark methods on five public datasets, experimental results show that MTSC-GE has achieved good performance.","PeriodicalId":174126,"journal":{"name":"2021 IEEE International Conference on Big Knowledge (ICBK)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121161194","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A Scheme for Kinship Reasoning based on Ontology 一种基于本体的亲属关系推理方案
Pub Date : 2021-12-01 DOI: 10.1109/ICKG52313.2021.00038
Ru Chen, Guliu Liu, Yi Zhu, Xindong Wu
With the rapid development of artificial intelligence and semantic networks, knowledge graphs have received extensive attention in various application domains. As a domain knowledge graph, a genealogical knowledge graph has significant research value in the family blood, family culture, and medical genetic analysis. However, as the identical relationship often has different names and complex relationships such as divorce, remarriage, and polygamy, the reasoning based on the genealogical knowledge graph is a challenging task. In response to this problem, we propose a scheme for kinship reasoning in the genealogical knowledge graph. First, based on real genealogical revision experiences, a character ontology framework in the genealogical knowledge graph is defined, and basic kinship reasoning rules are designed. Then, given different definitions of kinship in different surnames, the solution of custom reasoning rules is integrated into the reasoning framework. In addition, aiming at complex relationships in family trees, such as multiple generations of ancestors and multiple wives, a series of inference optimization methods are proposed. Finally, we implement this scheme in the Huapu system, and the experimental results conducted on a real genealogical dataset demonstrate the effectiveness and practicality of our proposed scheme.
随着人工智能和语义网络的快速发展,知识图在各个应用领域受到了广泛的关注。家谱知识图谱作为一种领域知识图谱,在家族血统、家族文化、医学遗传分析等方面具有重要的研究价值。然而,由于相同的关系往往有不同的名称和复杂的关系,如离婚、再婚、一夫多妻等,基于家谱知识图的推理是一项具有挑战性的任务。针对这一问题,我们提出了一种基于家谱知识图的亲属关系推理方案。首先,根据实际的家谱修订经验,定义了家谱知识图中的字符本体框架,设计了基本的亲属推理规则;然后,在不同姓氏的亲属关系定义不同的情况下,将习俗推理规则的解决方案整合到推理框架中。此外,针对家谱中祖先多代、妻子多代等复杂关系,提出了一系列推理优化方法。最后,我们在华普系统中实现了该方案,并在一个真实的家谱数据集上进行了实验,验证了该方案的有效性和实用性。
{"title":"A Scheme for Kinship Reasoning based on Ontology","authors":"Ru Chen, Guliu Liu, Yi Zhu, Xindong Wu","doi":"10.1109/ICKG52313.2021.00038","DOIUrl":"https://doi.org/10.1109/ICKG52313.2021.00038","url":null,"abstract":"With the rapid development of artificial intelligence and semantic networks, knowledge graphs have received extensive attention in various application domains. As a domain knowledge graph, a genealogical knowledge graph has significant research value in the family blood, family culture, and medical genetic analysis. However, as the identical relationship often has different names and complex relationships such as divorce, remarriage, and polygamy, the reasoning based on the genealogical knowledge graph is a challenging task. In response to this problem, we propose a scheme for kinship reasoning in the genealogical knowledge graph. First, based on real genealogical revision experiences, a character ontology framework in the genealogical knowledge graph is defined, and basic kinship reasoning rules are designed. Then, given different definitions of kinship in different surnames, the solution of custom reasoning rules is integrated into the reasoning framework. In addition, aiming at complex relationships in family trees, such as multiple generations of ancestors and multiple wives, a series of inference optimization methods are proposed. Finally, we implement this scheme in the Huapu system, and the experimental results conducted on a real genealogical dataset demonstrate the effectiveness and practicality of our proposed scheme.","PeriodicalId":174126,"journal":{"name":"2021 IEEE International Conference on Big Knowledge (ICBK)","volume":"118 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122467993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Constructing COVID-19 Knowledge Graph from A Large Corpus of Scientific Articles 从大型科学文章语料库构建COVID-19知识图谱
Pub Date : 2021-12-01 DOI: 10.1109/ICKG52313.2021.00040
Wei Emma Zhang, Queen Nguyen
Creating domain-specific glossaries that are both time-consuming and requires domain expertise. An effective and efficient automatic process will facilitate the glossary generation and its downstream applications for better decision making. In this project, we aim to build a domain-specific glossary from a large text corpus. We form the task as a knowledge graph construction problem with minimum supervision. We adapt both supervised pre-trained models and unsupervised methods for extracting relations for terms appear in the large corpus of scientific articles. We then utilize an off-the-shelf graph database to construct and store the knowledge graph. Furthermore, we develop an interactive Web-based tool for visualizing, exploring and querying the constructed knowledge graph. The project is sourced and funded by AI4DM initiative from the Office of National Intelligence (ONI) and the Defence Science and Technology (DST) Group, Australia. Although the fund requires the usage of a dataset of COVID-19 related literature collection, the solution to be presented in this paper is generic and could be easilt applied to any domain.
创建特定于领域的词汇表既耗时又需要领域专业知识。有效和高效的自动化过程将促进词汇表的生成及其下游应用程序,从而更好地做出决策。在这个项目中,我们的目标是从一个大型文本语料库中构建一个特定于领域的词汇表。我们将任务形成一个具有最小监督的知识图构建问题。我们采用监督预训练模型和非监督方法来提取大型科学文章语料库中出现的术语之间的关系。然后我们利用现成的图数据库来构建和存储知识图。此外,我们开发了一个交互式的基于web的工具,用于可视化、探索和查询构建的知识图谱。该项目由澳大利亚国家情报办公室(ONI)和国防科学技术(DST)集团的AI4DM计划提供资金。虽然该基金需要使用COVID-19相关文献收集的数据集,但本文提出的解决方案是通用的,可以很容易地应用于任何领域。
{"title":"Constructing COVID-19 Knowledge Graph from A Large Corpus of Scientific Articles","authors":"Wei Emma Zhang, Queen Nguyen","doi":"10.1109/ICKG52313.2021.00040","DOIUrl":"https://doi.org/10.1109/ICKG52313.2021.00040","url":null,"abstract":"Creating domain-specific glossaries that are both time-consuming and requires domain expertise. An effective and efficient automatic process will facilitate the glossary generation and its downstream applications for better decision making. In this project, we aim to build a domain-specific glossary from a large text corpus. We form the task as a knowledge graph construction problem with minimum supervision. We adapt both supervised pre-trained models and unsupervised methods for extracting relations for terms appear in the large corpus of scientific articles. We then utilize an off-the-shelf graph database to construct and store the knowledge graph. Furthermore, we develop an interactive Web-based tool for visualizing, exploring and querying the constructed knowledge graph. The project is sourced and funded by AI4DM initiative from the Office of National Intelligence (ONI) and the Defence Science and Technology (DST) Group, Australia. Although the fund requires the usage of a dataset of COVID-19 related literature collection, the solution to be presented in this paper is generic and could be easilt applied to any domain.","PeriodicalId":174126,"journal":{"name":"2021 IEEE International Conference on Big Knowledge (ICBK)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123438132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
ICBK 2021 Track Chairs ICBK 2021轨道椅
Pub Date : 2021-12-01 DOI: 10.1109/ickg52313.2021.00008
{"title":"ICBK 2021 Track Chairs","authors":"","doi":"10.1109/ickg52313.2021.00008","DOIUrl":"https://doi.org/10.1109/ickg52313.2021.00008","url":null,"abstract":"","PeriodicalId":174126,"journal":{"name":"2021 IEEE International Conference on Big Knowledge (ICBK)","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128421110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A survey on Optimisation-based Semi-supervised Clustering Methods 基于优化的半监督聚类方法综述
Pub Date : 2021-12-01 DOI: 10.1109/ICKG52313.2021.00070
Zahra Ghasemi, H. A. Khorshidi, U. Aickelin
Clustering methods are developed for categorizing data points into different groups so that data points within each group have high similarities. Classic clustering algorithms are unsupervised, meaning that there is not any kind of complementary information to be utilized for attaining better clustering results. However, in some clustering problems, one may have supplementary information which can be employed for guiding the clustering process. In the presence of such information, the problem is semi-supervised clustering. In some articles, the problem of semi-supervised clustering is modeled as an optimization problem. In this research, optimization-based semi-supervised clustering papers from 2013 to 2020 are reviewed. This review is conducted based on a four-step procedure. It is attempted to explore objective functions and optimization algorithms used in these articles, as well as application domain and types of supervised information.
聚类方法用于将数据点分成不同的组,使每组内的数据点具有较高的相似性。经典的聚类算法是无监督的,这意味着没有任何类型的补充信息可以用来获得更好的聚类结果。然而,在一些聚类问题中,可能有一些补充信息可以用来指导聚类过程。在这些信息存在的情况下,问题是半监督聚类。在一些文章中,半监督聚类问题被建模为一个优化问题。本研究回顾了2013 - 2020年基于优化的半监督聚类论文。这项审查是根据四步程序进行的。它试图探索这些文章中使用的目标函数和优化算法,以及应用领域和监督信息的类型。
{"title":"A survey on Optimisation-based Semi-supervised Clustering Methods","authors":"Zahra Ghasemi, H. A. Khorshidi, U. Aickelin","doi":"10.1109/ICKG52313.2021.00070","DOIUrl":"https://doi.org/10.1109/ICKG52313.2021.00070","url":null,"abstract":"Clustering methods are developed for categorizing data points into different groups so that data points within each group have high similarities. Classic clustering algorithms are unsupervised, meaning that there is not any kind of complementary information to be utilized for attaining better clustering results. However, in some clustering problems, one may have supplementary information which can be employed for guiding the clustering process. In the presence of such information, the problem is semi-supervised clustering. In some articles, the problem of semi-supervised clustering is modeled as an optimization problem. In this research, optimization-based semi-supervised clustering papers from 2013 to 2020 are reviewed. This review is conducted based on a four-step procedure. It is attempted to explore objective functions and optimization algorithms used in these articles, as well as application domain and types of supervised information.","PeriodicalId":174126,"journal":{"name":"2021 IEEE International Conference on Big Knowledge (ICBK)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130773678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Learning Dynamic Preference Structure Embedding From Temporal Networks 从时间网络学习动态偏好结构嵌入
Pub Date : 2021-11-23 DOI: 10.1109/ICKG52313.2021.00059
Tongya Zheng, Zunlei Feng, Yu Wang, Chengchao Shen, Mingli Song, Xingen Wang, Xinyu Wang, Chun Chen, Hao Xu
The dynamics of temporal networks lie in the con-tinuous interactions between nodes, which exhibit the dynamic node preferences with time elapsing. The challenges of mining temporal networks are thus two-fold: the dynamic structure of networks and the dynamic node preferences. In this paper, we investigate the dynamic graph sampling problem, aiming to capture the preference structure of nodes dynamically in cooperation with GNNs. Our proposed Dynamic Preference Structure (DPS) framework consists of two stages: structure sampling and graph fusion. In the first stage, two parameterized samplers are de-signed to learn the preference structure adaptively with network reconstruction tasks. In the second stage, an additional attention layer is designed to fuse two sampled temporal subgraphs of a node, generating temporal node embeddings for downstream tasks. Experimental results on many real-life temporal networks show that our DPS outperforms several state-of-the-art methods substantially owing to learning an adaptive preference structure. The code will be released soon at https://github.com/doujiang-zheng/Dynamic-Preference-Structure.
时间网络的动态性在于节点之间的连续相互作用,这种相互作用随着时间的推移呈现出动态的节点偏好。因此,挖掘时间网络的挑战是双重的:网络的动态结构和动态节点偏好。在本文中,我们研究了动态图采样问题,旨在与gnn合作动态捕获节点的偏好结构。我们提出的动态偏好结构(DPS)框架包括两个阶段:结构采样和图融合。首先,设计两个参数化采样器,根据网络重构任务自适应学习偏好结构。在第二阶段,设计一个额外的关注层来融合节点的两个采样时间子图,为下游任务生成时间节点嵌入。在许多现实生活中的时间网络上的实验结果表明,由于学习了自适应偏好结构,我们的DPS在很大程度上优于几种最先进的方法。代码将很快在https://github.com/doujiang-zheng/Dynamic-Preference-Structure上发布。
{"title":"Learning Dynamic Preference Structure Embedding From Temporal Networks","authors":"Tongya Zheng, Zunlei Feng, Yu Wang, Chengchao Shen, Mingli Song, Xingen Wang, Xinyu Wang, Chun Chen, Hao Xu","doi":"10.1109/ICKG52313.2021.00059","DOIUrl":"https://doi.org/10.1109/ICKG52313.2021.00059","url":null,"abstract":"The dynamics of temporal networks lie in the con-tinuous interactions between nodes, which exhibit the dynamic node preferences with time elapsing. The challenges of mining temporal networks are thus two-fold: the dynamic structure of networks and the dynamic node preferences. In this paper, we investigate the dynamic graph sampling problem, aiming to capture the preference structure of nodes dynamically in cooperation with GNNs. Our proposed Dynamic Preference Structure (DPS) framework consists of two stages: structure sampling and graph fusion. In the first stage, two parameterized samplers are de-signed to learn the preference structure adaptively with network reconstruction tasks. In the second stage, an additional attention layer is designed to fuse two sampled temporal subgraphs of a node, generating temporal node embeddings for downstream tasks. Experimental results on many real-life temporal networks show that our DPS outperforms several state-of-the-art methods substantially owing to learning an adaptive preference structure. The code will be released soon at https://github.com/doujiang-zheng/Dynamic-Preference-Structure.","PeriodicalId":174126,"journal":{"name":"2021 IEEE International Conference on Big Knowledge (ICBK)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133679438","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Recurrent Neural Networks for Learning Long-term Temporal Dependencies with Reanalysis of Time Scale Representation 基于时间尺度表征的递归神经网络长期时间依赖性学习
Pub Date : 2021-11-05 DOI: 10.1109/ICKG52313.2021.00033
Kentaro Ohno, Atsutoshi Kumagai
Recurrent neural networks with a gating mechanism such as an LSTM or GRU are powerful tools to model sequential data. In the mechanism, a forget gate, which was introduced to control information flow in a hidden state in the RNN, has recently been re-interpreted as a representative of the time scale of the state, i.e., a measure how long the RNN retains information on inputs. On the basis of this interpretation, several parameter initialization methods to exploit prior knowledge on temporal dependencies in data have been proposed to improve learn-ability. However, the interpretation relies on various unrealistic assumptions, such as that there are no inputs after a certain time point. In this work, we reconsider this interpretation of the forget gate in a more realistic setting. We first generalize the existing theory on gated RNNs so that we can consider the case where inputs are successively given. We then argue that the interpretation of a forget gate as a temporal representation is valid when the gradient of loss with respect to the state decreases exponentially as time goes back. We empirically demonstrate that existing RNNs satisfy this gradient condition at the initial training phase on several tasks, which is in good agreement with previous initialization methods. On the basis of this finding, we propose an approach to construct new RNNs that can represent a longer time scale than conventional models, which will improve the learnability for long-term sequential data. We verify the effectiveness of our method by experiments with real-world datasets.
具有门控机制(如LSTM或GRU)的递归神经网络是对序列数据建模的强大工具。在该机制中,遗忘门被引入来控制RNN中隐藏状态下的信息流,最近被重新解释为状态时间尺度的代表,即RNN在输入上保留信息的时间。在此基础上,提出了几种参数初始化方法来利用数据中时间依赖性的先验知识来提高学习能力。然而,这种解释依赖于各种不切实际的假设,例如在某个时间点之后没有输入。在这项工作中,我们在一个更现实的环境中重新考虑对遗忘门的这种解释。我们首先对门控rnn的现有理论进行了推广,以便我们可以考虑连续给定输入的情况。然后,我们认为,当损失梯度相对于状态随着时间的推移呈指数下降时,将遗忘门解释为时间表征是有效的。我们的经验证明,现有的rnn在几个任务的初始训练阶段满足这个梯度条件,这与以前的初始化方法很好地一致。在这一发现的基础上,我们提出了一种方法来构建新的rnn,它可以代表比传统模型更长的时间尺度,这将提高长期序列数据的可学习性。我们通过真实数据集的实验验证了我们方法的有效性。
{"title":"Recurrent Neural Networks for Learning Long-term Temporal Dependencies with Reanalysis of Time Scale Representation","authors":"Kentaro Ohno, Atsutoshi Kumagai","doi":"10.1109/ICKG52313.2021.00033","DOIUrl":"https://doi.org/10.1109/ICKG52313.2021.00033","url":null,"abstract":"Recurrent neural networks with a gating mechanism such as an LSTM or GRU are powerful tools to model sequential data. In the mechanism, a forget gate, which was introduced to control information flow in a hidden state in the RNN, has recently been re-interpreted as a representative of the time scale of the state, i.e., a measure how long the RNN retains information on inputs. On the basis of this interpretation, several parameter initialization methods to exploit prior knowledge on temporal dependencies in data have been proposed to improve learn-ability. However, the interpretation relies on various unrealistic assumptions, such as that there are no inputs after a certain time point. In this work, we reconsider this interpretation of the forget gate in a more realistic setting. We first generalize the existing theory on gated RNNs so that we can consider the case where inputs are successively given. We then argue that the interpretation of a forget gate as a temporal representation is valid when the gradient of loss with respect to the state decreases exponentially as time goes back. We empirically demonstrate that existing RNNs satisfy this gradient condition at the initial training phase on several tasks, which is in good agreement with previous initialization methods. On the basis of this finding, we propose an approach to construct new RNNs that can represent a longer time scale than conventional models, which will improve the learnability for long-term sequential data. We verify the effectiveness of our method by experiments with real-world datasets.","PeriodicalId":174126,"journal":{"name":"2021 IEEE International Conference on Big Knowledge (ICBK)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121375563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Transductive Data Augmentation with Relational Path Rule Mining for Knowledge Graph Embedding 基于关系路径规则挖掘的知识图嵌入转换数据增强
Pub Date : 2021-11-01 DOI: 10.1109/ICKG52313.2021.00057
Yushi Hirose, M. Shimbo, Taro Watanabe
For knowledge graph completion, two major types of prediction models exist: one based on graph embeddings, and the other based on relation path rule induction. They have different advantages and disadvantages. To take advantage of both types, hybrid models have been proposed recently. One of the hybrid models, UniKER, alternately augments training data by relation path rules and trains an embedding model. Despite its high prediction accuracy, it does not take full advantage of relation path rules, as it disregards low-confidence rules in order to maintain the quality of augmented data. To mitigate this limitation, we propose transductive data augmentation by relation path rules and confidence-based weighting of augmented data. The results and analysis show that our proposed method effectively improves the performance of the embedding model by augmenting data that include true answers or entities similar to them.
对于知识图补全,主要存在两种预测模型:一种是基于图嵌入的预测模型,另一种是基于关系路径规则归纳的预测模型。它们有不同的优点和缺点。为了利用这两种类型,最近提出了混合模型。其中一种混合模型UniKER交替地通过关系路径规则增强训练数据和训练嵌入模型。尽管它的预测精度很高,但它没有充分利用关系路径规则,因为它忽略了低置信度规则,以保持增强数据的质量。为了减轻这一限制,我们提出了通过关系路径规则和基于置信度的增强数据加权来增强数据。结果和分析表明,我们提出的方法通过增加包含真实答案或与其相似的实体的数据,有效地提高了嵌入模型的性能。
{"title":"Transductive Data Augmentation with Relational Path Rule Mining for Knowledge Graph Embedding","authors":"Yushi Hirose, M. Shimbo, Taro Watanabe","doi":"10.1109/ICKG52313.2021.00057","DOIUrl":"https://doi.org/10.1109/ICKG52313.2021.00057","url":null,"abstract":"For knowledge graph completion, two major types of prediction models exist: one based on graph embeddings, and the other based on relation path rule induction. They have different advantages and disadvantages. To take advantage of both types, hybrid models have been proposed recently. One of the hybrid models, UniKER, alternately augments training data by relation path rules and trains an embedding model. Despite its high prediction accuracy, it does not take full advantage of relation path rules, as it disregards low-confidence rules in order to maintain the quality of augmented data. To mitigate this limitation, we propose transductive data augmentation by relation path rules and confidence-based weighting of augmented data. The results and analysis show that our proposed method effectively improves the performance of the embedding model by augmenting data that include true answers or entities similar to them.","PeriodicalId":174126,"journal":{"name":"2021 IEEE International Conference on Big Knowledge (ICBK)","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115945083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
2021 IEEE International Conference on Big Knowledge (ICBK)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1