Pub Date : 2021-05-18DOI: 10.1016/j.aiopen.2022.07.001
Wenkai Li, Wenbo Hu, Ting Chen, Ning Chen, Cheng Feng
{"title":"StackVAE-G: An efficient and interpretable model for time series anomaly detection","authors":"Wenkai Li, Wenbo Hu, Ting Chen, Ning Chen, Cheng Feng","doi":"10.1016/j.aiopen.2022.07.001","DOIUrl":"https://doi.org/10.1016/j.aiopen.2022.07.001","url":null,"abstract":"","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"37 6 1","pages":"101-110"},"PeriodicalIF":0.0,"publicationDate":"2021-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83401508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-01-01DOI: 10.1016/j.aiopen.2021.09.001
Kai Xiong, Xiao Ding, Li Du, Ting Liu, Bing Qin
We focus on the task of stock market prediction based on financial text which contains information that could influence the movement of stock market. Previous works mainly utilize a single semantic unit of financial text, such as words, events, sentences, to predict the tendency of stock market. However, the interaction of different-grained information within financial text can be useful for context knowledge supplement and predictive information selection, and then improve the performance of stock market prediction. To facilitate this, we propose constructing a heterogeneous graph with different-grained information nodes from financial text for the task. A novel heterogeneous neural network is presented to aggregate multi-grained information. Experimental results demonstrate that our proposed approach reaches higher performance than baselines.
{"title":"Heterogeneous graph knowledge enhanced stock market prediction","authors":"Kai Xiong, Xiao Ding, Li Du, Ting Liu, Bing Qin","doi":"10.1016/j.aiopen.2021.09.001","DOIUrl":"10.1016/j.aiopen.2021.09.001","url":null,"abstract":"<div><p>We focus on the task of stock market prediction based on financial text which contains information that could influence the movement of stock market. Previous works mainly utilize a single semantic unit of financial text, such as words, events, sentences, to predict the tendency of stock market. However, the interaction of different-grained information within financial text can be useful for context knowledge supplement and predictive information selection, and then improve the performance of stock market prediction. To facilitate this, we propose constructing a heterogeneous graph with different-grained information nodes from financial text for the task. A novel heterogeneous neural network is presented to aggregate multi-grained information. Experimental results demonstrate that our proposed approach reaches higher performance than baselines.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"2 ","pages":"Pages 168-174"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651021000243/pdfft?md5=618178faed3a536b57646ee675c7b211&pid=1-s2.0-S2666651021000243-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73852604","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pre-trained Language Models (PLMs) have proven to be beneficial for various downstream NLP tasks. Recently, GPT-3, with 175 billion parameters and 570 GB training data, drew a lot of attention due to the capacity of few-shot (even zero-shot) learning. However, applying GPT-3 to address Chinese NLP tasks is still challenging, as the training corpus of GPT-3 is primarily English, and the parameters are not publicly available. In this technical report, we release the Chinese Pre-trained Language Model (CPM) with generative pre-training on large-scale Chinese training data. To the best of our knowledge, CPM, with 2.6 billion parameters and 100 GB Chinese training data, is the largest Chinese pre-trained language model, which could facilitate several downstream Chinese NLP tasks, such as conversation, essay generation, cloze test, and language understanding. Extensive experiments demonstrate that CPM achieves strong performance on many NLP tasks in the settings of few-shot (even zero-shot) learning. The code and parameters are available at https://github.com/TsinghuaAI/CPM.
{"title":"CPM: A large-scale generative Chinese Pre-trained language model","authors":"Zhengyan Zhang, Xu Han, Hao Zhou, Pei Ke, Yuxian Gu, Deming Ye, Yujia Qin, Yusheng Su, Haozhe Ji, Jian Guan, Fanchao Qi, Xiaozhi Wang, Yanan Zheng, Guoyang Zeng, Huanqi Cao, Shengqi Chen, Daixuan Li, Zhenbo Sun, Zhiyuan Liu, Minlie Huang, Maosong Sun","doi":"10.1016/j.aiopen.2021.07.001","DOIUrl":"10.1016/j.aiopen.2021.07.001","url":null,"abstract":"<div><p>Pre-trained Language Models (PLMs) have proven to be beneficial for various downstream NLP tasks. Recently, GPT-3, with 175 billion parameters and 570 GB training data, drew a lot of attention due to the capacity of few-shot (even zero-shot) learning. However, applying GPT-3 to address Chinese NLP tasks is still challenging, as the training corpus of GPT-3 is primarily English, and the parameters are not publicly available. In this technical report, we release the Chinese Pre-trained Language Model (CPM) with generative pre-training on large-scale Chinese training data. To the best of our knowledge, CPM, with 2.6 billion parameters and 100 GB Chinese training data, is the largest Chinese pre-trained language model, which could facilitate several downstream Chinese NLP tasks, such as conversation, essay generation, cloze test, and language understanding. Extensive experiments demonstrate that CPM achieves strong performance on many NLP tasks in the settings of few-shot (even zero-shot) learning. The code and parameters are available at <span>https://github.com/TsinghuaAI/CPM</span><svg><path></path></svg>.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"2 ","pages":"Pages 93-99"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.aiopen.2021.07.001","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90523293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-01-01DOI: 10.1016/j.aiopen.2021.07.002
Ruben Cartuyvels, Graham Spinks , Marie-Francine Moens
Discrete and continuous representations of content (e.g., of language or images) have interesting properties to be explored for the understanding of or reasoning with this content by machines. This position paper puts forward our opinion on the role of discrete and continuous representations and their processing in the deep learning field. Current neural network models compute continuous-valued data. Information is compressed into dense, distributed embeddings. By stark contrast, humans use discrete symbols in their communication with language. Such symbols represent a compressed version of the world that derives its meaning from shared contextual information. Additionally, human reasoning involves symbol manipulation at a cognitive level, which facilitates abstract reasoning, the composition of knowledge and understanding, generalization and efficient learning. Motivated by these insights, in this paper we argue that combining discrete and continuous representations and their processing will be essential to build systems that exhibit a general form of intelligence. We suggest and discuss several avenues that could improve current neural networks with the inclusion of discrete elements to combine the advantages of both types of representations.
{"title":"Discrete and continuous representations and processing in deep learning: Looking forward","authors":"Ruben Cartuyvels, Graham Spinks , Marie-Francine Moens","doi":"10.1016/j.aiopen.2021.07.002","DOIUrl":"10.1016/j.aiopen.2021.07.002","url":null,"abstract":"<div><p>Discrete and continuous representations of content (<em>e.g.</em>, of language or images) have interesting properties to be explored for the understanding of or reasoning with this content by machines. This position paper puts forward our opinion on the role of discrete and continuous representations and their processing in the deep learning field. Current neural network models compute continuous-valued data. Information is compressed into dense, distributed embeddings. By stark contrast, humans use discrete symbols in their communication with language. Such symbols represent a compressed version of the world that derives its meaning from shared contextual information. Additionally, human reasoning involves symbol manipulation at a cognitive level, which facilitates abstract reasoning, the composition of knowledge and understanding, generalization and efficient learning. Motivated by these insights, in this paper we argue that combining discrete and continuous representations and their processing will be essential to build systems that exhibit a general form of intelligence. We suggest and discuss several avenues that could improve current neural networks with the inclusion of discrete elements to combine the advantages of both types of representations.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"2 ","pages":"Pages 143-159"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651021000206/pdfft?md5=2930ea7a8804d90c964ce7206c845bec&pid=1-s2.0-S2666651021000206-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87346774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-01-01DOI: 10.1016/j.aiopen.2021.02.002
Kaisheng Zeng , Chengjiang Li , Lei Hou , Juanzi Li , Ling Feng
Knowledge Graphs (KGs), as a structured human knowledge, manage data in an ease-of-store, recognizable, and understandable way for machines and provide a rich knowledge base for different artificial intelligence applications. However, current multi-source KGs have heterogeneity and complementarity, and it is necessary to fuse heterogeneous knowledge from different data sources or different languages into a unified and consistent KG. Entity alignment aims to find equivalence relations between entities in different knowledge graphs but semantically represent the same real-world object, which is the most fundamental and essential technology in knowledge fusion. This paper investigated almost all the latest knowledge graph representations learning and entity alignment methods and summarized their core technologies and features from different aspects. Our full investigation gives a comprehensive outlook on several promising research directions for future work. We also provide an efficient and efficiency entity alignment toolkit to help researchers quickly start their own entity alignment models.
{"title":"A comprehensive survey of entity alignment for knowledge graphs","authors":"Kaisheng Zeng , Chengjiang Li , Lei Hou , Juanzi Li , Ling Feng","doi":"10.1016/j.aiopen.2021.02.002","DOIUrl":"10.1016/j.aiopen.2021.02.002","url":null,"abstract":"<div><p>Knowledge Graphs (KGs), as a structured human knowledge, manage data in an ease-of-store, recognizable, and understandable way for machines and provide a rich knowledge base for different artificial intelligence applications. However, current multi-source KGs have heterogeneity and complementarity, and it is necessary to fuse heterogeneous knowledge from different data sources or different languages into a unified and consistent KG. Entity alignment aims to find equivalence relations between entities in different knowledge graphs but semantically represent the same real-world object, which is the most fundamental and essential technology in knowledge fusion. This paper investigated almost all the latest knowledge graph representations learning and entity alignment methods and summarized their core technologies and features from different aspects. Our full investigation gives a comprehensive outlook on several promising research directions for future work. We also provide an efficient and efficiency entity alignment toolkit to help researchers quickly start their own entity alignment models.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"2 ","pages":"Pages 1-13"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.aiopen.2021.02.002","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75984252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Question answering over knowledge bases (KBQA) is a challenging task in natural language processing. It requires machines to answer natural language questions based on large-scale knowledge bases. Recent years have witnessed remarkable success of neural network models on many natural language processing tasks, including KBQA. In this paper, we first review the recent advances of deep learning methods on solving simple questions in two streams, the information extraction style and semantic parsing style. We then introduce how to extend the neural architectures to answer more complex questions with iteration and decomposition techniques, and summarize current research challenges.
{"title":"A review of deep learning in question answering over knowledge bases","authors":"Chen Zhang , Yuxuan Lai , Yansong Feng , Dongyan Zhao","doi":"10.1016/j.aiopen.2021.12.001","DOIUrl":"10.1016/j.aiopen.2021.12.001","url":null,"abstract":"<div><p>Question answering over knowledge bases (KBQA) is a challenging task in natural language processing. It requires machines to answer natural language questions based on large-scale knowledge bases. Recent years have witnessed remarkable success of neural network models on many natural language processing tasks, including KBQA. In this paper, we first review the recent advances of deep learning methods on solving simple questions in two streams, the information extraction style and semantic parsing style. We then introduce how to extend the neural architectures to answer more complex questions with iteration and decomposition techniques, and summarize current research challenges.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"2 ","pages":"Pages 205-215"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651021000292/pdfft?md5=eb6c1b2ea9296d53ba86dfc7d7ce5213&pid=1-s2.0-S2666651021000292-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74007285","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-01-01DOI: 10.1016/j.aiopen.2021.11.001
Gang Chen , Maosong Sun , Yang Liu
In artificial intelligence (AI), knowledge is the information required by an intelligent system to accomplish tasks. While traditional knowledge bases use discrete, symbolic representations, detecting knowledge encoded in the continuous representations learned from data has received increasing attention recently. In this work, we propose a method for building a continuous knowledge base (CKB) that can store knowledge imported from multiple, diverse neural networks. The key idea of our approach is to define an interface for each neural network and cast knowledge transferring as a function simulation problem. Experiments on text classification show promising results: the CKB imports knowledge from a single model and then exports the knowledge to a new model, achieving comparable performance with the original model. More interesting, we import the knowledge from multiple models to the knowledge base, from which the fused knowledge is exported back to a single model, achieving a higher accuracy than the original model. With the CKB, it is also easy to achieve knowledge distillation and transfer learning. Our work opens the door to building a universal continuous knowledge base to collect, store, and organize all continuous knowledge encoded in various neural networks trained for different AI tasks.
{"title":"Towards a universal continuous knowledge base","authors":"Gang Chen , Maosong Sun , Yang Liu","doi":"10.1016/j.aiopen.2021.11.001","DOIUrl":"10.1016/j.aiopen.2021.11.001","url":null,"abstract":"<div><p>In artificial intelligence (AI), knowledge is the information required by an intelligent system to accomplish tasks. While traditional knowledge bases use discrete, symbolic representations, detecting knowledge encoded in the continuous representations learned from data has received increasing attention recently. In this work, we propose a method for building a continuous knowledge base (CKB) that can store knowledge imported from multiple, diverse neural networks. The key idea of our approach is to define an interface for each neural network and cast knowledge transferring as a function simulation problem. Experiments on text classification show promising results: the CKB imports knowledge from a single model and then exports the knowledge to a new model, achieving comparable performance with the original model. More interesting, we import the knowledge from multiple models to the knowledge base, from which the fused knowledge is exported back to a single model, achieving a higher accuracy than the original model. With the CKB, it is also easy to achieve knowledge distillation and transfer learning. Our work opens the door to building a universal continuous knowledge base to collect, store, and organize all continuous knowledge encoded in various neural networks trained for different AI tasks.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"2 ","pages":"Pages 197-204"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651021000280/pdfft?md5=6baa28b4172e47cb5e69435795e785e6&pid=1-s2.0-S2666651021000280-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85616245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-01-01DOI: 10.1016/j.aiopen.2021.06.004
Yusheng Su , Xu Han , Zhengyan Zhang , Yankai Lin , Peng Li , Zhiyuan Liu , Jie Zhou , Maosong Sun
Several recent efforts have been devoted to enhancing pre-trained language models (PLMs) by utilizing extra heterogeneous knowledge in knowledge graphs (KGs), and achieved consistent improvements on various knowledge-driven NLP tasks. However, most of these knowledge-enhanced PLMs embed static sub-graphs of KGs (“knowledge context”), regardless of that the knowledge required by PLMs may change dynamically according to specific text (“textual context”). In this paper, we propose a novel framework named Coke to dynamically select contextual knowledge and embed knowledge context according to textual context for PLMs, which can avoid the effect of redundant and ambiguous knowledge in KGs that cannot match the input text. Our experimental results show that Coke outperforms various baselines on typical knowledge-driven NLP tasks, indicating the effectiveness of utilizing dynamic knowledge context for language understanding. Besides the performance improvements, the dynamically selected knowledge in Coke can describe the semantics of text-related knowledge in a more interpretable form than the conventional PLMs. Our implementation and datasets are publicly available.
{"title":"CokeBERT: Contextual knowledge selection and embedding towards enhanced pre-trained language models","authors":"Yusheng Su , Xu Han , Zhengyan Zhang , Yankai Lin , Peng Li , Zhiyuan Liu , Jie Zhou , Maosong Sun","doi":"10.1016/j.aiopen.2021.06.004","DOIUrl":"10.1016/j.aiopen.2021.06.004","url":null,"abstract":"<div><p>Several recent efforts have been devoted to enhancing pre-trained language models (PLMs) by utilizing extra heterogeneous knowledge in knowledge graphs (KGs), and achieved consistent improvements on various knowledge-driven NLP tasks. However, most of these knowledge-enhanced PLMs embed static sub-graphs of KGs (“knowledge context”), regardless of that the knowledge required by PLMs may change dynamically according to specific text (“textual context”). In this paper, we propose a novel framework named Coke to dynamically select contextual knowledge and embed knowledge context according to textual context for PLMs, which can avoid the effect of redundant and ambiguous knowledge in KGs that cannot match the input text. Our experimental results show that Coke outperforms various baselines on typical knowledge-driven NLP tasks, indicating the effectiveness of utilizing dynamic knowledge context for language understanding. Besides the performance improvements, the dynamically selected knowledge in Coke can describe the semantics of text-related knowledge in a more interpretable form than the conventional PLMs. Our implementation and datasets are publicly available.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"2 ","pages":"Pages 127-134"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.aiopen.2021.06.004","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78659489","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-01-01DOI: 10.1016/j.aiopen.2021.06.003
Chaojun Xiao , Xueyu Hu , Zhiyuan Liu , Cunchao Tu , Maosong Sun
Legal artificial intelligence (LegalAI) aims to benefit legal systems with the technology of artificial intelligence, especially natural language processing (NLP). Recently, inspired by the success of pre-trained language models (PLMs) in the generic domain, many LegalAI researchers devote their effort to applying PLMs to legal tasks. However, utilizing PLMs to address legal tasks is still challenging, as the legal documents usually consist of thousands of tokens, which is far longer than the length that mainstream PLMs can process. In this paper, we release the Longformer-based pre-trained language model, named as Lawformer, for Chinese legal long documents understanding. We evaluate Lawformer on a variety of LegalAI tasks, including judgment prediction, similar case retrieval, legal reading comprehension, and legal question answering. The experimental results demonstrate that our model can achieve promising improvement on tasks with long documents as inputs. The code and parameters are available at https://github.com/thunlp/LegalPLMs.
{"title":"Lawformer: A pre-trained language model for Chinese legal long documents","authors":"Chaojun Xiao , Xueyu Hu , Zhiyuan Liu , Cunchao Tu , Maosong Sun","doi":"10.1016/j.aiopen.2021.06.003","DOIUrl":"10.1016/j.aiopen.2021.06.003","url":null,"abstract":"<div><p>Legal artificial intelligence (LegalAI) aims to benefit legal systems with the technology of artificial intelligence, especially natural language processing (NLP). Recently, inspired by the success of pre-trained language models (PLMs) in the generic domain, many LegalAI researchers devote their effort to applying PLMs to legal tasks. However, utilizing PLMs to address legal tasks is still challenging, as the legal documents usually consist of thousands of tokens, which is far longer than the length that mainstream PLMs can process. In this paper, we release the Longformer-based pre-trained language model, named as Lawformer, for Chinese legal long documents understanding. We evaluate Lawformer on a variety of LegalAI tasks, including judgment prediction, similar case retrieval, legal reading comprehension, and legal question answering. The experimental results demonstrate that our model can achieve promising improvement on tasks with long documents as inputs. The code and parameters are available at <span>https://github.com/thunlp/LegalPLMs</span><svg><path></path></svg>.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"2 ","pages":"Pages 79-84"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.aiopen.2021.06.003","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73107746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}