AI Open最新文献

英文中文

Data Augmentation Approaches in Natural Language Processing: A Survey 自然语言处理中的数据增强方法综述

AI Open

Pub Date : 2021-10-05 DOI: 10.1016/j.aiopen.2022.03.001

Bohan Li, Yutai Hou, Wanxiang Che

引用次数: 110

StackVAE-G: An efficient and interpretable model for time series anomaly detection StackVAE-G:一种高效可解释的时间序列异常检测模型

AI Open

Pub Date : 2021-05-18 DOI: 10.1016/j.aiopen.2022.07.001

Wenkai Li, Wenbo Hu, Ting Chen, Ning Chen, Cheng Feng

引用次数: 3

Heterogeneous graph knowledge enhanced stock market prediction 异构图知识增强股票市场预测

AI Open

Pub Date : 2021-01-01 DOI: 10.1016/j.aiopen.2021.09.001

Kai Xiong, Xiao Ding, Li Du, Ting Liu, Bing Qin

We focus on the task of stock market prediction based on financial text which contains information that could influence the movement of stock market. Previous works mainly utilize a single semantic unit of financial text, such as words, events, sentences, to predict the tendency of stock market. However, the interaction of different-grained information within financial text can be useful for context knowledge supplement and predictive information selection, and then improve the performance of stock market prediction. To facilitate this, we propose constructing a heterogeneous graph with different-grained information nodes from financial text for the task. A novel heterogeneous neural network is presented to aggregate multi-grained information. Experimental results demonstrate that our proposed approach reaches higher performance than baselines.

本文重点研究了基于金融文本的股票市场预测任务，金融文本包含了影响股票市场走势的信息。以往的工作主要是利用金融文本的单个语义单位，如单词、事件、句子来预测股市走势。然而，金融文本中不同粒度信息之间的相互作用可以用于背景知识的补充和预测信息的选择，从而提高股票市场预测的性能。为了实现这一点，我们建议构建一个异构图，其中包含来自金融文本的不同粒度的信息节点。提出了一种基于异构神经网络的多粒度信息聚合方法。实验结果表明，我们提出的方法达到了比基线更高的性能。

引用次数: 6

CPM: A large-scale generative Chinese Pre-trained language model CPM:大规模生成中文预训练语言模型

AI Open

Pub Date : 2021-01-01 DOI: 10.1016/j.aiopen.2021.07.001

Zhengyan Zhang, Xu Han, Hao Zhou, Pei Ke, Yuxian Gu, Deming Ye, Yujia Qin, Yusheng Su, Haozhe Ji, Jian Guan, Fanchao Qi, Xiaozhi Wang, Yanan Zheng, Guoyang Zeng, Huanqi Cao, Shengqi Chen, Daixuan Li, Zhenbo Sun, Zhiyuan Liu, Minlie Huang, Maosong Sun

Pre-trained Language Models (PLMs) have proven to be beneficial for various downstream NLP tasks. Recently, GPT-3, with 175 billion parameters and 570 GB training data, drew a lot of attention due to the capacity of few-shot (even zero-shot) learning. However, applying GPT-3 to address Chinese NLP tasks is still challenging, as the training corpus of GPT-3 is primarily English, and the parameters are not publicly available. In this technical report, we release the Chinese Pre-trained Language Model (CPM) with generative pre-training on large-scale Chinese training data. To the best of our knowledge, CPM, with 2.6 billion parameters and 100 GB Chinese training data, is the largest Chinese pre-trained language model, which could facilitate several downstream Chinese NLP tasks, such as conversation, essay generation, cloze test, and language understanding. Extensive experiments demonstrate that CPM achieves strong performance on many NLP tasks in the settings of few-shot (even zero-shot) learning. The code and parameters are available at https://github.com/TsinghuaAI/CPM.

预训练语言模型(PLMs)已被证明对各种下游NLP任务是有益的。最近，拥有1750亿个参数和570gb训练数据的GPT-3因其少射(甚至零射)学习的能力而备受关注。然而，由于GPT-3的训练语料库主要是英语，且参数不公开，因此将GPT-3应用于汉语NLP任务仍然具有挑战性。在本技术报告中，我们发布了基于大规模中文训练数据生成式预训练的中文预训练语言模型(CPM)。据我们所知，CPM是最大的中文预训练语言模型，拥有26亿个参数和100 GB的中文训练数据，可以促进几个下游的中文NLP任务，如会话、文章生成、完形填空测试和语言理解。大量的实验表明，CPM在少量(甚至零次)学习的情况下，在许多NLP任务上取得了较好的表现。代码和参数可在https://github.com/TsinghuaAI/CPM上获得。

{"title":"CPM: A large-scale generative Chinese Pre-trained language model","authors":"Zhengyan Zhang, Xu Han, Hao Zhou, Pei Ke, Yuxian Gu, Deming Ye, Yujia Qin, Yusheng Su, Haozhe Ji, Jian Guan, Fanchao Qi, Xiaozhi Wang, Yanan Zheng, Guoyang Zeng, Huanqi Cao, Shengqi Chen, Daixuan Li, Zhenbo Sun, Zhiyuan Liu, Minlie Huang, Maosong Sun","doi":"10.1016/j.aiopen.2021.07.001","DOIUrl":"10.1016/j.aiopen.2021.07.001","url":null,"abstract":"<div><p>Pre-trained Language Models (PLMs) have proven to be beneficial for various downstream NLP tasks. Recently, GPT-3, with 175 billion parameters and 570 GB training data, drew a lot of attention due to the capacity of few-shot (even zero-shot) learning. However, applying GPT-3 to address Chinese NLP tasks is still challenging, as the training corpus of GPT-3 is primarily English, and the parameters are not publicly available. In this technical report, we release the Chinese Pre-trained Language Model (CPM) with generative pre-training on large-scale Chinese training data. To the best of our knowledge, CPM, with 2.6 billion parameters and 100 GB Chinese training data, is the largest Chinese pre-trained language model, which could facilitate several downstream Chinese NLP tasks, such as conversation, essay generation, cloze test, and language understanding. Extensive experiments demonstrate that CPM achieves strong performance on many NLP tasks in the settings of few-shot (even zero-shot) learning. The code and parameters are available at <span>https://github.com/TsinghuaAI/CPM</span><svg><path></path></svg>.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"2 ","pages":"Pages 93-99"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.aiopen.2021.07.001","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90523293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 86

Discrete and continuous representations and processing in deep learning: Looking forward 深度学习中的离散和连续表示与处理:展望

AI Open

Pub Date : 2021-01-01 DOI: 10.1016/j.aiopen.2021.07.002

Ruben Cartuyvels, Graham Spinks , Marie-Francine Moens

Discrete and continuous representations of content (e.g., of language or images) have interesting properties to be explored for the understanding of or reasoning with this content by machines. This position paper puts forward our opinion on the role of discrete and continuous representations and their processing in the deep learning field. Current neural network models compute continuous-valued data. Information is compressed into dense, distributed embeddings. By stark contrast, humans use discrete symbols in their communication with language. Such symbols represent a compressed version of the world that derives its meaning from shared contextual information. Additionally, human reasoning involves symbol manipulation at a cognitive level, which facilitates abstract reasoning, the composition of knowledge and understanding, generalization and efficient learning. Motivated by these insights, in this paper we argue that combining discrete and continuous representations and their processing will be essential to build systems that exhibit a general form of intelligence. We suggest and discuss several avenues that could improve current neural networks with the inclusion of discrete elements to combine the advantages of both types of representations.

内容(如语言或图像)的离散和连续表示具有有趣的特性，值得机器对这些内容的理解或推理进行探索。本文提出了我们对离散和连续表示及其处理在深度学习领域中的作用的看法。目前的神经网络模型计算连续值数据。信息被压缩成密集的分布式嵌入。与之形成鲜明对比的是，人类在语言交流中使用离散的符号。这些符号代表了一个压缩版的世界，它的意义来源于共享的语境信息。此外，人类推理还涉及认知层面的符号操作，这有助于抽象推理、知识和理解的组成、泛化和高效学习。在这些见解的推动下，在本文中，我们认为将离散和连续表示及其处理相结合对于构建展示一般智能形式的系统至关重要。我们提出并讨论了几种方法，可以通过包含离散元素来改进当前的神经网络，以结合两种表示类型的优势。

{"title":"Discrete and continuous representations and processing in deep learning: Looking forward","authors":"Ruben Cartuyvels, Graham Spinks , Marie-Francine Moens","doi":"10.1016/j.aiopen.2021.07.002","DOIUrl":"10.1016/j.aiopen.2021.07.002","url":null,"abstract":"<div><p>Discrete and continuous representations of content (<em>e.g.</em>, of language or images) have interesting properties to be explored for the understanding of or reasoning with this content by machines. This position paper puts forward our opinion on the role of discrete and continuous representations and their processing in the deep learning field. Current neural network models compute continuous-valued data. Information is compressed into dense, distributed embeddings. By stark contrast, humans use discrete symbols in their communication with language. Such symbols represent a compressed version of the world that derives its meaning from shared contextual information. Additionally, human reasoning involves symbol manipulation at a cognitive level, which facilitates abstract reasoning, the composition of knowledge and understanding, generalization and efficient learning. Motivated by these insights, in this paper we argue that combining discrete and continuous representations and their processing will be essential to build systems that exhibit a general form of intelligence. We suggest and discuss several avenues that could improve current neural networks with the inclusion of discrete elements to combine the advantages of both types of representations.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"2 ","pages":"Pages 143-159"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651021000206/pdfft?md5=2930ea7a8804d90c964ce7206c845bec&pid=1-s2.0-S2666651021000206-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87346774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 11

A comprehensive survey of entity alignment for knowledge graphs 知识图谱实体对齐的综合调查

AI Open

Pub Date : 2021-01-01 DOI: 10.1016/j.aiopen.2021.02.002

Kaisheng Zeng , Chengjiang Li , Lei Hou , Juanzi Li , Ling Feng

Knowledge Graphs (KGs), as a structured human knowledge, manage data in an ease-of-store, recognizable, and understandable way for machines and provide a rich knowledge base for different artificial intelligence applications. However, current multi-source KGs have heterogeneity and complementarity, and it is necessary to fuse heterogeneous knowledge from different data sources or different languages into a unified and consistent KG. Entity alignment aims to find equivalence relations between entities in different knowledge graphs but semantically represent the same real-world object, which is the most fundamental and essential technology in knowledge fusion. This paper investigated almost all the latest knowledge graph representations learning and entity alignment methods and summarized their core technologies and features from different aspects. Our full investigation gives a comprehensive outlook on several promising research directions for future work. We also provide an efficient and efficiency entity alignment toolkit to help researchers quickly start their own entity alignment models.

知识图(Knowledge Graphs, KGs)作为一种结构化的人类知识，以易于存储、可识别和可理解的方式对数据进行管理，为不同的人工智能应用提供了丰富的知识库。然而，当前的多源知识库存在异质性和互补性，需要将来自不同数据源或不同语言的异构知识融合为统一一致的知识库。实体对齐旨在寻找不同知识图中实体之间的等价关系，但在语义上表示相同的现实世界对象，是知识融合中最基本、最关键的技术。本文研究了几乎所有最新的知识图表示学习和实体对齐方法，并从不同方面总结了它们的核心技术和特点。我们的全面调查对未来工作的几个有前途的研究方向进行了全面的展望。我们还提供了一个高效的实体对齐工具包，以帮助研究人员快速启动他们自己的实体对齐模型。

{"title":"A comprehensive survey of entity alignment for knowledge graphs","authors":"Kaisheng Zeng , Chengjiang Li , Lei Hou , Juanzi Li , Ling Feng","doi":"10.1016/j.aiopen.2021.02.002","DOIUrl":"10.1016/j.aiopen.2021.02.002","url":null,"abstract":"<div><p>Knowledge Graphs (KGs), as a structured human knowledge, manage data in an ease-of-store, recognizable, and understandable way for machines and provide a rich knowledge base for different artificial intelligence applications. However, current multi-source KGs have heterogeneity and complementarity, and it is necessary to fuse heterogeneous knowledge from different data sources or different languages into a unified and consistent KG. Entity alignment aims to find equivalence relations between entities in different knowledge graphs but semantically represent the same real-world object, which is the most fundamental and essential technology in knowledge fusion. This paper investigated almost all the latest knowledge graph representations learning and entity alignment methods and summarized their core technologies and features from different aspects. Our full investigation gives a comprehensive outlook on several promising research directions for future work. We also provide an efficient and efficiency entity alignment toolkit to help researchers quickly start their own entity alignment models.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"2 ","pages":"Pages 1-13"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.aiopen.2021.02.002","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75984252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 57

A review of deep learning in question answering over knowledge bases 基于知识库的深度学习问题回答研究综述

AI Open

Pub Date : 2021-01-01 DOI: 10.1016/j.aiopen.2021.12.001

Chen Zhang , Yuxuan Lai , Yansong Feng , Dongyan Zhao

Question answering over knowledge bases (KBQA) is a challenging task in natural language processing. It requires machines to answer natural language questions based on large-scale knowledge bases. Recent years have witnessed remarkable success of neural network models on many natural language processing tasks, including KBQA. In this paper, we first review the recent advances of deep learning methods on solving simple questions in two streams, the information extraction style and semantic parsing style. We then introduce how to extend the neural architectures to answer more complex questions with iteration and decomposition techniques, and summarize current research challenges.

知识库问答(KBQA)是自然语言处理中的一项具有挑战性的任务。它需要机器回答基于大规模知识库的自然语言问题。近年来，神经网络模型在许多自然语言处理任务上取得了显著的成功，其中包括KBQA。在本文中，我们首先回顾了深度学习方法在解决简单问题的两个方面的最新进展，即信息提取风格和语义解析风格。然后，我们介绍了如何扩展神经体系结构，以使用迭代和分解技术来回答更复杂的问题，并总结了当前的研究挑战。

引用次数: 5

Towards a universal continuous knowledge base 朝向一个通用的连续知识库

AI Open

Pub Date : 2021-01-01 DOI: 10.1016/j.aiopen.2021.11.001

Gang Chen , Maosong Sun , Yang Liu

In artificial intelligence (AI), knowledge is the information required by an intelligent system to accomplish tasks. While traditional knowledge bases use discrete, symbolic representations, detecting knowledge encoded in the continuous representations learned from data has received increasing attention recently. In this work, we propose a method for building a continuous knowledge base (CKB) that can store knowledge imported from multiple, diverse neural networks. The key idea of our approach is to define an interface for each neural network and cast knowledge transferring as a function simulation problem. Experiments on text classification show promising results: the CKB imports knowledge from a single model and then exports the knowledge to a new model, achieving comparable performance with the original model. More interesting, we import the knowledge from multiple models to the knowledge base, from which the fused knowledge is exported back to a single model, achieving a higher accuracy than the original model. With the CKB, it is also easy to achieve knowledge distillation and transfer learning. Our work opens the door to building a universal continuous knowledge base to collect, store, and organize all continuous knowledge encoded in various neural networks trained for different AI tasks.

在人工智能(AI)中，知识是智能系统完成任务所需的信息。虽然传统知识库使用离散的符号表示，但从数据中学习的连续表示中编码的知识检测近年来受到越来越多的关注。在这项工作中，我们提出了一种构建连续知识库(CKB)的方法，该方法可以存储从多个不同的神经网络导入的知识。我们的方法的关键思想是为每个神经网络定义一个接口，并将知识传递作为一个函数仿真问题。在文本分类方面的实验显示了令人满意的结果:CKB从单个模型中导入知识，然后将知识导出到新模型中，达到了与原始模型相当的性能。更有趣的是，我们将多个模型中的知识导入到知识库中，从知识库中融合的知识导出到单个模型中，达到了比原始模型更高的精度。使用CKB，也很容易实现知识的升华和迁移学习。我们的工作为建立一个通用的连续知识库打开了大门，以收集、存储和组织所有编码在各种神经网络中的连续知识，这些神经网络是为不同的人工智能任务训练的。

{"title":"Towards a universal continuous knowledge base","authors":"Gang Chen , Maosong Sun , Yang Liu","doi":"10.1016/j.aiopen.2021.11.001","DOIUrl":"10.1016/j.aiopen.2021.11.001","url":null,"abstract":"<div><p>In artificial intelligence (AI), knowledge is the information required by an intelligent system to accomplish tasks. While traditional knowledge bases use discrete, symbolic representations, detecting knowledge encoded in the continuous representations learned from data has received increasing attention recently. In this work, we propose a method for building a continuous knowledge base (CKB) that can store knowledge imported from multiple, diverse neural networks. The key idea of our approach is to define an interface for each neural network and cast knowledge transferring as a function simulation problem. Experiments on text classification show promising results: the CKB imports knowledge from a single model and then exports the knowledge to a new model, achieving comparable performance with the original model. More interesting, we import the knowledge from multiple models to the knowledge base, from which the fused knowledge is exported back to a single model, achieving a higher accuracy than the original model. With the CKB, it is also easy to achieve knowledge distillation and transfer learning. Our work opens the door to building a universal continuous knowledge base to collect, store, and organize all continuous knowledge encoded in various neural networks trained for different AI tasks.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"2 ","pages":"Pages 197-204"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651021000280/pdfft?md5=6baa28b4172e47cb5e69435795e785e6&pid=1-s2.0-S2666651021000280-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85616245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

CokeBERT: Contextual knowledge selection and embedding towards enhanced pre-trained language models CokeBERT:面向增强预训练语言模型的上下文知识选择和嵌入

AI Open

Pub Date : 2021-01-01 DOI: 10.1016/j.aiopen.2021.06.004

Yusheng Su , Xu Han , Zhengyan Zhang , Yankai Lin , Peng Li , Zhiyuan Liu , Jie Zhou , Maosong Sun

Several recent efforts have been devoted to enhancing pre-trained language models (PLMs) by utilizing extra heterogeneous knowledge in knowledge graphs (KGs), and achieved consistent improvements on various knowledge-driven NLP tasks. However, most of these knowledge-enhanced PLMs embed static sub-graphs of KGs (“knowledge context”), regardless of that the knowledge required by PLMs may change dynamically according to specific text (“textual context”). In this paper, we propose a novel framework named Coke to dynamically select contextual knowledge and embed knowledge context according to textual context for PLMs, which can avoid the effect of redundant and ambiguous knowledge in KGs that cannot match the input text. Our experimental results show that Coke outperforms various baselines on typical knowledge-driven NLP tasks, indicating the effectiveness of utilizing dynamic knowledge context for language understanding. Besides the performance improvements, the dynamically selected knowledge in Coke can describe the semantics of text-related knowledge in a more interpretable form than the conventional PLMs. Our implementation and datasets are publicly available.

近年来，通过在知识图(KGs)中使用额外的异构知识来增强预训练语言模型(PLMs)，并在各种知识驱动的NLP任务上取得了一致的改进。然而，这些知识增强的plm大多嵌入了KGs的静态子图(“知识上下文”)，而不考虑plm所需的知识可能会根据特定的文本(“文本上下文”)动态变化。在本文中，我们提出了一种新的框架Coke，用于plm根据文本上下文动态选择上下文知识并嵌入知识上下文，从而避免了KGs中冗余和模糊的知识与输入文本不匹配的影响。我们的实验结果表明，Coke在典型的知识驱动型NLP任务上的表现优于各种基线，表明利用动态知识上下文进行语言理解的有效性。除了性能改进之外，Coke中动态选择的知识可以以比传统plm更可解释的形式描述文本相关知识的语义。我们的实现和数据集是公开的。

{"title":"CokeBERT: Contextual knowledge selection and embedding towards enhanced pre-trained language models","authors":"Yusheng Su , Xu Han , Zhengyan Zhang , Yankai Lin , Peng Li , Zhiyuan Liu , Jie Zhou , Maosong Sun","doi":"10.1016/j.aiopen.2021.06.004","DOIUrl":"10.1016/j.aiopen.2021.06.004","url":null,"abstract":"<div><p>Several recent efforts have been devoted to enhancing pre-trained language models (PLMs) by utilizing extra heterogeneous knowledge in knowledge graphs (KGs), and achieved consistent improvements on various knowledge-driven NLP tasks. However, most of these knowledge-enhanced PLMs embed static sub-graphs of KGs (“knowledge context”), regardless of that the knowledge required by PLMs may change dynamically according to specific text (“textual context”). In this paper, we propose a novel framework named Coke to dynamically select contextual knowledge and embed knowledge context according to textual context for PLMs, which can avoid the effect of redundant and ambiguous knowledge in KGs that cannot match the input text. Our experimental results show that Coke outperforms various baselines on typical knowledge-driven NLP tasks, indicating the effectiveness of utilizing dynamic knowledge context for language understanding. Besides the performance improvements, the dynamically selected knowledge in Coke can describe the semantics of text-related knowledge in a more interpretable form than the conventional PLMs. Our implementation and datasets are publicly available.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"2 ","pages":"Pages 127-134"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.aiopen.2021.06.004","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78659489","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 15

Lawformer: A pre-trained language model for Chinese legal long documents Lawformer:中文法律长文件的预训练语言模型

AI Open

Pub Date : 2021-01-01 DOI: 10.1016/j.aiopen.2021.06.003

Chaojun Xiao , Xueyu Hu , Zhiyuan Liu , Cunchao Tu , Maosong Sun

Legal artificial intelligence (LegalAI) aims to benefit legal systems with the technology of artificial intelligence, especially natural language processing (NLP). Recently, inspired by the success of pre-trained language models (PLMs) in the generic domain, many LegalAI researchers devote their effort to applying PLMs to legal tasks. However, utilizing PLMs to address legal tasks is still challenging, as the legal documents usually consist of thousands of tokens, which is far longer than the length that mainstream PLMs can process. In this paper, we release the Longformer-based pre-trained language model, named as Lawformer, for Chinese legal long documents understanding. We evaluate Lawformer on a variety of LegalAI tasks, including judgment prediction, similar case retrieval, legal reading comprehension, and legal question answering. The experimental results demonstrate that our model can achieve promising improvement on tasks with long documents as inputs. The code and parameters are available at https://github.com/thunlp/LegalPLMs.

法律人工智能(LegalAI)旨在利用人工智能技术，特别是自然语言处理(NLP)，使法律系统受益。最近，受到预训练语言模型(plm)在通用领域成功的启发，许多LegalAI研究人员致力于将plm应用于法律任务。然而，利用plm来处理法律任务仍然具有挑战性，因为法律文件通常由数千个令牌组成，这远远超过主流plm可以处理的长度。在本文中，我们发布了基于longformer的预训练语言模型Lawformer，用于中文法律长文件的理解。我们在各种LegalAI任务上对Lawformer进行了评估，包括判决预测、相似案例检索、法律阅读理解和法律问题回答。实验结果表明，我们的模型在以长文档为输入的任务上取得了很好的改进。代码和参数可在https://github.com/thunlp/LegalPLMs上获得。

{"title":"Lawformer: A pre-trained language model for Chinese legal long documents","authors":"Chaojun Xiao , Xueyu Hu , Zhiyuan Liu , Cunchao Tu , Maosong Sun","doi":"10.1016/j.aiopen.2021.06.003","DOIUrl":"10.1016/j.aiopen.2021.06.003","url":null,"abstract":"<div><p>Legal artificial intelligence (LegalAI) aims to benefit legal systems with the technology of artificial intelligence, especially natural language processing (NLP). Recently, inspired by the success of pre-trained language models (PLMs) in the generic domain, many LegalAI researchers devote their effort to applying PLMs to legal tasks. However, utilizing PLMs to address legal tasks is still challenging, as the legal documents usually consist of thousands of tokens, which is far longer than the length that mainstream PLMs can process. In this paper, we release the Longformer-based pre-trained language model, named as Lawformer, for Chinese legal long documents understanding. We evaluate Lawformer on a variety of LegalAI tasks, including judgment prediction, similar case retrieval, legal reading comprehension, and legal question answering. The experimental results demonstrate that our model can achieve promising improvement on tasks with long documents as inputs. The code and parameters are available at <span>https://github.com/thunlp/LegalPLMs</span><svg><path></path></svg>.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"2 ","pages":"Pages 79-84"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.aiopen.2021.06.003","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73107746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 88

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

AI Open

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀