Joint Reference and Relation Extraction from Legal Documents with Enhanced Decoder Input

IF 1.1 Q4 COMPUTER SCIENCE, INFORMATION SYSTEMS Cybernetics and Information Technologies Pub Date : 2023-06-01 DOI:10.2478/cait-2023-0014

Nguyen Thi Thanh Thuy, Nguyen Ngoc Diep, Ngo Xuan Bach, Tu Minh Phuong

引用次数: 0

Abstract

Abstract This paper deals with an important task in legal text processing, namely reference and relation extraction from legal documents, which includes two subtasks: 1) reference extraction; 2) relation determination. Motivated by the fact that two subtasks are related and share common information, we propose a joint learning model that solves simultaneously both subtasks. Our model employs a Transformer-based encoder-decoder architecture with non-autoregressive decoding that allows relaxing the sequentiality of traditional seq2seq models and extracting references and relations in one inference step. We also propose a method to enrich the decoder input with learnable meaningful information and therefore, improve the model accuracy. Experimental results on a dataset consisting of 5031 legal documents in Vietnamese with 61,446 references show that our proposed model performs better results than several strong baselines and achieves an F1 score of 99.4% for the joint reference and relation extraction task.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于增强解码器输入的法律文件联合参考和关系提取

摘要:本文研究了法律文本处理中的一项重要任务，即从法律文件中提取参考文献和关系，包括两个子任务:1)参考文献提取;2)关系确定。基于两个子任务相互关联并共享共同信息的特点，提出了一种同时解决两个子任务的联合学习模型。我们的模型采用基于transformer的编码器-解码器架构，具有非自回归解码，允许放松传统seq2seq模型的顺序性，并在一个推理步骤中提取引用和关系。我们还提出了一种方法来丰富解码器输入的可学习的有意义的信息，从而提高模型的准确性。在包含5031份越南语法律文件和61446条参考文献的数据集上的实验结果表明，我们提出的模型比几个强基线的结果更好，在联合参考和关系提取任务中达到了99.4%的F1分数。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊