Low-resource court judgment summarization for common law systems

IF 7.4 1区管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Information Processing & Management Pub Date : 2024-06-03 DOI:10.1016/j.ipm.2024.103796

Shuaiqi Liu , Jiannong Cao , Yicong Li , Ruosong Yang , Zhiyuan Wen

{"title":"Low-resource court judgment summarization for common law systems","authors":"Shuaiqi Liu , Jiannong Cao , Yicong Li , Ruosong Yang , Zhiyuan Wen","doi":"10.1016/j.ipm.2024.103796","DOIUrl":null,"url":null,"abstract":"<div><p>Common law courts need to refer to similar precedents’ judgments to inform their current decisions. Generating high-quality summaries of court judgment documents can facilitate legal practitioners to efficiently review previous cases and assist the general public in accessing how the courts operate and how the law is applied. Previous court judgment summarization research focuses on civil law or a particular jurisdiction’s judgments. However, judges can refer to the judgments from all common law jurisdictions. Current summarization datasets are insufficient to satisfy the demands of summarizing precedents across multiple jurisdictions, especially when labeled data are scarce for many jurisdictions. To address the lack of datasets, we present CLSum, the first dataset for summarizing multi-jurisdictional common law court judgment documents. Besides, this is the first court judgment summarization work adopting large language models (LLMs) in data augmentation, summary generation, and evaluation. Specifically, we design an LLM-based data augmentation method incorporating legal knowledge. We also propose a legal knowledge enhanced evaluation metric based on LLM to assess the quality of generated judgment summaries. Our experimental results verify that the LLM-based summarization methods can perform well in the few-shot and zero-shot settings. Our LLM-based data augmentation method can mitigate the impact of low data resources. Furthermore, we carry out comprehensive comparative experiments to find essential model components and settings that are capable of enhancing summarization performance.</p></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":null,"pages":null},"PeriodicalIF":7.4000,"publicationDate":"2024-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Processing & Management","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306457324001511","RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Common law courts need to refer to similar precedents’ judgments to inform their current decisions. Generating high-quality summaries of court judgment documents can facilitate legal practitioners to efficiently review previous cases and assist the general public in accessing how the courts operate and how the law is applied. Previous court judgment summarization research focuses on civil law or a particular jurisdiction’s judgments. However, judges can refer to the judgments from all common law jurisdictions. Current summarization datasets are insufficient to satisfy the demands of summarizing precedents across multiple jurisdictions, especially when labeled data are scarce for many jurisdictions. To address the lack of datasets, we present CLSum, the first dataset for summarizing multi-jurisdictional common law court judgment documents. Besides, this is the first court judgment summarization work adopting large language models (LLMs) in data augmentation, summary generation, and evaluation. Specifically, we design an LLM-based data augmentation method incorporating legal knowledge. We also propose a legal knowledge enhanced evaluation metric based on LLM to assess the quality of generated judgment summaries. Our experimental results verify that the LLM-based summarization methods can perform well in the few-shot and zero-shot settings. Our LLM-based data augmentation method can mitigate the impact of low data resources. Furthermore, we carry out comprehensive comparative experiments to find essential model components and settings that are capable of enhancing summarization performance.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

普通法系的低资源法院判决摘要

普通法法院需要参考类似先例的判决，为其当前的判决提供依据。生成高质量的法院判决文件摘要可以方便法律从业人员有效地审查以前的案件，并帮助公众了解法院是如何运作的以及法律是如何适用的。以往的法院判决摘要研究主要集中于民法或特定司法管辖区的判决。然而，法官可以参考所有普通法司法管辖区的判决。目前的归纳数据集不足以满足归纳多个司法管辖区判例的需求，尤其是当许多司法管辖区的标注数据稀缺时。为了解决数据集缺乏的问题，我们提出了 CLSum，这是第一个用于总结多法域普通法法院判决文件的数据集。此外，这是首个在数据扩充、摘要生成和评估中采用大型语言模型（LLM）的法院判决摘要工作。具体来说，我们设计了一种基于 LLM 的数据扩增方法，其中包含法律知识。我们还提出了一种基于 LLM 的法律知识增强评价指标，用于评估生成的判决摘要的质量。我们的实验结果验证了基于 LLM 的摘要方法在少镜头和零镜头设置下都能表现出色。我们基于 LLM 的数据增强方法可以减轻低数据资源的影响。此外，我们还进行了全面的对比实验，以找到能够提高摘要性能的基本模型组件和设置。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Information Processing & Management 工程技术-计算机：信息系统

CiteScore

17.00

自引率

11.60%

发文量

276

审稿时长

39 days

期刊介绍： Information Processing and Management is dedicated to publishing cutting-edge original research at the convergence of computing and information science. Our scope encompasses theory, methods, and applications across various domains, including advertising, business, health, information science, information technology marketing, and social computing. We aim to cater to the interests of both primary researchers and practitioners by offering an effective platform for the timely dissemination of advanced and topical issues in this interdisciplinary field. The journal places particular emphasis on original research articles, research survey articles, research method articles, and articles addressing critical applications of research. Join us in advancing knowledge and innovation at the intersection of computing and information science.