首页 > 最新文献

Artificial Intelligence and Law最新文献

英文 中文
Automated legal reasoning with discretion to act using s(LAW) 使用 s(LAW)进行自动法律推理并酌情采取行动
IF 3.1 2区 社会学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-11-20 DOI: 10.1007/s10506-023-09376-5
Joaquín Arias, Mar Moreno-Rebato, Jose A. Rodriguez-García, Sascha Ossowski

Automated legal reasoning and its application in smart contracts and automated decisions are increasingly attracting interest. In this context, ethical and legal concerns make it necessary for automated reasoners to justify in human-understandable terms the advice given. Logic Programming, specially Answer Set Programming, has a rich semantics and has been used to very concisely express complex knowledge. However, modelling discretionality to act and other vague concepts such as ambiguity cannot be expressed in top-down execution models based on Prolog, and in bottom-up execution models based on ASP the justifications are incomplete and/or not scalable. We propose to use s(CASP), a top-down execution model for predicate ASP, to model vague concepts following a set of patterns. We have implemented a framework, called s(LAW), to model, reason, and justify the applicable legislation and validate it by translating (and benchmarking) a representative use case, the criteria for the admission of students in the “Comunidad de Madrid”.

自动法律推理及其在智能合约和自动决策中的应用越来越受到关注。在这种情况下,出于道德和法律方面的考虑,自动推理器有必要以人类可理解的方式证明所给出的建议是合理的。逻辑编程,特别是答案集编程,具有丰富的语义,已被用于非常简洁地表达复杂的知识。然而,基于 Prolog 的自上而下执行模型无法表达行动的自由裁量权建模和其他模糊概念(如模糊性),而基于 ASP 的自下而上执行模型则无法完整和/或扩展理由。我们建议使用 s(CASP)--谓词 ASP 的自顶向下执行模型--按照一组模式对模糊概念建模。我们实施了一个名为 s(LAW)的框架,用于对适用法律进行建模、推理和论证,并通过翻译(和基准测试)一个具有代表性的用例("马德里社区 "的学生录取标准)对其进行验证。
{"title":"Automated legal reasoning with discretion to act using s(LAW)","authors":"Joaquín Arias,&nbsp;Mar Moreno-Rebato,&nbsp;Jose A. Rodriguez-García,&nbsp;Sascha Ossowski","doi":"10.1007/s10506-023-09376-5","DOIUrl":"10.1007/s10506-023-09376-5","url":null,"abstract":"<div><p>Automated legal reasoning and its application in smart contracts and automated decisions are increasingly attracting interest. In this context, ethical and legal concerns make it necessary for automated reasoners to <i>justify</i> in human-understandable terms the advice given. Logic Programming, specially Answer Set Programming, has a rich semantics and has been used to very concisely express complex knowledge. However, modelling <i>discretionality to act</i> and other vague concepts such as <i>ambiguity</i> cannot be expressed in top-down execution models based on Prolog, and in bottom-up execution models based on ASP the justifications are incomplete and/or not scalable. We propose to use s(CASP), a top-down execution model for predicate ASP, to model vague concepts following a set of patterns. We have implemented a framework, called s(LAW), to model, reason, and justify the applicable legislation and validate it by translating (and benchmarking) a representative use case, the criteria for the admission of students in the “Comunidad de Madrid”.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"32 4","pages":"1141 - 1164"},"PeriodicalIF":3.1,"publicationDate":"2023-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139255770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bringing order into the realm of Transformer-based language models for artificial intelligence and law 将秩序带入基于 Transformer 的人工智能和法律语言模型领域
IF 3.1 2区 社会学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-11-20 DOI: 10.1007/s10506-023-09374-7
Candida M. Greco, Andrea Tagarelli

Transformer-based language models (TLMs) have widely been recognized to be a cutting-edge technology for the successful development of deep-learning-based solutions to problems and applications that require natural language processing and understanding. Like for other textual domains, TLMs have indeed pushed the state-of-the-art of AI approaches for many tasks of interest in the legal domain. Despite the first Transformer model being proposed about six years ago, there has been a rapid progress of this technology at an unprecedented rate, whereby BERT and related models represent a major reference, also in the legal domain. This article provides the first systematic overview of TLM-based methods for AI-driven problems and tasks in the legal sphere. A major goal is to highlight research advances in this field so as to understand, on the one hand, how the Transformers have contributed to the success of AI in supporting legal processes, and on the other hand, what are the current limitations and opportunities for further research development.

基于变换器的语言模型(TLM)已被广泛认为是成功开发基于深度学习的解决方案的前沿技术,可以解决需要自然语言处理和理解的问题和应用。与其他文本领域一样,TLM 在法律领域的许多任务中确实推动了人工智能方法的发展。尽管第一个 Transformer 模型是在六年前提出的,但这项技术以前所未有的速度迅速发展,其中 BERT 和相关模型在法律领域也具有重要的参考价值。本文首次系统地概述了基于 TLM 的方法在法律领域用于人工智能驱动的问题和任务。本文的一个主要目的是重点介绍该领域的研究进展,以便一方面了解变换器如何促进人工智能在支持法律流程方面取得成功,另一方面了解当前的局限性以及进一步研究发展的机遇。
{"title":"Bringing order into the realm of Transformer-based language models for artificial intelligence and law","authors":"Candida M. Greco,&nbsp;Andrea Tagarelli","doi":"10.1007/s10506-023-09374-7","DOIUrl":"10.1007/s10506-023-09374-7","url":null,"abstract":"<div><p>Transformer-based language models (TLMs) have widely been recognized to be a cutting-edge technology for the successful development of deep-learning-based solutions to problems and applications that require natural language processing and understanding. Like for other textual domains, TLMs have indeed pushed the state-of-the-art of AI approaches for many tasks of interest in the legal domain. Despite the first Transformer model being proposed about six years ago, there has been a rapid progress of this technology at an unprecedented rate, whereby BERT and related models represent a major reference, also in the legal domain. This article provides the first systematic overview of TLM-based methods for AI-driven problems and tasks in the legal sphere. A major goal is to highlight research advances in this field so as to understand, on the one hand, how the Transformers have contributed to the success of AI in supporting legal processes, and on the other hand, what are the current limitations and opportunities for further research development.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"32 4","pages":"863 - 1010"},"PeriodicalIF":3.1,"publicationDate":"2023-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10506-023-09374-7.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142519046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Natural language processing for legal document review: categorising deontic modalities in contracts 法律文件审查的自然语言处理:合同中道义模式的分类
IF 3.1 2区 社会学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-11-11 DOI: 10.1007/s10506-023-09379-2
S. Georgette Graham, Hamidreza Soltani, Olufemi Isiaq

The contract review process can be a costly and time-consuming task for lawyers and clients alike, requiring significant effort to identify and evaluate the legal implications of individual clauses. To address this challenge, we propose the use of natural language processing techniques, specifically text classification based on deontic tags, to streamline the process. Our research question is whether natural language processing techniques, specifically dense vector embeddings, can help semi-automate the contract review process and reduce time and costs for legal professionals reviewing deontic modalities in contracts. In this study, we create a domain-specific dataset and train both baseline and neural network models for contract sentence classification. This approach offers a more efficient and cost-effective solution for contract review, mimicking the work of a lawyer. Our approach achieves an accuracy of 0.90, showcasing its effectiveness in identifying and evaluating individual contract sentences.

{"title":"Natural language processing for legal document review: categorising deontic modalities in contracts","authors":"S. Georgette Graham,&nbsp;Hamidreza Soltani,&nbsp;Olufemi Isiaq","doi":"10.1007/s10506-023-09379-2","DOIUrl":"10.1007/s10506-023-09379-2","url":null,"abstract":"<div><p>The contract review process can be a costly and time-consuming task for lawyers and clients alike, requiring significant effort to identify and evaluate the legal implications of individual clauses. To address this challenge, we propose the use of natural language processing techniques, specifically text classification based on deontic tags, to streamline the process. Our research question is whether natural language processing techniques, specifically dense vector embeddings, can help semi-automate the contract review process and reduce time and costs for legal professionals reviewing deontic modalities in contracts. In this study, we create a domain-specific dataset and train both baseline and neural network models for contract sentence classification. This approach offers a more efficient and cost-effective solution for contract review, mimicking the work of a lawyer. Our approach achieves an accuracy of 0.90, showcasing its effectiveness in identifying and evaluating individual contract sentences.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"33 1","pages":"79 - 100"},"PeriodicalIF":3.1,"publicationDate":"2023-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135043189","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A large scale benchmark for session-based recommendations on the legal domain 基于会议的法律领域建议的大规模基准
IF 3.1 2区 社会学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-10-25 DOI: 10.1007/s10506-023-09378-3
Marcos Aurélio Domingues, Edleno Silva de Moura, Leandro Balby Marinho, Altigran da Silva

The proliferation of legal documents in various formats and their dispersion across multiple courts present a significant challenge for users seeking precise matches to their information requirements. Despite notable advancements in legal information retrieval systems, research into legal recommender systems remains limited. A plausible factor contributing to this scarcity could be the absence of extensive publicly accessible datasets or benchmarks. While a few studies have emerged in this field, a comprehensive analysis of the distinct attributes of legal data that influence the design of effective legal recommenders is notably absent in the current literature. This paper addresses this gap by initially amassing a comprehensive session-based dataset from Jusbrasil, one of Brazil’s largest online legal platforms. Subsequently, we scrutinize and discourse key facets of legal session-based recommendation data, including session duration, types of recommendable legal artifacts, coverage, and popularity. Furthermore, we introduce the first session-based recommendation benchmark tailored to the legal domain, shedding light on the performance and constraints of several renowned session-based recommendation approaches. These evaluations are based on real-world data sourced from Jusbrasil.

{"title":"A large scale benchmark for session-based recommendations on the legal domain","authors":"Marcos Aurélio Domingues,&nbsp;Edleno Silva de Moura,&nbsp;Leandro Balby Marinho,&nbsp;Altigran da Silva","doi":"10.1007/s10506-023-09378-3","DOIUrl":"10.1007/s10506-023-09378-3","url":null,"abstract":"<div><p>The proliferation of legal documents in various formats and their dispersion across multiple courts present a significant challenge for users seeking precise matches to their information requirements. Despite notable advancements in legal information retrieval systems, research into legal recommender systems remains limited. A plausible factor contributing to this scarcity could be the absence of extensive publicly accessible datasets or benchmarks. While a few studies have emerged in this field, a comprehensive analysis of the distinct attributes of legal data that influence the design of effective legal recommenders is notably absent in the current literature. This paper addresses this gap by initially amassing a comprehensive session-based dataset from Jusbrasil, one of Brazil’s largest online legal platforms. Subsequently, we scrutinize and discourse key facets of legal session-based recommendation data, including session duration, types of recommendable legal artifacts, coverage, and popularity. Furthermore, we introduce the first session-based recommendation benchmark tailored to the legal domain, shedding light on the performance and constraints of several renowned session-based recommendation approaches. These evaluations are based on real-world data sourced from Jusbrasil.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"33 1","pages":"43 - 78"},"PeriodicalIF":3.1,"publicationDate":"2023-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135113045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Integrating legal event and context information for Chinese similar case analysis 整合法律事件与语境信息进行中国同类案例分析
IF 3.1 2区 社会学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-10-25 DOI: 10.1007/s10506-023-09377-4
Jingpei Dan, Lanlin Xu, Yuming Wang

Similar case analysis (SCA) is an essential topic in legal artificial intelligence, serving as a reference for legal professionals. Most existing works treat SCA as a traditional text classification task and ignore some important legal elements that affect the verdict and case similarity, like legal events, and thus are easily misled by semantic structure. To address this issue, we propose a Legal Event-Context Model named LECM to improve the accuracy and interpretability of SCA based on Chinese legal corpus. The event-context integration mechanism, which is an essential component of the LECM, is proposed to integrate the legal event and context information based on the attention mechanism, enabling legal events to be associated with their corresponding relevant contexts. We introduce an event detection module to obtain the legal event information, which is pre-trained on a legal event detection dataset to avoid labeling events manually. We conduct extensive experiments on two SCA tasks, i.e., similar case matching (SCM) and similar case retrieval (SCR). Compared with baseline models, LECM is validated by about 13% and 11% average improvement in terms of mean average precision and accuracy respectively, for SCR and SCM tasks. These results indicate that LECM effectively utilizes event-context knowledge to enhance SCA performance and its potential application in various legal document analysis tasks.

{"title":"Integrating legal event and context information for Chinese similar case analysis","authors":"Jingpei Dan,&nbsp;Lanlin Xu,&nbsp;Yuming Wang","doi":"10.1007/s10506-023-09377-4","DOIUrl":"10.1007/s10506-023-09377-4","url":null,"abstract":"<div><p>Similar case analysis (SCA) is an essential topic in legal artificial intelligence, serving as a reference for legal professionals. Most existing works treat SCA as a traditional text classification task and ignore some important legal elements that affect the verdict and case similarity, like legal events, and thus are easily misled by semantic structure. To address this issue, we propose a Legal Event-Context Model named LECM to improve the accuracy and interpretability of SCA based on Chinese legal corpus. The event-context integration mechanism, which is an essential component of the LECM, is proposed to integrate the legal event and context information based on the attention mechanism, enabling legal events to be associated with their corresponding relevant contexts. We introduce an event detection module to obtain the legal event information, which is pre-trained on a legal event detection dataset to avoid labeling events manually. We conduct extensive experiments on two SCA tasks, i.e., similar case matching (SCM) and similar case retrieval (SCR). Compared with baseline models, LECM is validated by about 13% and 11% average improvement in terms of mean average precision and accuracy respectively, for SCR and SCM tasks. These results indicate that LECM effectively utilizes event-context knowledge to enhance SCA performance and its potential application in various legal document analysis tasks.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"33 1","pages":"1 - 42"},"PeriodicalIF":3.1,"publicationDate":"2023-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135217653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A novel network-based paragraph filtering technique for legal document similarity analysis 一种新的基于网络的法律文件相似度分析段落过滤技术
2区 社会学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-10-19 DOI: 10.1007/s10506-023-09375-6
Mayur Makawana, Rupa G. Mehta
{"title":"A novel network-based paragraph filtering technique for legal document similarity analysis","authors":"Mayur Makawana, Rupa G. Mehta","doi":"10.1007/s10506-023-09375-6","DOIUrl":"https://doi.org/10.1007/s10506-023-09375-6","url":null,"abstract":"","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135779227","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-language transfer learning for low-resource legal case summarization 针对低资源法律案例摘要的多语言迁移学习
IF 3.1 2区 社会学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-09-25 DOI: 10.1007/s10506-023-09373-8
Gianluca Moro, Nicola Piscaglia, Luca Ragazzi, Paolo Italiani

Analyzing and evaluating legal case reports are labor-intensive tasks for judges and lawyers, who usually base their decisions on report abstracts, legal principles, and commonsense reasoning. Thus, summarizing legal documents is time-consuming and requires excellent human expertise. Moreover, public legal corpora of specific languages are almost unavailable. This paper proposes a transfer learning approach with extractive and abstractive techniques to cope with the lack of labeled legal summarization datasets, namely a low-resource scenario. In particular, we conducted extensive multi- and cross-language experiments. The proposed work outperforms the state-of-the-art results of extractive summarization on the Australian Legal Case Reports dataset and sets a new baseline for abstractive summarization. Finally, syntactic and semantic metrics assessments have been carried out to evaluate the accuracy and the factual consistency of the machine-generated legal summaries.

分析和评估法律案件报告是法官和律师的劳动密集型任务,他们通常根据报告摘要、法律原则和常识推理做出裁决。因此,总结法律文件既耗费时间,又需要出色的人类专业知识。此外,特定语言的公共法律语料库几乎不可用。本文提出了一种具有抽取和抽象技术的迁移学习方法,以应对缺乏标注法律摘要数据集(即低资源场景)的问题。我们特别进行了广泛的多语言和跨语言实验。在澳大利亚法律案例报告数据集上,所提出的工作优于最先进的抽取式摘要结果,并为抽象式摘要设定了新的基准。最后,还进行了句法和语义度量评估,以评价机器生成的法律摘要的准确性和事实一致性。
{"title":"Multi-language transfer learning for low-resource legal case summarization","authors":"Gianluca Moro,&nbsp;Nicola Piscaglia,&nbsp;Luca Ragazzi,&nbsp;Paolo Italiani","doi":"10.1007/s10506-023-09373-8","DOIUrl":"10.1007/s10506-023-09373-8","url":null,"abstract":"<div><p>Analyzing and evaluating legal case reports are labor-intensive tasks for judges and lawyers, who usually base their decisions on report abstracts, legal principles, and commonsense reasoning. Thus, summarizing legal documents is time-consuming and requires excellent human expertise. Moreover, public legal corpora of specific languages are almost unavailable. This paper proposes a transfer learning approach with extractive and abstractive techniques to cope with the lack of labeled legal summarization datasets, namely a low-resource scenario. In particular, we conducted extensive multi- and cross-language experiments. The proposed work outperforms the state-of-the-art results of extractive summarization on the Australian Legal Case Reports dataset and sets a new baseline for abstractive summarization. Finally, syntactic and semantic metrics assessments have been carried out to evaluate the accuracy and the factual consistency of the machine-generated legal summaries.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"32 4","pages":"1111 - 1139"},"PeriodicalIF":3.1,"publicationDate":"2023-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10506-023-09373-8.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135768568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Ant: a process aware annotation software for regulatory compliance Ant:用于法规遵从性的流程感知注释软件
IF 3.1 2区 社会学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-08-09 DOI: 10.1007/s10506-023-09372-9
Raphaël Gyory, David Restrepo Amariles, Gregory Lewkowicz, Hugues Bersini

Accurate data annotation is essential to successfully implementing machine learning (ML) for regulatory compliance. Annotations allow organizations to train supervised ML algorithms and to adapt and audit the software they buy. The lack of annotation tools focused on regulatory data is slowing the adoption of established ML methodologies and process models, such as CRISP-DM, in various legal domains, including in regulatory compliance. This article introduces Ant, an open-source annotation software for regulatory compliance. Ant is designed to adapt to complex organizational processes and enable compliance experts to be in control of ML projects. By drawing on Business Process Modeling (BPM), we show that Ant can contribute to lift major technical bottlenecks to effectively implement regulatory compliance through software, such as the access to multiple sources of heterogeneous data and the integration of process complexities in the ML pipeline. We provide empirical data to validate the performance of Ant, illustrate its potential to speed up the adoption of ML in regulatory compliance, and highlight its limitations.

准确的数据注释对于成功实施机器学习(ML)以符合法规至关重要。通过注释,企业可以训练有监督的 ML 算法,并对所购买的软件进行调整和审核。由于缺乏专注于监管数据的注释工具,在包括监管合规在内的各种法律领域中,成熟的 ML 方法和流程模型(如 CRISP-DM)的采用速度正在放缓。本文将介绍用于法规遵从的开源注释软件 Ant。Ant 旨在适应复杂的组织流程,使合规专家能够控制 ML 项目。通过借鉴业务流程建模(BPM),我们展示了 Ant 可以帮助解除通过软件有效实施法规遵从的主要技术瓶颈,例如访问多源异构数据和集成 ML 管道中的复杂流程。我们提供了经验数据来验证 Ant 的性能,说明它在加快采用 ML 实现法规遵从方面的潜力,并强调了它的局限性。
{"title":"Ant: a process aware annotation software for regulatory compliance","authors":"Raphaël Gyory,&nbsp;David Restrepo Amariles,&nbsp;Gregory Lewkowicz,&nbsp;Hugues Bersini","doi":"10.1007/s10506-023-09372-9","DOIUrl":"10.1007/s10506-023-09372-9","url":null,"abstract":"<div><p>Accurate data annotation is essential to successfully implementing machine learning (ML) for regulatory compliance. Annotations allow organizations to train supervised ML algorithms and to adapt and audit the software they buy. The lack of annotation tools focused on regulatory data is slowing the adoption of established ML methodologies and process models, such as CRISP-DM, in various legal domains, including in regulatory compliance. This article introduces Ant, an open-source annotation software for regulatory compliance. Ant is designed to adapt to complex organizational processes and enable compliance experts to be in control of ML projects. By drawing on Business Process Modeling (BPM), we show that Ant can contribute to lift major technical bottlenecks to effectively implement regulatory compliance through software, such as the access to multiple sources of heterogeneous data and the integration of process complexities in the ML pipeline. We provide empirical data to validate the performance of Ant, illustrate its potential to speed up the adoption of ML in regulatory compliance, and highlight its limitations.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"32 4","pages":"1075 - 1110"},"PeriodicalIF":3.1,"publicationDate":"2023-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10506-023-09372-9.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42450021","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Lessons learned building a legal inference dataset 构建法律推理数据集的经验教训
IF 3.1 2区 社会学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-07-31 DOI: 10.1007/s10506-023-09370-x
Sungmi Park, Joshua I. James

Legal inference is fundamental for building and verifying hypotheses in police investigations. In this study, we build a Natural Language Inference dataset in Korean for the legal domain, focusing on criminal court verdicts. We developed an adversarial hypothesis collection tool that can challenge the annotators and give us a deep understanding of the data, and a hypothesis network construction tool with visualized graphs to show a use case scenario of the developed model. The data is augmented using a combination of Easy Data Augmentation approaches and round-trip translation, as crowd-sourcing might not be an option for datasets with sensible data. We extensively discuss challenges we have encountered, such as the annotator’s limited domain knowledge, issues in the data augmentation process, problems with handling long contexts and suggest possible solutions to the issues. Our work shows that creating legal inference datasets with limited resources is feasible and proposes further research in this area.

法律推理是建立和验证警方调查假设的基础。在本研究中,我们用韩语建立了一个法律领域的自然语言推理数据集,重点是刑事法庭判决书。我们开发了一个对抗性假设收集工具,可以挑战注释者并让我们深入了解数据;我们还开发了一个假设网络构建工具,其可视化图表展示了所开发模型的使用场景。数据扩充采用了简易数据扩充方法和往返翻译相结合的方式,因为对于具有合理数据的数据集来说,众包可能不是一种选择。我们广泛讨论了遇到的挑战,如注释者有限的领域知识、数据扩充过程中的问题、处理长上下文的问题,并提出了可能的解决方案。我们的工作表明,利用有限的资源创建法律推理数据集是可行的,并提出了在这一领域开展进一步研究的建议。
{"title":"Lessons learned building a legal inference dataset","authors":"Sungmi Park,&nbsp;Joshua I. James","doi":"10.1007/s10506-023-09370-x","DOIUrl":"10.1007/s10506-023-09370-x","url":null,"abstract":"<div><p>Legal inference is fundamental for building and verifying hypotheses in police investigations. In this study, we build a Natural Language Inference dataset in Korean for the legal domain, focusing on criminal court verdicts. We developed an adversarial hypothesis collection tool that can challenge the annotators and give us a deep understanding of the data, and a hypothesis network construction tool with visualized graphs to show a use case scenario of the developed model. The data is augmented using a combination of Easy Data Augmentation approaches and round-trip translation, as crowd-sourcing might not be an option for datasets with sensible data. We extensively discuss challenges we have encountered, such as the annotator’s limited domain knowledge, issues in the data augmentation process, problems with handling long contexts and suggest possible solutions to the issues. Our work shows that creating legal inference datasets with limited resources is feasible and proposes further research in this area.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"32 4","pages":"1011 - 1044"},"PeriodicalIF":3.1,"publicationDate":"2023-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48589915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A topic discovery approach for unsupervised organization of legal document collections 一种无监督组织法律文件集合的主题发现方法
IF 3.1 2区 社会学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-07-19 DOI: 10.1007/s10506-023-09371-w
Daniela Vianna, Edleno Silva de Moura, Altigran Soares da Silva

Technology has substantially transformed the way legal services operate in many different countries. With a large and complex collection of digitized legal documents, the judiciary system worldwide presents a promising scenario for the development of intelligent tools. In this work, we tackle the challenging task of organizing and summarizing the constantly growing collection of legal documents, uncovering hidden topics, or themes that later can support tasks such as legal case retrieval and legal judgment prediction. Our approach to this problem relies on topic discovery techniques combined with a variety of preprocessing techniques and learning-based vector representations of words, such as Doc2Vec and BERT-like models. The proposed method was validated using four different datasets composed of short and long legal documents in Brazilian Portuguese, from legal decisions to chapters in legal books. Analysis conducted by a team of legal specialists revealed the effectiveness of the proposed approach to uncover unique and relevant topics from large collections of legal documents, serving many purposes, such as giving support to legal case retrieval tools and also providing the team of legal specialists with a tool that can accelerate their work of labeling/tagging legal documents.

在许多不同的国家,技术已经大大改变了法律服务的运作方式。全世界的司法系统收集了大量复杂的数字化法律文件,为开发智能工具提供了广阔的前景。在这项工作中,我们要解决的挑战性任务是组织和总结不断增长的法律文件集合,挖掘隐藏的主题,或日后可支持法律案件检索和法律判决预测等任务的主题。我们解决这一问题的方法是将主题发现技术与各种预处理技术和基于学习的词向量表示(如 Doc2Vec 和 BERT 类模型)相结合。我们使用四个不同的数据集对所提出的方法进行了验证,这些数据集由巴西葡萄牙语的长短法律文件组成,从法律判决到法律书籍中的章节。由法律专家团队进行的分析表明,所提出的方法能有效地从大量法律文件集合中发现独特的相关主题,从而达到多种目的,例如为法律案例检索工具提供支持,同时也为法律专家团队提供了一种工具,可以加快他们对法律文件进行标注/标记的工作。
{"title":"A topic discovery approach for unsupervised organization of legal document collections","authors":"Daniela Vianna,&nbsp;Edleno Silva de Moura,&nbsp;Altigran Soares da Silva","doi":"10.1007/s10506-023-09371-w","DOIUrl":"10.1007/s10506-023-09371-w","url":null,"abstract":"<div><p>Technology has substantially transformed the way legal services operate in many different countries. With a large and complex collection of digitized legal documents, the judiciary system worldwide presents a promising scenario for the development of intelligent tools. In this work, we tackle the challenging task of organizing and summarizing the constantly growing collection of legal documents, uncovering hidden topics, or themes that later can support tasks such as legal case retrieval and legal judgment prediction. Our approach to this problem relies on topic discovery techniques combined with a variety of preprocessing techniques and learning-based vector representations of words, such as Doc2Vec and BERT-like models. The proposed method was validated using four different datasets composed of short and long legal documents in Brazilian Portuguese, from legal decisions to chapters in legal books. Analysis conducted by a team of legal specialists revealed the effectiveness of the proposed approach to uncover unique and relevant topics from large collections of legal documents, serving many purposes, such as giving support to legal case retrieval tools and also providing the team of legal specialists with a tool that can accelerate their work of labeling/tagging legal documents.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"32 4","pages":"1045 - 1074"},"PeriodicalIF":3.1,"publicationDate":"2023-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46420377","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Artificial Intelligence and Law
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1