Artificial Intelligence and Law最新文献_第3页

Bringing legal knowledge to the public by constructing a legal question bank using large-scale pre-trained language model 利用大规模预训练语言模型构建法律题库，将法律知识带给大众

IF 3.1 2区社会学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Artificial Intelligence and Law

Pub Date : 2023-07-06 DOI: 10.1007/s10506-023-09367-6

Mingruo Yuan, Ben Kao, Tien-Hsuan Wu, Michael M. K. Cheung, Henry W. H. Chan, Anne S. Y. Cheung, Felix W. H. Chan, Yongxi Chen

Access to legal information is fundamental to access to justice. Yet accessibility refers not only to making legal documents available to the public, but also rendering legal information comprehensible to them. A vexing problem in bringing legal information to the public is how to turn formal legal documents such as legislation and judgments, which are often highly technical, to easily navigable and comprehensible knowledge to those without legal education. In this study, we formulate a three-step approach for bringing legal knowledge to laypersons, tackling the issues of navigability and comprehensibility. First, we translate selected sections of the law into snippets (called CLIC-pages), each being a small piece of article that focuses on explaining certain technical legal concept in layperson’s terms. Second, we construct a Legal Question Bank, which is a collection of legal questions whose answers can be found in the CLIC-pages. Third, we design an interactive CLIC Recommender. Given a user’s verbal description of a legal situation that requires a legal solution, CRec interprets the user’s input and shortlists questions from the question bank that are most likely relevant to the given legal situation and recommends their corresponding CLIC pages where relevant legal knowledge can be found. In this paper we focus on the technical aspects of creating an LQB. We show how large-scale pre-trained language models, such as GPT-3, can be used to generate legal questions. We compare machine-generated questions against human-composed questions and find that MGQs are more scalable, cost-effective, and more diversified, while HCQs are more precise. We also show a prototype of CRec and illustrate through an example how our 3-step approach effectively brings relevant legal knowledge to the public.

获取法律信息是诉诸司法的基础。然而，法律信息的可获取性不仅指向公众提供法律文件，还指使公众能够理解法律信息。在向公众提供法律信息的过程中，一个令人头疼的问题是如何将立法和判决等通常技术性很强的正式法律文件转化为易于浏览和理解的知识，让没有受过法律教育的人也能理解。在本研究中，我们提出了一种分三步将法律知识带给非专业人士的方法，以解决可浏览性和可理解性的问题。首先，我们将选定的法律章节翻译成片段（称为 CLIC-页），每个片段都是一小段文章，重点是用通俗易懂的语言解释某些技术性法律概念。其次，我们构建了一个法律问题库，这是一个法律问题集合，其答案可以在 CLIC 页中找到。第三，我们设计了一个交互式 CLIC 推荐器。根据用户对需要法律解决方案的法律情况的口头描述，CRec 将对用户的输入进行解释，并从问题库中筛选出最有可能与给定法律情况相关的问题，然后向用户推荐可以找到相关法律知识的相应 CLIC 页面。在本文中，我们将重点讨论创建 LQB 的技术问题。我们展示了大规模预训练语言模型（如 GPT-3）如何用于生成法律问题。我们比较了机器生成的问题和人工撰写的问题，发现 MGQs 更具扩展性、成本效益更高，而且更加多样化，而 HCQs 则更加精确。我们还展示了 CRec 的原型，并通过实例说明了我们的三步法如何有效地为公众提供相关法律知识。

{"title":"Bringing legal knowledge to the public by constructing a legal question bank using large-scale pre-trained language model","authors":"Mingruo Yuan, Ben Kao, Tien-Hsuan Wu, Michael M. K. Cheung, Henry W. H. Chan, Anne S. Y. Cheung, Felix W. H. Chan, Yongxi Chen","doi":"10.1007/s10506-023-09367-6","DOIUrl":"10.1007/s10506-023-09367-6","url":null,"abstract":"<div><p>Access to legal information is fundamental to access to justice. Yet accessibility refers not only to making legal documents available to the public, but also rendering legal information comprehensible to them. A vexing problem in bringing legal information to the public is how to turn formal legal documents such as legislation and judgments, which are often highly technical, to easily navigable and comprehensible knowledge to those without legal education. In this study, we formulate a three-step approach for bringing legal knowledge to laypersons, tackling the issues of navigability and comprehensibility. First, we translate selected sections of the law into snippets (called CLIC-pages), each being a small piece of article that focuses on explaining certain technical legal concept in layperson’s terms. Second, we construct a <i>Legal Question Bank</i>, which is a collection of legal questions whose answers can be found in the CLIC-pages. Third, we design an interactive <i>CLIC Recommender</i>. Given a user’s verbal description of a legal situation that requires a legal solution, CRec interprets the user’s input and shortlists questions from the question bank that are most likely relevant to the given legal situation and recommends their corresponding CLIC pages where relevant legal knowledge can be found. In this paper we focus on the technical aspects of creating an LQB. We show how large-scale pre-trained language models, such as GPT-3, can be used to generate legal questions. We compare machine-generated questions against human-composed questions and find that MGQs are more scalable, cost-effective, and more diversified, while HCQs are more precise. We also show a prototype of CRec and illustrate through an example how our 3-step approach effectively brings relevant legal knowledge to the public.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"32 3","pages":"769 - 805"},"PeriodicalIF":3.1,"publicationDate":"2023-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42058228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

M-LAMAC: a model for linguistic assessment of mitigating and aggravating circumstances of criminal responsibility using computing with words M-LAMAC:一个使用词计算的刑事责任减轻和加重情况的语言评估模型

IF 3.1 2区社会学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Artificial Intelligence and Law

Pub Date : 2023-07-04 DOI: 10.1007/s10506-023-09365-8

Carlos Rafael Rodríguez Rodríguez, Yarina Amoroso Fernández, Denis Sergeevich Zuev, Marieta Peña Abreu, Yeleny Zulueta Veliz

The general mitigating and aggravating circumstances of criminal liability are elements attached to the crime that, when they occur, affect the punishment quantum. Cuban criminal legislation provides a catalog of such circumstances and some general conditions for their application. Such norms give judges broad discretion in assessing circumstances and adjusting punishment based on the intensity of those circumstances. In the interest of broad judicial discretion, the law does not establish specific ways for measuring circumstances’ intensity. This gives judges more freedom and autonomy, but it also imposes on them more social responsibility and challenges them to manage the uncertainty and subjectivity inherent in this complex activity. This paper proposes a model to aid the linguistic assessment of circumstances’ intensity and to provide linguistic and numerical recommendations to determine an appropriate punishment interval. M-LAMAC determines the collective evaluation of circumstances of the same type, determines the prevalence of a type of circumstance by means of a compensation function, recommends the required modification in the input interval, and finally recommends a numerical interval adjusted to the judges’ initially expressed preferences. The model’s applicability is demonstrated by means of several experiments on a fictitious case of bank document forgery.

减轻和加重刑事责任的一般情节是犯罪所附带的要素，一旦出现就会影响刑罚量刑。古巴刑事立法规定了此类情节的目录以及适用这些情节的一些一般条件。这些规范给予法官广泛的自由裁量权，以评估情节并根据情节的严重程度调整处罚。为了实现广泛的司法自由裁量权，法律没有规定衡量情节严重程度的具体方法。这给了法官更多的自由和自主权，但同时也赋予了他们更多的社会责任，并挑战他们如何处理这一复杂活动中固有的不确定性和主观性。本文提出了一个模型来帮助对情节强度进行语言评估，并为确定适当的惩罚间隔提供语言和数字建议。M-LAMAC 模型可确定对同一类型情节的集体评价，通过补偿函数确定某一类型情节的普遍程度，建议对输入区间进行必要的修改，最后根据法官最初表达的偏好建议调整数值区间。该模型的适用性通过对一个虚构的银行文件伪造案件的多次实验得到了证明。

{"title":"M-LAMAC: a model for linguistic assessment of mitigating and aggravating circumstances of criminal responsibility using computing with words","authors":"Carlos Rafael Rodríguez Rodríguez, Yarina Amoroso Fernández, Denis Sergeevich Zuev, Marieta Peña Abreu, Yeleny Zulueta Veliz","doi":"10.1007/s10506-023-09365-8","DOIUrl":"10.1007/s10506-023-09365-8","url":null,"abstract":"<div><p>The general mitigating and aggravating circumstances of criminal liability are elements attached to the crime that, when they occur, affect the punishment quantum. Cuban criminal legislation provides a catalog of such circumstances and some general conditions for their application. Such norms give judges broad discretion in assessing circumstances and adjusting punishment based on the intensity of those circumstances. In the interest of broad judicial discretion, the law does not establish specific ways for measuring circumstances’ intensity. This gives judges more freedom and autonomy, but it also imposes on them more social responsibility and challenges them to manage the uncertainty and subjectivity inherent in this complex activity. This paper proposes a model to aid the linguistic assessment of circumstances’ intensity and to provide linguistic and numerical recommendations to determine an appropriate punishment interval. M-LAMAC determines the collective evaluation of circumstances of the same type, determines the prevalence of a type of circumstance by means of a compensation function, recommends the required modification in the input interval, and finally recommends a numerical interval adjusted to the judges’ initially expressed preferences. The model’s applicability is demonstrated by means of several experiments on a fictitious case of bank document forgery.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"32 3","pages":"697 - 739"},"PeriodicalIF":3.1,"publicationDate":"2023-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48842789","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Integrating text mining and system dynamics to evaluate financial risks of construction contracts 结合文本挖掘和系统动力学评估建筑合同财务风险

IF 3.1 2区社会学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Artificial Intelligence and Law

Pub Date : 2023-07-04 DOI: 10.1007/s10506-023-09366-7

Mahdi Bakhshayesh, Hamidreza Abbasianjahromi

Financial risks are among the most important risks in the construction industry projects, which significantly impact project objectives, including project cost. Besides, financial risks have many interactions with each other and project parameters, which must be taken into account to analyze risks correctly. In addition, a source of financial risks in a project is the contract, which is the most important project document. Identifying terms related to financial risks in a contract and considering their effects on the risk management process is an essential issue that has been neglected. Hence, an integrated model for evaluating financial risks and their related contractual clauses were presented. To this end, the effect of financial risks on the project cost was simulated using a system dynamics model. Moreover, terms related to financial risks in a contract text were identified and extracted using text mining, and their effect was included in the system dynamics model. The model was implemented in a hospital construction project in Tehran as a case study, and its results were analyzed. The innovation of the research is integrating text mining and the system dynamics model to investigate the effect of financial risks and related contractual clauses on the project cost.

财务风险是建筑业项目中最重要的风险之一，会对项目目标（包括项目成本）产生重大影响。此外，财务风险与项目参数之间有许多相互作用，要正确分析风险就必须考虑到这些因素。此外，项目财务风险的来源之一是合同，它是最重要的项目文件。识别合同中与财务风险相关的条款并考虑其对风险管理过程的影响是一个被忽视的重要问题。因此，我们提出了一个评估财务风险及其相关合同条款的综合模型。为此，使用系统动力学模型模拟了财务风险对项目成本的影响。此外，还利用文本挖掘技术识别和提取了合同文本中与财务风险相关的术语，并将其影响纳入系统动力学模型。该模型以德黑兰的一个医院建设项目为案例进行了实施，并对其结果进行了分析。该研究的创新之处在于整合了文本挖掘和系统动力学模型，以研究财务风险和相关合同条款对项目成本的影响。

{"title":"Integrating text mining and system dynamics to evaluate financial risks of construction contracts","authors":"Mahdi Bakhshayesh, Hamidreza Abbasianjahromi","doi":"10.1007/s10506-023-09366-7","DOIUrl":"10.1007/s10506-023-09366-7","url":null,"abstract":"<div><p>Financial risks are among the most important risks in the construction industry projects, which significantly impact project objectives, including project cost. Besides, financial risks have many interactions with each other and project parameters, which must be taken into account to analyze risks correctly. In addition, a source of financial risks in a project is the contract, which is the most important project document. Identifying terms related to financial risks in a contract and considering their effects on the risk management process is an essential issue that has been neglected. Hence, an integrated model for evaluating financial risks and their related contractual clauses were presented. To this end, the effect of financial risks on the project cost was simulated using a system dynamics model. Moreover, terms related to financial risks in a contract text were identified and extracted using text mining, and their effect was included in the system dynamics model. The model was implemented in a hospital construction project in Tehran as a case study, and its results were analyzed. The innovation of the research is integrating text mining and the system dynamics model to investigate the effect of financial risks and related contractual clauses on the project cost.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"32 3","pages":"741 - 768"},"PeriodicalIF":3.1,"publicationDate":"2023-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47208726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A RDF-based graph to representing and searching parts of legal documents 一种基于rdf的图形，用于表示和搜索法律文件的各个部分

IF 3.1 2区社会学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Artificial Intelligence and Law

Pub Date : 2023-07-01 DOI: 10.1007/s10506-023-09364-9

Francisco de Oliveira, Jose Maria Parente de Oliveira

Despite the public availability of legal documents, there is a need for finding specific information contained in them, such as paragraphs, clauses, items and so on. With such support, users could find more specific information than only finding whole legal documents. Some research efforts have been made in this area, but there is still a lot to be done to have legal information available more easily to be found. Thus, due to the large number of published legal documents and the high degree of connectivity, simple access to the document is not enough. It is necessary to recover the related legal framework for a specific need. In other words, the retrieval of the set of legal documents and their parts related to a specific subject is necessary. Therefore, in this work, we present a proposal of a RDF-based graph to represent and search parts of legal documents, as the output of a set of terms that represents the pursued legal information. Such a proposal is well-grounded on an ontological view, which makes possible to describe the general structure of a legal system and the structure of legal documents, providing this way the grounds for the implementation of the proposed RDF graph in terms of the meaning of their parts and relationships. We posed several queries to retrieve parts of legal documents related to sets of words and the results were significant.

尽管法律文件可以公开获取，但仍有必要查找其中包含的具体信息，如段落、条款、项目等。有了这种支持，用户就可以找到比只查找整个法律文件更具体的信息。在这方面已经做了一些研究工作，但要使法律信息更容易被找到，还有很多工作要做。因此，由于已出版的法律文件数量庞大且具有高度的关联性，仅仅获取文件是不够的。有必要根据具体需要恢复相关的法律框架。换句话说，有必要检索与特定主题相关的法律文件集及其部分。因此，在这项工作中，我们提出了一个基于 RDF 的图来表示和搜索法律文件的部分内容的建议，作为表示所追求的法律信息的一组术语的输出。这种建议以本体论观点为基础，可以描述法律系统的一般结构和法律文件的结构，从而为根据其各部分的含义和关系实现建议的 RDF 图提供依据。我们提出了几项查询，以检索法律文件中与词集相关的部分，结果非常显著。

{"title":"A RDF-based graph to representing and searching parts of legal documents","authors":"Francisco de Oliveira, Jose Maria Parente de Oliveira","doi":"10.1007/s10506-023-09364-9","DOIUrl":"10.1007/s10506-023-09364-9","url":null,"abstract":"<div><p>Despite the public availability of legal documents, there is a need for finding specific information contained in them, such as paragraphs, clauses, items and so on. With such support, users could find more specific information than only finding whole legal documents. Some research efforts have been made in this area, but there is still a lot to be done to have legal information available more easily to be found. Thus, due to the large number of published legal documents and the high degree of connectivity, simple access to the document is not enough. It is necessary to recover the related legal framework for a specific need. In other words, the retrieval of the set of legal documents and their parts related to a specific subject is necessary. Therefore, in this work, we present a proposal of a RDF-based graph to represent and search parts of legal documents, as the output of a set of terms that represents the pursued legal information. Such a proposal is well-grounded on an ontological view, which makes possible to describe the general structure of a legal system and the structure of legal documents, providing this way the grounds for the implementation of the proposed RDF graph in terms of the meaning of their parts and relationships. We posed several queries to retrieve parts of legal documents related to sets of words and the results were significant.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"32 3","pages":"667 - 695"},"PeriodicalIF":3.1,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42956866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Predicting citations in Dutch case law with natural language processing 用自然语言处理预测荷兰判例法中的引文

IF 3.1 2区社会学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Artificial Intelligence and Law

Pub Date : 2023-06-28 DOI: 10.1007/s10506-023-09368-5

Iris Schepers, Masha Medvedeva, Michelle Bruijn, Martijn Wieling, Michel Vols

With the ever-growing accessibility of case law online, it has become challenging to manually identify case law relevant to one’s legal issue. In the Netherlands, the planned increase in the online publication of case law is expected to exacerbate this challenge. In this paper, we tried to predict whether court decisions are cited by other courts or not after being published, thus in a way distinguishing between more and less authoritative cases. This type of system may be used to process the large amounts of available data by filtering out large quantities of non-authoritative decisions, thus helping legal practitioners and scholars to find relevant decisions more easily, and drastically reducing the time spent on preparation and analysis. For the Dutch Supreme Court, the match between our prediction and the actual data was relatively strong (with a Matthews Correlation Coefficient of 0.60). Our results were less successful for the Council of State and the district courts (MCC scores of 0.26 and 0.17, relatively). We also attempted to identify the most informative characteristics of a decision. We found that a completely explainable model, consisting only of handcrafted metadata features, performs almost as well as a less well-explainable system based on all text of the decision.

随着在线判例法的可获取性不断增加，人工识别与个人法律问题相关的判例法已成为一项挑战。在荷兰，计划增加判例法的在线发布，预计这将加剧这一挑战。在本文中，我们试图预测法院判决在公布后是否被其他法院引用，从而在某种程度上区分出权威性较高和较低的案例。此类系统可用于处理大量可用数据，过滤掉大量非权威性判决，从而帮助法律从业人员和学者更轻松地找到相关判决，并大幅减少准备和分析所花费的时间。就荷兰最高法院而言，我们的预测与实际数据的匹配度相对较高（马太相关系数为 0.60）。对于国务委员会和地区法院，我们的结果则不太理想（马太相关系数分别为 0.26 和 0.17）。我们还试图找出判决中信息量最大的特征。我们发现，一个完全可解释的模型（仅由手工制作的元数据特征组成）与一个基于判决书全部文本的可解释性较差的系统的表现几乎一样好。

{"title":"Predicting citations in Dutch case law with natural language processing","authors":"Iris Schepers, Masha Medvedeva, Michelle Bruijn, Martijn Wieling, Michel Vols","doi":"10.1007/s10506-023-09368-5","DOIUrl":"10.1007/s10506-023-09368-5","url":null,"abstract":"<div><p>With the ever-growing accessibility of case law online, it has become challenging to manually identify case law relevant to one’s legal issue. In the Netherlands, the planned increase in the online publication of case law is expected to exacerbate this challenge. In this paper, we tried to predict whether court decisions are cited by other courts or not after being published, thus in a way distinguishing between more and less authoritative cases. This type of system may be used to process the large amounts of available data by filtering out large quantities of non-authoritative decisions, thus helping legal practitioners and scholars to find relevant decisions more easily, and drastically reducing the time spent on preparation and analysis. For the Dutch Supreme Court, the match between our prediction and the actual data was relatively strong (with a Matthews Correlation Coefficient of 0.60). Our results were less successful for the Council of State and the district courts (MCC scores of 0.26 and 0.17, relatively). We also attempted to identify the most informative characteristics of a decision. We found that a completely explainable model, consisting only of handcrafted metadata features, performs almost as well as a less well-explainable system based on all text of the decision.\u0000</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"32 3","pages":"807 - 837"},"PeriodicalIF":3.1,"publicationDate":"2023-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11291598/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47866539","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

I beg to differ: how disagreement is handled in the annotation of legal machine learning data sets 我不同意：在法律机器学习数据集的注释中如何处理分歧

IF 3.1 2区社会学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Artificial Intelligence and Law

Pub Date : 2023-06-27 DOI: 10.1007/s10506-023-09369-4

Daniel Braun

Legal documents, like contracts or laws, are subject to interpretation. Different people can have different interpretations of the very same document. Large parts of judicial branches all over the world are concerned with settling disagreements that arise, in part, from these different interpretations. In this context, it only seems natural that during the annotation of legal machine learning data sets, disagreement, how to report it, and how to handle it should play an important role. This article presents an analysis of the current state-of-the-art in the annotation of legal machine learning data sets. The results of the analysis show that all of the analysed data sets remove all traces of disagreement, instead of trying to utilise the information that might be contained in conflicting annotations. Additionally, the publications introducing the data sets often do provide little information about the process that derives the “gold standard” from the initial annotations, often making it difficult to judge the reliability of the annotation process. Based on the state-of-the-art, the article provides easily implementable suggestions on how to improve the handling and reporting of disagreement in the annotation of legal machine learning data sets.

法律文件，如合同或法律，是可以解释的。不同的人对同一份文件会有不同的解释。世界各地司法部门的大部分工作都与解决分歧有关，而分歧的部分原因就在于这些不同的解释。在这种情况下，在法律机器学习数据集的注释过程中，分歧、如何报告分歧以及如何处理分歧似乎是理所当然的。本文对当前法律机器学习数据集标注的最新技术进行了分析。分析结果表明，所有被分析的数据集都删除了所有分歧痕迹，而不是试图利用可能包含在冲突注释中的信息。此外，介绍数据集的出版物通常很少提供关于从初始注释中得出 "黄金标准 "的过程的信息，因此往往难以判断注释过程的可靠性。文章以最新技术为基础，就如何改进法律机器学习数据集注释中分歧的处理和报告提出了易于实施的建议。

{"title":"I beg to differ: how disagreement is handled in the annotation of legal machine learning data sets","authors":"Daniel Braun","doi":"10.1007/s10506-023-09369-4","DOIUrl":"10.1007/s10506-023-09369-4","url":null,"abstract":"<div><p>Legal documents, like contracts or laws, are subject to interpretation. Different people can have different interpretations of the very same document. Large parts of judicial branches all over the world are concerned with settling disagreements that arise, in part, from these different interpretations. In this context, it only seems natural that during the annotation of legal machine learning data sets, disagreement, how to report it, and how to handle it should play an important role. This article presents an analysis of the current state-of-the-art in the annotation of legal machine learning data sets. The results of the analysis show that all of the analysed data sets remove all traces of disagreement, instead of trying to utilise the information that might be contained in conflicting annotations. Additionally, the publications introducing the data sets often do provide little information about the process that derives the “gold standard” from the initial annotations, often making it difficult to judge the reliability of the annotation process. Based on the state-of-the-art, the article provides easily implementable suggestions on how to improve the handling and reporting of disagreement in the annotation of legal machine learning data sets.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"32 3","pages":"839 - 862"},"PeriodicalIF":3.1,"publicationDate":"2023-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10506-023-09369-4.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44532145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Mining legal arguments in court decisions 挖掘法院判决中的法律论据

IF 3.1 2区社会学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Artificial Intelligence and Law

Pub Date : 2023-06-23 DOI: 10.1007/s10506-023-09361-y

Ivan Habernal, Daniel Faber, Nicola Recchia, Sebastian Bretthauer, Iryna Gurevych, Indra Spiecker genannt Döhmann, Christoph Burchard

Identifying, classifying, and analyzing arguments in legal discourse has been a prominent area of research since the inception of the argument mining field. However, there has been a major discrepancy between the way natural language processing (NLP) researchers model and annotate arguments in court decisions and the way legal experts understand and analyze legal argumentation. While computational approaches typically simplify arguments into generic premises and claims, arguments in legal research usually exhibit a rich typology that is important for gaining insights into the particular case and applications of law in general. We address this problem and make several substantial contributions to move the field forward. First, we design a new annotation scheme for legal arguments in proceedings of the European Court of Human Rights (ECHR) that is deeply rooted in the theory and practice of legal argumentation research. Second, we compile and annotate a large corpus of 373 court decisions (2.3M tokens and 15k annotated argument spans). Finally, we train an argument mining model that outperforms state-of-the-art models in the legal NLP domain and provide a thorough expert-based evaluation. All datasets and source codes are available under open lincenses at https://github.com/trusthlt/mining-legal-arguments.

自论据挖掘领域诞生以来，识别、分类和分析法律论述中的论据一直是一个突出的研究领域。然而，自然语言处理（NLP）研究人员对法庭判决中的论据进行建模和注释的方式与法律专家理解和分析法律论证的方式之间一直存在很大差异。计算方法通常将论证简化为一般的前提和主张，而法律研究中的论证通常表现出丰富的类型，这对于深入了解特定案件和一般法律应用非常重要。我们解决了这一问题，并为推动该领域的发展做出了几项重大贡献。首先，我们为欧洲人权法院（ECHR）诉讼程序中的法律论证设计了一种新的注释方案，该方案深深植根于法律论证研究的理论和实践。其次，我们汇编并注释了一个包含 373 份法院判决的大型语料库（230 万个词条和 1.5 万个注释论证跨度）。最后，我们训练了一个论证挖掘模型，该模型在法律 NLP 领域优于最先进的模型，并提供了全面的基于专家的评估。所有数据集和源代码均以开放林肯许可的方式提供，网址为 https://github.com/trusthlt/mining-legal-arguments。

{"title":"Mining legal arguments in court decisions","authors":"Ivan Habernal, Daniel Faber, Nicola Recchia, Sebastian Bretthauer, Iryna Gurevych, Indra Spiecker genannt Döhmann, Christoph Burchard","doi":"10.1007/s10506-023-09361-y","DOIUrl":"10.1007/s10506-023-09361-y","url":null,"abstract":"<div><p>Identifying, classifying, and analyzing arguments in legal discourse has been a prominent area of research since the inception of the argument mining field. However, there has been a major discrepancy between the way natural language processing (NLP) researchers model and annotate arguments in court decisions and the way legal experts understand and analyze legal argumentation. While computational approaches typically simplify arguments into generic premises and claims, arguments in legal research usually exhibit a rich typology that is important for gaining insights into the particular case and applications of law in general. We address this problem and make several substantial contributions to move the field forward. First, we design a new annotation scheme for legal arguments in proceedings of the European Court of Human Rights (ECHR) that is deeply rooted in the theory and practice of legal argumentation research. Second, we compile and annotate a large corpus of 373 court decisions (2.3M tokens and 15k annotated argument spans). Finally, we train an argument mining model that outperforms state-of-the-art models in the legal NLP domain and provide a thorough expert-based evaluation. All datasets and source codes are available under open lincenses at https://github.com/trusthlt/mining-legal-arguments.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"32 3","pages":"1 - 38"},"PeriodicalIF":3.1,"publicationDate":"2023-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10506-023-09361-y.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76639492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

An approach to temporalised legal revision through addition of literals 通过添加文字来实现时间化法律修订

IF 3.1 2区社会学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Artificial Intelligence and Law

Pub Date : 2023-06-08 DOI: 10.1007/s10506-023-09363-w

Martín O. Moguillansky, Diego C. Martinez, Luciano H. Tamargo, Antonino Rotolo

As lawmakers produce norms, the underlying normative system is affected showing the intrinsic dynamism of law. Through undertaken actions of legal change, the normative system is continuously modified. In a usual legislative practice, the time for an enacted legal provision to be in force may differ from that of its inclusion to the legal system, or from that in which it produces legal effects. Even more, some provisions can produce effects retroactively in time. In this article we study a simulation of such process through the formalisation of a temporalised logical framework upon which a novel belief revision model tackles the dynamic nature of law. Represented through intervals, the temporalisation of sentences allows differentiating the temporal parameters of norms. In addition, a proposed revision operator allows assessing change to the legal system by including a new temporalised literal while preserving the time-based consistency. This can be achieved either by pushing out conflictive pieces of pre-existing norms or through the modification of intervals in which such norms can be either in force, or produce effects. Finally, the construction of the temporalised revision operator is axiomatically characterised and its rational behavior proved through a corresponding representation theorem.

随着立法者制定规范，基本规范体系也会受到影响，这显示了法律的内在活力。通过开展法律变革行动，规范体系不断得到修改。在通常的立法实践中，已颁布的法律条文的生效时间可能不同于其纳入法律体系的时间，也可能不同于其产生法律效力的时间。更有甚者，有些条款在时间上可以产生追溯效力。在本文中，我们通过形式化的时间逻辑框架来研究这种过程的模拟，在此基础上，一种新颖的信念修正模型解决了法律的动态性问题。通过区间表示，句子的时间化允许区分规范的时间参数。此外，建议的修订运算符允许在保持基于时间的一致性的同时，通过加入新的时间化字面来评估法律体系的变化。要做到这一点，既可以推翻原有规范中的冲突部分，也可以修改这些规范生效或产生效力的时间间隔。最后，我们对时态化修正算子的构造进行了公理化描述，并通过相应的表示定理证明了其合理性。

{"title":"An approach to temporalised legal revision through addition of literals","authors":"Martín O. Moguillansky, Diego C. Martinez, Luciano H. Tamargo, Antonino Rotolo","doi":"10.1007/s10506-023-09363-w","DOIUrl":"10.1007/s10506-023-09363-w","url":null,"abstract":"<div><p>As lawmakers produce norms, the underlying normative system is affected showing the intrinsic dynamism of law. Through undertaken actions of legal change, the normative system is continuously modified. In a usual legislative practice, the time for an enacted legal provision to be in force may differ from that of its inclusion to the legal system, or from that in which it produces legal effects. Even more, some provisions can produce effects retroactively in time. In this article we study a simulation of such process through the formalisation of a temporalised logical framework upon which a novel belief revision model tackles the dynamic nature of law. Represented through intervals, the temporalisation of sentences allows differentiating the temporal parameters of norms. In addition, a proposed revision operator allows assessing change to the legal system by including a new temporalised literal while preserving the time-based consistency. This can be achieved either by pushing out conflictive pieces of pre-existing norms or through the modification of intervals in which such norms can be either in force, or produce effects. Finally, the construction of the temporalised revision operator is axiomatically characterised and its rational behavior proved through a corresponding representation theorem.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"32 3","pages":"621 - 666"},"PeriodicalIF":3.1,"publicationDate":"2023-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41942916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Encoding legislation: a methodology for enhancing technical validation, legal alignment and interdisciplinarity 编码立法:一种加强技术验证、法律一致性和跨学科性的方法

IF 3.1 2区社会学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Artificial Intelligence and Law

Pub Date : 2023-06-03 DOI: 10.1007/s10506-023-09350-1

Alice Witt, Anna Huggins, Guido Governatori, Joshua Buckley

This article proposes an innovative methodology for enhancing the technical validation, legal alignment and interdisciplinarity of attempts to encode legislation. In the context of an experiment that examines how different legally trained participants convert select provisions of the Australian Copyright Act 1968 (Cth) into machine-executable code, we find that a combination of manual and automated methods for coding validation, which focus on formal adherence to programming languages and conventions, can significantly increase the similarity of encoded rules between coders. Participants nonetheless encountered various interpretive difficulties, including syntactic ambiguity, and intra- and intertextuality, which necessitated legal evaluation, as distinct from and in addition to coding validation. Many of these difficulties can be resolved through what we call a process of ‘legal alignment’ that aims to enhance the congruence between encoded provisions and the true meaning of a statute as determined by the courts. However, some difficulties cannot be overcome in advance, such as factual indeterminacy. Given the inherently interdisciplinary nature of encoding legislation, we argue that it is desirable for ‘rules as code’ (‘RaC’) initiatives to have, at a minimum, legal subject matter, statutory interpretation and technical programming expertise. Overall, we contend that technical validation, legal alignment and interdisciplinary teamwork are integral to the success of attempts to encode legislation. While legal alignment processes will vary depending on jurisdictionally-specific principles and practices of statutory interpretation, the technical and interdisciplinary components of our methodology are transferable across regulatory contexts, bodies of law and Commonwealth and other jurisdictions.

本文提出了一种创新方法，用于加强立法编码尝试的技术验证、法律协调和跨学科性。在一项研究不同受过法律培训的参与者如何将《1968 年澳大利亚版权法》（Cth）中的部分条款转换为机器可执行代码的实验中，我们发现，手动和自动编码验证方法的结合（侧重于正式遵守编程语言和惯例）可以显著提高编码者之间编码规则的相似性。尽管如此，参与者还是遇到了各种解释上的困难，包括句法歧义、文本内和文本间性，这就需要进行法律评估，有别于编码验证，也有别于编码验证。其中许多困难可以通过我们所说的 "法律调整 "过程来解决，该过程旨在加强编码条款与法院确定的法规真正含义之间的一致性。然而，有些困难是无法事先克服的，例如事实的不确定性。鉴于立法编码本身具有跨学科性质，我们认为 "规则即代码"（"RaC"）计划至少应具备法律主题、法规解释和技术编程方面的专业知识。总之，我们认为技术验证、法律协调和跨学科团队合作是立法编码成功不可或缺的因素。虽然法律协调过程会因特定司法管辖区的法定解释原则和实践而有所不同，但我们方法中的技术和跨学科组成部分可在不同的监管环境、法律机构以及英联邦和其他司法管辖区之间进行转换。

{"title":"Encoding legislation: a methodology for enhancing technical validation, legal alignment and interdisciplinarity","authors":"Alice Witt, Anna Huggins, Guido Governatori, Joshua Buckley","doi":"10.1007/s10506-023-09350-1","DOIUrl":"10.1007/s10506-023-09350-1","url":null,"abstract":"<div><p>This article proposes an innovative methodology for enhancing the technical validation, legal alignment and interdisciplinarity of attempts to encode legislation. In the context of an experiment that examines how different legally trained participants convert select provisions of the Australian <i>Copyright Act </i><i>1968</i> (Cth) into machine-executable code, we find that a combination of manual and automated methods for coding validation, which focus on formal adherence to programming languages and conventions, can significantly increase the similarity of encoded rules between coders. Participants nonetheless encountered various interpretive difficulties, including syntactic ambiguity, and intra- and intertextuality, which necessitated legal evaluation, as distinct from and in addition to coding validation. Many of these difficulties can be resolved through what we call a process of ‘legal alignment’ that aims to enhance the congruence between encoded provisions and the true meaning of a statute as determined by the courts. However, some difficulties cannot be overcome in advance, such as factual indeterminacy. Given the inherently interdisciplinary nature of encoding legislation, we argue that it is desirable for ‘rules as code’ (‘RaC’) initiatives to have, at a minimum, legal subject matter, statutory interpretation and technical programming expertise. Overall, we contend that technical validation, legal alignment and interdisciplinary teamwork are integral to the success of attempts to encode legislation. While legal alignment processes will vary depending on jurisdictionally-specific principles and practices of statutory interpretation, the technical and interdisciplinary components of our methodology are transferable across regulatory contexts, bodies of law and Commonwealth and other jurisdictions.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"32 2","pages":"293 - 324"},"PeriodicalIF":3.1,"publicationDate":"2023-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10506-023-09350-1.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47194421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Compliance checking on first-order knowledge with conflicting and compensatory norms: a comparison among currently available technologies 对具有冲突和补偿规范的一阶知识的合规性检查：当前可用技术之间的比较

IF 3.1 2区社会学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Artificial Intelligence and Law

Pub Date : 2023-06-02 DOI: 10.1007/s10506-023-09360-z

Livio Robaldo, Sotiris Batsakis, Roberta Calegari, Francesco Calimeri, Megumi Fujita, Guido Governatori, Maria Concetta Morelli, Francesco Pacenza, Giuseppe Pisano, Ken Satoh, Ilias Tachmazidis, Jessica Zangari

This paper analyses and compares some of the automated reasoners that have been used in recent research for compliance checking. Although the list of the considered reasoners is not exhaustive, we believe that our analysis is representative enough to take stock of the current state of the art in the topic. We are interested here in formalizations at the first-order level. Past literature on normative reasoning mostly focuses on the propositional level. However, the propositional level is of little usefulness for concrete LegalTech applications, in which compliance checking must be enforced on (large) sets of individuals. Furthermore, we are interested in technologies that are freely available and that can be further investigated and compared by the scientific community. In other words, this paper does not consider technologies only employed in industry and/or whose source code is non-accessible. This paper formalizes a selected use case in the considered reasoners and compares the implementations, also in terms of simulations with respect to shared synthetic datasets. The comparison will highlight that lot of further research still needs to be done to integrate the benefits featured by the different reasoners into a single standardized first-order framework, suitable for LegalTech applications. All source codes are freely available at https://github.com/liviorobaldo/compliancecheckers, together with instructions to locally reproduce the simulations.

本文分析并比较了近期研究中用于合规性检查的一些自动推理器。尽管我们所考虑的推理器并不详尽，但我们相信，我们的分析足以代表当前该领域的技术水平。在此，我们关注的是一阶层次的形式化。以往关于规范推理的文献大多集中在命题层面。然而，命题层面对于具体的法律技术应用来说用处不大，因为在这些应用中，必须对（大量）个人集合进行合规性检查。此外，我们感兴趣的技术都是免费提供的，科学界可以对其进行进一步研究和比较。换句话说，本文不考虑仅用于工业领域和/或源代码不可访问的技术。本文在所考虑的推理器中正式确定了一个选定的用例，并比较了这些推理器的实现，同时还对共享合成数据集进行了模拟。比较结果将突出表明，要将不同推理器的优势整合到一个适用于法律技术应用的标准化一阶框架中，仍有许多进一步的研究工作要做。所有源代码均可在 https://github.com/liviorobaldo/compliancecheckers 免费获取，并附有本地重现模拟的说明。

{"title":"Compliance checking on first-order knowledge with conflicting and compensatory norms: a comparison among currently available technologies","authors":"Livio Robaldo, Sotiris Batsakis, Roberta Calegari, Francesco Calimeri, Megumi Fujita, Guido Governatori, Maria Concetta Morelli, Francesco Pacenza, Giuseppe Pisano, Ken Satoh, Ilias Tachmazidis, Jessica Zangari","doi":"10.1007/s10506-023-09360-z","DOIUrl":"10.1007/s10506-023-09360-z","url":null,"abstract":"<div><p>This paper analyses and compares some of the automated reasoners that have been used in recent research for compliance checking. Although the list of the considered reasoners is not exhaustive, we believe that our analysis is representative enough to take stock of the current state of the art in the topic. We are interested here in formalizations at the <i>first-order</i> level. Past literature on normative reasoning mostly focuses on the <i>propositional</i> level. However, the propositional level is of little usefulness for concrete LegalTech applications, in which compliance checking must be enforced on (large) sets of individuals. Furthermore, we are interested in technologies that are <i>freely available</i> and that can be further investigated and compared by the scientific community. In other words, this paper does not consider technologies only employed in industry and/or whose source code is non-accessible. This paper formalizes a selected use case in the considered reasoners and compares the implementations, also in terms of simulations with respect to shared synthetic datasets. The comparison will highlight that lot of further research still needs to be done to integrate the benefits featured by the different reasoners into a single standardized first-order framework, suitable for LegalTech applications. All source codes are freely available at https://github.com/liviorobaldo/compliancecheckers, together with instructions to locally reproduce the simulations.\u0000</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"32 2","pages":"505 - 555"},"PeriodicalIF":3.1,"publicationDate":"2023-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10506-023-09360-z.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45255388","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0