Pub Date : 2021-11-27DOI: 10.1007/s10506-021-09305-4
Diego de Vargas Feijo, Viviane P. Moreira
The standard approach for abstractive text summarization is to use an encoder-decoder architecture. The encoder is responsible for capturing the general meaning from the source text, and the decoder is in charge of generating the final text summary. While this approach can compose summaries that resemble human writing, some may contain unrelated or unfaithful information. This problem is called “hallucination” and it represents a serious issue in legal texts as legal practitioners rely on these summaries when looking for precedents, used to support legal arguments. Another concern is that legal documents tend to be very long and may not be fed entirely to the encoder. We propose our method called LegalSumm for addressing these issues by creating different “views” over the source text, training summarization models to generate independent versions of summaries, and applying entailment module to judge how faithful these candidate summaries are with respect to the source text. We show that the proposed approach can select candidate summaries that improve ROUGE scores in all metrics evaluated.
{"title":"Improving abstractive summarization of legal rulings through textual entailment","authors":"Diego de Vargas Feijo, Viviane P. Moreira","doi":"10.1007/s10506-021-09305-4","DOIUrl":"10.1007/s10506-021-09305-4","url":null,"abstract":"<div><p>The standard approach for abstractive text summarization is to use an encoder-decoder architecture. The encoder is responsible for capturing the general meaning from the source text, and the decoder is in charge of generating the final text summary. While this approach can compose summaries that resemble human writing, some may contain unrelated or unfaithful information. This problem is called “hallucination” and it represents a serious issue in legal texts as legal practitioners rely on these summaries when looking for precedents, used to support legal arguments. Another concern is that legal documents tend to be very long and may not be fed entirely to the encoder. We propose our method called LegalSumm for addressing these issues by creating different “views” over the source text, training summarization models to generate independent versions of summaries, and applying entailment module to judge how faithful these candidate summaries are with respect to the source text. We show that the proposed approach can select candidate summaries that improve ROUGE scores in all metrics evaluated.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"31 1","pages":"91 - 113"},"PeriodicalIF":4.1,"publicationDate":"2021-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47866335","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-11-13DOI: 10.1007/s10506-021-09304-5
Paheli Bhattacharya, Shounak Paul, Kripabandhu Ghosh, Saptarshi Ghosh, Adam Wyner
The task of rhetorical role labeling is to assign labels (such as Fact, Argument, Final Judgement, etc.) to sentences of a court case document. Rhetorical role labeling is an important problem in the field of Legal Analytics, since it can aid in various downstream tasks as well as enhances the readability of lengthy case documents. The task is challenging as case documents are highly various in structure and the rhetorical labels are often subjective. Previous works for automatic rhetorical role identification (i) mainly used Conditional Random Fields over manually handcrafted features, and (ii) focused on certain law domains only (e.g., Immigration cases, Rent law), and a particular jurisdiction/country (e.g., US, Canada, India). In this work, we improve upon the prior works on rhetorical role identification by proposing novel Deep Learning models for automatically identifying rhetorical roles, which substantially outperform the prior methods. Additionally, we show the effectiveness of the proposed models over documents from five different law domains, and from two different jurisdictions—the Supreme Court of India and the Supreme Court of the UK. Through extensive experiments over different variations of the Deep Learning models, including Transformer models based on BERT and LegalBERT, we show the robustness of the methods for the task. We also perform an extensive inter-annotator study and analyse the agreement of the predictions of the proposed model with the annotations by domain experts. We find that some rhetorical labels are inherently hard/subjective and both law experts and neural models frequently get confused in predicting them correctly.
{"title":"DeepRhole: deep learning for rhetorical role labeling of sentences in legal case documents","authors":"Paheli Bhattacharya, Shounak Paul, Kripabandhu Ghosh, Saptarshi Ghosh, Adam Wyner","doi":"10.1007/s10506-021-09304-5","DOIUrl":"10.1007/s10506-021-09304-5","url":null,"abstract":"<div><p>The task of rhetorical role labeling is to assign labels (such as Fact, Argument, Final Judgement, etc.) to sentences of a court case document. Rhetorical role labeling is an important problem in the field of Legal Analytics, since it can aid in various downstream tasks as well as enhances the readability of lengthy case documents. The task is challenging as case documents are highly various in structure and the rhetorical labels are often subjective. Previous works for automatic rhetorical role identification (i) mainly used Conditional Random Fields over manually handcrafted features, and (ii) focused on certain law domains only (e.g., Immigration cases, Rent law), and a particular jurisdiction/country (e.g., US, Canada, India). In this work, we improve upon the prior works on rhetorical role identification by proposing novel Deep Learning models for automatically identifying rhetorical roles, which substantially outperform the prior methods. Additionally, we show the effectiveness of the proposed models over documents from five different law domains, and from two different jurisdictions—the Supreme Court of India and the Supreme Court of the UK. Through extensive experiments over different variations of the Deep Learning models, including Transformer models based on BERT and LegalBERT, we show the robustness of the methods for the task. We also perform an extensive inter-annotator study and analyse the agreement of the predictions of the proposed model with the annotations by domain experts. We find that some rhetorical labels are inherently hard/subjective and both law experts and neural models frequently get confused in predicting them correctly.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"31 1","pages":"53 - 90"},"PeriodicalIF":4.1,"publicationDate":"2021-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44962668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper investigates pricing in laboratory markets when human players interact with an algorithm. We compare the degree of competition when exclusively humans interact to the case of one firm delegating its decisions to an algorithm. We further vary whether participants know about the presence of the algorithm. When one of three firms in a market is an algorithm, we observe significantly higher prices compared to humanonly markets. Firms employing an algorithm earn significantly less profit than their rivals. For four-firm markets, we find no significant differences. (Un)certainty about the actual presence of an algorithm does not significantly affect collusion, although humans seem to perceive algorithms as more disruptive.
{"title":"Human-Algorithm Interaction: Algorithmic Pricing in Hybrid Laboratory Markets","authors":"Hans-Theo Normann, Martin Sternberg","doi":"10.2139/ssrn.3840789","DOIUrl":"https://doi.org/10.2139/ssrn.3840789","url":null,"abstract":"This paper investigates pricing in laboratory markets when human players interact with an algorithm. We compare the degree of competition when exclusively humans interact to the case of one firm delegating its decisions to an algorithm. We further vary whether participants know about the presence of the algorithm. When one of three firms in a market is an algorithm, we observe significantly higher prices compared to humanonly markets. Firms employing an algorithm earn significantly less profit than their rivals. For four-firm markets, we find no significant differences. (Un)certainty about the actual presence of an algorithm does not significantly affect collusion, although humans seem to perceive algorithms as more disruptive.","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"67 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2021-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78899557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-10-23DOI: 10.1007/s10506-021-09303-6
Amelia V. Taylor, Eva Mfutso-Bengo
Legal professionals in Malawi rely on a limited number of textbooks, outdated law reports and inadequate library services. Most documents available are in image form, are un-structured, i.e. contain no useful legal meta-data, summaries, keynotes, and do not support a system of citation that is essential to legal research. While advances in document processing and machine learning have benefited many fields, legal research is still only marginally affected. In this interdisciplinary research, the authors build semi-automatic tools for creating a corpus of Malawi criminal law decisions annotated with legal meta-data, case and law citations. We used this corpus to extract legal meta-data, including law and case citations as used in Malawi by employing machine learning tools, spaCy and Gensim LDA. We set the foundation for a new methodology for classifying Malawi criminal case law according to the recently introduced International Classification of Crime for Statistical Purposes (ICCS).
{"title":"Towards a machine understanding of Malawi legal text","authors":"Amelia V. Taylor, Eva Mfutso-Bengo","doi":"10.1007/s10506-021-09303-6","DOIUrl":"10.1007/s10506-021-09303-6","url":null,"abstract":"<div><p>Legal professionals in Malawi rely on a limited number of textbooks, outdated law reports and inadequate library services. Most documents available are in image form, are un-structured, i.e. contain no useful legal meta-data, summaries, keynotes, and do not support a system of citation that is essential to legal research. While advances in document processing and machine learning have benefited many fields, legal research is still only marginally affected. In this interdisciplinary research, the authors build semi-automatic tools for creating a corpus of Malawi criminal law decisions annotated with legal meta-data, case and law citations. We used this corpus to extract legal meta-data, including law and case citations as used in Malawi by employing machine learning tools, spaCy and Gensim LDA. We set the foundation for a new methodology for classifying Malawi criminal case law according to the recently introduced International Classification of Crime for Statistical Purposes (ICCS).</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"31 1","pages":"1 - 11"},"PeriodicalIF":4.1,"publicationDate":"2021-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45360285","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-10-13DOI: 10.1007/s10506-021-09300-9
Mark D. Flood, Oliver R. Goodenough
We show that the fundamental legal structure of a well-written financial contract follows a state-transition logic that can be formalized mathematically as a finite-state machine (specifically, a deterministic finite automaton or DFA). The automaton defines the states that a financial relationship can be in, such as “default,” “delinquency,” “performing,” etc., and it defines an “alphabet” of events that can trigger state transitions, such as “payment arrives,” “due date passes,” etc. The core of a contract describes the rules by which different sequences of events trigger particular sequences of state transitions in the relationship between the counterparties. By conceptualizing and representing the legal structure of a contract in this way, we expose it to a range of powerful tools and results from the theory of computation. These allow, for example, automated reasoning to determine whether a contract is internally coherent and whether it is complete relative to a particular event alphabet. We illustrate the process by representing a simple loan agreement as an automaton.
{"title":"Contract as automaton: representing a simple financial agreement in computational form","authors":"Mark D. Flood, Oliver R. Goodenough","doi":"10.1007/s10506-021-09300-9","DOIUrl":"10.1007/s10506-021-09300-9","url":null,"abstract":"<div><p>We show that the fundamental legal structure of a well-written financial contract follows a state-transition logic that can be formalized mathematically as a finite-state machine (specifically, a deterministic finite automaton or DFA). The automaton defines the states that a financial relationship can be in, such as “default,” “delinquency,” “performing,” etc., and it defines an “alphabet” of events that can trigger state transitions, such as “payment arrives,” “due date passes,” etc. The core of a contract describes the rules by which different sequences of events trigger particular sequences of state transitions in the relationship between the counterparties. By conceptualizing and representing the legal structure of a contract in this way, we expose it to a range of powerful tools and results from the theory of computation. These allow, for example, automated reasoning to determine whether a contract is internally coherent and whether it is complete relative to a particular event alphabet. We illustrate the process by representing a simple loan agreement as an automaton.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"30 3","pages":"391 - 416"},"PeriodicalIF":4.1,"publicationDate":"2021-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10506-021-09300-9.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46590930","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper examines the hindrances to Copyright Protection in the digital era. The author is of the view that there are six factors that pose as a challenge and in equal measure presents remedies to mitigate the challenges.
{"title":"Copyright Protection in a Digital Environment: Some Introspection","authors":"J. Kevins","doi":"10.2139/ssrn.3924212","DOIUrl":"https://doi.org/10.2139/ssrn.3924212","url":null,"abstract":"This paper examines the hindrances to Copyright Protection in the digital era. The author is of the view that there are six factors that pose as a challenge and in equal measure presents remedies to mitigate the challenges.","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"20 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2021-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74339660","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-09-15DOI: 10.1007/s10506-021-09301-8
Andrea Tagarelli, Andrea Simeri
Modeling law search and retrieval as prediction problems has recently emerged as a predominant approach in law intelligence. Focusing on the law article retrieval task, we present a deep learning framework named LamBERTa, which is designed for civil-law codes, and specifically trained on the Italian civil code. To our knowledge, this is the first study proposing an advanced approach to law article prediction for the Italian legal system based on a BERT (Bidirectional Encoder Representations from Transformers) learning framework, which has recently attracted increased attention among deep learning approaches, showing outstanding effectiveness in several natural language processing and learning tasks. We define LamBERTa models by fine-tuning an Italian pre-trained BERT on the Italian civil code or its portions, for law article retrieval as a classification task. One key aspect of our LamBERTa framework is that we conceived it to address an extreme classification scenario, which is characterized by a high number of classes, the few-shot learning problem, and the lack of test query benchmarks for Italian legal prediction tasks. To solve such issues, we define different methods for the unsupervised labeling of the law articles, which can in principle be applied to any law article code system. We provide insights into the explainability and interpretability of our LamBERTa models, and we present an extensive experimental analysis over query sets of different type, for single-label as well as multi-label evaluation tasks. Empirical evidence has shown the effectiveness of LamBERTa, and also its superiority against widely used deep-learning text classifiers and a few-shot learner conceived for an attribute-aware prediction task.
{"title":"Unsupervised law article mining based on deep pre-trained language representation models with application to the Italian civil code","authors":"Andrea Tagarelli, Andrea Simeri","doi":"10.1007/s10506-021-09301-8","DOIUrl":"10.1007/s10506-021-09301-8","url":null,"abstract":"<div><p>Modeling law search and retrieval as prediction problems has recently emerged as a predominant approach in law intelligence. Focusing on the law article retrieval task, we present a deep learning framework named LamBERTa, which is designed for civil-law codes, and specifically trained on the Italian civil code. To our knowledge, this is the first study proposing an advanced approach to law article prediction for the Italian legal system based on a BERT (Bidirectional Encoder Representations from Transformers) learning framework, which has recently attracted increased attention among deep learning approaches, showing outstanding effectiveness in several natural language processing and learning tasks. We define LamBERTa models by fine-tuning an Italian pre-trained BERT on the Italian civil code or its portions, for law article retrieval as a classification task. One key aspect of our LamBERTa framework is that we conceived it to address an extreme classification scenario, which is characterized by a high number of classes, the few-shot learning problem, and the lack of test query benchmarks for Italian legal prediction tasks. To solve such issues, we define different methods for the unsupervised labeling of the law articles, which can in principle be applied to any law article code system. We provide insights into the explainability and interpretability of our LamBERTa models, and we present an extensive experimental analysis over query sets of different type, for single-label as well as multi-label evaluation tasks. Empirical evidence has shown the effectiveness of LamBERTa, and also its superiority against widely used deep-learning text classifiers and a few-shot learner conceived for an attribute-aware prediction task.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"30 3","pages":"417 - 473"},"PeriodicalIF":4.1,"publicationDate":"2021-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10506-021-09301-8.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42634721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-08-28DOI: 10.1007/s10506-021-09299-z
Kiana Alikhademi, Emma Drobina, Diandra Prioleau, Brianna Richardson, Duncan Purves, Juan E. Gilbert
{"title":"Correction to: A review of predictive policing from the perspective of fairness","authors":"Kiana Alikhademi, Emma Drobina, Diandra Prioleau, Brianna Richardson, Duncan Purves, Juan E. Gilbert","doi":"10.1007/s10506-021-09299-z","DOIUrl":"10.1007/s10506-021-09299-z","url":null,"abstract":"","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"30 1","pages":"19 - 20"},"PeriodicalIF":4.1,"publicationDate":"2021-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1007/s10506-021-09299-z","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48462414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-08-12DOI: 10.1007/s10506-021-09295-3
Robert Kowalski, Akber Datoo
In this paper, we present an informal introduction to Logical English (LE) and illustrate its use to standardise the legal wording of the Automatic Early Termination (AET) clauses of International Swaps and Derivatives Association (ISDA) Agreements. LE can be viewed both as an alternative to conventional legal English for expressing legal documents, and as an alternative to conventional computer languages for automating legal documents. LE is a controlled natural language (CNL), which is designed both to be computer-executable and to be readable by English speakers without special training. The basic form of LE is syntactic sugar for logic programs, in which all sentences have the same standard form, either as rules of the form conclusion if conditions or as unconditional sentences of the form conclusion. However, LE extends normal logic programming by introducing features that are present in other computer languages and other logics. These features include typed variables signalled by common nouns, and existentially quantified variables in the conclusions of sentences signalled by indefinite articles. Although LE translates naturally into a logic programming language such as Prolog or ASP, it can also serve as a neutral standard, which can be compiled into other lower-level computer languages.
{"title":"Logical English meets legal English for swaps and derivatives","authors":"Robert Kowalski, Akber Datoo","doi":"10.1007/s10506-021-09295-3","DOIUrl":"10.1007/s10506-021-09295-3","url":null,"abstract":"<div><p>In this paper, we present an informal introduction to Logical English (LE) and illustrate its use to standardise the legal wording of the Automatic Early Termination (AET) clauses of International Swaps and Derivatives Association (ISDA) Agreements. LE can be viewed both as an alternative to conventional legal English for expressing legal documents, and as an alternative to conventional computer languages for automating legal documents. LE is a controlled natural language (CNL), which is designed both to be computer-executable and to be readable by English speakers without special training. The basic form of LE is syntactic sugar for logic programs, in which all sentences have the same standard form, either as rules of the form <i>conclusion if conditions</i> or as unconditional sentences of the form <i>conclusion.</i> However, LE extends normal logic programming by introducing features that are present in other computer languages and other logics. These features include typed variables signalled by common nouns, and existentially quantified variables in the <i>conclusions</i> of sentences signalled by indefinite articles. Although LE translates naturally into a logic programming language such as Prolog or ASP, it can also serve as a neutral standard, which can be compiled into other lower-level computer languages.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"30 2","pages":"163 - 197"},"PeriodicalIF":4.1,"publicationDate":"2021-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1007/s10506-021-09295-3","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47103960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-08-04DOI: 10.1007/s10506-021-09297-1
Graziella De Martino, Gianvito Pio, Michelangelo Ceci
In an era characterized by fast technological progress that introduces new unpredictable scenarios every day, working in the law field may appear very difficult, if not supported by the right tools. In this respect, some systems based on Artificial Intelligence methods have been proposed in the literature, to support several tasks in the legal sector. Following this line of research, in this paper we propose a novel method, called PRILJ, that identifies paragraph regularities in legal case judgments, to support legal experts during the redaction of legal documents. Methodologically, PRILJ adopts a two-step approach that first groups documents into clusters, according to their semantic content, and then identifies regularities in the paragraphs for each cluster. Embedding-based methods are adopted to properly represent documents and paragraphs into a semantic numerical feature space, and an Approximated Nearest Neighbor Search method is adopted to efficiently retrieve the most similar paragraphs with respect to the paragraphs of a document under preparation. Our extensive experimental evaluation, performed on a real-world dataset provided by EUR-Lex, proves the effectiveness and the efficiency of the proposed method. In particular, its ability of modeling different topics of legal documents, as well as of capturing the semantics of the textual content, appear very beneficial for the considered task, and make PRILJ very robust to the possible presence of noise in the data.
{"title":"PRILJ: an efficient two-step method based on embedding and clustering for the identification of regularities in legal case judgments","authors":"Graziella De Martino, Gianvito Pio, Michelangelo Ceci","doi":"10.1007/s10506-021-09297-1","DOIUrl":"10.1007/s10506-021-09297-1","url":null,"abstract":"<div><p>In an era characterized by fast technological progress that introduces new unpredictable scenarios every day, working in the law field may appear very difficult, if not supported by the right tools. In this respect, some systems based on Artificial Intelligence methods have been proposed in the literature, to support several tasks in the legal sector. Following this line of research, in this paper we propose a novel method, called PRILJ, that identifies paragraph regularities in legal case judgments, to support legal experts during the redaction of legal documents. Methodologically, PRILJ adopts a two-step approach that first groups documents into clusters, according to their semantic content, and then identifies regularities in the paragraphs for each cluster. Embedding-based methods are adopted to properly represent documents and paragraphs into a semantic numerical feature space, and an Approximated Nearest Neighbor Search method is adopted to efficiently retrieve the most similar paragraphs with respect to the paragraphs of a document under preparation. Our extensive experimental evaluation, performed on a real-world dataset provided by EUR-Lex, proves the effectiveness and the efficiency of the proposed method. In particular, its ability of modeling different topics of legal documents, as well as of capturing the semantics of the textual content, appear very beneficial for the considered task, and make PRILJ very robust to the possible presence of noise in the data.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"30 3","pages":"359 - 390"},"PeriodicalIF":4.1,"publicationDate":"2021-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10506-021-09297-1.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43617883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}