Sezen Perçin, Andrea Galassi, F. Lagioia, Federico Ruggeri, Piera Santin, G. Sartor, Paolo Torroni
{"title":"Combining WordNet and Word Embeddings in Data Augmentation for Legal Texts","authors":"Sezen Perçin, Andrea Galassi, F. Lagioia, Federico Ruggeri, Piera Santin, G. Sartor, Paolo Torroni","doi":"10.18653/v1/2022.nllp-1.4","DOIUrl":null,"url":null,"abstract":"Creating balanced labeled textual corpora for complex tasks, like legal analysis, is a challenging and expensive process that often requires the collaboration of domain experts.To address this problem, we propose a data augmentation method based on the combination of GloVe word embeddings and the WordNet ontology.We present an example of application in the legal domain, specifically on decisions of the Court of Justice of the European Union.Our evaluation with human experts confirms that our method is more robust than the alternatives.","PeriodicalId":278495,"journal":{"name":"Proceedings of the Natural Legal Language Processing Workshop 2022","volume":"27 15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Natural Legal Language Processing Workshop 2022","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18653/v1/2022.nllp-1.4","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Creating balanced labeled textual corpora for complex tasks, like legal analysis, is a challenging and expensive process that often requires the collaboration of domain experts.To address this problem, we propose a data augmentation method based on the combination of GloVe word embeddings and the WordNet ontology.We present an example of application in the legal domain, specifically on decisions of the Court of Justice of the European Union.Our evaluation with human experts confirms that our method is more robust than the alternatives.