加强巴西高等法院法律信息检索的新概念框架

Proceedings of the 12th International Conference on Management of Digital EcoSystems Pub Date : 2020-11-02 DOI:10.1145/3415958.3433087

Thiago Gomes, M. Ladeira

{"title":"加强巴西高等法院法律信息检索的新概念框架","authors":"Thiago Gomes, M. Ladeira","doi":"10.1145/3415958.3433087","DOIUrl":null,"url":null,"abstract":"Effective retrieval of jurisprudence (case-law) is imperative to achieve consistency and predictability for any legal system. In this work, we propose and proceed to an empirical evaluation of a framework for jurisprudence retrieval of the Brazilian Superior Court of Justice in order to ease the task of retrieval of other decisions with the same legal opinion. The experimental results shown that our approach based on text similarity performs better than the legacy system of the Court based on Boolean queries. The building of complex Boolean queries is very specialized and we aim to offer a tool able to use free text as queries without any operator. With the legacy system as baseline, we compare the TF-IDF traditional retrieval model, the BM25 probabilistic model and the Word2Vec model. Our results indicate that the Word2Vec Skip-Gram model, trained on a specialized legal corpus and BM25 yield similar performance and surpasses the legacy system. Combining BM25 model with embedding models improved the performance up to 19%.","PeriodicalId":198419,"journal":{"name":"Proceedings of the 12th International Conference on Management of Digital EcoSystems","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"A new conceptual framework for enhancing legal information retrieval at the Brazilian Superior Court of Justice\",\"authors\":\"Thiago Gomes, M. Ladeira\",\"doi\":\"10.1145/3415958.3433087\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Effective retrieval of jurisprudence (case-law) is imperative to achieve consistency and predictability for any legal system. In this work, we propose and proceed to an empirical evaluation of a framework for jurisprudence retrieval of the Brazilian Superior Court of Justice in order to ease the task of retrieval of other decisions with the same legal opinion. The experimental results shown that our approach based on text similarity performs better than the legacy system of the Court based on Boolean queries. The building of complex Boolean queries is very specialized and we aim to offer a tool able to use free text as queries without any operator. With the legacy system as baseline, we compare the TF-IDF traditional retrieval model, the BM25 probabilistic model and the Word2Vec model. Our results indicate that the Word2Vec Skip-Gram model, trained on a specialized legal corpus and BM25 yield similar performance and surpasses the legacy system. Combining BM25 model with embedding models improved the performance up to 19%.\",\"PeriodicalId\":198419,\"journal\":{\"name\":\"Proceedings of the 12th International Conference on Management of Digital EcoSystems\",\"volume\":\"17 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-11-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 12th International Conference on Management of Digital EcoSystems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3415958.3433087\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 12th International Conference on Management of Digital EcoSystems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3415958.3433087","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

摘要

有效地检索判例(判例法)是实现任何法律制度的一致性和可预测性的必要条件。在这项工作中，我们提出并着手对巴西高等法院的判例检索框架进行实证评估，以便简化具有相同法律意见的其他决定的检索任务。实验结果表明，基于文本相似度的方法比基于布尔查询的法院遗留系统性能更好。构建复杂的布尔查询是非常专业的，我们的目标是提供一个能够使用自由文本作为查询而不需要任何操作符的工具。以遗留系统为基准，我们比较了TF-IDF传统检索模型、BM25概率模型和Word2Vec模型。我们的研究结果表明，在专门的法律语料库和BM25上训练的Word2Vec Skip-Gram模型产生了类似的性能，并且超过了遗留系统。将BM25模型与嵌入模型相结合，性能提高19%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

A new conceptual framework for enhancing legal information retrieval at the Brazilian Superior Court of Justice

Effective retrieval of jurisprudence (case-law) is imperative to achieve consistency and predictability for any legal system. In this work, we propose and proceed to an empirical evaluation of a framework for jurisprudence retrieval of the Brazilian Superior Court of Justice in order to ease the task of retrieval of other decisions with the same legal opinion. The experimental results shown that our approach based on text similarity performs better than the legacy system of the Court based on Boolean queries. The building of complex Boolean queries is very specialized and we aim to offer a tool able to use free text as queries without any operator. With the legacy system as baseline, we compare the TF-IDF traditional retrieval model, the BM25 probabilistic model and the Word2Vec model. Our results indicate that the Word2Vec Skip-Gram model, trained on a specialized legal corpus and BM25 yield similar performance and surpasses the legacy system. Combining BM25 model with embedding models improved the performance up to 19%.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 12th International Conference on Management of Digital EcoSystems

自引率

0.00%

发文量

期刊最新文献

Selection of Information Streams in Social Sensing: an Interdependence- and Cost-aware Ranking Method LEOnto Bot-Detective: An explainable Twitter bot detection service with crowdsourcing functionalities A Novel Framework for Event Interpretation in a Heterogeneous Information System Spatial Information Retrieval in Digital Ecosystems: A Comprehensive Survey