{"title":"加强巴西高等法院法律信息检索的新概念框架","authors":"Thiago Gomes, M. Ladeira","doi":"10.1145/3415958.3433087","DOIUrl":null,"url":null,"abstract":"Effective retrieval of jurisprudence (case-law) is imperative to achieve consistency and predictability for any legal system. In this work, we propose and proceed to an empirical evaluation of a framework for jurisprudence retrieval of the Brazilian Superior Court of Justice in order to ease the task of retrieval of other decisions with the same legal opinion. The experimental results shown that our approach based on text similarity performs better than the legacy system of the Court based on Boolean queries. The building of complex Boolean queries is very specialized and we aim to offer a tool able to use free text as queries without any operator. With the legacy system as baseline, we compare the TF-IDF traditional retrieval model, the BM25 probabilistic model and the Word2Vec model. Our results indicate that the Word2Vec Skip-Gram model, trained on a specialized legal corpus and BM25 yield similar performance and surpasses the legacy system. Combining BM25 model with embedding models improved the performance up to 19%.","PeriodicalId":198419,"journal":{"name":"Proceedings of the 12th International Conference on Management of Digital EcoSystems","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"A new conceptual framework for enhancing legal information retrieval at the Brazilian Superior Court of Justice\",\"authors\":\"Thiago Gomes, M. Ladeira\",\"doi\":\"10.1145/3415958.3433087\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Effective retrieval of jurisprudence (case-law) is imperative to achieve consistency and predictability for any legal system. In this work, we propose and proceed to an empirical evaluation of a framework for jurisprudence retrieval of the Brazilian Superior Court of Justice in order to ease the task of retrieval of other decisions with the same legal opinion. The experimental results shown that our approach based on text similarity performs better than the legacy system of the Court based on Boolean queries. The building of complex Boolean queries is very specialized and we aim to offer a tool able to use free text as queries without any operator. With the legacy system as baseline, we compare the TF-IDF traditional retrieval model, the BM25 probabilistic model and the Word2Vec model. Our results indicate that the Word2Vec Skip-Gram model, trained on a specialized legal corpus and BM25 yield similar performance and surpasses the legacy system. Combining BM25 model with embedding models improved the performance up to 19%.\",\"PeriodicalId\":198419,\"journal\":{\"name\":\"Proceedings of the 12th International Conference on Management of Digital EcoSystems\",\"volume\":\"17 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-11-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 12th International Conference on Management of Digital EcoSystems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3415958.3433087\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 12th International Conference on Management of Digital EcoSystems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3415958.3433087","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A new conceptual framework for enhancing legal information retrieval at the Brazilian Superior Court of Justice
Effective retrieval of jurisprudence (case-law) is imperative to achieve consistency and predictability for any legal system. In this work, we propose and proceed to an empirical evaluation of a framework for jurisprudence retrieval of the Brazilian Superior Court of Justice in order to ease the task of retrieval of other decisions with the same legal opinion. The experimental results shown that our approach based on text similarity performs better than the legacy system of the Court based on Boolean queries. The building of complex Boolean queries is very specialized and we aim to offer a tool able to use free text as queries without any operator. With the legacy system as baseline, we compare the TF-IDF traditional retrieval model, the BM25 probabilistic model and the Word2Vec model. Our results indicate that the Word2Vec Skip-Gram model, trained on a specialized legal corpus and BM25 yield similar performance and surpasses the legacy system. Combining BM25 model with embedding models improved the performance up to 19%.