语义处理和其他 NLP 工具如何改进在线法律数据库？

IF 0.6 Q2 LAW TalTech Journal of European Studies Pub Date : 2023-12-01 DOI:10.2478/bjes-2023-0018

Renátó Vági

{"title":"语义处理和其他 NLP 工具如何改进在线法律数据库？","authors":"Renátó Vági","doi":"10.2478/bjes-2023-0018","DOIUrl":null,"url":null,"abstract":"Abstract The spread of online databases and the increasingly sophisticated search solutions in the past 10–15 years have opened up many new opportunities for lawyers to find relevant documents. However, it is still a common problem that the various legal databases and legal search engines face an information crisis. Legal database providers use various information extraction solutions, especially named entity recognition (NER), to mitigate this problem. These solutions can improve the relevance of the lists of results. Their limitation, however, is that they can only extract and create searchable metadata entities if the latter have a well-defined location or regularity in the text. Therefore, the next era of search support for legal databases is semantic processing. Semantic processing solutions are fundamentally different from information extraction and NER because they do not only extract and make visible and/or searchable the specific information element contained in the text but allow for the analytical analysis of the text as a whole. In addition, in many cases, legal database developments using machine learning can be a significant burden on a company, as it is not always known what kind of an AI solution is needed, and how the providers could compare the different solutions. Legal database providers need to customize processing their documents and texts in the most optimal way possible, considering all their legal, linguistic, statistical, or other characteristics. This is where text processing pipelines can help. So, the article reviews the two main natural language processing (NLP) solutions which can help legal database providers to increase the value of legal data within legal databases. The article then shows the importance of text-processing pipelines and frameworks in the era of digitized documents and presents the digital-twin-distiller.","PeriodicalId":29836,"journal":{"name":"TalTech Journal of European Studies","volume":"90 4","pages":"138 - 151"},"PeriodicalIF":0.6000,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"How Could Semantic Processing and Other NLP Tools Improve Online Legal Databases?\",\"authors\":\"Renátó Vági\",\"doi\":\"10.2478/bjes-2023-0018\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract The spread of online databases and the increasingly sophisticated search solutions in the past 10–15 years have opened up many new opportunities for lawyers to find relevant documents. However, it is still a common problem that the various legal databases and legal search engines face an information crisis. Legal database providers use various information extraction solutions, especially named entity recognition (NER), to mitigate this problem. These solutions can improve the relevance of the lists of results. Their limitation, however, is that they can only extract and create searchable metadata entities if the latter have a well-defined location or regularity in the text. Therefore, the next era of search support for legal databases is semantic processing. Semantic processing solutions are fundamentally different from information extraction and NER because they do not only extract and make visible and/or searchable the specific information element contained in the text but allow for the analytical analysis of the text as a whole. In addition, in many cases, legal database developments using machine learning can be a significant burden on a company, as it is not always known what kind of an AI solution is needed, and how the providers could compare the different solutions. Legal database providers need to customize processing their documents and texts in the most optimal way possible, considering all their legal, linguistic, statistical, or other characteristics. This is where text processing pipelines can help. So, the article reviews the two main natural language processing (NLP) solutions which can help legal database providers to increase the value of legal data within legal databases. The article then shows the importance of text-processing pipelines and frameworks in the era of digitized documents and presents the digital-twin-distiller.\",\"PeriodicalId\":29836,\"journal\":{\"name\":\"TalTech Journal of European Studies\",\"volume\":\"90 4\",\"pages\":\"138 - 151\"},\"PeriodicalIF\":0.6000,\"publicationDate\":\"2023-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"TalTech Journal of European Studies\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2478/bjes-2023-0018\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"LAW\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"TalTech Journal of European Studies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2478/bjes-2023-0018","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"LAW","Score":null,"Total":0}

引用次数: 0

摘要

在过去的10-15年里，在线数据库的普及和日益成熟的搜索解决方案为律师寻找相关文件开辟了许多新的机会。然而，各种法律数据库和法律搜索引擎面临信息危机仍然是一个共同的问题。合法的数据库提供商使用各种信息提取解决方案，特别是命名实体识别(NER)来缓解这个问题。这些解决方案可以提高结果列表的相关性。然而，它们的限制是，它们只能提取和创建可搜索的元数据实体，如果后者在文本中具有定义良好的位置或规律性。因此，法律数据库搜索支持的下一个时代是语义处理。语义处理解决方案从根本上不同于信息提取和NER，因为它们不仅提取文本中包含的特定信息元素并使其可见和/或可搜索，而且还允许对整个文本进行分析分析。此外，在许多情况下，使用机器学习的法律数据库开发对公司来说可能是一个重大负担，因为它并不总是知道需要什么样的人工智能解决方案，以及提供商如何比较不同的解决方案。法律数据库提供商需要考虑所有法律、语言、统计或其他特征，以尽可能最佳的方式定制处理其文档和文本。这就是文本处理管道可以提供帮助的地方。因此，本文综述了两种主要的自然语言处理(NLP)解决方案，它们可以帮助法律数据库提供商提高法律数据库中法律数据的价值。文章阐述了数字化文档时代文本处理管道和框架的重要性，并提出了数字孪生蒸馏器。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

How Could Semantic Processing and Other NLP Tools Improve Online Legal Databases?

Abstract The spread of online databases and the increasingly sophisticated search solutions in the past 10–15 years have opened up many new opportunities for lawyers to find relevant documents. However, it is still a common problem that the various legal databases and legal search engines face an information crisis. Legal database providers use various information extraction solutions, especially named entity recognition (NER), to mitigate this problem. These solutions can improve the relevance of the lists of results. Their limitation, however, is that they can only extract and create searchable metadata entities if the latter have a well-defined location or regularity in the text. Therefore, the next era of search support for legal databases is semantic processing. Semantic processing solutions are fundamentally different from information extraction and NER because they do not only extract and make visible and/or searchable the specific information element contained in the text but allow for the analytical analysis of the text as a whole. In addition, in many cases, legal database developments using machine learning can be a significant burden on a company, as it is not always known what kind of an AI solution is needed, and how the providers could compare the different solutions. Legal database providers need to customize processing their documents and texts in the most optimal way possible, considering all their legal, linguistic, statistical, or other characteristics. This is where text processing pipelines can help. So, the article reviews the two main natural language processing (NLP) solutions which can help legal database providers to increase the value of legal data within legal databases. The article then shows the importance of text-processing pipelines and frameworks in the era of digitized documents and presents the digital-twin-distiller.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

TalTech Journal of European Studies POLITICAL SCIENCE-

CiteScore

1.90

自引率

62.50%

发文量